Follow-up to the SyncInbound bulk rewrite, fixing the remaining O(M*N)
and O(M)-round-trip behaviour in the add/delete and bulk paths that made
them time out on large inbounds (worst case minutes), especially on
PostgreSQL.
- compactOrphans: chunk the "email IN (...)" lookup (400/batch) instead
of binding every email at once. A single huge IN exceeded PostgreSQL's
65535-parameter limit (and SQLite's) and made the planner pathological,
so add/delete failed outright past ~100k clients.
- emailsUsedByOtherInbounds: new batched form used by delInboundClients
(BulkDetach) and bulkDelInboundClients (BulkDelete), replacing a
per-email global JSON scan (O(M*N)) with one scan, and skipped entirely
when keepTraffic is set.
- BulkCreate: rewritten to validate/dedup in one pass, then group clients
by inbound and add them in a single addInboundClient call per inbound
(one getAllEmailSubIDs, one settings rewrite, one SyncInbound) instead
of running the full single-create pipeline per client.
- Bulk delete/adjust: batch DelClientStat/DelClientIPs with IN deletes
and wrap the settings Save + SyncInbound in one transaction, so the
per-row writes share a single fsync instead of one per row.
Measured on PostgreSQL 16 (one inbound, M=2000 affected clients):
- create: 8m35s (M=500) -> ~1-5s
- detach: 52s -> ~4s (flat in N)
- delete: ~16s -> ~1-4s
- adjust: ~20s -> ~7-10s
add/delete of a single client on a 200k-client inbound stays in seconds.
sync_scale_postgres_test.go adds skip-gated benchmarks (XUI_DB_TYPE=
postgres) for the single add/delete and the five bulk operations.
Every client mutation funnels through SyncInbound, which ran O(n) DB
round-trips per call: one SELECT per client, a Save+UpdateColumn per
client, and a per-row junction INSERT. Toggling a single client on a
large inbound issued thousands of queries and timed out, badly so on
PostgreSQL where each round-trip pays TCP latency.
SyncInbound now:
- loads existing records with a single chunked SELECT ... email IN (...)
instead of one query per client
- writes only the records that actually changed (skips no-op Saves), so
toggling/editing one client writes one row, not all of them
- batch-creates new records and batch-inserts the junction rows
Merge and sticky-field semantics are unchanged. Measured on PostgreSQL
16: a single-client toggle on a 50k-client inbound drops from ~8m54s to
~0.9s, and seeding 50k clients from ~2m48s to ~1.6s; 200k clients sync
in seconds.
A skip-gated benchmark (web/service/sync_scale_postgres_test.go, run
with XUI_DB_TYPE=postgres) reproduces and verifies the scaling.