Bulk client operations bound their entire working set in a single
WHERE x IN (...) clause, which exceeds PostgreSQL's 65535-parameter limit
(and SQLite's 32766) and gives the planner a pathological query, so they
failed outright on inbounds/selections larger than the limit. Every such
query is now chunked at 400 items:
- BulkDelete / delete-all-clients: six IN queries chunked, and the
per-row delete tombstone (which swept the whole in-memory map on every
call, O(N^2)) replaced with a single bulk sweep.
- BulkAdjust: record and inbound-mapping lookups chunked.
- AddToGroup / RemoveFromGroup (bulk add/remove to group): three IN
queries chunked.
- replaceGroupValue (rename/delete group): inbound-mapping lookup chunked.
- List (all-clients listing): link and traffic lookups chunked.
Measured on PostgreSQL 16: delete-all-clients on a 100k-client inbound
now completes in ~7s (previously crashed at the parameter limit); bulk
add/remove to group ~6s and full client list ~1s at 100k.
sync_scale_postgres_test.go adds skip-gated benchmarks for delete-all,
group add/remove, and list.
Follow-up to the SyncInbound bulk rewrite, fixing the remaining O(M*N)
and O(M)-round-trip behaviour in the add/delete and bulk paths that made
them time out on large inbounds (worst case minutes), especially on
PostgreSQL.
- compactOrphans: chunk the "email IN (...)" lookup (400/batch) instead
of binding every email at once. A single huge IN exceeded PostgreSQL's
65535-parameter limit (and SQLite's) and made the planner pathological,
so add/delete failed outright past ~100k clients.
- emailsUsedByOtherInbounds: new batched form used by delInboundClients
(BulkDetach) and bulkDelInboundClients (BulkDelete), replacing a
per-email global JSON scan (O(M*N)) with one scan, and skipped entirely
when keepTraffic is set.
- BulkCreate: rewritten to validate/dedup in one pass, then group clients
by inbound and add them in a single addInboundClient call per inbound
(one getAllEmailSubIDs, one settings rewrite, one SyncInbound) instead
of running the full single-create pipeline per client.
- Bulk delete/adjust: batch DelClientStat/DelClientIPs with IN deletes
and wrap the settings Save + SyncInbound in one transaction, so the
per-row writes share a single fsync instead of one per row.
Measured on PostgreSQL 16 (one inbound, M=2000 affected clients):
- create: 8m35s (M=500) -> ~1-5s
- detach: 52s -> ~4s (flat in N)
- delete: ~16s -> ~1-4s
- adjust: ~20s -> ~7-10s
add/delete of a single client on a 200k-client inbound stays in seconds.
sync_scale_postgres_test.go adds skip-gated benchmarks (XUI_DB_TYPE=
postgres) for the single add/delete and the five bulk operations.
Every client mutation funnels through SyncInbound, which ran O(n) DB
round-trips per call: one SELECT per client, a Save+UpdateColumn per
client, and a per-row junction INSERT. Toggling a single client on a
large inbound issued thousands of queries and timed out, badly so on
PostgreSQL where each round-trip pays TCP latency.
SyncInbound now:
- loads existing records with a single chunked SELECT ... email IN (...)
instead of one query per client
- writes only the records that actually changed (skips no-op Saves), so
toggling/editing one client writes one row, not all of them
- batch-creates new records and batch-inserts the junction rows
Merge and sticky-field semantics are unchanged. Measured on PostgreSQL
16: a single-client toggle on a 50k-client inbound drops from ~8m54s to
~0.9s, and seeding 50k clients from ~2m48s to ~1.6s; 200k clients sync
in seconds.
A skip-gated benchmark (web/service/sync_scale_postgres_test.go, run
with XUI_DB_TYPE=postgres) reproduces and verifies the scaling.