Description
Right now we run agent and CLI binaries without any sort of privilege escalation, which leaves us with the inability to increase buffer sizes for our UDP sockets. This is shown by the following log message:
2023-09-26 21:30:17.014 [debu] net.wgengine: magicsock: [warning] failed to force-set UDP write buffer size to 7340032: operation not permitted; using kernel default values (impacts throughput only)
I tested a bit with giving coder
binaries CAP_NET_ADMIN
(which allows us to resize UDP buffers) and got about ~50% increase in performance on two cores.
Without CAP_NET_ADMIN
(both agent and CLI)
INTERVAL THROUGHPUT
0.00-1.02 sec 362.4548 Mbits/sec
1.02-2.06 sec 419.8718 Mbits/sec
2.06-3.07 sec 529.4830 Mbits/sec
3.07-4.07 sec 536.7345 Mbits/sec
4.07-5.06 sec 663.2100 Mbits/sec
-----------------------------------
0.00-5.06 sec 500.8592 Mbits/sec
With CAP_NET_ADMIN
(both agent and CLI)
INTERVAL THROUGHPUT
0.00-1.00 sec 786.4530 Mbits/sec
1.00-2.02 sec 822.9956 Mbits/sec
2.02-3.04 sec 875.3655 Mbits/sec
3.04-4.05 sec 877.9789 Mbits/sec
4.05-5.04 sec 865.6894 Mbits/sec
-----------------------------------
0.00-5.04 sec 845.7073 Mbits/sec
Adding CAP_NET_ADMIN
to agents should be pretty straight forward, as long as the workspace contains setcap
. For CLI installs, we might be able to automatically add it via the install script. It's worth noting the increase in speeds only happen when both the agent and CLI have CAP_NET_ADMIN
. If either are missing, the lower speeds are seen.
It might be good to experiment with higher buffer sizes to find a happy medium for our use case.