Elevated number of dropped TCP connections to a listening remote network socket.
The symptom:
“SYNs to LISTEN sockets dropped” increments at a high rate:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# netstat -s | grep -i listen | |
2608 SYNs to LISTEN sockets dropped | |
root@smtp-out-n01:~# |
Obtaining the baseline:
- Starting from a lower level, let's check the size of the transmit queue on the network interface and make sure there aren’t any collisions:
- Next, let’s check to see if the interface is dropping packets due to the transmit queue:
- Finally, check for any fragmentation problems:
- Moving up the stack, print the Accept Queue sizes for the listening service. Recv-Q shows the number of sockets in the Accept Queue, and Send-Q shows the backlog parameter:
- Nothing really in the accept queue, let’s check how many connections are in SYN-RECV state for the receiving process in question:
- Connections are moving to ESTABLISHED pretty quickly. Let’s make sure we have enough file descriptors available (the current number of allocated file handles, the number of unused but allocated file handles, the system-wide maximum):
- Check for half-closed connections, waiting on FIN,ACK and total established connections:
- No concerns here, based on the total number of connections. Let’s check the number of concurrent (NEW) connections:
- Current rate is at 180 NEW connections per second. Observing the rate on a single node for a 24 hours period, we peak at about 250 connections per second. Checking the CPU and memory utilization shows a pretty idle system even during peak send times:
- Finally, checking the counter for dropped SYN packets, shows an ever increasing number at a rate of about 20/sec:
- The main reason for dropping SYN packets is when the SYN Queue is getting full. I was not able to see that in any of the above diagnostics. For better visibility let’s install some kernel hooks with SystemTap to print details on specifically what connections suffer due to Accept Queue overflow. This should help in identifying periodically hung applications that fail to accept() connections fast enough:
- Unfortunately running the kernel hook for 24 hours did not yield any results. The SYN and ACCEPT queues were nearly empty, though the “SYNs to LISTEN sockets dropped” issue persisted.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# ifconfig eth0 | grep txqueuelen | |
collisions:0 txqueuelen:1000 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# tc -s qdisc show dev eth0 | grep dropped | |
Sent 17873576470 bytes 21407282 pkt (dropped 0, overlimits 0 requeues 12223) | |
Sent 2830505875 bytes 2906590 pkt (dropped 0, overlimits 0 requeues 1874) | |
Sent 1498561593 bytes 2255912 pkt (dropped 0, overlimits 0 requeues 1391) | |
Sent 3102757206 bytes 2651357 pkt (dropped 0, overlimits 0 requeues 1121) | |
Sent 5034092949 bytes 2946821 pkt (dropped 0, overlimits 0 requeues 2729) | |
Sent 1231897711 bytes 2718582 pkt (dropped 0, overlimits 0 requeues 1506) | |
Sent 1743081970 bytes 2229000 pkt (dropped 0, overlimits 0 requeues 1851) | |
Sent 1435231757 bytes 2717978 pkt (dropped 0, overlimits 0 requeues 1015) | |
Sent 997447409 bytes 2981042 pkt (dropped 0, overlimits 0 requeues 736) | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# cat /proc/net/snmp | grep '^Ip:' | cut -f17 -d' ' | |
ReasmFails | |
0 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# ss -plnt sport = :2319|cat && ss -plnt sport = :2320|cat | |
State Recv-Q Send-Q Local Address:Port Peer Address:Port | |
LISTEN 0 65535 :::2319 :::* users:(("service",pid=3646,fd=47)) | |
State Recv-Q Send-Q Local Address:Port Peer Address:Port | |
LISTEN 0 65535 :::2320 :::* users:(("service",pid=3646,fd=48)) | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# ss -n state syn-recv sport = :2319 | wc -l; ss -n state syn-recv sport = :2320 | wc -l | |
5 | |
1 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl fs.file-nr | |
fs.file-nr = 6240 0 3247209 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# ss -n state time-wait | wc -l | |
86 | |
root@smtp-out-n01:~# ss -n state established | wc -l | |
1441 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# modprobe ip_conntrack | |
root@smtp-out-n01:~# conntrack -E -e NEW | pv -l -i 1 -r > /dev/null | |
[ 180 /s] | |
^C | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# free -m | |
total used free shared buff/cache available | |
Mem: 31711 7779 17838 296 6092 23083 | |
Swap: 0 0 0 | |
root@smtp-out-n01:~# w | |
18:12:36 up 2:21, 1 user, load average: 0.61, 0.71, 0.70 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# netstat -s | grep -i listen | |
2608 SYNs to LISTEN sockets dropped | |
root@smtp-out-n01:~# nstat -az | grep -i listen | |
TcpExtListenOverflows 0 0.0 | |
TcpExtListenDrops 2608 0.0 | |
TcpExtTCPFastOpenListenOverflow 0 0.0 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# cat acceptq.stp | |
probe begin { | |
printf("time (us) \tacceptq\tqmax\tlocal addr\tremote_addr\n") | |
} | |
function skb_get_remote_v4addr:string(skb:long) | |
{ | |
return format_ipaddr(__ip_skb_daddr(__get_skb_iphdr(skb)), 2 /* AF_INET */) | |
} | |
function skb_get_remote_v6addr:string(skb:long) | |
{ | |
ipv6_hdr = &@cast(__get_skb_iphdr(skb), "ipv6hdr") | |
return format_ipaddr(&ipv6_hdr->daddr, 10 /* AF_INET6 */) | |
} | |
function skb_get_remote_port:long(skb:long) | |
{ | |
return __tcp_skb_sport(__get_skb_tcphdr(skb)) | |
} | |
probe kernel.function("tcp_v4_conn_request") { | |
if ($sk->sk_ack_backlog > $sk->sk_max_ack_backlog) { | |
printf("%d\t%d\t%d\t%s:%d\t%s:%d\n", | |
gettimeofday_us(), | |
$sk->sk_ack_backlog, | |
$sk->sk_max_ack_backlog, | |
inet_get_ip_source($sk), | |
inet_get_local_port($sk), | |
skb_get_remote_v4addr($skb), | |
skb_get_remote_port($skb)); | |
} | |
} | |
probe kernel.function("tcp_v6_conn_request") { | |
if ($sk->sk_ack_backlog > $sk->sk_max_ack_backlog) { | |
printf("%d\t%d\t%d\t[%s]:%d\t[%s]:%d\n", | |
gettimeofday_us(), | |
$sk->sk_ack_backlog, | |
$sk->sk_max_ack_backlog, | |
inet_get_ip_source($sk), | |
inet_get_local_port($sk), | |
skb_get_remote_v6addr($skb), | |
skb_get_remote_port($skb)); | |
} | |
} | |
root@smtp-out-n01:~# wget http://launchpadlibrarian.net/483914277/linux-image-4.4.0-1110-aws-dbgsym_4.4.0-1110.121_amd64.ddeb && dpkg --install linux-image-4.4.0-1110-aws-dbgsym_4.4.0-1110.121_amd64.ddeb | |
root@smtp-out-n01:~# stap -v acceptq.stp | |
Pass 1: parsed user script and 110 library script(s) using 107992virt/43612res/6352shr/37332data kb, in 80usr/20sys/99real ms. | |
Pass 2: analyzed script: 6 probe(s), 28 function(s), 5 embed(s), 3 global(s) using 258404virt/195296res/7684shr/187744data kb, in 1530usr/420sys/1960real ms. | |
Pass 3: using cached /root/.systemtap/cache/4a/stap_4ae7ddea0627fb050d53ced46ad3b670_24938.c | |
Pass 4: using cached /root/.systemtap/cache/4a/stap_4ae7ddea0627fb050d53ced46ad3b670_24938.ko | |
Pass 5: starting run. | |
time (us) acceptq qmax local addr remote_addr |
Tuning the kernel for better network performance:
After each incremental change, I measured the rate of SYN errors and checked the SYN and Accept queue utilizations.
- Increased the number of incoming connections backlog queue. This queue sets the maximum number of packets, queued on the INPUT side:
- Increased the overall TCP memory, in pages (number of guaranteed pages for TCP, the threshold at which TCP should start to conserve pages, maximum number of allocatable pages):
- Increased the core system socket read and write buffers absolute max, in bytes. The applications cannot request more than this value:
- Increased the system socket read and write buffers (min, default and max size in bytes):
- Ensured TCP window scaling is enabled:
- Updated how many times to retry SYN connections. With the default the final timeout for an active TCP connection attempt will happen after 127 seconds:
- And arguably most importantly I’ve increased the limit of the socket listen() backlog, the maximum value that net.ipv4.tcp_max_syn_backlog can take. The kernel documentation states that if this limit is reached SYN packets will be dropped:
- Even though that huge number got accepted (the default varies by kernel version, from 128 to 4096) the queue can’t be more than 65535 it seems:
- Increased the Listener queue length for unacknowledged SYN_RECV connection attempts. A SYN_RECV request socket consumes about 304 bytes of memory:
- Checking to see how many connections are in SYN_RECV state after the above change:
- Increased the number of times SYNACKs for a passive TCP connection attempt will be retransmitted. WIth the default the final timeout for a passive TCP connection will happen after 63seconds:
- Finally, disabling the reuse of TCP connections (at the expense of increased number of TIME_WAIT connections and about 120MB of extra memory usage) yielded the best result, dropped SYN packets went down to about 3 per 15 minutes!
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.core.netdev_max_backlog=3000000 | |
net.core.netdev_max_backlog = 3000000 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_mem=’758316 1011092 1516632’ | |
net.ipv4.tcp_mem = 758316 1011092 1516632 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.core.rmem_max=67108864 | |
net.core.rmem_max = 67108864 | |
root@smtp-out-n01:~# sysctl -w net.core.wmem_max=67108864 | |
net.core.wmem_max = 67108864 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl net.ipv4.tcp_rmem=’715827867 1073741800 2147483600’ | |
net.ipv4.tcp_rmem = 715827867 1073741800 2147483600 | |
root@smtp-out-n01:~# sysctl net.ipv4.tcp_wmem=’715827867 1073741800 2147483600’ | |
net.ipv4.tcp_wmem = 715827867 1073741800 2147483600 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl net.ipv4.tcp_window_scaling | |
net.ipv4.tcp_window_scaling = 1 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_syn_retries=6 | |
net.ipv4.tcp_syn_retries = 6 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.core.somaxconn=1000000 | |
net.core.somaxconn = 1000000 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# ss -plnt sport = :2319|cat && ss -plnt sport = :2320|cat | |
State Recv-Q Send-Q Local Address:Port Peer Address:Port | |
LISTEN 0 65535 :::2319 :::* users:(("service",pid=3646,fd=47)) | |
State Recv-Q Send-Q Local Address:Port Peer Address:Port | |
LISTEN 0 65535 :::2320 :::* users:(("service",pid=3646,fd=48)) | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_max_syn_backlog=7064090 | |
net.ipv4.tcp_max_syn_backlog = 7064090 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# netstat -antup | grep SYN_RECV | egrep "2319|2320" | wc -l | |
3 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_synack_retries=5 | |
net.ipv4.tcp_synack_retries = 5 | |
root@smtp-out-n01:~# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
root@smtp-out-n01:~# sysctl -w net.ipv4.tcp_tw_recycle=0 | |
net.ipv4.tcp_tw_recycle = 0 | |
root@smtp-out-n01:~# |