Keepin’ TCP Alive

I was debugging an odd network issue lately that turned out to have a pretty simple explanation. A client on the network was intermittently experiencing significant delays in accessing the network. Upon closer inspection, it turned out that prior to the delay, the client was being left idle for long periods of time. With this additional information it was pretty easy to identify that there was likely a connection between the client and server that was being torn down for being idle.

So in the end, the cause of the problem itself was pretty simple to identify. The fix, however, is more of a conundrum. The obvious answer is to adjust the timers and prevent the connection from being torn down. But what timers should be adjusted? There are the keepalive timers on the client, the keepalive timers on the server, and the idle teardown timers on the firewall in the middle.

TCP keepalive handling varies between operating systems. If we look at the three major operating systems, Linux, Windows, and OS X, then we can make the blanket statement that, by default, keepalives are sent after two hours of idle time. But, most firewalls seem to have a default TCP teardown timer of one hour. These defaults are not conducive to keeping idle connections alive.

The optimal scenario for timeouts is for the clients to have a keepalive timer that fires at an interval lower than that of the idle tcp timeout on the firewall. The actual values to use, as well as which devices should be changed, is up for debate. The firewall is clearly the easier point at which to make such a change. Typically there are very few firewall devices that would need to be updated as compared to the larger number of client devices. Additionally, there will likely be fewer firewalls added to the network over time, so ensuring that timers are properly set is much easier. On the other hand, the defaults that firewalls are generally configured with have been chosen specifically by the vendor for legitimate reasons. So perhaps the clients should conform to the setting on the firewall? What is the optimal solution?

And why would we want to allow idle connections anyway? After all, if a connection is idle, it’s not being used. Clearly, any application that needed a connection to remain open would send some sort of keepalive, right? Is there a valid reason to allow these sorts of connections for an extended period of time?

As it turns out, there are valid reasons for connections to remain active, but idle. For instance, database connections are often kept for longer periods of time for performance purposes. The TCP handshake can take a considerable amount of time to perform as opposed to the simple matter of retrieving data from a database. So if the database connection remains established, additional data can be retrieved without the overhead of TCP setup. But in these instances, shouldn’t the application ensure that keepalives are sent so that the connection is not prematurely terminated by an idle timer somewhere along the data path? Well, yes. Sort of. Allow me to explain.

When I first discovered the source of the network problem we were seeing, I chalked it up to lazy programming. While it shouldn’t take much to add a simple keepalive system to a networked application, it is extra work. As it turns out, however, the answer isn’t quite that simple. All three major operating systems, Windows, Linux, and OS X, all have kernel level mechanisms for TCP keepalives. Each OS has a slightly different take on how keepalive timers should work.

Linux has three parameters related to tcp keepalives :

tcp_keepalive_time
The interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further
tcp_keepalive_intvl
The interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime
tcp_keepalive_probes
The number of unacknowledged probes to send before considering the connection dead and notifying the application layer

OS X works quite similar to Linux, which makes sense since they’re both *nix variants. OS X has four parameters that can be set.

keepidle
Amount of time, in milliseconds, that the connection must be idle before keepalive probes (if enabled) are sent. The default is 7200000 msec (2 hours).
keepintvl
The interval, in milliseconds, between keepalive probes sent to remote machines, when no response is received on a keepidle probe. The default is 75000 msec.
keepcnt
Number of probes sent, with no response, before a connection is dropped. The default is 8 packets.
always_keepalive
Assume that SO_KEEPALIVE is set on all TCP connections, the kernel will periodically send a packet to the remote host to verify the connection is still up.

Windows acts very differently from Linux and OS X. Again, there are three parameters, but they perform entirely different tasks. All three parameters are registry entries.

KeepAliveInterval
This parameter determines the interval between TCP keep-alive retransmissions until a response is received. Once a response is received, the delay until the next keep-alive transmission is again controlled by the value of KeepAliveTime. The connection is aborted after the number of retransmissions specified by TcpMaxDataRetransmissions have gone unanswered.
KeepAliveTime
The parameter controls how often TCP attempts to verify that an idle connection is still intact by sending a keep-alive packet. If the remote system is still reachable and functioning, it acknowledges the keep-alive transmission. Keep-alive packets are not sent by default. This feature may be enabled on a connection by an application.
TcpMaxDataRetransmissions
This parameter controls the number of times that TCP retransmits an individual data segment (not connection request segments) before aborting the connection. The retransmission time-out is doubled with each successive retransmission on a connection. It is reset when responses resume. The Retransmission Timeout (RTO) value is dynamically adjusted, using the historical measured round-trip time (Smoothed Round Trip Time) on each connection. The starting RTO on a new connection is controlled by the TcpInitialRtt registry value.

There’s a pretty good reference page with information on how to set these parameters that can be found here.

We still haven’t answered the question of optimal settings. Unfortunately, there doesn’t seem to be a correct answer. The defaults provided by most firewall vendors seem to have been chosen to ensure that the firewall does not run out of resources. Each connection through the firewall must be tracked. As a result, each connection uses up a portion of memory and CPU. Since both memory and CPU are finite resources, administrators must be careful not to exceed the limits of the firewall platform.

There is some good news. Firewalls have had a one hour tcp timeout timer for quite a while. As time has passed and new revisions of firewall hardware are released, the CPU has become more powerful and the amount of memory in each system has grown. The default one hour timer, however, has remained in place. This means that modern firewall platforms are much better prepared to handle an increase in the number of connections tracked. Ultimately, the firewall platform must be monitored and appropriate action taken if resource usage becomes excessive.

My recommendation would be to start by setting the firewall tcp teardown timer to a value slightly higher than that of the clients. For most networks, this would be slightly over two hours. The firewall administrator should monitor the number of connections tracked on the firewall as well as the resources used by the firewall. Adjustments should be made as necessary.

If longer lasting idle connections are unacceptable, then a slightly different tactic can be used. The firewall teardown timer can be set to a level comfortable to the administrator of the network. Problematic clients can be updated to send keepalive packets at a shorter interval. These changes will likely only be necessary on servers. Desktop systems don’t have the same need as servers for long-term establishment of idle connections.