Hello everyone, I’m Kobayashi.

Today, let’s talk about an interesting question: unplug the network cable for a few seconds, and then plug it back in, does the original TCP connection still exist?

Some students may say that the network cable has been unplugged, which means that the physical layer is disconnected, and the transport layer in the upper layer should also be disconnected, so the original TCP connection will not exist. It’s like, when we make a wired call, if one party’s phone line is unplugged, then the call is completely broken.

Is this really the case?

There is a problem with the above logic. The problem is that the mistaken belief that unplugging the network cable affects the transport layer, but in fact it does not.

In fact, a TCP connection in the Linux kernel is a struct socket whose contents contain information such as the status of the TCP connection. When the network cable is unplugged, the operating system does not change anything about the structure, so the state of the TCP connection does not change.

I did a small experiment on my computer, I connected my cloud server with an ssh terminal, and then I simulated the scene of unplugging the network cable by disconnecting wifi, at this time to see if the status of the TCP connection has not changed, or is it in the ESTABLISHED state.

Through the above experimental results, we know that unplugging the network cable does not affect the state of the TCP connection.

Next, it depends on what the two sides do after unplugging the network cable.

Therefore, in view of this problem, it is necessary to discuss it in different scenarios:

After the client unplugs the network cable, the data packet sent by the server to the client will not get any response, and after waiting for a certain period of time, the server will trigger the timeout retransmission mechanism to retransmit the unresponsive data packet.

If the client just plugs the network cable back in the process of retransmitting the message on the server, because unplugging the network cable does not change the client’s TCP connection status, and it is still in the ESTABLISHED state, the client can normally receive the data packet sent by the server, and then the client will return to the ACK response packet.

At this point, the TCP connection between the client and the server is still there, and it feels like nothing is happening.

However, if the client does not plug the network cable back in the process of retransmitting the message on the server side, after the number of times the server times out to retransmit the message reaches a certain threshold, the kernel will determine that there is a problem with the TCP, and then tell the application through the socket interface that the TCP connection is wrong, so the TCP connection on the server side will be disconnected.

After the client plugs back into the network cable, if the client sends data to the server, since the server no longer has the same TCP connection as the client, the server-side kernel will reply to the RST packet, and the client will release the TCP connection after receiving it.

At this point, the TCP connection between the client and the server has been disconnected.

How many times is the TCP data message retransmitted?

On Linux systems, a configuration item called tcp_retries2 is provided, and the default value is 15, as shown in the following figure:

This kernel parameter is the maximum number of timeout retransmissions that control in the event of a TCP connection being established.

However, tcp_retries2 is set 15 times, it does not mean that the TCP timeout is retransmitted 15 times before the application is notified to terminate the TCP connection, and the kernel will also determine based on the “maximum timeout time”.

The timeout time of each round increases exponentially, for example, the first timeout retransmission is triggered after 2s, the second time is after 4s, the third time is after 8s, and so on.

The kernel calculates a maximum timeout based on the value set by tcp_retries2.

In the event that a message is retransmitted and no response has been received, one of the conditions of “maximum number of retransmissions” or “maximum timeout time” is reached, the retransmission stops and the TCP connection is disconnected.

For scenarios where there is no data transmission after unplugging the network cable, it depends on whether the TCP keepalive mechanism (TCP keepalive mechanism) is enabled.

If the TCP keepalive mechanism is not enabled, after the client unplugs the network cable, and neither party has data transmission, the TCP connection between the client and the server will always exist.

If the TCP keepalive mechanism is enabled, after the client unplugs the network cable, even if both parties do not carry out data transmission, after a period of time, TCP will send a probe packet:

Therefore, the TCP keepalive mechanism can determine whether the TCP connection of the other party is alive by probing the message without data interaction between the two parties.

What exactly does the TCP keepalive mechanism look like?

The principle of this mechanism is as follows:

Define a time period during which the TCP keepalive mechanism will begin to function if there is no connection-related activity, and every other time interval, send a probe message that contains very little data, and if several consecutive probe messages do not get a response, the current TCP connection is considered dead, and the system kernel notifies the upper application of the error message.

In the Linux kernel, there are corresponding parameters that can set the keepalive time, the number of keepalive detections, and the time interval of keepalive detection, and the following are the default values:

This means that in Linux systems, it takes at least 2 hours, 11 minutes and 15 seconds to find a “dead” connection.

Note that applications that want to use the TCP keepalive mechanism need to set the SO_KEEPALIVE option through the socket interface to take effect, and if not, the TCP keepalive mechanism cannot be used.

The TCP keepalive mechanism probes for too long, right?

Yes, it’s a bit long.

TCP keepalive is a TCP layer (kernel-state) implementation that is a bottom-up scheme for all programs that are based on the TCP transport protocol.

In fact, our application layer can implement a set of detection mechanisms on its own, which can detect whether the other party is alive or not in a relatively short period of time.

For example, web service software typically provides a keepalive_timeout parameter to specify the timeout for HTTP long connections. If the timeout for a long HTTP connection is set to 60 seconds, the web service software will start a timer, and if the client does not initiate a new request for 60 seconds after completing the last HTTP request, the timer will trigger a callback function to release the connection as soon as the timer arrives.

When a client unplugs a network cable, it does not directly affect the TCP connection status. Therefore, after unplugging the network cable, whether the TCP connection will still exist, the key depends on whether there is data transmission after unplugging the network cable.

In the case of data transfer:

After the client unplugs the network cable, if the server sends a data packet, then before the number of retransmissions on the server does not reach the maximum value, the client plugs back into the network cable, then the original TCP connection between the two sides can still exist normally, as if nothing happened.

After the client unplugs the network cable, if the server sends a data packet, the server will disconnect the TCP connection when the number of retransmissions on the server reaches the maximum value before the client plugs back in the network cable. After the client plugs back in the network cable, the data is sent to the server, because the server has disconnected the TCP connection of the same quad as the client, so it will return to the RST message, and the client will disconnect the TCP connection after receiving it. At this point, both sides of the TCP connection are disconnected.

In the absence of data transfer:

In addition to the scenario where the client unplugs the network cable, there are two scenarios where the client “shuts down and kills the process”.

In the first scenario, the client downtime is the same as unplugging the network cable, which cannot be perceived by the server, so if there is no data transmission and the TCP keepalive mechanism is not enabled, the TCP connection on the server side will remain in the ESTABLISHED connection state until the server restarts the process.

So, we can know a point. When one party’s TCP connection is in the ESTABLISHED state without using the TCP keepalive mechanism and the two parties do not transmit data, it does not mean that the other party’s TCP connection is still normal.

In the second scenario, after killing the client’s process, the client’s kernel sends FIN packets to the server and waves with the client four times.

Therefore, even if TCP keepalive is not turned on and there is no data interaction between the two parties, if one of the processes crashes, the process is perceptible to the operating system, so it will send FIN messages to the other party, and then TCP waving with the other party four times.

Finish!