123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293 |
- Some less-widely known details of TCP connections.
- Properly closing the connection.
- After this code sequence:
- sock = socket(AF_INET, SOCK_STREAM, 0);
- connect(sock, &remote, sizeof(remote));
- write(sock, buffer, 1000000);
- a large block of data is only buffered by kernel, it can't be sent all at once.
- What will happen if we close the socket?
- "A host MAY implement a 'half-duplex' TCP close sequence, so that
- an application that has called close() cannot continue to read
- data from the connection. If such a host issues a close() call
- while received data is still pending in TCP, or if new data is
- received after close() is called, its TCP SHOULD send a RST
- to show that data was lost."
- IOW: if we just close(sock) now, kernel can reset the TCP connection
- (send RST packet).
- This is problematic for two reasons: it discards some not-yet sent
- data, and it may be reported as error, not EOF, on peer's side.
- What can be done about it?
- Solution #1: block until sending is done:
- /* When enabled, a close(2) or shutdown(2) will not return until
- * all queued messages for the socket have been successfully sent
- * or the linger timeout has been reached.
- */
- struct linger {
- int l_onoff; /* linger active */
- int l_linger; /* how many seconds to linger for */
- } linger;
- linger.l_onoff = 1;
- linger.l_linger = SOME_NUM;
- setsockopt(sock, SOL_SOCKET, SO_LINGER, &linger, sizeof(linger));
- close(sock);
- Solution #2: tell kernel that you are done sending.
- This makes kernel send FIN after all data is written:
- shutdown(sock, SHUT_WR);
- close(sock);
- However, experiments on Linux 3.9.4 show that kernel can return from
- shutdown() and from close() before all data is sent,
- and if peer sends any data to us after this, kernel still responds with
- RST before all our data is sent.
- In practice the protocol in use often does not allow peer to send
- such data to us, in which case this solution is acceptable.
- Solution #3: if you know that peer is going to close its end after it sees
- our FIN (as EOF), it might be a good idea to perform a read after shutdown().
- When read finishes with 0-sized result, we conclude that peer received all
- the data, saw EOF, and closed its end.
- However, this incurs small performance penalty (we run for a longer time)
- and requires safeguards (nonblocking reads, timeouts etc) against
- malicious peers which don't close the connection.
- Solutions #1 and #2 can be combined:
- /* ...set up struct linger... then: */
- setsockopt(sock, SOL_SOCKET, SO_LINGER, &linger, sizeof(linger));
- shutdown(sock, SHUT_WR);
- /* At this point, kernel sent FIN packet, not RST, to the peer, */
- /* even if there is buffered read data from the peer. */
- close(sock);
- Defeating Nagle.
- Method #1: manually control whether partial sends are allowed:
- This prevents partially filled packets being sent:
- int state = 1;
- setsockopt(fd, IPPROTO_TCP, TCP_CORK, &state, sizeof(state));
- and this forces last, partially filled packet (if any) to be sent:
- int state = 0;
- setsockopt(fd, IPPROTO_TCP, TCP_CORK, &state, sizeof(state));
- Method #2: make any write to immediately send data, even if it's partial:
- int state = 1;
- setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, &state, sizeof(state));
|