Don't clear connectTimeout until ConnectResponse is received #57

arekinath · 2017-02-14T01:35:13Z

We should not clear the connectTimeout (in ConnectionManager) until after we receive ConnectResponse (currently we clear it as soon as we get the TCP 'connect' event). Otherwise we leave a window between when the TCP connection emits 'connect' and we send the ConnectRequest, and when the ConnectResponse comes back, where we have no timeout applying at all and will make no attempt to talk to the socket again.

If we get netsplit from ZK during such a window (and stay netsplit for long enough to both go past the ZK session timeout, and then miss any of its OS' FIN retransmits, which may only be a few minutes), we may hang in the CONNECTING state indefinitely, never reconnecting.

Since the OS on the ZK side has given up on the socket and forgotten about it (its FINs/RSTs never got through during the netsplit), the only way our local OS could find out about the issue is if we tried to send data (then the remote side would send RST back), which we will not do (since we haven't got the ConnectResponse yet or set up the ping timeout). So we will never receive any kind of 'error' event from this socket, and no 'timeout' (since we haven't set one up yet, we only do that after the ConnectResponse).

This window of time may sound small, but sometimes a heavily loaded ZK (such as when it's in the middle of dealing with, say, a netsplit in the cluster) can take quite a long time between when its host OS ACKs the ConnectRequest we sent, and when it then tries to send a ConnectResponse. We've seen windows in prod here of up to almost 1 sec.

We've hit this bug in prod repeatedly during widespread network events (e.g. switch fabric reconfiguration). I can also reproduce it reliably in the lab by making the ZK connect sequence artificially slow (adding a sleep) and starting a netsplit at the right time.

Otherwise we leave a window between when the TCP connection emits 'connect' and we've sent the ConnectRequest, and when the ConnectResponse comes back, when we have no timeout applying at all and no attempt to talk to the socket again. If we get netsplit from ZK during such a window, we may hang in the CONNECTING state indefinitely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't clear connectTimeout until ConnectResponse is received #57

Don't clear connectTimeout until ConnectResponse is received #57

arekinath commented Feb 14, 2017 •

edited

Loading

Don't clear connectTimeout until ConnectResponse is received #57

Are you sure you want to change the base?

Don't clear connectTimeout until ConnectResponse is received #57

Conversation

arekinath commented Feb 14, 2017 • edited Loading

arekinath commented Feb 14, 2017 •

edited

Loading