mirror of
https://code.videolan.org/rist/librist.git
synced 2026-07-04 15:06:53 +00:00
cadb6c4aa4
A bonded caller-sender leg that lost and regained connectivity on the same source tuple -- an interface flap, or an upstream outage that resumes without a reconnect -- stayed wedged and never rejoined the bond without an operator restart. While the leg is silent the far-end authenticator times out and purges its session. The leg keeps its miface-bound UDP socket (a flap does not close it) and stays in EAP SUCCESS, so it keeps streaming data the authenticator now drops while it waits for an EAPOL START the caller never sends, because EAP is authenticator-driven and the caller believes it is still authenticated. - Extend try_caller_socket_rebind to sender-mode callers: a sender leg silent past session_timeout resets its EAP context (eap_reset_authenticatee) and re-drives the SRP handshake on the existing socket. It does not rebind -- the miface-bound socket is still valid and rebinding would move the source tuple the far end expects. - Fold the leg back into the weighted bond once it re-authenticates. Recovery de-authenticates the leg (so it leaves the sender balancing rotation while down) and rewinds eap_authentication_state, so the "EAP Authentication succeeded" transition fires again on re-auth and restores the connection- level authenticated flag via rist_peer_authenticate. Without this the leg re-authenticates but is left out of balancing and carries only NACK retransmits instead of its share. - Only SRP sender legs need this; plaintext/PSK senders have no such deadlock and recover via normal reconnect, so they are left untouched. Reproduced and verified with a bonded advanced-profile ristsender over two netns/veth legs to an SRP listener: one leg is silenced with 100% packet loss both ways (tc netem) while its interface stays up, so the socket persists on the same source tuple. Before the fix the returning leg never re-authenticates and the listener floods "handshake is still pending"; with re-auth alone it authenticates but the sender balances over only the surviving leg; after the full fix the returning leg re-authenticates and resumes carrying its full weighted share (verified on a restored zero-loss link, matching a plaintext bond). Added as test/rist/test_bonded_leg_flap_netns.sh (meson "netns" suite), which asserts both re-authentication and reintegration; it needs Linux root + netns/tc and cleanly skips (exit 77) otherwise. An in-process loopback test cannot reproduce the wedge because a loopback leg has no miface binding and self-heals with a new-port handshake.