librist_sender04

Author	SHA1	Message	Date
roman	56b7c3bff5	Update tools/meson.build	2026-05-24 16:43:27 +00:00
roman	467841d071	Add tools/moo-ristsender.c	2026-05-24 16:42:30 +00:00
Sergio Ammirata	812fd9dd8f	Merge branch 'feat/pmtu-safety-net' into 'master' udpsocket: set IP don't-fragment on every UDP socket See merge request rist/librist!323	2026-05-24 08:00:38 +00:00
Sergio Ammirata	00ee7c4027	udp,gre: log actionable PMTU hint on EMSGSIZE When a send fails with EMSGSIZE, log the packet size and suggest reducing the application payload. Rate-limited to one message per peer per 5 seconds. Both simple-profile and main-profile send paths use the same helper.	2026-05-24 03:56:06 -04:00
Sergio Ammirata	0b8d3d849a	udpsocket: set IP don't-fragment on every UDP socket Set DF so the kernel returns EMSGSIZE instead of silently IP-fragmenting. Covers Linux, Windows, and BSD for v4/v6. Best-effort: platforms without the option keep legacy behaviour.	2026-05-24 03:55:59 -04:00
Sergio Ammirata	64d511a275	Merge branch 'fix/ristsender-rtp-seq-ts-uncomment' into 'master' fix(ristsender): enable --rtp-sequence and --rtp-timestamp extraction See merge request rist/librist!322	2026-05-23 19:58:25 +00:00
Sergio Ammirata	aa141517a3	fix(ristsender): enable --rtp-sequence and --rtp-timestamp extraction The RTP seq/timestamp extraction in input_udp_recv was disabled since `be32351` (2020) with a "TODO: Figure out why this does not work" comment. The root cause was a buffer offset bug: recvfrom writes the UDP payload at recv_buf + ipheader_bytes, but the extraction code read recv_buf[2..7] — the reserved IP-header prefix area — returning garbage.	2026-05-23 15:54:49 -04:00
Sergio Ammirata	88467c0e4e	Merge branch 'security/audit2-followup' into 'master' Security follow-ups to v0.2.15 (audit, part 2) See merge request rist/librist!319	2026-05-18 17:23:25 +00:00
Sergio Ammirata	91d666fb66	NEWS: document audit2 follow-up fixes for pre-v0.2.16 Add a "Post-0.2.15 audit follow-ups" subsection under the existing 0.2.16 hardening notes. Groups the changes by component (receiver flow accounting, EAP authenticator, PSK, SRP, stats JSON, build portability) so a downstream reader can scan to the bit they care about.	2026-05-18 12:52:49 -04:00
Sergio Ammirata	df2834f906	build(srp): drop dead !USE_SHA_RET branch, require mbedTLS >= 2.7 `a66b7bf` fixed the !USE_SHA_RET branch so it would compile against mbedTLS versions prior to 2.7 (March 2018), where the SHA-256 calls returned void and could not signal an internal allocation failure back to the caller. The branch is otherwise dead: the bundled mbedTLS in contrib/ is 2.28.x, every distro of interest is well past 2.7, and no CI job builds against anything older. A latent bug regressing into that branch would never be caught. Drop the branch, emit an explicit #error if anyone builds against mbedTLS < 2.7 with directions to either upgrade their system mbedTLS or use the bundled tree. Surface area down, untestable dead code gone. Nettle has always returned int and is unaffected. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/info-02).	2026-05-18 12:45:42 -04:00
Sergio Ammirata	feabe74762	fix(stats): add schema_version field to JSON payload `86e2a16` changed the sender-stats JSON shape from duplicate "peer" keys (technically valid, practically broken) to a single "peers" array. The change is correct, but it was made without any way for downstream consumers to know which shape they're parsing. A monitoring script that did .["sender-stats"].peer.id silently started getting null after the upgrade. The prometheus exporter shipped inside the tools tree was rewritten for the new shape (`3ea6c49`), so a librist sender on the new tree paired with an external exporter pinned to the old shape sees no peers at all, again silently. Add a top-level "schema_version" integer to the three places we emit stats JSON (sender flow stats, sender per-peer wrapper, receiver flow stats). Set it to 2 - version 1 was the duplicate-key shape, version 2 is the array shape we ship now. Receiver stats has always been array-shaped, but tagging it lets consumers branch on a single field regardless of which side they're parsing. Any future incompatible shape change must bump this number and call it out in NEWS. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/medium-04).	2026-05-18 12:44:47 -04:00
Sergio Ammirata	88c1bcc446	fix(srp): reserve former NG_512 / NG_768 enum slots to avoid silent ABI shift `eefe26e` dropped the sub-1024-bit RFC 5054 groups (correct call - they're below current minimum-strength thresholds) but did so by deleting the enum entries outright, which shifted every remaining group's integer value down by two. librist_get_ng_constants is RIST_API-exported and external callers that hardcoded the integer 2 to mean NG_1024 silently started getting NG_4096 instead. Same idea, different group, equally silent and equally wrong. Restore the original integer values by reserving slots 0 and 1 with explicit LIBRIST_SRP_NG_RESERVED_0 / RESERVED_1 names, and make librist_get_ng_constants reject them with -1 (the existing out-of-range error). Code using the symbolic enum names is unaffected; code using literal integers gets either the right group or a clean failure. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/low-05).	2026-05-18 12:43:52 -04:00
Sergio Ammirata	ace6eab94e	fix(stats): NUL-terminate sender peer cname explicitly strncpy does not guarantee NUL termination when the source is at least n bytes long. The defensive cname[0] = '\0' before the copy in rist_stats_sender_peer_stats did nothing - strncpy overwrites it. Today the bound is safe in practice (the SDES bounds fix in `c308e10` limits peer->receiver_name to 127 bytes + NUL within its 128-byte field, so the implicit NUL is copied) but the contract is fragile and is the same pattern the udpsocket fix in `38cb58f` already excised elsewhere. Use a copy length one byte shy of the destination and write the terminator unconditionally. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/low-03).	2026-05-18 12:43:14 -04:00
Sergio Ammirata	859224faa3	fix(flow): make the RIST_MAX_FLOWS cap race-free rist_receiver_associate_flow counted existing flows under flows_lock, released the lock, called create_flow, and create_flow re-took the lock to insert. Two concurrent callers could each see count == RIST_MAX_FLOWS - 1, each release the lock, and each insert - pushing the actual flow count past the cap by (racers - 1). The practical impact was small because the cap still bounds memory growth to O(N + racers), but the documented invariant "at most RIST_MAX_FLOWS flows" did not hold under concurrency. Move the authoritative count + cap check into create_flow under the same flows_lock acquisition that does the append. The early count in associate_flow stays as an allocation-avoidance optimization (skip the calloc + pthread_init dance when the cap is already definitively reached), but is no longer relied on for correctness. If the early count is stale and create_flow trips the cap, we tear down the freshly-allocated flow cleanly and return NULL - the caller's existing NULL handling kicks in. Move the RIST_MAX_FLOWS / RIST_MAX_PEERS_PER_FLOW defines above create_flow so the cap check sees them. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/low-02).	2026-05-18 12:42:55 -04:00
Sergio Ammirata	d5c1ccc1e9	fix(psk): drop the AES-128 fallthrough in the nettle session dispatch The session-path AES-CTR dispatch carried the same default-falls- through-to-AES-128 shape that audit/high-13 spotted on the standalone _librist_crypto_aes_ctr. `9e3703e` fixed the standalone function but missed the session sibling. Today it is not reachable (key_size always comes from SRP and is 128/192/256) but the pattern is a one-typo regression away - a future caller that passes key_size = 0 (e.g. a use-before-init) would silently run AES-128 over an uninitialized key schedule. Replicate the fixed shape: explicit cases for 128 / 192 / 256, and return on default rather than encrypting under whatever key happens to be in key->nettle_ctx.u.ctx128. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/low-01).	2026-05-18 12:41:47 -04:00
Sergio Ammirata	0501b30d22	fix(psk): skip PBKDF2 work for nonces while the key is locked out _librist_crypto_psk_decrypt ran a full PBKDF2 + AES re-keying on every attacker-chosen GRE nonce, even after the key had been locked out for excessive decryption failures. `1c1ae12` correctly closed the audit's lockout-reset path by preserving the bad_decryption flag across nonce rolls, but the PBKDF2 itself still ran on every packet with a new nonce. Each iteration costs a few milliseconds of CPU against an attacker that needs to flip four bytes of GRE header per packet, so a modest packet rate could pin the receiver CPU. Move the lockout test before the rekey: while bad_decryption is set, attacker-supplied nonces are ignored without doing any work. The flag is still only cleared by passphrase rotation or by a successful decrypt + RTP validation upstream, so the high-15 fix property is preserved. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/medium-01).	2026-05-18 12:41:26 -04:00
Sergio Ammirata	8ae996c0a7	fix(flow): align peer_lst lock discipline on f->mutex, cap peers per flow flow->peer_lst is reallocated under f->mutex on the add path (rist_receiver_associate_flow), but several other writers and readers were not taking the same lock: - remove_peer_from_flow took no lock at all on the shrink path. - send_nack_group took peerlist_lock (not f->mutex) around the peer_lst walk, so a concurrent realloc-move from the protocol thread left it walking a freed array. Reproduced under TSan and under ASan with audit2/high-01/poc-h1-race-harness.c. - The session-timeout deletion path (rist_receiver_pthread_protocol) walked peer_lst under peerlist_lock only after dropping f->mutex, same class of race. Take f->mutex everywhere peer_lst is read or written. Lock order is peerlist_lock outside, f->mutex inside, matching the existing convention elsewhere in the file. Both shrink call-sites (remove_peer_from_flow and the inline writer in rist_peer_remove) now also check the realloc return: a shrink that returns NULL legitimately means "could not move" and we keep the oversized allocation rather than overwriting peer_lst with NULL while peer_lst_len is still positive. Add RIST_MAX_PEERS_PER_FLOW = 256, checked under f->mutex in rist_receiver_associate_flow. peer_lst grew unbounded before this; 256 is generous for any legitimate multipath deployment and bounds the worst case for both attack and misconfiguration. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/high-01).	2026-05-18 12:40:59 -04:00
Sergio Ammirata	91da8b5e06	fix(crypto): stop poking at mbedTLS internal entropy struct fields on Windows The `b05ff88` Windows CSPRNG fix reached into mbedtls_entropy_context's internal source_count and source[] fields to drop the default CryptGenRandom source and substitute BCryptGenRandom. Those fields are not part of the mbedTLS public API. mbedTLS 3.x has started hiding internal struct members behind MBEDTLS_PRIVATE() and reserves the right to reorder the layout, so the existing approach will fail to compile (or, worse, compile but read wrong offsets) on any distro that builds librist against system mbedTLS 3.x. Rework the Windows path to bypass the entropy context entirely: feed BCryptGenRandom directly into mbedtls_ctr_drbg_seed via the f_rng interface, which is part of the stable public API and does not care how the random bytes are produced. The entropy context is now declared and initialised only on non-Windows; the seed-failure plumbing (ctr_drbg_seed_ret, the fail-closed property in _librist_crypto_ramdom_get_bytes) is unchanged. Bundled contrib mbedtls 2.28.10 is unaffected, but this removes a build hazard for downstream packagers building against system mbedtls 3.x. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/info-03).	2026-05-18 12:38:39 -04:00
Sergio Ammirata	0089625aa2	fix(crypto): propagate CSPRNG failure through the nettle SRP wrapper _librist_srp_nettle_wrap_random is the random-bytes callback nettle invokes for the SRP private exponents (a on the authenticatee side, b on the authenticator side) and for the verifier salt. Its signature is void, so when _librist_crypto_ramdom_get_bytes failed the wrapper used to discard the error and leave the destination buffer at whatever was on the stack. mpz_import then read those stack bytes as the SRP private exponent. On a build using the nettle backend during a sustained entropy outage that meant the SRP session key could be derived from uninitialized memory. `8f22b8e` made the mbedTLS path fail-closed because the mbedTLS wrapper returns int and propagates the error naturally. This commit applies the equivalent property to the nettle path: thread the caller's `int ret` through the otherwise-unused void* context arg of nettle_mpz_random / nettle_mpz_random_size, and have the wrapper write -1 plus zero the buffer on CSPRNG failure. The existing `if (ret != 0)` checks at the two BIGNUM_RANDOM call sites and at the salt-generation site then abort the SRP operation cleanly. The buffer-zero on failure is defense in depth: even if a caller were added that ignored ret, the bignum would be zero rather than attacker-influenced stack, and SRP's mod-N checks already reject 0 as A or B. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/high-05).	2026-05-18 12:37:40 -04:00
Sergio Ammirata	4e8c08e64e	fix(crypto): fail PSK encrypt/decrypt closed when CSPRNG is unavailable prand_u32 silently substituted (uint32_t)timestampNTP_u64() when the CSPRNG returned non-zero, and the PSK GRE nonce generator (_librist_crypto_psk_generate_nonce) consumed that fallback as the PBKDF2 salt. The result was a 32-bit wall-clock-truncated nonce feeding key derivation whenever the entropy source failed - predictable to anyone who could observe wire timing, and prone to nonce collisions across restarts that would reuse the AES-CTR keystream. `8f22b8e` was supposed to make the crypto stack fail-closed on entropy failure; this commit closes the prand_u32 hole that defeated that property. Split the API so the fallback only lives where it is genuinely benign: - prand_u32() keeps the wall-clock fallback. Used for SSRC, flow-id, and peer-id where unpredictability is not a security property and the caller just needs "some non-zero u32". - _librist_crypto_random_u32(uint32_t out) is the new fail-closed sibling. Returns the underlying mbedTLS / gnutls error on failure, leaves out untouched, and is the only entry point used by security-critical sites. Route the PSK nonce generator through the fail-closed call and add a csprng_failed flag on struct rist_key: when set, both _librist_crypto_psk_encrypt and _librist_crypto_psk_decrypt short-circuit. The encrypt path zeroes the output so a caller that ignores the lockout never emits ciphertext under a weak key. The flag is cleared on the next successful passphrase install (a fresh nonce attempt is then made). Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/high-03).	2026-05-18 12:36:51 -04:00
Sergio Ammirata	fceba02731	fix(eap): cap attacker-supplied SRP modulus and generator lengths process_eap_request_srp_challenge took the bignum lengths straight from the wire and only bounded them against the remaining EAPOL frame size (~10 KB after `d71d96e`). mbedtls_mpi_read_binary plus the subsequent exp_mod / inv_mod are super-linear in input size, so a single CHALLENGE with a ~10 KB N pinned the CPU pre-auth on the authenticatee side without ever completing the handshake. Add an explicit cap at EAP_MAX_MODULUS_BYTES (1024 = 8192 bits, the largest RFC 5054 group librist supports). Anything beyond that is neither legitimate SRP nor decodable into a usable group, and is rejected with EAP_LENERR. `eefe26e` already removed the sub-1024-bit groups; this is the corresponding upper bound. Found by Thomas Guillem during the post-v0.2.15 re-audit (audit2/medium-03).	2026-05-18 12:34:48 -04:00
Sergio Ammirata	985cde4c6b	fix(eap): harden REQUEST_IDENTITY handler against pre-auth abuse Two small follow-ups on process_eap_request_identity surfaced by the audit2 review: 1. Role gate (audit2/low-04). The handler ran on any role, so an authenticator that received a spoofed IDENTITY REQUEST would echo whatever happens to be in config.username (typically empty on a server peer) and burn an eap_reset_data() cycle. The process_eap_response_identity counterpart and the SRP REQUEST dispatch already carry an authenticator-only / authenticatee-only gate; this is the same pattern applied to the IDENTITY REQUEST. 2. Rate-limit on the pre-auth reply path (audit2/medium-02). `b2915a7` already blocks IDENTITY REQUEST once the peer is SUCCESS, but pre-auth the original behaviour was unchanged: one EAPOL frame from an attacker wiped any in-flight SRP state and echoed the configured SRP username back to the spoof source. The username is operationally sensitive (often the production SRP account name) and the wipe is a free DoS on the in-flight handshake. Cap to one reply per EAP_IDENTITY_REPLY_INTERVAL (200 ms). Legitimate authenticator retransmits run at EAP_AUTH_TIMEOUT (500 ms) so the real handshake is unaffected; a flooded peer leaks the username at most 5x/s rather than at line rate. Found by Thomas Guillem during the post-v0.2.15 re-audit.	2026-05-18 12:34:12 -04:00
Sergio Ammirata	bdbe4fb123	fix(eap): gate FAILURE on in-flight identifier; recover from soft FAILED state EAP_CODE_FAILURE was honored regardless of which identifier it carried, so four spoofed FAILURE packets to a peer's listening UDP port were enough to bump tries past EAP_AUTH_RETRY_MAX, set the state to EAP_AUTH_STATE_FAILED, and permanently silence the peer on EAP-SRP until the process restarted. `d71d96e` bounded the reset storm but left the terminal state reachable from any unauthenticated source. Two changes (audit2/high-02): 1. Identifier gating on the FAILURE handler. A legitimate FAILURE always carries the identifier of the in-flight exchange. To make that property usable on both sides we also set last_identifier from the incoming REQUEST in process_eap_request — previously the field was maintained only on outgoing REQUESTs and the authenticatee side had no notion of "current exchange identifier". An off-path attacker now has at best a 1/256 chance per spoof of landing a matching identifier; mismatches are dropped silently with no state mutation and no tries++. 2. Recovery from soft-FAILED. The eap_periodic path now watches for EAP_AUTH_STATE_FAILED entered via the increment path (tries past the retry max but below the permanent sentinel set by SRP-failure paths) and resets to UNAUTH after EAP_AUTH_FAILED_RECOVERY ms of quiet. A new failed_state_timestamp field carries the recovery clock; it is cleared on every SUCCESS / LOGOFF transition. Together these reduce the single-direction DoS from "trivial with four packets, permanent until restart" to "needs to guess the identifier and sustain the spoof faster than 30 s of quiet". On-path attackers who can observe the identifier still win the spoof, but recovery means they cannot make the silence permanent without continuous traffic. Found by Thomas Guillem during the post-v0.2.15 re-audit.	2026-05-18 12:33:02 -04:00
Sergio Ammirata	2f1cd95368	fix(eap): refuse off-path LOGOFF on authenticated peers; saturate tries counter Two issues in eap_process_eapol's EAPOL_TYPE_LOGOFF case that together let an off-path attacker tear down or destabilize an established EAP-SRP session, surfaced by the audit2 follow-up (high-04). 1. A spoofed LOGOFF would unconditionally drop an authenticated peer to UNAUTH state. EAPOL frames carry no origin authentication, so a single packet to the listening UDP port was enough to deauth a peer. Mirror the `b2915a7` guard already applied to REQUEST_IDENTITY: once the peer is SUCCESS or REAUTH, refuse the LOGOFF as EAP_UNEXPECTEDREQUEST. Re-auth is driven by the timers in eap_periodic, never by an attacker-supplied LOGOFF. 2. On the legitimate (pre-auth) LOGOFF path, the retry counter was left untouched. Combined with the 255-sentinel that several SRP-failure paths set on ctx->tries, a LOGOFF could move the peer out of FAILED while tries was still primed at 255, and the next spoofed FAILURE would wrap (uint8_t 255 + 1 = 0), bypassing the FAILED-state gate at process_eap_pkt and putting the attacker back in business. Reset tries=0 on the legitimate LOGOFF, and widen tries from uint8_t to unsigned int with a saturating eap_tries_inc() helper so the wrap primitive is closed for good. The 255 sentinel is replaced by EAP_AUTH_TRIES_PERMANENT = UINT_MAX, which is a fixed point under the saturating increment. Found by Thomas Guillem during the post-v0.2.15 re-audit.	2026-05-18 12:32:16 -04:00
Sergio Ammirata	f9176ed0af	Merge branch 'tests/ci-modernization' into 'master' ci+build: JUnit on Linux, link threads dep for ristsrppasswd Closes #210 See merge request rist/librist!318	2026-05-17 22:34:34 +00:00
Sergio Ammirata	4e294b56bc	NEWS: document the Windows CSPRNG fix and CI improvements Bug fix: BCryptGenRandom-backed seeding unblocks encryption on Windows runtimes where the legacy CryptoAPI is missing (wine, sandboxed containers). Test/CI: SRP and AES suites now run on Windows under wine, and per-test results are visible directly in the GitLab MR UI via JUnit.	2026-05-17 18:30:13 -04:00
Sergio Ammirata	526b13585c	ci: publish JUnit test reports from test-ubuntu Meson writes testlog.junit.xml alongside testlog.txt on every test run. Point GitLab at it so reviewers see per-test pass/fail directly in the MR's "Tests" tab instead of fishing through job logs. The text log is still uploaded as a fallback.	2026-05-17 18:30:13 -04:00
Sergio Ammirata	2a1652cd72	ci: build the Windows test binaries with mbedTLS The Windows cross build was using -Duse_mbedtls=false, which silently disabled the SRP and AES test suites under wine. Build with the same crypto configuration as the released binaries so test-win64 actually exercises those code paths. Coverage goes from 19/27 to 27/27 tests.	2026-05-17 18:30:13 -04:00
Sergio Ammirata	3bacd5f180	build: link the threads dependency into ristsrppasswd ristsrppasswd compiles in contrib/pthread-shim.c, which calls pthread_cond_timedwait under have_mingw_pthreads. The other tools inherit the threads dependency transitively via prometheus and microhttpd; ristsrppasswd does not, and so fails to link on Windows with `undefined reference to pthread_cond_timedwait` as soon as mbedTLS is enabled in the cross build. Add threads to its dependency list. The variable resolves to [] on platforms that don't need it, so the change is a no-op elsewhere.	2026-05-17 18:30:13 -04:00
Sergio Ammirata	b05ff88388	fix(crypto): seed CSPRNG via BCryptGenRandom on Windows mbedTLS auto-installs an entropy source on Windows that uses the legacy CryptoAPI (CryptAcquireContext + CryptGenRandom). It silently depends on the RSA crypto provider being installed and reachable, which is not the case in wine, in several common container images, or on some hardened Windows deployments. On those systems mbedtls_ctr_drbg_seed fails, and after 0.2.15's CSPRNG hardening librist refuses to run any encryption code path at all. Replace the source with BCryptGenRandom (CNG): same cryptographic strength, present on every supported Windows release (Vista+), and reliably implemented by wine. Add bcrypt to the Windows link line. No behaviour change on Linux or macOS. Fixes #210.	2026-05-17 18:30:13 -04:00
Sergio Ammirata	e2b2581729	Merge branch 'tests/regression-coverage' into 'master' tests: regression test for #208 + run Windows meson suite under wine See merge request rist/librist!317	2026-05-17 16:44:44 +00:00
Sergio Ammirata	650212df73	NEWS: document the new Windows test coverage The 0.2.16 pre-release section now lists what the new test infra catches: every protocol test that gates Linux releases also gates Windows releases, so a #208-class winsock-runtime regression cannot ship without being seen first.	2026-05-17 12:40:34 -04:00
Sergio Ammirata	9a274f4a53	ci: run the full Windows test suite under wine CI infrastructure improvement. Until now build-win64 only verified that the code cross-compiles; the test suite ran on Linux only, so 18 of the 19 protocol tests had zero Windows signal. #208 (the 0.2.15 winsock-init bug) made it into a release because of this gap. Splits build-win64 into a pure build job plus a new test-win64 job that runs the same `meson test` suite against the cross-compiled .exe artifacts, transparently launched under wine via the meson cross-file's exe_wrapper. The wine runtime was already provisioned in the CI image; only the test invocation was missing. Does not replace a native-Windows runner (wine does not reproduce every Windows kernel behaviour - see #209's ICMP-unreach interaction) but it would have caught #208 on the first MR run, which is the class of bug we want to stop shipping.	2026-05-17 12:40:29 -04:00
Sergio Ammirata	d4cfb379d6	test: regression test for rist_logging_set remote-address path Test coverage gap. Catches #208 (winsock not initialised before rist_logging_set on Windows 11) and any future regression in the public logging API's startup ordering. The bug shipped in 0.2.15 because no test exercised the rist_logging_set(.., address, ..) path before any rist_*_create call. On POSIX this test is a baseline smoke check; under wine (see the companion .gitlab-ci.yml change) it fails with WSANOTINITIALISED if winsock init regresses. Standalone executable (no cmocka dependency) so it runs on every build that has -Dtest=true.	2026-05-17 12:22:23 -04:00
Sergio Ammirata	3c502814e7	Merge branch 'release-0.2.16' into 'master' release-0.2.16: Windows fixes (#208, #209) and 0.2.15 follow-ups See merge request rist/librist!316	2026-05-17 16:07:42 +00:00
Sergio Ammirata	546a22b5d0	NEWS: open the 0.2.16 maintenance release section Document the Windows fixes (#208, #209) and the 0.2.15 follow-ups that have accumulated so far. 0.2.16 is binary-compatible with 0.2.15; downstream consumers will not need to recompile. More MRs are expected before the release goes out.	2026-05-17 12:05:13 -04:00
Sergio Ammirata	d3263ffc0c	fix(udpsocket): Windows EAP authentication fails when bonding peers Bug fix - Windows only, 0.2.15 regression. Reported and confirmed fixed by Roman Dissertori (@moo) in #209. Setting up multiple peers on a Windows receiver with EAP-SRP authentication (Linux ristsender -> Windows ristreceiver) fails with "Failed to process EAPOL pkt, return code: -1". Ethernet peers fail consistently; wifi peers sometimes succeed depending on timing. Worked in 0.2.14, broken in 0.2.15. Root cause: two 0.2.15 changes interact badly on Windows. 1. Windows enqueues a synthetic WSAECONNRESET indication on any UDP socket whose outbound packets provoke an ICMP port-unreachable reply. The 0.2.15 hardening added a recvfrom(drain[1], ...) call to clear that indication, but on Windows that recvfrom either consumes a real datagram outright or returns WSAEMSGSIZE and silently drops one. librist then handed a truncated buffer to its EAPOL handler. 2. Also in 0.2.15, eap_process_eapol started rejecting anything shorter than a complete EAPOL frame. A truncated buffer that 0.2.14 would have tolerated now fails outright. Together they break the authenticator on Windows whenever ICMP unreachables happen to align with the EAP handshake. Fix: disable the synthetic WSAECONNRESET indication on every UDP socket via WSAIoctl(SIO_UDP_CONNRESET, FALSE), and drop the unsafe drain. ICMP unreachables on a connectionless socket are advisory only, and librist detects peer loss via RTCP keepalive timeouts.	2026-05-17 12:05:13 -04:00
Sergio Ammirata	f61c1560e2	diag(udpsocket): show actual Windows error code on socket failures Diagnostics only - no behaviour change. On Windows, errno carries POSIX values that do not match what the socket layer actually reported, so strerror(errno) in our existing error logs is misleading. End-user bug reports from Windows users were missing the WSA error code needed to identify the real cause. Add WSAGetLastError() to the error log lines in udpsocket_open_connect, udpsocket_open_bind and udpsocket_set_nonblocking. Mirrors the existing handling in udpsocket_open().	2026-05-17 12:05:13 -04:00
Sergio Ammirata	09cead14b3	fix(udpsocket): Windows tools fail to start when -r remote log is used Bug fix - Windows only. Reported by Roman Dissertori (@moo) in #208. Running `ristreceiver -r 127.0.0.1:port ...` on Windows 11 fails immediately with "Failed to open logsocket / Failed to setup logging!" and the tool exits before doing any work. Same failure path for any caller that uses rist_logging_set() before rist_sender_create() / rist_receiver_create(). Root cause: WSAStartup ran inside init_common_ctx(), which is reached only via rist_*_create. The public logging API opens a UDP socket before that, so socket() returns INVALID_SOCKET and the log socket is never created. Fix: initialise winsock from udpsocket.c itself, guarded by InitOnceExecuteOnce / pthread_once so the first call from any thread to udpsocket_open() or udpsocket_resolve_host() is enough. WSAStartup is ref-counted by Windows so the existing init_common_ctx() call stays in place for out-of-tree consumers that may call rist-common directly. POSIX path is a no-op.	2026-05-17 12:05:13 -04:00
Sergio Ammirata	8f22b8e5b5	fix(crypto/random): silent crypto downgrade when entropy source missing Security hardening. Not externally observed; the failure mode requires a deployment with no kernel entropy source (sandboxed container, embedded image without /dev/urandom, broken mbedTLS build). What was wrong: if the CSPRNG seeding call failed at startup, the return value was silently dropped. Every subsequent crypto-random request would then return zeros / failure codes, and the wall-clock-derived fallback inside prand_u32() would be used for the entire process lifetime - including for SRP nonces and AES-CTR IVs. The 0.2.15 PRNG hardening was therefore quietly absent in exactly the environments where it matters most. Fix: capture the seed return code in a static, log a loud error if seeding failed, and short-circuit subsequent crypto-random calls so the failure is visible at every caller instead of papered over. The Nettle path already retries gnutls_rnd up to 10 times; added a single error log on the final failure for parity. No behaviour change when the CSPRNG is healthy.	2026-05-17 12:05:13 -04:00
Sergio Ammirata	5222b3c325	fix(eap): authentication retry exhaustion never dropped failed peer Bug fix. Latent since 0.2.14 - not externally reported but exploitable: an EAP-SRP authentication that fails every retry was supposed to mark the peer as permanently failed and drop it; instead the peer was kept around in a half-authenticated state and kept consuming retry attempts. Root cause: rist-common's EAPOL handler checked `if (eapret == 255)` to detect the permanent-failure path, but eap_process_eapol returns negative error codes (-255 in the old numbering). The comparison could never match. Fixed by introducing named error constants (EAP_INTERNALERR, EAP_AUTH_FAILED, EAP_AUTH_TERMINATED) and comparing against EAP_AUTH_TERMINATED instead. Also caught while adding the names: EAP_SRP_WRONGSUBTYPE and EAP_UNEXPECTEDREQUEST were both defined as -4. No current caller discriminates on the value so the collision had no observed effect, but it was a trap for any future per-error handling. Moved EAP_SRP_WRONGSUBTYPE to -5. No wire-format change. Error codes are internal to librist.	2026-05-17 11:57:21 -04:00
Sergio Ammirata	e7c1fd3b0f	docs(eap): clarify why IDENTITY-response handler rejects authenticatee Documentation only - no behaviour change, no bug fix. The original guard comment was labelled "Defensive:" which made it look like a paranoia check that could be removed. It is not: on the authenticatee side the lookup callback is NULL by construction, so the handler MUST refuse the IDENTITY response or it will dereference NULL. Reword the comment so a future reader does not mistakenly delete the check.	2026-05-17 11:57:21 -04:00
Sergio Ammirata	561c2536e6	Merge branch 'hotfix/srp-nettle-macro' into 'master' hotfix: SRP Nettle backend build failure (BIGNUM_WRITE_BYTES_OR_GOTO) See merge request rist/librist!315 v0.2.15	2026-05-15 04:48:53 +00:00
Sergio Ammirata	443f6d71e4	fix(srp): drop invalid (void)(lbl) cast in Nettle BIGNUM_WRITE_BYTES_OR_GOTO The macro tried to silence an "unused argument" warning by casting lbl to void, but lbl is a goto label, not an expression. GCC and clang both reject this with "undeclared identifier". The Nettle backend builds (Linux ARM, Windows ARM/LLVM) all fail at src/crypto/srp.c:230 calling BIGNUM_WRITE_BYTES_OR_GOTO. Drop the useless cast; macro arguments don't need referencing to suppress unused-parameter warnings.	2026-05-15 00:35:07 -04:00
Sergio Ammirata	d59491fc2f	Merge branch 'release-0.2.15' into 'master' Release 0.2.15: security audit fixes, Windows build, reported bug fixes See merge request rist/librist!314	2026-05-14 22:37:31 +00:00
Sergio Ammirata	e137d0a4a1	release: prepare 0.2.15 Security release on top of 0.2.14. ABI bumps to 4.6.0 / 4:3:0, binary-compatible with 0.2.14 (no symbol additions or removals, soversion stays at 4). Combines: * a full receive-side, EAP/SRP and PSK/AES-CTR audit * a follow-up audit shared by VideoLAN that closes the residual gaps (EAP role enforcement, SRP cleanup paths, NPD scratch sizing, RTP header guard for the FULL/EAPOL paths, hardened TUN helpers, SDES length clamp, qsort/median fixes) * the Windows / MinGW build fixes (clock_gettime probe, the PTHREAD_START_FUNC return type, WSAECONNRESET drain, strtok_r shim header) * three reported bugs picked up while the branch was open: invalid JSON for multi-peer sender stats (#206) plus the prometheus parser landing for the sender side, the UDP log socket condition (!313) and the infinite socket-error loop on hostname resolution failure (!312). NEWS lists the per-fix detail.	2026-05-14 18:29:13 -04:00
Sergio Ammirata	bfcac70ddc	tools/udp2udp: warn that --metrics has no rist_* series to export udp2udp uses librist for URL parsing and the Prometheus httpd shell, not for actual data flow, so the exporter context comes up but no sender/receiver callbacks ever feed it. People enabling --metrics on udp2udp expecting RIST stats end up scratching their heads at an empty scrape endpoint. Print a one-shot warning right after rist_setup_prometheus_stats so the limitation is obvious from the logs.	2026-05-14 18:28:22 -04:00
Sergio Ammirata	1f5299f2fc	tools/ristsender: drop the duplicate sender-stats callback registration setup_rist_peer was calling rist_sender_stats_callback_set twice on the same ctx: once with (void*)w->id as the callback arg, then a second time a few lines down with NULL. The second call wins (the library just overwrites sender_stats_callback_argument), so the JSON callback was running with arg = NULL and the Prometheus parser couldn't tell which sender instance the JSON came from when more than one was running in the same process. Drop the second registration; the first one is correct.	2026-05-14 18:27:59 -04:00
Sergio Ammirata	3ea6c49ca9	feat(prometheus-exporter): parse sender-side stats JSON ristsender uses rist_sender_stats_callback_set, which only delivers the JSON blob to the callback. The Prometheus exporter's parser for that path was a (void)... // TODO stub, so sender-side metrics from ristsender simply never showed up. Walk the "sender-stats" -> "peers" array introduced in 0.2.15 (`86e2a16`), pull each peer's id, cname and stats numbers, and feed them through the existing rist_prometheus_handle_sender_peer_stats helper so the gauge/counter shape stays exactly what already shipped on the receiver-callback path. While here, factor the stale-peer cleanup block out of rist_prometheus_parse_stats into rist_prometheus_cleanup_stale_locked so both parsers run it. Without that, peers added with from_callback=true on the JSON path would never expire. JSON rtt is already in milliseconds (last_rtt / RIST_CLOCK in src/stats.c) and the handler converts ms -> seconds, so the parser feeds the value straight through with no extra scaling. Closes the rist_prometheus_parse_sender_stats half of #206. Builds on the parser shape from manueldev's mr/206 work, reworked against the 0.2.15 JSON schema. Co-authored-by: Manuel <malejandrodev@gmail.com>	2026-05-14 18:27:46 -04:00
Sergio Ammirata	b07fc5f13e	fix(build): always compile random.c, no-op when no CSPRNG backend prand_u32() now routes through _librist_crypto_ramdom_get_bytes() unconditionally, but random.c was only added to the build when have_srp was true (mbedTLS or nettle). The mingw-w64 -Duse_mbedtls=false job legitimately has neither, so the librist.dll link fails with an undefined reference. Move random.c out of the have_srp block so the symbol always exists, and guard the body so it returns -1 when no backend is compiled in. prand_u32 already falls back to the wall clock on that path. Reported by upstream CI on !314.	2026-05-14 17:42:18 -04:00

1 2 3 4 5 ...

1272 Commits