librist

mirror of https://code.videolan.org/rist/librist.git synced 2026-07-04 15:06:53 +00:00

Author	SHA1	Message	Date
Sergio Ammirata	6e2a5e341b	Merge branch 'fix/eap-srp-bonded-sender-reauth' into 'master' fix(eap): recover and reintegrate a bonded EAP-SRP sender leg after a flap See merge request rist/librist!370	2026-07-03 17:45:38 +00:00
Sergio Ammirata	cadb6c4aa4	fix(eap): recover and reintegrate a bonded EAP-SRP sender leg after a flap A bonded caller-sender leg that lost and regained connectivity on the same source tuple -- an interface flap, or an upstream outage that resumes without a reconnect -- stayed wedged and never rejoined the bond without an operator restart. While the leg is silent the far-end authenticator times out and purges its session. The leg keeps its miface-bound UDP socket (a flap does not close it) and stays in EAP SUCCESS, so it keeps streaming data the authenticator now drops while it waits for an EAPOL START the caller never sends, because EAP is authenticator-driven and the caller believes it is still authenticated. - Extend try_caller_socket_rebind to sender-mode callers: a sender leg silent past session_timeout resets its EAP context (eap_reset_authenticatee) and re-drives the SRP handshake on the existing socket. It does not rebind -- the miface-bound socket is still valid and rebinding would move the source tuple the far end expects. - Fold the leg back into the weighted bond once it re-authenticates. Recovery de-authenticates the leg (so it leaves the sender balancing rotation while down) and rewinds eap_authentication_state, so the "EAP Authentication succeeded" transition fires again on re-auth and restores the connection- level authenticated flag via rist_peer_authenticate. Without this the leg re-authenticates but is left out of balancing and carries only NACK retransmits instead of its share. - Only SRP sender legs need this; plaintext/PSK senders have no such deadlock and recover via normal reconnect, so they are left untouched. Reproduced and verified with a bonded advanced-profile ristsender over two netns/veth legs to an SRP listener: one leg is silenced with 100% packet loss both ways (tc netem) while its interface stays up, so the socket persists on the same source tuple. Before the fix the returning leg never re-authenticates and the listener floods "handshake is still pending"; with re-auth alone it authenticates but the sender balances over only the surviving leg; after the full fix the returning leg re-authenticates and resumes carrying its full weighted share (verified on a restored zero-loss link, matching a plaintext bond). Added as test/rist/test_bonded_leg_flap_netns.sh (meson "netns" suite), which asserts both re-authentication and reintegration; it needs Linux root + netns/tc and cleanly skips (exit 77) otherwise. An in-process loopback test cannot reproduce the wedge because a loopback leg has no miface binding and self-heals with a new-port handshake.	2026-07-03 13:10:42 -04:00
Sergio Ammirata	91826342d8	Merge branch 'fix/url-parse-buffer-safety' into 'master' fix(url): bound scheme-prefix parsing against buffer overflow See merge request rist/librist!368	2026-07-02 00:13:56 +00:00
Sergio Ammirata	3c0c69aa38	fix(url): bound scheme-prefix parsing against buffer overflow Reported by the Fable 5 security audit. parse_url_udp_options() extracted the URL scheme into a fixed 16-byte prefix[] buffer using a copy length derived from the offset of the first '/'. Two inputs corrupted memory: - A URL beginning with '/' set prefix_len to 0, so the strncpy length prefix_len-1 underflowed the uint32_t to 0xFFFFFFFF. strncpy copied the short source and then NUL-padded the destination toward 4 GB, writing far past the 16-byte buffer. - A scheme longer than 15 characters left the strncpy length clamped to 15 but still wrote the terminator at prefix[prefix_len], an out-of-bounds one-byte write at an attacker-influenced index. Both are reachable through the public rist_parse_udp_address() / rist_parse_udp_address2() entry points, which the ristsender, ristreceiver and udp2udp tools call directly on user-supplied -i/-o UDP URLs. Clamp both the copy length and the terminator index to the buffer size, preserving the existing behavior of dropping the trailing ':' from the scheme. Add test/rist/unit/test_url_udp_prefix_parse, which exercises the normal udp/rtp schemes plus the two malformed inputs and fails under AddressSanitizer against the old code. Also replace the unbounded sprintf that builds the Simple-profile RTCP peer address with snprintf, so a long input URL cannot overflow the peer-config address buffer.	2026-07-01 19:40:51 -04:00
Sergio Ammirata	aae9bbccfd	Merge branch 'fix/eap-srp-caller-rebind' into 'master' fix(eap): recover EAP-SRP callers after a listener restart See merge request rist/librist!367	2026-07-01 14:03:00 +00:00
Sergio Ammirata	e9af5686e4	fix(eap): recover EAP-SRP callers after a listener restart An EAP-SRP caller (e.g. a caller-mode receiver behind NAT or on a single UDP port) did not recover when its listener restarted. SRP callers were excluded from the silent caller socket rebind, so once the listener dropped all session state the caller stayed authenticated on a dead socket and never re-handshook; recovery required an operator restart. - Include SRP callers in try_caller_socket_rebind and reset their EAP context (eap_reset_authenticatee) so they re-run the SRP handshake on the fresh socket. - Bound the supplicant EAPOL START retransmit: an authenticatee stuck in UNAUTH against a silent authenticator retransmits START on a timer, capped by EAP_AUTH_TIMEOUT_RETRY_MAX. Inbound EAP re-arms the timer and authentication success disarms it, so a live handshake is never disrupted. This covers a restarted listener that never saw the first START. - Reset rebind_attempts/last_rebind_time on successful (re)authentication so a later outage retries promptly instead of after an ever-growing backoff. - Log a warning if the immediate START send fails (the periodic path still retries). test/rist/test_caller_socket_rebind (plaintext, psk, srp) kills the sender, waits past session_timeout, restarts it on the same port, and asserts the caller rebinds to a new local port, the backoff resets to 0, and real payload flows end to end after recovery.	2026-07-01 09:47:17 -04:00
Sergio Ammirata	71c4611441	Merge branch 'fix/recovery-depth-32bit' into 'master' rist: make recovery-depth maximum platform-aware (fix 32-bit test) Closes #222 See merge request rist/librist!366	2026-06-29 18:11:04 +00:00
Sergio Ammirata	9e0fe492c2	rist: make recovery-depth maximum platform-aware (fix 32-bit test) recovery-depth N sizes the Advanced retransmission ring to UINT16_SIZE << N packets. RIST_RECOVERY_DEPTH_MAX is 16, i.e. 2^32 packets - the full 32-bit sequence space. The ring is two parallel arrays (one rist_buffer* and one uint32_t per slot), so 2^32 slots are only addressable when size_t is 64 bits. On a 32-bit size_t the slot count overflows, rist_recovery_depth_apply() correctly refused it (-2), and rist_recovery_depth_set() returned failure for the documented maximum. The recovery_depth unit test, which expects set(200) to clamp to the maximum and succeed, therefore failed on 32-bit targets; its own DEPTH_PKTS(MAX) expectation also overflowed to 0. Add rist_recovery_depth_platform_max() in rist-private.h: the largest exponent whose ring (UINT16_SIZE << depth) fits SIZE_MAX / (sizeof(ptr) + sizeof(u32)). On LP64 this is RIST_RECOVERY_DEPTH_MAX; on a 32-bit size_t it resolves lower (depth 12). rist_recovery_depth_to_packets() now clamps to it - covering both the public setter and the ?recovery-depth= / config apply path - so the result is always representable. The setter logs when a request is capped. The unit test uses the same shared helper for its expectation, so it tracks whatever the platform supports. apply()'s representability/OOM guard is kept as a safety net. No change on 64-bit, where the maximum is unchanged. Fixes #222.	2026-06-29 13:50:48 -04:00
Sergio Ammirata	b88e20200f	Merge branch 'fix/too-old-reset-spam' into 'master' rist: stop "Too many old packets" reset from flooding the log Closes #223 See merge request rist/librist!365	2026-06-29 17:42:19 +00:00
Sergio Ammirata	fb931c3bc2	rist: stop "Too many old packets" reset from flooding the log receiver_output() resets the flow when more than 100 consecutive packets arrive past twice the recovery window (delay_rtc > 2 * recovery_buffer_ticks). The reset logged "Too many old packets, resetting buffer", set receiver_queue_has_items = false and returned, but it never cleared f->too_late_ctr. The counter therefore stayed latched above 100. The data-output thread calls receiver_output() whenever the queue is non-empty, with no has_items gate, so on the next cycle the first still-old packet immediately tripped the > 100 guard again: log, return, repeat. The flow only recovers when a non-retry packet hits the re-anchor path in receiver_enqueue(), but a packet that arrives while has_items is false and is a retry is dropped before it can re-anchor. On a degraded, heavily retransmitting link (e.g. bonded) retries dominate, so the flow stays in the latched state and the message repeats every output cycle - the flood the report describes. Clear f->too_late_ctr when the reset fires. The stale backlog then drains through the existing drop path (goto next) instead of returning without progress, and a fresh run of 100 too-late packets is required before the next reset is logged. No behavioural change on a healthy flow, where the counter is already reset on every released packet. Fixes #223.	2026-06-29 13:38:07 -04:00
Sergio Ammirata	f86b48bfe7	Merge branch 'fix/udp-socket-buffer-autosize' into 'master' udpsocket: size socket buffers toward the OS maximum, not a fixed 1 MB See merge request rist/librist!364 v0.2.19-rc1	2026-06-26 18:37:40 +00:00
Sergio Ammirata	c4ecf6d47b	NEWS: document the udpsocket buffer-autosize fix Add the 0.2.19-rc1 Bug Fixes bullet for the socket-buffer change so NEWS matches the release tag.	2026-06-26 14:23:16 -04:00
Sergio Ammirata	c35738cc38	NEWS, meson: prepare 0.2.19-rc1 Mark the 0.2.19 NEWS section as -rc1 and bump the project version from 0.2.18 to 0.2.19 for the release candidate tag.	2026-06-26 14:19:46 -04:00
Sergio Ammirata	0cf88d505a	udpsocket: size socket buffers toward the OS maximum, not a fixed 1 MB udpsocket_set_optimal_buffer_size()/_send_size() requested a fixed 1 MB SO_RCVBUF/SO_SNDBUF and, on failure, fell back to ~200 KB. They never used the headroom a large net.core.{r,w}mem_max provides. That headroom is what lets a listener absorb a burst of simultaneous connections: a single protocol thread drains one UDP socket, so when many peers establish at once the inbound control/handshake traffic can momentarily exceed 1 MB and the kernel silently drops the overflow, stalling the peers whose handshake packets were lost. Measured on a 16-core x86 listener (net.core.rmem_max = 16 MB) with 200 peers dialing in within <1 s: at the 1 MB buffer the kernel dropped 924 datagrams at the socket (Udp: RcvbufErrors) and only ~158/200 peers completed authentication; the protocol thread sat at 9% of one core the whole time, so it was never the bottleneck. Rebuilt with the buffer sized to the OS maximum, the same storm dropped 0 datagrams and all peers connected (256/256 at the 256-flow cap). Request UDPSOCKET_SOCK_BUFSIZE_MAX (8 MB) and let the kernel clamp to mem_max, keeping whatever we obtain as long as it is at least the historical 1 MB floor; otherwise fall back to the ~200 KB last resort and warning exactly as before. No API/ABI change, no new configuration, and hosts with a small mem_max are unaffected (they still settle at the same fallback). Also correct the send-buffer error message to reference wmem_max instead of rmem_max.	2026-06-26 12:27:21 -04:00
Sergio Ammirata	28483d6f3a	Merge branch 'staging/0.2.19' into 'master' staging/0.2.19: OOB read, risttunnel single-port, Advanced/Main fixes Closes #221, #220, #219, and #218 See merge request rist/librist!363	2026-06-26 13:20:35 +00:00
Sergio Ammirata	145b8a5fba	fix(sender): honor the configured stats interval for structured stats (#221 ) rist_stats_callback_set() updated only the common stats interval, but the sender loop reads the sender context's own stats_report_time, which stayed 0. The callback then fired every loop iteration and each report covered a single pass, so packets_sent/packets_received showed tiny deltas (0-5). Set the sender interval here too, matching rist_sender_stats_callback_set(). Also harden the sender loop: read the interval fresh each pass (so a callback registered after the loop starts takes effect) and skip reporting when it is 0, so a stats-disabled sender no longer builds a stats object every iteration. Affected risttunnel's sender stats; the bundled Prometheus path (ristsender) uses the legacy setter and was unaffected. Thanks to laur fb for the precise diagnosis and the patch suggestion.	2026-06-26 09:06:13 -04:00
Sergio Ammirata	4d6c6860d9	fix(receiver): guard the cname re-association SRP check (#220 ) The listener cname re-association calls eap_is_authenticated(), whose declaration lives under HAVE_SRP_SUPPORT, so the build broke when SRP support is disabled. Guard the call. Without SRP the re-association refuses: the cname is not a per-peer secret, so a peer must not be migrated to a new source tuple without an authenticated session. Thanks to Gyan Doshi for the report.	2026-06-26 09:06:13 -04:00
Sergio Ammirata	9791e5817d	fix(ristsender): split the usage string under the C99 4095-char limit (#219 ) Adding --blind-send to the help text pushed the single usage string literal past the 4095-character limit C99 requires a compiler to support, so a -Wpedantic -std=c99 -Werror build failed with -Woverlength-strings. Split the help text into two literals printed back to back; the output is unchanged. Thanks to Florian Ernst for the report.	2026-06-26 09:06:13 -04:00
Sergio Ammirata	28b1cfc617	fix(receiver): clear stale clock-drift samples on a framing-baseline reset (#218 ) When an Advanced flow upgrades from Main to Advanced wire framing mid-stream, the two framings carry source_time in different timestamp domains, so the switch resets the flow timing baseline. That reset re-derived time_offset but left the clock-drift sample buffer full of stale Main-domain samples. The next median recalculation then jumped the offset by several seconds and released the whole receiver buffer at once, overflowing the data-out fifo. Because test_send_receive treats any ERROR-level log as fatal at 0% loss, the single "Rist data out fifo queue overflow" line failed the advanced+unicast+client tests, most reproducibly on slower build hosts. Clear the drift samples on the baseline reset, matching the existing clock-wrap reset path. Thanks to Florian Ernst for the detailed report and build logs.	2026-06-26 09:06:13 -04:00
Sergio Ammirata	45b957a836	NEWS: document the remaining 0.2.19 bug fixes Three fixes had landed without a changelog entry: the risttunnel SRP credential presentation on the sender/client legs, the rist_oob_dequeue use-after-free guard, and the rist_receiver_data_block_free2() NULL guard. Every commit since the 0.2.18 release is now covered.	2026-06-25 18:59:23 -04:00
Sergio Ammirata	acd68c7fa5	fix(sender): retransmit to a Main-downgraded peer in the RTP domain An Advanced-profile sender indexes its retransmit buffer by the 32-bit advanced sequence (seq_index, keyed by seq & (queue_max - 1)). Under TR-06-3 an Advanced device starts every peer in Main and only frames Advanced once that peer advertises I=1, so a peer that stays in Main requests retransmits with a 16-bit RTP sequence and nack_seq_msb = 0. The advanced and RTP sequence counters are independent, so the Main peer's NACK never resolved against the 32-bit index: the lookup found the wrong (or no) buffer slot, the seq-number validation rejected it, and nothing was retransmitted (recovered packet count stayed 0). Advanced<->Advanced sessions were unaffected. Keep a parallel 16-bit RTP retransmit index on Advanced senders, populated alongside seq_index as packets are sent, and resolve, validate, and frame a downgraded-Main peer's retransmits in the RTP domain. The new path is gated by rist_retx_use_rtp_domain(), which is true only for an Advanced context serving a peer that has not negotiated up to Advanced, so the Advanced<->Advanced lookup, validation, and wire framing are byte-for-byte unchanged. Main/Simple senders are untouched: their primary index is already the RTP domain and no second index is allocated. Add a header-only unit test for the domain predicate.	2026-06-25 17:11:37 -04:00
Sergio Ammirata	122000ba8b	fix(risttunnel): present SRP credentials on the sender/client legs risttunnel only enabled EAP-SRP on the listener via the -F verifier file. The sender leg (and the receiver leg in two-port client mode) parsed username/password from the URL into the peer config but never called rist_enable_eap_srp_2, so a caller pointed at an SRP-protected listener failed with "EAP authentication requested but credentials have not been configured" and the listener waited forever for authentication. Mirror ristsender/ristreceiver/rist2rist: when the URL carries both an SRP username and password, present them on the peer; the receiver leg falls back to the verifier file when no credentials are supplied.	2026-06-24 17:25:52 -04:00
Sergio Ammirata	e6a27a4f44	fix(oob): validate the stashed peer before sending in rist_oob_dequeue rist_oob_write() stashes the destination rist_peer* into the oob queue from a caller thread; the protocol thread later sends it in rist_oob_dequeue(). A peer addressed this way (e.g. the child peer the risttunnel --single-port reverse path captures via the auth/connect callback) can be torn down by a NAT rebind / session timeout in the protocol thread between enqueue and dequeue. The non-listening branch then dereferenced a freed peer (p->listening, rist_send_common_rtcp) -> use-after-free / intermittent SIGSEGV. Take peerlist_lock once at the top of the dequeue, verify the stashed peer is still on the live PEERS list, and drop the packet if it is gone. Hold the lock across the whole send (the listener branch already did this for its child walk) so a concurrent add/remove cannot free the peer or a sibling underneath us. Both rist_oob_dequeue callers release peerlist_lock before calling it, so there is no recursive-lock risk.	2026-06-22 15:06:49 -04:00
Sergio Ammirata	45fd8ff498	fix(receiver): guard rist_receiver_data_block_free2 against a NULL block The public free2() wrapper dereferenced (block)->ref unconditionally. In the receiver data_fd/tun delivery path the block can legitimately be NULL by the time the fifo-overflow branch frees it: free_data_block() nulls block after a successful free, and merged FEC pairs never allocate a block at all. With fifo_queue_size advancing on every packet, the overflow branch eventually calls free2() on the now-NULL block, reading offset 0x40 (->ref) of a NULL pointer -> intermittent SIGSEGV (observed as "segfault at 40" on listeners running risttunnel in both two-port and --single-port modes under abrupt link changes). Make free2() a free(NULL)-style no-op, mirroring free_data_block()'s own NULL guard. This protects every caller, not just the data_fd path.	2026-06-22 15:06:49 -04:00
Sergio Ammirata	f7f01d0b63	feat(risttunnel): add --single-port mode (forward RTP+ARQ, reverse OOB) risttunnel -1/--single-port carries a bidirectional IP tunnel over a single UDP port instead of two. One RIST connection runs the forward direction over the RTP data_fd path (ARQ-protected) and the reverse direction over the OOB channel on the same socket (best-effort, no ARQ). The caller passes only -o (sender role); the listener passes only -b (receiver role). librist learns the OOB return peer only from received OOB and the caller sends only forward RTP, so the listener captures the connecting peer via the auth/connect callback (sp_conn_cb) and hands it explicitly to rist_oob_write() from a TUN->OOB reader thread. The caller injects inbound OOB into its TUN through an OOB receive callback. Intended for low-bandwidth, NAT-friendly control tunnels (e.g. remote support over a single UDP/443) where the heavy direction is forward and the reverse is light/interactive.	2026-06-22 11:32:43 -04:00
Sergio Ammirata	f5315052f9	feat(oob): implement rist_oob_read polling receive path rist_oob_read() was a stub that logged an error and returned 0 (success per the public API) without ever writing *oob_block, so an application using the documented polling API received a garbage output pointer. Add an out-of-band receive fifo: when oob is enabled without a callback (rist_oob_callback_set with a NULL callback), the protocol thread queues incoming OOB packets in rist_recv_oob_data() and rist_oob_read() drains them, handing back a borrowed block that stays valid until the next call. The fifo is freed with the rest of the oob state on teardown. rist_oob_read() now returns -1 when oob is disabled or a callback is installed, the number of available packets (>=1) when one is returned, and 0 when the queue is empty. Adds test_oob_read exercising the write->read round-trip over the Main and Advanced profiles.	2026-06-22 11:00:23 -04:00
Sergio Ammirata	4037905026	fix(risttunnel): initialize log_ptr to the static logging settings risttunnel declared log_ptr uninitialised, so a non-NULL garbage stack value could be passed to rist_logging_set(), which only allocates a new settings struct when *log_ptr is NULL and otherwise writes the level/cb/ stream through the pointer as-is. That write to an invalid address caused an intermittent SIGSEGV at startup (a zeroed stack slot happened to work, masking the bug). Point log_ptr at the static logging_settings global, matching ristsender/ristreceiver/udp2udp/rist2rist. As a side effect the -v loglevel now also applies to risttunnel's own messages, which previously used the unconfigured global.	2026-06-22 10:41:48 -04:00
Sergio Ammirata	618207c7b0	Merge branch 'release/v0.2.18' into 'master' release: v0.2.18 See merge request rist/librist!362 v0.2.18	2026-06-22 01:54:10 +00:00
Sergio Ammirata	d8d9034777	fix(test): build NAT/multipath rebind tests under MSVC The issue #188 NAT/socket-rebind and multipath cname integration tests included <pthread.h> and initialised their tracker mutex with PTHREAD_MUTEX_INITIALIZER. MSVC has no system pthread.h, and the Windows pthread-shim maps pthread_mutex_t to a CRITICAL_SECTION, which cannot be initialised statically, so the whole librist VS solution failed to build (only these tests; the library and tools were fine). Follow the convention already used by test_send_receive.c and test_reflector.c: include "pthread-shim.h" (directly, or via rist-private.h), initialise the tracker mutex at runtime with pthread_mutex_init(), and declare the feeder threads with the PTHREAD_START_FUNC() macro so they match the shim's __stdcall/LPVOID signature. POSIX builds are unchanged; the suite now also compiles under MSVC.	2026-06-21 21:37:52 -04:00
Sergio Ammirata	2863e43834	NEWS: finalize 0.2.18 release Drop the pre-release marker from the 0.2.18 heading and add the canonical ABI 15:0:11 / API 4.12.0 summary lines, matching the format of prior release sections.	2026-06-21 20:39:45 -04:00
Sergio Ammirata	df64a2f1c0	meson: prepare 0.2.18-rc3 Bump API to 4.12.0, ABI to 15:0:11 (soversion 4 unchanged, binary-compatible with 0.2.15/0.2.16/0.2.17 and rc2). New interfaces since rc2: rist_recovery_depth_set() with the RIST_RECOVERY_DEPTH_MIN/DEFAULT/MAX macros, and the additive stats fields (RIST_STATS_VERSION 3: profile / seq_bits / advanced_active). Changes since rc2: - Advanced-profile correctness fixes (>64k sequence gaps, ring-index masking in receiver_mark_missing, expected-next-seq wrap mask, payload-only byte accounting, sequence-based duplicate detection, 32-bit-safe merge-mode pairing). - Default profile is now Advanced with Main interoperability (TR-06-3 Section 9); risttunnel follows RIST_DEFAULT_PROFILE. - Tunable Advanced recovery-ring depth via ?recovery-depth=. - Advanced-profile profile/framing visibility in the stats API and the Prometheus exporter. v0.2.18-rc3	2026-06-21 16:46:17 -04:00
Sergio Ammirata	1e53f5ce09	NEWS: document changes since 0.2.18-rc2 Add 0.2.18 pre-release entries for the changes that landed since rc2: - Advanced-profile correctness fixes: >64k sequence gaps, ring-index masking in receiver_mark_missing, expected-next-seq wrap mask, payload-only byte accounting, sequence-based duplicate detection, and 32-bit-safe merge-mode pairing - Advanced-profile profile/framing visibility in the stats API (RIST_STATS_VERSION 2 -> 3) and the Prometheus exporter, plus the receiver counters that were previously JSON-only - risttunnel follows RIST_DEFAULT_PROFILE Also normalize two em dashes in the section to ASCII ' - '.	2026-06-21 16:45:39 -04:00
Sergio Ammirata	db88969ebf	Merge branch 'fix/receiver-mark-missing-oob' into 'master' fix(receiver): mask sequence into ring index in receiver_mark_missing See merge request rist/librist!361	2026-06-21 18:46:56 +00:00
Sergio Ammirata	f3a72d9ede	fix(receiver): mask sequence into ring index in receiver_mark_missing receiver_mark_missing() indexed receiver_queue[] with the raw last_seq_found and current_seq, whereas every other queue access masks with (receiver_queue_max - 1). For 16-bit flows the sequence range equals receiver_queue_max, so a raw sequence was always a valid index. A 32-bit (Advanced) sequence can exceed receiver_queue_max once the stream passes the ring size: the access then reads off the end of the array and dereferences a stale or NULL slot's packet_time, crashing the receiver. Mask both indices like the rest of the queue accesses.	2026-06-21 14:29:36 -04:00
Sergio Ammirata	520d1b60d6	Merge branch 'feat/advanced-stats-visibility' into 'master' feat(stats): expose RIST profile and complete Advanced receiver stats See merge request rist/librist!360	2026-06-21 16:36:04 +00:00
Sergio Ammirata	8cf3c817b2	feat(stats): export collected receiver counters Several receiver counters were present in the stats JSON but never surfaced as Prometheus metrics. Export retries, dropped_late, dropped_full, duplicates, and the nack-depth buckets (recovered after two, three, four, and more nacks).	2026-06-21 12:28:30 -04:00
Sergio Ammirata	4d55974da5	feat(stats): expose RIST profile and Advanced framing The stats output gave no way to tell which RIST profile a flow used or whether Advanced framing was active on the wire. Add `profile` and `advanced_active` to the sender peer stats, and `profile`, `seq_bits`, and `advanced_active` to the receiver flow stats, with matching fields in the JSON payloads. Bump RIST_STATS_VERSION to 3 and the JSON schema version to 4. The Prometheus exporter emits new `rist_client_flow_info` and `rist_sender_peer_info` series that carry these as labels, so a scrape can identify an Advanced flow (and 16-bit vs 32-bit framing) without changing the label set of the existing series. On the sender, advanced_active reflects whether Advanced framing is in use toward the peer; on the receiver flow it is true only when the context is Advanced and the flow is on 32-bit framing, so it reads false when an Advanced context stays on Main framing because the peer has not advertised Advanced support (TR-06-3 Section 9) and false for Main and Simple contexts.	2026-06-21 12:27:27 -04:00
Sergio Ammirata	3bcf28caf7	fix(receiver): use rist_seq_next for merge-mode pairing The merge-mode pairing test computed the next sequence number with `& UINT16_MAX`, truncating it to 16 bits, which mis-pairs packets near a 16-bit boundary on a 32-bit Advanced flow. Use rist_seq_next(), which wraps correctly for both short_seq (16-bit) and 32-bit flows.	2026-06-21 12:26:00 -04:00
Sergio Ammirata	83cfd7fd9e	fix(receiver): detect duplicates by sequence number receiver_enqueue compared source_time to detect a duplicate in an occupied slot. The Advanced path stamps source_time with the arrival time, so that comparison never matched a genuine duplicate and the duplicate counter under-reported. Compare the sequence number instead, which is what the slot index is derived from. This does not change how Advanced buffers packets: it still buffers on arrival time, so avg_buffer_time measures buffer residence rather than source-to-output latency. Buffering on the on-wire timestamp is left as a separate change.	2026-06-21 12:26:00 -04:00
Sergio Ammirata	81f8a50722	fix(receiver): match Advanced byte accounting to Main The Advanced receive path passed the full datagram size as the ingest size while the Main path passes the payload only, so received_bytes and the derived bitrate were not comparable across profiles. Pass adv_data_len (the delivered payload) instead. ts_null_bytes stays 0 on the Advanced path, which performs no ts-null reinsertion on receive.	2026-06-21 12:25:38 -04:00
Sergio Ammirata	d1b65ea6dd	Merge branch 'fix/expected-seq-wrap-mask' into 'master' fix(receiver): correct 16-bit wrap mask for expected-next-seq See merge request rist/librist!359	2026-06-20 23:59:55 +00:00
Sergio Ammirata	639b388536	fix(receiver): correct 16-bit wrap mask for expected-next-seq receiver_enqueue computed the expected next sequence number with `& (UINT16_MAX - 1)` (0xFFFE), which clears bit 0 and forces the value even, so every odd successor was mis-computed. The mask was meant to wrap a 16-bit counter, i.e. `& UINT16_MAX`. The error is gated by the `packet_time < last_packet_ts` guard, so it never fires on normal in-order delivery. It bites in ARRIVAL timing mode, where a per-path arrival timeline can legitimately regress for an in-order packet on a bonded flow, sending roughly half of odd-successor packets into the late-drop path. Extract rist_seq_next() (symmetric with rist_seq_gap) so the successor wraps correctly for both short_seq (16-bit) and 32-bit flows, and add a link-free unit test covering the odd-successor regression and both wrap points.	2026-06-20 19:47:25 -04:00
Sergio Ammirata	b688f44c7f	Merge branch 'fix/advanced-32bit-gap' into 'master' fix(advanced): don't truncate >64k sequence gaps on 32-bit flows See merge request rist/librist!358	2026-06-20 23:18:17 +00:00
Sergio Ammirata	6ac6ce78b5	fix(advanced): don't truncate >64k sequence gaps on 32-bit flows receiver_mark_missing() and receiver_enqueue() unconditionally masked the forward sequence gap to 16 bits (& UINT16_MAX / & (UINT16_MAX-1)). That is correct for Simple/Main (16-bit) flows but wrong for Advanced 32-bit flows: a genuine >64k contiguous gap was truncated, distorting the NACK pacing math and mis-classifying in-order packets in the arrival-timing reorder guard. Reachable now that the Advanced recovery ring is resizable up to the full 32-bit space (?recovery-depth=). Add a short_seq-aware rist_seq_gap() helper and use it at both sites. The short_seq branch reproduces the original masks byte-for-byte, so 16-bit flows are unchanged; only 32-bit flows now see the true gap. The wrap false-positive cap is unified with the recovery-walk hole cap already in the tree (short_seq ? UINT16_SIZE/2 : receiver_queue_max/2), finishing a conversion that was only half-applied. Adds a pure, link-free unit test. MR note: the 16-bit branch intentionally keeps the long-standing `& (UINT16_MAX - 1)` (0xFFFE) mask in expected_seq, which forces the value even and looks like it should be `& UINT16_MAX`. Preserving it keeps this patch a zero-behavior-change for 16-bit flows; whether that mask is itself a bug is a separate, older question and should be its own change.	2026-06-20 19:01:48 -04:00
Sergio Ammirata	5046e25965	Merge branch 'feat/advanced-main-interop' into 'master' Advanced profile: Main interop, tunable recovery depth, Advanced default See merge request rist/librist!357	2026-06-20 18:30:31 +00:00
Sergio Ammirata	4b1b264e94	feat(tunnel): default risttunnel to RIST_DEFAULT_PROFILE Completes the default-profile sweep: risttunnel was still hardcoded to Main while ristsender/ristreceiver/YAML now follow RIST_DEFAULT_PROFILE (Advanced). risttunnel uses one profile for both its sender and receiver side, so it now defaults to Advanced too and interoperates with a Main-only peer via the TR-06-3 Section 9 fallback. Help text updated to list advanced and the new default. rist2rist is intentionally left at Simple (its documented purpose is to receive Simple-profile input and re-emit Main); udp2udp does not select a profile.	2026-06-20 14:18:16 -04:00
Sergio Ammirata	e57e1e2ff7	feat(profile): default to the Advanced profile (was Main) Changes the default RIST profile from Main to Advanced. This is now safe because an Advanced endpoint interoperates with Main: it negotiates Advanced framing with an Advanced peer and falls back to Main framing with a Main-only peer (TR-06-3 Section 9, implemented in the preceding commit), so existing Main deployments keep working without a flag. - RIST_DEFAULT_PROFILE is now RIST_PROFILE_ADVANCED. - ristsender, ristreceiver and the YAML config now take their default profile from RIST_DEFAULT_PROFILE instead of a hardcoded RIST_PROFILE_MAIN, so there is a single source of truth. Pass -p 1 (or profile: 1 in YAML) to force Main as before. - yamlparse.c gains an explicit <librist/peer.h> include for the macro. The default only affects the profile a context starts in; the ?profile= URL override and the explicit profile argument to rist_*_create() are unchanged. rist_peer_config still records the default with profile_set = 0, so it is applied only when ?profile= is present.	2026-06-20 14:18:16 -04:00
Sergio Ammirata	615b01d2c4	feat(advanced): Main profile interoperability (TR-06-3 Section 9) An Advanced-profile sender previously emitted Advanced (Type-8) framing unconditionally, so a Main-only receiver could not decode the stream; an Advanced receiver likewise assumed 32-bit framing for the whole flow. This implements the spec's optional Main/Advanced interop method so a single flow works across mismatched profiles. Sender (udp.c): an Advanced device now starts in Main-Profile framing and only switches to Advanced framing for a peer once that peer advertises Advanced capability (I=1 in its Main keep-alives, tracked as remote_supports_advanced). Until then it emits Main-conformant media a Main-only peer can decode. Control and OOB are unchanged. Receiver (rist-common.c): rist_receiver_recv_data() learns the actual wire framing of each data packet (pkt_short_seq) and refines the flow's short_seq from it. The two framings carry different sequence widths and timestamp encodings, so a mid-stream framing change (the Main->Advanced upgrade once a peer advertises I=1) is treated like a flow-id change: the timing baseline is dropped so the next enqueue re-derives time_offset and the seq->index mapping from the new framing instead of blending the two. Without this the stale baseline corrupts delivery after the switch. flow.c: Advanced flows default to 32-bit framing at create time; recv_data refines it per wire. The recovery ring size stays fixed per context. Matched Advanced and matched Main pairs are unaffected. Adds a permanent cross-profile interop suite (both directions, with and without 10% loss) to test/rist/meson.build.	2026-06-20 14:18:16 -04:00
Sergio Ammirata	1ecc10c187	feat(advanced): tunable recovery ring depth (?recovery-depth=) The Advanced-profile recovery/retransmission ring was a fixed compile-time size, and Simple/Main flows over-allocated to that same size. This makes the Advanced ring sizable and stops the 16-bit profiles from paying the Advanced memory footprint. - Recovery rings are now heap-allocated and profile-sized. Simple/Main allocate the 16-bit cap (65536 entries); Advanced allocates the configured depth. Sender and receiver rings are freed on teardown. - New ?recovery-depth=<n> URL parameter and rist_recovery_depth_set(ctx, depth) API size the Advanced ring. depth is a base-2 exponent: the ring holds 65536 << depth packets. Default 3 (8x, the prior fixed size); range 0..16 (16 == full 32-bit sequence space). Must be set before rist_start(). - rist_peer_config grows a trailing recovery_depth field at the existing RIST_PEER_CONFIG_VERSION 5; rist_peer_config_defaults_set_versioned() sets it to RIST_RECOVERY_DEPTH_DEFAULT for version-5 callers. Zero-initialised and older configs keep the default ring. - A one-shot WARN fires at peer-config time when recovery-maxbitrate and the max buffer would queue more packets than the recovery window can address, because packets beyond the window cannot be retransmitted. The Advanced hint points at ?recovery-depth=; the Simple/Main hint points at the Advanced profile. Tests: test_recovery_depth (depth->packet mapping, ring resize, receiver capacity, Main behaviour, range clamping, NULL ctx) and test_url_recovery_depth_parse (numeric parse, range and garbage rejection).	2026-06-20 14:18:00 -04:00
Sergio Ammirata	340fea4285	Merge branch 'fix/advanced-rtt-echo-scale' into 'master' fix(advanced): scale RTT echo to NTP ticks See merge request rist/librist!356	2026-06-17 13:37:10 +00:00

1 2 3 4 5 ...

1429 Commits