Commit Graph

8 Commits

Author SHA1 Message Date
rawdigits
3dea496c7f overlay/tio: KeepAlive poll-path readv/writev buffers too
The Poll fallback (used when IFF_VNET_HDR can't be enabled) has the
same unsafe-pointer-via-uintptr pattern as Offload.rawWrite:
readOne/writeOne build a [2]syscall.Iovec on the stack, pass it to
syscall.Syscall as uintptr, and the kernel then DMAs in/out of the
to/from slices whose Base pointers the iovec holds.

Escape analysis can't see that use, so under GC pressure the
backing memory could be collected or moved mid-syscall.

Add runtime.KeepAlive on the iovec and the user buffer around both
the SYS_READV and SYS_WRITEV syscalls. Same pattern and rationale
as the prior commit on the offload path.
2026-04-24 21:44:37 +00:00
rawdigits
7c38aa7e6b overlay/tio: KeepAlive writev iovec and payloads through the syscall
rawWrite passed the iovec pointer to syscall.Syscall as a uintptr, so
the Go compiler's escape analysis could not keep the underlying
[]unix.Iovec (or the payload slices its Base pointers reach) rooted
across the syscall. Under heavy sustained write load, GC could
collect or move these before tun_chr_write_iter finished reading
them, at which point the kernel read freed memory.

Observed on a UniFi UXG-Pro (Annapurna Labs Alpine V2, arm64, Linux
4.19.152) forwarding 1 Gbps iperf3 -R between LAN and a remote
Nebula peer, as two paired kernel warnings in the same second:

  refcount_t: underflow; use-after-free
    sock_wfree -> skb_release_head_state -> kfree_skb
    -> skb_release_data -> __kfree_skb -> tcp_recvmsg ...

  refcount_t: addition on 0; use-after-free
    skb_set_owner_w -> sock_alloc_send_pskb
    -> tun_get_user -> tun_chr_write_iter -> do_iter_write
    -> vfs_writev -> do_writev -> __arm64_sys_writev

The Annapurna watchdog then soft-rebooted the device. No crash or
kernel WARN after patching; box ran sustained 1 Gbps iperf3 -R
without issue.

Fix: add a variadic `keepAlive ...interface{}` parameter to
rawWrite, and call runtime.KeepAlive on the iovec plus every
supplied root after the syscall returns. writeWithScratch now
passes its buffer + iovec; WriteGSO passes the iovec array, the
header buffer, and the payload fragment slice.

runtime.KeepAlive is a compiler directive, not a runtime barrier,
so the cost is effectively zero: it just forces the compiler's
liveness analysis to treat the object as used at that point.
2026-04-24 21:40:51 +00:00
JackDoan
8fd724d762 fix? 2026-04-24 16:27:23 -05:00
JackDoan
90f2938f9c cruft 2026-04-23 13:12:24 -05:00
JackDoan
f76ac2e216 fix tests 2026-04-23 11:35:51 -05:00
JackDoan
4104a48a86 checksum speed 2026-04-21 17:07:50 -05:00
JackDoan
78af44068f typo! 2026-04-21 14:02:15 -05:00
JackDoan
ad6b918e4d checkpt 2026-04-21 13:31:16 -05:00