mirror of
https://github.com/slackhq/nebula.git
synced 2026-05-16 04:47:38 +02:00
rawWrite passed the iovec pointer to syscall.Syscall as a uintptr, so
the Go compiler's escape analysis could not keep the underlying
[]unix.Iovec (or the payload slices its Base pointers reach) rooted
across the syscall. Under heavy sustained write load, GC could
collect or move these before tun_chr_write_iter finished reading
them, at which point the kernel read freed memory.
Observed on a UniFi UXG-Pro (Annapurna Labs Alpine V2, arm64, Linux
4.19.152) forwarding 1 Gbps iperf3 -R between LAN and a remote
Nebula peer, as two paired kernel warnings in the same second:
refcount_t: underflow; use-after-free
sock_wfree -> skb_release_head_state -> kfree_skb
-> skb_release_data -> __kfree_skb -> tcp_recvmsg ...
refcount_t: addition on 0; use-after-free
skb_set_owner_w -> sock_alloc_send_pskb
-> tun_get_user -> tun_chr_write_iter -> do_iter_write
-> vfs_writev -> do_writev -> __arm64_sys_writev
The Annapurna watchdog then soft-rebooted the device. No crash or
kernel WARN after patching; box ran sustained 1 Gbps iperf3 -R
without issue.
Fix: add a variadic `keepAlive ...interface{}` parameter to
rawWrite, and call runtime.KeepAlive on the iovec plus every
supplied root after the syscall returns. writeWithScratch now
passes its buffer + iovec; WriteGSO passes the iovec array, the
header buffer, and the payload fragment slice.
runtime.KeepAlive is a compiler directive, not a runtime barrier,
so the cost is effectively zero: it just forces the compiler's
liveness analysis to treat the object as used at that point.