The scatter-gather writev path in WriteGSO triggered a kernel-side
use-after-free in tun_chr_write_iter → sock_alloc_send_pskb →
skb_set_owner_w on Linux 4.19 TUN when the virtio_net_hdr requested
TSO segmentation. The skb write-memory refcount (sk_wmem_alloc)
underflowed, producing paired traces of refcount_t: addition on 0
(in the write path) and refcount_t: underflow (in the paired recv
socket), reliably rebooting UBIOS UXG-Pro routers under iperf3 -R.
Match wireguard-go's design: coalesce the virtio_net_hdr, IP/TCP
header, and all payload fragments into a single contiguous per-queue
scratch buffer, then emit the superpacket with a single write()
syscall. wireguard-go's offload path handles GRO-merged TSO
superpackets this way and has no equivalent failure mode (see
tun/tun_linux.go Write — it writes bufs[bufsI][offset:] with a
single tunFile.Write call after coalesce).
Cost: one extra memcpy per superpacket (bounded at ~64KiB by the
virtio spec).
Unit tests pass (go test ./overlay/tio/...). Field testing on
UXG-Pro (4.19) pending.
remove runtime.LockOSThread() because it makes things worse now
remove the "custom" Write() method from tun_linux.go, the stdlib path via os.File performs better
We should change our guidance around number of routines, ~2 per thread (that you wish to use for Nebula) seems to be about right now
Recent merge of cert-v2 support introduced the ability to tunnel IPv6. However, FreeBSD's IPv6 tunneling does not work for 2 reasons:
* The ifconfig commands did not work for IPv6 addresses
* The tunnel device was not configured for link-layer mode, so it only supported IPv4
This PR improves FreeBSD tunneling support in 3 ways:
* Use ioctl instead of exec'ing ifconfig to configure the interface, with additional logic to support IPv6
* Configure the tunnel in link-layer mode, allowing IPv6 traffic
* Use readv() and writev() to communicate with the tunnel device, to avoid the need to copy the packet buffer
We switched to yaml.v3 with #1148, but missed this spot that was still
casting into `map[any]any` when yaml.v3 makes it `map[string]any`. Also
clean up a few more `interface{}` that were added as we changed them all
to `any` with #1148.
* upgrade to yaml.v3
The main nice fix here is that maps unmarshal into `map[string]any`
instead of `map[any]any`, so it cleans things up a bit.
* add config.AsBool
Since yaml.v3 doesn't automatically convert yes to bool now, for
backwards compat
* use type aliases for m
* more cleanup
* more cleanup
* more cleanup
* go mod cleanup
* firewall: add option to send REJECT replies
This change allows you to configure the firewall to send REJECT packets
when a packet is denied.
firewall:
# Action to take when a packet is not allowed by the firewall rules.
# Can be one of:
# `drop` (default): silently drop the packet.
# `reject`: send a reject reply.
# - For TCP, this will be a RST "Connection Reset" packet.
# - For other protocols, this will be an ICMP port unreachable packet.
outbound_action: drop
inbound_action: drop
These packets are only sent to established tunnels, and only on the
overlay network (currently IPv4 only).
$ ping -c1 192.168.100.3
PING 192.168.100.3 (192.168.100.3) 56(84) bytes of data.
From 192.168.100.3 icmp_seq=2 Destination Port Unreachable
--- 192.168.100.3 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 31ms
$ nc -nzv 192.168.100.3 22
(UNKNOWN) [192.168.100.3] 22 (?) : Connection refused
This change also modifies the smoke test to capture tcpdump pcaps from
both the inside and outside to inspect what is going on over the wire.
It also now does TCP and UDP packet tests using the Nmap version of
ncat.
* calculate seq and ack the same was as the kernel
The logic a bit confusing, so we copy it straight from how the kernel
does iptables `--reject-with tcp-reset`:
- https://github.com/torvalds/linux/blob/v5.19/net/ipv4/netfilter/nf_reject_ipv4.c#L193-L221
* cleanup