Commit Graph

65 Commits

Author SHA1 Message Date
JackDoan
c4deb5fc1c checkpt 2026-05-14 14:42:17 -05:00
JackDoan
697294a676 size arena to match batch size 2026-05-14 14:39:02 -05:00
JackDoan
13ebc1b343 batched tun interface 2026-05-14 14:37:35 -05:00
JackDoan
3b1e658bef change Queue.Read signature 2026-05-14 14:31:02 -05:00
JackDoan
37208b1d8f speed 2026-05-14 14:28:06 -05:00
JackDoan
7bb5cd477e new tun interface 2026-05-14 14:27:17 -05:00
JackDoan
c256f2cfbb use less ram pls 2026-05-11 17:30:56 -05:00
JackDoan
3be637bade clean up a comment a bit 2026-05-11 17:10:33 -05:00
JackDoan
5138321491 scoot pinning around 2026-05-11 11:32:57 -05:00
JackDoan
6a46a2913a GSO/GRO offloads, with TCP+ECN and UDP support 2026-05-11 11:32:57 -05:00
JackDoan
4b4331ba42 better and batched tun interface 2026-05-11 11:32:57 -05:00
Nate Brown
d0f02ba873 Switch to slog, remove logrus (#1672) 2026-04-27 09:41:47 -05:00
Nate Brown
8c50fc3f60 Plug the conntrack cache ticker leak and nebula-service log.Fatal calls (#1669) 2026-04-21 13:19:54 -05:00
Nate Brown
2f4532f102 No more dns globals, proper cleanup on shutdown (#1667) 2026-04-21 12:41:10 -05:00
Jack Doan
e80b9830a3 Remove more os.Exit calls and give a more reliable wait for stop function (attempt 3) (#1661) 2026-04-20 16:08:26 -05:00
Jack Doan
b3194236aa udp_linux: wrap socket operations with syscall.RawConn for clean teardown (#1654)
Some checks failed
gofmt / Run gofmt (push) Failing after 3s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 3s
Build and test / Build all and test on ubuntu-linux (push) Failing after 3s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
remove runtime.LockOSThread() because it makes things worse now

remove the "custom" Write() method from tun_linux.go, the stdlib path via os.File performs better

We should change our guidance around number of routines, ~2 per thread (that you wish to use for Nebula) seems to be about right now
2026-04-14 18:25:24 -05:00
Jack Doan
42bee7cf17 Report if Nebula start fails because of tun device name (#1588)
Some checks failed
gofmt / Run gofmt (push) Failing after 2s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 2s
Build and test / Build all and test on ubuntu-linux (push) Failing after 2s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
* specifically report if nebula start fails because of tun device name

* close all routines when closing the tun
2026-01-28 10:03:36 -06:00
Nate Brown
1283ff0db4 Add option to control accepting recv_error (#1569) 2026-01-13 00:00:27 -06:00
Nate Brown
56067afca2 Stab at better logging when a relay is being used (#1533)
Some checks failed
gofmt / Run gofmt (push) Failing after 5s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 3s
Build and test / Build all and test on ubuntu-linux (push) Failing after 2s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-12-03 17:48:29 -06:00
Nate Brown
7aff313a17 Relax the restriction on routines from the config (#1531) 2025-11-19 13:10:11 -06:00
Nate Brown
52623820c2 Drop inactive tunnels (#1427) 2025-07-03 09:58:37 -05:00
Wade Simmons
b8ea55eb90 optimize usage of bart (#1395)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 19s
smoke / Run multi node smoke test (push) Failing after 1m19s
Build and test / Build all and test on ubuntu-linux (push) Failing after 18m41s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2m47s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
Use `bart.Lite` and `.Contains` as suggested by the bart maintainer:

- 9455952eed (commitcomment-155362580)
2025-04-18 12:37:20 -04:00
John Maguire
d4a7df3083 Rename pki.default_version to pki.initiating_version (#1381)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m26s
Build and test / Build all and test on ubuntu-linux (push) Failing after 21m13s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3m19s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-04-07 18:08:29 -04:00
Nate Brown
d97ed57a19 V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown
08ac65362e Cert interface (#1212) 2024-10-10 18:00:22 -05:00
Nate Brown
e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Ben Ritcey
01cddb8013 Added firewall.rules.hash metric (#1010)
* Added firewall.rules.hash metric

Added a FNV-1 hash of the firewall rules as a Prometheus value.

* Switch FNV has to int64, include both hashes in log messages

* Use a uint32 for the FNV hash

Let go-metrics cast the uint32 to a int64, so it won't be lossy
when it eventually emits a float64 Prometheus metric.
2023-11-28 11:56:47 -05:00
Nate Brown
3356e03d85 Default pki.disconnect_invalid to true and make it reloadable (#859) 2023-11-13 12:39:38 -06:00
Nate Brown
5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Nate Brown
223cc6e660 Limit how often a busy tunnel can requery the lighthouse (#940)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-08-08 13:26:41 -05:00
Nate Brown
a3e59a38ef Use registered io on Windows when possible (#905) 2023-07-10 12:43:48 -05:00
Nate Brown
3bbf5f4e67 Use an interface for udp conns (#901) 2023-06-14 10:48:52 -05:00
Nate Brown
03e4a7f988 Rehandshaking (#838)
Co-authored-by: Brad Higgins <brad@defined.net>
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-05-04 15:16:37 -05:00
Wade Simmons
0b67b19771 add boringcrypto Makefile targets (#856)
This adds a few build targets to compile with `GOEXPERIMENT=boringcrypto`:

- `bin-boringcrypto`
- `release-boringcrypto`

It also adds a field to the intial start up log indicating if
boringcrypto is enabled in the binary.
2023-05-04 15:42:45 -04:00
brad-defined
9b03053191 update EncReader and EncWriter interface function args to have concrete types (#844)
* Update LightHouseHandlerFunc to remove EncWriter param.
* Move EncWriter to interface
* EncReader, too
2023-04-07 14:28:37 -04:00
Wade Simmons
6685856b5d emit certificate.expiration_ttl_seconds metric (#782) 2023-04-03 20:18:16 -05:00
Nate Brown
ee8e1348e9 Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
Nate Brown
6b3d42efa5 Use atomic.Pointer for certState (#833) 2023-03-30 13:04:09 -05:00
Wade Simmons
9af242dc47 switch to new sync/atomic helpers in go1.19 (#728)
These new helpers make the code a lot cleaner. I confirmed that the
simple helpers like `atomic.Int64` don't add any extra overhead as they
get inlined by the compiler. `atomic.Pointer` adds an extra method call
as it no longer gets inlined, but we aren't using these on the hot path
so it is probably okay.
2022-10-31 13:37:41 -04:00
Wade Simmons
7b9287709c add listen.send_recv_error config option (#670)
By default, Nebula replies to packets it has no tunnel for with a `recv_error` packet. This packet helps speed up re-connection
in the case that Nebula on either side did not shut down cleanly. This response can be abused as a way to discover if Nebula is running
on a host though. This option lets you configure if you want to send `recv_error` packets always, never, or only to private network remotes.
valid values: always, never, private

This setting is reloadable with SIGHUP.
2022-06-27 12:37:54 -04:00
brad-defined
1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Nate Brown
78d0d46bae Remove WriteRaw, cidrTree -> routeTree to better describe its purpose, remove redundancy from field names (#582) 2021-11-12 12:47:09 -06:00
Nate Brown
88ce0edf76 Start the overlay package with the old Inside interface (#576) 2021-11-10 21:52:26 -06:00
CzBiX
16be0ce566 Add Wintun support (#289) 2021-11-08 12:36:31 -06:00
Nate Brown
bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined
6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis
32e2619323 Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Wade Simmons
44cb697552 Add more metrics (#450)
* Add more metrics

This change adds the following counter metrics:

Metrics to track packets dropped at the firewall:

    firewall.dropped.local_ip
    firewall.dropped.remote_ip
    firewall.dropped.no_rule

Metrics to track handshakes attempts that have been initiated and ones
that have timed out (ones that have completed are tracked by the
existing "handshakes" histogram).

    handshake_manager.initiated
    handshake_manager.timed_out

Metrics to track when cached_packets are dropped because we run out of
buffer space, and how many are sent once the handshake completes.

    hostinfo.cached_packets.dropped
    hostinfo.cached_packets.sent

This change also notes how many cached packets we have when we log the
final "Handshake received" message for either stage1 for stage2.

* separate incoming/outgoing metrics

* remove "allowed" firewall metrics

We don't need this on the hotpath, they aren't worh it.

* don't need pointers here
2021-04-27 22:23:18 -04:00
brad-defined
17106f83a0 Ensure the Nebula device exists before attempting to bind to the Nebula IP (#375) 2021-04-16 10:34:28 -05:00
Nathan Brown
64d8e5aa96 More LH cleanup (#429) 2021-04-01 10:23:31 -05:00