Commit Graph

66 Commits

Author SHA1 Message Date
JackDoan d50c3028a2 broken checkpt 2026-05-14 15:56:34 -05:00
JackDoan c4deb5fc1c checkpt 2026-05-14 14:42:17 -05:00
JackDoan 697294a676 size arena to match batch size 2026-05-14 14:39:02 -05:00
JackDoan 13ebc1b343 batched tun interface 2026-05-14 14:37:35 -05:00
JackDoan 3b1e658bef change Queue.Read signature 2026-05-14 14:31:02 -05:00
JackDoan 37208b1d8f speed 2026-05-14 14:28:06 -05:00
JackDoan 7bb5cd477e new tun interface 2026-05-14 14:27:17 -05:00
JackDoan c256f2cfbb use less ram pls 2026-05-11 17:30:56 -05:00
JackDoan 3be637bade clean up a comment a bit 2026-05-11 17:10:33 -05:00
JackDoan 5138321491 scoot pinning around 2026-05-11 11:32:57 -05:00
JackDoan 6a46a2913a GSO/GRO offloads, with TCP+ECN and UDP support 2026-05-11 11:32:57 -05:00
JackDoan 4b4331ba42 better and batched tun interface 2026-05-11 11:32:57 -05:00
Nate Brown d0f02ba873 Switch to slog, remove logrus (#1672) 2026-04-27 09:41:47 -05:00
Nate Brown 8c50fc3f60 Plug the conntrack cache ticker leak and nebula-service log.Fatal calls (#1669) 2026-04-21 13:19:54 -05:00
Nate Brown 2f4532f102 No more dns globals, proper cleanup on shutdown (#1667) 2026-04-21 12:41:10 -05:00
Jack Doan e80b9830a3 Remove more os.Exit calls and give a more reliable wait for stop function (attempt 3) (#1661) 2026-04-20 16:08:26 -05:00
Jack Doan b3194236aa udp_linux: wrap socket operations with syscall.RawConn for clean teardown (#1654)
gofmt / Run gofmt (push) Failing after 3s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 3s
Build and test / Build all and test on ubuntu-linux (push) Failing after 3s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
remove runtime.LockOSThread() because it makes things worse now

remove the "custom" Write() method from tun_linux.go, the stdlib path via os.File performs better

We should change our guidance around number of routines, ~2 per thread (that you wish to use for Nebula) seems to be about right now
2026-04-14 18:25:24 -05:00
Jack Doan 42bee7cf17 Report if Nebula start fails because of tun device name (#1588)
gofmt / Run gofmt (push) Failing after 2s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 2s
Build and test / Build all and test on ubuntu-linux (push) Failing after 2s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
* specifically report if nebula start fails because of tun device name

* close all routines when closing the tun
2026-01-28 10:03:36 -06:00
Nate Brown 1283ff0db4 Add option to control accepting recv_error (#1569) 2026-01-13 00:00:27 -06:00
Nate Brown 56067afca2 Stab at better logging when a relay is being used (#1533)
gofmt / Run gofmt (push) Failing after 5s
smoke-extra / Run extra smoke tests (push) Failing after 2s
smoke / Run multi node smoke test (push) Failing after 3s
Build and test / Build all and test on ubuntu-linux (push) Failing after 2s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-12-03 17:48:29 -06:00
Nate Brown 7aff313a17 Relax the restriction on routines from the config (#1531) 2025-11-19 13:10:11 -06:00
Nate Brown 52623820c2 Drop inactive tunnels (#1427) 2025-07-03 09:58:37 -05:00
Wade Simmons b8ea55eb90 optimize usage of bart (#1395)
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 19s
smoke / Run multi node smoke test (push) Failing after 1m19s
Build and test / Build all and test on ubuntu-linux (push) Failing after 18m41s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2m47s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
Use `bart.Lite` and `.Contains` as suggested by the bart maintainer:

- https://github.com/gaissmai/bart/commit/9455952eedcf59a6e755fc28ed16e906fa4f3066#commitcomment-155362580
2025-04-18 12:37:20 -04:00
John Maguire d4a7df3083 Rename pki.default_version to pki.initiating_version (#1381)
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m26s
Build and test / Build all and test on ubuntu-linux (push) Failing after 21m13s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3m19s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-04-07 18:08:29 -04:00
Nate Brown d97ed57a19 V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown 08ac65362e Cert interface (#1212) 2024-10-10 18:00:22 -05:00
Nate Brown e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Ben Ritcey 01cddb8013 Added firewall.rules.hash metric (#1010)
* Added firewall.rules.hash metric

Added a FNV-1 hash of the firewall rules as a Prometheus value.

* Switch FNV has to int64, include both hashes in log messages

* Use a uint32 for the FNV hash

Let go-metrics cast the uint32 to a int64, so it won't be lossy
when it eventually emits a float64 Prometheus metric.
2023-11-28 11:56:47 -05:00
Nate Brown 3356e03d85 Default pki.disconnect_invalid to true and make it reloadable (#859) 2023-11-13 12:39:38 -06:00
Nate Brown 5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Nate Brown 223cc6e660 Limit how often a busy tunnel can requery the lighthouse (#940)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-08-08 13:26:41 -05:00
Nate Brown a3e59a38ef Use registered io on Windows when possible (#905) 2023-07-10 12:43:48 -05:00
Nate Brown 3bbf5f4e67 Use an interface for udp conns (#901) 2023-06-14 10:48:52 -05:00
Nate Brown 03e4a7f988 Rehandshaking (#838)
Co-authored-by: Brad Higgins <brad@defined.net>
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-05-04 15:16:37 -05:00
Wade Simmons 0b67b19771 add boringcrypto Makefile targets (#856)
This adds a few build targets to compile with `GOEXPERIMENT=boringcrypto`:

- `bin-boringcrypto`
- `release-boringcrypto`

It also adds a field to the intial start up log indicating if
boringcrypto is enabled in the binary.
2023-05-04 15:42:45 -04:00
brad-defined 9b03053191 update EncReader and EncWriter interface function args to have concrete types (#844)
* Update LightHouseHandlerFunc to remove EncWriter param.
* Move EncWriter to interface
* EncReader, too
2023-04-07 14:28:37 -04:00
Wade Simmons 6685856b5d emit certificate.expiration_ttl_seconds metric (#782) 2023-04-03 20:18:16 -05:00
Nate Brown ee8e1348e9 Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
Nate Brown 6b3d42efa5 Use atomic.Pointer for certState (#833) 2023-03-30 13:04:09 -05:00
Wade Simmons 9af242dc47 switch to new sync/atomic helpers in go1.19 (#728)
These new helpers make the code a lot cleaner. I confirmed that the
simple helpers like `atomic.Int64` don't add any extra overhead as they
get inlined by the compiler. `atomic.Pointer` adds an extra method call
as it no longer gets inlined, but we aren't using these on the hot path
so it is probably okay.
2022-10-31 13:37:41 -04:00
Wade Simmons 7b9287709c add listen.send_recv_error config option (#670)
By default, Nebula replies to packets it has no tunnel for with a `recv_error` packet. This packet helps speed up re-connection
in the case that Nebula on either side did not shut down cleanly. This response can be abused as a way to discover if Nebula is running
on a host though. This option lets you configure if you want to send `recv_error` packets always, never, or only to private network remotes.
valid values: always, never, private

This setting is reloadable with SIGHUP.
2022-06-27 12:37:54 -04:00
brad-defined 1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Nate Brown 78d0d46bae Remove WriteRaw, cidrTree -> routeTree to better describe its purpose, remove redundancy from field names (#582) 2021-11-12 12:47:09 -06:00
Nate Brown 88ce0edf76 Start the overlay package with the old Inside interface (#576) 2021-11-10 21:52:26 -06:00
CzBiX 16be0ce566 Add Wintun support (#289) 2021-11-08 12:36:31 -06:00
Nate Brown bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined 6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis 32e2619323 Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Wade Simmons 44cb697552 Add more metrics (#450)
* Add more metrics

This change adds the following counter metrics:

Metrics to track packets dropped at the firewall:

    firewall.dropped.local_ip
    firewall.dropped.remote_ip
    firewall.dropped.no_rule

Metrics to track handshakes attempts that have been initiated and ones
that have timed out (ones that have completed are tracked by the
existing "handshakes" histogram).

    handshake_manager.initiated
    handshake_manager.timed_out

Metrics to track when cached_packets are dropped because we run out of
buffer space, and how many are sent once the handshake completes.

    hostinfo.cached_packets.dropped
    hostinfo.cached_packets.sent

This change also notes how many cached packets we have when we log the
final "Handshake received" message for either stage1 for stage2.

* separate incoming/outgoing metrics

* remove "allowed" firewall metrics

We don't need this on the hotpath, they aren't worh it.

* don't need pointers here
2021-04-27 22:23:18 -04:00
brad-defined 17106f83a0 Ensure the Nebula device exists before attempting to bind to the Nebula IP (#375) 2021-04-16 10:34:28 -05:00