Commit Graph

44 Commits

Author SHA1 Message Date
Jack Doan 2e9117da5b fix tunnels that could permanently escape connection-manager monitoring (#1752)
smoke-extra / freebsd-amd64 (push) Failing after 16s
smoke-extra / linux-amd64-ipv6disable (push) Failing after 15s
smoke-extra / netbsd-amd64 (push) Failing after 14s
smoke-extra / openbsd-amd64 (push) Failing after 16s
smoke-extra / linux-386 (push) Failing after 17s
smoke / Run multi node smoke test (push) Failing after 1m25s
Build and test / Static checks (push) Successful in 1m42s
Build and test / Test linux (push) Failing after 2m17s
Build and test / Test linux-boringcrypto (push) Failing after 3m9s
Build and test / Test linux-pkcs11 (push) Failing after 2m54s
Build and test / Cross-build linux-arm (push) Successful in 3m3s
Build and test / Cross-build linux-mips (push) Successful in 3m44s
Build and test / Cross-build linux-other (push) Successful in 3m7s
Build and test / Cross-build windows (push) Successful in 59s
Build and test / Cross-build freebsd (push) Successful in 1m33s
Build and test / Cross-build netbsd (push) Successful in 1m34s
Build and test / Cross-build openbsd (push) Successful in 1m33s
Build and test / Cross-build mobile (push) Successful in 3m15s
smoke-extra / Run windows smoke test (push) Has been cancelled
Build and test / Test macos (push) Has been cancelled
Build and test / Test windows (push) Has been cancelled
Build and test / CI status (push) Has been cancelled
2026-06-10 11:03:23 -05:00
Nate Brown 213dd46588 Stop leaking goroutines past Control.Stop, consolidate punching in Punchy (#1708) 2026-05-06 16:21:16 -05:00
Nate Brown d0f02ba873 Switch to slog, remove logrus (#1672) 2026-04-27 09:41:47 -05:00
Jack Doan 01909f4715 try to make certificate addition/removal reloadable in some cases (#1468)
* try to make certificate addition/removal reloadable in some cases

* very spicy change to respond to handshakes with cert versions we cannot match with a cert that we can indeed match

* even spicier change to rehandshake if we detect our cert is lower-version than our peer, and we have a newer-version cert available

* make tryRehandshake easier to understand
2025-11-03 19:38:44 -06:00
Jack Doan 932e329164 Don't delete static host mappings for non-primary IPs (#1464)
* Don't delete a vpnaddr if it's part of a certificate that contains a vpnaddr that's in the static host map

* remove unused arg from ConnectionManager.shouldSwapPrimary()
2025-09-04 14:49:40 -05:00
Nate Brown 52623820c2 Drop inactive tunnels (#1427) 2025-07-03 09:58:37 -05:00
brad-defined 94142aded5 Fix relay migration panic by covering every possible relay state (#1414) 2025-07-02 08:48:02 -04:00
John Maguire d4a7df3083 Rename pki.default_version to pki.initiating_version (#1381)
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m26s
Build and test / Build all and test on ubuntu-linux (push) Failing after 21m13s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3m19s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-04-07 18:08:29 -04:00
Nate Brown d97ed57a19 V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown 08ac65362e Cert interface (#1212) 2024-10-10 18:00:22 -05:00
Nate Brown e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Nate Brown a390125935 Support reloading preferred_ranges (#1043) 2024-04-03 22:14:51 -05:00
Nate Brown 072edd56b3 Fix re-entrant GetOrHandshake issues (#1044) 2023-12-19 11:58:31 -06:00
Nate Brown 3356e03d85 Default pki.disconnect_invalid to true and make it reloadable (#859) 2023-11-13 12:39:38 -06:00
brad-defined 06b480e177 Fix relay migration (#964)
* Fix for relay migration on rehandshaking issue. On rehandshake, the relay tunnel doesn't migrate to the new hostinfo object correctly, due to an incorrect Nebula IP sent in the CreateRelayRequest message.
* Add a test for this case

---------

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2023-09-05 09:29:27 -04:00
Nate Brown 076ebc6c6e Simplify getting a hostinfo or starting a handshake with one (#954) 2023-08-21 18:51:45 -05:00
Nate Brown 7edcf620c0 We only need the certificate in ConnectionState (#953) 2023-08-21 14:11:06 -05:00
Nate Brown 5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Wade Simmons 9a7ed57a3f Cache cert verification methods (#871)
* cache cert verification

CheckSignature and Verify are expensive methods, and certificates are
static. Cache the results.

* use atomics

* make sure public key bytes match

* add VerifyWithCache and ResetCache

* cleanup

* use VerifyWithCache

* doc
2023-05-17 10:14:26 -04:00
Nate Brown d1f786419c Try rehandshaking a main hostinfo after releasing hostmap locks (#863) 2023-05-08 14:43:03 -05:00
Nate Brown 48eb63899f Have lighthouses ack updates to reduce test packet traffic (#851) 2023-05-05 14:44:03 -05:00
Nate Brown 702e1c59bd Always disconnect block listed hosts (#858) 2023-05-04 16:09:42 -05:00
Nate Brown 5fe8f45d05 Clear lighthouse cache for a vpn ip on a dead connection when its the final hostinfo (#857) 2023-05-04 15:42:12 -05:00
Nate Brown 03e4a7f988 Rehandshaking (#838)
Co-authored-by: Brad Higgins <brad@defined.net>
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-05-04 15:16:37 -05:00
Nate Brown fd99ce9a71 Use fewer test packets (#840) 2023-04-04 13:42:24 -05:00
Nate Brown ee8e1348e9 Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
brad-defined 2801fb2286 Fix relay (#827)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2023-03-30 11:09:20 -05:00
Ryan Huber e28336c5db probes to the lh are not generally useful as recv_error should catch (#408) 2023-03-29 15:09:36 -05:00
Nate Brown 92cc32f844 Remove handshake race avoidance (#820)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-03-13 12:35:14 -05:00
Nate Brown a06977bbd5 Track connections by local index id instead of vpn ip (#807) 2023-02-13 14:41:05 -06:00
Nate Brown 5278b6f926 Generic timerwheel (#804) 2023-01-18 10:56:42 -06:00
brad-defined 813b64ffb1 Remove unused variables from connection manager (#677) 2022-11-15 20:33:09 -06:00
brad-defined 1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Don Stephan 332fa2b825 fix panic in handleInvalidCertificate (#675)
* fix panic in handleInvalidCertificate

when HandleMonitorTick fires, the hostmap can be nil which causes a panic to occur when trying to clean up the hostmap in handleInvalidCertificate. This fix just stops the invalidation from continuing if the hostmap doesn't exist.

* removed conditional for disconnectInvalid in HandleDeletionTick
2022-05-16 13:29:57 -04:00
Wade Simmons b38bd36766 fix connection manager check when disconnect_invalid set (#658)
This restores the hostMap.QueryVpnIP block to how it looked before #370
was merged. I'm not sure why the patch from #370 wanted to continue on
if there was no match found in the hostmap, since there isn't anything
to do at that point (the tunnel has already been closed).

This was causing a crash because the handleInvalidCertificate check
expects the hostinfo to be passed in (but it is nil since there was no
hostinfo in the hostmap).

Fixes: #657
2022-04-04 13:38:36 -04:00
Nate Brown bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined 6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis 32e2619323 Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Nathan Brown 3ea7e1b75f Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Wade Simmons ee7c27093c add HostMap.RemoteIndexes (#329)
This change adds an index based on HostInfo.remoteIndexId. This allows
us to use HostMap.QueryReverseIndex without having to loop over all
entries in the map (this can be a bottleneck under high traffic
lighthouses).

Without this patch, a high traffic lighthouse server receiving recv_error
packets and lots of handshakes, cpu pprof trace can look like this:

      flat  flat%   sum%        cum   cum%
    2000ms 32.26% 32.26%     3040ms 49.03%  github.com/slackhq/nebula.(*HostMap).QueryReverseIndex
     870ms 14.03% 46.29%     1060ms 17.10%  runtime.mapiternext

Which shows 50% of total cpu time is being spent in QueryReverseIndex.
2020-11-23 14:51:16 -05:00
mhp 672ce1f0a8 Move slice allocations in connection manager monitor loop (#340)
* Move slice allocations in connection manager monitor loop

* move further out

Co-authored-by: Miran Park <mpark@slack-corp.com>
2020-11-19 15:44:05 -08:00
Wade Simmons b4f2f7ce4e log certName alongside vpnIp (#200)
This change adds a new helper, `(*HostInfo).logger()`, that starts a new
logrus.Entry with `vpnIp` and `certName`. We don't use the helper inside
of handshake_ix though since the certificate has not been attached to
the HostInfo yet.

Fixes: #84
2020-04-06 11:34:00 -07:00
Ryan Huber 9333a8e3b7 subnet support 2019-12-12 16:34:17 +00:00
Slack Security Team f22b4b584d Public Release 2019-11-19 17:00:20 +00:00