Commit Graph

41 Commits

Author SHA1 Message Date
Jack Doan
01909f4715 try to make certificate addition/removal reloadable in some cases (#1468)
* try to make certificate addition/removal reloadable in some cases

* very spicy change to respond to handshakes with cert versions we cannot match with a cert that we can indeed match

* even spicier change to rehandshake if we detect our cert is lower-version than our peer, and we have a newer-version cert available

* make tryRehandshake easier to understand
2025-11-03 19:38:44 -06:00
Jack Doan
932e329164 Don't delete static host mappings for non-primary IPs (#1464)
* Don't delete a vpnaddr if it's part of a certificate that contains a vpnaddr that's in the static host map

* remove unused arg from ConnectionManager.shouldSwapPrimary()
2025-09-04 14:49:40 -05:00
Nate Brown
52623820c2 Drop inactive tunnels (#1427) 2025-07-03 09:58:37 -05:00
brad-defined
94142aded5 Fix relay migration panic by covering every possible relay state (#1414) 2025-07-02 08:48:02 -04:00
John Maguire
d4a7df3083 Rename pki.default_version to pki.initiating_version (#1381)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m26s
Build and test / Build all and test on ubuntu-linux (push) Failing after 21m13s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3m19s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-04-07 18:08:29 -04:00
Nate Brown
d97ed57a19 V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown
08ac65362e Cert interface (#1212) 2024-10-10 18:00:22 -05:00
Nate Brown
e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Nate Brown
a390125935 Support reloading preferred_ranges (#1043) 2024-04-03 22:14:51 -05:00
Nate Brown
072edd56b3 Fix re-entrant GetOrHandshake issues (#1044) 2023-12-19 11:58:31 -06:00
Nate Brown
3356e03d85 Default pki.disconnect_invalid to true and make it reloadable (#859) 2023-11-13 12:39:38 -06:00
brad-defined
06b480e177 Fix relay migration (#964)
* Fix for relay migration on rehandshaking issue. On rehandshake, the relay tunnel doesn't migrate to the new hostinfo object correctly, due to an incorrect Nebula IP sent in the CreateRelayRequest message.
* Add a test for this case

---------

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2023-09-05 09:29:27 -04:00
Nate Brown
076ebc6c6e Simplify getting a hostinfo or starting a handshake with one (#954) 2023-08-21 18:51:45 -05:00
Nate Brown
7edcf620c0 We only need the certificate in ConnectionState (#953) 2023-08-21 14:11:06 -05:00
Nate Brown
5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Wade Simmons
9a7ed57a3f Cache cert verification methods (#871)
* cache cert verification

CheckSignature and Verify are expensive methods, and certificates are
static. Cache the results.

* use atomics

* make sure public key bytes match

* add VerifyWithCache and ResetCache

* cleanup

* use VerifyWithCache

* doc
2023-05-17 10:14:26 -04:00
Nate Brown
d1f786419c Try rehandshaking a main hostinfo after releasing hostmap locks (#863) 2023-05-08 14:43:03 -05:00
Nate Brown
48eb63899f Have lighthouses ack updates to reduce test packet traffic (#851) 2023-05-05 14:44:03 -05:00
Nate Brown
702e1c59bd Always disconnect block listed hosts (#858) 2023-05-04 16:09:42 -05:00
Nate Brown
5fe8f45d05 Clear lighthouse cache for a vpn ip on a dead connection when its the final hostinfo (#857) 2023-05-04 15:42:12 -05:00
Nate Brown
03e4a7f988 Rehandshaking (#838)
Co-authored-by: Brad Higgins <brad@defined.net>
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-05-04 15:16:37 -05:00
Nate Brown
fd99ce9a71 Use fewer test packets (#840) 2023-04-04 13:42:24 -05:00
Nate Brown
ee8e1348e9 Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
brad-defined
2801fb2286 Fix relay (#827)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2023-03-30 11:09:20 -05:00
Ryan Huber
e28336c5db probes to the lh are not generally useful as recv_error should catch (#408) 2023-03-29 15:09:36 -05:00
Nate Brown
92cc32f844 Remove handshake race avoidance (#820)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-03-13 12:35:14 -05:00
Nate Brown
a06977bbd5 Track connections by local index id instead of vpn ip (#807) 2023-02-13 14:41:05 -06:00
Nate Brown
5278b6f926 Generic timerwheel (#804) 2023-01-18 10:56:42 -06:00
brad-defined
813b64ffb1 Remove unused variables from connection manager (#677) 2022-11-15 20:33:09 -06:00
brad-defined
1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Don Stephan
332fa2b825 fix panic in handleInvalidCertificate (#675)
* fix panic in handleInvalidCertificate

when HandleMonitorTick fires, the hostmap can be nil which causes a panic to occur when trying to clean up the hostmap in handleInvalidCertificate. This fix just stops the invalidation from continuing if the hostmap doesn't exist.

* removed conditional for disconnectInvalid in HandleDeletionTick
2022-05-16 13:29:57 -04:00
Wade Simmons
b38bd36766 fix connection manager check when disconnect_invalid set (#658)
This restores the hostMap.QueryVpnIP block to how it looked before #370
was merged. I'm not sure why the patch from #370 wanted to continue on
if there was no match found in the hostmap, since there isn't anything
to do at that point (the tunnel has already been closed).

This was causing a crash because the handleInvalidCertificate check
expects the hostinfo to be passed in (but it is nil since there was no
hostinfo in the hostmap).

Fixes: #657
2022-04-04 13:38:36 -04:00
Nate Brown
bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined
6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis
32e2619323 Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Nathan Brown
3ea7e1b75f Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Wade Simmons
ee7c27093c add HostMap.RemoteIndexes (#329)
This change adds an index based on HostInfo.remoteIndexId. This allows
us to use HostMap.QueryReverseIndex without having to loop over all
entries in the map (this can be a bottleneck under high traffic
lighthouses).

Without this patch, a high traffic lighthouse server receiving recv_error
packets and lots of handshakes, cpu pprof trace can look like this:

      flat  flat%   sum%        cum   cum%
    2000ms 32.26% 32.26%     3040ms 49.03%  github.com/slackhq/nebula.(*HostMap).QueryReverseIndex
     870ms 14.03% 46.29%     1060ms 17.10%  runtime.mapiternext

Which shows 50% of total cpu time is being spent in QueryReverseIndex.
2020-11-23 14:51:16 -05:00
mhp
672ce1f0a8 Move slice allocations in connection manager monitor loop (#340)
* Move slice allocations in connection manager monitor loop

* move further out

Co-authored-by: Miran Park <mpark@slack-corp.com>
2020-11-19 15:44:05 -08:00
Wade Simmons
b4f2f7ce4e log certName alongside vpnIp (#200)
This change adds a new helper, `(*HostInfo).logger()`, that starts a new
logrus.Entry with `vpnIp` and `certName`. We don't use the helper inside
of handshake_ix though since the certificate has not been attached to
the HostInfo yet.

Fixes: #84
2020-04-06 11:34:00 -07:00
Ryan Huber
9333a8e3b7 subnet support 2019-12-12 16:34:17 +00:00
Slack Security Team
f22b4b584d Public Release 2019-11-19 17:00:20 +00:00