nebula

mirror of https://github.com/slackhq/nebula.git synced 2025-11-09 09:13:58 +01:00

Author	SHA1	Message	Date
Wade Simmons	1c9fdba403	Merge remote-tracking branch 'origin/master' into mutex-debug	2025-04-02 09:22:18 -04:00
Nate Brown	d97ed57a19	V2 certificate format (#1216 ) Co-authored-by: Nate Brown <nbrown.us@gmail.com> Co-authored-by: Jack Doan <jackdoan@rivian.com> Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com> Co-authored-by: Jack Doan <me@jackdoan.com>	2025-03-06 11:28:26 -06:00
Nate Brown	08ac65362e	Cert interface (#1212 )	2024-10-10 18:00:22 -05:00
Nate Brown	e264a0ff88	Switch most everything to netip in prep for ipv6 in the overlay (#1173 )	2024-07-31 10:18:56 -05:00
Wade Simmons	0ccfad1a1e	Merge remote-tracking branch 'origin/master' into mutex-debug	2024-04-11 12:15:52 -04:00
Nate Brown	a390125935	Support reloading preferred_ranges (#1043 )	2024-04-03 22:14:51 -05:00
Wade Simmons	91ec6bb1ff	Merge remote-tracking branch 'origin/master' into mutex-debug	2023-12-19 13:30:40 -05:00
Nate Brown	072edd56b3	Fix re-entrant `GetOrHandshake` issues (#1044 )	2023-12-19 11:58:31 -06:00
Wade Simmons	6f27f46965	simplify	2023-12-19 09:10:00 -05:00
Wade Simmons	bcaefce4ac	more types	2023-12-18 22:38:52 -05:00
Wade Simmons	fdb78044ba	Merge remote-tracking branch 'origin/master' into mutex-debug	2023-12-17 09:19:48 -05:00
Nate Brown	5181cb0474	Use generics for CIDRTrees to avoid casting issues (#1004 )	2023-11-02 17:05:08 -05:00
Nate Brown	a44e1b8b05	Clean up a hostinfo to reduce memory usage (#955 )	2023-11-02 16:53:59 -05:00
Nate Brown	076ebc6c6e	Simplify getting a hostinfo or starting a handshake with one (#954 )	2023-08-21 18:51:45 -05:00
Wade Simmons	4c89b3c6a3	cleanup	2023-08-21 13:09:25 -04:00
Wade Simmons	5cc43ea9cd	Merge branch 'master' into mutex-debug	2023-08-21 12:42:36 -04:00
Nate Brown	223cc6e660	Limit how often a busy tunnel can requery the lighthouse (#940 ) Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2023-08-08 13:26:41 -05:00
Nate Brown	a10baeee92	Pull hostmap and pending hostmap apart, remove unused functions (#843 )	2023-07-24 12:37:52 -05:00
Wade Simmons	9105eba939	also validate hostinfo locks	2023-05-09 11:22:55 -04:00
Wade Simmons	e6eeef785e	mutex_debug experimental test to see if we can have a test mode that verifies mutexes lock in the order we want, while having no hit on production performance. Since this uses a build tag, it should all compile out during the build process and be a no-op unless the tag is set.	2023-05-08 11:17:14 -04:00
Nate Brown	03e4a7f988	Rehandshaking (#838 ) Co-authored-by: Brad Higgins <brad@defined.net> Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2023-05-04 15:16:37 -05:00
Nate Brown	ee8e1348e9	Use connection manager to drive NAT maintenance (#835 ) Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>	2023-03-31 15:45:05 -05:00
brad-defined	2801fb2286	Fix relay (#827 ) Co-authored-by: Nate Brown <nbrown.us@gmail.com>	2023-03-30 11:09:20 -05:00
Nate Brown	f0ef80500d	Remove dead code and re-order transit from pending to main hostmap on stage 2 (#828 )	2023-03-17 15:36:24 -05:00
Wade Simmons	e1af37e46d	add calculated_remotes (#759 ) * add calculated_remotes This setting allows us to "guess" what the remote might be for a host while we wait for the lighthouse response. For networks that hard designed with in mind, it can help speed up handshake performance, as well as improve resiliency in the case that all lighthouses are down. Example: lighthouse: # ... calculated_remotes: # For any Nebula IPs in 10.0.10.0/24, this will apply the mask and add # the calculated IP as an initial remote (while we wait for the response # from the lighthouse). Both CIDRs must have the same mask size. # For example, Nebula IP 10.0.10.123 will have a calculated remote of # 192.168.1.123 10.0.10.0/24: - mask: 192.168.1.0/24 port: 4242 * figure out what is up with this test * add test * better logic for sending handshakes Keep track of the last light of hosts we sent handshakes to. Only log handshake sent messages if the list has changed. Remove the test Test_NewHandshakeManagerTrigger because it is faulty and makes no sense. It relys on the fact that no handshake packets actually get sent, but with these changes we would send packets now (which it should!) * use atomic.Pointer * cleanup to make it clearer * fix typo in example	2023-03-13 15:09:08 -04:00
Nate Brown	92cc32f844	Remove handshake race avoidance (#820 ) Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2023-03-13 12:35:14 -05:00
Nate Brown	a06977bbd5	Track connections by local index id instead of vpn ip (#807 )	2023-02-13 14:41:05 -06:00
Wade Simmons	9af242dc47	switch to new sync/atomic helpers in go1.19 (#728 ) These new helpers make the code a lot cleaner. I confirmed that the simple helpers like `atomic.Int64` don't add any extra overhead as they get inlined by the compiler. `atomic.Pointer` adds an extra method call as it no longer gets inlined, but we aren't using these on the hot path so it is probably okay.	2022-10-31 13:37:41 -04:00
brad-defined	1a7c575011	Relay (#678 ) Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>	2022-06-21 13:35:23 -05:00
Wade Simmons	949ec78653	don't set ConnectionState to nil (#590 ) * don't set ConnectionState to nil We might have packets processing in another thread, so we can't safely just set this to nil. Since we removed it from the hostmaps, the next packets to process should start the handshake over again. I believe this comment is outdated or incorrect, since the next handshake will start over with a new HostInfo, I don't think there is any way a counter reuse could happen: > We must null the connectionstate or a counter reuse may happen Here is a panic we saw that I think is related: panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x93a037] goroutine 59 [running, locked to thread]: github.com/slackhq/nebula.(Firewall).Drop(...) github.com/slackhq/nebula/firewall.go:380 github.com/slackhq/nebula.(Interface).consumeInsidePacket(...) github.com/slackhq/nebula/inside.go:59 github.com/slackhq/nebula.(Interface).listenIn(...) github.com/slackhq/nebula/interface.go:233 created by github.com/slackhq/nebula.(Interface).run github.com/slackhq/nebula/interface.go:191 * use closeTunnel	2021-12-06 14:09:05 -05:00
Nate Brown	467e605d5e	Push route handling into overlay, a few more nits fixed (#581 )	2021-11-12 11:19:28 -06:00
Nate Brown	e07524a654	Move all of tun into overlay (#577 )	2021-11-11 16:37:29 -06:00
Wade Simmons	304b12f63f	create ConnectionState before adding to HostMap (#535 ) We have a few small race conditions with creating the HostInfo.ConnectionState since we add the host info to the pendingHostMap before we set this field. We can make everything a lot easier if we just add an "init" function so that we can set this field in the hostinfo before we add it to the hostmap.	2021-11-08 14:46:22 -05:00
Nate Brown	bcabcfdaca	Rework some things into packages (#489 )	2021-11-03 20:54:04 -05:00
brad-defined	6ae8ba26f7	Add a context object in nebula.Main to clean up on error (#550 )	2021-11-02 13:14:26 -05:00
Wade Simmons	ea2c186a77	remote_allow_ranges: allow inside CIDR specific remote_allow_lists (#540 ) This allows you to configure remote allow lists specific to different subnets of the inside CIDR. Example: remote_allow_ranges: 10.42.42.0/24: 192.168.0.0/16: true This would only allow hosts with a VPN IP in the 10.42.42.0/24 range to have private IPs (and thus don't connect over public IPs). The PR also refactors AllowList into RemoteAllowList and LocalAllowList to make it clearer which methods are allowed on which allow list.	2021-10-19 10:54:30 -04:00
Wade Simmons	ae5505bc74	handshake: update to preferred remote (#532 ) If we receive a handshake packet for a tunnel that has already been completed, check to see if the new remote is preferred. If so, update to the preferred remote and send a test packet to influence the other side to do the same.	2021-10-19 10:53:55 -04:00
Nate Brown	d004fae4f9	Unlock the hostmap quickly, lock hostinfo instead (#459 )	2021-05-05 13:10:55 -05:00
Wade Simmons	44cb697552	Add more metrics (#450 ) * Add more metrics This change adds the following counter metrics: Metrics to track packets dropped at the firewall: firewall.dropped.local_ip firewall.dropped.remote_ip firewall.dropped.no_rule Metrics to track handshakes attempts that have been initiated and ones that have timed out (ones that have completed are tracked by the existing "handshakes" histogram). handshake_manager.initiated handshake_manager.timed_out Metrics to track when cached_packets are dropped because we run out of buffer space, and how many are sent once the handshake completes. hostinfo.cached_packets.dropped hostinfo.cached_packets.sent This change also notes how many cached packets we have when we log the final "Handshake received" message for either stage1 for stage2. * separate incoming/outgoing metrics * remove "allowed" firewall metrics We don't need this on the hotpath, they aren't worh it. * don't need pointers here	2021-04-27 22:23:18 -04:00
Nathan Brown	db23fdf9bc	Dont apply race avoidance to existing handshakes, use the handshake time to determine who wins (#451 ) Co-authored-by: Wade Simmons <wadey@slack-corp.com>	2021-04-27 21:15:34 -05:00
Nathan Brown	710df6a876	Refactor remotes and handshaking to give every address a fair shot (#437 )	2021-04-14 13:50:09 -05:00
Nathan Brown	480036fbc8	Remove unused structs in hostmap.go (#430 )	2021-04-01 22:07:11 -05:00
Nathan Brown	0c2e5973e1	Simple lie test (#427 )	2021-03-31 10:26:35 -05:00
Wade Simmons	4603b5b2dd	fix PromoteEvery check (#424 ) This check was accidentally typo'd in #396 from `%` to `&`. Restore the correct functionality here (we want to do the check every "PromoteEvery" count packets).	2021-03-26 15:01:05 -04:00
Nathan Brown	3ea7e1b75f	Don't use a global logger (#423 )	2021-03-26 09:46:30 -05:00
Nathan Brown	7a9f9dbded	Don't craft buffers if we don't need them (#416 )	2021-03-22 18:25:06 -05:00
Nathan Brown	7073d204a8	IPv6 support for outside (udp) (#369 )	2021-03-18 20:37:24 -05:00
Wade Simmons	6c55d67f18	Refactor handshake_ix (#401 ) There are some subtle race conditions with the previous handshake_ix implementation, mostly around collisions with localIndexId. This change refactors it so that we have a "commit" phase during the handshake where we grab the lock for the hostmap and ensure that we have a unique local index before storing it. We also now avoid using the pending hostmap at all for receiving stage1 packets, since we have everything we need to just store the completed handshake. Co-authored-by: Nate Brown <nbrown.us@gmail.com> Co-authored-by: Ryan Huber <rhuber@gmail.com> Co-authored-by: forfuncsake <drussell@slack-corp.com>	2021-03-12 14:16:25 -05:00
Wade Simmons	d604270966	Fix most known data races (#396 ) This change fixes all of the known data races that `make smoke-docker-race` finds, except for one. Most of these races are around the handshake phase for a hostinfo, so we add a RWLock to the hostinfo and Lock during each of the handshake stages. Some of the other races are around consistently using `atomic` around the `messageCounter` field. To make this harder to mess up, I have renamed the field to `atomicMessageCounter` (I also removed the unnecessary extra pointer deference as we can just point directly to the struct field). The last remaining data race is around reading `ConnectionInfo.ready`, which is a boolean that is only written to once when the handshake has finished. Due to it being in the hot path for packets and the rare case that this could actually be an issue, holding off on fixing that one for now. here is the results of `make smoke-docker-race`: before: lighthouse1: Found 2 data race(s) host2: Found 36 data race(s) host3: Found 17 data race(s) host4: Found 31 data race(s) after: host2: Found 1 data race(s) host4: Found 1 data race(s) Fixes: #147 Fixes: #226 Fixes: #283 Fixes: #316	2021-03-05 21:18:33 -05:00
Nathan Brown	b6234abfb3	Add a way to trigger punch backs via lighthouse (#394 )	2021-03-01 19:06:01 -06:00

1 2

61 Commits