Commit Graph

74 Commits

Author SHA1 Message Date
Nate Brown
9877648da9 Drop inactive tunnels (#1413) 2025-06-23 11:32:50 -05:00
Jack Doan
248cf194cd fix integer wraparound in the calculation of handshake timeouts on 32-bit targets (#1185)
Fixes: #1169
2024-08-13 09:25:18 -04:00
Nate Brown
e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Nate Brown
a390125935 Support reloading preferred_ranges (#1043) 2024-04-03 22:14:51 -05:00
Wade Simmons
0564d0a2cf when listen.port is zero, fix multiple routines (#1057)
This used to work correctly because when the multiple routines work was
first added in #382, but an important part to discover the listen port
before opening the other listeners on the same socket was lost in this
PR: #653.

This change should fix the regression and allow multiple routines to
work correctly when listen.port is set to `0`.

Thanks to @rawdigits for tracking down and discovering this regression.
2024-01-08 13:49:44 -05:00
Ben Ritcey
01cddb8013 Added firewall.rules.hash metric (#1010)
* Added firewall.rules.hash metric

Added a FNV-1 hash of the firewall rules as a Prometheus value.

* Switch FNV has to int64, include both hashes in log messages

* Use a uint32 for the FNV hash

Let go-metrics cast the uint32 to a int64, so it won't be lossy
when it eventually emits a float64 Prometheus metric.
2023-11-28 11:56:47 -05:00
Tristan Rice
1083279a45 add gvisor based service library (#965)
* add service/ library
2023-11-21 11:50:18 -05:00
Nate Brown
3356e03d85 Default pki.disconnect_invalid to true and make it reloadable (#859) 2023-11-13 12:39:38 -06:00
Lars Lehtonen
77a8ce1712 main: fix dropped error (#1002)
This isn't an actual issue because the current implementation of NewSSHServer never returns an error (https://github.com/slackhq/nebula/blob/v1.7.2/sshd/server.go#L56), but still good to fix so no surprises happen in the future.
2023-10-31 10:32:08 -04:00
Nate Brown
076ebc6c6e Simplify getting a hostinfo or starting a handshake with one (#954) 2023-08-21 18:51:45 -05:00
Nate Brown
5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Nate Brown
223cc6e660 Limit how often a busy tunnel can requery the lighthouse (#940)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-08-08 13:26:41 -05:00
Caleb Jasik
ed00f5d530 Remove unused config code (last edited 4yrs ago) (#938) 2023-07-31 15:59:20 -05:00
Nate Brown
14d0106716 Send the lh update worker into its own routine instead of taking over the reload routine (#935) 2023-07-27 14:38:10 -05:00
Nate Brown
a10baeee92 Pull hostmap and pending hostmap apart, remove unused functions (#843) 2023-07-24 12:37:52 -05:00
Nate Brown
3bbf5f4e67 Use an interface for udp conns (#901) 2023-06-14 10:48:52 -05:00
brad-defined
bd9cc01d62 Dns static lookerupper (#796)
* Support lighthouse DNS names, and regularly resolve the name in a background goroutine to discover DNS updates.
2023-05-09 11:22:08 -04:00
Nate Brown
3cb4e0ef57 Allow listen.host to contain names (#825) 2023-04-05 11:29:26 -05:00
Nate Brown
ee8e1348e9 Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
Tricia
0fc4d8192f log network as String to match the other log event in interface.go that emits network (#811)
Co-authored-by: Tricia Bogen <tbogen@slack-corp.com>
2023-01-23 14:05:35 -05:00
Jon Rafkind
c2259f14a7 explicitly reload config from ssh command (#725) 2022-08-08 12:44:09 -05:00
Wade Simmons
7b9287709c add listen.send_recv_error config option (#670)
By default, Nebula replies to packets it has no tunnel for with a `recv_error` packet. This packet helps speed up re-connection
in the case that Nebula on either side did not shut down cleanly. This response can be abused as a way to discover if Nebula is running
on a host though. This option lets you configure if you want to send `recv_error` packets always, never, or only to private network remotes.
valid values: always, never, private

This setting is reloadable with SIGHUP.
2022-06-27 12:37:54 -04:00
brad-defined
1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
brad-defined
03498a0cb2 Make nebula advertise its dynamic port to lighthouses (#653) 2022-03-15 18:03:56 -05:00
Nate Brown
312a01dc09 Lighthouse reload support (#649)
Co-authored-by: John Maguire <contact@johnmaguire.me>
2022-03-14 12:35:13 -05:00
Wade Simmons
befce3f990 fix crash with -test (#602)
When running in `-test` mode, `tun` is set to nil. So we should move the
defer into the `!configTest` if block.

    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x54855c]

    goroutine 1 [running]:
    github.com/slackhq/nebula.Main.func3(0x4000135e80, {0x0, 0x0})
            github.com/slackhq/nebula/main.go:176 +0x2c
    github.com/slackhq/nebula.Main(0x400022e060, 0x1, {0x76faa0, 0x5}, 0x4000230000, 0x0)
            github.com/slackhq/nebula/main.go:316 +0x2414
    main.main()
            github.com/slackhq/nebula/cmd/nebula/main.go:54 +0x540
2021-12-06 14:06:16 -05:00
Nate Brown
48c47f5841 Warn if no lighthouses were configured on a non lighthouse node (#587) 2021-11-30 10:31:33 -06:00
Nate Brown
467e605d5e Push route handling into overlay, a few more nits fixed (#581) 2021-11-12 11:19:28 -06:00
Nate Brown
e07524a654 Move all of tun into overlay (#577) 2021-11-11 16:37:29 -06:00
Nate Brown
88ce0edf76 Start the overlay package with the old Inside interface (#576) 2021-11-10 21:52:26 -06:00
Nate Brown
4453964e34 Move util to test, contextual errors to util (#575) 2021-11-10 21:47:38 -06:00
Nate Brown
bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined
6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Donatas Abraitis
32e2619323 Teardown tunnel automatically if peer's certificate expired (#370) 2021-10-20 13:23:33 -05:00
Wade Simmons
ea2c186a77 remote_allow_ranges: allow inside CIDR specific remote_allow_lists (#540)
This allows you to configure remote allow lists specific to different
subnets of the inside CIDR. Example:

    remote_allow_ranges:
      10.42.42.0/24:
        192.168.0.0/16: true

This would only allow hosts with a VPN IP in the 10.42.42.0/24 range to
have private IPs (and thus don't connect over public IPs).

The PR also refactors AllowList into RemoteAllowList and LocalAllowList to make it clearer which methods are allowed on which allow list.
2021-10-19 10:54:30 -04:00
brad-defined
7859140711 Only set serveDns if the host is also configured to be a lighthouse. (#433) 2021-04-16 13:33:56 -05:00
brad-defined
17106f83a0 Ensure the Nebula device exists before attempting to bind to the Nebula IP (#375) 2021-04-16 10:34:28 -05:00
Nathan Brown
710df6a876 Refactor remotes and handshaking to give every address a fair shot (#437) 2021-04-14 13:50:09 -05:00
Nathan Brown
1499be3e40 Fix name resolution for host names in config (#431) 2021-04-01 21:48:41 -05:00
Nathan Brown
64d8e5aa96 More LH cleanup (#429) 2021-04-01 10:23:31 -05:00
Nathan Brown
883e09a392 Don't use a global ca pool (#426) 2021-03-29 12:10:19 -05:00
Wade Simmons
a71541fb0b export build version as a prometheus label (#405)
This is how Prometheus recommends you do it, and how they do it
themselves in their client. This makes it easy to see which versions you
have deployed in your fleet, and query over it too.
2021-03-26 14:16:35 -04:00
Nathan Brown
3ea7e1b75f Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Nathan Brown
7073d204a8 IPv6 support for outside (udp) (#369) 2021-03-18 20:37:24 -05:00
Ryan Huber
73a5ed90b2 Do not allow someone to run a nebula lighthouse with an ephemeral port (#399)
* Do not allow someone to run a nebula lighthouse with an ephemeral port

* derp - we discover the port so we have to check the config setting

* No context needed for this error

* gofmt yourself

* Revert "gofmt yourself"

This reverts commit c01423498e.

* Revert "No context needed for this error"

This reverts commit 6792af6846.

* snip snap snip snap
2021-03-08 12:42:06 -08:00
Nathan Brown
b6234abfb3 Add a way to trigger punch backs via lighthouse (#394) 2021-03-01 19:06:01 -06:00
Wade Simmons
2a4beb41b9 Routine-local conntrack cache (#391)
Previously, every packet we see gets a lock on the conntrack table and updates it. When running with multiple routines, this can cause heavy lock contention and limit our ability for the threads to run independently. This change caches reads from the conntrack table for a very short period of time to reduce this lock contention. This cache will currently default to disabled unless you are running with multiple routines, in which case the default cache delay will be 1 second. This means that entries in the conntrack table may be up to 1 second out of date and remain in a routine local cache for up to 1 second longer than the global table.

Instead of calling time.Now() for every packet, this cache system relies on a tick thread that updates the current cache "version" each tick. Every packet we check if the cache version is out of date, and reset the cache if so.
2021-03-01 19:52:17 -05:00
Wade Simmons
a0583ebdca tun_disabled: reply to ICMP Echo Request (#342)
This change allows a server running with `tun.disabled: true` (usually
a lighthouse) to still reply to ICMP EchoRequest packets. This allows
you to "ping" the lighthouse Nebula IP as a quick check to make sure the
tunnel is up, even when running with tun.disabled.

This is still gated by allowing `icmp` packets in the inbound firewall
rules.
2021-03-01 11:09:41 -05:00
Wade Simmons
27d9a67dda Proper multiqueue support for tun devices (#382)
This change is for Linux only.

Previously, when running with multiple tun.routines, we would only have one file descriptor. This change instead sets IFF_MULTI_QUEUE and opens a file descriptor for each routine. This allows us to process with multiple threads while preventing out of order packet reception issues.

To attempt to distribute the flows across the queues, we try to write to the tun/UDP queue that corresponds with the one we read from. So if we read a packet from tun queue "2", we will write the outgoing encrypted packet to UDP queue "2". Because of the nature of how multi queue works with flows, a given host tunnel will be sticky to a given routine (so if you try to performance benchmark by only using one tunnel between two hosts, you are only going to be using a max of one thread for each direction).

Because this system works much better when we can correlate flows between the tun and udp routines, we are deprecating the undocumented "tun.routines" and "listen.routines" parameters and introducing a new "routines" parameter that sets the value for both. If you use the old undocumented parameters, the max of the values will be used and a warning logged.

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2021-02-25 15:01:14 -05:00
Nathan Brown
68e3e84fdc More like a library (#279) 2020-09-18 09:20:09 -05:00