Commit Graph

55 Commits

Author SHA1 Message Date
JackDoan
d2cb854bff make sure hosts use the correct IP addresses when relaying 2025-10-10 13:28:22 -05:00
Jack Doan
f1e992f6dd don't require a detailsVpnAddr in a HostUpdateNotification (#1472)
* don't require a detailsVpnAddr in a HostUpdateNotification

* don't send our own addr on HostUpdateNotification for v2
2025-09-29 13:43:12 -05:00
Jack Doan
5cccd39465 update RemoteList.vpnAddrs when we complete a handshake (#1467) 2025-09-10 09:44:25 -05:00
Jack Doan
8196c22b5a store lighthouses as a slice (#1473)
* store lighthouses as a slice. If you have fewer than 16 lighthouses (and fewer than 16 vpnaddrs on a host, I guess), it's faster
2025-09-10 09:43:25 -05:00
Jack Doan
768325c9b4 cert-v2 chores (#1466) 2025-09-05 15:08:22 -05:00
Jack Doan
932e329164 Don't delete static host mappings for non-primary IPs (#1464)
* Don't delete a vpnaddr if it's part of a certificate that contains a vpnaddr that's in the static host map

* remove unused arg from ConnectionManager.shouldSwapPrimary()
2025-09-04 14:49:40 -05:00
Wade Simmons
b8ea55eb90 optimize usage of bart (#1395)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 19s
smoke / Run multi node smoke test (push) Failing after 1m19s
Build and test / Build all and test on ubuntu-linux (push) Failing after 18m41s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2m47s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
Use `bart.Lite` and `.Contains` as suggested by the bart maintainer:

- 9455952eed (commitcomment-155362580)
2025-04-18 12:37:20 -04:00
John Maguire
d4a7df3083 Rename pki.default_version to pki.initiating_version (#1381)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m26s
Build and test / Build all and test on ubuntu-linux (push) Failing after 21m13s
Build and test / Build and test on linux with boringcrypto (push) Failing after 3m19s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
2025-04-07 18:08:29 -04:00
Wade Simmons
879852c32a upgrade to yaml.v3 (#1148)
Some checks failed
gofmt / Run gofmt (push) Successful in 37s
smoke-extra / Run extra smoke tests (push) Failing after 20s
smoke / Run multi node smoke test (push) Failing after 1m25s
Build and test / Build all and test on ubuntu-linux (push) Failing after 18m51s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2m44s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m27s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
* upgrade to yaml.v3

The main nice fix here is that maps unmarshal into `map[string]any`
instead of `map[any]any`, so it cleans things up a bit.

* add config.AsBool

Since yaml.v3 doesn't automatically convert yes to bool now, for
backwards compat

* use type aliases for m

* more cleanup

* more cleanup

* more cleanup

* go mod cleanup
2025-03-31 16:08:34 -04:00
Nate Brown
d97ed57a19 V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown
e264a0ff88 Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Caleb Jasik
8109cf2170 Add puncuation to doc comment (#1164)
* Add puncuation to doc comment

* Fix list formatting inside `EncryptDanger` doc comment
2024-06-24 14:50:17 -04:00
Nate Brown
072edd56b3 Fix re-entrant GetOrHandshake issues (#1044) 2023-12-19 11:58:31 -06:00
Nate Brown
5181cb0474 Use generics for CIDRTrees to avoid casting issues (#1004) 2023-11-02 17:05:08 -05:00
Nate Brown
5a131b2975 Combine ca, cert, and key handling (#952) 2023-08-14 21:32:40 -05:00
Nate Brown
14d0106716 Send the lh update worker into its own routine instead of taking over the reload routine (#935) 2023-07-27 14:38:10 -05:00
Nate Brown
3bbf5f4e67 Use an interface for udp conns (#901) 2023-06-14 10:48:52 -05:00
brad-defined
96f4dcaab8 Fix reconfig freeze attempting to send to an unbuffered, unread channel (#886)
* Fixes a reocnfig freeze where the reconfig attempts to send to an unbuffered channel with no readers.
Only create stop channel when a DNS goroutine is created, and only send when the channel exists.
Buffer to size 1 so that the stop message can be immediately sent even if the goroutine is busy doing DNS lookups.
2023-05-31 16:05:46 -04:00
brad-defined
bd9cc01d62 Dns static lookerupper (#796)
* Support lighthouse DNS names, and regularly resolve the name in a background goroutine to discover DNS updates.
2023-05-09 11:22:08 -04:00
Nate Brown
48eb63899f Have lighthouses ack updates to reduce test packet traffic (#851) 2023-05-05 14:44:03 -05:00
brad-defined
9b03053191 update EncReader and EncWriter interface function args to have concrete types (#844)
* Update LightHouseHandlerFunc to remove EncWriter param.
* Move EncWriter to interface
* EncReader, too
2023-04-07 14:28:37 -04:00
Nate Brown
1a6c657451 Normalize logs (#837) 2023-03-30 15:07:31 -05:00
Wade Simmons
3e5c7e6860 add punchy.respond_delay config option (#721) 2023-03-29 14:32:35 -05:00
Wade Simmons
e1af37e46d add calculated_remotes (#759)
* add calculated_remotes

This setting allows us to "guess" what the remote might be for a host
while we wait for the lighthouse response. For networks that hard
designed with in mind, it can help speed up handshake performance, as well as
improve resiliency in the case that all lighthouses are down.

Example:

    lighthouse:
      # ...

      calculated_remotes:
        # For any Nebula IPs in 10.0.10.0/24, this will apply the mask and add
        # the calculated IP as an initial remote (while we wait for the response
        # from the lighthouse). Both CIDRs must have the same mask size.
        # For example, Nebula IP 10.0.10.123 will have a calculated remote of
        # 192.168.1.123

        10.0.10.0/24:
          - mask: 192.168.1.0/24
            port: 4242

* figure out what is up with this test

* add test

* better logic for sending handshakes

Keep track of the last light of hosts we sent handshakes to. Only log
handshake sent messages if the list has changed.

Remove the test Test_NewHandshakeManagerTrigger because it is faulty and
makes no sense. It relys on the fact that no handshake packets actually
get sent, but with these changes we would send packets now (which it
should!)

* use atomic.Pointer

* cleanup to make it clearer

* fix typo in example
2023-03-13 15:09:08 -04:00
Wade Simmons
9af242dc47 switch to new sync/atomic helpers in go1.19 (#728)
These new helpers make the code a lot cleaner. I confirmed that the
simple helpers like `atomic.Int64` don't add any extra overhead as they
get inlined by the compiler. `atomic.Pointer` adds an extra method call
as it no longer gets inlined, but we aren't using these on the hot path
so it is probably okay.
2022-10-31 13:37:41 -04:00
brad-defined
1a7c575011 Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Wade Simmons
45d1d2b6c6 Update dependencies - 2022-04 (#664)
Updated  github.com/kardianos/service         https://github.com/kardianos/service/compare/v1.2.0...v1.2.1
    Updated  github.com/miekg/dns                 https://github.com/miekg/dns/compare/v1.1.43...v1.1.48
    Updated  github.com/prometheus/client_golang  https://github.com/prometheus/client_golang/compare/v1.11.0...v1.12.1
    Updated  github.com/prometheus/common         https://github.com/prometheus/common/compare/v0.32.1...v0.33.0
    Updated  github.com/stretchr/testify          https://github.com/stretchr/testify/compare/v1.7.0...v1.7.1
    Updated  golang.org/x/crypto                  5770296d90...ae2d96664a
    Updated  golang.org/x/net                     69e39bad7d...749bd193bc
    Updated  golang.org/x/sys                     7861aae155...289d7a0edf
    Updated  golang.zx2c4.com/wireguard/windows   v0.5.1...v0.5.3
    Updated  google.golang.org/protobuf           v1.27.1...v1.28.0
2022-04-18 12:12:25 -04:00
Nate Brown
d85e24f49f Allow for self reported ips to the lighthouse (#650) 2022-04-04 12:35:23 -05:00
brad-defined
03498a0cb2 Make nebula advertise its dynamic port to lighthouses (#653) 2022-03-15 18:03:56 -05:00
Nate Brown
312a01dc09 Lighthouse reload support (#649)
Co-authored-by: John Maguire <contact@johnmaguire.me>
2022-03-14 12:35:13 -05:00
Nate Brown
94aaab042f Fix race between punchback and lighthouse handler reset (#566) 2021-11-03 21:54:27 -05:00
Nate Brown
bcabcfdaca Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined
6ae8ba26f7 Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Wade Simmons
ea2c186a77 remote_allow_ranges: allow inside CIDR specific remote_allow_lists (#540)
This allows you to configure remote allow lists specific to different
subnets of the inside CIDR. Example:

    remote_allow_ranges:
      10.42.42.0/24:
        192.168.0.0/16: true

This would only allow hosts with a VPN IP in the 10.42.42.0/24 range to
have private IPs (and thus don't connect over public IPs).

The PR also refactors AllowList into RemoteAllowList and LocalAllowList to make it clearer which methods are allowed on which allow list.
2021-10-19 10:54:30 -04:00
Nathan Brown
a1ee521d79 Fix a failed return in an error case (#445) 2021-04-17 18:47:31 -05:00
Nathan Brown
710df6a876 Refactor remotes and handshaking to give every address a fair shot (#437) 2021-04-14 13:50:09 -05:00
Nathan Brown
64d8e5aa96 More LH cleanup (#429) 2021-04-01 10:23:31 -05:00
Nathan Brown
75f7bda0a4 Lighthouse performance pass (#418) 2021-03-31 17:32:02 -05:00
Nathan Brown
3ea7e1b75f Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Nathan Brown
7073d204a8 IPv6 support for outside (udp) (#369) 2021-03-18 20:37:24 -05:00
Thomas Roten
ea07a89cc8 Ensure mutex is unlocked when adding remote IP. (#406)
Currently, if you use the remote allow list config, as soon as you attempt to create a tunnel to a node that has a blocked IP address, a mutex is locked and never unlocked. This happens even if the node has an allowed remote IP address in addition to the blocked remote IP address.

This pull request ensures that the lighthouse mutex is unlocked whenever we attempt to add a remote IP.
2021-03-16 12:41:35 -04:00
Nathan Brown
b6234abfb3 Add a way to trigger punch backs via lighthouse (#394) 2021-03-01 19:06:01 -06:00
Wade Simmons
ce9ad37431 fix regression with LightHouseHandler and punchBack (#346)
The change introduced by #320 incorrectly re-uses the output buffer for
sending punchBack packets. Since we are currently spawning a new
goroutine for each send here, we need to allocate a new buffer each
time. We can come back and optimize this in the future, but for now we
should fix the regression.
2020-11-25 17:49:26 -05:00
Wade Simmons
2e7ca027a4 Lighthouse handler optimizations (#320)
We noticed that the number of memory allocations LightHouse.HandleRequest creates for each call can seriously impact performance for high traffic lighthouses. This PR introduces a benchmark in the first commit and then optimizes memory usage by creating a LightHouseHandler struct. This struct allows us to re-use memory between each lighthouse request (one instance per UDP listener go-routine).
2020-11-23 14:50:01 -05:00
Wade Simmons
4756c9613d trigger handshakes when lighthouse reply arrives (#246)
Currently, we wait until the next timer tick to act on the lighthouse's
reply to our HostQuery. This means we can easily add hundreds of
milliseconds of unnecessary delay to the handshake. To fix this, we
can introduce a channel to trigger an outbound handshake without waiting
for the next timer tick.

A few samples of cold ping time between two hosts that require a
lighthouse lookup:

    before (v1.2.0):

    time=156 ms
    time=252 ms
    time=12.6 ms
    time=301 ms
    time=352 ms
    time=49.4 ms
    time=150 ms
    time=13.5 ms
    time=8.24 ms
    time=161 ms
    time=355 ms

    after:

    time=3.53 ms
    time=3.14 ms
    time=3.08 ms
    time=3.92 ms
    time=7.78 ms
    time=3.59 ms
    time=3.07 ms
    time=3.22 ms
    time=3.12 ms
    time=3.08 ms
    time=8.04 ms

I recommend reviewing this PR by looking at each commit individually, as
some refactoring was required that makes the diff a bit confusing when
combined together.
2020-07-22 10:35:10 -04:00
Wade Simmons
b37a91cfbc add meta packet statistics (#230)
This change add more metrics around "meta" (non "message" type packets).
For lighthouse packets, we also record statistics around the specific
lighthouse meta type.

We don't keep statistics for the "message" type so that we don't slow
down the fast path (and you can just look at metrics on the tun
interface to find that information).
2020-06-26 13:45:48 -04:00
Wade Simmons
0a474e757b Add lighthouse.{remoteAllowList,localAllowList} (#217)
These settings make it possible to blacklist / whitelist IP addresses
that are used for remote connections.

`lighthouse.remoteAllowList` filters which remote IPs are allow when
fetching from the lighthouse (or, if you are the lighthouse, which IPs
you store and forward to querying hosts). By default, any remote IPs are
allowed. You can provide CIDRs here with `true` to allow and `false` to
deny. The most specific CIDR rule applies to each remote.  If all rules
are "allow", the default will be "deny", and vice-versa. If both "allow"
and "deny" rules are present, then you MUST set a rule for "0.0.0.0/0"
as the default.

    lighthouse:
      remoteAllowList:
        # Example to block IPs from this subnet from being used for remote IPs.
        "172.16.0.0/12": false

        # A more complicated example, allow public IPs but only private IPs from a specific subnet
        "0.0.0.0/0": true
        "10.0.0.0/8": false
        "10.42.42.0/24": true

`lighthouse.localAllowList` has the same logic as above, but it applies
to the local addresses we advertise to the lighthouse. Additionally, you
can specify an `interfaces` map of regular expressions to match against
interface names. The regexp must match the entire name. All interface
rules must be either true or false (and the default rule will be the
inverse). CIDR rules are matched after interface name rules.

Default is all local IP addresses.

    lighthouse:
      localAllowList:
        # Example to blacklist docker interfaces.
        interfaces:
          'docker.*': false

        # Example to only advertise IPs in this subnet to the lighthouse.
        "10.0.0.0/8": true
2020-04-08 15:36:43 -04:00
Ryan Huber
1297090af3 add configurable punching delay because of race-condition-y conntracks (#210)
* add configurable punching delay because of race-condition-y conntracks

* add changelog

* fix tests

* only do one punch per query

* Coalesce punchy config

* It is not is not set

* Add tests

Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2020-03-27 11:26:39 -07:00
Chad Harp
efe741ad66 Allow ValidateLHStaticEntries to check all static host map entries (#141)
* Allow ValidateLHStaticEntries to check all static host map entries

* Cleaner fix for ValidateLHStaticEntries
2020-01-02 21:04:18 -05:00
Ryan Huber
df8e45c13b if interval is 0 don't even update lh (mobile use case) 2019-12-26 21:12:31 +00:00