64 Commits

Author SHA1 Message Date
Wade Simmons
ae9de47dd9 Merge remote-tracking branch 'origin/master' into multiport 2025-07-11 12:57:52 -04:00
Nate Brown
52623820c2
Drop inactive tunnels (#1427) 2025-07-03 09:58:37 -05:00
brad-defined
b158eb0c4c
Use a list for relay IPs instead of a map (#1423)
* Use a list for relay IPs instead of a map

* linter
2025-07-02 08:47:05 -04:00
Wade Simmons
b8ea55eb90
optimize usage of bart (#1395)
Some checks failed
gofmt / Run gofmt (push) Successful in 9s
smoke-extra / Run extra smoke tests (push) Failing after 19s
smoke / Run multi node smoke test (push) Failing after 1m19s
Build and test / Build all and test on ubuntu-linux (push) Failing after 18m41s
Build and test / Build and test on linux with boringcrypto (push) Failing after 2m47s
Build and test / Build and test on linux with pkcs11 (push) Failing after 2m47s
Build and test / Build and test on macos-latest (push) Has been cancelled
Build and test / Build and test on windows-latest (push) Has been cancelled
Use `bart.Lite` and `.Contains` as suggested by the bart maintainer:

- 9455952eed (commitcomment-155362580)
2025-04-18 12:37:20 -04:00
Wade Simmons
f36db374ac Merge remote-tracking branch 'origin/master' into multiport 2025-03-06 16:11:32 -05:00
Nate Brown
d97ed57a19
V2 certificate format (#1216)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
Co-authored-by: Jack Doan <jackdoan@rivian.com>
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
Co-authored-by: Jack Doan <me@jackdoan.com>
2025-03-06 11:28:26 -06:00
Nate Brown
08ac65362e
Cert interface (#1212) 2024-10-10 18:00:22 -05:00
Wade Simmons
dabce8a1b4 1.9.4 Release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEnN7QnoQoG72upUfo5qM118W2lxoFAmbfOr4ACgkQ5qM118W2
 lxoTGQ//SKoaiZwbtWZtEjYWUJPxGL5gbidmqdmtT9b0ttBK+ufRRbRQXeuXv+pY
 KlKE3YxS8aWbW+YPvtQ7Ly6W4KoJ49esZYnFRMwnLnOpJY9KXtWe0ej+ohQIqm0g
 R/7MFx9YiKsO+oNI3Bk8Flfkdhh2RCSECO/i5V0oZIkZHy3ceeM/EAlMXy2slC7Z
 jcDLKkHsDSTkNhuCiNFwR8t04y2sZhYXPDC3xG/9FzO8dlstj6Kj7L0E7uceb3yP
 9LlmnQB8AAXQ/ZpJ82Roe72ORGuL5xwUPDpEPKnM2090h6skIA9cpIn4BpRpg/6S
 rrZb/fSIjLlE8YnkA39kKnMS1SW5O2EXSDtXCzEkZI40vGHIJiVY2j+mELqHiWLf
 8MLVC0qW2DvOMA28ZAipQ2gG9txxuArLBD/Zlhtlzn4KeP8m1Dnnv1kkL8z8+H+6
 18zM9lcE4xK8ET+9yao5yNpYinhwEHQnekeevMBJPrI/5SQxkb53u+FXeg1eGAbK
 IewcLlpxun/IwL8D0NwY2/1EVlemupEed9geHDBIjM9gPmBG/zYJdRvh2aLUXcti
 C5nxXAXUknXYAyUwT2kvplLyj1yZheA9nDonIVI9GY1nyZmzWsT0D7BSoOGxw+6H
 4nhcsQfHpEVQvCfY9G2wOvmqiZEkbFDho/3o7hebowkFljXXcKU=
 =IC32
 -----END PGP SIGNATURE-----

Merge tag 'v1.9.4' into multiport

1.9.4 Release
2024-09-13 10:17:59 -04:00
Nate Brown
e264a0ff88
Switch most everything to netip in prep for ipv6 in the overlay (#1173) 2024-07-31 10:18:56 -05:00
Wade Simmons
b445d14ddb Merge remote-tracking branch 'origin/master' into multiport 2024-05-08 11:22:19 -04:00
Nate Brown
a390125935
Support reloading preferred_ranges (#1043) 2024-04-03 22:14:51 -05:00
Wade Simmons
659d7fece6 1.8.2 Release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEnN7QnoQoG72upUfo5qM118W2lxoFAmWcXeYACgkQ5qM118W2
 lxo8yBAAxnMxvP2d2Mu2n6SExRxqmK5e+CddM0XWNZQzTXO1gyKw7YPLzzQwRPTa
 mhmuGEmqjmG0/VXwz9dl1jrpIJu0ge7APgIn9duFzz5HYnDbb+6+T0cQ/8LQbNe1
 i+xGdY3n1RYHKoeqOi14lmf9uB6zrklfhzFG/05AyYjNNipMtAsC82FrFmySTQ9w
 gp4XGwK5edzWSrBZ0w4nbo8G8r4mP/2qZdbxY+9g9IrrQoeoZtWVttdZ36rkEvIi
 uzyj//PClLTTrAiSHcWdrdPHlLj2L4t1S0ixjnAk2OO/OD/EQ5FwtYggF+x+YE6N
 fedIcUliJNidK7FZ+cWUdB6tUWgjM9TsbfuPoCI786e1OnBRML5ZPCiXZpzhxMWZ
 l+uKJkOUqoC7Nu83+WoedLrJo5zwOhq8oYx0/BVw8dNMdYFGSPrbE3ooFtgUc6Lu
 2TEtD5NzVz6nPAyPOYVNOw726J19fFBKbBZsV12KSTW1ElFafEDCHGelIf2wt8mI
 t23SlYfHMJOhKPMnJWczAFsuVDfMmt5xRvH1mFORiBIm/4EXYIS00IEGKQYuC7m+
 lUmdrk9R6pVdq5lekL1KkB/fjGI/mg5liYY0ubx/4oeHXRyMPXeVY0ZkTqc2PPHi
 7wl2iLytG/FTMdGPC4F4LmXT9xPRzTGNpANItael2PTSBPThQb8=
 =XsOf
 -----END PGP SIGNATURE-----

Merge tag 'v1.8.2' into multiport

1.8.2 Release
2024-01-26 10:45:15 -05:00
Nate Brown
072edd56b3
Fix re-entrant GetOrHandshake issues (#1044) 2023-12-19 11:58:31 -06:00
Nate Brown
5181cb0474
Use generics for CIDRTrees to avoid casting issues (#1004) 2023-11-02 17:05:08 -05:00
Nate Brown
a44e1b8b05
Clean up a hostinfo to reduce memory usage (#955) 2023-11-02 16:53:59 -05:00
Wade Simmons
f2aef0d6eb Merge remote-tracking branch 'origin/master' into multiport 2023-10-27 08:48:13 -04:00
Nate Brown
076ebc6c6e
Simplify getting a hostinfo or starting a handshake with one (#954) 2023-08-21 18:51:45 -05:00
Nate Brown
223cc6e660
Limit how often a busy tunnel can requery the lighthouse (#940)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-08-08 13:26:41 -05:00
Nate Brown
a10baeee92
Pull hostmap and pending hostmap apart, remove unused functions (#843) 2023-07-24 12:37:52 -05:00
Wade Simmons
0e593ad582 Merge branch 'master' into multiport 2023-05-09 15:37:30 -04:00
Nate Brown
03e4a7f988
Rehandshaking (#838)
Co-authored-by: Brad Higgins <brad@defined.net>
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-05-04 15:16:37 -05:00
Wade Simmons
e71059a410 Merge remote-tracking branch 'origin/master' into multiport 2023-04-03 11:30:41 -04:00
Nate Brown
ee8e1348e9
Use connection manager to drive NAT maintenance (#835)
Co-authored-by: brad-defined <77982333+brad-defined@users.noreply.github.com>
2023-03-31 15:45:05 -05:00
brad-defined
2801fb2286
Fix relay (#827)
Co-authored-by: Nate Brown <nbrown.us@gmail.com>
2023-03-30 11:09:20 -05:00
Nate Brown
f0ef80500d
Remove dead code and re-order transit from pending to main hostmap on stage 2 (#828) 2023-03-17 15:36:24 -05:00
Wade Simmons
e1af37e46d
add calculated_remotes (#759)
* add calculated_remotes

This setting allows us to "guess" what the remote might be for a host
while we wait for the lighthouse response. For networks that hard
designed with in mind, it can help speed up handshake performance, as well as
improve resiliency in the case that all lighthouses are down.

Example:

    lighthouse:
      # ...

      calculated_remotes:
        # For any Nebula IPs in 10.0.10.0/24, this will apply the mask and add
        # the calculated IP as an initial remote (while we wait for the response
        # from the lighthouse). Both CIDRs must have the same mask size.
        # For example, Nebula IP 10.0.10.123 will have a calculated remote of
        # 192.168.1.123

        10.0.10.0/24:
          - mask: 192.168.1.0/24
            port: 4242

* figure out what is up with this test

* add test

* better logic for sending handshakes

Keep track of the last light of hosts we sent handshakes to. Only log
handshake sent messages if the list has changed.

Remove the test Test_NewHandshakeManagerTrigger because it is faulty and
makes no sense. It relys on the fact that no handshake packets actually
get sent, but with these changes we would send packets now (which it
should!)

* use atomic.Pointer

* cleanup to make it clearer

* fix typo in example
2023-03-13 15:09:08 -04:00
Wade Simmons
aec7f5f865 Merge remote-tracking branch 'origin/master' into multiport 2023-03-13 15:07:32 -04:00
Nate Brown
92cc32f844
Remove handshake race avoidance (#820)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2023-03-13 12:35:14 -05:00
Nate Brown
a06977bbd5
Track connections by local index id instead of vpn ip (#807) 2023-02-13 14:41:05 -06:00
Wade Simmons
9af242dc47
switch to new sync/atomic helpers in go1.19 (#728)
These new helpers make the code a lot cleaner. I confirmed that the
simple helpers like `atomic.Int64` don't add any extra overhead as they
get inlined by the compiler. `atomic.Pointer` adds an extra method call
as it no longer gets inlined, but we aren't using these on the hot path
so it is probably okay.
2022-10-31 13:37:41 -04:00
Wade Simmons
326fc8758d Support multiple UDP source ports (multiport)
The goal of this work is to send packets between two hosts using more than one
5-tuple. When running on networks like AWS where the underlying network driver
and overlay fabric makes routing, load balancing, and failover decisions based
on the flow hash, this enables more than one flow between pairs of hosts.

Multiport spreads outgoing UDP packets across multiple UDP send ports,
which allows nebula to work around any issues on the underlay network.
Some example issues this could work around:

- UDP rate limits on a per flow basis.
- Partial underlay network failure in which some flows work and some don't

Agreement is done during the handshake to decide if multiport mode will
be used for a given tunnel (one side must have tx_enabled set, the other
side must have rx_enabled set)

NOTE: you cannot use multiport on a host if you are relying on UDP hole
punching to get through a NAT or firewall.

NOTE: Linux only (uses raw sockets to send). Also currently only works
with IPv4 underlay network remotes.

This is implemented by opening a raw socket and sending packets with
a source port that is based on a hash of the overlay source/destiation
port. For ICMP and Nebula metadata packets, we use a random source port.

Example configuration:

    multiport:
      # This host support sending via multiple UDP ports.
      tx_enabled: false

      # This host supports receiving packets sent from multiple UDP ports.
      rx_enabled: false

      # How many UDP ports to use when sending. The lowest source port will be
      # listen.port and go up to (but not including) listen.port + tx_ports.
      tx_ports: 100

      # NOTE: All of your hosts must be running a version of Nebula that supports
      # multiport if you want to enable this feature. Older versions of Nebula
      # will be confused by these multiport handshakes.
      #
      # If handshakes are not getting a response, attempt to transmit handshakes
      # using random UDP source ports (to get around partial underlay network
      # failures).
      tx_handshake: false

      # How many unresponded handshakes we should send before we attempt to
      # send multiport handshakes.
      tx_handshake_delay: 2
2022-10-17 12:58:06 -04:00
brad-defined
1a7c575011
Relay (#678)
Co-authored-by: Wade Simmons <wsimmons@slack-corp.com>
2022-06-21 13:35:23 -05:00
Wade Simmons
949ec78653
don't set ConnectionState to nil (#590)
* don't set ConnectionState to nil

We might have packets processing in another thread, so we can't safely
just set this to nil. Since we removed it from the hostmaps, the next
packets to process should start the handshake over again.

I believe this comment is outdated or incorrect, since the next
handshake will start over with a new HostInfo, I don't think there is
any way a counter reuse could happen:

> We must null the connectionstate or a counter reuse may happen

Here is a panic we saw that I think is related:

    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x93a037]
    goroutine 59 [running, locked to thread]:
    github.com/slackhq/nebula.(*Firewall).Drop(...)
            github.com/slackhq/nebula/firewall.go:380
    github.com/slackhq/nebula.(*Interface).consumeInsidePacket(...)
            github.com/slackhq/nebula/inside.go:59
    github.com/slackhq/nebula.(*Interface).listenIn(...)
            github.com/slackhq/nebula/interface.go:233
    created by github.com/slackhq/nebula.(*Interface).run
            github.com/slackhq/nebula/interface.go:191

* use closeTunnel
2021-12-06 14:09:05 -05:00
Nate Brown
467e605d5e
Push route handling into overlay, a few more nits fixed (#581) 2021-11-12 11:19:28 -06:00
Nate Brown
e07524a654
Move all of tun into overlay (#577) 2021-11-11 16:37:29 -06:00
Wade Simmons
304b12f63f
create ConnectionState before adding to HostMap (#535)
We have a few small race conditions with creating the HostInfo.ConnectionState
since we add the host info to the pendingHostMap before we set this
field. We can make everything a lot easier if we just add an "init"
function so that we can set this field in the hostinfo before we add it
to the hostmap.
2021-11-08 14:46:22 -05:00
Nate Brown
bcabcfdaca
Rework some things into packages (#489) 2021-11-03 20:54:04 -05:00
brad-defined
6ae8ba26f7
Add a context object in nebula.Main to clean up on error (#550) 2021-11-02 13:14:26 -05:00
Wade Simmons
ea2c186a77
remote_allow_ranges: allow inside CIDR specific remote_allow_lists (#540)
This allows you to configure remote allow lists specific to different
subnets of the inside CIDR. Example:

    remote_allow_ranges:
      10.42.42.0/24:
        192.168.0.0/16: true

This would only allow hosts with a VPN IP in the 10.42.42.0/24 range to
have private IPs (and thus don't connect over public IPs).

The PR also refactors AllowList into RemoteAllowList and LocalAllowList to make it clearer which methods are allowed on which allow list.
2021-10-19 10:54:30 -04:00
Wade Simmons
ae5505bc74
handshake: update to preferred remote (#532)
If we receive a handshake packet for a tunnel that has already been
completed, check to see if the new remote is preferred. If so, update to
the preferred remote and send a test packet to influence the other side
to do the same.
2021-10-19 10:53:55 -04:00
Nate Brown
d004fae4f9
Unlock the hostmap quickly, lock hostinfo instead (#459) 2021-05-05 13:10:55 -05:00
Wade Simmons
44cb697552
Add more metrics (#450)
* Add more metrics

This change adds the following counter metrics:

Metrics to track packets dropped at the firewall:

    firewall.dropped.local_ip
    firewall.dropped.remote_ip
    firewall.dropped.no_rule

Metrics to track handshakes attempts that have been initiated and ones
that have timed out (ones that have completed are tracked by the
existing "handshakes" histogram).

    handshake_manager.initiated
    handshake_manager.timed_out

Metrics to track when cached_packets are dropped because we run out of
buffer space, and how many are sent once the handshake completes.

    hostinfo.cached_packets.dropped
    hostinfo.cached_packets.sent

This change also notes how many cached packets we have when we log the
final "Handshake received" message for either stage1 for stage2.

* separate incoming/outgoing metrics

* remove "allowed" firewall metrics

We don't need this on the hotpath, they aren't worh it.

* don't need pointers here
2021-04-27 22:23:18 -04:00
Nathan Brown
db23fdf9bc
Dont apply race avoidance to existing handshakes, use the handshake time to determine who wins (#451)
Co-authored-by: Wade Simmons <wadey@slack-corp.com>
2021-04-27 21:15:34 -05:00
Nathan Brown
710df6a876
Refactor remotes and handshaking to give every address a fair shot (#437) 2021-04-14 13:50:09 -05:00
Nathan Brown
480036fbc8
Remove unused structs in hostmap.go (#430) 2021-04-01 22:07:11 -05:00
Nathan Brown
0c2e5973e1
Simple lie test (#427) 2021-03-31 10:26:35 -05:00
Wade Simmons
4603b5b2dd
fix PromoteEvery check (#424)
This check was accidentally typo'd in #396 from `%` to `&`. Restore the
correct functionality here (we want to do the check every "PromoteEvery"
count packets).
2021-03-26 15:01:05 -04:00
Nathan Brown
3ea7e1b75f
Don't use a global logger (#423) 2021-03-26 09:46:30 -05:00
Nathan Brown
7a9f9dbded
Don't craft buffers if we don't need them (#416) 2021-03-22 18:25:06 -05:00
Nathan Brown
7073d204a8
IPv6 support for outside (udp) (#369) 2021-03-18 20:37:24 -05:00