* firewall can distinguish if the host connecting has an overlapping network, is a VPN peer without an overlapping network, or is a unsafe network
* Cross stack subnet stuff (#1512)
* experiment with not filtering out non-common addresses in hostinfo.networks
* allow handshakes without overlaps
* unsafe network test
* change HostInfo.buildNetworks argument to reference the cert
* try to make certificate addition/removal reloadable in some cases
* very spicy change to respond to handshakes with cert versions we cannot match with a cert that we can indeed match
* even spicier change to rehandshake if we detect our cert is lower-version than our peer, and we have a newer-version cert available
* make tryRehandshake easier to understand
A local index collision happens when two tunnels attempt to use the same
random int32 index ID. This is a rare chance, and we have code to deal
with it, but we have a panic because we return the wrong thing in this
case. This change should fix the panic.
We had a rare deadlock in GetOrHandshake because we kept the hostmap
lock when we do the call to StartHandshake. StartHandshake can block
while sending to the lighthouse query worker channel, and that worker
needs to be able to grab the hostmap lock to do its work. Other calls
for StartHandshake don't hold the hostmap lock so we should be able to
drop it here.
This lock was originally added with: https://github.com/slackhq/nebula/pull/954
* avoid deadlock in lighthouse queryWorker
If the lighthouse queryWorker tries to grab to call StartHandshake on
a lighthouse vpnIp, we can deadlock on the handshake_manager lock. This
change drops the handshake_manager lock before we send on the lighthouse
queryChan (which could block), and also avoids sending to the channel if
this is a lighthouse IP itself.
* need to hold lock during cacheCb
* add calculated_remotes
This setting allows us to "guess" what the remote might be for a host
while we wait for the lighthouse response. For networks that hard
designed with in mind, it can help speed up handshake performance, as well as
improve resiliency in the case that all lighthouses are down.
Example:
lighthouse:
# ...
calculated_remotes:
# For any Nebula IPs in 10.0.10.0/24, this will apply the mask and add
# the calculated IP as an initial remote (while we wait for the response
# from the lighthouse). Both CIDRs must have the same mask size.
# For example, Nebula IP 10.0.10.123 will have a calculated remote of
# 192.168.1.123
10.0.10.0/24:
- mask: 192.168.1.0/24
port: 4242
* figure out what is up with this test
* add test
* better logic for sending handshakes
Keep track of the last light of hosts we sent handshakes to. Only log
handshake sent messages if the list has changed.
Remove the test Test_NewHandshakeManagerTrigger because it is faulty and
makes no sense. It relys on the fact that no handshake packets actually
get sent, but with these changes we would send packets now (which it
should!)
* use atomic.Pointer
* cleanup to make it clearer
* fix typo in example