I just ran an experiment to find out how long it takes Weave to recover from a broken connection. In my test setup node A is connected to node B through two paths. Path 1 has one hop and path 2 has two hops. So, initially, Weave routes the traffic through path 1 since it is shorter.
When breaking path 1 by bringing down the interface of one of the hops it takes around 60 seconds for Weave to react and reroute the traffic through path 2.
I am checking the routing by having a look at the weave report output. More precisely, I am checking the information at Router.Peers.UnicastRoutes:
"UnicastRoutes": [
{
"Dest": "2a:e4:6e:f0:57:ef",
"Via": "76:5d:78:64:6d:a6"
},
{
"Dest": "66:c6:2f:12:02:05",
"Via": "00:00:00:00:00:00"
},
{
"Dest": "76:5d:78:64:6d:a6",
"Via": "76:5d:78:64:6d:a6"
},
{
"Dest": "a2:eb:a7:ed:41:b8",
"Via": "76:5d:78:64:6d:a6"
},
{
"Dest": "06:8c:d2:06:2b:eb",
"Via": "76:5d:78:64:6d:a6"
},
{
"Dest": "aa:be:7b:8b:a2:75",
"Via": "76:5d:78:64:6d:a6"
}
]
In this case the connection is already broken and all traffic is routed via the longer path 2.
As mentioned, it takes around 60 seconds for Weave to notice that path 1 is broken. I am assuming, that there is a timeout to make sure that the connections is really down and won't recover. When I fix path 1 by bringing the interface back up, Weave updates its topology within less than one second, which indicates that it can react a lot faster.
So I was wondering if there is a way to specify the time that Weave keeps trying to connect before accepting that this connections is broken.