Skip to content

Conversation

@Dando-Real-ITA
Copy link
Contributor

This allows to have pref-src set by router id, for example to have source routing for IPv4

# ISP1
install ip 0.0.0.0/0 eq 0 id 0e:6a:a3:ff:fe:a7:00:00 pref-src 66.199.5.162
# ISP2
install ip 0.0.0.0/0 eq 0 id 0e:08:1c:ff:fe:dd:00:00 pref-src 12.144.66.186

@Dando-Real-ITA
Copy link
Contributor Author

  • Added a fix to allow pref-src to update on route change. change_route was using always the old route pref-src
  • Added a command check_xroutes to propagate faster a change in the kernel routing table, for example a default route withdraw from a script

@Dando-Real-ITA
Copy link
Contributor Author

Added more code that allows the socket config to change a redistribute route metric allowing to retract or reannounce a route.

Example:
Route defined in babeld.conf as redistribute ip 0.0.0.0/0 eq 0 metric 1000

Retract:

echo "redistribute ip 0.0.0.0/0 eq 0 deny" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null
echo "check_xroutes" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null

Reannounce:

echo "redistribute ip 0.0.0.0/0 eq 0 metric 1000" | socat - UNIX-CONNECT:/var/run/babeld.sock
echo "check_xroutes" | socat - UNIX-CONNECT:/var/run/babeld.sock 1>&2 > /dev/null

@jech
Copy link
Owner

jech commented Mar 2, 2024

There appears to be multiple (related?) functionalities in this pull request, and I'm not quite sure what problems each of those is solving. I'd be grateful if you could squash related functionality into a single patch. For example, the two newpref_src patches should be a single commit.

Here's a first review (just having a quick look):

  • put the .gitignore changes in a separate patch; put a newline at the end of .gitignore;
  • provide a more detailed commit message for the install_filter patch, explaining the new functionality it enables;
  • squash the two commits about newpref_src, and explain in the commit message why it is useful;
  • I don't agree with the check_xroutes command, this is an implementation detail that should not leak in the UI; instead, we should make sure that xroutes are checked whenever necessary; if you're concerned about checking xroutes too often, you can schedule check_xroutes by setting the kernel_dump_time value, similarly to what is done in schedule_neighbours_check.

I haven't reviewed the remaining patches, since I don't understand what problem you're trying to solve.

@Dando-Real-ITA
Copy link
Contributor Author

Ok I will on Monday

The context is:
2 edge routers connected to 2 ISPs, using the provider public IPs, and redistributing the default route. Clients have IPs from both providers ( 2 IPv4 and 2 IPv6 ).

Goal is to keep connectivity even in case one ISP connection fails

Problem 1:
Default routes for each ISP need to be used with the correct source ip. This is the problem described in the source specific prefix document, with 2 caveats:

  • It works only for IPv6, but a solution also for ipv4 is needed
  • if a client has only source specific default ipv6 routes, and 2 ipv6 each matching one route, if one route is retracted, the client will have "unreachable routes" errors to some external ips unless the source is forced externally before the route lookup ( like with ping -I src addr )

Problem 2:
Edge routers should be able to retract default routes if they detect the upstream connection is failing. Simply deleting the default route did not work, and also is not ideal for the edge router that needs to check when connectivity is back

I am not sure it is the best way, but this is how I made work the use case I was modeling:

Solution 1a:
Distribute the default route normally from the edge routers, then let the clients install with the correct source ip. To differentiate the default routes received, id was added to install filter.

Solution 1b:
When a default route was updated from isp1 to isp2, the src field of the route was not updated. I identified that always the original ‘pref_src’ field was used, so I added ‘newpref_src’ to support pref_src switch

Solution 2a:
Enable metric change of live redistribution filters. This way, a route can be retracted or allowed without touching the kernel route.

Solution 2b:
Enable a special trigger of kernel_dump. Normally, if a redistribute filter has metric deny, the route is just ignored and no further action is performed. But with a manual retraction, it was necessary to not skip normal processing and allow babeld to notify the route change. Thus a new flag has been created for check_xroutes which is only set for the special case of manual issuing the command, annd it allows infinity redistribute metric to be processed instead of being silently ignored.

@jech
Copy link
Owner

jech commented Mar 3, 2024

Edge routers should be able to retract default routes if they detect the upstream connection is failing. Simply deleting the default route did not work, and also is not ideal for the edge router that needs to check when connectivity is back

This one is actually easy, and requires no new mechanism. On each of your edge routers, install a fake default route with low priority:

ip route add 0.0.0.0/0 dev lo metric 65534 proto 43

Then redistribute this route, but don't redistribute the real default route:

redistribute ip 0.0.0.0/0 le 0 proto 43 allow
redistribute ip 0.0.0.0/0 le 0 deny

Now when connectivity fails, remove the redistributed route:

ip route del 0.0.0.0/0 dev lo metric 65534 proto 43

and add it back when connectivity resumes. You may use babel-pinger in order to handle the fake route automatically.

@Dando-Real-ITA
Copy link
Contributor Author

Interesting, I'll test the default route + babel-ping and clean the commits for the id and newpref_src
Do you have also a solution for propagating the source specific return routes?
As of now I am defining them manually in the client like this:

      routes:
        # Default routes are learned from babel
        # Tables for return traffic with correct source address set to correct ISP
        # ISP1, table 1
        - from: "2001:4870:24a:500:2:0:0:1d03"
          to: default
          via: "2001:4870:24a:500:2:0:0:1d01"
          table: 1
        # ISP2, table 2
        - from: "2001:1890:1f76:4400:2:0:0:1d03"
          to: default
          via: "2001:1890:1f76:4400:2:0:0:1d00"
          table: 2
      routing-policy:
        # Select table for return traffic based on the source IP of the response
        # ISP 1
        - from: "2001:4870:24a:500:2:0:0:1d03"
          table: 1
        # ISP 2
        - from: "2001:1890:1f76:4400:2:0:0:1d03"
          table: 2

@jech
Copy link
Owner

jech commented Mar 3, 2024

Do you have also a solution for propagating the source specific return routes?

No, I don't think I have. Either we reinstate source-specific routing for IPv4 (which I removed because it was complex code and I thought nobody was using it), or we implement your solution. I'll wait until you've had the opportunity clean up your patches, and think about it some more.

Let me know if you want me to set up a git repository for babel-pinger, it could do with some tweaking (in particular, it's almost completely undocumented).

As to check_xroute, I really think the xroute check should be trigerred automatically, I'd rather not expose this implementation detail in the UI.

The router id allows to assign origin-based pref-src.

A use case is, when installing default routes originated by two edge routers with different subnets, and the client has 2 ips matching the separate subnets

install ip 0.0.0.0/0                  eq 0    id 00:02:00:00:00:00:1d:01      pref-src 66.199.5.163
install ip 0.0.0.0/0                  eq 0    id 00:02:00:00:00:00:1d:00      pref-src 12.144.66.186
When two similar routes with pref-src were switched, the pref-src was not updated to the new one.
With this fix, switch_routes passes the new->src to change_route, which is used to reapply the install_filter and retrieve the newpref_src from the filter_result.
kernel_route for netlink has been updated to support the newpref_src on ROUTE_MODIFY
@Dando-Real-ITA
Copy link
Contributor Author

I reduced for now the patches to only install filter id and newpref_src fix ( 3 commits ).
I'll think about the correct way to trigger an update, with the lo route trick, SIGUSR2 on the pid should be enough, or add a simplified command that updates the timers as you said

I gave a quick look to babel-pinger, I noticed it uses only IPv4 addresses and routes.
It would make sense for it to also trigger the babel update post route change

I'll think more about the patch to change live the redistribute metric, I recognize it was more pervasive and I don't know if it can have an use ( pops to my mind some form of dynamic multipath when using multiple upstream routers with prefix aggregation which is the next thing I want to simulate )

For the install source specific I have a simple but hacky idea to test for default routes

@jech
Copy link
Owner

jech commented Mar 4, 2024

At first sight, this looks very good. It's late now, so I may be saying something stupid, but shouldn't kernel_route take a prefsrc in all cases, not just in the change case?

@Dando-Real-ITA
Copy link
Contributor Author

For add and flush there is only one prefsrc anyway, and only one set of route parameters.

For change_route_metric, it recreates the route but the prefsrc does not change, thus by default newprefsrc = prefsrc

Then in kernel_route the field is passed separately to ensure that the recursive calls on modify work correctly in all cases, removing the old route with the old prefsrc and creating the new with the newprefsrc

@Dando-Real-ITA
Copy link
Contributor Author

Pull is part of #114
Master branch has a more pervasive change to support dual default route distribution when generated by different routers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants