Skip to content

Add addpath support to EVPN #18759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Tuetuopay
Copy link
Contributor

Hi,

This pull requests implements addpath support to EVPN. This includes:

  • route-reflector and route-server support for multiple paths for the same EVPN prefix
  • importing multiple paths from EVPN type-5 into an overlay VRF
  • exporting multiple (if not all) paths from an overlay VRF to EVPN type-5

Rationale

EVPN has a native support for multipath, as the route distinguisher is part of the prefix, since each VTEP is supposed to have its own. In a nutshell, this is already what AddPath does, but in a more explicit way (the distinguisher is part of the prefix) and in a simpler way (the distinguisher is carried across the whole EVPN instance, instead of changing on each speaker).

However, this has a limitation when importing/exporting routes with EVPN: a single VTEP cannot advertise multiple paths to the same overlay prefix in the IP-VRF, as FRR has a single route distinguisher per VRF. A few usecases:

Imbalanced exit paths

When interacting with an outside network, we may have imbalanced exit paths between the PEs and the CEs:

                         ┌─────┐       ┌─────┐ 
                         │     ├───────┤ CE1 │ 
 ┌─────────────┐         │     │       └─────┘ 
 │             ├─────────┤ PE1 │               
 │ EVPN Fabric │         │     │       ┌─────┐ 
 │             ├─────┐   │     ├───────┤ CE2 │ 
 └─────────────┘     │   └─────┘       └─────┘ 
                     │                         
                     │   ┌─────┐       ┌─────┐ 
                     └───┤ PE2 ├───────┤ CE3 │ 
                         └─────┘       └─────┘ 

In the above situation, we have three total paths to the outside network. However, without addpath, PE1 will export a single EVPN route to the rest of the fabric, leading to only two paths to the outside network. In classic ECMP scenarios, this creates an imbalance between the CEs where the traffic would be split 25%/25%/50%.

By using addpath, PE1 can signal it has two paths to the external network, giving a better overlay multipathing of 33/33/33. This can be used as a workaround for vendors not supporting weighted ECMP, by feeding them all paths. In addition, this enables BGP inspection tools to see which paths are present.

Note: the above scenario can easily be attained with dual redundant connections with 4 CEs, when one of the four fails.

Import/export performed out-of-band

In virtualized environments, or with many peers with many tenants, or with some inter-tunnel routing setups, one may want to decouple BGP signaling from the actual router performing the forwarding.

In centralized gateway models, the actual routers are often redundant, anycasted, and even sometimes using anycast VTEP setups. This makes peering with them through the overlay impractical: implementation details of "how many" there are leak out; if not impossible when they are anycast VTEPs. Thus, the signaling needs to be performed out-of-band, acting like a route-server that also imports/exports routes to EVPN, leveraging Gateway-IP overlay indexes to carry overlay routing information to the actual router(s).

         ┌───────┐                                 
         │  GW1  │                                 
         │  GW2  ├────────┐       EVPN type-5      
         │  GW3  │        │                        
         │  ...  │    ┌───┴───┐                    
         │  GWN  │    │  PE1  │                    
         └───┬───┘    └──┬─┬──┘                    
             │           │ │                       
             │        ┌──┘ └──┐   IPv[46] unicast  
             │        │       │                    
 Dataplane   │     ┌──┴──┐ ┌──┴──┐                 
             │     │ CE1 │ │ CE2 │                 
             │     └──┬──┘ └──┬──┘                 
             │        │       │                    
 ────────────┴────────┴───────┴────────            

For the N unicast gateways to know about both CE1 and CE2, PE1 would need either a route distinguisher per CE, or AddPath.

Another usecase for out-of-band would be in distributed routing, e.g. on hypervisors. If we consider CEs to be VMs and PEs to be hypervisors running FRR, live-migration scenarios would ensure the BGP session to flap, as it would reconnect to the new hypervisor. It also mandates shuttling around FRR configurations, increasing the failure chances. One would want to push the peering outside of the hypervisor, keeping the session alive and reducing configuration changes.

Approach

Pretty much all EVPN functions handling type-5 routes were enriched with addpath IDs, and those are used to distinguish the routes during export. The IPv[46] unicast path's addpath TX ID is used as the EVPN path's addpath RX ID. This, however, required a large change into where EVPN hooks itself for export: it only operated after the VRF's bestpath selection, only getting access to the best path. The new hooks were modeled after MPLS VPN leaking, as it's conceptually pretty close.

Route import was not touched as it worked out of the box; and to avoid breaking installation of other route types.

The multipath behavior is only enabled when the advertise <ipv4|ipv6> unicast knob bears the gateway-ip flag, in which case, all paths are exported to EVPN, and addpath TX IDs are generated for the VRF.

  • This way, operators that don't encounter the multipath usecase (which is most of them) don't pay the multipath price. Though, it should be noted that for all paths to be exported to other peers, addpath-tx-all-paths needs to be enabled on EVPN peers.
  • Enabling multipath without Gateway-IP overlay index does not makes a lot of sense, as all paths would be basically identical (if all other attributes are identical), thus generate unneeded load in the EVPN instance.
  • Requiring Gateway-IP to enable multipath is not that big of a burden, and gives cues to operators when inspecting routing tables. Furthermore, those can easily be dropped anywhere on the route's path in BGP, or ignored by receiving PEs.

Other words

First of all, thank you for taking the time to review this PR, and for any feedback you might provide. Thanks also for this great piece of software!

I am, of course, open to changes in the approach taken there; and to more topotest scenarios.

@Tuetuopay
Copy link
Contributor Author

Force push: missing signed-off-by and style fixes as noted by CI

@vayetze vayetze requested a review from chiragshah6 May 6, 2025 15:56
@donaldsharp donaldsharp self-requested a review May 6, 2025 15:56
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work with addpath RX disabled flag or limiting the addpath paths to be sent? I meant not only with ALL paths.

This should make addpath work for EVPN when routes "pass-by" the BGP
instances without changing AFI (i.e. when not import/exporting to EVPN).

Signed-off-by: Tuetuopay <[email protected]>
@frrbot frrbot bot added the tests Topotests, make check, etc label May 7, 2025
@Tuetuopay
Copy link
Contributor Author

Does it work with addpath RX disabled flag or limiting the addpath paths to be sent? I meant not only with ALL paths.

@ton31337 if you mean setting disable-addpath-rx in the IPv4 unicast peers in the overlay VRF, yes, it works as expected since RX addpath in the unicast vrf does not really matter.

If you mean an EVPN (like our R2 in the topotest), as I think you mean since it's the issue at hand:

  • disable-addpath-rx works fine: the capability is dropped from the session, and only the selected best is transmitted. Note that, in the case of a multipath route, the sent one may not be stable (well, at least in my examples) as the paths are strictly equal. The only differentiator would be the gateway IP, but it's not part of the route comparison code. It will rely on the first path received. I did not add such a discriminant as it's not described in RFCs AFAIK.
  • addpath-tx-bestpath-per-AS also (now) works fine as expected. (now because I fixed a bug when switching away from a specific addpath strategy).

I added two scenarios in the topotest to ensure those work.

Tuetuopay added 7 commits May 8, 2025 18:03
This will properly generate multipath EVPN paths when the IPv4/IPv6
unicast one is multipath, properly forwarding AddPath IDs for multipath
EVPN propagation.

All paths will be forwarded to EVPN (even not selected ones), for
filtering later with `addpath-tx-all-paths` or `addpath-tx-best-path`.

Signed-off-by: Tuetuopay <[email protected]>
When the overlay index is NONE (i.e. use the router's mac), even if we
export multiple paths, they will all point to ourselves. However, when
the `gateway-ip` option is enabled, receiving VTEPs can infer multiple
paths should they have access to the overlay index. This is indeed what
FRR does when recieving type-5 routes bearing such an overlay index.

Thus, only export multiple paths through addpath when the user
explicitly asks for it with `advertise ipv[46] unicast gateway-ip`. This
way, users that don't use the `gateway-ip` flag don't pay the price of
multipath.

Signed-off-by: Tuetuopay <[email protected]>
For EVPN multipath (when the overlay index mode is `gateway-ip`), the
addpath IDs are used to distinguish routes in the EVPN VRF. However, the
current logic only counts peers (and peer-groups) that perform any form
of TX addpath.

This patch counts the `gateway-ip` knob as one "consumer" of addpath tx
ids, ensuring they are present even though peers in the VRF don't
transmit using addpath.

Signed-off-by: Tuetuopay <[email protected]>
Don't output a leading comma `,` for `numPrefix` when there are no
routes before in the object.

The following command:

    show bgp l2vpn evpn [route detail] json

Did output the invalid object:

    {
    ,"numPrefix":0,"numPaths":0}

Signed-off-by: Tuetuopay <[email protected]>
This test ensures routes are properly exported to EVPN, handled, and
imported from EVPN.

Signed-off-by: Tuetuopay <[email protected]>
Since the behavior changes between with and without gateway-ip, it needs
to be explicitly mentioned in the docs.

Signed-off-by: Tuetuopay <[email protected]>
These two new scenarios ensure we can tune the TX AddPath strategy in
EVPN, and switch from one to the other live.

Signed-off-by: Tuetuopay <[email protected]>
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Member

@riw777 riw777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@pbrisset pbrisset self-requested a review May 28, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants