BGP and SD-WAN are like peanut butter and jelly — just better together. And given that a FortiGate has full-blown BGP routing capabilities in addition to its SD-WAN capabilities, it would make sense to use the two functions to share information with each other when steering traffic. To plagiarize our FortiDocs website, “SD-WAN allows you to select different outbound WAN links based on performance SLAs. It is important that BGP neighbors are aware of these settings, and changes to them.” (source: https://docs.fortinet.com/document/fortigate/7.2.0/administration-guide/256748/controlling-traffic-with-bgp-route-mapping-and-service-rules).
And to further plagiarize that FortiDocs page, “BGP can adapt to changes in SD-WAN link SLAs in the following ways:“
- Applying different route-maps based on the SD-WAN’s health checks. For example, different BGP community strings can be advertised to BGP neighbors when SLAs are not met.
- Traffic can be selectively forwarded based on the active BGP neighbor. If the SD-WAN service’s role matches the active SD-WAN neighbor, the service is enabled. If there is no match, then the service is disabled.
To illustrate how I use this method in my SD-WAN Lab, here is a diagram illustrating the components:
In the above diagram, you can see that when our VPN1 tunnel SD-WAN SLA is healthy, it will set the Route-Tag to 1 on updates sent to the Hub. And the Hub will set Route-Tag 1 on updates sent out the VPN1 tunnel. It helps us pin outbound and return traffic to the same VPN tunnel that it was received on.
Now we’ll step into the configuration we’ve used and conclude this post with some useful diagnostic commands to verify the Route Tags we use when SD-WAN is in SLA vs. out of SLA.
Hub Configuration
Prefix Lists
We don’t use Prefix Lists here on the Hub, but they typically are used to filter the networks advertised to BGP neighbors. However, we will utilize Prefix Lists on the Branches further down below…
Community Lists
Community lists tie communities, such as xxxxxx:x that are sent by the Branches, to a locally-significant value on the Hub. In my lab, Branches using VPN1 send Community 65000:1, Branches using VPN2 send 65000:2 and Branches using VPN tunnels that are out of SLA send 65000:5. These communities are only useful if the neighbor (i.e. the Branch) sends them to the Hub in their BGP updates; we’ll configure the Branches to do just that further down below. But here is my config for my Hub’s Community lists to match those inbound communities:
config router community-list
edit “VPN1“
config rule
edit 1
set action permit
set match “65000:1“
next
end
next
edit “VPN2“
config rule
edit 1
set action permit
set match “65000:2“
next
end
next
edit “Out-of-SLA“
config rule
edit 1
set action permit
set match “65000:5“
next
end
next
end
Route-Maps
Route-Maps map community lists to route tags. They take the community lists we defined earlier and when matched, set a Route-Tag value inbound in BGP updates from the Branches. If a neighbor tells us (the Hub) that the community is 65000:1, then we (the Hub) know to set the Route-Tag to 1 in our BGP routing table. Here is the configuration I use in my lab for the mappings:
config router route-map
edit “RouteMap-VPN-RouteTag”
config rule
edit 1
set match-community “VPN1“
set set-route-tag 1
next
edit 2
set match-community “VPN2“
set set-route-tag 2
next
edit 3
set match-community “Out-of-SLA“
set set-route-tag 5
next
end
next
end
BGP Configuration
In the Hub’s BGP configuration, we configure it to map those Communities to Route-Tags on inbound updates for each of the neighbor groups on both the VPN tunnels:
config router bgp
config neighbor-group
edit “VPN1”
set route-map-in “RouteMap-VPN-RouteTag”
next
edit “VPN2”
set route-map-in “RouteMap-VPN-RouteTag”
next
end
SD-WAN Configuration
Finally, we configure SD-WAN on the Hubs to send Route-Tag 1 out VPN1 (which is priority-member 1 in my config) and Route-Tag 2 out VPN2. This helps us pin inbound VPN1 traffic to outbound VPN1 traffic and likewise for VPN2:
config system sdwan
config service
edit 1
set name “VPN1-RouteTag-1”
set route-tag 1
set src “all”
set priority-members 1
next
edit 2
set name “VPN2-RouteTag-2”
set route-tag 2
set src “all”
set priority-members 2
next
end
end
Spoke Configuration
Prefix Lists
We use Prefix Lists to limit the Route Tags to only our local LAN subnets. Prefix Lists also help us only advertise our local networks to our BGP neighbors. Here is the prefix-list I use in my lab for my Branch’s LAN:
config router prefix-list
edit “Branch-LAN”
config rule
edit 1
set prefix 172.16.200.0 255.255.255.0
unset ge
unset le
next
end
next
end
It’s worth noting that I use Metadata Variables in my lab though, courtesy of FortiManager 7.2, to plug in dynamic values for the network value, but the above config snippet illustrates how to hardcode the network value.
Community Lists
We don’t use Community Lists on our Branches for matching, but our Hubs use them to map the Communities we send from the Branches to Route-Tags.
Route-Maps
We use Route-Maps to set a Community value for outbound updates to the Hubs; we set Community 65000:1 on BGP advertisements sent out VPN1 and likewise for 65000:2 on VPN2:
config router route-map
edit “VPN1“
config rule
edit 1
set match-ip-address “Branch-LAN”
set set-community “65000:1“
next
end
next
edit “VPN2“
config rule
edit 1
set match-ip-address “Branch-LAN”
set set-community “65000:2“
next
end
next
edit “Out-of-SLA“
config rule
edit 1
set match-ip-address “Branch-LAN”
set set-community “65000:5“
next
end
next
end
BGP Configuration
Here is where we configure our Branch to set the Community to 65000:1 out to the Hub via VPN1 preferably, assuming our SD-WAN Performance SLA is good. Otherwise we’ll send Community 65000:5.
config router bgp
config neighbor
edit “169.254.100.253”
set route-map-out “Out-of-SLA“
set route-map-out-preferable “VPN1“
next
edit “169.254.200.253”
set route-map-out “Out-of-SLA“
set route-map-out-preferable “VPN2“
next
end
end
SD-WAN Config
This snippet of our Branch SD-WAN config is required to tie our Performance SLAs to our BGP neighbors. Each neighbor IP address in SD-WAN needs to match each neighbor IP address in BGP. In my lab, I tie my “HUB” health check ping of a loopback IP address on my Hub FortiGate cluster to the BGP updates to the Hub neighbor:
config system sdwan
config neighbor
edit “169.254.100.253”
set member 1
set health-check “HUB“
set sla-id 1
next
edit “169.254.200.253”
set member 2
set health-check “HUB“
set sla-id 1
next
end
end
Recap
Let’s take a second to recap all that we’ve configured. When our Branch/Spoke’s “HUB” SD-WAN SLA is healthy, the Branch advertises Community 65000:1 out VPN1 and 65000:2 out VPN2. The Hub then maps these Communities to Route-Tags when it stores the routes in the BGP database: Community 65000:1 maps to Route-Tag 1, 65000:2 to Route-Tag 2 and 65000:5 to Route-Tag 5. The Hub’s SD-WAN configuration sets Route-Tag 1 on BGP updates out VPN1 and sets Route-Tag 2 on BGP updates out VPN2.
In the unfortunate event that our “HUB” SD-WAN SLA becomes unhealthy and fails, the Branch will advertise Community 65000:5 out that VPN1 or VPN2 tunnel. The Hub will then receive this Community and store it in its local BGP database. When the Route-Tag is 1 or 2 (meaning the Branch is in SLA), it will pin that Route-Tag to the appropriate VPN tunnel when routing traffic back to the Branch.
Verification / Troubleshooting
In my lab, my Branch SD-WAN Rule prefers VPN1, as long as the “HUB” SLA is healthy:
And here is what the Hub’s BGP database looks like when the Branch/Spoke is in SLA for routes received from VPN1 (169.254.100.x) and from VPN2 (169.254.200.x):
You can see neighbor 169.254.100.2 (Branch on VPN1) has Community 65000:1 and Route-Tag 1. Likewise for neighbor 169.254.200.7 on VPN2. We also examine SD-WAN on the Hub and see that when we match Route-Tag 1, we route traffic out VPN1 (and likewise for 2):
In my lab, I intentionally made my “HUB” SD-WAN SLA fail by tweaking WANem as my “Internet Backbone” and introducing 500ms latency across the Hub’s VPN1 tunnel. Here is what the Hub’s BGP database looks like when the Branch/Spoke VPN1 tunnel is out of SLA for routes received:
And when we examine the Hub’s SD-WAN rules, we see that it is not setting the Route-Tag on updates back to the Branch:
But equally importantly we see the “service disabled caused by no destination” meaning we’re not using VPN1 in our SD-WAN rules on the Hub:
The “Last Used” column reflects this and indicates it was last week when it last used VPN1, which is when I introduced the 500ms latency on that link.
Conclusion
Using SD-WAN SLAs and Route Tags, we both pin traffic (inbound and outbound) to a VPN tunnel while we measure the health of that tunnel. I wish I could claim credit for this design, using Route-Tags from BGP to incorporate SD-WAN rules, but I work with a lot of brilliant engineers and this is a common design that they’ve shared with me. And it lays the groundwork to do more creative things with the Out-of-SLA Route-Tag (5 in my lab) on your Hub’s SD-WAN rules.
If you’d like to read further, these two FortiDocs pages have great information:
- https://docs.fortinet.com/document/fortigate/7.2.0/administration-guide/256748/controlling-traffic-with-bgp-route-mapping-and-service-rules
- https://docs.fortinet.com/document/fortigate/7.2.0/administration-guide/89370/applying-bgp-route-map-to-multiple-bgp-neighbors
Thanks for reading and please leave any questions or comments in the comments section below.
Andrew
How do you configure connections that are intiated from the HUB to use VPN1 as the primary and VPN2 as the secondary. When I try this config it seems to pick and choose which one it wants to use.
David, you’re right in that the Hub would use BGP to choose and routes from both VPN tunnels have a distance of 200, I believe, so it would use equal-cost multipathing. If you wanted to prefer one VPN tunnel over the other, you could set the weight higher on the preferred neighbor (i.e. VPN1 in this case). The downside to that being that it will prefer that VPN tunnel rather than pinning traffic to the same tunnel as it was received on. When increased the weight of VPN1 on the Hub, all the Branches preferred it as long as the SLA was healthy regardless of my Route-Tag.
In my lab, I used the following config to give VPN1 a higher weight (default is 100):
config router bgp
config neighbor-group
edit “VPN1”
set weight 150
Then I confirmed the weights were correct for routes learned via VPN1 (after an “exec router restart”):
Hub1 # get router info bgp network
VRF 0 BGP table version is 1, local router ID is 192.168.250.1
Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
S Stale
Origin codes: i – IGP, e – EGP, ? – incomplete
Network Next Hop Metric LocPrf Weight RouteTag Path
*> 10.0.0.0/24 0.0.0.0 100 32768 0 i <-/1>
* i172.16.100.0/24 169.254.200.7 0 100 0 2 i <-/->
*>i 169.254.100.2 0 100 150 1 i <-/1>
And that the routes stored in the forward information base (FIB) were the ones via VPN1:
Hub1 # get router info routing-table all
……
B 172.16.100.0/24 [200/0] via 169.254.100.2 (recursive is directly connected, VPN1), 00:00:08, [1/0]
B 172.16.101.0/24 [200/0] via 169.254.100.9 (recursive is directly connected, VPN1), 00:00:08, [1/0]
B 172.16.102.0/24 [200/0] via 169.254.100.8 (recursive is directly connected, VPN1), 00:00:08, [1/0]
……
The bottom of this document shows the BGP path selection process and what you can tweak to prefer one path over another: https://docs2.fortinet.com/document/fortigate/7.2.4/administration-guide/55339/troubleshooting-bgp
In this case, it’s a design decision as to whether you want return traffic pinned to the same VPN tunnel vs. forcing it over the “preferred” VPN tunnel, but either are doable. Great question and please let me know if I can help further! Andrew
Could this be remedied by having a like for like config on both sides? Would this force both sides to prefer VPN1 based on received SLAs?