Posted in

FortiGate BGP and SD-WAN

BGP and SD-WAN are like peanut butter and jelly — just better together. And given that a FortiGate has full-blown BGP routing capabilities in addition to its SD-WAN capabilities, it would make sense to use the two functions to share information with each other when steering traffic. To plagiarize our FortiDocs website, “SD-WAN allows you to select different outbound WAN links based on performance SLAs. It is important that BGP neighbors are aware of these settings, and changes to them.” (source: https://docs.fortinet.com/document/fortigate/7.2.0/administration-guide/256748/controlling-traffic-with-bgp-route-mapping-and-service-rules).

And to further plagiarize that FortiDocs page, “BGP can adapt to changes in SD-WAN link SLAs in the following ways:

  • Applying different route-maps based on the SD-WAN’s health checks. For example, different BGP community strings can be advertised to BGP neighbors when SLAs are not met.
  • Traffic can be selectively forwarded based on the active BGP neighbor. If the SD-WAN service’s role matches the active SD-WAN neighbor, the service is enabled. If there is no match, then the service is disabled.

To illustrate how I use this method in my SD-WAN Lab, here is a diagram illustrating the components:

In the above diagram, you can see that when our VPN1 tunnel SD-WAN SLA is healthy, it will set the Route-Tag to 1 on updates sent to the Hub. And the Hub will set Route-Tag 1 on updates sent out the VPN1 tunnel. It helps us pin outbound and return traffic to the same VPN tunnel that it was received on.

Now we’ll step into the configuration we’ve used and conclude this post with some useful diagnostic commands to verify the Route Tags we use when SD-WAN is in SLA vs. out of SLA.

Hub Configuration

Prefix Lists

We don’t use Prefix Lists here on the Hub, but they typically are used to filter the networks advertised to BGP neighbors. However, we will utilize Prefix Lists on the Branches further down below…

Community Lists

Community lists tie communities, such as xxxxxx:x that are sent by the Branches, to a locally-significant value on the Hub. In my lab, Branches using VPN1 send Community 65000:1, Branches using VPN2 send 65000:2 and Branches using VPN tunnels that are out of SLA send 65000:5. These communities are only useful if the neighbor (i.e. the Branch) sends them to the Hub in their BGP updates; we’ll configure the Branches to do just that further down below. But here is my config for my Hub’s Community lists to match those inbound communities:

config router community-list

    edit “VPN1

        config rule

            edit 1

                set action permit

                set match “65000:1

            next

        end

    next

    edit “VPN2

        config rule

            edit 1

                set action permit

                set match “65000:2

            next

        end

    next

    edit “Out-of-SLA

        config rule

            edit 1

                set action permit

                set match “65000:5

            next

        end

    next

end

Route-Maps

Route-Maps map community lists to route tags. They take the community lists we defined earlier and when matched, set a Route-Tag value inbound in BGP updates from the Branches. If a neighbor tells us (the Hub) that the community is 65000:1, then we (the Hub) know to set the Route-Tag to 1 in our BGP routing table. Here is the configuration I use in my lab for the mappings:

config router route-map

    edit “RouteMap-VPN-RouteTag”

        config rule

            edit 1

                set match-community “VPN1

                set set-route-tag 1

            next

            edit 2

                set match-community “VPN2

                set set-route-tag 2

            next

            edit 3

                set match-community “Out-of-SLA

                set set-route-tag 5

            next

        end

    next

end

BGP Configuration

In the Hub’s BGP configuration, we configure it to map those Communities to Route-Tags on inbound updates for each of the neighbor groups on both the VPN tunnels:

config router bgp

  config neighbor-group

        edit “VPN1”

        set route-map-in “RouteMap-VPN-RouteTag”

        next

        edit “VPN2”

        set route-map-in “RouteMap-VPN-RouteTag”

        next

end

SD-WAN Configuration

Finally, we configure SD-WAN on the Hubs to send Route-Tag 1 out VPN1 (which is priority-member 1 in my config) and Route-Tag 2 out VPN2. This helps us pin inbound VPN1 traffic to outbound VPN1 traffic and likewise for VPN2:

config system sdwan

config service

        edit 1

            set name “VPN1-RouteTag-1”

            set route-tag 1

            set src “all”

            set priority-members 1

        next

        edit 2

            set name “VPN2-RouteTag-2”

            set route-tag 2

            set src “all”

            set priority-members 2

        next

    end

end

Spoke Configuration

Prefix Lists

We use Prefix Lists to limit the Route Tags to only our local LAN subnets. Prefix Lists also help us only advertise our local networks to our BGP neighbors. Here is the prefix-list I use in my lab for my Branch’s LAN:

config router prefix-list

    edit “Branch-LAN”

        config rule

            edit 1

                set prefix 172.16.200.0 255.255.255.0

                unset ge

                unset le

            next

        end

    next

end

It’s worth noting that I use Metadata Variables in my lab though, courtesy of FortiManager 7.2, to plug in dynamic values for the network value, but the above config snippet illustrates how to hardcode the network value.

Community Lists

We don’t use Community Lists on our Branches for matching, but our Hubs use them to map the Communities we send from the Branches to Route-Tags.

Route-Maps

We use Route-Maps to set a Community value for outbound updates to the Hubs; we set Community 65000:1 on BGP advertisements sent out VPN1 and likewise for 65000:2 on VPN2:

config router route-map

    edit “VPN1

        config rule

            edit 1

                set match-ip-address “Branch-LAN”

                set set-community “65000:1

            next

        end

    next

    edit “VPN2

        config rule

            edit 1

                set match-ip-address “Branch-LAN”

                set set-community “65000:2

            next

        end

    next

    edit “Out-of-SLA

        config rule

            edit 1

                set match-ip-address “Branch-LAN”

                set set-community “65000:5

            next

        end

    next

end

BGP Configuration

Here is where we configure our Branch to set the Community to 65000:1 out to the Hub via VPN1 preferably, assuming our SD-WAN Performance SLA is good. Otherwise we’ll send Community 65000:5.

config router bgp

  config neighbor

        edit “169.254.100.253”

            set route-map-out “Out-of-SLA

            set route-map-out-preferable “VPN1

        next

        edit “169.254.200.253”

            set route-map-out “Out-of-SLA

            set route-map-out-preferable “VPN2

        next

    end

end

SD-WAN Config

This snippet of our Branch SD-WAN config is required to tie our Performance SLAs to our BGP neighbors. Each neighbor IP address in SD-WAN needs to match each neighbor IP address in BGP. In my lab, I tie my “HUB” health check ping of a loopback IP address on my Hub FortiGate cluster to the BGP updates to the Hub neighbor:

config system sdwan

  config neighbor

        edit “169.254.100.253”

            set member 1

            set health-check “HUB

            set sla-id 1

        next

        edit “169.254.200.253”

            set member 2

            set health-check “HUB

            set sla-id 1

        next

    end

end

Recap

Let’s take a second to recap all that we’ve configured. When our Branch/Spoke’s “HUB” SD-WAN SLA is healthy, the Branch advertises Community 65000:1 out VPN1 and 65000:2 out VPN2. The Hub then maps these Communities to Route-Tags when it stores the routes in the BGP database: Community 65000:1 maps to Route-Tag 1, 65000:2 to Route-Tag 2 and 65000:5 to Route-Tag 5. The Hub’s SD-WAN configuration sets Route-Tag 1 on BGP updates out VPN1 and sets Route-Tag 2 on BGP updates out VPN2.

In the unfortunate event that our “HUB” SD-WAN SLA becomes unhealthy and fails, the Branch will advertise Community 65000:5 out that VPN1 or VPN2 tunnel. The Hub will then receive this Community and store it in its local BGP database. When the Route-Tag is 1 or 2 (meaning the Branch is in SLA), it will pin that Route-Tag to the appropriate VPN tunnel when routing traffic back to the Branch.

Verification / Troubleshooting

In my lab, my Branch SD-WAN Rule prefers VPN1, as long as the “HUB” SLA is healthy:

And here is what the Hub’s BGP database looks like when the Branch/Spoke is in SLA for routes received from VPN1 (169.254.100.x) and from VPN2 (169.254.200.x):

You can see neighbor 169.254.100.2 (Branch on VPN1) has Community 65000:1 and Route-Tag 1. Likewise for neighbor 169.254.200.7 on VPN2. We also examine SD-WAN on the Hub and see that when we match Route-Tag 1, we route traffic out VPN1 (and likewise for 2):

In my lab, I intentionally made my “HUB” SD-WAN SLA fail by tweaking WANem as my “Internet Backbone” and introducing 500ms latency across the Hub’s VPN1 tunnel. Here is what the Hub’s BGP database looks like when the Branch/Spoke VPN1 tunnel is out of SLA for routes received:

And when we examine the Hub’s SD-WAN rules, we see that it is not setting the Route-Tag on updates back to the Branch:

But equally importantly we see the “service disabled caused by no destination” meaning we’re not using VPN1 in our SD-WAN rules on the Hub:

The “Last Used” column reflects this and indicates it was last week when it last used VPN1, which is when I introduced the 500ms latency on that link.

Conclusion

Using SD-WAN SLAs and Route Tags, we both pin traffic (inbound and outbound) to a VPN tunnel while we measure the health of that tunnel. I wish I could claim credit for this design, using Route-Tags from BGP to incorporate SD-WAN rules, but I work with a lot of brilliant engineers and this is a common design that they’ve shared with me. And it lays the groundwork to do more creative things with the Out-of-SLA Route-Tag (5 in my lab) on your Hub’s SD-WAN rules.

If you’d like to read further, these two FortiDocs pages have great information:

Thanks for reading and please leave any questions or comments in the comments section below.

Andrew

3 thoughts on “FortiGate BGP and SD-WAN

  1. How do you configure connections that are intiated from the HUB to use VPN1 as the primary and VPN2 as the secondary. When I try this config it seems to pick and choose which one it wants to use.

    1. David, you’re right in that the Hub would use BGP to choose and routes from both VPN tunnels have a distance of 200, I believe, so it would use equal-cost multipathing. If you wanted to prefer one VPN tunnel over the other, you could set the weight higher on the preferred neighbor (i.e. VPN1 in this case). The downside to that being that it will prefer that VPN tunnel rather than pinning traffic to the same tunnel as it was received on. When increased the weight of VPN1 on the Hub, all the Branches preferred it as long as the SLA was healthy regardless of my Route-Tag.

      In my lab, I used the following config to give VPN1 a higher weight (default is 100):
      config router bgp
      config neighbor-group
      edit “VPN1”
      set weight 150

      Then I confirmed the weights were correct for routes learned via VPN1 (after an “exec router restart”):
      Hub1 # get router info bgp network
      VRF 0 BGP table version is 1, local router ID is 192.168.250.1
      Status codes: s suppressed, d damped, h history, * valid, > best, i – internal,
      S Stale
      Origin codes: i – IGP, e – EGP, ? – incomplete

      Network Next Hop Metric LocPrf Weight RouteTag Path
      *> 10.0.0.0/24 0.0.0.0 100 32768 0 i <-/1>
      * i172.16.100.0/24 169.254.200.7 0 100 0 2 i <-/->
      *>i 169.254.100.2 0 100 150 1 i <-/1>

      And that the routes stored in the forward information base (FIB) were the ones via VPN1:
      Hub1 # get router info routing-table all
      ……
      B 172.16.100.0/24 [200/0] via 169.254.100.2 (recursive is directly connected, VPN1), 00:00:08, [1/0]
      B 172.16.101.0/24 [200/0] via 169.254.100.9 (recursive is directly connected, VPN1), 00:00:08, [1/0]
      B 172.16.102.0/24 [200/0] via 169.254.100.8 (recursive is directly connected, VPN1), 00:00:08, [1/0]
      ……

      The bottom of this document shows the BGP path selection process and what you can tweak to prefer one path over another: https://docs2.fortinet.com/document/fortigate/7.2.4/administration-guide/55339/troubleshooting-bgp

      In this case, it’s a design decision as to whether you want return traffic pinned to the same VPN tunnel vs. forcing it over the “preferred” VPN tunnel, but either are doable. Great question and please let me know if I can help further! Andrew

      1. Could this be remedied by having a like for like config on both sides? Would this force both sides to prefer VPN1 based on received SLAs?

Leave a Reply

Your email address will not be published. Required fields are marked *