In networks that use BGP as part of their routing protocols it is very important to understand how the BGP route selection works. BGP is an important part of the JNCIE exams so this information is also very useful for candidates preparing for any of the practical exams.
BGP route selection can be broken down into discrete steps so it then becomes easy to understand how you can influence the route selection with the appropriate attributes. So lets have a look at the algorithm used in Junos OS in a somewhat simplified form. For all detailed steps I refer to http://www.juniper.net/techpubs/en_US/junos13.1/topics/reference/general/routing-ptotocols-address-representation.html
Before we can start the actual BGP route selection the router needs to make sure that the route is valid, so it checks for Martian routes, AS loops and next-hop reach-ability. The actual route selection steps are:
- The first route selection is done based on the lowest route preference. By default this step doesn't result in a selected route as for all incoming BGP routes the route preference is 170. The route preference can be changed for specific neighbors or specific routes however.
- The second step in the route selection is commonly viewed as the first step as this looks at the highest local preference value. The default value for all learned BGP routes is 100, so normally this doesn't result in a route selection with default settings.
- The third step in the route selection will look for the shortest AS-PATH value of the routes. This is normally the first time route selection can take place with default settings.
- The fourth step will look for the lowest origin value of the routes. Lower origin values are preferred, so IGP (I) is preferred over EGP (E), and EGP is preferred over incomplete (?).
- The fifth step will look for the lowest multiple exit discriminator (MED) value of the routes. By default MED values are only compared when routes are learned from the same AS. This behavior can be changed using the always-compare-med command. This command allows the MED values to be compared even if they were learned from different AS's. Also very important to remember is that routes without MED values are treated as if they have MED value of 0, which is the lowest and therefore always the most preferred value.
- The sixth step compares the route type (Internal BGP / External BGP). External BGP learned routes are preferred over internal BGP learned routes at this point. This behavior is commonly referred to as hot potato routing. Just to remind you after this step only equal route type routes are compared so both routes are internal, or both routes are external.
- The seventh step selects the route with the lowest IGP cost towards the BGP next-hop of the route.
- The eight step selection criteria differs for internal routes and external routes. For internal routes it prefers the route with the lowest router-id. For external routes it selects the oldest active route. The behavior for external routes can be changed to also prefer the lowest router-id by using the path-selection external-router-id command.
- The ninth step is only used if route reflection is used in the internal BGP network. The route with the shortest cluster list is preferred.
- The tenth step is the last option for BGP to make a route selection and it is needed in case multiple peerings are used between the same routers. As the router-id will be the same for both sessions, the lowest peer ip address is chosen as the final tie-breaker.
The route selection result and the route selection criteria used in the selection can be seen in the output of show route detail or show route extensive command. This can be useful if it is not immediately obvious why a certain route was selected or not selected.
Below we will look at the different results of the BGP route selection process in the route table. Each step will be illustrated by the route output. The output is from router R1 that will have 2 sessions that advertise the same route (20.20.20/24) with different attribute values to it, so that all selection steps can be shown.
1. Selection based on lowest route preference
The output below shows the active 20.20.20/24 route having a non-default route preference of 169. The other learned route to 20.20.20/24 has a default BGP route preference of 170. Junos OS is verbose in the output showing us why the second route has lost in the route selection. The inactive reason given is Route Preference.
2. Selection based on highest local preference
The output below shows the active route that has been selected because of its non-default local preference of 500, which is higher than 100 for the second route that is inactive. The inactive reason shows Local Preference.
3. Selection based on shortest AS path
The active route is chosen because its AS path value shows only 1 AS. The other route has 3 AS hops in its AS path. The inactive reason shows AS path.
4. Selection based on lowest origin
The active route is chosen based on its lower origin setting of I, compared to the second route which has an origin of ?. The origin value is show after the AS path information. The inactive reason given is Origin.
5. Selection based on lowest multiple exit discriminator (MED)
The active route has a MED value of 10. This is shown as metric in the output. The second route has a MED of 100 and this results in the inactive reason showing Not Best in its group - Route Metric or MED comparison.
6. Selection based on route type (internal vs external BGP)
The active route is selected because the strictly external route is preferred over the external route learned through an internal neighbor. Have a look at the Local AS and Peer AS line, this gives information on if the route is learned from an I-BGP neighbor or an E-BGP neighbor. The second route is inactive and the inactive reason shown is Interior > Exterior > Exterior via Interior.
So the order here is interior routes (local redistributed routes into BGP), then strictly external routes (directly learned from E-BGP peers), and lastly external routes learned through internal peers.
7. Selection based on lowest IGP cost towards the BGP next-hop
The active route is chosen based on the lowest IGP cost to the BGP next-hop, shown as protocol next-hop in the output. The IGP cost is shown as metric2 in the output. The active route has a metric2 of 5. The inactive route has a metric2 of 10. The inactive reason given is Not Best in its group - IGP metric
8. Selection based on lowest router-id for I-BGP or oldest active route for E-BGP
This step is different for internal sessions compared to external sessions. Examples of both I-BGP and E-BGP are shown below.
The first output shows the behavior for internal sessions. For internal sessions the lowest router-id is used as the tie-breaker. The inactive route has a router-id 192.168.2.2 which is higher than the 192.168.2.2 router-id of the active route. The inactive reason given is Not Best in its group - Router ID.
For external sessions the default behavior is to stay on the current active route for stability reasons. The inactive reason given in this scenario is Not Best in its group - Active preferred
The default behavior for external sessions can be changed using the path-selection external-router-id command. When this option is configured the external session will use the router-id as the tie-breaker (just like it does for internal sessions). The inactive reason given is Not Best in its group - Router ID.
9. Selection based on shortest cluster list
Only in the case of I-BGP learned routes and the use of route reflection will point 9 become a potential selection tie-breaker. The shortest cluster list will be preferred at this point, a missing cluster list is seen as a lenght of 0 so always preferred.
In the output below the active route has no cluster list whereas the inactive route has a cluster list length of 1 (192.168.2.1). The inactive reason given is Not Best in its group - Cluster list length
10. Selection based on lowest peer ip address.
The ultimate tie-breaker is needed when 2 routers peer 2 or more times with each other. This is typically done for load-balancing purposes. The router-id in step 8 can't be used as it will be the same for all the peering sessions.
The active route has the lowest peer ip address, shown as Source in the output. The inactive reason given is Not Best in its group - Update source
Summary
This blog post covered the main steps of the BGP route selection algorithm in Junos OS. The steps have been illustrated by the route output for each of the individual selection options. The show route detail or show route extensive output makes it easy to figure out why a route is in the inactive state.
This information is useful for anybody using BGP in their production network and for future JNCIE candidates.
More information about BGP route selection and other BGP topics can be found in the iNET ZERO’s JNCIE workbooks.
Let us know what other topics you would like to have covered in a blog post.