Blog

What is your OSPF neighbor doing? Adjacency problems in OSPF

January 30, 2014 2:42 pm

As a JNCIE you are expected to known how to troubleshoot misconfigurations in your given network and fix them. Troubleshooting IGP neighbor issues is easiest if you can compare the configurations of the routers involved. However this is not always possible as in some situations you can not view the configuration of all routers. An example of real-life is a MPLS L3 VPN with unmanaged CE devices where OSPF is used between PE and CE. This blog covers OSPF adjancency issues that might arise from misconfiguration of parameters between OSPF neighbors.

For troubleshooting routing adjacencies there are basically three tools within JUNOS. These are syslog, traceoptions, and monitor traffic interface. Of course the show commands can also be of use, but in many cases they are not useful. The syslog output might be useful to detect problems with your OSPF neighbor but typically it doesn’t specify what caused your neighbor going down as shown in the output below.

OSPF-Syslog-Down

The topology used to demonstrate the different issues that can arise between OSPF neighbors is very simple. Two routers are connected via Ethernet interfaces, and both routers have a loopback interface configured as well. For readers unfamiliar with JUNOS Olive the interface names might be a bit strange, but in Olive the em.X interface are like Ethernet interfaces. In the diagram below you can see the router under our control JNCIE-R1 and the router that we cannot control JNCIE-HIDDEN. In the exam connectivity between both routers might need to be established, without access to the configuration of the JNCIE-HIDDEN router.

test-size1

Lets start with the basic configuration where both routers are in the FULL state as neighbors. Both routers have a very basic configuration with both interfaces (em and lo0) into the same area (0.0.0.0). Below we show  the configuration of JNCIE-R1 including traceoptions settings useful for neighbor issues, and baseline show ospf interface detail and show ospf neighbor detail output. All the output is from JUNOS 13.2R2.4 so a newer version than currently on the exam.

Basic-OSPF-config-traceoptions

Basic-OSPF-Interface

Basic-OSPF-neighbor

The output above shows the default interface settings like timers (hello and dead), DR priority and so on. The neighbor state is FULL which means that the LSDB has been exchanged and routing between the routers should be fully functional.

There are quite a few parameters that must match between OSPF neighbors to become fully adjacent. Most of these parameters are normally configured under the [edit protocol ospf] hierarchy and are therefore easy to spot when possible to compare the configurations of both routers involved. As this is not always possible we need to troubleshoot in a different manner.

Before we go into more interesting issues we assume that you checked the basic reasons for OSPF adjancency problems. So make sure that the correct interfaces are configured under [edit protocols ospf], that the interfaces are not passive, that the interfaces are not down, and that there is no firewall filter blocking OSPF (protocol 89) communications in or outbound.

Lets start with a list of potential issues that could break or prevent a full OSPF adjacency between 2 routers. The reasons that these issues arise are misconfiguration, migration of settings, or vendor interoperability issues (difference in vendor defaults).
1. Area mismatch
2. Timer mismatch
3. Authentication mismatch
4. Area type mismatch (stub vs normal)
5. Subnet mask mismatch
6. Duplicate router-id
7. MTU mismatch
8. DR priority = 0 on both routers
9. Duplicate interface ip address
10. Interface type mismatch

To investigate the above issues, different settings on one of the routers are configured to make the problem appear. The focus is on spotting the issue using the traceoptions and monitor traffic output. In some cases the show commands are also shown.

1. Area mismatch
Area mismatch can be caused by misconfiguration, bad documentation, or because it is unknown what the other side has configured. As it is required that OSPF neighbors belong to the same area it will hide the neighbors from each other if they belong to different area’s. So the show ospf neighbor output would be empty. Below we can see the output of our traceoptions log file. It clearly indicates that there is an area mismatch, and also gives information about the two area’s involved (0.0.0.0 and 0.0.0.1). Remember from the configuration shown earlier that the name of the traceoptions file for OSPF is called OSPF-DEBUG. By using the show log OSPF-DEBUG | last 10 it is easy to focus on the issue. You can increase the number of last lines if needed.

OSPF-DEBUG-AREA-MISMATCH The same area mismatch information can be seen in the monitor traffic output but it is not that obvious. Only by studying the output carefully we can see that the input packet is generated in area 0.0.0.1, whereas the output packet is generated in area 0.0.0.0. My preference for this problem is to use the traceoptions output as it is much easier and faster to spot the issue. And remember time is crucial in your exam!!

OSPF-monitor-AREA-MISMATCHIn the monitor traffic interface em1 size 1500 no-resolve detail | matching “ip proto ospf” the matching option is used to focus on only OSPF packets. As there might be other control traffic this is an easy way to focus only on OSPF packets. Using “ip proto 89” in the matching option gives the same result.

2. Timer mismatch
Similar to the previous issue, OSPF neighbors must have the exact same timers for them to at least become neighbors. Timers can be changed to fine-tune the convergence, different interface types (p2p, p2mp, nbma, broadcast) might have different timers, especially in multivendor networks. Again the show ospf neighbor command does not show any neighbor if there is a mismatch. The output of the traceoptions log is again much easier and faster to interpret than the monitor traffic output.

OSPF-DEBUG-Timer-MismatchOSPF-monitor-Timer-MISMATCH

3. Authentication mismatch
This is a somewhat tricky issue in real-life because typically MD5 password authentication is used and it is impossible to figure out what the password is from the traceoptions or monitor traffic output. This means that for the exam the only viable option is to misconfigure simple-text passwords as they can be learned from the monitor traffic output.

OSPF-DEBUG-AUTH-MISMATCHOSPF-Monitor-Auth-MISMATCH

In this scenario it is advantageous to use both methods as the traceoptions log will pinpoint the issue quickly but it doesn’t give the plain-text password used by the other side. The monitor traffic output provides the junos123 password information.

4. Area type mismatch (stub vs normal)
Area type mismatch typically happens during migrations and inconsistent configuration management. In the Hello message the option field signals the type of area the interface belongs to, if they are different, for example stub and normal, the neighbors do not show up. The traceoptions output is again more efficient for finding this type of mismatch.

OSPF-DEBUG-Area-type-MISMATCHOSPF-Monitor-Area-type-MISMATCH

The monitor traffic output shows the difference but some background knowledge is required to figure out what is going on. In the Options field one side is signalling External, LLS (=normal area) whereas the other side signals only LLS (=stub area). An NSSA area would signal NSSA, LSS in the Options field.

5. Subnet mask mismatch
The most common cause for subnet mask mismatches is misconfigurations, possibly bad documentation playing a supporting role. With JUNOS it doesn’t seem to matter if the OSPF interface-type is P2P or not. Again the traceoptions output delivers a fast resolution of this issue.

OSPF-DEBUG-SUBNET-MISMATCH

6. Duplicate router-id
Duplicate router-id’s are really bad news in an OSPF network as they create permanent instability in the OSPF domain, as the routers involved will continuously send update LSA’s to reestablish their unique identity in the network. If the routers with the duplicate router-id are direct neighbors they will not form an adjacency with each other. This is actually somewhat good news as this neighbor problem would identify the duplicate router-id very easily. Reasons for duplicate router-id’s are misconfiguration, and bad documentation. Pay attention to your primary address for your unique loopback address if you are implementing Anycast-RP with MSDP. You can avoid this also by hardcoding the router-id in the [edit routing-options] hierarchy.

The traceoptions output shows the problem very clearly. The monitor traffic output also makes it obvious that in this example both routers use router-id 6.6.6.6 as their identifier.

OSPF-DEBUG-ROUTERID OSPF-MONITOR-ROUTERID

7. MTU mismatch
A parameter that is not configured under the [edit protocols ospf] hierarchy that is important for OSPF neighbor establishment is the interface MTU size. If there is a difference in the MTU size the OSPF neighbors typically get stuck in the Exstart fase. Differences in MTU size are caused by misconfiguration, migrations, and vendor interoperability issues.

OSPF-SHOW-MTU-NEIGHBOR

As expected by now the output of the traceoptions log file is the best way to spot this problem.

OSPF-DEBUG-MTUHowever it doesn’t show us the MTU size of the other side. For that the monitor traffic output is needed. It provides information about the MTU’s advertised in the Database Description packets.

OSPF-Monitor-MTU

8. DR priority = 0 on both routers
A much less likely problem would be if all routers on a multi-access segment (typically Ethernet) would have their Designated Router election priority set to 0. If all routers are unelectable than the segment can not function properly if the interface-type requires a DR. To me this is more a trickery scenario than a real-world issue but I added it for completeness.

This issue is harder to troubleshoot as the traceoptions output doesn’t show any problem as shown below.

OSPF-DEBUG-DR-0

Only when looking carefully at monitor traffic output you can spot that both sides have their Priority set to 0. Also the show ospf neighbor shows the neighbors to stay in 2-way state forever. OSPF-MONITOR-DR-0

OSPF-SHOW-DR-0

9. Duplicate interface ip address
An easy to notice issue is if the two neighbors have the same ip address on the interface they are trying to establish an OSPF neighborship on. Checking the proper documentation and pinging the local and remote side address should already highlight this misconfiguration issue. If you would have to troubleshoot it from an OSPF standpoint you can use the general tools like show commands, monitor traffic, and traceoptions output.

The issue symptom can be easily seen from the show ospf neighbor detail command. As shown below it results in the session getting stuck in the Exstart phase.

DupIP-sh-ospf-neig-detailWhen looking at the output of the show ospf interface detail command you can see that there is something weird going on as both the DR and BDR address are the same, which is not normal.

DupIP-sh-ospf-int-detailThe output of the monitor traffic command shows the problem very clearly as the source of the incoming and outgoing OSPF hello packets are the same.

DupIP-monitor-trafficThe traceoptions log is the least clear to troubleshoot this issue if you don’t have the experience what the below message indicates.In the output below IFL 0 is the IFL number of the interface with the OSPF issue.

DupIP-log-debug

10. Interface type mismatch
This last issue we will look at is maybe the sneakiest of them all as it does establish the neighbor, at least on 1 side, but it causes SPF calculation problems, resulting in missing routes. The core of this problem is that OSPF supports different interface-types. These interface types can be divided into interface-types that require a DR (LAN, NBMA) and interface-types that do not use a DR (P2P, P2MP). It is impossible to mix these interface types on the same physical link, so one side thinks the Ethernet is a OSPF LAN, and the other side thinks it is an OSPF P2P link.

Reasons for this problem to appear in real-life is misconfiguration, migration, vendor interoperability, and using different WAN interface types on Frame-Relay and ATM networks (P2P, NBMA). Nowadays it is quite common to optimize OSPF over Ethernet point-to-point links (only 2 routers) by setting the OSPF interface-type to P2P. This example shows this needs to be done consistently as otherwise parts of the network might become unreachable.

If the timers are equal, which they are by default in JUNOS (for LAN/P2P the Hello = 10s and Dead = 40s), the neighbors will establish a relationship (at least on 1 side) but during SPF calculations this difference in how the interface is represented results in routes not being installed in the routing table.
In our example you can see below that the neighbor is up, from JNCIE-R1 perspective (LAN type), if you would be able to look on the other side JNCIE-HIDDEN (P2P type) it becomes more obvious that there is a problem as it doesn’t see a neighbor.

OSPF-SHOW-INT-INTF-TYPE

Looking at the routing table and the OSPF database on JNCIE-R1 it becomes clear that the 2.2.2.2/32 route is visible in the OSPF database but it is not installed in the routing table.

OSPF-ROUTE-DB-INTF-TYPE

Summary
This blog post covered troubleshooting multiple issues that can arise between OSPF neighbors, especially if you do not control both sides of the link. The main troubleshooting tools used are traceoptions, monitor traffic, and sometimes the show ospf commands. By knowing how to use these tools effectively in your exam you should be able to resolve any OSPF neighbor issue. This applies to misconfigurations inserted into the exam by design, or by yourself during the exam.

More information about OSPF topics and advanced troubleshooting scenarios can be found in the iNET ZERO’s JNCIE workbooks.

Let us know what other OSPF neighbor problems you have encountered, or what other OSPF problems you would like to have covered in a blog post.

Categorised in: , ,

  • snabblån says:

    Hi, I do believe this is an excellent website. I stumbledupon it 😉 I
    am going to come back yet again since i have book marked it.

    Money and freedom is the best way to change, may you be rich and continue to help other people.

    My homepage – snabblån

  • One thing I’ve seen is that jncie lab exams do not go too much into details of N1/2 or E1/2 ospf routes’ metrics (to use cisco terminlogy) and their forwarding addresses. Maybe something from here in the future 🙂

  • Joe Zhu says:

    good article.

Leave a Reply

Your email address will not be published. Required fields are marked *