Troubleshooting techniques are vital to understand to obtain the Cisco certified Network Professional (CCNP) and other higher level certifications. Sean Wilkins, an accomplished networking consultant for SR-W Consulting, wrote an article of Cisco Troubleshooting Techniques & Procedures. So you can read more detailed CCNP TSHOOT info in the following passage. Okay, here, go ahead…
Part1: Cisco Troubleshooting Techniques
As most experienced network engineers know, there are a number of different methods that people use to troubleshoot problems on a network (or systems in general). Now determining which one is “better” is very subjective and can end up being a bit like having a political conversation with other engineers. This article takes a look at a number of the common troubleshooting techniques; these techniques are vital to understand for those candidates looking to obtain the Cisco certified Network Professional (CCNP) and other higher level certifications. The CCNP TSHOOT exam is one of the required exams that must be taken to achieve the CCNP; this exam requires a knowledge base including the concepts discussed in this article.
The Top-Down approach takes advantage of the hierarchy of the Open Systems Interconnection (OSI) model. As most network engineers are drilled in both the structure of the OSI and TCP/IP models, basing a troubleshooting model from them makes sense and tends to be very “natural” to most trained engineers. The Top-Down model as the name indicates takes a look first at the application layer (OSI model) and then works down based on whether a problem has been found. This model tends to be used when troubleshooting apparent application problems on specific computers. An example of using the Top-Down approach would be to first look at the application being used when the trouble is happening and determining whether it is causing the reported problem, if not, continue to work down the layers until the physical connection is verified.
The Bottoms-Up approach uses the same OSI model as a basis but simply takes a look at the physical layers first. Obviously, to really use this model some amount of physical access is required (for example to see if the cable is plugged in or connected correctly). Sometimes this is possible, sometimes not. The problem that an engineer would have if using this model first would be if the problem really ends up existing at the application layer, as allot of time would have been spent working through all of the other layers first. A simple example of this approach would involve verifying the cabling into a device is connected first, then moving to the data link layer and so on; if no problem was found the last step would involve looking at the specific application being used to determine if it is the problem.
Divide and Conquer
The Divide and Conquer is a very popular starting technique when troubleshooting network problems. Instead of starting at the top or the bottom of the OSI model the Divide and Conquer model starts in the middle and works in the direction of the problem. For example, by attempting to ping or traceroute from a device and engineer can determine whether to troubleshoot down towards the network layer or up through the transport layer. The Divide and Conquer method is one of the most commonly taught troubleshooting methods, mainly because it avoids the problem that both the Top-Down and Bottom-Up approaches have with troubleshooting problems without knowing which side of the OSI model the problem exists on. By starting the troubleshooting process in the middle of the OSI model there are fewer layers to work through and the problem is typically found faster than with the Top-Down or Bottoms-Up methods.
Follow the Path
The Follow the Path technique is used to locate a problem by following the path that the traffic takes through the network. To start tools like traceroute are used to determine the path being taken through the network. If the traceroute is unable to complete then the problem may exist at that point within the network, sometimes the point of the troubleshooting is to determine whether the traffic is taking the “correct” path through the network. This is easily determined with a traceroute as well, wherever the traffic “steps” off the “correct” path is where to continue troubleshooting.
Spot the Differences
The Spot the Differences technique is used when there is something to compare against. For example, if troubleshooting access router configurations that are similar, an engineer can compare the configuration to determine a missing or extra command. The engineer must be knowledgeable enough to know what is supposed to be different and what is supposed to be the same to correctly use this technique.
Move the Problem
The last troubleshooting technique covered in this article is the Move the Problem technique. The basic principle here is that if a component is moved and the problem moves with it, then the problem exists with the component, if it does not move then the component is probably not the problem. As with all of the techniques discussed there are only specific situations where it is possible to move a component to test it, this is one of the main limiting factors to this technique. A simple example of this would be if two branch offices existed that utilized the same router and the same (of very similar) configuration, if the devices were swapped and the problem moved from one office to the other the problem would probably exist with the router in the office where the problem moved.
Obviously there are a number of different techniques that can be used when troubleshooting, which one to use depends greatly on the specific situation and what is being troubleshooted. Overall, each method has its advantages and disadvantages and should be used in specific situations making none of them the perfect technique overall but only in specific situations. What really comes with experience knows which one to use in which situation to limit the amount of time spent troubleshooting and resolving a problem quickly. Hopefully you find this advice useful if whether you’re looking to take the CCNP TSHOOT exam or not.
Part2: Cisco Troubleshooting Procedures
The overall process of troubleshooting is a very subjective and which trouble shooting techniques (the part 1) that are used to troubleshoot specific problems are as well. This article takes a look at the basic troubleshooting process steps as laid out by Cisco; these procedures are vital to understand for those candidates looking to obtain the Cisco certified Network Professional (CCNP) and other higher level certifications. The CCNP TSHOOT exam is one of the required exams that must be taken to achieve the CCNP; this exam requires a knowledge base including the concepts discussed in this article. It is also important to note that these specific process steps are given in a specific order but can be used and reused in a number of different orders depending on the experience of the engineer.
Defining a Problem
A common issue that exists for troubleshooters is a lack of a clear definition of a problem being reported; a common one is “My **** is not working”. While this gives a basic idea of what to look at it does not really give an engineer a good idea of where to start; it is sort of like “my car doesn’t work”. During this step in the process an engineer must define the problem being reported; this includes talking to the reporting party and hopefully observing the exact problem being reported. The more specific definition of the problem, the easier it is to narrow down and fix the problem.
Once a proper definition of a problem exists, the specific devices to gather information from can be determined. What information to specifically gather really comes with experience, in general it is best to have too much information over too little. Examples of this would be gathering event logs, status information and verifying current operations information from each affected device and those devices along the affected path. Once all of the relevant information is obtained goto the next step.
Analyzing the Information
The gather and analyze steps are really two sides of the same step but is presented as a separate step in the process. Once an engineer has all of the information gathered that is relevant to the problem it must then be analyzed and formatted. The specific format is not specific and depends on how the engineer best reviews information. Once it is formatted to the engineers liking then it is able to be reviewed easier and problem can be located faster. An example would be taking the information gathered from the event logs, status information and operations information and performing an assessment of observed potential problems.
Eliminating Possible Problem Causes
To find the cause of a problem, it is often required to eliminate what is not the cause of the problem. This step extends on the information formatted in the previous step and attempts to isolate the potential causes of the problem and eliminate those that are not. An example would be observing that the configuration was not altered on a device when a trouble occurred; this eliminates a configuration change as a potential problem. Once the potential causes of the problem have been isolated, move to the next step in the process.
Formulating a Hypothesis about the Likely Cause of the Problem
Once the potential causes of the problem have been isolated a hypothesis can be derived about how the problem occurred and how to fix the problem. This step in the process includes the formulation of the hypothesis and mapping out a procedure for testing the hypothesis. An example would be observing that a circuit between major offices went down when the trouble occurred; from this hypothesizing that the circuit trouble caused the problem being reported.
Testing the Hypothesis
During this step the engineer takes the procedure (or procedures) laid out in the previous step and tests whether the hypothesis was correct and if not why. At times it is necessary to take the information out of this step and repeat back to the previous step (especially when the hypothesis was incorrect). An example would be to bring down the circuit between offices (during designated testing and change times) and see if the problem recurs.
Solving the Problem
Well as for the most obvious step name, this step includes taking the information taken from the testing step and implements the fix (or fixes) proven during the testing. An example would be ensuring that the circuit does not go down and pursuing a method to ensure that the problem does not recur even if the circuit does go down again (i.e. offline file support)
As even Cisco admits, the specific process steps to fix a problem can depend greatly on the engineer doing the troubleshooting and the specific problem. These set of steps are laid out as a backbone that can be followed to achieve successful results, the streamlining of the process occurs with experience.
More Cisco Certification Info and Tips (CCNA, CCIE, CCNP)