Basic Troubleshooting Techniques
- William Estep
- Programming
- May 27, 2008
Table of Contents
Troubleshooting is tracking down pesky bugs, rooting them from the code and squashing them. Having some basic troubleshooting skills can greatly enhance your bug fighting. The goal of troubleshooting is to quickly identify the root cause of the problem, however, don’t confuse troubleshooting with problem solving. Problem solving is answering the question, “Can this be done?” or “How can we do this?” Troubleshooting on the other hand is answering the question, “Why isn’t this working?
In this article, I will cover some basic steps to troubleshooting that I use, discuss tools to aid in troubleshooting, and identify what to avoid. These basic steps are platform and environment independent. These basic troubleshooting skills apply to developing a simple database driven website, an eCommerce solution, or a mid-range finance application in RPG.
I learned basic troubleshooting techniques in Electronics Technician ‘A’ school while serving in the Navy. I have found the same basic approach applies to software development. The fundamental skills are easy to learn and will greatly reduce the amount of stress pesky coding errors can cause.
Basic Troubleshooting Steps
The first step is recognizing the symptoms. From my days in the Navy, I remembered the first step as “Check the bulkhead power supply on.” While very appropriate in the Navy, the wording may not seem relevant when troubleshooting a piece of software. However, what we really need to do first is recognize the symptoms. Be specific. Saying, “The Add button doesn’t work,” may be correct, however, it does not clearly identify the symptom. As I will talk about in the “Things to Avoid” section, it is important not to jump on the first symptom and start troubleshooting. Step back, take a breath and look at what the program is saying. More often than not, jumping on the first symptom will lead to chasing non-existent problems, because the root cause hasn’t been identified. Exercise the system and make sure you see the same errors the user reported, or see what other clues can be easily uncovered.
If the application is not running at all, I am probably looking for an environmental failure. Remember the bulkhead power supply? Is the correct Perl path set? Or for a Java application, is a valid JVM available?
Are there multiple symptoms? If a user reported the bug, can I repeat the errors the user is seeing? Is the error unique to the user’s machine?
Now that I have a good picture of the problem, and a clear list of symptoms, I can identify the faulty areas of the application. If the program I am troubleshooting is large enough to have various use cases, limiting the search based on the symptoms reported will greatly improve the troubleshooting time. For example, did the error occur when trying to add an item to the shopping cart? If so, it is safe to assume the bug is in the “Add Item to Shopping Cart” use case. This may seem obvious, but tracking down a more complex issue, for example, “The Accounts Receivable balance is off by X dollars at month end”, requires a systematic approach, and listing possible faulty modules will identify the places to focus troubleshooting efforts.
Once I have identified the areas in the application likely to hold the bug, it’s time to start localizing the error. There are several approaches to take. The most common is to “Easter Egg”. In electronics, “Easter Egging” is searching for the fault by randomly replacing components. We are all guilty of this, randomly trying anything in a desperate hope of stumbling on the solution. It is, however, a very unproductive approach to troubleshooting.
The best approach to take is “Half Splitting”. Most programs can be sub divided into modules or subroutines. If users are unable to log in or authenticate to an application, the problem is probably not related to a reporting module. In our example of the “Add Item to Shopping Cart” use case, the use case may include the following steps:
- User navigates to item detail page
- User clicks “Add to Cart” button
- System validates user session
- System adds item to user cart object
- System takes user to shopping cart detail page
- System displays contents of cart
To “Half Split” this process and determine where the bug is occurring, I could monitor the output of the “System validates user session” process. Did I get the expected output? If so, I know the problem is downstream of this point. I just need to continue Half-Splitting until the error is isolated.
Once I have found the problem and made any necessary corrections, I still have another, important troubleshooting step. I need to test the changes, and ensure a) the original problem has been corrected, and b) we have not introduced a new bug.
Troubleshooting Toolbox
Programmers have many troubleshooting tools at their disposal, from simple log files to complex debugging environments. For basic troubleshooting, here are some important tools to be aware of.
Keep Notes
Making a record of the troubleshooting effort is perhaps the easiest troubleshooting tool to use, and probably the least used. If I don’t find the problem in one sitting, I know I wont remember everything I have already tried. A good set of notes will keep me from repeating the same steps.
Log Files
Know where the log files are for the development environment. If I am writing PHP for a web application running on an apache server, apache will frequently put useful information in the access and error logs. The error reports can aid in quickly tracking down syntax errors such as:
[17-May-2007 00:22:10] PHP Parse error: syntax error, unexpected '.' in /path to file/test.php on line 51**
Logging Services
Another tool to add to the troubleshooting kit is a logging service, such as log4j. This simple add-on library can greatly enhance debugging information. The logging services project at apache.org (http://logging.apache.org) includes logging packages for C++, Java, .NET, Perl and PHP.
Manual Debugging Points
I try to stay in the habit when coding of including manual debugging points. In Perl I do something like this:
$debugMode = 1; # Set 1 to turn debug statements on, 0 for off
…
print "DEBUG:var: DSN, USER, PASS: $DSN, $USR, $PASSn" if $debugMode;
…
print "DEBUG:stat: Oracle Login Successfuln" if $debugMode;
I make a habit of adding these markers as I am programming, it makes on the fly bug hunting much easier. In this way, I can easily see the status of variable sets and program execution as needed. And when the program is ready for production, all I need to do is set $debugMode = 0.
Variable Dumps
Many languages include functions to facilitate viewing the current state of variables; for example, vardump() in PHP is an excellent tool to quickly analyze the state of data as it moves through the program. Here is a simple example of vardump() in action:
<?php
$days_of_week = array ('mon'=>'Monday',
'tue'=>'Tuesday',
'wed'=>'Wednesday',
'thur'=>'Thursday',
'fri'=>'Friday',
'sat'=>'Saturday',
'sun'=>'Sunday');
…
var_dump ($days_of_week);
?>
The output of var_dump() looks like this:
array(7) {
["mon"]=>
string(6) "Monday"
["tue"]=>
string(7) "Tuesday"
["wed"]=>
string(9) "Wednesday"
["thur"]=>
string(8) "Thursday"
["fri"]=>
string(6) "Friday"
["sat"]=>
string(8) "Saturday"
["sun"]=>
string(6) "Sunday"
}
Leave Foot Prints
Another good habit to foster is leaving footprints in the code as enhancements or changes are made. This is not a replacement for a good version control methodology, but placing footprints and adding enhancement notes to the header of files greatly improves code readability. If I need to make a fix to a script, I will often document in the top of the script the date and time, and the changes I am making. Then in the code, I will leave a marker where the changes are made. A marker may look something like /BLE-05272007/, the programmer’s initials and the date. Six months or a year later, these footprints can be a lifesaver.
Frequently Made Mistakes
Keep a list of common errors or frequently made mistakes. For example, I have on more than one occasion used a declaration instead of a comparison in an If qualification:
- Right: If ($myvar == “test value”) …
- Wrong: If ($myvar = “test value”)…
This can be a very difficult error to track down. Once the general area is identified using half-splitting, take a look at each variable value at steps along the way. Keep an eye out for variables that never change. Hopefully, you will only have to find this type of error once, and keeping a reminder in a frequent mistakes list will help jog your memory in the future.
Other Debugging Tools
I recently started using Firebug (a Firefox extension, http://www.getfirebug.com) to help troubleshoot some css problems, an excellent utility. There are many useful debugging tools available. It can be worth the time to explore new options.
Things to Avoid
- As I have discussed, “Easter Egging” is not a productive troubleshooting technique. The problem may eventually be found, but taking a logical, systematic approach will find it faster and with less stress.
- Don’t assume there is only one problem to fix. Several symptoms may indicate one or more problems.
- Don’t jump on the first thing you see. Take your time and make sure the symptom you noticed is actually a problem.
Conclusion
These simple techniques can be used in any programming environment, whether it is programming RPG on an AS/400, writing Perl scripts for data conversion or PHP for a website. Good troubleshooting skills will set you apart from your colleagues.