I would like to take some time and pontificate about a subject that is near and dear to my heart – Debugging Code. So here are some basic principles:
1) Developers should be responsible for debugging – not Quality Assurance.
Developers know the code – they wrote it.
The same is not true for QA.
So there is no way that QA can debug code as well as Development.
2) Code must be designed to facility debugging.
Debugging has to be built in the the code.
I will go into this subject in detail in a bit.
3) A goal should be to never have say: “I have no idea what the bug is”.
If this happens, then there is not enough debugging infrastructure built
in to the product.
4) How quickly can a developer diagnose a bug?
A goal should be for bug identification in under an hour.
If it takes more than 24 hours to diagnose a single bug,
then there is not enough debugging infrastructure built in to the product.
Code Debugging should be driven by a Test Plan. Here is how Code Debugging ties into a Test Plan:
1) Developers should write the Test Plan.
QA should review the Test Plan.
2) Developers should write all tools needed by the Test Plan.
3) QA should execute the Test Plan.
Development should diagnose and fix all bugs seen by QA.
And here is Code Debugging from the developers perspective:
1) Each subsystem can be defined by its data structures.
The question is: how are these data structures manipulated?
2) The developer needs to create a utility that does a formatted display of
these data structures.
3) The developer needs to create a utility that sanity checks these data
structures.
This utility can be run from the CLI during problem resolution or as a
check point during code execution.
In addition, Development needs to provide a Code Logging Facility. Code detects problems by the use of ASSERTs. An ASSERT tests the validity of a conditional. If the result of an ASSERT is false, then the action can be:
1) Panic
Crash the product, take a core dump, and restart the product.
This is the most drastic action.
You obviously would like to avoid this action as much as possible.
But some times this is the only appropriate action if the system is totally
hosed of data corruption is possible.
2) Log File
Write a debugging message to a disk log file and continue with normal
function.
This option does not interrupt service but it does imply a large performance
hit.
3) In Memory Tracing
Write a debugging message to an in memory circular buffer.
This option also does not interrupt service and writing a message to memory
is fast.
You have to make the circular buffer large enough to hold enough info
without wasting too much space.
If the buffer is too small, then info will be quickly over written.