Here’s the dilemma: You’ve got some code in front of you that you know basically nothing about and it doesn’t work. Well, actually, there’s just one little part of it that doesn’t work, but the trouble is you have absolutely no idea where or what that little part is. Your job is to fix it. Where do you start?
I recently ran into this problem when trying to debug a drag and drop problem in Delphi 2006. I hadn’t debugged drag and drop code, and I’m sure anyone who has appreciates the challenges of using the debugger to figure out what is going on during a drag operation. The big problem is that if you want to set a breakpoint in the code which executes when the drag operation begins, you won’t be able step though all of the code that needs to execute to complete a drop operation because the mouse messages will not be processed properly by the application while it is stopped in the debugger. As soon as the breakpoint hits, your drag operation is basically cancelled.
The bug I was working on had to do with being unable to rearrange items in the tool palette using the mouse. On certain systems, the tool palette would simply not allow dragged items to be dropped.
In this situation, the Delphi IDE uses "OLE" drag and drop which means much of the code that executes during a drag operation is invoked by callbacks from the operating system. Since this callback is called indirectly, it is more difficult to identify. It’s awfully tough to set a breakpoint on a line of code when you don’t know where it is! The problem was further complicated by the fact that code that should have been executing wasn’t getting called, so even if I knew where to put the breakpoint it wouldn’t have worked.
I decided to use AQTime to tackle this problem. Not only is it a powerful code profiler, it is also a very useful debugging tool. The results it generates will show which functions are called, how many times they’re called, and how long each call takes. In addition, it provides a call stack that clearly shows which functions call other functions. Another nice thing about the call stack AQTime provides is it’s "static" nature, you don’t have to be sitting on a breakpoint with the program running to review it as you normally would with a debugger.
The Debugging Process
To get started, I found a system that was not affected by the bug and profiled the drag and drop operation. This gave me a "good" set of profiler data. I copied the results to a system that couldn’t perform the drag and drop. Now, using the same AQTime project with the good results, I executed the same steps (which failed) under the profiler. This gave me a "bad" set of profiler data. Next, I used a feature in AQTime which allows you to compare these two sets of results. That allowed me to quickly see which methods were not being called on the computer affected by the bug. Finally, I analyzed the call stacks from each profiler run to determine where the execution paths diverged.
At this point, I had the information I needed to jump back into the debugger and set some breakpoints and figure out exactly what was going wrong. This took me just a couple of minutes, now that I had a clear picture of what the expected code path was. From there I made a few code changes to fix the bug and: Viola!, problem solved1. I’m sure there were probably other ways I could have attacked this bug, and had I gone into this exercise with more background in drag and drop code I probably would have known right where to look for the problem without any extra help. Fortunately, having AQTime in my toolbox I knew I could get the job done with this approach.
It’s incredibly useful to have a "static" call stack that you can examine after the fact. You can walk up and down the stack and review the associated code in the code editor window and review any part of the profiler results without having to run the program. Another big advantage of the call stack AQTime provides is that it traverses the .net managed/unmanaged code interop boundary showing you a full picture of the code execution. The IDE debugger only allows you to see either .net or Win32 calls in the call stack, not both.
1 In case you are curious, the bug involved the use of the IDragTargetHelper. The DragEnter method returns an error code when the "Show window contents while dragging" option is disabled in the operating system. The workaround for the bug is to enable this option in the OS.