Learning WinDBG/SOS and Advanced Debugging

In my daily R&D work as well as in my general development, I always keep WinDBG open so I can quickly debug major problems in a system or just to take a look under the covers. WinDBG is short for Windows Debugger and it's what you would use if you were to debug a Windows driver or figure out why your system blue screened.  It's an advanced unmanaged debugger.  If you're into internals and eat up books like Windows Internals and Windows via C/C++, then you will or probably already do love Windows Debugger.

You can use it for more than unmanaged debugging though.  The .NET framework ships with a product called SOS, which you can load into WinDBG to enable advanced managed debugging.  Actually, with the proper settings ("Enable unmanaged code debugging" to true) you can sometimes load SOS in Visual Studio.  Using either, you can do anything from break when a particular method is called, dump out the IL at a particular place in your call stack (yes, this means you can view the code at runtime without needing the code!), view the methods of a type, or even break when the CLR first loads.  It's incredibly powerful.  You don't even need to be at the system to use it.  Just have someone send you a memory dump and you can use that just as easily as if you were physically at the system.

You can even use it to debug .NET applications inside of unmanaged applications.  For example, WinDBG is the tool I used to figure out why Visual Studio 2008 didn't allow .NET assemblies to be referenced in Silverlight.  Visual Studio was simply loading a managed assembly to do all of its assembly reference dirty work.  Since .NET has this awesome thing called the CTS (Common Type System), types are actual types instead of just chunks of memory.  So when you pause a process and walk through memory, you don't just see memory addresses like 0x018271927, but you see things like System.String with a value of "ctl02".

In addition to being able to debug unmanaged code, .NET code, and .NET code in unmanaged code, you can also use WinDBG to debug Silverlight code.  As it turns out, when you install Silverlight, sos.dll is installed in your %ProgramFiles%\Microsoft Silverlight\2.0.31005.0\sos.dll folder (or on a 64-bit system, %ProgramFiles(x86)%\Microsoft Silverlight\2.0.31005.0\sos.dll).  Just attach your debugger to the unmanaged browser to debug the managed Silverlight application.

At this point you might expect me to announce that I'm going to do some long series on how to effectively use SOS, but that's not the case for two reasons: first, I'm not going to dump out a series of screen shots when many others have done this already.  Second, 90%+ of advanced debugging is all about understanding internals; something that takes many hours of study over many months.  Don't think that learning some tool will help you to effectively debug serious problems.

Thus, remember this: your ability to debug something is directly proportional to your knowledge of the system.  As I say regularly, if you ever try to fix something without understanding it, you are, at best, doing little more than hacking.  Therefore, instead of writing a series, I'm going to refer you to a series of places where you can easily learn all about WinDBG and, more importantly, advanced debugging and system internals.

Before you go diving into a list of resources, though, you need to realize that WinDBG, like Visual Studio's built-in, toned-down debugger, is only a tool.  As I've already mentioned, no tool will ever replace the need for the human to know how to use the tool or how to interpret the information provided by the tool.  Knowledge of debugging is required before you can use a debugging tool effectively.  Logic is also always required when debugging any issue.  Not only that, but as previously mentioned, deep knowledge of the internals of the system you are debugging is non-negotiable.  If you don't know what a MethodDesc, what a method table, or even what a module is, then your debugging abilities will be severely limited.  You won't have a clue how to find or interpret your debug output.

The more you know about the CLR, memory management, threading, garbage collection, the CLR's type system, and Windows the better off you are.  If you understand why .NET finalizers should only be used once every 10 years, you're on the right track.  If you don't even know how many generations are in .NET's garbage collector, you have some serious research ahead of you.  The more you know, the better off you are.  This applies to the whole of the system.  I hear people all the time talking about the "stack" and the "heap" as if they are some ethereal concept beyond human grasp.  These people typically have no idea that these are concepts that predate .NET by decades.  If you can't explain the stack and heap from a C/C++ perspective, then you don't truly understand the concepts.

Also, if you understand memory management in C/C++, you're way ahead of the game.  Even though you don't normally see them in a typical ASP.NET web form, you should know your pointers as they are the foundation of all memory management.  Without awareness of pointers, the entire concept of "passing by reference" is little more than magic.  You need to know why something is a value type and why something is a reference type.  If you don't understand the concept of a sync block index, then you're always wonder why you can't lock on a value type and why you can lock on a value type cast to an interface.  This type of information will go a long way to help you not only debug a system, but, equally important, to make sure it has optimal performance.

You should also not be afraid of IL.  This is the language of .NET.  You don't need to care about becoming fluent in writing in IL.  I promise you that you won't be required by any fascist employer to write a full-scale WPF system in IL.  However, if you don't even know what boxing, you need to hit the books.  Knowledge of IL dramatically aides in your understanding of both the framework class library and the CLR.  You should be aware of what is actually happening under the covers.  Don't simply assume that the csc.exe or vbc.exe compilers will fix your coding flaws.  Some times the code created by these compilers isn't optimal.  You should also understand what the evaluation stack is and how stack-based programming works as this is the foundation for IL development.

Fortunately, there are resources for these prerequisites of advanced debugging in addition to the following debugging resources.  Not only that, but there's two books that explains just about every thing I've just mentioned!  It's Jeffery Richter's CLR via C# and Joe Duffy's .NET Framework 2.0.  Buy these books and a stack of highlighters and study until every page has marks, notes, and coffee stains on them.  In addition to this, you should probably also drop by your local book store, grab a book on C/C++ and read a chapter or two on memory management so you can see the types of things the CLR has to deal with so you don't have to.

For even deeper debugging, though, you may want to study the "Processor Architecture" section of the WinDBG help files, get the aforementioned Windows/C++ books, and buy Advanced Windows Debugging as well.  Everything you study is just "code", so the more you know, the deeper you can debug.  Not just debugging though, but knowledge of internals help you get more knowledge of internals.  By understanding C++, assembly language, and Windows internals, you already have just about everything you need to do effective reverse engineering.  This will help you to understand the undocumented internals and features of various technologies.

These things will help further your knowledge of what's actually going on in .NET's brain.  It's also important to remember that you won't learns this stuff over night.  As I said, your debugging skills are directly related to your knowledge of internals.  Therefore, as pick up more prerequisites in your career, you will become better at debugging (and performance optimization).  Every few months I take some time to further my understanding of the internals of new (and even sometimes old) technologies for these specific reasons.  For example, in Q1 and Q2 of 2009, I'm spending the bulk of my time studying assembly language and Windows architecture to further my own understanding.

Now on to the resources (which will be updated every now and again):

Advanced Debugging Resources