Author: Craig Stuntz
I helped another developer debug an interesting problem this morning. Let’s see if you can spot the problem. The code in question looked something like this simplified version containing only enough code to show the problem:
[crayon-673f9ae591c04820841703/]
Note that the result of the function DoStuff
is not used by Execute
. That result actually exists only for testing purposes; it’s essentially a log we use to monitor changes the method makes to external state. The unit tests in question passed, so it was clear that DoStuff
worked correctly, at least in a test context. The problem was that when the code ran outside of a test context (i.e., in the real application), the DoStuff
method would never run. The debugger would stop at breakpoint 1, but not at breakpoint 2, but only in the “real” application. Similarly, attempting to step into DoStuff
would not actually go into the method building. If we debugged the unit tests, the debugger would stop at both breakpoints, and the method worked.
Can you spot the bug?
Perhaps it would help if I showed more of the method:
[crayon-673f9ae591c0b726268289/]
Now do you see the bug? Remember, the unit tests pass. There is no special knowledge about our application needed to see the problem here; all of the information required to spot the bug is in the code snippets above. The problem is a code bug, not a setup or configuration issue.
Perhaps it would help if I showed you a version of DoStuff which “works.”
[crayon-673f9ae591c0e091074009/]
With this version, both the unit tests and the “real” application work correctly.
The Solution
At first glance, this might seem puzzling. I’ve changed only the last line, and both of those versions appear to do almost exactly the same thing. Why is the behavior of the breakpoint at the previous line different?
The answer is that using yield return
causes the C# compiler to change the entire method, not just that single line. It surrounds the code with a state machine containing the rest of the method building. Importantly, the iterator returned from the “yield return” method is entirely lazy; it will not run the method building at all until you attempt to iterate the result of the method. But Execute ignores this result, so the method never runs at all.
Discussion
Some languages, like Haskell, go to great lengths to segregate expressions and side effects. C# isn’t one of them, but even so it’s common to try to improve quality by doing so. Eric Lippert, a member of the C# compiler team, once wrote:
I am philosophically opposed to providing [an
IEnumerable<T>.ForEach()
method], for two reasons.The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects.
The purpose of an expression is to compute a value, not to cause a side effect. The purpose of a statement is to cause a side effect.
It is clear that causing side effects could cause an expression to change in mid-computation. This is problematic for debugging and quality, especially if some of the evaluations are lazy. But as this example demonstrates, the opposite is also true: Adding expressions to a computation can change the side effects, too.