If you’re a C# developer, you probably use the foreach language keyword all the time. I love this keyword, because it drastically simplifies a lot of the boilerplate we’d otherwise have to write ourselves. It makes consuming Iterators very easy, and aligns the programming language very closely to natural language constructs. With that said, have you ever looked at the code that gets generated by the C# compiler? That’s the topic of this post, so let’s dig in!

I love digging into details about programming languages, and frequently look at the Intermediate Language (IL) the Roslyn compiler generates for my code. A lot of my interest in this area came from Joe Duffy and Eric Lippert, and Mads Torgersen. Joe talks a lot about safety mechanisms in programming languages and concurrency on his blog, and Eric has discussed all sorts of interesting programming language things on his blog. A lot of this also gave me a deep appreciation for code contracts (not to be confused with the Code Contracts technology).

So how does the foreach keyword work, anyway? You could read the part of the language specification regarding foreach, and would save yourself quite a bit of time doing it. The docs are also great, and I highly recommend reading them (if you haven’t). But, with that said, it’s more satisfying to dig in yourself, which is what we’re going to do here!

We’re going to use an extraordinarily simple method as our sample for looking at foreach:

static void PrintStuff(IEnumerable<string> strings) {
    foreach (var str in strings)
        Console.WriteLine(str);
}

This one will be easy enough to de-construct. So, let’s get this thing compiled. I compiled the method above using Visual Studio, and the default DEBUG build. Here’s the resulting IL:

.method private hidebysig static void 
    PrintStuff(class [System.Runtime]System.Collections.Generic.IEnumerable`1<string> strings)
    cil managed
{
  // Code size       47 (0x2f)
  .maxstack  1
  .locals init (
    class [System.Runtime]System.Collections.Generic.IEnumerator`1<string> V_0,
    string V_1)
  IL_0000:  nop
  IL_0001:  nop
  IL_0002:  ldarg.0
  IL_0003:  callvirt   instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
  IL_0008:  stloc.0
  .try
  {
    IL_0009:  br.s       IL_0019
    IL_000b:  ldloc.0
    IL_000c:  callvirt   instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
    IL_0011:  stloc.1
    IL_0012:  ldloc.1
    IL_0013:  call       void [System.Console]System.Console::WriteLine(string)
    IL_0018:  nop
    IL_0019:  ldloc.0
    IL_001a:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
    IL_001f:  brtrue.s   IL_000b
    IL_0021:  leave.s    IL_002e
  }  // end .try
  finally
  {
    IL_0023:  ldloc.0
    IL_0024:  brfalse.s  IL_002d
    IL_0026:  ldloc.0
    IL_0027:  callvirt   instance void [System.Runtime]System.IDisposable::Dispose()
    IL_002c:  nop
    IL_002d:  endfinally
  }  // end handler
  IL_002e:  ret
} // end of method Program::PrintStuff

This first bit, .method private hidebysig static void is just header information. It’s so similar to C# that I don’t think you’ll need an explanation for it. The .maxstack directive tells the CLR how large the stack needs to be to execute the method. Specifically, it’s saying “the stack must have space for at most n values to be stored at any given time.”

The .localsinit portion is used for local variables. There is sometimes some disparity between the number of variables we declare in our code, and the number emitted to support our application. This isn’t surprising: many of C#’s language features are syntactic sugar for much more complex code - hence the syntactic sugar!

The next two items are nops:

  IL_0000:  nop
  IL_0001:  nop

These are only there because I compiled the application in Debug mode. They’re the bits that allow us to set breakpoints in locations like the start of the method (at the brace). Release builds would not usually include extraneous nops (which we’ll see at the end of this post).

The next part gives us real functionality:

  IL_0003:  callvirt   instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
  IL_0008:  stloc.0

callvirt is used here primarily because the type is not known at compile-time, but is also seen frequently because callvirt generates a null-check. stloc.0 is used to - predictably - assign the result of calling GetEnumerator() to the local variable list.

The next part - br.s IL_0019 - may also be there to support the debugger. This is an unconditional jump directive, which will start the iterator. This next part is where the really interesting stuff happens:

  IL_000b:  ldloc.0
  IL_000c:  callvirt   instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
  IL_0011:  stloc.1
  IL_0012:  ldloc.1
  IL_0013:  call       void [System.Console]System.Console::WriteLine(string)
  IL_0018:  nop
  IL_0019:  ldloc.0
  IL_001a:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
  IL_001f:  brtrue.s   IL_000b
  IL_0021:  leave.s    IL_002e

ldloc.0 loads our local enumerator off of the stack, and accesses the IEnumerator<string>.Current property. In the C# language, properties are just syntactic sugar for equivalent <property-type> get_<property_name> and void set_<property_name>(<property-type> value) methods, which is shown in the snippet above. After that, we call stloc.1 to set the value in a local variable, and then re-load the local variable so we can do something with it. In this case, we’re just calling a method - Console.WriteLine(string). More complex variations of this method obviously require additional parameters, which would lead to more stack operations.

The next IL instruction - il_0018: nop - is also there to support the debugger. Thereafter, we re-load the enumerator, calls it’s bool MoveNext() method, and setup a branch based on the result of calling MoveNext():

  1. Continue executing the loop
  2. Leave the method (IL_002e: ret)

Finally, we enter the finally block (pun intended):

  finally
  {
    IL_0023:  ldloc.0
    IL_0024:  brfalse.s  IL_002d
    IL_0026:  ldloc.0
    IL_0027:  callvirt   instance void [System.Runtime]System.IDisposable::Dispose()
    IL_002c:  nop
    IL_002d:  endfinally
  }  // end handler

This is all looks straight forward, but there’s actually a hidden gem in here for IL_0024: brfalse.s IL_002d. The documentation for this instruction says:

Transfers control to a target instruction if value is false, a null reference, or zero.

Thus, after loading the local variable, a null-check is issued before any other work is performed. We then re-load the local, and call the IDisposable.Dispose() implementation. Interestingly, as described above, the callvirt opcode will issue it’s own null check. The obvious reasons for why the compiler would issue it’s own null check first are:

  1. The argument is not proven to be non-null first
  2. Null checking was not exhaustive in the compilation process (standard for “Debug” builds, which pass the -optimize- compiler flag)

The remainder of the method is fairly self-explanatory… Just finish the method and leave.

Now, with all this review of a Debug target, let’s have a look at a Release target:

.method private hidebysig static void  PrintStuff(class [System.Runtime]System.Collections.Generic.IEnumerable`1<string> strings) cil managed
{
  // Code size       41 (0x29)
  .maxstack  1
  .locals init (class [System.Runtime]System.Collections.Generic.IEnumerator`1<string> V_0)
  IL_0000:  ldarg.0
  IL_0001:  callvirt   instance class [System.Runtime]System.Collections.Generic.IEnumerator`1<!0> class [System.Runtime]System.Collections.Generic.IEnumerable`1<string>::GetEnumerator()
  IL_0006:  stloc.0
  .try
  {
    IL_0007:  br.s       IL_0014
    IL_0009:  ldloc.0
    IL_000a:  callvirt   instance !0 class [System.Runtime]System.Collections.Generic.IEnumerator`1<string>::get_Current()
    IL_000f:  call       void [System.Console]System.Console::WriteLine(string)
    IL_0014:  ldloc.0
    IL_0015:  callvirt   instance bool [System.Runtime]System.Collections.IEnumerator::MoveNext()
    IL_001a:  brtrue.s   IL_0009
    IL_001c:  leave.s    IL_0028
  }  // end .try
  finally
  {
    IL_001e:  ldloc.0
    IL_001f:  brfalse.s  IL_0027
    IL_0021:  ldloc.0
    IL_0022:  callvirt   instance void [System.Runtime]System.IDisposable::Dispose()
    IL_0027:  endfinally
  }  // end handler
  IL_0028:  ret
} // end of method Program::PrintStuff

As you can see, the code size is a tiny bit smaller between Debug and Release builds (0x2f versus 0x29). Mostly, this was achieved by eliminating all of those nop instructions.

This concludes Part 1 of our evaluation of C# Iterators. In the Part 2, we’ll look at how the compiler optimizes the foreach iterator when it can prove the type being iterated is an array. Thanks for reading!