Iterators are very useful, but in the past they've been a bit of a nuisance to write.
Not difficult as such, but you've always needed an extra class in the past to store
the state of where you've got up to in the collection, etc. yield
statements allow you to write iterators "inline" in a single method, with the
compiler doing all the hard work of keeping track of state behind the scenes.
yield
statements are only valid in a method/operator/property which
returns one of IEnumerable
, IEnumerable<T>
,
IEnumerator
or IEnumerator<T>
. You can't mix
and match - if a member uses yield
statements, it can't use
normal return
statements too.
There are two types of yield
statements - yield return
(for returning the next item) and yield break
(to signify the end of
the iterator). Here's virtually the simplest example possible: (I've chosen not to
use the generic form of IEnumerable
to avoid confusion if you're not
familiar with generics yet.)
using System; using System.Collections; class Test { static void Main(string[] args) { foreach (string x in Foo()) { Console.WriteLine (x); } } static IEnumerable Foo() { yield return "Hello"; yield return "there"; } } |
The result is:
Hello there |
The important thing to understand is that although Foo()
is only
called once, the compiler has effectively built a state machine - the yield return "there";
statement is only executed after "Hello"
has already been printed on
the screen. Every time MoveNext()
is called on the iterator (in this case MoveNext()
is called implicitly by the foreach
statement) execution continues from where it had got to
in what we've declared as the Foo()
method, until it next reaches a yield
statement.
If you're familiar with coroutines
from other languages, that's effectively what is going on here - it's just that the compiler has done all
the hard work.
Within the method, you can use perfectly normal code, with a few restrictions - you can't put a yield
statment in a finally
block, you can't put a yield return
statement in a try
block if there's a catch
block, and you can't use unsafe code. However, you can use normal looping,
access other variables etc. Here's an example, this time implementing IEnumerable
:
using System; using System.Collections; class Test { static void Main(string[] args) { NameAndPlaces t = new NameAndPlaces("Jon", new string[]{"London", "Hereford", "Cambridge", "Reading"}); foreach (string x in t) { Console.WriteLine (x); } } } public class NameAndPlaces : IEnumerable { string name; string[] places; public NameAndPlaces (string name, string[] places) { this.name = name; this.places = places; } public IEnumerator GetEnumerator() { yield return "My name is "+name; yield return "I have lived in: "; foreach (string place in places) { yield return place; } } } |
The result is:
My name is Jon I have lived in: London Hereford Cambridge Reading |
As mentioned before, yield break
is used to stop iterating. Usually this is not needed,
as you naturally reach the end of the iterator block. As well as stopping iterating, yield break
can also be used to create a simple "empty" iterator which doesn't yield anything. If you had
a completely empty method body, the compiler wouldn't know whether you wanted to write an iterator block
or a "normal" block (with normal return
statements etc). A single yield break;
statement as the whole method body is enough to satisfy the compiler.
yield break
can be useful if you want to stop iterating due to some external signal -
the user clicking on a "cancel" button for instance. Sometimes it is easier to stop the code which is
providing the data than the code which is requesting that data. In simple cases, of course, you can
just use while
loops to only keep going while more data is really wanted. In more complicated
scenarios, however, that can make the code messy - yield break
ends the method abruptly in
the same way that a normal return
statement does, with no need to make sure that every
level of iteration checks whether or not to continue. Here's an example:
using System; using System.Collections; using System.Threading; class Test { static TripleCounter counter; static void Main(string[] args) { counter = new TripleCounter(); new Thread (new ThreadStart(ShowCounter)).Start(); // After 5 seconds, stop the counter Thread.Sleep (5000); counter.stop = true; } static void ShowCounter() { // This would keep going for a very long // time if the counter wasn't stopped! foreach (string count in counter) { Console.WriteLine (count); } } } public class TripleCounter : IEnumerable { // Of course in normal code we'd never use a public field... public volatile bool stop = false; public IEnumerator GetEnumerator() { for (int i=0; i < 10000; i++) { for (int j=0; j < 4; j++) { for (int k=0; k < 4; k++) { Thread.Sleep(250); if (stop) { yield break; } yield return string.Format ("{0} {1} {2}", i, j, k); } } } } } |
As noted in the code, normally you'd never have a public field - it just makes the code
simpler in this case, so you can concentrate on the yield
statements. One thread
reads values from the enumerator and the other thread just stops it after five seconds. Coding
this without yield break
would involve each of the for
loops
checking whether or not the loop ought to stop - and other situations could be even more
complicated.
Behind the scenes, the compiler creates an extra nested type to store the state of the enumerator.
It should be noted that a hand-written enumerator may end up being significantly faster
than the compiler-generated one. However, in most cases I'd suggest that the iteration speed
is unlikely to be significant - and the solution using yield
statements is likely
to be much easier to read and maintain than a custom solution. As ever, if you have performance
concerns, measure and compare different solutions.
While iterators are usually used for collections of one sort or another, the yield
statement syntax makes other styles of programming possible too. For instance, the
Concurrency and Coordination Runtime
under development by Microsoft uses iterators and yield
statements to make asynchronous execution
much simpler to understand.
Back to the main C# page.