Thursday, 3 July 2008

Lazy nature of yield return

It is commonly believed that using yield return mechanism in .Net 2 is ‘equivalent’ to creating your own collection and returning it as an IEnumerable (ignoring performance, which I have never tested myself). However, today I realized that the behaviour is actually quite different. When you use yield return, your code to generate subsequent elements of the collection is executed as you need them. When you create your own collection first on the other hand, you generate all the elements first, and use them later. Let’s look at a simple example:

public IEnumerable ThrowExeptionForNullParameterYieldReturn(string argument)
{
if(argument==null)
{
throw new ArgumentNullException("argument");
}
yield return 0;
}

[Test]
public void ExceptionWithYieldReturnIsConsumed()
{
Assert.IsNotNull(ThrowExeptionForNullParameterYieldReturn(null));
}

Do you think that this test will pass or fail? It seems quite obvious that it will fail with the ArgumentNullException, before we even get to the first yield return statement, yes? Actually, not at all! The methods using yield return methods are not ordinary ones that just execute when you call them. This test will pass and create a valid IEnumerable object, and none of the code in ThrowExeptionForNullParameterYieldReturn is going to execute. Try yourself, place a break point on the first line of the function, run the test in debugger and notice that the break point is not going to be hit at all.

The code checking the argument is going to run in a lazy fashion, during the first MoveNext() method on the enumerator:

[Test]
[ExpectedException(
ExceptionType = typeof(ArgumentNullException),
ExpectedMessage = "Value cannot be null.\r\nParameter name: argument")]
public void ExceptionWithYieldReturnIsThrownInMoveNext()
{
IEnumerable myStuff = ThrowExeptionForNullParameterYieldReturn(null);
myStuff.GetEnumerator().MoveNext();
}

The test above passes, so now the exception we expected was really thrown. And when using the traditional approach with creating our own collection, we get the exception before we get hold of the IEnumerable, as expected:

public IEnumerable ThrowExeptionForNullParameterList(string argument)
{
if (argument == null)
{
throw new ArgumentNullException("argument");
}
List result = new List();
Result.Add(0);
return result;
}

[Test]
[ExpectedException(
ExceptionType = typeof(ArgumentNullException),
ExpectedMessage = "Value cannot be null.\r\nParameter name: argument")]
public void ExceptionWithListIsThrownAtTheBeginning()
{
ThrowExeptionForNullParameterList(null);
}

The fact that yield return results in a lazy solution is quite important, as it has great impact not only on validating parameters, but also if your function has some side-effects. Imagine logic like this:

IEnumerable GetCustomers()
{
foreach(Customer customer in Customers)
{
if (customer.EligibleOrThrow)
{
yield return customer;
}
}
}

void Pay()
{
IEnumerable eligibleCustomers = GetCustomers();
foreach (Customer eligibleCustomer in eligibleCustomers)
{
customer.Pay(10000);
}
}

So, who did your system pay if one of the customer.EligibleOrThrow throws?

No comments: