Cache expensive LINQ queries in our code for performance and eliminate the laziness. We may be using a LINQ query with an expensive query. This operation may take, for example, about 0.000300 seconds to execute each time. This LINQ query is very slow and it could be made much better.
We improve performance by forcing immediate evaluation. The coolest thing about LINQ is that it doesn't do anything until you actually walk through its results. This can enhance performance--or severely hamper it. The following example shows a LINQ query similar to what I was using.
// Example LINQ query that is lazily evaluated. // The var keyword simplifies the syntax. var groupList = from groupItem in _site.Pages orderby _site.Categories[groupItem.Category], groupItem.Title where groupItem.Visibility == VisibilityType.Regular group groupItem by _site.Categories[groupItem.Category];
The above line doesn't do anything--everything is executed in the next few lines of the function. The query is evaluated "lazily." I am not sure exactly how "lazy" the queries are or how they are internally implemented. This is where the queries are evaluated.
StringBuilder builder = new StringBuilder();
foreach (var group in groupList)
{
// Query is evaluated now.
builder.Append("String");
foreach (SitePage page in group)
{
// Query is evaluated.
// Append to a StringBuilder.
builder.Append("String");
}
}
My first optimization attempt was to cache an IEnumerable<IGrouping<string, SitePage>>. This didn't work, because that doesn't break the laziness. The ToArray method forces the LINQ query to be fully evaluated and stored in an array.
/// <summary>
/// An example class containing the forced evaluation optimization.
/// </summary>
class SampleClass
{
/// <summary>
/// This is where the collection is cached.
/// </summary>
IGrouping<string, SitePage>[] _groupCache;
/// <summary>
/// Generate the HTML (contains the query string).
/// </summary>
public string GetSidebarString()
{
if (_groupCache == null)
{
_groupCache = (from groupItem in _site.Pages
orderby _site.Categories[groupItem.Category],
groupItem.Title
where groupItem.Visibility == VisibilityType.Regular
group groupItem by _site.Categories[groupItem.Category]
).ToArray();
}
StringBuilder builder = new StringBuilder();
foreach (var group in _groupCache)
{
builder.Append("...");
foreach (SitePage page in group)
{
builder.Append("...");
}
}
return builder.ToString();
}
}
We declare an IGrouping collection, and store that as a member variable. It is a cache of the LINQ query. Then, we only run the LINQ query when that IGrouping is null. This way, the LINQ is evaluated exactly once, and its results are stored in a member array.
In this particular situation, with about 70 objects in the SitePage array, I cut the time required for the query by a factor of 6 by forcing immediate evaluation and caching the results. The end result isn't too ugly and doesn't look too much like a hack to me. In fact, it looks just about as graceful.
| Version | Time Required in seconds |
| Lazy evaluation with var | 0.000250 |
| Cached array of IGrouping | 0.000055 |
The point here is to carefully dissect the behavior of LINQ, and in doing so, learn more about it and ways to enhance its usefulness. By micro-benchmarking, we can become experts on what's really happening in our code. If something is taking a bit too long, it may be doing something you are not aware of.