Readability, Sub-Queries in LINQ
I’ve been working with LINQ and LINQ to SQL for a while now and I have to say, it definitely makes object enumeration, data access and binding extremely easy and efficient. Even the dynamic SQL it generates using LINQ to SQL is acceptable in efficiency and performance. One thing I noticed however is when you need to do complex sub queries or provide summary information within nested levels of relationships, the code can become pretty messy, until I discovered you can write LINQ statements within a generic type from inside an existing LINQ statement.
I’m sure somewhere in the documentation it says you can do this, but I just discovered it for myself, so no laughing if you’re thinking to yourself, “DUH; what a moron!”
Setting up the LINQ to SQL DBML
Below is an example that I set up using everyone’s favorite Northwind database (which had to be installed separately on SQL 2005 for good reason). First I created a simple console application targeted for the .NET 3.5 Framework and added a LINQ to SQL class (dbml) called “Northwind” which created my NorthwindDataContext class.

For this example I’m using the Employee, Order, and Order_Detail table in order to calculate employee commission at 10%. You could also use LINQ to Objects or LINQ to XML in this same fashion as I’m about to show below.
In the code, my first attempt at doing this calculation was pretty hard to read and involved nested Lambda expressions, which of and in themselves can be difficult to read.
using (NorthwindDataContext northwind = new NorthwindDataContext())
{
var employees = from e in northwind.Employees
select new
{
FirstName = e.FirstName,
LastName = e.LastName,
Title = e.Title,
Commission = e.Orders
.Sum(r => r.Order_Details
.Where(r1 => r1.Order.ShippedDate.HasValue && r1.Order.ShippedDate < DateTime.Today.AddDays(-1))
.Sum(r2 => (r2.Quantity * r2.UnitPrice) – Convert.ToDecimal(r2.Discount)) * .1M)
};
foreach (var emp in employees)
{
Console.WriteLine(“First Name\tLast Name\tTitle\tCommission”);
Console.WriteLine(“{0}\t{1}\t{2}\t{3:C}”,
emp.FirstName,
emp.LastName,
emp.Title,
emp.Commission);
}
}
Then, after looking at this code, I figured I would give a shot at selecting commission using another LINQ statement within the new generic class I was creating. The result looked like this:
using (NorthwindDataContext northwind = new NorthwindDataContext())
{
var employees = from e in northwind.Employees
select new
{
FirstName = e.FirstName,
LastName = e.LastName,
Title = e.Title,
Commission = (from t in e.Orders
where t.ShippedDate.HasValue && t.ShippedDate < DateTime.Today.AddDays(-1)
select t.Order_Details.Sum(r => (r.Quantity * r.UnitPrice) – Convert.ToDecimal(r.Discount)) * .1M).Sum(r => r)
};
foreach (var emp in employees)
{
Console.WriteLine(“First Name\tLast Name\tTitle\tCommission”);
Console.WriteLine(“{0}\t{1}\t{2}\t{3:C}”,
emp.FirstName,
emp.LastName,
emp.Title,
emp.Commission);
}
}
SQL Output
The output from the 2 above LINQ statements was as follows:
SELECT [t0].[FirstName], [t0].[LastName], [t0].[Title], (
SELECT SUM([t4].[value])
FROM (
SELECT ((
SELECT SUM([t3].[value])
FROM (
SELECT ((CONVERT(Decimal(29,4),[t2].[Quantity])) * [t2].[UnitPrice]) - (CONVERT(Decimal(29,4),[t2].[Discount])) AS [value], [t2].[OrderID]
FROM [dbo].[Order Details] AS [t2]
) AS [t3]
WHERE ([t1].[ShippedDate] IS NOT NULL) AND ([t1].[ShippedDate] < ‘2008-03-27 00:00:00.000′) AND ([t3].[OrderID] = [t1].[OrderID])
)) * 0.1 AS [value], [t1].[EmployeeID]
FROM [dbo].[Orders] AS [t1]
) AS [t4]
WHERE [t4].[EmployeeID] = [t0].[EmployeeID]
) AS [Commission]
FROM [dbo].[Employees] AS [t0]
Compared with the revised code:
SELECT [t0].[FirstName], [t0].[LastName], [t0].[Title], (
SELECT SUM([t4].[value])
FROM (
SELECT ((
SELECT SUM([t3].[value])
FROM (
SELECT ((CONVERT(Decimal(29,4),[t2].[Quantity])) * [t2].[UnitPrice]) - (CONVERT(Decimal(29,4),[t2].[Discount])) AS [value], [t2].[OrderID]
FROM [dbo].[Order Details] AS [t2]
) AS [t3]
WHERE [t3].[OrderID] = [t1].[OrderID]
)) * 0.1 AS [value], [t1].[ShippedDate], [t1].[EmployeeID]
FROM [dbo].[Orders] AS [t1]
) AS [t4]
WHERE ([t4].[ShippedDate] IS NOT NULL) AND ([t4].[ShippedDate] < ‘2008-03-27 00:00:00.000′) AND ([t4].[EmployeeID] = [t0].[EmployeeID])
) AS [Commission]
FROM [dbo].[Employees] AS [t0]
Conclusion
Both of these sets of code produce SQL that returns the exact same results from the database, however the joins that they use and indexes are very different, and the second one that is not only easier to read also shows lower consumption and more efficiency in the execution plan than the first. The first query took consistently between 334 and 386 miliseconds longer than the second, although both executed under one second after the queries were cached by SQL Server.

4 Responses Leave a comment
Me Like!
Thanks for the writeup. Linq2Sql is quite flexible. Didn’t think about doing it in the “select new{}” way. Good idea.
I was thinking about doing an “in” or “contains” style operator for seeing if a record contains information from another related table.. but this was good info to know.
If you talk about readability i would split it up into 2 queries
var qcommision = from t in e.Orders
where t.ShippedDate.HasValue && t.ShippedDate (r.Quantity * r.UnitPrice) – Convert.ToDecimal(r.Discount)) * .1M).Sum(r => r)
and then
var employees = from e in northwind.Employees
select new
{
FirstName = e.FirstName,
LastName = e.LastName,
Title = e.Title,
Commission = qcommision
}
haven’t tested this but it think it would work this way also.
I see the direction you’re going and I can’t argue that this may be more legible from a readability standpoint by breaking up the subquery as an IQueryable, then assigning the “var” to the Commission property in the actual select statement. My assumption (without validation of course) would be that you would end up with the exact same SQL output, which any optimizing tricks you can use when LINQ is generating your SQL and not you the better.
Of course, my idea of readability is putting everything humanly possible in a single statement (or single line if I can get away with it). I hate vertical space
. This was a good suggestion though, thanks!
Other variant is possible also