Optimizing join queries in Linq to Objects.

For many years working in SQL server environment and writing dozens of joins, I never cared which table needs to come first and which one comes second. The reason was SQL server would use table statistics to determine best way to join two tables. However we don't get this luxury when we are writing Linq to object queries in C# 3.0. For instance if you have two sequences in memory and you have to join those two sequences, than always use the smaller sequence on the right side of your query and the larger sequence on the left side. It makes a considerable difference in performance. To prove that there is a big difference in the time it takes for the query to execute I will put a small sequence on the right side and run the query to see how long it takes.

image

image

In the screen shot above I am creating two sequence with num2 being a small sequence. Than I am joining both the sequences with smaller sequence on the right hand side. Notice it took .214 sec to complete the entire query.  Changing the query to where we are using larger sequence on the right hand side causes the query execution time to increase to more than 3 sec. That is a huge difference. Below is an example of the expensive query.

image

 

image

No Comments