Loading recursive queries

Entity framework supports self referencing entity where an entity has a a reference to itself. A common example of that would be a category with subcategories. In the database you would typically see a category table with a ParentCategoryId column which represents the parent category of the current category. Figure below shows the Category table in the database.

image

When you import the above table in Entity Framework, you would see the association for category pointing back to itself. Figure below shows the entity.

image

Notice that we have an entity reference, ParentCategory  because a category belongs to a parent unless its a top level category. Also a category may have subcategories which can be accessed by SubCategories property.

Suppose we have a requirement where you want to load all categories any level deep for a given category. You can do that using several different approaches some are better then the other but you have to look at your scenario and do a test to see which one offers the optimal performance. List below shows some option.

1. Load all categories from the database and let relationship span fix the references.

2. use common table expression to load all the categories at once.

3. If you know the depth of your collection, you can do that many includes in your query.

I find the first approach , loading entire table and leveraging relationship span to offer the best performance.

Let’s go through each option and compare its pros and cons.

Retrieving entire table and fixing relationship with Relationship Span

I have found that if your table is generally small like 1000 records, it is easier to just do a query to return all the records from the database and as each record is brought into the object context, EF sees that a given category’s parentcategoryid points to a category which it is already tracking. Hence it just fixes the relationship and the experience seems as if you have retrieve the category and its subcategories in one single query.  Code below shows an example of loading all related categories for the top level parent category.

image

In the code above, I am retrieving all the categories in the memory and using the first operator to get the top level category which is the category that does not have a parent. After that i am printing out 3 level deep category hierarchy on the console window.

 image

The benefit to the above approach is the simplicity of the query. If you look at the query sent to the database, its simply a select * from categories table. However you pay the cost of sending more data over the network and doing fix up of categories you do need. If you have a fairly large table then this option wouldn’t be ideal.

Using Common Table Expression to load the hierarchy

You can create a stored procedure in the database that takes in the categoryid and returns all the nested categories any level. This is very efficient because you are only retrieving sub categories for the top level node you are interested in.  Code below shows the stored procedure that uses common table expression on the database to retrieve related categories.

image

To consume the above stored procedure we can use Add Function Import feature to import the stored procedure. This makes it available on the ObjectContext as a method that we can  call.  In the code below, we retrieve our parent category first and then we pass that categoryId to our GetCategories method which calls our stored procedure and returns all the categories for the categoryid passed in. Rest of the code prints out nested categories.

image

The draw back to this approach is you have a stored procedure that you have to maintain. Also you can’t further filter the query because you are using a stored procedure and any filtering applied would be done in memory. In the current version of EF4, there is no support for table valued function. Once that option becomes available, getting related categories with a function would be the biggest bank for your buck!

The above code also uses relationship span to fix the references but the benefit is, you are only loading categories you need in the object context. I am not quite sure but relationship span fixing does not work if you are using independent association. It only works with foreign key association.

Using Include to load the hierarchy

If you know deep your hierarchy is you can use that many includes to load the depth. However i would caution you that you will see a very nasty query generated which may sql server have a hard time running. Also you would receive redundant data meaning for each child category, the parent category information would be repeated. This would increase the size of the data travelling over the network. In the code below we use the Include to retrieve SubCategories 3 level deep.

db.Categories.Include("SubCategories.SubCategories.SubCategories").Single(c => c.CategoryId == 1);

Th query generated by the above linq query is shown below.

image

We covered several options to retrieve the depth i find that retrieving entire table and letting relationship span fix the reference offers me in in-between option where i don't have to maintain the stored procedure and the query sent to the database is very simple.

2 Comments

  • Hi,

    Great blog and post!
    But I have different situation than the ones you mention.

    I need to load an object similar to "Categories" that has a self-reference ("ParentID"). What I need to be able to do is to query for a category using one context and then querying for its sub categories using another context and connect the category with its children (load-on-demand situation). After this, I want to be able to mark some or all of the objects in the hierarchical structure as deleted and call ApplyChanges to another context. Is this even possible with EF and STEs?

  • what about this one for the first method:


    IList categoryList = new List();
    public IList GetCategoryListByParentID(int? categoryId)
    {

    var category = this.ObjectContext.Category.Include("SubCategories").FirstOrDefault(e => e.CategoryId == categoryId);
    categoryList.Add(category);
    foreach (var item in category.SubCategories)
    {
    GetCategoryListByParentID(item.CategoryId);
    }

    return CategoryList;
    }

Comments have been disabled for this content.