New "Orcas" Language Feature: Anonymous Types
Over the last two months I've published a series of posts covering some of the new language features that are coming as part of the Visual Studio and .NET Framework "Orcas" release. Here are pointers to the first four posts in my series:
- Automatic Properties, Object Initializer and Collection Initializers
- Extension Methods
- Lambda Expressions
- Query Syntax
Today's blog post covers the last new feature in my language series: Anonymous Types.
What are Anonymous Types?
Anonymous types are a convenient language feature of C# and VB that enable developers to concisely define inline CLR types within code, without having to explicitly define a formal class declaration of the type.
Anonymous types are particularly useful when querying and transforming/projecting/shaping data with LINQ.
Anonymous Type Example
In my previous Query Syntax blog post I demonstrated how you could transform data with projections. This is a powerful feature of LINQ that enables you to perform query operations on a data source (regardless of whether it is a database, an XML file, or an in-memory collection), and shape the results of the data being queried into a different structure/format than the original data source is in.
In my previous Query Syntax blog post I defined a custom "MyProduct" class that I used to represent my transformed product data. By explicitly defining the "MyProduct" class I have a formal CLR type contract that I can use to easily pass my custom-shaped product results between web-services or between multiple classes/assemblies within my application solution.
However, there are times when I just want to query and work with data within my current code scope, and I don't want to have to formally define an explicit class that represents my data in order to work with it. This is where anonymous types are very useful, as they allow you to concisely define a new type to use inline within your code.
For example, assume I use the LINQ to SQL object relational mapper designer within "Orcas" to model the "Northwind" database with classes like below:
I can then use the below code to query the Product data in my database, and use the projection/transformation capability of LINQ to custom shape the data result to be something other than the "Product" class above. Rather than use an explicitly defined "MyProduct" class to represent each custom-shaped row of data retrieved from the database, I can instead use the anonymous type feature to implicitly define a new type with 4 properties to represent my custom shaped data like so:
In the code above I'm declaring an anonymous type as part of the select clause within my LINQ expression, and am having the compiler automatically create the anonymous type with 4 properties (Id, Name, UnitPrice and TotalRevenue) - whose property names and type values are inferred from the shape of the query.
I'm then using the new "var" keyword within C# to programmatically refer to the IEnumerable<T> sequence of this anonymous type that is returned from the LINQ expression, as well as to refer to each of the anonymous type instances within this sequence when I programmatically loop over them within a foreach statement later in my code.
While this syntax gives me dynamic language-like flexibility, I also still retain the benefits of a strongly-typed language - including support for compile-time checking and code intellisense within Visual Studio. For example, notice above how I am doing a foreach over the returned products sequence and I am still able to get full code intellisense and compilation checking on the anonymous type with custom properties that was inferred from the LINQ query.
Understanding the Var Keyword
C# "Orcas" introduces a new var keyword that may be used in place of the type name when performing local variable declarations.
A common misperception that people often have when first seeing the new var keyword is to think that it is a late-bound or un-typed variable reference (for example: a reference of type Object or a late-bound object like in Javascript). This is incorrect -- the var keyword always generates a strongly typed variable reference. Rather than require the developer to explicitly define the variable type, though, the var keyword instead tells the compiler to infer the type of the variable from the expression used to initialize the variable when it is first declared.
The var keyword can be used to reference any type in C# (meaning it can be used with both anonymous types and explictly declared types). In fact, the easiest way to understand the var keyword is to look at a few examples of it using common explict types. For example, I could use the var keyword like below to declare three variables:
The compiler will infer the type of the "name", "age" and "male" variables based on the type of their initial assignment value (in this case a string, an integer, and a boolean). This means it will generate IL that is absolutely identical to the code below:
The CLR actually never knows that the var keyword is being used - from its perspective there is absolutely no difference between the above two code examples. The first version is simply syntactic sugar provided by the compiler that saves the developer some keystrokes, and has the compiler do the work of inferring and declaring the type name.
In addition to using built-in datatypes with the var keyword, you can obviously also use any custom types you define. For example, I could go back to the LINQ query projection I did in my previous blog post that used an explicit "MyProduct" type for the data-shaping and adapt it to use the var keyword like so:
Important: Although I'm using the "var" keyword above, I'm not using it with an anonymous type. My LINQ query is still shaping the returned data using the "MyProduct" type - which means that the "var products" declaration is simply a shorthand for "IEnumerable<Product> products". Likewise, the "var p" variable I defined within my foreach statement is simply shorthand for a a variable of type "MyProduct p".
Important Rule about the Var Keyword
Because the var keyword produces a strongly-typed variable declaration, the compiler needs to be able to infer the type to declare based on its usage. This means that you need to always do an initial value assignment when declaring one. The compiler will produce a compiler error if you don't:
Declaring Anonymous Types
Now that we've introduced the "var" keyword, we can start to use it to refer to anonymous types.
Anonymous types in C# are defined using the same object initializer syntax I covered in my first blog post in this language series. The difference is that instead of declaring the type-name as part of the initialization grammar, when instantiating anonymous types you instead just leave the type-name blank after the "new" keyword:
The compiler will parse the above syntax and automatically define a new standard CLR type that has 4 properties. The types of each of the 4 properties are determined based on the type of the initialization values being assigned to them (for example: in the sample above the "Id" property is being assigned an integer - so the compiler will generate the property to be of type integer).
The actual CLR name of the anonymous type will automatically be generated by the C# compiler. The CLR itself actually doesn't know the difference between an anonymous type and a named type - so the runtime semantics of the two are absolutely identical. Bart De Smet has a good blog post here that details this if you want to see the exact class name pattern and IL generated.
Note above how when you type "product." on the anonymous type, you still get compile-time checking and full intellisense within Visual Studio. Notice also how the intellisense description indicates it is an "AnonymousType" - but still provides full declaration information of the properties (this is the text circled in red).
Using Anonymous Types for Hierarchical Shaping
One of the powerful scenarios that anonymous types makes easy is the ability to easily perform hierarchical shape projections of data with a minimum amount of code.
For example, I could write the below LINQ expression to query all products from the Northwind database whose price is greater than $50, and then shape the returned products in a hierarchical structure sorted by the Products' stock reorder level (using the "group into" clause supported by LINQ query syntax):
When the above code is run in ASP.NET, I'll get the below output rendered in my browser:
I could likewise do nice hierarchical shapings based on JOIN results. For example, the below code creates a new anonymous type with some standard product column properties, as well as a hierarchical sub-collection property that contains the orderdetails of the 5 most recent orders that customers have placed for that particular product:
Notice how I can neatly traverse the hierarchical data. Above I'm looping over the product query, and then drilling into the collection of the last 5 orders for each product. As you can see, I have full intellisense and compile-time checking everywhere (even on properties of objects within the nested sub-collection of order details on the anonymous type).
Data Binding Anonymous Types
As I mentioned earlier in this blog post, there is absolutely no difference from a CLR perspective between an anonymous type and an explicitly defined/named type. Anonymous types and the var keyword are purely "syntactic sugar" that avoid you having to type code - the runtime semantics are the same as using explicitly defined types.
Among other things, this means that all of the standard .NET type reflection features work with anonymous types - which means that features like databinding to UI controls work just fine with them. For example, if I wanted to display the results of my previous hierarchical LINQ query, I could define an <asp:gridview> control within a .aspx page like below:
The .aspx above contains a gridview with 2 standard boundfield columns, and one templated field column that contains a nested <asp:bulletedlist> control that I'll use to display the product's hierarchical orderdetail sub-results.
I could then write the below LINQ code to perform my hierarchical query against the database and databind the custom-shaped results against the GridView to display:
Because the GridView supports binding against any IEnumerable<T> sequence, and uses reflection to retrieve property values, it will work just fine against the anonymous type I'm using above.
At runtime the above code will produce a simple grid of product details with a hierarchical list of their recent order quantities like so:
Obviously you could make this report much richer and prettier - but hopefully you get the idea of how easy it is to now perform hierarchical queries against a database, shape the returned results however you want, and then either work against the results programmatically or databind them to UI controls.
Summary
Anonymous types are a convenient language feature that enable developers to concisely define inline CLR types within code, without having to explicitly provide a formal class declaration of the type. Although they can be used in lots of scenarios, there are particularly useful when querying and transforming/shaping data with LINQ.
This post concludes my 5-part language series for "Orcas". Going forward I'll be doing many more LINQ posts that will demonstrate how to actually take advantage of all of these new language features to perform common data access operations (defining data models, querying, updating, using sprocs, validation, etc). I wanted to get this 5 part language series done first, though, so that you'll have a good way to really understand the underlying language constructs as we drill into scenarios within my upcoming posts.
Hope this has helped,
Scott