New "Orcas" Language Feature: Extension Methods
Last week I started the first in a series of blog posts I'll be making that cover some of the new VB and C# language features that are coming as part of the Visual Studio and .NET Framework "Orcas" release later this year.
My last blog post covered the new Automatic Properties, Object Initializer and Collection Initializer features. If you haven't read my previous post yet, please read it here. Today's blog post covers a much more significant new feature that is available with both VB and C#: Extension Methods.
What are Extension Methods?
Extension methods allow developers to add new methods to the public contract of an existing CLR type, without having to sub-class it or recompile the original type. Extension Methods help blend the flexibility of "duck typing" support popular within dynamic languages today with the performance and compile-time validation of strongly-typed languages.
Extension Methods enable a variety of useful scenarios, and help make possible the really powerful LINQ query framework that is being introduced with .NET as part of the "Orcas" release.
Simple Extension Method Example:
Ever wanted to check to see whether a string variable is a valid email address? Today you'd probably implement this by calling a separate class (probably with a static method) to check to see whether the string is valid. For example, something like:
if ( EmailValidator.IsValid(email) ) {
}
Using the new "extension method" language feature in C# and VB, I can instead add a useful "IsValidEmailAddress()" method onto the string class itself, which returns whether the string instance is a valid string or not. I can then re-write my code to be cleaner and more descriptive like so:
if ( email.IsValidEmailAddress() ) {
}
How did we add this new IsValidEmailAddress() method to the existing string type? We did it by defining a static class with a static method containing our "IsValidEmailAddress" extension method like below:
{
public static bool IsValidEmailAddress(this string s)
{
Regex regex = new Regex(@"^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$");
return regex.IsMatch(s);
}
}
Note how the static method above has a "this" keyword before the first parameter argument of type string. This tells the compiler that this particular Extension Method should be added to objects of type "string". Within the IsValidEmailAddress() method implementation I can then access all of the public properties/methods/events of the actual string instance that the method is being called on, and return true/false depending on whether it is a valid email or not.
To add this specific Extension Method implementation to string instances within my code, I simply use a standard "using" statement to import the namespace containing the extension method implementation:
The compiler will then correctly resolve the IsValidEmailAddress() method on any string. C# and VB in the public "Orcas" March CTP now provide full intellisense support for extension methods within the Visual Studio code-editor. So when I hit the "." keyword on a string variable, my extension methods will now show up in the intellisense drop-downlist:
The VB and C# compilers also naturally give you compile-time checking of all Extension Method usage - meaning you'll get a compile-time error if you mis-type or mis-use one.
[Credit: Thanks to David Hayden for first coming up with the IsValidEmailAddress scenario I used above in a prior blog post of his from last year.]
Extension Methods Scenarios Continued...
Leveraging the new extension method feature to add methods to individual types opens up a number of useful extensibility scenarios for developers. What makes Extension Methods really powerful, though, is their ability to be applied not just to individual types - but also to any parent base class or interface within the .NET Framework. This enables developers to build a variety of rich, composable, framework extensions that can be used across the .NET Framework.
For example, consider a scenario where I want an easy, descriptive, way to check whether an object is already included within a collection or array of objects. I could define a simple .In(collection) extension method that I want to add to all objects within .NET to enable this. I could implement this "In()" extension method within C# like so:
Note above how I've declared the first parameter to the extension method to be "this object o". This indicates that this extension method should applied to all types that derive from the base System.Object base type - which means I can now use it on every object in .NET.
The "In" method implementation above allows me to check to see whether a specific object is included within an IEnumerable sequence passed as an argument to the method. Because all .NET collections and arrays implement the IEnumerable interface, I now have a useful and descriptive method for checking whether any .NET object belongs to any .NET collection or array.
I could use then use this "In()" extension method to see whether a particular string is within an array of strings:
I could use it to check to see whether a particular ASP.NET control is within a container control collection:
I could even use it with scalar datatypes like integers:
Note above how you can even use extension methods on base datatype values (like the integer value 42). Because the CLR supports automatic boxing/unboxing of value-classes, extensions methods can be applied on numeric and other scalar datatypes directly.
As you can probably begin to see from the samples above, extension methods enable some really rich and descriptive extensibility scenarios. When applied against common base classes and interfaces across .NET, they enable some really nice domain specific framework and composition scenarios.
Built-in System.Linq Extension Methods
One of the built-in extension method libraries that we are shipping within .NET in the "Orcas" timeframe are a set of very powerful query extension method implementations that enable developers to easily query data. These extension method implementations live under the new "System.Linq" namespace, and define standard query operator extension methods that can be used by any .NET developer to easily query XML, Relational Databases, .NET objects that implement IEnumerable, and/or any other type of data structure.
A few of the advantages of using the extension method extensibility model for this query support include:
1) It enables a common query programming model and syntax that can be used across all types of data (databases, XML files, in-memory objects, web-services, etc).
2) It is composable and allows developers to easily add new methods/operators into the query syntax. For example: we could use our custom "In()" method together with the standard "Where()" method defined by LINQ as part of a single query. Our custom In() method will look just as natural as the "standard" methods supplied under the System.Linq namespace.
3) It is extensible and allows any type of data provider to be used with it. For example: an existing ORM engine like NHibernate or LLBLGen could implement the LINQ standard query operators to enable LINQ queries against their existing ORM implementation and mapping engines. This will enable developers to learn a common way to query data, and then apply the same skills against a wide variety of rich data store implementations.
I'll be walking through LINQ much more over the next few weeks, but wanted to leave you with a few samples that show how to use a few of the built-in LINQ query extension methods with different types of data:
Scenario 1: Using LINQ Extension Methods Against In-Memory .NET Objects
Assume we have defined a class to represent a "Person" like so:
I could then use the new object Initializer and collection Initializer features to create and populate a collection of "people" like so:
I could then use the standard "Where()" extension method provided by System.Linq to retrieve a sequence of those "Person" objects within this collection whose FirstName starts with the letter "S" like so:
The new p => syntax above is an example of a "Lambda expression", which is a more concise evolution of C# 2.0's anonymous method support, and enables us to easily express a query filter with an argument (in this case we are indicating that we only want to return a sequence of those Person objects where the firstname property starts with the letter "S"). The above query will then return 2 objects as part of the sequence (for Scott and Susanne).
I could also write code that takes advantage of the new "Average" and "Max" extension methods provided by System.Linq to determine the average age of the people in my collection, as well as the age of the oldest person like so:
Scenario 2: Using LINQ Extension Methods Against an XML File
It is probably rare that you manually create a collection of hard-coded data in-memory. More likely you'll retrieve the data either from an XML file, a database, or a web-service.
Let's assume we have an XML file on disk that contains the data below:
I could obviously use the existing System.Xml APIs today to either load this XML file into a DOM and access it, or use a low-level XmlReader API to manually parse it myself. Alternatively, with "Orcas" I can now use the System.Xml.Linq implementation that supports the standard LINQ extension methods (aka "XLINQ") to more elegantly parse and process the XML.
The below code-sample shows how to use LINQ to retrieve all of the <person> XML Elements that have a <person> sub-node whose inner value starts with the letter "S":
Note that it uses the exact same Where() extension method as with the in-memory object sample. Right now it is returning a sequence of "XElement" elements, which is an un-typed XML node element. I could alternatively re-write the query to "shape" the data that is returned instead by using LINQ's Select() extension method and provide a Lambda expression that uses the new object initializer syntax to populate the same "Person" class that we used with our first in-memory collection example:
The above code does all the work necessary to open, parse and filter the XML in the "test.xml" file, and return back a strongly-typed sequence of Person objects. No mapping or persistence file is necessary to map the values - instead I am expressing the shaping from XML->objects directly within the LINQ query above.
I could also use the same Average() and Max() LINQ extension methods as before to calculate the average age of <person> elements within the XML file, as well as the maximum age like so:
I do not have to manually parse the XML file. Not only will XLINQ handle that for me, but it will parse the file using a low-level XMLReader and not have to create a DOM in order to evaluate the LINQ expression. This means that it is lightening fast and doesn't allocate much memory.
Scenario 3: Using LINQ Extension Methods Against a Database
Let's assume we have a SQL database that contains a table called "People" that has the following database schema:
I could use the new LINQ to SQL WYSIWYG ORM designer within Visual Studio to quickly create a "Person" class that maps to the database:
I can then use the same LINQ Where() extension method I used previously with objects and XML to retrieve a sequence of strongly-typed "Person" objects from the database whose first name starts with the letter "S":
Note how the query syntax is the same as with objects and XML.
I could then use the same LINQ Average() and Max() extension methods as before to retrieve the average and maximum age values from the database like so:
You don't need to write any SQL code yourself to have the above code snippets work. The LINQ to SQL object relational mapper provided with "Orcas" will handle retrieving, tracking and updating objects that map to your database schema and/or SPROCs. You can simply use any LINQ extension method to filter and shape the results, and LINQ to SQL will execute the SQL code necessary to retrieve the data (note: the Average and Max extension methods above obviously don't return all the rows from the table - they instead use TSQL aggregate functions to compute the values in the database and just return a scalar result).
Please watch this video I did in January to see how LINQ to SQL dramatically improves data productivity in "Orcas". In the video you can also see the new LINQ to SQL WYSIWYG ORM designer in action, as well as see full intellisense provided in the code-editor when writing LINQ code against the data model.
Summary
Hopefully the above post gives you a basic understanding of how extension methods work, and some of the cool extensibility approaches you will be able to take with them. As with any extensibility mechanism, I'd really caution about not going overboard creating new extension methods to begin with. Just because you have a shiny new hammer doesn't mean that everything in the world has suddenly become a nail!
To get started trying out extension methods, I'd recommend first exploring the standard query operators provided within the System.Linq namespace in "Orcas". These enable rich query support against any array, collection, XML stream, or relational database, and can dramatically improve your productivity when working with data. I think you'll find they'll significantly reduce the amount of code you write within your applications, and allow you to write really clean and descriptive syntax. They'll also enable you to get automatic intellisense and compile-time checking of query logic within your code.
In the next few weeks I'll continue this series on new language features in "Orcas" and explore Anonymous Types and Type Inference, as well as talk more about Lambdas and other cool features. I'll also obviously be talking a lot more about LINQ.
[April 21st Update: I recently posted the next topic in this Orcas language series - covering Query Syntax -- here.]
Hope this helps,
Scott