Performance Statistics of Various Implementations of the Data Mapper Pattern
What started off as a quick how-to example of rehydrating business objects from data access layers (in reply to this Jay Kimble post) morph’d into a whole lot more (thanks to Scott Hansleman’s DataSet post, and my thoughts on the topic). The idea was pretty simple. I wanted to see how long it took to rehydrate a multi-dimension object graph from data retrieved from a database using the most common methods in .Net. Going into this project, I had some preconceived notions on what the results would be, but I was very surprised at the results.
For those of you that are not pattern aware (yet), the process that you go through when converting data to business objects is called a Data Mapper Pattern. The Data Mapper layer handles the creation of your business objects (aka your domain) from data, and hopefully don’t even know that there is a database (thanks to the data access layer). In a domain-less (aka no business object) world the mapping step is skipped entirely, and datasets are used, with the business rules all tied up in the presentation tier or down in the database. A slight advancement over the pure dataset method is strongly-type datasets, but unless you modify the generated code, you can not add the business rules into your strongly-typed datasets (at least until .Net 2.0’s partial classes).
In this project, I decided to use the DAAB v3.1 as the basis for my data access layer, since I know that a lot of people use it, and it saves me from have to explain my data access layer (more on that another time). All I did is add my Connection String encryption project to it (keeping it a more enterprise ready example), and added a service layer around the DAAB (to encapsulate the Data Access, and hide it as much as possible from the business objects).
For the actual data mapping, that code is actually only dependent on one thing, how you exposing the data to the data mapper code. There are three popular ways to pass data from the data access layer to the layers above, the DataReader, the DataSet, and the XmlReader. Both the DataReader and the DataSet require you to create the actual data mapping code by hand, but with the XmlReader, you can use Xml Serialization to deserialize the XML into an instance of your business objects. There is one more way that I’ve been working on to pass data up from the data access layer, and that is by using XPathNavigators, so I added this as a fourth example. This is the one implementation that I was most interested in finding out the performance figures for.
The sample database I used was the all too familiar Northwind database on SQL Server 2000 (with all the latest service packs). I created a cloned copy of the database and created the following stored procedure to test against (well actually there are 2 versions of the stored proc, one “normal” proc, and one that returned an XML stream using the For Xml Auto clause) that would return all the employees, with their associated Territories, and the Region for the Territory.
SELECT Employee.EmployeeId as "Employee.Id",
Employee.LastName as "Employee.LastName",
Employee.FirstName as "Employee.FirstName",
Employee.Title as "Employee.Title",
Employee.BirthDate as "Employee.BirthDate",
Territory.TerritoryId as "Territory.Id",
RTRIM(Territory.TerritoryDescription) as "Territory.Description",
Region.RegionID as "Region.Id",
RTRIM(Region.RegionDescription) as "Region.Description"
from dbo.Employees Employee
INNER JOIN dbo.EmployeeTerritories et
on Employee.EmployeeID = et.EmployeeID
INNER JOIN dbo.Territories Territory
on et.TerritoryID = Territory.TerritoryID
INNER Join dbo.Region Region
on Region.RegionID = Territory.RegionID
order by Employee.EmployeeID, Territory.TerritoryId, Region.RegionId;
This would give me a nice 3 dimensional object graph to test the performance numbers against. If you want to see all the code, you can pull it down from here (Or you can get the code from the Mvp.Xml SourceForge Project). Just add the stored procs to a version of Northwind, and compile and walk thru the code (until I can get some time to write this all up in a series of articles).
Now, on to the results. I used my Dell 8500 laptop as the test machine, with WindowsXP SP2, VS.Net 2k3, and SQL2k. I wrote a small console app for each of the 4 test cases and they all used the High Performance Timer code to get better time measurements. In order to try to get more consistent results, the console app will load the business object one time (to get everything compiled and into memory, including connection and all the SQL Server side stuff), and then start the timer and loop thru 1000 times.
DataReader - 341
DataSet - 411 ( 18.6% slower than the DataReader)
XPathNavigator - 450 (27.7% slower than the DataReader)
XmlSerialization - 542 (46.3% slower than the DataReader)
It wasn’t real surprising that the DataReader was the fastest of the implementations, but what surprised me was how slow the XmlSerialization was, and that the XPathNavigator was slower than the DataSet. (oh, and if you are wondering how I calculated the percent changed, see this article on why what you were taught in high school was wrong and very misleading). I went over my code pretty well, but I’m sure that people will find some performance enhancements, but overall, I’m pretty sure that things were done pretty consistently, and most enhancements will effect all the results and wash out of the final analysis ( and yes, I made sure that I cached the XmlSerialization instance, so that isn’t why that implementation method is so slow). But, if you find something, definitely let me know so that I can update it.
I’m sure there are lots of other ways to implement a DataMapper in .Net. You are welcome to clone the code, implement it, and let us know what your results were.
The preceding blog entry has been syndicated from the DonXML Demsak’s All Things Techie Blog. Please post all comments on the original post.