Examples of O/R Mapping vs Stored Procedures
Mike Schinkel of Xtras.net responded to my previous post about "dynamic sql" by asking for a real-world comparison "example" that would clearly demonstrate the pros and cons of O/R Mapping. I seriously thought about providing such an "example" for a few minutes, but (1) there are already enough "examples" for those that are truly interested and (2) it would be a lot of work that I try to avoid. :) Seriously, see my ASPAlliance article for an introduction, see my ORMapper.net site for more snippets, and finally, and most importantly, download my (or someone else's) ORMapper demo and see it all work for yourself. And I also really do seriously mean giving you what you want would be a lot of work -- but only on the "traditional" side, since the examples on the O/R mapping "side" are quite trivial, as the article, snippet, and demo should make very clear if you think about it.
Why? Lets say for a moment that our "example" is going to just be a single simple table with a few fields, along with the basic CRUD, and nothing else. The first thing I need to do for both "sides" is to create my database table -- simple enough. Now for my ORMapper, all I need to do is create a very simple entity class and an xml mapping file, both of which can either be done with my automated tool (the ORHelper) or manually without too much work. An entity class, if you don't know, is typically a class with private member variables for each field and public properties to wrap those members with any additional business or validation logic necessary. Note that you are also going to probably be creating this very same entity class in the "traditional" approach, unless you are going to work with untyped and bloated datasets (yuck), so the entity class is something you need on both "sides" again. That really just means that the xml mapping file is the only thing extra required to use the ORMapper -- and again it can be generated or created without too much trouble by hand (see my demo or snippets to see for yourself if you don't believe it). Seriously, that's all -- now you just create an instance of the ObjectSpace, passing it your connection string and the mapping file, and you can instantly create new objects, update them, and delete them -- let alone retrieve entire collections, filtered and sorted collections, and paged collections -- for any database provider at that.
What about the "traditional" approach? We've already noted that we need the database table and entity class for both "sides", so what else is needed? We need a minimum of 4 stored procedures, and I really mean a minimum since you will likely want at least 5 or 6, or even more. Why a minimum of 4? One for inserts, one for updates, one for deletes, and one for retrieval -- and its that last one that may require several variants even in the simplest of cases. Why? How about one stored proc to retrieve all objects, one to retrieve a single object by its primary key, and one to retrieve some other variations. Note that even this, a simple situation where just 6 stored procedures are involved, is often not enough since that "other variations" concept can get rather involved. Again, why? You probably want to retrieve one object by some other more "natural" key, and you are likely to want to retrieve collections with various filters and sorts -- and each of these either require a new proc -- or you're going to end up creating a huge ugly proc that accepts parameters for filters and sorts and which then uses dynamic sql to get the resultset! And that kind of brings us back full circle to what you wanted to avoid, and we haven't even hooked anything up yet -- all we've done is create a bunch of boring stored procs. Before we move on to see what else is needed, lets note that you're options really are either limit your query flexibility, write lots of stored procs, or write one massive proc that uses dynamic sql anyhow -- and will likely now require your application to know all of the field names in order to even work.
So what else is needed beyond procs? You need a DAL, a Data Access Layer, that enables you to call all of these stored procs -- and hopefully one that is easy to use (the MS DAAB is not in my opinion) and which actually closes all of your connections (CSLA examples didn't at one time, although hopefully it does now after it was pointed out by yours truly). I personally don't think a DAL is all that difficult, but then again that's why I could write an O/R mapper -- I've worked with lots of developers over the years and most fall seriously short in understanding how to use any DAL given to them, let alone build one themselves. OK, so lets assume you have a DAL or are going to use the MS DAAB -- now you still have to add all the code in your entity class to "hook up" the class with all of these various stored procs -- and I assume you've done this enough to recognize that even using cut-n-paste this isn't trivial -- its a lot of boring repetitive work which is prone to mistakes. And you've got to either have your code know all of the field names in order to associate the correct values with your class members, or you've got to very carefully assume a certain field order (hopefully you'll use an enum to at least make it a little more maintainable -- but most people don't). As noted earlier, you'll either have limited functionality, or a lot of procs to hard-code into your class, or a very big monster proc that can be used lots of ways and which will be difficult to work with in reality.
That seems like a heck of a lot of work to me, just for one simple little table -- and no I'm not going to create you an "example" of it to compare with my O/R Mapper since we've all done it a million times. Come on -- you should know the amount of work your "side" requires, so why don't you download the demo of my mapper, or one of the other ones, and see for yourself how little work they require! Yes, there are alternatives -- maybe you like to use untyped and bloated datasets instead of entity classes and commonly accepted object-oriented principles -- you've still got to write all those procs, get or build a DAL, and then "hook up" all those stored procs for retrieval and persistence with these datasets. Maybe you prefer typed datasets instead -- its still going to take that same work, although now you'll be able to at least avoid some compile-time errors and have a little bit of intellisense (not much). Finally, you could of course generate all of this "traditional" method (using CodeSmith or some other tool) -- and this is one alternative that I won't make fun of -- it really is a valid workable alternative. That said, you still have to either give up a lot of flexibility that a mapper will give you for free, or create and maintain some very sophisticated templates for your code generation. For those that need the utmost of control (yes, everyone thinks they are in this category, but realistically very few ever are) this is a valid option, but its a lot of work and I know that most development teams aren't up to it!
What about maintenance? What if you need to change a field name (silly example, but lots of people bring it up)? For my mapper you just change the name of the field in the xml mapping file -- end of story -- no recompile. Some mappers use attributes and would require a recompile, but what about these other "traditional" techniques. First, you need to change the name of this field in a lot of stored procedures -- and if you're very lucky that's all -- still more work than using my mapper. But if you hardcoded your field names in your application (and many really do as pointed out earlier) then you're also going to have to change source code, maybe in multiple places, and do a recompile -- and you'd better have the test team do some testing. If you used code generation then you maybe only need to regenerate the application's code and recompile -- still more work than my mapper. Oops, most teams really end up customizing the generated code since they didn't spend enough time on the templates and so they'll either have to redo this customization, or just give up on code generation and instead alter their procs and code manually! What if you need to add a new field (a little more realistic)? Add a new field to the mappings and a new member to the entity class if you're using a mapper -- ah ha the critics roar -- you have to recompile. Seriously, can you really just add this field to your stored procs and never recompile? Not -- because both of us also need to add this new field to our UIs or business logic in some manner -- otherwise why are we bothering to add this field in the first place.
Hopefully I've done a fair job at convincing you a good O/R Mapper really is a huge time-savings, both initially and for maintenance -- but what about other concerns? Everyone knows that performance sucks -- NOT -- people can debate this one forever, so try it for yourself -- that's exactly what I did before I made the jump from "traditional" to mapper! Most CRUD is pretty straight-forward actually, and most mappers avoid the bloat of datasets, so there's actually little difference, and sometimes they perform even better. But seriously, there are many other factors that weight much heavier -- network traffic or really cool UIs come to mind immediately -- so try it out for yourself and get over it. The other gotcha is security -- again mostly NOT -- although I'll readily admit this one isn't as clear-cut. Basically, you are "forced" to give a database "user" direct read and write to tables (unless you want to map stored procs and lose a lot of the benefits), but no one ever said you had to give out the credentials of this database "user" to your real end users! On the flip side, can your "traditional" approach support a different database if you have a client that demands it (common if you sell your application), can your updates handle optimistic concurrency or update only changed fields, and is your DAL and your application easy to port to a distributed scenario when it grows? Do your users demand more flexibility in their queries, with the ability to specify any criteria or sort, possibly along with paging? Does your team do a good job of syncing up all the changes to stored procs, entity-DAL logic, and UI changes, as opposed to a much smaller set of changes (I remember how painful this was in team development using the "traditional" model).
Finally, do you really think that your "traditional" model is really even all that traditional in the first place? If so, then talk to some enterprise Java developers -- they are probably either using an O/R Mapper or some serious code generation! Or what about SAP and other similar apps that are very configurable -- you'll find something very much like an O/R Mapper under the covers, although it probably won't be very generic. For that matter, if you look closely at very configurable Microsoft applications you're going to see some very "mapper"-like use of meta-data, as opposed to everything being stored procs! This is NOT something that Frans, Thomas, or myself made up -- O/R mappers are a commonly used, and tried and tested, enterprise scale design pattern! That's also why half of the Enterprise Pattern book that I have by Martin Fowler is about nothing other than O/R Mapping -- no kidding at all. No, I didn't invent this, nor did Frans -- in fact both of us were in the "traditional" Microsoft stored proc camp until we listened, tried it for ourselves, and learned something new. I only created my mapper since I think most of the others out there are too complex for my taste -- simplification is the one thing that I'm actually very good at.
So I stand by my claim that you should seriously think about either using a good O/R mapper, or do some serious code generation -- and if code generation then please spend the time and do it the right way or you'll be back where you started when you have to extend or maintain. The choices are simple -- spend all your time writing routine and boring CRUD instead of a killer application, use a tool like an O/R Mapper that does it for you and lets you focus on the real issues, or if you must insist on the utmost control then spend some real time up-front and write some killer code generation templates so that later you'll be free to focus on the real issues. I think most projects don't justify the time or need for the utmost control, so I typically prefer the O/R Mapper approach lately, but I can also agree that a well-designed set of templates can be reusable to a large degree if you have multiple projects that you can fit into the same scheme. Lastly, I also like O/R Mappers because I like simplicity -- I don't want to see the code of the third party grids that I use, and I really honestly don't typically need to see the code for all the boring CRUD either. But that's just me -- I just hope that you aren't doing it the old cut-n-paste way that many samples and books (and even tools like Visual Studio) "teach" you to do. Actually I take that back -- I hope you do continue to use those inefficient techniques so that I'll be able to underbid you on your next project! :)