O/R Mappers: Avoiding Reflection
Steve Eichert posted his findings yesterday about the performance cost of reflection. I knew reflection was slower, but I had no clue it was that bad potentially. I haven't done many tests yet myself to see if it really is that bad, but it doesn't really matter since I can agree that reflection is definitely slow. So why does this matter -- well, right now my WilsonORMapper uses a lot of reflection to get and set the field values. I was planning on doing something to fix that sooner or later, but Steve's post got me thinking about making it my next priority.
So how should I go about getting rid of reflection? The first solution I came up with is the easiest for me to implement, although I'll admit it seems kind of kludgy and ugly to the user. Basically, I would provide an interface with one property whose single parameter would be the member name specified in the mapping file. The user (or my WilsonORHelper) would simply implement this interface's property with a switch block where they set or get their own member variable to avoid reflection if performance is a consideration. I would simply need to use reflection one time, on the initial load of the mappings, to see if they implemented this interface.
OK, that does sound pretty kludgy, and it does mean that I would be requiring the user to do something specific for a change. But this would not be "required" unless they want or need the performance gain, and its still not requiring them to inherit from a specific class either. Implementing an interface, while definitely a requirement, is not as big of a "burden" since you can have multiple interface inheritance in .NET. And again, its not really required unless the user really wants or needs the performance, which may be necessary for collections of many objects, but maybe not for other cases.
So what other options exist to avoid reflection? One option that Steve mentioned is to use CodeDOM to dynamically create an assembly with a signature that the O/R mapping framework understands that knows how to call the public members or properties of the original class. Those public members or properties might be specified totally in the mappings, or reflection might need to used one time at startup at most. The problem I have with this technique is that it requires public members or properties, and it doesn't handle any member that is read-only publicly. Assuming that there aren't many read-only cases, what's the problem with using public properties, since they almost always exist anyhow. The problem is that properties are often (and should be) wrappers around the private fields that contain additional validation or business logic. There's nothing wrong with your public property rejecting or modifying the user's attempt to set an invalid value, but it should not prevent me from loading data that currently exists in the database.
So what do other O/R mappers do instead? Some O/R mappers use CodeDOM to dynamically create an assembly with new classes that inherit from the original classes which become the real ones used by the mapping framework. This can be done by either having the original classes be abstract with all the real logic created in the new dynamic inherited class, or by requiring the original class to expose its member variables as protected. The problems with this approach are that you can't use new to explicitly create your classes anymore, and the classes that the framework returns are actually a different type technically than was originally expected. Neither of those are significant problems, but the requirements to do this aren't trivial. You either have to forego providing your own implementation that includes your validation and business logic with an abstract class, or you have to expose all of your member variables as protected, and neither of those requirements are very friendly. There may be solutions to this that some O/R mapping frameworks have discovered, so I'm not trying to imply there isn't, but I doubt any such solutions are trivial, and they are probably therefore out of my reach to easily implement.
What other solutions are used by O/R mappers? Some O/R mappers require you to generate lots of code in order for them to work so that they don't have to use reflection and so that they can also gain other "insider" knowledge. I don't want to imply that this is "bad", or not a valid technique for O/R mappers, both because that's not necessarily the case, and because there have been other discussions on this already. That said, its not what I want to do with my O/R mapper, so this was not ever an alternative I seriously considered. One thing it does do for me, however, is to at least validate that allowing the user to implement an optional interface with a single property if they want or need the extra performance is not something totally out of line with other tools. And since I can make my WilsonORHelper generate this code if the user wants to use my helper and wants to turn on this feature, then I do feel like its not at all too much of a "burden".
So at this point I've just about concluded that my first solution is probably good enough, at least for my simple O/R mapper, and I also have decided that I don't really like the other alternatives, at least that I can think of or find. Then it occurred to me that I should try to figure out what ObjectSpaces is doing, but I quickly gave up since their code is just too much for me to try to figure out without lots of time and work. Then, on a whim, I decided to Google on ObjectSpaces and reflection. The first result was a blog entry by Andres Aguiar that confirms ObjectSpaces does use lots of reflection, but this is somehow going to be less of a performance hit in .NET v2.0. The fifth result returns some documentation about ObjectSpaces and an IObjectHelper interface that I had never noticed before -- and remarkably it sounds exactly like what I was proposing to do! There's also an IObjectNotification interface that can be implemented to enable your objects to receive events when is updated or deleted or when an exception occurs, which was something else I was wanting to do somehow.
That's enough research for me, since I liked my solution to begin with, and since I'm mostly using the syntax of ObjectSpaces anyhow, this will now definitely be the thing I implement in the next few week or so. Of course, that still doesn't mean its the best or "right" solution, so I'm still interested in what others think of my solution and the other options.