Stored Procedures vs. Dynamic Queries.
In Ted Graham's blog this morning he wrote a little text about the debate that is starting to show up here and there: is there a move away from 'Stored Procedures' towards 'Dynamic Created Queries' ? I think there is.
In the big, ugly Data Layer Generator/Toolkit debate over at the asp.net forums, there was some discussion about dynamic queries vs. stored procedures. As you can see, if you read the thread, I was one of the people who thought stored procedures was the way to go, for performance reasons for example. In the last couple of weeks I did some tests to see if what I was thinking back then when the thread was going on, was actually as true as I thought it was.
In a recent blog, I talked about stored procedures with optional parameters. The reason I was looking into these was that if you have a table with a couple of interesting fields, and you want to filter on a combination of any of these fields, you'll end up with a lot of stored procedures. LLBLGen f.e. generates for each foreign key found in a table T a different stored procedure to filter on that foreign key field, thus not even on a combination of these foreign key fields, if more of these fields are found in a table. This does result in excessive code bloat if you are not careful.
During the past year, users of LLBLGen have asked me to generate even more stored procedures, f.e. selects with filters based on a combination of fields, with joins etcetera, etcetera. For version 2.x, which I'm currently developing, I looked into ways to do this without having to generate a zillion stored procedures or to require the user to design every stored procedure in a fancy editor (which would take ages). I came up with two solutions: 1) Stored procedures with optional parameters and 2) dynamic generated queries (so no stored procedures). Option 2) I didn't like very much, because my mind told me stored procedures were the way to go, for performance, for security, for separation of tiers and perhaps there are a couple of other reasons you can come up with. However, I'm not 15 anymore, the Stubbornness of Youth has left me a long time ago so I decided to put up a (simple) test.
To make this test fair, an explanation is necessary. As mentioned above, if you want to avoid excessive code bloat in the stored procedure API, you have to use stored procedures with optional parameters, there is no way around it. For a database with 10 tables or so, code bloat isn't an issue, however it gets nasty when you have 50 or 60 tables or even more. So I did a test with a simple select using 3 foreign keys as possible filters. The stored procedure used is mentioned in this blog. I wrote 2 routines. One would build up a dynamic query and based on the input parameters (f.e. if the caller wanted to filter on just CustomerID, the other parameters would be null) it would generate a dynamic query using parameters. The other would simply call the stored procedure.
I'll save you the code, but the results were interesting. I compiled two programs using the routines, one would call the dynamic query routine while feeding it random input parameters, the other one would call the stored procedure routine and also feed it with random input parameters. I ran both at the same time, as a semi simulation of multi-user database hammering. The programs were ran on my P3-933 workstation (however, in these days of P4's, a better name would be 'electric typewriter') using my dual P3-933 Serverworks powered server with SqlServer 2000 and the Northwind database. To be fair, each time a query was started the connection was opened and closed again. All queries retrieved data in a DataTable object.
The dynamic query routine was twice as fast as the stored procedure routine. I ran the routines for about 10,000 loops, and a number of times, but each time the dynamic query one was faster. This was of course because the dynamic query was tailored to the task, without expensive statements as COALESCE(), while the stored procedure was always using the same, slower execution plan. I didn't expect this, because I thought the execution plan of the dynamic query is thrown away each time a query is executed, but this is not the case. The stored procedure version was using more CPU power on the server, while the dynamic query was using more CPU power on the client. I didn't cache any generated query so every time I had to create the query again in a stringbuilder object however that worked fine (and garbage collection kicked in rather smoothly).
This convinced me to choose the route of the Dynamic Queries. It will create the code in your n-tier stack so much simpler and uniform. LLBLGen's code generator will also be much simpler because it can now generate C# / VB.NET statements creating a dynamic query using the dynamic query engine, instead of that I have to write a complete stored procedure generator.
Aren't there any disadvantages? Of course. There is less separation between the tiers. This means that security set on stored procedures should be set elsewhere. With n-tier web-applications this is not a real issue, since they most of the time connect via the same user to the database. Security is then managed through functionality in the application, not in the lowest layer of the application: if you can't call a function in the database because you can't call the caller of that function in the Business Tier, you can't call the function in the database. If you trim down the amount of users who can access your application, you can trim down the amount of users who can execute code in your database. Admitted, it is without stored procedures harder to set up 'last resort'-hurdles to avoid code execution. However because a lot of web-applications using ASP.NET are connecting to the database using the same user (ASPNET user) or an impersonated user, there is a tunnel to the stored procedure code via that user anyway (and which implies security through functionality has to be implemented as well).