Code Generation - Templating vs CodeDOM and automatic refactoring

Tuesday, June 14, 2005

Code Generation General Software Development

We are currently using two code-generation techniques. For the data access and business logic layers, we have a code generator written in Prolog. For the layers on top of that one (i.e, a WS layer or a UI layer) we started using CodeDom and then we switched to an ASP.NET-like template-based code generator engine.

The main reason to switch from CodeDom to templates was that we wanted our customers to customize them, and customizing CodeDOM code is very difficult, while customizing a template is quite easy. From our point of view, it's also easy to write templates, even if it means that we need to maintain two sets of them, one for C# and another for VB.NET.

The main issue with template-based code generators is that it's quite difficult to have a 'clean' set of templates. For anything but the simplest stuff, templates are spaghetti code. They are hard to read and difficult to modularize, even with help of the template editors. This is much better with CodeDom, as you can use OO techniques to design your code generator.

Additionally, using templates, keeping the code generated clean is also a challenge. You usually need to decide if you prefer to have cleaner templates or cleaner generated code. If your main development artifact is the template, then you could decide that is better to have a cleaner template. However, if the people using the generated code tend to read it, try to understand it, and evaluate your code-generation tools depending on it, you could prefer to have cleaner generated code.

For example, in DeKlarit you can write things like:

ModifiedDate = System.DateTime.Today if update or insert;
CreatedDate = System.DateTime.Today if insert;
MyNamespace.MyMethod.AddRelatedRecord(CustomerId, CustomerName) on AfterInsert;

These rules are then generated in methods like:

public void AfterInsertRules()
{
MyNamespace.MyMethod.AddRelatedRecord(row.CustomerId, row.CustomerName);
}

and invoked in another methods like:

public void Insert()
{
// Do something here..

AfterInsertRules();
}

This happens with the 'AfterInsert', the 'AfterUpdate', 'AfterDelete', etc.

If we are generating code using templates, we can generate that code in several ways. One is to write something like:

<% if (ListOfAfterInsertRules.Length > 0)
{ %>
public void AfterInsertRules()
{
<% PrintListOfRules(); %>
}
<% } %>

and before calling the method, write:

public void Insert()
{
// Do something here..

<% if (ListOfAfterInsertRules.Length > 0)
{ %>
AfterInsertRules();
<% } %>
}

If we do it this way, then the generated code will be cleaner, and the 'AfterInsertRules' method won't be generated or called when there are no rules to execute.

If I want to keep the template code cleaner, then I could write:

public void AfterInsertRules()
{
<% PrintListOfRules(); %>
}

public void Insert()
{
// Do something here..
AfterInsertRules();
}

In this case, if there are no rules to trigger, my generated code will have an empty method and a call to an empty method, so my generated code will look worse.

You could need to make this decision if you are using a CodeDom generator or a Template-based generator.

The interesting thing is that if you are using a CodeDom-like approach is that before writing the DOM to the source file, you can refactor it...

This means that as far as you don't touch the public interface, you can do whatever you want with it the DOM. For example, you can look for empty methods, and remove them together with their calls. You can find variables that are not used and remove them. You can find unreacheable code and delete it. You can find member variables that are only used in one method and define them as local to that method.

Some of these changes will probably have no important impact in the runtime performance of the application, but will make the generated code look much better, and you can also keep your code-generation code much cleaner.

One of the main reasons why we work with Prolog is that it's an easy (OK, easy if you know/like Prolog ;) way to have a CodeDom-like generator without requiring us to use typed language for it, so creating a DOM for the code is just creating a complex list. Refactoring the code is just processing that list and transforming it.

When using CodeSmith in our code generation architecture, we had a similar issue. We found that the more complex our templates got the more out of control our templates got.

Fortunately CodeSmith's template system also supports using a CodeBehind class file as the base class for each template.

We used this functionality to create a "templateHelper" class which abstracted database columns out even further into collections of "RenderedColumns" which allowed us to encapulate things like the UI control to use for a particular Sql Type, default assignment statements, and validation for specific types.

This greatly decreased the amount of spaghetti code that resided in our templates.

Jeff Gonzalez - Tuesday, June 14, 2005 5:28:00 AM

Another example on the cleaner templates / cleaner generated code trade-off, is that for cleaner generated code, you'd always want to eliminate implementation redundancies using base classes, implementer classes, or generics. However, from the template point of view, those redundancies might not actually exist as long as you have the generation in only one template, so providing you don't mind the impact on the code size, you might choose not to eliminate the duplication in the generated code.

Jose Lamas Rios - Tuesday, June 14, 2005 5:30:00 AM

2 Comments