Deriving from a fluent class
Here’s an interesting one, and maybe you can help me make my design less crappy. I have this library that I’m a little proud of, called FluentPath. It’s a fluent wrapper around System.IO that enables you to manipulate files not unlike how jQuery manipulates the DOM, with operations over sets, and lots of functional concepts:
Path.Get(args.Length != 0 ? args[0] : ".")
.Files(
p => new[] {
".avi", ".m4v", ".wmv",
".mp4", ".dvr-ms", ".mpg", ".mkv"
}.Contains(p.Extension))
.CreateDirectories(
p => p.Parent()
.Combine(p.FileNameWithoutExtension))
.End()
.Move(
p => p.Parent()
.Combine(p.FileNameWithoutExtension)
.Combine(p.FileName));
In fact, it’s totally inspired by jQuery, and was born from the frustration of having to work with the antiquated monstrosity of an API that System.IO is.
One user of the library wanted to extend it. The way I had designed things, extensions are made through extension methods. One example is available in the source code that adds zip/unzip capabilities.
This user however wanted to derive from the Path class. I’m not a fan of inheritance, but who am I to judge? The problem comes from the way fluent APIs work: their methods return an instance of the same class, in order to enable chaining. For example, here’s the signature of the ChangeExtension method:
public Path ChangeExtension(Func<Path, string> extensionTransformation)
If you want to derive DerivedPath from this class, one big problem you’re going to have is that this method will return a Path, not a DerivedPath, so your extensions won’t be available on the results of chainable methods. This, for example, won’t work (DoStuff is a method on DerivedPath that doesn’t exist on the Path that Parent() returns):
new DerivedPath("foo").Parent().DoStuff();
Whereas this works just fine:
new DerivedPath("foo").DoStuff()
That’s pretty lame and awkward. So how do we ensure that derived classes’ methods vary their parameters and return types so their own types are used in place of the base Path class?
The solution I implemented and checked in was suggested by PlasmaSoft, the user who exposed the problem in the first place: change Path into a base generic class, then have the generic type parameter be an alias of sorts for the class itself that we can use as a parameter or return type.
Here is what the class declaration for the new base class looks like:
public class PathBase<T> : IEnumerable<T> where T : PathBase<T>, new()
Method declarations have been changed to use the T generic parameter in place of Path for almost the whole public surface of the API. For example, ChangeExtension becomes:
public T ChangeExtension(Func<T, string> extensionTransformation)
The Path class itself has been changed into a derived class of that base class, and sealed:
public sealed class Path : PathBase<Path>
If you want to derive from Path, well, don’t. Instead, derive from PathBase:
public class DerivedPath : PathBase<DerivedPath>
I really don’t like how this looks (convoluted, redundant), but I don’t see another solution. There will be a few awkward places as well. For example, some methods can take an arbitrary derived type for a parameter, but what the return type of the method should be is not always clear: should it be the type of “this”, or the type of the parameter?
public T Combine<TOther>(PathBase<TOther> relativePath) where TOther: PathBase<TOther>, new()
Another problem is that the compiler gets a little confused and doesn’t know that the current class and T are one and the same. There are many places where we’ll want to instantiate a new PathBase<T>, and return it as a T. We can’t just cast the PathBase<T> after instantiation, because as far as the compiler and runtime are concerned, T derives from PathBase<T>, not the other way around, so we need a way to instantiate a T directly (and that instance will behave as a PathBase<T>). All the compiler knows about T’s constructors is provided by the new() constraint: it has a parameterless constructor, which is not the one we need. In order to work around that, I had to implement a number of Create factory methods and use them internally where I would normally have used a regular constructor. This bled into the quality of the implementation as I had to remove “readonly” qualifiers from private fields for entirely technical reasons. I hate when the language doesn’t let you do the right thing (or doesn’t steer you to it).
Finally, derived classes have to implement a bunch of constructors and cast operators that do nothing but delegate to base.
In the end, everything behaves as it should, all tests pass, and we have the additional derivation feature, but I’m really unhappy about the hoops I had to jump through, and about the disagreeable form of derived classes. I prefer the extension method way of extending the library even more than I did before today.
I do have a question for you, my readers, however: can any of you think of or point me to a better way of writing base fluent classes?