Identity and your Domain Model
I’ve been struggling with a concept today that I wanted to flesh out. I may ramble on but I think there’s a point to be had deep down here (somewhere).
How often do you see a class begin its life like this:
5 public class Customer
6 {
7 private int id;
8
9 public Customer()
10 {
11 }
12
13 public int Id
14 {
15 get { return id; }
16 set { id = value; }
17 }
18 }
Looks innocent enough, but many times that Id value is there because
- an object of this class has to eventually persist into a database and someone thought it would be easy to store it here
- that database uses an identity column and thus, the value in your business entity has to be an integer to maintain a reference to it
- someone wants to use it in a UI layer so they can retrieve details about the item (DisplayCustomer.aspx?Id=3) (or someone wants to show a “nice” number to a user)
An identity column (more of a SQL Server term, Oracle can pull it off but it’s a little more involved) is a column that provides a counter for you. In it's simplest form an identity column creates a numeric sequence for you.
More often than not though, it gets tied (directly or indirectly) to a class design. This is where the fun begins.
What happens when I want to test this class? When I want to write a test checking that two objects have a unique identity I might write some tests that look like this:
21 [TestFixture]
22 public class CustomerFixture
23 {
24 [Test]
25 public void TwoCustomersAreUnique()
26 {
27 Customer firstCustomer = new Customer();
28 Customer secondCustomer = new Customer();
29 Assert.IsFalse(firstCustomer.Id == secondCustomer.Id);
30 }
31 }
With the above code, my test fails because I haven’t initialized Id to anything so they’re the same. However, in order to initialize them to something unique (each time) I need something to do this. Since Id was put there because someone knew this object was eventually going to be stored in a database it’s easy. Create the customer and when it’s saved (and loaded back into my object) a new Id is created. Voila. Test passes.
This is great but it means I’m inherently tied to my data source layer (in order to get an identity) to create my business entity. That’s no good for testing.
Maybe with a mock customer I can fix this, but again I would have to create some kind of mock system that generated id numbers on the fly. Not as easy as it sounds (especially when they have to be unique). In any case, it doesn’t model my business domain and at the end of the day, why do I need some number floating around that tells me what record # my object is in some database somewhere. That has nothing to do with the problem at hand.
I’m not saying an object couldn’t/shouldn’t/wouldn’t have identity, but a domain objects identity is not it’s ordinal position in a database.
Eric Evans makes a great statement about this:
“When an object is distinguished by its identity, rather than its attributes, make this primary to its definition in the model.”
I completely believe this and try to follow it as best as possible. Given an object (say a bank transaction) where each transaction has to be unique, identity is an important thing. However imagine if you tied bank transactions identities to an endless numbering system in SQL Server? How can I guarantee uniqueness when I have multiple data stores (say an active and passive one). Or a data warehouse. Or an internationally distributed system where I have to generate two unique transaction numbers on each side of the planet. What if someone resets/restarts the identity counter?
Okay, maybe I’m getting carried away here but eventually, IMHO, the identity approach falls short and you need something better.
Relying on infrastructure for your domain objects is a bit of a cheat and while even using something like a GUID isn’t perfect (and requires infrastructure as GUIDs are generated from things like hardware) it is pretty much guaranteed to be unique no matter what. Even creating one in Java and one in .NET, on the same machine, at the same time will get you a unique identifier (although I’m not sure if a dual-core system would never generate two GUIDs but I’ll leave that for the weary traveller to test out).
So if we change our Customer class to use GUIDs for identity we get something like this:
6 public class Customer
7 {
8 private Guid id = Guid.NewGuid();
9
10 public Customer()
11 {
12 }
13
14 public Guid Id
15 {
16 get { return id; }
17 set { id = value; }
18 }
19 }
Now the test we wrote before passes correctly because we have two unique identities for each object, no database required. Much better.
So all I’m saying is (to quote Jimmy Nisson) “Get rid of those nasty IDENTITY/sequences…” and “let the model set the values, for example by calling a simple service at the right place/time”.
Just something to consider when you’re building out your classes. Sure, what’s a system without storing it but it doesn’t mean you have to pollute your model with multiple numbers to keep track of something in a database system somewhere. Identity in a database is just that, and not something that you should rely on in your domain (especially if you’re doing TDD and don’t have one).
Try using GUIDs (or some other method if you prefer, like a service) that will help you keep your domain model pure of what it needs to operate with, and leave the non-business stuff like tracking numbers to the infrastructure layer.
Note: if you’re still hung up on using identity and SQL to generate ids for your business objects, check out Don Schlichtings article here on getting the right identity.