Nov 27, 2013

Repository Pattern Hell (Part 3): Enlightenment

Modern object relational mappers have come a long, long way since the dark days of writing DataTableAdapters, CRUD stored procedures, or worse inlining dynamic SQL strings. In fact, they are so good now and come packed with so many fancy features it’s almost surprising people are still hung up on using the same old generic repository pattern that abstracts it all away and aims for the lowest common denominator.

It’s true that they didn’t originally support all of these features, and the “lowest common denominator” was basically the status quo anyway. At that time, the problems with the generic repositories weren’t really apparent or all that serious. Nowadays, with lazy loading, change tracking, and so on, we’re feeling the limitations more and more. Everyone is. I know, because I keep seeing them jump through a ton of hoops to still be able to use these advanced features without abandoning the noble IRepository. This is where mind-bogglingly complex systems of Specifications and such come in to allow for advanced filtering, pagination, and other things. They also have to come up with crazy ways of abstracting simple things like making joins. For every Order I also want the associated Customer. This is super easy in most ORMs (in EF it is context.Orders.Include(“Customer”)) but usually requires Expression fetch paths or other nonsense craziness to get it to work in an abstracted-to-all-hell repository pattern.

It’s all because while modern ORMs are powerful and many have overlapping features they are usually different enough in implementation that we cannot easily make a one-size-fits-all interface “glove” for them. It may be as simple as calling Include on an EF set, but what about specifying a child of a child? How is that different in OData/Data Services? NHibernate?

Some try to solve this by just writing pre-baked queries (GetOrdersByCustomerId) but you’re still stuck with odd problems. What child tables to eager-load? Do all consumers of this query actually need all that? And what about the fact that we now have multiple reasons to change since the query must attempt to satisfy all consumers? What if a new class needs it but also needs more child tables included but now that means another class that was using this heavily, assuming it was a small and fast query (which it previously was), is now performing slowly. It’s not pretty.

As I said before, the answer has been right in front of us.

Just use the ORM.

Let it work its magic. Let it be a powerful and useful tool that actually saves you time and headaches instead of forcing you to write a lot of confusing architecture just to end up calling Include anyway in your EntityFrameworkRepository implementation class. Just use the damn DbContext directly!

But! What about all those benefits we talked about? DbContext isn’t mockable! How would I decorate it? And what if I need to switch to Entity Framework 6 or, worse, NHibernate or something else entirely? What if…

Relax. It’ll be okay. Watch:

interface IModelDbContext : IDisposable
{
    IDbSet<Customer> Customers { get; }
    IDbSet<Order> Orders { get; }
}

class MyEFModelContext : DbContext, IModelDbContext
{
    // blah blah
}

Oh. Look at that. It’s our old friend the interface! I guess that means we can inject mock implementations just fine now. And look, we can even fake the DbSet that EF uses (in fact, EF itself defines the IDbSet for us). We can even use practically the same T4 that generates the DbContext to generate the interface for us!

Wait, so does this mean I need a reference to EntityFramework.dll in all my projects? And a using System.Data.Entity at the top of my classes?

Well, yeah. So what? Is that really so bad?

Look, if you ever decide not to use EF then you’re looking at needing to rewrite things anyway. And, also, will it really be so bad? I don’t think so. I recently did just that when we switched from using EF directly (via an interface like the one above) to using OData via WCF Data Services. The entire “rewrite”, after getting all of the new OData service stuff set up of course, wasn’t that bad really. I made a new IDataServiceContext interface with a different type than IDbSet, fixed up my projects to reference Data Services Client dlls instead of Entity Framework, and then changed a few other minor differences in usage (for instance, DS uses Expand instead of Include). Sure, there were a few other significant differences that required more work to rewrite (it’s much easier to do a WHERE IN style filter through EF than DS, for example) but it wasn’t that bad really.

And besides, with the new data access method it was worth re-examining how we were getting data now that it was going to go through WCF and HTTP rather than an intranet connection string to a database directly. That’s a pretty fundamental underlying difference that shouldn’t be taken lightly. Just as another example, the amount of data being transferred matters more now since WCF and http bindings have limits on response/request length. Sure, it’s probably bad to be transferring 50 MB over EF anyway but now it will actively fail if WCF isn’t configured to explicitly allow such large transfers. Like I said, the differences matter.

But look at what we gain now. We still get injection and testing support via the interface, we can leverage T4 easily to make this extra piece on top of the T4s that we already get. The ORM manages what to connect to, the schema, and all of the translating to SQL for us, so the nitty-gritty DAL work is encapsulated and isolated from the rest of the app. We can even still decorate our context implementation at run-time if need be. But it’s still, fundamentally, exposed as a DbContext-like interface first and foremost, so we don’t have to cater to the lowest common denominator. We get Include. We get ToListAsync. We get Attach, Create, Find, Remove. All for free. All with the power of a modern ORM. Heck, the DbContext is already a Unit of Work, if that is something else you normally like to layer on top of your IRepositories.

Are we wedding ourselves a bit close to a specific ORM and persistence method? Well, somewhat. Certainly EF in this case, but thankfully EF supports some flexibility on the underlying persistence via Providers, so we aren’t wedded to a specific database type (SQL Server, Oracle, or whatever).

And really, how likely is it you will change it? More so, how likely is it you will be too far into the project when it becomes apparent that, for whatever reason, EF isn’t going to cut it? My guess is that in most cases that won’t actually happen. And yet the benefits you’ll gain, the simplicity in your architecture you’ll see, are all well worth the “sin” of using a specific ORM so directly.

But, even so… if IRepository is working for you, if the limitations I’ve described don’t apply or aren’t a big deal in your specific project, then use them. At the end of the day, the real point here is to have the flexibility of mind and knowledge to bring the right tool to the job. The best skill of all is knowing when to use which pattern in which case. It’s about remembering that the Repository Pattern is considered a “best practice” because it is infinitely better than a lot of alternatives and is so often a good [enough] choice. But it isn’t always the right tool. Don’t be slave to these guidelines and practices as if they were hard and fast rules. Repositories can be powerful and useful too, but please stop treating them like an inviolable law of programming.

No comments:

Post a Comment