The Fable of the Perfect ORM
Data is a funny business. While at the moment I am spending a lot of time teaching Silverlight, my passion still lives in the data. I was brought up on Minisystems (Multi-user CP/M and the like) where you were dealing with something like a database (though we didn’t have that as firm a concept as you might think). Later I did quite a lot of desktop database development starting with dBase II (yes, I am that old), Paradox, Clipper, FoxPro and even Access. That naturally led to client-server and N-Tier development. Throughout all the time its become exceptionally clear how much data matters to most applications.
But using databases can be difficult as there is an impedance mismatch with object-oriented development and the natural structure of data. The solution to this for many organizations has been to build data access and business object layers around the data. These layers were meant to simplify data access for most developers and embed the business logic directly into a re-usable set of code instead of it ending up in the UI layer. This was a good thing…
But the problem is that much of the data access (and to a lesser extent, business object) code was very redundant. Developers ended up writing the same code or simply mimicking schema that was in the database. Some started to develop ways to use the database schema to their advantage to limit the amount of hand-written code was created. While not always called this, this is where object-relational mapping (ORM) products got their start.
ORM is a good thing. But ORM is about data access, not business rules. Its a important distinction that needs to be understood. ORM’s typically were bad at defining and managing business rules, but that was never their job. Keep this in mind when you think about ORM and Business Objects (or just read Rocky Lhotka’s book on the subject).
Now that ORM’s have become a staple of data access, we have a ecosystem where there are a huge number of competing products (1st party, 3rd party and open source). The most common question I get these days when I meet people at user groups or conferences is “What ORM should I use for my new project?” This question is flawed. The problem with this question is that there is not a singular solution for data access (ORM et al.) that solve all problems. In fact there are many solutions to this problem that fit the needs to particular use cases. So when I get this question, I attempt to ask more questions but in reality this isn’t a question that can be answered on the back of a napkin.
The recent brouhaha about the Entity Framework is a great example of this problem. Many of the complaints about the Entity Framework (or any ORM really) are central to the viewer’s point of view. This is true of NHibernate as well. Its great in the right environment, but lousy in others. I wish I could encourage the community to stop trying to find the perfect ORM. It doesn’t exist. Its like the perfect car, perfect beer or perfect woman. The perfect car for speeding on a racetrack is horrible for taking the kids to hockey practice.
Why do I think this is true? Because I tried to write it. Several years ago I was fed up with the ORM landscape and decided that I would try to write one that fixed the flaws of all the other solutions. I spent nearly four months part time tinkering with the code to get it working the way I wanted it to. But I kept finding myself backed into corners. “If I implement it this way, its great for X, but lousy for Y.” I finally decided that all ORM’s are flawed because the problem is inherently driven by a core set of requirements. No tool could possibly meet the criteria of every project.
Picking a data access strategy involves more than just functional requirements. Its not about the size of the project, the speed of the runtime environment or even the veracity of the tools. Its bigger than that. Though this list is incomplete, when I talk to people about this problem I encourage them to look at the requirements of their project to include (but not limited to):
- Functional Requirements
- System Requirements
- Skill-set of the Development Team
- Business Factors
- Business Culture
- Lifetime of Project
- Volatility of Schema
I could go on, but I think you can fill this out with lots more. The situation is that ORM’s that are good for certain environments are bad for others. For example, if I were writing a website for a mom-n-pop Pizza Parlour in my neighborhood, if I had to have data, I’d likely pick something like “LINQ to SQL” as it is always going to be directly mapped to tables and the size of their database and throughput are low. Getting the job done on budget is more important than worrying about performance or refactor-ability.
In contrast, if I was building a large financial system where concurrent transactions and high volume processing was critical to the project’s success, I’d likely hand-code or use something like NHibernate. Spending more time on hand coding for performance pays off on a big, high-volume project like this but would be wasted on the other project.
Lastly, if I were to be remotely working on a small project with a team who are not that well versed in database development, I might pick something that did a lot of the code generation for me (like LLBLGenPro) where the developers could get up to speed quickly without having to understand the basics of database development.
Some times its specific philosophies that help find the right match. For example, if persistence ignorance and implicit data loading is important to your team, then a technology like NHibernate makes a lot of sense. But NHibernate often comes with the higher cost of object tracking (e.g. often you’ll consume 2x memory with NHibernate since they are keeping the old and new objects to do comparisons).
Other example is the difference in philosophy of data access. One of the driving ideas in Entity Framework is the idea that a developer should never make a request to the database the they don’t know about. This is very different from the idea in NHibernate that requesting a related object should *just work*. That’s why understanding your team, your culture and your project all come together to help you find the right solution.
Please don’t take my mentioning of specific technologies as specific preference but instead understand that picking a tool requires more than trivial review (e.g. (Its included in Visual Studio for free so we should use it.") Ultimately most projects spend more time tuning and tweaking their data access than building it so picking the right tool that gives you enough control is key to success. Don’t get blinded by shiny designers, its ultimately the code that is more important.
I welcome your experiences and opinions…