Rants Tagged with “Entity Framework”

1  2  >  >>  (Total Pages: 2/Total Results: 20)

The Busy Developer's Guide to SQL Server Modeling

SQL Server Modeling

As my continued facination with all things SQL Server Modeling related, I was tasked with writing a short article on the introduction of the basics of SQL Server Modeling for the developer.  The result is the article "The Busy Developer's Guide to SQL Server Modeling" that was released on MSDN today. It is short so you can get the big details without investing a month learning the technology. Let me know what you think!

 

Are ORMs Solving Anything?

Angry Baby

I like to write blog posts where I offer some pragmatic advice.  In most posts I try to include tons of code samples and example projects...but this post is different.  I am trying to get my head around something so I want to share what is in my head so I can get a conversation started with my readers to help me out.  Once you read this post, please comment...

The other day I was responding to a tweet from Doug Purdy. He had posted a link to some new EF 4.0 features by the boss at Microsoft's DevDiv. I, as usual, complained instead of lauded the list. I started a conversation about lazy loading and the potential danger of it, but quickly Doug mentioned that ORM's maybe were the wrong approach in general. That got me thinking (not always a good idea).

Those of you who have followed me for a long time are apt to remember that I go back to ADO (actually even farther back) to a time when writing your own data access layer was the norm. And using that data access layer against your business objects was what we all did.

Things are really different today as most of us now use an Object Relational Mapper (ORM) of one sort or another.  Entity Framework, LINQ to SQL, nHibernate, LLBGenPro, etc. They all essentially do the same thing (but in very different ways). The problem is that to some it feels like ORMs are business objects. In fact, ORMs are just data access. Are they making our jobs easier?  It depends...

Most ORMs lean heavily on code generation to create entities and other code. The mapping is generally trying to take some relational model and create an object oriented model. It works this way because this is what we've done for a couple of decades.  But is it helpful? Perhaps not. But because we build class-based (or object oriented) software today that seems plausable to do. 

Object orientation is at a cross-roads in many ways. We've been told for a couple of decades now that it is the way that you can benefit from reuse, ease of development and modeling. Of course, like all magic pills, applying it to every problem space doesn't help. In the case of ORM's, its always felt like we were putting a round peg into a square box.

The StackDon't get me wrong, mapping a rectagular result (e.g. SELECT * FROM Foo) makes a lot of sense.  But the problem is that the mapping happens between *related* entities which is hard. Relationships in data stores are not as simple as 1..n, 1..1, or 0..1. By shaping the relational model into a class structure, we're losing fidelity.  You can see if you're losing fidelity when an ORM is going to allow you to get deep inside to hint at the actual queries. At that point we're fighting the tools.

I am not anti-ORM at all and I can clearly see how ORMs help build smaller projects easier.  This is especially true if your relational model looks like the object model (or as is common in DDD, using the class model to push down the schema). Great if you're building small projects, but when those same techniques are used on large or enterprise solutions, the stack gets big fast (see right).  The problem is that we end up with in-memory classes that represent entities, then we need business objects that apply business rules, then DTOs to communicate the shape of hte data across the wire and in some cases data contract classes outside the firewall. Under the ORM entity objects, the ADO.NET managed providers still exist and sometimes an unmanaged stack too. Sometimes that a lot of code to worry about.

My big worry here is that ORMs are seductive.  When you start a project and build a data layer with a couple of clicks you can focus on building the benefit to the business.  Like most development issues, the real cost is in the maintenance. I fear that much of the work in this space is done by consultants because typically consultants help get a solution built and move on. Its not that consultants are bad, but because they can go from greenfield to greenfield project (I certainly have been guilty of this before), the pain of maintaining this code isn't always obvious.  What happens when the database is overwhelmed with data.  What happens when you try to add indexing to an existing application as the use cases change and mean that adding index hints is hard (though in SQL Server 2008 you can do this without modifying the calls).

That comes back to Doug Purdy and other folks in the twitterverse.  What's the next wave of solutions out there?  I admittedly haven't spent enough time playing with solutions like ActiveRecord, but they feel like dynamic classes which doesn't feel like it will solve the problem. I have heard that maybe getting away from relational databases is the solution, but the object database/non-SQL/BigTable solutions I've seen never seem to scale to high transaction systems or work well with reporting.

Your turn...what's your take?

RIA Services, Silverlight and MVVM

Silverlight Logo

I've finally had a chance to take a look at the July CTP of RIA Services. My opinion is mixed, but its still pretty early. I ran through the simple walkthrough and it was easy to set up but it still felt as if there is too much Visual Studio magic (a complaint I've had for a long time now).

A Brief Overview

For the uninitiated let me explain a couple of things (though, please don't assume this blog entry is an introduction to RIA Services):

RIA Services is trying to share logic with the Silverlight client. That's the big story (IMHO). Some of this logic is a surface to query, validation attributes and outright code in other cases. Its trying to solve a difficult problem, but they've made a fundamental mistake in my book: RIA Services requires that all pieces are in a single solution file. But why? The magic is code generation.  

RIA Services starts with a Domain class which normally uses a model (Entity Framework or others) to expose the data.  As you change the domain class and the entities, RIA Services builds a code generated file in the Silverlight project (or other client files) for you:

The RIA Magic

As the GameDomain file and the GameModel's entities are changed, the generated code is regenerated to keep up with the changes.

Why This is Troubling

This makes sense in one case because as the domain and entities are changed, the code magically stays updated.  Which is a better experience than updating a Service Reference.  But it requires something called a RIA Link.  This link is between a client-side component (typically a Silverlight application) and a server-side component (typically an ASP.NET Web Site/Project).

The cost of this approach is that the projects must exist in the same solution file. This works for demo's and small projects, but in the big world of enterprise or Internet development this breaks down. Throw in composition strategies (like Prism solves) and it complicates it quite a lot.  I certainly hope that when it reaches 1.0 they'll have a solution for this. The current solution for this is to wrap the domain service with an ADO.NET Data Service, but that means there are two layers to go through instead of one and if that's the approach, just use ADO.NET Data Service instead, right?

Another Concern

While reading the well-written and lengthy overview on RIA Services (that comes with the RIA Services CTP) I noticed that RIA Services comes with a data source object (called DomainDataSource) that can be used directly within XAML to communicate with the domain class:

RIA Services Data Source

If you've been reading this blog for any length of time, you'll know that I think that data source objects are almost always evil in that they suggest that its ok to include data access in the user interface.  And the depth of interaction with the data source is really troubling here.

So What About RIA Services in MVVM?

My first thought was how this impact the best-practices of not co-minging UI and data.  I thought it might not work at all, but as often happens in these cases, the example isn't the exemplar that I can suggest.  The framework (aside from my concern about the solution file) actually allows this pretty easily.

So I went back and grabbed my MVVM example from my MSDN article (seen here: http://msdn.microsoft.com/en-us/magazine/dd458800.aspx) and refactored it to use RIA Services. The breakdown of that architecture was pretty simple:

MVVM Architecture

Notice I am using a RIA Domain Service on the server to expose the data model (instead of ADO.NET Data Services). This is broken up in the solution as a set of client-side projects and server-side projects:

RIA Services Project Breakdown

Notice that while in the typcial scenario (and walkthrough's) the RIA Link is betwen the Silverlight application and the web project, RIA Services allows you to have that link between separated parts of the solution. In this case, we have a MVVM.Data library that contains the entity model and the domain service class.  This allows us to re-use this in separate web projects (which is something that is harder to do in ADO.NET Data Services). We also have a MVVM.Client.Data Silverlight library that contains the model for the Silverlight application (and separates the access to the services so the client does not need to change).

The refactoring was fairly painless in that the entity types that were created with the generated code were identical to the data contract classes created by the ADO.NET Data Services - Service Reference class. I had to change the namespaces but the rest was identical.

In the model class I had to change how I was performing the queries, but the major change there was using the extension method syntax for the query instead of the LINQ syntax. The separation of the Model meant that the refactoring was simple (as the whole pattern is supposed to do).

public void GetGamesByGenre(string genre)
{
  // Get all the games ordered by release date
  var qry = Context.GetGamesQuery()
    .Where(g => g.Genre.ToLower() == genre.ToLower())     
    .OrderByDescending(g => g.ReleaseDate)
    .Take(MAX_RESULTS);

  ExecuteGameQuery(qry);
}

I do wish that the RIA Services style wasn't so RPC (Remote Procedure Call) but that may be because I see the value in client-side LINQ queries instead of "GetGamesQuery()" and such. I think the syntax could be a lot simpler.

Overall, I think that RIA Services can help solve some problems but the bottom line is still too much Visual Studio magic for my taste and the validation is still only covering the very simple cases. Rich validation is always going to be hard (read Rocky Lhotka's book if you don't believe me).

If you want to play with it, you can grab the code here:

http://wildermuth.com/downloads/MVVMExample_RIA.zip

What do you think?

Entity Framework Samples now in Visual Basic

ThinqLinq

My friend Jim Wooley has been doing some work with the Entity Framework team to help them get their examples into Visual Basic.  The first of the fruits of these labors is now available!  Hop on over to his blog to see all the details:

http://thinqlinq.com/Default/Entity-Framework-Samples-in-Visual-Basic.aspx

I, for one, am glad that this is finally happening. I would so prefer that the examples will be in both languages.  As much pain as that is pre-VB10 its always my goal!

Entity Framework Model Generation with TPH Detection

Data

One of my favorite patterns in the Entity Framework is a smarter way to use discriminators for data in a database. Using Tables per Hierarchy (TPH) or Tables per Type (TPT) all for more complex modeling of data in a relational data store. In a simple example, you might have a database that looks like this:

ER Diagram

Notice that the ProductType is actually helping define what type of product it is (and therefore which decorator table holds additional information about that type). In the Entity Framework I like to model this as inheritance in the model:

EF Inheritance

Notice the connection between the Product and the Game/Accessory/Console tables is inheritance not a foreign key. A Game contains the properties defined by the Game type plus the properties defined by the Product type.  Its real inheritance. This allows us to make LINQ queries like so:

var qry = from g in ctx.Products
          where g is Game
          select g;

The second line is the magic part of the query.  This example shows searching through the Products in the model who are actually of the Type "Game". So the model has knowledge in it that makes it easier to query the real intent of the data without having to trouble with the details.  The user of this model doesn't need to worry about what properties make it a Game, but can just treat it as such. Very powerful and just using tooling to describe data we've always modeled like this.

Up to this point we've had to build these models ourselves (not always the easiest to do with the current designer). Today the ADO.NET EF team released a new version of the command-line compiler (EDMGEN) that can look for these patterns and produce the right model for you. The team used some resources over at MS Research to help provide heuristics to detect these patterns and do the right thing.  Check it out:

http://blogs.msdn.com/adonet/archive/2009/04/04/edmgen2-now-with-reverse-engineering-options.aspx

 

Using LinqPad and Entity Framework Models

Pardon the link post, but this is a great short walk-through of using LinqPad to execute EF queries against your own model. Great for testing query ideas against your data.  Highly recommended.

One note is that while the AutoCompletion extension of LinqPad is great, it not necessary for this technique to work.  Please buy AutoCompletion to support the great work in LinqPad, but it isn't necessary to get it to work.

See my Silverlight Data Access Talk from DevReach

Silverlight Logo

If you missed me in Bulgaria's DevReach conference, the video of my Silverlight Data Access talk is now available.  The original talk's description was:

In this session, we will explore the different methods for dealing with data in your Silverlight 2 applications including LINQ, ASMX, WCF, REST, Astoria and WebClient calls. The session covers both how to consume data as well as how to expose data to Silverlight 2.

Hope you enjoy it!

Caution when Eager Loading in the Entity Framework

Silverlight Logo

UPDATE: Roger Jennings correctly stated, I meant to say that Include is *not* a guarantee.

When I am using the Entity Framework for a project, I have gotten into the habit of using eager loading via the Include syntax. In case you're not familiar it, the Entity Framework has a different philosophy than other data layers (e.g. NHibernate). In the Entity Framework, relationships have to be manually loaded when they are lazy loaded (so developers never have network round-trips without explicitly knowing about it). Whether you agree or not with this philosophy, understanding how it works is helpful when you're working with the Entity Framework.

The Entity Framework supports eager loading of the data as well using the Include syntax.  For example:

var qry = from w in ctx.Workshops.Include("Topic")
          orderby w.Name
          select w;

By amending the source of the query with the Include method, you can eager load relationships.  The problem is that these are really hints to the Entity Framework to load the relationships, but not a guarantee.  Depending on the query, these Includes may be dropped. The two scenarios I see this most often are the grouping and subselects:

// Drops the Include
var qry = from e in ctx.Events
                       .Include("Workshop")
          where e.EventDate >= DateTime.Today
          group e by e.EventLocation.Region into r
          select r;

I ran into a good post on the forums on the subject:

In that post, Diego Vega says that Include only makes sense in the following scenarios:

  • Include only applies to items in the query results: objects that are projected at the outermost operation in the query, for instance, the last Select() operation (in your query, you tried to apply Include to a subquery)
  • The type of the results has to be an entity type (not the case in your query)
  • The query cannot contain operations that change the type of the result between Include and the outermost operation (i.e. a GroupBy() or a Select() operation that changes the result type)
  • The parameter taken by Include is a dot-delimited path of navigation properties that have to be navigable from an instance of the type returned at the outermost operation

To alleviate the problem in some scenarios, you can use the EFExtensions Include Extension method to add includes on the complete query like so:

using Microsoft.Data.Extensions;
// ...
var qry = from e in ctx.Events
          where e.EventDate >= DateTime.Today
          select e;

List results = qry.Include("Workshop").ToList();

You can find the EFExtensions at the MSDN Code site here:

Has this bitten you before?

New Silverlight 2 ADO.NET Data Service Example

Silverlight Logo

I've finally had a chance to update my Silverlight 2-ADO.NET Data Services example. In this new sample I show how to create a Line-of-Business application (an XBox Game editor) using ADO.NET Data Services against both an Entity Framework model and NHibernate. Unlike earlier examples, this one includes implementation against the ADO.NET Data Service Silverlight 2 library to support saving of changed entities. In addition, I show some techniques for paging, retrieving simple types over an ADO.NET Data Service and full styling of the application. I hope to add support for Forms Authentication in the coming weeks.

Feel free to post replies with questions about the sample.

The Fable of the Perfect ORM

Silverlight Logo

Data is a funny business. While at the moment I am spending a lot of time teaching Silverlight, my passion still lives in the data. I was brought up on  Minisystems (Multi-user CP/M and the like) where you were dealing with something like a database (though we didn't have that as firm a concept as you might think). Later I did quite a lot of desktop database development starting with dBase II (yes, I am that old), Paradox, Clipper, FoxPro and even Access. That naturally led to client-server and N-Tier development. Throughout all the time its become exceptionally clear how much data matters to most applications.

But using databases can be difficult as there is an impedance mismatch with object-oriented development and the natural structure of data. The solution to this for many organizations has been to build data access and business object layers around the data.  These layers were meant to simplify data access for most developers and embed the business logic directly into a re-usable set of code instead of it ending up in the UI layer. This was a good thing...

But the problem is that much of the data access (and to a lesser extent, business object) code was very redundant. Developers ended up writing the same code or simply mimicking schema that was in the database. Some started to develop ways to use the database schema to their advantage to limit the amount of hand-written code was created. While not always called this, this is where object-relational mapping (ORM) products got their start. 

ORM is a good thing.  But ORM is about data access, not business rules. Its a important distinction that needs to be understood.  ORM's typically were bad at defining and managing business rules, but that was never their job. Keep this in mind when you think about ORM and Business Objects (or just read Rocky Lhotka's book on the subject).

Now that ORM's have become a staple of data access, we have a ecosystem where there are a huge number of competing products (1st party, 3rd party and open source). The most common question I get these days when I meet people at user groups or conferences is "What ORM should I use for my new project?" This question is flawed. The problem with this question is that there is not a singular solution for data access (ORM et al.) that solve all problems. In fact there are many solutions to this problem that fit the needs to particular use cases. So when I get this question, I attempt to ask more questions but in reality this isn't a question that can be answered on the back of a napkin.

The recent brouhaha about the Entity Framework is a great example of this problem. Many of the complaints about the Entity Framework (or any ORM really) are central to the viewer's point of view. This is true of NHibernate as well. Its great in the right environment, but lousy in others. I wish I could encourage the community to stop trying to find the perfect ORM.  It doesn't exist. Its like the perfect car, perfect beer or perfect woman. The perfect car for speeding on a racetrack is horrible for taking the kids to hockey practice.

Why do I think this is true? Because I tried to write it. Several years ago I was fed up with the ORM landscape and decided that I would try to write one that fixed the flaws of all the other solutions. I spent nearly four months part time tinkering with the code to get it working the way I wanted it to.  But I kept finding myself backed into corners.  "If I implement it this way, its great for X, but lousy for Y." I finally decided that all ORM's are flawed because the problem is inherently driven by a core set of requirements.  No tool could possibly meet the criteria of every project.

Picking a data access strategy involves more than just functional requirements. Its not about the size of the project, the speed of the runtime environment or even the veracity of the tools. Its bigger than that. Though this list is incomplete, when I talk to people about this problem I encourage them to look at the requirements of their project to include (but not limited to):

  • Functional Requirements
  • System Requirements
  • Skill-set of the Development Team
  • Business Factors
  • Time-to-Market
  • Business Culture
  • Lifetime of Project
  • Volatility of Schema

I could go on, but I think you can fill this out with lots more. The situation is that ORM's that are good for certain environments are bad for others.  For example, if I were writing a website for a mom-n-pop Pizza Parlour in my neighborhood, if I had to have data, I'd likely pick something like "LINQ to SQL" as it is always going to be directly mapped to tables and the size of their database and throughput are low.  Getting the job done on budget is more important than worrying about performance or refactor-ability.

In contrast, if I was building a large financial system where concurrent transactions and high volume processing was critical to the project's success, I'd likely hand-code or use something like NHibernate.  Spending more time on hand coding for performance pays off on a big, high-volume project like this but would be wasted on the other project.

Lastly, if I were to be remotely working on a small project with a team who are not that well versed in database development, I might pick something that did a lot of the code generation for me (like LLBLGenPro) where the developers could get up to speed quickly without having to understand the basics of database development.

Some times its specific philosophies that help find the right match. For example, if persistence ignorance and implicit data loading is important to your team, then a technology like NHibernate makes a lot of sense. But NHibernate often comes with the higher cost of object tracking (e.g. often you'll consume 2x memory with NHibernate since they are keeping the old and new objects to do comparisons).

Other example is the difference in philosophy of data access. One of the driving ideas in Entity Framework is the idea that a developer should never make a request to the database the they don't know about. This is very different from the idea in NHibernate that requesting a related object should *just work*. That's why understanding your team, your culture and your project all come together to help you find the right solution.

Please don't take my mentioning of specific technologies as specific preference but instead understand that picking a tool requires more than trivial review (e.g. (Its included in Visual Studio for free so we should use it.")  Ultimately most projects spend more time tuning and tweaking their data access than building it so picking the right tool that gives you enough control is key to success.  Don't get blinded by shiny designers, its ultimately the code that is more important. 

I welcome your experiences and opinions...