Reversing My Opinion of LINQ
Most of my exposure to LINQ has been in very short snippets and sessions at various conferences and blogs. My initial reaction was fairly negative. This negative reaction was based on several key factors:
- The evangelizing of LINQ as an ORM (making LINQ for SQL as the main focus of LINQ).
- Pointing out that LINQ is based on SQL so that developers should be comfortable with the syntax.
Let me discuss these points individually. I think that LINQ for SQL is not a compelling ORM (and is mostly useful only in the RAD or prototyping of applications, much like Typed DataSets are now). The more problematic part of this factor is that it only confuses developers as LINQ is not about database development but about integrating query into code. Integrating a query mechanism into the language is a great feature that should be explained but using LINQ for SQL as a demonstration tends to lend the comparison with nHibernate, LLBGenPro, Typed DataSets and such. The reality is that LINQ for SQL is an interesting implementation of LINQ but clouds the issue.
As for equating LINQ as having a SQL-like syntax, there are three specific problems with this in my mind:
- Many developers only know SQL basics therefore basing a language on it does not necessarily add great benefits to skill-reuse.
- LINQ is only vaguely SQL-like in my opinion. The problem is that LINQ is attempting to create a language that is useful for a different set of tasks than SQL was designed for. SQL is a set-based query language, where LINQ needs to be a query language for multiple types of data: sets, hierarchies, trees, inverted trees, etc. (Good news is that LINQ is actually fairly adept at handing these other data structures).
- Lastly, I would have liked to see the syntax be something that was more intrinsically self-describing. SQL is a classic case of a functional language that is feature rich but very difficult to decipher (and even harder to determine the results as it often depends on execution paths). Bringing a purely SQL-like syntax into the language will not help make the code clearer to read.
This is still my opinion. I believe that all of these issues are fairly problematic with LINQ. So what has changed?
As I’ve been digging in, I realize that there really isn’t “LINQ for Objects”, “LINQ for DataSets”, “LINQ for SQL”, etc. There is just Language INtegrated Query. The idea that there is a moniker for “LINQ for Objects” I think is a misnomer. LINQ is always about managed objects. Anyone who enables LINQ in their own collections simple are saying that with available data, we can tie into a centralized query facility. This became abundantly clear when I was chatting with Jim Wooley (co-author of a forthcoming book on LINQ) about LINQ for SQL.
LINQ for SQL depends on managed classes that are using attributes to map them to the database (though you can do something with mapping files to execute arbitrary SQL). This means that the metadata about a query must exist in the CLR (or in Memory if you prefer). If the metadata is in your memory space, then LINQ is really just querying objects. Underneath the covers LINQ for SQL may be creating new objects for you from the database but that’s an implementation detail as far as I am concerned. LINQ for SQL. Its not special, just an implementation (much like LINQ for Amazon and LINQ for Flickr are just inventive implementations).
What does this all really mean. This means that LINQ is a generalized query mechanism for managed code. That’s great news as we needed one. The cost may be a bit high as I think introducing “Extension Methods” and “Variable Inference” may cause more trouble than they are worth, its here and I better get used to it. At the end of the day LINQ is just syntactical sugar to map to IEnumerable<> so I can live with that. This reminds me of a great conversation I had with a bunch of guys at the MVP summit about this. Several of us were very negative about LINQ and others were very passionate about how great it was. I remember hearing that there are other syntactical sugars in languages already. The most obvious one for me is the “foreach” statement. We do not need foreach. We can just as easily write code that uses the enumerator to walk a collection, but foreach certainly makes the code both more readable and easier to write. I hope I feel that way about LINQ in the coming years, but I am certainly coming around to it.
How are you feeling about LINQ?