LINQ and Language Complexity

I’ve been away from reading tech weblogs for a while and a lot has been happening, apparently. Most notably, I think, is Microsoft’s LINQ project. LINQ stands for Language Integrated Query.

Here is the MSDN site, and here’s an explanation of what it is and does. What it comes down to is that it integrates SQL-like querying into the programming language (C# and VB.NET in this case). So that you can do stuff like this:

string[] names = { "Burke", "Connor", "Frank", "Everett",
"Albert", "George", "Harris", "David" };
var query = from item in names
orderby item
group item by item.Length into lengthGroups
orderby lengthGroups.Key descending
select lengthGroups;

If you look at this piece of code you’ll notice a few things. What’s that var type thing doing there, that looks kind-of JavaScript-ish. Then of course you’ll notice the new query syntax: from something in something, select this and that, orderby this. This syntax, however, is just syntactic sugar. This:

var query1 = from p in people
where p.Age > 20
orderby p.Age descending, p.Name
select new {
p.Name, Senior = p.Age > 30, p.CanCode

Is exactly the same as writing this:

var query1 = people.Where(p => p.Age > 20)
.OrderByDescending(p => p.Age)
.ThenBy(p => p.Name)
.Select(p => new {
Senior = p.Age > 30,

There’s another new thing here, the something => bla > 20 and the new { … } notation. But there’s more. Extension methods for example. What’s all this stuff and why do we need it?

Well, the C# 3.0 team needed all this new stuff to make LINQ a comfortable working environment, to make it usable, and I think that’s great. I’m all for LINQ. I’m not so sure, however, about the side effect: language features.

Programmers had to struggle with C++ for many years. It was hard, its syntax is complex and hard to understand. Luckily Java came around. It cleaned up the language and made programming simple again. Then C# 1.0 came along, it added a couple features to the Java stack, but nothing major. Then C# 2.0 (and Java 5) came along, those added a major new feature: generics. For a lot of people generics are still hard to grasp.

And now C# comes with stuff like anonymous methods, anonymous types, lambda functions which can be interpreted in two ways, depending on how you store them, and extension methods. I must say I love those features and I can appreciate them. But I’m a weird guy, I like getting into the dark corners of language syntax and semantics. Not many people are like me. They just want to learn a language quickly and get things done.

I wonder if C# 3.0 doesn’t make things too hard. C# is statically typed, which has its advantages, but also makes things a lot more complicated as you will see if you read the LINQ article. The kind of stuff queries return are usually of some generic type, which can be very complex. This is where the var keyword comes in. It basically tells the compiler “I’m lost, you figure it out” (and it saves typing). At compile time, the var keyword is replaced with the type of what the expression assigned to it returns.

So, here’s my question. How far are we willing to drag on the huge beast that is a static language? If you look at a language like Ruby or Python, they already got most of the features that C# had to add to make this happen, but in Ruby and Python they’re not half as complicated. In Ruby you could already add methods to existing classes, anonymous methods (in Ruby known as blocks) are something a Ruby programmer breathes, anonymous types? var keyword? generics? Don’t need those.

If we want to carry on in the direction that LINQ is heading, and I think we should, shouldn’t we sacrifice this one thing: static languages? This makes things a lot simpler in many ways, and the sacrifice may just be worth it.