Balancing Readability and New Syntax Sugar in C# 3.0
Posted by Keith Elder | Posted in .Net | Posted on 08-10-2008
I’ve been struggling with something the past several weeks since I’ve been heads down programming some new features at work. It is the first time in awhile I’ve really had a chance to dig in and use the latest and greatest features in .Net 3.5 and C# 3.0 in a production environment. Like most developers, I find myself struggling to remember to use the new features in the language since I’ve been doing things the same way for years. I have to constantly remind myself that I could do things differently than I’ve done them before. Anonymous methods, automatic properties, lambda expressions, LINQ and other new features just to mention a few.
Here’s what I’m struggling with. Where do you draw the line in the sand? At which point do you say “Just because you can doesn’t mean you should?”. In other words, how do you balance readability versus using new language features. The absolute simplest example I’ve run across recently is as follows.
I needed to get the first IP4 Address of the current machine. A computer can have multiple addresses (not only IP4 but IP6 addresses too). To start with I already had a method that would validate an IP4 address. Here’s the method.
1: public static bool ValidateIP4Address(string ipAddress)
2: {
3: string ipRegex = @"^(2[0-5]{2}|2[0-4]\d|1\d{2}|[1-9]\d|\d)\." +
4: @"(2[0-5]{2}|2[0-4]\d|1\d{2}|[1-9]\d|\d)\." +
5: @"(2[0-5]{2}|2[0-4]\d|1\d{2}|[1-9]\d|\d)\." +
6: @"(2[0-5]{2}|2[0-4]\d|1\d{2}|[1-9]\d|\d)(?:/(3[012]|[12]\d|\d))?$";
7: Regex regex = new Regex(ipRegex);
8: return regex.IsMatch(ipAddress);
9: }
As I solved this problem I knew what I wanted to do. I wanted to loop through all IP Addresses on the machine and take the first IP4 Address. The solution is simple but it can be done numerous ways, especially when you start using new C# 3.0 features. I started writing the code and here’s what I came up with using a Lambda expression (a new C# feature).
1: private static void GetIpUsingLambda()
2: {
3: string ipAddy = Dns.GetHostAddresses(Dns.GetHostName()).First(i => ValidateIP4Address(i.ToString())).ToString();
4: Console.WriteLine(ipAddy);
5: }
After writing it I checked my unit test and it passed. I then switched gears in my brain and starting looking at it as if I had seen it for the first time. Sure it looks cool and complicated but what the heck is it doing? There is a plethora of parentheses, nested methods and dots. Honestly, it isn’t easy to read. But it is in one line!
Then I rewrote it like this to hopefully make it more “readable”.
1: private static void GetIpUsingLambda()
2: {
3: string ipAddy = Dns.GetHostAddresses(
4: Dns.GetHostName()
5: ).First(
6: i => ValidateIP4Address(i.ToString())
7: ).ToString();
8: Console.WriteLine(ipAddy);
9: }
Is it more readable? I don’t know to be honest. I’m struggling with it myself. It is only marginally easier to read since it is somewhat broken up by the parentheses.
I decided to then write it “old school” just for comparison.
1: private static void GetIpOldWay()
2: {
3: IPAddress[] addresses = Dns.GetHostAddresses(Dns.GetHostName());
4: foreach (IPAddress addy in addresses)
5: {
6: if (ValidateIP4Address(addy.ToString()))
7: {
8: Console.WriteLine(addy.ToString());
9: return;
10: }
11: }
12: }
Line #3 gets all IP addresses on the machine. Then we loop through each one calling ValidateIP4Address(). If we find a match we stop on the first one.
Here’s the question, which one is more readable and understandable at first glance, the lamba or the old way? I’ll let you decide.
Of course we could take the GetIpOldWay() and throw in some new C# features using the keyword “var”.
1: private static void GetIpUsingVar()
2: {
3: var addresses = Dns.GetHostAddresses(Dns.GetHostName());
4: foreach (var addy in addresses)
5: {
6: if (ValidateIP4Address(addy.ToString()))
7: {
8: Console.WriteLine(addy.ToString());
9: return;
10: }
11: }
12: }
Really there isn’t much difference. It is less typing for me the developer but that’s about it. Not much difference.
One more way we could write this is using LINQ.
1: private static void GetIpUsingLinq()
2: {
3: var ipAddy = Dns.GetHostAddresses(Dns.GetHostName());
4: var foo = (from i in ipAddy
5: where ValidateIP4Address(i.ToString())
6: select i).First();
7:
8: Console.WriteLine(foo);
9: }
Interesting.
For me the LINQ version above is the most interesting and I think the simplest to understand for someone looking at the problem for the first time. I think it expresses more of the “intent” of what I was doing. It is more typing compared to the Lamba expression in the GetIpUsingLambda() but it is easier on the eyes.
After writing all four variations on this simple real world problem I think I like the LINQ example the best. I am positive I’m not the only person out there struggling with this and I think it is a discussion that is happening in developers minds. Of course this whole discussion is just personal preference but I think maintaining readable code is something we should strive for in our systems. I also think if another developer had to change or modify the Lambda expression they’d spend a lot more time figuring out what it is doing compared to other samples, especially the LINQ example even though we are using new language features.
For developers throwing around Lambda expressions like they are candy I would question if it is truly the best and most maintainable code. Just because you can doesn’t mean you should in all cases. Definitely something to think about. Feel free to weigh in, I’m curious as to which example you find the easiest to understand.
UPDATE:
Several of us on IRC continued this conversation there yesterday and I wanted to capture some of that conversation.
Jay Wren brought up using AddressFamily which in this example would be faster. Here is a rewritten example using the Lambda expression using the AddressFamily enumeration.
1: private static void GetIpUsingLambda()
2: {
3: string ipAddy = Dns.GetHostAddresses(Dns.GetHostName()).Where(i => i.AddressFamily == AddressFamily.InterNetwork).First().ToString();
4: Console.WriteLine(ipAddy);
5: }
Nate Kohari added onto Jay’s work and took on the challenge of readability by suggesting an extension method of IsIP4() for the IPAddress.
1: public static class Extensions
2: {
3: public static bool IsIP4(this IPAddress i)
4: {
5: return i.AddressFamily == AddressFamily.InterNetwork;
6: }
7: }
Adding the extension method would then allow the Lambda and the LINQ example to really flow. They would now look like this.
1: private static void GetIpUsingLambdaWithExtension()
2: {
3: string ipAddy = Dns.GetHostAddresses(Dns.GetHostName()).Where(i => i.IsIP4()).First().ToString();
4: Console.WriteLine(ipAddy);
5: }
1: private static void GetIpUsingLinqExtension()
2: {
3: var foo = (from i in Dns.GetHostAddresses(Dns.GetHostName())
4: where i.IsIP4()
5: select i).First();
6:
7: Console.WriteLine(foo);
8: }
Now we are truly getting somewhere balancing the new features in C# 3.0 and readability. This final LINQ version is the winner in my book although the Lambda example makes my geek senses start to tingle.
Great experiment view! I’ve just learned this critical construction. It’s really hard and worthy reading. Thanks for letting us know.
Hi:
Agree. Sometimes as engineers we would like to be just right and can’t sleep until it is just the right way.
Your choice of LinQ appears more appropriate ’cause, the nature of the problem is to choose an item form a collection. My thinking is, considering the appropriate problem domain gives us some guidance.
@Harley Jones
Yes, the var keyword is type-safe at compile time. The compiler knows exactly what type is going to be assigned to that variable.
Further, anonymous types (e.g. those returned by some LINQ expressions are also compile time constructs. When you build your assembly you can take a look at the byte code and see that an honest to goodness .NET class was created to represent the anonymous type.
@everyone
This seems like an interesting discussion and really the topic seems to boil down to, “Is it better to use lambda expressions implicitly or explicitly”.
In both of the examples LINQ and Lambda expressions are being used.
In the first example the LINQ extension methods are being used explicilty and a Lambda Expression is being explicitly created.
In the second, the LINQ syntax is being used which the compiler will in turn translate into the same extension method calls and Lambda Expression.
To me it seems obvious that that implicit use of the extension methods and lambda expression is more readable.
If you want to take your latest update further into “Should I do this?” land you could use s func:
<pre>
Func<IPAddress, Boolean> isIP4 = i => i.IsIP4();
</pre>
And chuck that into the where clause:
<pre>
Dns.GetHostAddresses(Dns.GetHostName()).Where(isIP4).First().ToString()
</pre>
I try to only use “var” when it is clear what the resulting type is. For example:
var ex = new ArgumentOutOfRangeException(argName, msg);
instead of
ArgumentOutOfRangeException ex = new ArgumentOutOfRangeException(argName, msg);
In the posted code, I would stick with IPAddress[] to make it clear that you are ending up with an array without having to lookup “GetHostAddresses”.
I agree that Jay’s second post is the most readable. Your first lambda example reminds me of C programmers that used to see how much functionality they could cram on a single like.
If I have to study a line of code closely to figure out what it is doing I think it’s too complicated. Of course, it may be because we’re not ‘used’ to reading lambda expressions yet.
I have to question the wisdom of using the “var” keyword. I understand it’s supposed to be type-safe at runtime. But is it type-safe at compile time?
I think the extra keystrokes to explicitly define your type is worth it in terms of readability and maintainability.
I’m a big fan of JavaScript, and I think it handles implicit types very well. Then again, it’s a totally separate tool for a totally separate job.
I think that this is very readable:
var v4 = from i in Dns.GetHostAddresses(Dns.GetHostName()) where i.IsIpV4() select i;
v4.First().ToString()
with nkohari’s IsIpV4() xmethod
public static bool IsIpV4(this IPAddress addr) {
return addr.AddressFamily == AddressFamily.InterNetwork;
}
In fact, I think it is most readable to me.
I think that “readable” is very important, but it should be recognized as highly subjective.
Here is a more readable 2.0 version:
var firstAddress2 = Array.Find(Dns.GetHostAddresses(Dns.GetHostName()), delegate(IPAddress addr) { return addr.AddressFamily == System.Net.Sockets.AddressFamily.InterNetwork; });
Console.WriteLine(firstAddress2);
Here is a more readable 3.5/LINQ version:
var firstAddress = (from addr in Dns.GetHostAddresses(Dns.GetHostName()) where addr.AddressFamily==System.Net.Sockets.AddressFamily.InterNetwork select addr).First();
Console.WriteLine(firstAddress);