Programming Standards, Naming Conventions
Posted by Keith Elder | Posted in .Net, Programming | Posted on 30-11-2007
It never ceases to amaze me how different naming conventions are from language to language. It seems each language has their own coding convention or style that gets adopted over time. More times than not these naming conventions are established by authors in books, magazines, and Internet articles. Developers that read those publications follow suite the majority of the time. For example, I can remember years ago developers who wrote PHP named files they included in their programs with the extension of “.inc” yet other files ended in “.php”. The huge downside to this was the .inc files would be passed directly through from an Apache server as plain text file instead of getting parsed as code. This would allow want to be hackers to see the code to a web site. The worse part is a lot of the time the database username and password was visible in plain text. Why was this done? It was done because Rasmus (the guy who wrote PHP) put it in his samples. Once it was printed, it became a standard that took many years to correct.
After programming in a language for several years one tends to adopt a certain style and I am no exception to that. It is funny how caught up we get in how things are named. Our brains work on consistency though. If I have to read code someone else wrote that doesn’t follow my naming conventions and style I am now out of my comfort zone and have to do double translation. First translate their code into what I would have done to understand the problem better, and then read their code and translate that into what it is doing. The worst examples of this are when you have to read code in a language written by a developer who is new to the language that brings in the previous style of the language they’ve used for years. It is painful to read code that isn’t written in the “implied” standard the language is written in. By implied I mean if you look at Intellisense for .Net objects, enumerations, namespaces, methods and so on you’ll quickly realize that everything starts with a capital letter. A developer learning a programming language from scratch will pickup on this and more than likely follow this implied naming convention. However, if a Java or PHP/Perl/Bash/Ruby programmer writes .Net code he or she will more than likely write code the same way they did in the Java world. Example:
Typical .Net Version
public int AddTwoNumbers(int x, int y) { return x + y; }
Typical Java Version
public int addTwoNumbers(int x, int y) { return x + y; }
Typical Scripting Language Version
public int add_two_numbers(int x, int y) { return x + y; }
Three distinct ways to name a method. The difference in the first two is the upper / lower casing of the method but it is more obvious with the third example in how different platforms write their code. I’ve had the pleasure, I mean pain of reading .Net code written or translated by developers who are new to .Net. I can tell within a few lines the language the person translated the code from or the language they wrote in previously. Assuming they don’t “change” how they write their code. I suspect most others can as well.
When developers transition to .Net or learn .Net it amazes me how many of them deviate from the implied naming conventions. For example, all namespaces in .Net are Pascal case with no underscores yet I see developers coming from a Java background creating namespaces like “com.something.foo” and breaking this convention instead of following what has been established before they got there. I find the .Net framework very consistent in how it is written and coded. If you are learning .Net the best rule you can follow is first follow the implied naming conventions.
I like standards, especially when moving developers in and out of development teams. It amazes me when I look at a problem that isn’t formatted or named the way I prefer, how much longer it takes me to understand what the developer is doing. This can be from how curly braces are structured, to variable naming conventions and much more. I have had situations that before I could help the person I had to refactor their code before I could understand it. More times than not while I am refactoring the person will find their own mistake because of how I refactored it (simplicity is genius). Naming conventions are good because it helps others to quickly understand what is going on. Again I go back to my point that if someone writes in a style you don’t follow you are doing double the work to understand it.
Here’s an example of the subtlety in the naming of variables that can be confusing. For example if I see a property of an object called “AddressCollection” I know immediately this is a generic list of Address objects. Easy, Forrest Gump, no questions asked. Conversely, if I see “Addresses” as a property I have no idea what this is. It could be a collection or just an object. I’ve taught many a programming class and I’ve seen a lot of code written by complete newbie programmers. It always amazed me at how I could write code they couldn’t understand just by changing my “style” or naming conventions. Folks, this stuff does matter and especially in larger team environments. If you don’t believe me, then write a program with variables named after Flintstone characters and then give it to a Senior Engineer and see how long it takes them to figure out what it does versus one that isn’t.
There are a lot of preferences when you talk to developers about how they code. I personally like what makes sense and is as Gump as I can get it. I also like to follow suit of how other things are named to provide consistency. So what are my preferences? Well, I won’t go into each and everyone but here are a few of my stronger opinions.
Variable Prefixes
A controversial topic no doubt but I prefix all private variables with an underscore. I do this because it is easier to read. It is also one character different which follows the rule of not making private variables and public variables different by case. Example:
private string _propertyName; public string PropertyName { get { return _propertyName; } set { _propertyName = value; } }
I’ve seen a lot of code written and published on the Internet that prefixes variables with “m_”. Most of the time this is due to legacy habits. For those that do this, please stop, you aren’t coding in VB6 anymore. It boils down to simplicity. I use one character, it avoids naming collisions and is easy on the eyes to read and less distracting.
I also have seen code where developers will prefix a variable with a shortcut of a type. For example if they are creating an array they’ll do arrMyVariable or if the variable is a string they will do strMyVariable. Usually when I see this type of code it tells me the person came from a scripting language background whereby variables can literally be different types in the next statement. For example this is something we used to do in Perl whereby this would help other developers to know what was in the variable or suppose to be in the variable so they wouldn’t try to change the type.
Curly Braces
Another controversial topic is curly braces. I line curly braces up on top of each other (each gets their own line). Some may not agree with this but I find it extremely easy to debug sections since the braces line up and allow the eyes to see blocks of code quicker. When you get into nested situations it is easier to see the “blocks”. For those that don’t agree I have tested this in many programming classes I’ve taught and students who are “learning” a language from scratch will agree 100% of the time that lining them up is easier. Here’s what I’m talking about with some random code that makes no sense and isn’t suppose to. Two versions of it with a lot of nested blocks.
HARD TO READ
private void MyFunction(int x) { if (x % 2 == 0) { for (int i = 0; i < x; i++) { if (i % 2 == 0) { // do something } else { // do something for (int i = 0; i < x*i; i++) { // something else if (DateTime.Today.DayOfWeek == DayOfWeek.Saturday) { Console.WriteLine("Take the day off."); } } } } } }
EASIER TO READ
private void MyFunction(int x) { if (x % 2 == 0) { for (int i = 0; i < x; i++) { if (i % 2 == 0) { // do something } else { // do something for (int i = 0; i < x*i; i++) { // something else if (DateTime.Today.DayOfWeek == DayOfWeek.Saturday) { Console.WriteLine("Take the day off."); } } } } } }
While the second example is longer in terms of lines of code (this is why God created regions) one thing you can see easily is how nested it is and how the code breathes more. The first approach looks like it is cramming the code together. Like I’ve said, I have tested this in every programming class I’ve taught and the consensus is the second example is always preferred and easier to understand. The nice thing is for those using Visual Studio this is the out of the box behavior.
If Statements
A pet peeve of mine is when developers shortcut if blocks. Here is what I mean.
If Block Shortcut
public Foo(int x) { if (x % 2 == 0) Console.WriteLine(x.ToString() + " is even"); }
If you have a single line for an If block you can shortcut it and leave off the curly braces. It is a feature, yeah. Some other languages support this and C# does as well. Personally I can’t stand this because of several reasons. One, debugging. If I want to set a break point just when the Console.WriteLine is hit I can’t. I have set it on the one line and check it each time. Another reason this disturbs me is if the business logic changes where I have to print an else statement it has to be reformatted anyway. Go ahead and add the curly braces up front instead of waiting. The third reason is when looking at the code there may be things that occur after this and it is confusing to read to know what the intent of the person that originally wrote it was. This also ties into the readability of the code as outlined in the curly braces section. Bottom line, don’t shortcut just because you can.
Namespaces
I mentioned this earlier but I’ll mention it again. Namespaces should be Pascal Case. A namespace should be something that is agreed on BEFORE you write code for a project. The standard is to use the CompanyName[.Team][.SubTeam] + Project + [.Feature][.Design]. Example: Microsoft.Practices.EnterpriseLibrary.Caching. Of course you can leave off the team, subteam, feature or design if you choose depending on how global the project is. Don’t worry if you do that. ScottGu isn’t going to show up at your doorstep asking you to change it. There aren’t any hard and fast rules these are just the “implied what Microsoft does” rules. For more detailed information Microsoft has a reference page about namespaces available on MSDN.
Naming Preferences
This is a large topic, or can be. So instead of re-hashing this on my own go read Peter Brown’s article on naming conventions. It is fantastic. This is one of the few articles that I can say I agree with 96.78582762% or more of what it states. I imagine if I read Peter’s code it would look exactly like mine in terms of naming preferences. Beyond Peter’s article there is also naming guidelines already established on MSDN that outline how to name classes, enumerations, parameters, properties, events, and so on. If you are wondering what I do, for the most part I follow that guideline.
Thanks for the best post. It is used to do in Perl whereby this would help other developers to know what was in the data-type or accept to be in the code so they wouldn’t try to change the type.
toshiba direct coupon code
If it takes you longer to debug someone’s code because they did not follow the naming standards that you like tells me you are really a mid level programmer. I mean if something like that SLOWS you down then you have other issues to deal with. IMHO. I have NEVER used naming standards to tell me what something means or does. That would be stupid to assume something like that.
Thanks for the great post! I’m just starting to move into a project lead role and was feeling a little timid about enforcing coding standards. You provided a lot of good justification for doing so.
I agree with most of what you wrote, and also name my private variables the same way, but live in constant fear that someone will call me a horrible person for not adding “m_str_warandpeace_” so that people coding in notepad can fully comprehend the meaning or something. While I style my “curly braces” in the same way that you do, I really don’t like the code example that you’re using, as you should never end up with all those control loops in one method.
Keith,
great article. I personally think the last point you made is the best one. Microsoft has great standards for these things, use them!
Also, once you start using these standards just enable code analysis and you can get compile time checking of these standards as well. Make it part of the build process and people will be forced to adopt a standard.
One thing that does make me laugh though, is the idea of the implicit conventions from Microsoft. I think there is even disagreement there. One of my favorites is the following items from the AjaxControlToolkit:
[System.Diagnostics.CodeAnalysis.SuppressMessage(“Microsoft.Naming”, “CA1706:ShortAcronymsShouldBeUppercase”, Justification = “Following ASP.NET AJAX pattern”)]
public string PopupControlID
You’ll find things like that everywhere. Code analysis said to do one thing, yet Microsoft doesn’t follow the convention.
Also, while you have tons of great things here you haven’t even gotten in to great things like where to place the code within a given scope (where are variables versus properties in a class, or where to place locally scoped variables within a method). You also didn’t mention one of my favorites, which is how to region your code.
These items aren’t typically discussed by Microsoft in their standards, but following a good standard on regions and code placement can make code far easier to read and understand.