Programming Standards, Naming Conventions

It never ceases to amaze me how different naming conventions are from language to language.  It seems each language has their own coding convention or style that gets adopted over time.  More times than not these naming conventions are established by authors in books, magazines, and Internet articles.  Developers that read those publications follow suite the majority of the time.  For example, I can remember years ago developers who wrote PHP named files they included in their programs with the extension of ".inc" yet other files ended in ".php".  The huge downside to this was the .inc files would be passed directly through from an Apache server as plain text file instead of getting parsed as code.  This would allow want to be hackers to see the code to a web site.  The worse part is a lot of the time the database username and password was visible in plain text.  Why was this done?  It was done because Rasmus (the guy who wrote PHP) put it in his samples.  Once it was printed, it became a standard that took many years to correct.

After programming in a language for several years one tends to adopt a certain style and I am no exception to that.  It is funny how caught up we get in how things are named.  Our brains work on consistency though.  If I have to read code someone else wrote that doesn't follow my naming conventions and style I am now out of my comfort zone and have to do double translation.  First translate their code into what I would have done to understand the problem better, and then read their code and translate that into what it is doing.  The worst examples of this are when you have to read code in a language written by a developer who is new to the language that brings in the previous style of the language they've used for years.  It is painful to read code that isn't written in the "implied" standard the language is written in.  By implied I mean if you look at Intellisense for .Net objects, enumerations, namespaces, methods and so on you'll quickly realize that everything starts with a capital letter.  A developer learning a programming language from scratch will pickup on this and more than likely follow this implied naming convention.  However, if a Java or PHP/Perl/Bash/Ruby programmer writes .Net code he or she will more than likely write code the same way they did in the Java world.  Example:

Typical .Net Version

        public int AddTwoNumbers(int x, int y)
        {
            return x + y;
        }

Typical Java Version

        public int addTwoNumbers(int x, int y)
        {
            return x + y;
        }

Typical Scripting Language Version

        public int add_two_numbers(int x, int y)
        {
            return x + y;
        }

Three distinct ways to name a method.  The difference in the first two is the upper / lower casing of the method but it is more obvious with the third example in how different platforms write their code.  I've had the pleasure, I mean pain of reading .Net code written or translated by developers who are new to .Net.  I can tell within a few lines the language the person translated the code from or the language they wrote in previously.  Assuming they don't "change" how they write their code.  I suspect most others can as well.

When developers transition to .Net or learn .Net it amazes me how many of them deviate from the implied naming conventions.  For example, all namespaces in .Net are Pascal case with no underscores yet I see developers coming from a Java background creating namespaces like "com.something.foo" and breaking this convention instead of following what has been established before they got there.  I find the .Net framework very consistent in how it is written and coded.  If you are learning .Net the best rule you can follow is first follow the implied naming conventions.

I like standards, especially when moving developers in and out of development teams.  It amazes me when I look at a problem that isn't formatted or named the way I prefer, how much longer it takes me to understand what the developer is doing.  This can be from how curly braces are structured, to variable naming conventions and much more.  I have had situations that before I could help the person I had to refactor their code before I could understand it.   More times than not while I am refactoring the person will find their own mistake because of how I refactored it (simplicity is genius).  Naming conventions are good because it helps others to quickly understand what is going on.  Again I go back to my point that if someone writes in a style you don't follow you are doing double the work to understand it. 

Here's an example of the subtlety in the naming of variables that can be confusing.  For example if I see a property of an object called "AddressCollection" I know immediately this is a generic list of Address objects.  Easy, Forrest Gump, no questions asked. Conversely, if I see "Addresses" as a property I have no idea what this is.  It could be a collection or just an object.  I've taught many a programming class and I've seen a lot of code written by complete newbie programmers.  It always amazed me at how I could write code they couldn't understand just by changing my "style" or naming conventions.  Folks, this stuff does matter and especially in larger team environments.    If you don't believe me, then write a program with variables named after Flintstone characters and then give it to a Senior Engineer and see how long it takes them to figure out what it does versus one that isn't.

There are a lot of preferences when you talk to developers about how they code.  I personally like what makes sense and is as Gump as I can get it.  I also like to follow suit of how other things are named to provide consistency.  So what are my preferences?  Well, I won't go into each and everyone but here are a few of my stronger opinions.

Variable Prefixes
A controversial topic no doubt but I prefix all private variables with an underscore.  I do this because it is easier to read.  It is also one character different which follows the rule of not making private variables and public variables different by case.  Example:

        private string _propertyName;
        public string PropertyName
        {
            get { return _propertyName; }
            set { _propertyName = value; }
        }

I've seen a lot of code written and published on the Internet that prefixes variables with "m_".  Most of the time this is due to legacy habits.  For those that do this, please stop, you aren't coding in VB6 anymore.   It boils down to simplicity.  I use one character, it avoids naming collisions and is easy on the eyes to read and less distracting. 

I also have seen code where developers will prefix a variable with a shortcut of a type.  For example if they are creating an array they'll do arrMyVariable or if the variable is a string they will do strMyVariable.  Usually when I see this type of code it tells me the person came from a scripting language background whereby variables can literally be different types in the next statement.  For example this is something we used to do in Perl whereby this would help other developers to know what was in the variable or suppose to be in the variable so they wouldn't try to change the type. 

Curly Braces
Another controversial topic is curly braces.  I line curly braces up on top of each other (each gets their own line).  Some may not agree with this but I find it extremely easy to debug sections since the braces line up and allow the eyes to see blocks of code quicker.  When you get into nested situations it is easier to see the "blocks".  For those that don't agree I have tested this in many programming classes I've taught and students who are "learning" a language from scratch will agree 100% of the time that lining them up is easier.  Here's what I'm talking about with some random code that makes no sense and isn't suppose to.  Two versions of it with a lot of nested blocks.

HARD TO READ

        private void MyFunction(int x) {
            if (x % 2 == 0) {
                for (int i = 0; i < x; i++) {
                    if (i % 2 == 0) {
                        // do something
                    }
                    else {
                        // do something
                        for (int i = 0; i < x*i; i++) {
                            // something else
                            if (DateTime.Today.DayOfWeek == DayOfWeek.Saturday) {
                                Console.WriteLine("Take the day off.");
                            }
                        }
                    }
                }
            }
        }

EASIER TO READ

   private void MyFunction(int x)
        {
            if (x % 2 == 0)
            {
                for (int i = 0; i < x; i++)
                {
                    if (i % 2 == 0)
                    {
                        // do something
                    }
                    else
                    {
                        // do something
                        for (int i = 0; i < x*i; i++)
                        {
                            // something else
                            if (DateTime.Today.DayOfWeek == DayOfWeek.Saturday)
                            {
                                Console.WriteLine("Take the day off.");
                            }
                        }
                    }
                }
            }
        }

While the second example is longer in terms of lines of code (this is why God created regions) one thing you can see easily is how nested it is and how the code breathes more.  The first approach looks like it is cramming the code together.  Like I've said, I have tested this in every programming class I've taught and the consensus is the second example is always preferred and easier to understand.  The nice thing is for those using Visual Studio this is the out of the box behavior. 

If Statements
A pet peeve of mine is when developers shortcut if blocks.  Here is what I mean.

If Block Shortcut

    public Foo(int x)
    {
        if (x % 2 == 0) Console.WriteLine(x.ToString() + " is even");
    }

If you have a single line for an If block you can shortcut it and leave off the curly braces.  It is a feature, yeah.  Some other languages support this and C# does as well.  Personally I can't stand this because of several reasons.  One, debugging.  If I want to set a break point just when the Console.WriteLine is hit I can't.  I have set it on the one line and check it each time.  Another reason this disturbs me is if the business logic changes where I have to print an else statement it has to be reformatted anyway.  Go ahead and add the curly braces up front instead of waiting.  The third reason is when looking at the code there may be things that occur after this and it is confusing to read to know what the intent of the person that originally wrote it was.   This also ties into the readability of the code as outlined in the curly braces section.   Bottom line, don't shortcut just because you can.

Namespaces
I mentioned this earlier but I'll mention it again. Namespaces should be Pascal Case.  A namespace should be something that is agreed on BEFORE you write code for a project.  The standard is to use the CompanyName[.Team][.SubTeam] + Project + [.Feature][.Design].  Example:  Microsoft.Practices.EnterpriseLibrary.Caching.  Of course you can leave off the team, subteam, feature or design if you choose depending on how global the project is.  Don't worry if you do that.  ScottGu isn't going to show up at your doorstep asking you to change it.  There aren't any hard and fast rules these are just the "implied what Microsoft does" rules.  For more detailed information Microsoft has a reference page about namespaces available on MSDN.

Naming Preferences
This is a large topic, or can be.  So instead of re-hashing this on my own go read Peter Brown's article on naming conventions.  It is fantastic. This is one of the few articles that I can say I agree with 96.78582762% or more of what it states.  I imagine if I read Peter's code it would look exactly like mine in terms of naming preferences.  Beyond Peter's article there is also naming guidelines already established on MSDN that outline how to name classes, enumerations, parameters, properties, events, and so on.  If you are wondering what I do, for the most part I follow that guideline.

«November»
SunMonTueWedThuFriSat
28293031123
45678910
11121314151617
18192021222324
2526272829301
2345678