Welcome

You have reached the blog of Keith Elder. Thank you for visiting! Feel free to click the twitter icon to the right and follow me on twitter.

Using Regular Expressions in C# vs PHP

Posted by Keith Elder | Posted in .Net, PHP | Posted on 05-12-2007

Recently at work I made a call out to all of the engineers to send me code they thought should be placed in the common library for .Net development.  In one of the responses I got an email from a PHP developer that shared a regular expression to check the format of money.  He stated he didn’t know how often we use them (referring to regular expressions) in C# but he thought he’d pass it along anyway.  That got me to thinking.  Do I use regular expressions as much in C# as I did in PHP? 

For those that are curious, the regular expression that was sent over works perfectly well in C#, no problems.  I think developers will find regular expressions will easily move from one to another.  If you happen to be porting PHP regular expressions to C# and vice versa you shouldn’t run into too many problems.  For example here is a quick test I threw together with the regular expression to make sure it worked.

using System;
using System.Text.RegularExpressions;

class Money
{
    public static void Main()
    {
        Regex exp = new Regex(@"^[1-9][0-9]{0,2}(,{0,1}[0-9]{3})*(\.[0-9]{0,2})$");
        Console.WriteLine(exp.IsMatch("123.23").ToString());
    }
}

When complied and ran it yielded the following result:

image

It worked.  When it comes to regular expressions within C# I have used them.  I find they are more like pepper sprinkled in a good stew rather than a main course meal.  That is how I would state the difference in use of them between the two languages.  If I had to give a percentage of how much less I use regular expressions I would definitely say more than half as much, and maybe go higher to 90% depending on the situation.  Why? 

I think it boils down to the reason that C# is strongly typed and PHP is loosely typed.  For example, the following code could never be written in C#. 

<?php
$x = 1;
$y = '2keith';

echo $x + $y;  // will print 3 (yes you can add strings and numbers in php)
// set $x to something completely different
$x = array('a', 'b', 'c', 'd');

echo $x[1]; // will print b
?>
Running the entire thing will result in:  3b

To some this example may be scary, others may look at it as a feature.  Whichever way you think, most developers when asked what the result will provide respond with all sorts of various answers.  When I taught PHP classes as a consultant I would use a similar example with the class.  To no exception, no one could ever predict the out come.  Answers given ranged from, it should throw an exception since you can’t add a string and a number.  Or it should throw an exception because the variable $x was reassigned to a different type.  

Do you see “why” PHP code relies on regular expressions more now?  Since $x can literally become any type at any time, a PHP developer can never rely on the fact that $x is an INT.  About the only way to check that value is the use of a regular expression.  In the PHP world it is called Type Juggling.  Conversely in C#, once the variable x is assigned a type, it cannot be changed and only valid numbers can be assigned to that type therefore eliminating the need to use a regular expression to check the value of the variable.

The question then becomes though, is this the C# way to test for the money value?  I would probably argue it isn’t the “best” way to handle money in C#.  While it certainly works there are other things to take into consideration when adding money.  For example two different types of money such as US Dollars and Euros cannot be added together.  It must first be exchanged and then added.  The same thing could be said of other operators performed against a variable of type money.  This is where it would be suitable to use a struct and create a new type called Money.

We can in C# declare a variable as the type of decimal and use it as money if we choose.  In this case we still don’t need a regular expression to validate the value of our variable.  Here’s a sample showing one way to handle a rogue value:

            decimal money;
            if (Decimal.TryParse("123.a234", out money))
            {
                Console.WriteLine("money is valid");
            }
            else
            {
                Console.WriteLine("money is invalid");
            }

I suspect a lot of programmers use this method but again a struct is more desirable.  Andre de Cavaignac has a great example of building a struct for a money type.  He provides these examples:

            Money eur10 = new Money(MoneyCurrency.Euro, 10);
            Money eurNeg10 = new Money(MoneyCurrency.USDollar, -10);
            Money usd10 = new Money(MoneyCurrency.USDollar, 10);
            Money usdZero = new Money(MoneyCurrency.USDollar, 0);

            bool result = (eur10 == usd10); // returns false;
            bool result = (eur10 > usd10); // throws InvalidOperationException (comparison not valid)
            bool result = (eur10 > Money.Zero); // returns true
            bool result = (eur10 > usd0); // returns true
            bool result = (usd10 > eurNeg10); // returns true (positive always greater than negative)

Obviously he’s put a lot of thought into how money should be handled and if you look at his library you’ll see he accounts for all different type of currencies. 

For those wondering the differences about regular expressions in PHP and C# I hope that gives you some insight into how the different languages respectively handle different situations.  It all boils down to strong typing vs loose typing and the ability to create new types based on structs.  Happy Holidays!

Comments (4)

Handling raw string data (whether from user input or some other place) many times requires extensive use of regular expressions, both in C# and php. Especially when working with complex data.

Good article

I would add somthing that might increase the use of the regular expressions in C#. It’s not something I am easily reminded of, but did come up just now. Hence the reason I’m on this topic.

I am looking for the money expression without going through a gazillion combinations myself. Why am I having to look for a regular expression in C# if I could do all the things you specified?

Because I’m using the validation control on text boxes in ASP.NET. It requires the regular expression. Alive and kicking when validating with a common control that requires input for validation. Thought I would add a situation that just came up that would qualify more use of Regular Expressions.

So would Type Casting in PHP force a variable to be strongly typed?

In other words is there a way to force a variable to be a specific type in PHP. It would be a great feature of the language to be able to switch between loosely typed and strongly typed.

The Daily Find

Write a comment