Looking for the previous guiStuff?

It's still here, the content didn't go anywhere. You may want to check out this new guiStuff though -- It's rather informative.

References/Tutorials:


Intro Documents:


guiStuff:









::Stuff for the multi-spec coder;

Coding, formats, standards, and other practical things.

 Home  //  JavaScript  //  Regular Expressions 

<!-- JavaScript

Regular Expressions


Regular Expressions are strings or as the name suggests 'expressions' that are used to search and/or match a certain text from a string based on the pattern specified in the Regular Expression.

In JavaScript we have a RegExp object to create regular expressions. Moreover we have two RegExp objects. One is the Predefined RegExp object and the other is the Individual RegExp object. The predefined RegExp and the Individual work in conjunction with each other. The properties of the Predefined RegExp are static in nature and cannot be changed after compiling the script. They have various properties but no methods(functions), whereas on the other hand Individual RegExp have both properties and methods.

RegExp.propertyname

The above is the syntax used to access the properties of the Predefined Regular expressions.

There are two ways to create Individual Regular Expressions:

re_name = new RegExp( "pattern", ["flags"] )

...and:

re_name = /pattern/[flags]

The first one is created with the RegExp() constructor function, and the other one is just a normal literal declaration.

The pattern variable is the actual part you are looking to find in a string object for example. Pattern can take absolutely any value from normal letters to digits to special characters also. This totally depends on what your pattern is and what you are looking to find.

The flags parameter is the optional one, you can add more functionality to your pattern matching by using these flags. For example, the flag i when used as a flag signals that the pattern you are searching for should not be case-sensitive. Similarly we have two more flags, g and gi. The g stands for global and is used to match multiple patterns and not stop at the first match. The gi flag is the combination of both the above flags.

An important part to note in the syntax is the double quotes, in the literal notation there are no double quotes to be wrapped around the pattern and flags. Whereas in the constructor method we use the double quotes.

Let's just have a brief look at the methods for Individual Regular Expressions before we move on to the examples of the implementation. Now let's check out the pattern syntax of our regular expressions. If we are looking for complete words in a string, we can just use the '/ /' normally.

Examples:

/heaven/ - will find 'heaven' in the string 'Going to heaven'.
/250/ - will find '250' in the string 'The cost of this is 250 dollars'.

However,

/heaven/ will NOT match anything in the string 'Heaven or hell'.

The reason for the above is the case sensitive nature of the Regular expression, although if we use the i tag we can obtain that match. So a /heaven/i will match Heaven in the string 'Heaven or hell'.

Now, suppose you want to match 4 consecutive letters or numbers in a particular string.

/p{4}/ will match and store a substring if there occurs four consecutive 'p's (only the first occurrence). So, if there was a string 'jfk ppppl ppppj' it will only match the first occurrence, if we want to match both of them we need to use the g after the pattern to show that the pattern should be matched globally.

/4{4}/ will match four consecutive 4's in a number.

/\d{4}/ will match any four consecutive numbers.

We also have some special characters which we can use to enhance our pattern matching experience:

+ The plus sign is used to indicate matching of one or more occurrence of the preceding pattern.

Example:

/p+/g - will match the letter occurring one or more times. Therefore it will match 'p' and again 'pp' in the string 'prepping always'.

^ The ^ matches the pattern only at the start of a string or a new line.

Example:

/^the/i This Regular expression will match, 'The' in the string - 'The sky is blue', and not match 'the' in 'That was the best house'.

$ The $ matches only the end of the string.

Example:

/water$/ This will match 'water' in 'Give me a glass of water' and not in 'The water is clean'.

| The | gives a OR option for pattern matching, so the regular expression will either match substring on the left of the | or at it's right.

Example:

/hot|cold/ will match either the word 'hot' or 'cold' in the given string.

| The | gives a OR option for pattern matching, so the regular expression will either match substring on the left of the | or at it's right.

Example:

/hot|cold/ will match either the word 'hot' or 'cold' in the given string.

? The ? will match the only its preceding character for no occurrence or occurring once.

Examples: (Consider the regular expression - /lo?/)

Hello - In this string /lo?/ will match 'lo'.
Helloo - In this string /lo?/ will match 'lo'.
Hell - In this string /lo?/ will match 'l'.

* The * character works almost in the same way as the ?. Except matching it zero or one time, the * matches it one or more times.

Examples: (Consider the regular expression - /lo*/)

Hello - In this string /lo*/ will match 'lo'.
Helloo - In this string /lo*/ will match 'loo'.
Hell - In this string /lo*/ will not match anything.

/[any set of characters]/ - This Regular expression will try to match the string with any characters within the [ ].

Examples:

/[abc]/ - will match a,b or c in the required string.

/[a-z]/ - will match all lowercase alphabets.

In this same arrangement we can have a different use for the special character ^.
If we have an expression:

/[^a-z]/, it will only match characters except the lowercase alphabets. Notice that the ^ is within the square brackets.

So we can make this conveniently into matching only alphabets by adding another pattern within it like:

/[A-Za-z]/ - will match all the alphabets. Now if you want to allow numbers and alphabets we can adding 0-9 to this. So our Regular expression will look something like /[A-Za-z0-9]/. There is also a shortcut which checks for the above characters along with the underscore. /\w/ will match all alpabets, numbers and the underscore character.

We also have various characters for spaces, tabs, etc. /\s/ can be used to match a whitespace between characters. /\r/ is used to match a carriage return. /\t/ will match a normal tab and \v will match a vertical tab.

Let's move onto some small pieces of codes and functions which can solve some implementation problems.

<form>
<input type="text" name="color" id="id1" value="What color of the background do you want?">
<input type="button" value="Change background" onclick="change_b()">
</form>

Suppose we have a form in which we make the user enter the color which he wants the background of the document to be. Above is the code for that, on pressing the button the function change_b() will be called.

function change_b() { var col = document.forms[0].color.value; var rx = /[a-f0-9]{6}/i var test_bool = rx.test(col); // test( ) returns a Boolean value, true if match occurs // and false if no match. if (test_bool) { col="#"+col; document.bgColor = col; } else document.forms[0].color.value = "Enter a valid color code"; document.forms[0].id1.focus(); document.forms[0].id1.select(); }

This function will test whether the user has entered the correct color code or not. The colors entered must be in our traditional color codes for JavaScript. The Regular expression checks the length and the characters of the input string. The string entered (color) should have only characters from [a-f] or [A-F] and numbers from [0-9]. The total length of this should be exactly 6 characters. The # is added later on once this is satisfied.

Let's have a look at a small example using the exec() function and the RegExp properties. The exec() has more functions than the test() function, as in the test() function we can only obtain Boolean values in accordance to the match. The exec() function returns the array of matched substrings and if we have specified parenthesis patterns in our regular expression it will store those as well. So evidently the exec() function can have more implementations than test().

Consider a string taken in as entered by the user, str1 = "It has been a lonnggiissssh day". With Regular expression, rex = /(nn)(gg)(ii)(s{4})/

RegExp.input = str1; s_match = rex.exec();

In the above code, first the str1 is assigned to RegExp.input, the input property is a regular expression property which provides the input string when no string is specified for the exec() function. We then call the exec() function without a string so the input will be taken from the previous statement.

document.write("Match = " + s_match);

The above code will provide a array which will consist of the whole pattern match with the parenthesis substrings which have matched. The output will look like:

Match = [nnggiissss, nn, gg, ii, ssss]

Let's have a look what the other regular expression properties would display.

document.write("RegExp.$1 = " + RegExp.$1); document.write("RegExp.$2 = " + RegExp.$2); document.write("RegExp.$3 = " + RegExp.$3); document.write("RegExp.$4 = " + RegExp.$4); document.write("RegExp.lastMatch = " + RegExp.lastMatch); document.write("rex.source = " + rex.source);


These are some of the properties and their respective outputs.



Return to the JavaScript section, or go the to Main page.





Looking for the old guiStuff?

It's still here, the old content didn't go anywhere.