Wednesday, 19 November 2014

Using Regular Expressions with PHP

PHP provides rich support for regular expression. Regular expressions or RegEx can be used for pattern matching, replacing a particular part of string or to extract some part of string.

RegEx are string of characters that defines a particular patter and has its own rules.
There are two types of RegEx :
  • POSIX Regular Expressions
  • PERL Style Regular Expressions

We will see only POSIX style RegEx in this tutorial.

What is a Regular Expression

RegEx is a string of character. For example, a is a regex, \"([^\"]+)\" is also a regex and so is [0-9]+([a-z]).* In the first sight it looks very weird but as we go along the tutorial it will become easy for you to understand these patterns.

Matching with literals

Literals matches exact characters they specify.For example, "/abc/" is a regular expression. It will match strings which has string "abc" as a sub string, like abcdef, xyzabc, xyabcef etc. The forward slashes at the begin and end are called delimiters. They mark the start and end of pattern. They should be same and can not be backslash or any alphanumeric character that is you can use /,|,: etc.

Matching start and end

Consider the pattern "/abc/". As we saw it will match "abcdef", "xyzabcdef". Suppose we want that "abc" should only come at the beginning, that is we dont want to match "xyzabcdef". We can use ^ for the purpose. Anything that comes after ^ should come in the beginning of the subject string. Thus "/^abc/" will only match "abcdef" not "xyzabcdef".
Just like ^, $ matched end on the string. So "/abc$/" will match neither "abcdef" nor "xyzabcdef", but it will match "defabc".
Some particular examples :
"/^$/" will match empty string
"/^abc$/" will match only "abc", i.e. none of "abcdef", "xyzzbcdef", "defabc" is matched, only "abc" gets matched.

Giving Range with brackets

Brackets [] can be used in a regex to specify a range. For example, [0-9] matches single digit from 0 to 9. Consider [a-z] which matched any lower case alphabets. Consider the pattern, "/^[a-z][a-z][0-9][0-9]/" it will match any string starting with a small case alphabet and followed by a small alphabet and two digits. So it will match "aa10", "xy44"; but not "12fv","ddrt", "1123". The ^ character when used at starting of pattern it will indicate start of the subject string. But inside the [] it has a special purpose of negation. For example "/^[^0-9]/" will match any string that DOES NOT start with a number. Here first ^ marks the beginning of the string while the second one inside the brackets gives negation.

Giving choices

Suppose we want to match a pattern where first character is either a digit or a alphabet and followed by two digits. From above examples a simple solution would be to first check "/^[a-z][0-9][0-9]/" and if it does not match we check for "/^[0-9][0-9][0-9]/". But this is not a good solution as you have to write case for each choice that is possible. For example consider date-month-year pattern, where date can 0 followed by a digit or 1 followed by a digit or 2 followed by digit or 3 followed by either 1 or 0; month can be 0 followed by a digit or 1 followed by 0 or 1 or 2; year is any two digits. If we use above method and write code it will be really cumbersome to write and prone to error. Fortunately RegEx provides | symbol for making choice. Consider our first example, if want a alphabet OR digit followed by two digits, our patter would be "/^[a-z]|[0-9][0-9][0-9]/". | serves as OR in patters. Remember | needs patterns on both side. Also pattern "/a|bc/" will match (a OR b) and then c; not a OR (b and then c). We can use parenthesis for easy reading like: "/(a|b)c/".

Using Quantifiers

Quantifiers are used to match long repeating string of pattens. For example, assume that we want to match a string containing only numbers. It is not possible to do directly using any of above features. For such kind of situations Quantifiers are provided.
They are +,*,?, {},^,$. We have already seen ^ and $. Here is a short explanation of rest.

Quantifier Use
* Matched zero or more occurrence of preceding pattern.
+ Matched one or more occurrence of preceding pattern.
? Matched zero or one occurrence of preceding pattern.
{min,max} Matched occurrence of preceding pattern min to max times.
{min,} Matched occurrence of preceding pattern atleast min times.


Here are some examples:
Quantifier Use
a* Matches empty string,a,aa,aaa,aaa...
a+ Matches a,aa,aaa,aaa...
Can be thought as aa*
a? Matches empty string or a.
{2,5} Matched occurrence of preceding pattern min to max times.
{min,} Matched occurrence of preceding pattern at least min times.


Escaping

Sometimes you want to match symbols like '[', or ' / ' in the string with the string, which are actually a part of the pattern syntax. Thus, it is necessary to distinguish weather we want to use a particular symbol as an literal or as a part of RegEx. "\" (without quotes) is used for this. So if you want to match / you would actually use \/. It called escape sequence. Same goes for other symbols like, \[, \" etc.

Remembering with parenthesis

Suppose you want to extract an IP address from a line of text. IP addresses are like, 127.0.0.1, 192.168.2.6. They are 4 numbers separated by dots and say we want all four numbers separately. From what we have learned its hard to do this. However with parenthesis at our help this becomes really easy. Pattern "/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/" will match the IP address. Now if you want to extract some part of matched string from pattern, you can parenthesis that part and the use references to it. Thus, "/([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)/" can be used to remember four numbers of IP address that can be accessed later. We will see, how to reference them when we will see PHP functions that uses RegEx. Also note that I have escaped the . because it has a special meaning as we will see next.

The Dot

There is a special symbol "." which is used for matching any one character. You can use if you don't know actual characters in the patter but you know the text, pattern or symbols bounding the required pattern. With the use of above quantifier . is really helpful in many cases. For example suppose you want to find the text in a line between two hash symbols. You don't know what text is, what its length is, it could be empty as well. We use . is situations like this. So for above scenario, "/#(.*)#/" will match #AnyText# and using parenthesis we can extract the text between hashes.

You can practice regular expressions on following site.
regex101.com

Ads :
Buy Kodak, Canon, Panasonic Cameras on www.ebay.com
Electronics, Cars, Fashion, Collectibles, Coupons and More Online Shopping | eBay
www.ebay.co.uk | www.ebay.com.my

Friday, 28 February 2014

Submitting and Securing the form data with Javascript

Whenever you create a form in HTML, you use "action" attribute of the <form> to submit your form. Now if you are not using SSL, then the data that you are sending is not encrypted and not secure.
You can use javascript to encrypt the data and also submit the form. <form> contains a attribute or event "onsubmit" which is fired when user clicks on the submit button. We can use this event and submit form using javascript.

Here is a little demo of what you can do.

First create a simple form.
<form action="process.php" method="post" onsubmit="return submitForm(this.form)"> <input type="text" name="uname" id="uname"> <br> <input type="password" name="pass" id="pass"> <br> <input type="submit" value="Submit"> </form>
Now consider that want to send the username and password after encrypting it.
One way to do this is when user clicks the submit button, we change the values of the fields with the encrypted values.
The other way is to create form in javascript and values to it and then submit that form to server side.
Here is a little JS that does the same thing.
<script> //Dummy Encrypt function, returns the given data as it is. function Encrypt(data) { return data; } function submitForm(form) { //create a form element var f = document.createElement("form"); //set post and action attributes f.setAttribute('method',"post"); f.setAttribute('action',"process.php"); //Create a new element of type input var uname = document.createElement("input"); //Set its type uname.setAttribute('type',"text"); //give it a name uname.setAttribute('name',"uname"); //Encrypt can be any function that you define to //encrypt the given data uname.value = Encrypt(form.uname.value); //create one more to store password var pass = document.createElement("input"); pass.setAttribute('type',"password"); pass.setAttribute('name',"pass"); pass.value = Encrypt(form.pass.value); //add both to form f.appendChild(uname); f.appendChild(pass); //submit the form f.submit(); return false; } </script>
Here it we create a new form element, add child elements to it and submit is. The advantage here is you can choose your own encryption logic based on your needs and send the data securely to the server.
Ads :
Buy Kodak, Canon, Panasonic Cameras on www.ebay.com
Electronics, Cars, Fashion, Collectibles, Coupons and More Online Shopping | eBay
www.ebay.co.uk | www.ebay.com.my