Regular Expressions

Regular expressions are an amazingly powerful but tedious tool available in most today's programming languages.

The first type of character you will use for defining patterns is literal. a literal is a value that is written exactly as it is interpreted.

Metacharacters are special literal value. To match any metacharacter, you will need to escape it. For example a '.' matches any single character, hence "\." will match the period itself.

  • ^ caret indicates the beginning of a string
  • $ dollar sign indicates the end of a string
  • . period matches any single character
  • | pipe means alternatives (or)

"?", "*", "+" are quantifiers which allow to match multiple occurences in your patterns, as are the curly braces "{}". To match a certain quantity of a letter, put the quantity between curly braces, stating a specific number, a minimum, or a minimum and a maximum.

  • "?" means 0 or 1
  • "*" means 0 or more
  • "+" means 1 or more
  • "{x}" means exactly x occurrences
  • "{x, y}" means between x and y (inclusive)
  • "{x,} means at least x occurrences

Grouping, you use parentheses to group characters into more involved patterns.

Character classes are created by placing characters within square brackets ([]). Character classes can be used for matching any number sequence, regardless of how you combine your letters into various groups.

  • [a-z] means any lowercase letter
  • [a-zA-Z] means any letter
  • [0-9] means any number
  • [\f\r\t\n\v] means any space
  • [aeiou] means any vowel

More by this Author


Comments 1 comment

tonystubblebine profile image

tonystubblebine 6 years ago from San Francisco and NYC

Great post and thanks for including a link to the regex pocket reference (my book).

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working