Saturday, April 15, 2006

PostgreSQL: Documentation: Manuals: PostgreSQL 7.3: Pattern Matching:

6.6.2. SIMILAR TO and SQL99 Regular Expressions

string SIMILAR TO pattern [ESCAPE escape-character]
string NOT SIMILAR TO pattern [ESCAPE escape-character]

The SIMILAR TO operator returns true or false depending on whether its pattern matches the given string. It is much like LIKE, except that it interprets the pattern using SQL99's definition of a regular expression. SQL99's regular expressions are a curious cross between LIKE notation and common regular expression notation.

Like LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string; this is unlike common regular expression practice, wherein the pattern may match any part of the string. Also like LIKE, SIMILAR TO uses % and _ as wildcard characters denoting any string and any single character, respectively (these are comparable to .* and . in POSIX regular expressions).

In addition to these facilities borrowed from LIKE, SIMILAR TO supports these pattern-matching metacharacters borrowed from POSIX regular expressions:

  • | denotes alternation (either of two alternatives).

  • * denotes repetition of the previous item zero or more times.

  • + denotes repetition of the previous item one or more times.

  • Parentheses () may be used to group items into a single logical item.

  • A bracket expression [...] specifies a character class, just as in POSIX regular expressions.

Notice that bounded repetition (? and {...}) are not provided, though they exist in POSIX. Also, dot (.) is not a metacharacter.

As with LIKE, a backslash disables the special meaning of any of these metacharacters; or a different escape character can be specified with ESCAPE.

Some examples:

'abc' SIMILAR TO 'abc'      true
'abc' SIMILAR TO 'a' false
'abc' SIMILAR TO '%(b|d)%' true
'abc' SIMILAR TO '(b|c)%' false

The SUBSTRING function with three parameters, SUBSTRING(string FROM pattern FOR escape), provides extraction of a substring that matches a SQL99 regular expression pattern. As with SIMILAR TO, the specified pattern must match to the entire data string, else the function fails and returns null. To indicate the part of the pattern that should be returned on success, SQL99 specifies that the pattern must contain two occurrences of the escape character followed by double quote ("). The text matching the portion of the pattern between these markers is returned.

Some examples:

SUBSTRING('foobar' FROM '%#"o_b#"%' FOR '#')   oob
SUBSTRING('foobar' FROM '#"o_b#"%' FOR '#') NULL

0 Comments:

Post a Comment

<< Home