Regular expressions

P* has adopted the neat and simple regular expression syntax from the Perl language. Regular expression are delimeted with slashes. Here is an example of a program which uses regular expressions:

SCENE main {
	/* Create a strange string */
	string test = "aaa   bbbb";

	/* Check if a regular expression matches */
	if (test =~ /a\s*b/) {
		echo "Does match\n";
	}

	/* Check if a regular expression does not match */
	if (test !~ /ffff/) {
		echo "Does not match\n";
	}
}

You can interpolate variables into the regular expression by using the $-prefix. The content of the variable is considiered a part of the regular expression syntax, so to avoid funny behaviour, you might want to escape special characters like . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -.

You can escape a variable by using the \Q \E-method, like this:

SCENE main {
	ENV env;
	string search = "/cgi-bin/test.pstar";

	/* Check if script name ends with $search, while making
	   sure that the forward slashes are escaped.
	 */
	if (env->SCRIPT_NAME =~ /\Q$search\E$/) {
		echo "This script is placed inside the cgi-bin directory";
	}
}

Notice that the variable interpolation prefix $ is treated as end-of-string if not directly followed by a word character.

Flags

One or more flags may be used by specifing them directly after the last / of the regular expression. These change the way the regular expression works.

Table 3.3. Regular Expression flags

FlagNameDescription
/qAutomatic meta-quoteAutomatically escape all inline variables with \Q \E
/gGlobalRun the regular expression several times untill the end of the string is reached.

Store matches

When a regular expression has matched, it pushes the full match and sub-matches onto the discard queue. The discard queue is a hidden array in all expressions, and we can push it onto an array later by using the => operator.

SCENE main {
	string subject = "aabbc";
	array<string> matches;

	/* Push the matches in the regular expression into the matches-array */
	test =~ /..|./g => matches;

	int i;
	for (i = 0; i < @matches-1; i++) {
		echo "Match $i: '" . matches[i] . "'\n";
	}

	/* This will print out:
Match 0: 'aa'
Match 1: 'bb'
Match 2: 'c'
	*/
}

Note that the => operator also pushes the last value, which is left of it, onto the array. In this case, this value is 'true', since the regular expression matched. We should therefore ignore the last element of the array, hence the -1 (minus 1) in the for-loop.

Text replacement

Like in the Perl lanugage, you can use regular expressions to replace text. To use matches from the regular expression in the replacement part, we can use the special variables $0 $1 $2 $3. The full match, everything between / / is placed into $0. The first sub-expression (inside ( ) is placed into $1, the second in $2 and so on. These variables are only available inside regular expressions.

SCENE main {
        string subject = "23 24 25 26 27";

	echo "Before: $subject\n";

	/* Replace all '2's with the character following them */
	subject =~ s/2(.)/$1$1/g;

	echo " After: $subject\n";
}

This program will output

Before: 23 24 25 26 27
 After: 33 44 55 66 77

More regex!

Take a look at to the Perl documentation for regular expressions for more information.