Regular Expressions in Perl 5.10

There are many new features in the regular expression engine of Perl 5.10. I point out some of them.

Named captures

I am trying to match a phone number and save the values in variables.

One way to do it is:

if ($str =~ /^(\d+)-(\d+)-(\d+)$/) { $num{country} = $1; $num{area} = $2; $num{phone} = $3; }

The new way is

if ($str =~ /^(?<country>\d+)-(?<area>\d+)-(?<phone>\d+)$/) { %num = %+; }

Starting from 5.10 we can name the capturing parenthesis and the strings they match will be in the %+ hash using the names of the parenthesis as the keys.

Not only that but we can use these names also instead of the \1, \2 matching buffers y writing \k as in the following example:

/(?<letters>[a-z]+)-(?<digits>\d+)-\k<letters>-\k<digits>/

Using names will make it much clearer what each pair of parenthesis are matching and will eliminate bugs created when we add or remove a pair that changes the numbering.

For example in this regex:

/(.)(.)\2\1/

If I want to add a repetition to it I would start writing

/((.)(.)\2\1){2}/

but this is incorrect and gives a syntax error as now I need to change the numbers of the buffers:

/((.)(.)\3\2){2}/

Using named buffers even if they are just single letter will solve this problem:

/(?<c>(?<a>.)(?<b>.)\k<a>\k<b>)/