So Perl 5.10 has been released on 18th December 2007,
on the 20th birthday of Perl. There are several interesting
additions to the language. In What's new in Perl 5.10? say, //, state I already
wrote about a few.
Now I am going to look at something called Smart Matching.
In 5.10 there is a new operator that looks like this: ~~
It is called the Smart Matching operator and it is available
only if specifically requested. The easiest way to do that is to write
the following in your code:
use 5.010;
As it is a commutative
operator normally you will use ~~ just like you would use "==" or "eq" between
two scalar variables (but not like =~ which is not commutative).
Smart Match will check if the two values are string-equal using "eq",
unless it finds some better way to match them....
so all the following will be true:
"Foo" ~~ "Foo"
42 ~~ 42
42 ~~ 42.0
42 ~~ "42.0"
and all these will be false
"Foo" ~~ "Bar"
42 ~~ "42x"
And that is already a nice advantage over what we had earlier.
In every operator Perl changes the type of the value based on the operator.
That is == turns both sides to Numerical values and compares them as numbers
while eq turns both side to String values and compares them as strings.
When turning a string to a number Perl looks at the left side of the string and
uses as many characters as it can understand as being a number and warns if there
are more - non-number - characters in the string.
On the other hand ~~ fits the comparison method to the values on the two sides.
In a smart way.
This means that these are all true:
42 == 42
42 == 42.0
42 == "42.0"
42 == "42\n"
but this is false:
42 eq "42.0"
42 eq "42\n"
and the following are true:
42 == "42x"
"Foo" == "Bar"
albeit with a warning...
if you used use warnings...
This behavior while consistent is a bit hard to understand.
On the other hand the new ~~ is strange in another way.
Its comparison is value driven as opposed to the other operations in
Perl which are operator driven.
Let's see it in a different approach: As I wrote ~~ will compare the values
as strings using eq, unless it finds some better way to compare them.
"Foo" ~~ "Bar"
will return false but
"Moose" ~~ "Moose"
will return true. Nothing to surprise us.
So what better ways might be there to compare two scalars?
I wrote a small function that can be used to see what smart matching does.
sub compare {
my ($x, $y, $description) = @_;
my $t = ($x ~~ $y ? "True" : "False");
$x //= '';
$y //= '';
printf("%-4s ~~ %-4s is %-5s (%s)\n", $x, $y, $t, $description);
}
This will get two values (and a description) and compare the two
with smart matching printing the result. So I can now supply two
values two this function and see how are the respective variables compared.
So
"Foo" ~~ "Bar" is the same as "Foo" eq "Bar", (this is called Any ~~ Any in the documentation)
If one side is a number, it seems to be better to compare them as numbers (using ==)
42 ~~ 42.0 are compared using == (this is called Any ~~ Num)
If one side is Number but the other one is a string (eg. "xyz" or "23xxyz") then
we would be better off comparing with eq and not trying to change them to numbers.
Hence:
42 ~~ "42.0x" are compared using eq (this is called Any ~~ Str)
But what if the string is actually including a numish thing such as
"42" or "42.0" or even "42\n" to forgive those who forget to chomp()?
The following are all compared as numbers so they are true:
42 ~~ "42"
42 ~~ "42.0"
42 ~~ "42\n"
42 ~~ "42 "
This is called (Num ~~ numish)
There is a strange thing though, and I am not yet sure what is
its purpose but if both are numish then they are compared as strings:
"42" ~~ "42.0"
I have a posted a question regarding this on
PerlMonks where
you might find the answer.
But ~~ can do more. If one of the values in an == or eq is undef,
Perl will complain about that. Smart Matching on the other hand
understands that an undef is just an undef. So if one of the
values is undef then ~~ checks if the other one is undef too using
defined();
So these are false:
3 ~~ undef
"x" ~~ undef
While this is true:
undef ~~ undef
In addition one can provide a regular expression on
one side and then ~~ will apply the regex so one can write
either of these without any success:
What if one of the given values is not a real scalar?
What if that is actually a reference to an array?
The smart match will do The Right Thing, it will check if the given scalar value is
the same as one of the elements in the Array. (Str ~~ Array)
"Moose" ~~ [qw(Foo Bar Baz)] is false
"Moose" ~~ [qw(Foo Bar Moose Baz)] is true
The way the individual values are compared is based on the type of the scalar.
So if the scalar is a string all the values of the array are compared to the
string using "eq" while if the scalar is a number, all the comparisons will
be done by "==".
(Str ~~ Array and Num ~~ Array in the documentation)
I am not fully convinced that the last one is really good, but that's the behavior.
If you would like to read more about that here is the
PerlMonks post about that issue.
So that actually means we now have an operator to check if an individual scalar
is represented in an array. It still slower than a hash lookup but it is faster
than a grep that most people use. It is definitely easier to write.
Similarly we can have a hash reference instead of one of the values and
~~ will check if the given scalar is one of the keys in the hash:
That is, using exists(); (Any ~~ Hash)
'a' ~~ {a => 19, b => 23} true
42 ~~ {a => 19, b => 23} false
But that is a bit less interesting in this context.
As a side note, instead of reference to Array or reference to Hash
you can actually put there the real Array or the real Hash
(but not a simple list) so this works and it is true:
my @a = (2, 3);
say 3 ~~ @a ? "T" : "F";
but this does not work:
say 3 ~~ (2, 3) ? "T" : "F";
The obvious question is then, what happens when both sides are
complex data structures Arrays or Hashes?
With Arrays, it will check if each element is the same (Array ~~ Array)
["Foo", "Bar"] ~~ ["Foo", "Bar"] is true
["Foo", "Bar"] ~~ ["Foo", "Bar", "Baz"] is false
[1,2,3] ~~ [1,2,3] is true
With hashes it checks if the keys are identical (Hash ~~ Hash)
{Foo => 19, Bar => 23} ~~ {Foo => 23, Bar => 19} is true
There are several more cases but even this was probably too much for an introduction.
To get more details search for "Smart matching in detail"
after typing perldoc perlsyn on the command line
- you do have 5.10 installed already, don't you,
or after browsing to the
perlsyn page.