Smart Matching in Perl 5.10

Blog entries
2009 Jul 02

Why am I writing Padre?.
2009 Jul 01

Test Reporting system: Smolder wish-list.
2009 Jun 30

The Ubuntu Business model and Perl.
2009 Jun 28

Perl 5 Personal Service.
2009 Jun 25

Padre 0.37 released.
2009 Jun 23

Things I am missing from Iron Man.
2009 Jun 22

When is the next release of Perl?.
2009 Jun 20

Live Help - IRC channels.
2009 Jun 17

Perl 5 to Perl 6 - Arrays.
2009 Jun 16

Perl 5 to Perl 6 - Scalars.
2009 Jun 15

Introduction to PHPUnit.
2009 Jun 13

Comparing the Eclipse Foundation with The Perl Foundation and EPO.
2009 Jun 10

Help your vendor packaging CPAN modules.
2009 Jun 08

Plans for the next 2-3 months.
2009 Jun 05

If you change the code of an open source application no one will support you.
2009 Jun 03

I hate Net::SSH::Perl.
2009 Jun 02

Why www is (un)necessary in the web addresses.
2009 May 31

Planning an SQL or DBI plugin for Padre.
2009 May 30

The importance of frequent binary releases.
2009 May 30

Padre 0.36 released.
2009 May 25

The Corporate CPAN.
2009 May 22

Perl 6 training in Lisbon in August.
2009 May 18

Perl Programming.
2009 May 09

CPAN Dependency browser.
2009 May 05

Ideas for Padre plugins.
2009 Apr 30

If you can read this then you don't need this.
2009 Apr 28

Padre 0.34 Released.
2009 Apr 27

SmartLinks on CPAN now.
2009 Apr 26

Syntax::Highlight::Engine::Kate anyone seen Hans Jeuken?.
2009 Apr 23

Iron Man Blogging contest.
2009 Apr 14

Padre and Catalyst.
2009 Apr 10

You show them mine, I show them yours.
2009 Apr 09

The Perl 5 - Perl 6 divide.
2009 Apr 08

Reporting Test Results.
2009 Apr 06

What is the last element of an infinite list or how to get started with Perl 6 ?.
2009 Mar 30

Perl 6 subroutines.
2009 Mar 26

Testing a (Perl) Web application without a lot of setup.
2009 Mar 23

Embedding Perl 6 in Perl 5.
2009 Mar 21

Padre and Google Summer of Code 2009.
2009 Mar 20

Perl 6: Looping over a list of values one at a time, two at a time and more.
2009 Mar 17

Perl 6: Is a value IN a given list of values?.
2009 Mar 15

Testing PHP Applications.
2009 Mar 13

Perl 6: Scalar, Array and Hash interpolation.
2009 Mar 12

Perl 6: Arrays with unique values.
2009 Mar 11

Testing PHP code with SimpleTest.
2009 Mar 09

Ending the Padre and Parrot integration grant.
2009 Mar 08

Spine, the Perl CMS (Content Management System).
2009 Mar 07

Better Than Grep.
2009 Mar 07

Vim as Perl IDE.
2009 Mar 05

No cookies for me.
2009 Mar 02

German Perl Workshop.
2009 Mar 01

Hands on Perl 6 training in Oslo.
2009 Feb 24

No good Perl for Win32 ?.
2009 Feb 20

Moaning Goat Meter.
2009 Feb 19

Experimental Perl 6 training / workshop in Frankfurt.
2009 Feb 18

Twitter.
2009 Feb 18

Prices.
2009 Feb 18

More Padre blogs.
2009 Feb 16

Methods and Messages: Randal Schwartz on Smalltalk.
2009 Feb 16

What is Modern Perl?.
2009 Feb 15

Padre blogs.
2009 Feb 15

TOP 100 CPAN packages.
2009 Feb 10

The Five Forces in the Language Wars.
2009 Feb 08

Shimming for testing Perl 6 code released to CPAN.
2009 Feb 04

Writing Perl 6 can be frustrating.
2009 Feb 02

Padre 0.26 released.
2009 Jan 21

Mocking real world to test a wrapper.
2009 Jan 18

Test Automation Training in Oslo, Norway.
2009 Jan 18

Operation on a Series of Integers in Perl 6.
2009 Jan 16

Embedding Parrot in Perl 5.
2009 Jan 13

Test Automation using Perl Training in Frankfurt, Germany.
2009 Jan 12

Getting Started with Perl 6.
2009 Jan 10

Perl 6 syntax highlighting.
2009 Jan 01

Perl 6 Cookbook.
2009 Jan 01

New Year's Resolutions.
2008 Dec 30

PPI based Syntax highlighting for Perl 5.
2008 Dec 29

Syntax highlighting for Perl 6.
2008 Dec 11

Plans for Integrating Padre with Parrot and Rakudo.
2008 Dec 10

Grant accepted for Integrating Padre with Parrot and Rakudo.
2008 Dec 10

Plans for the next month or two.
2008 Dec 05

Perlsphere.
2008 Nov 30

Portable Padre 0.19 for Windows.
2008 Nov 27

10-fold grows in Padre user base.
2008 Nov 26

How many test harnesses are too many?.
2008 Nov 25

Licenses on CPAN. Again.
2008 Nov 20

Padre talk in Haifa, reality check.
2008 Nov 17

Padre 0.17 was released.
2008 Nov 12

Talking about Padre and wxPerl in Haifa.
2008 Nov 11

Backlinks or links back to your site.
2008 Nov 10

Building your resume.
2008 Nov 09

How to run an Open Source Project.
2008 Nov 06

Syntax highlighting nightmare.
2008 Nov 04

2008Q4 TPF Grant Proposals.
2008 Nov 02

Subversion committer statistics.
2008 Oct 28

Perl Application Development and Distribution Platform.
2008 Oct 28

Compare Languages by usage.
2008 Oct 23

Yak shaving.
2008 Oct 21

Recursive development that leads nowhere.
2008 Oct 18

Licenses in META.yml on CPAN.
2008 Oct 17

Shall I enable some form of trackback or commenting?.
2008 Oct 15

Shana Tova - New Year's resolution.
2008 Oct 15

Perl needs is_number and similar functions (nearly built in).
2008 Sep 22

The Quest for the Perfect Editor.
2008 Sep 04

Living on the border.
2008 Sep 02

TAP - Test Anything Protocol.
2008 Aug 31

Padre - the journey I..
2008 Aug 21

Who needs an IDE for Perl anyway?.
2008 Aug 09

Padre project web site.
2008 Jul 27

Padre.
2008 Jul 23

White Camel.
2008 Jul 18

Name a Perl IDE - get a Perl book or YAPC ticket.
2008 Jul 09

QA Hackathon in Israel.
2008 Jul 01

OSDC Israel 2009 - Call for organizers.
2008 Jun 11

Selenium on Ubuntu 8.04 (Hardy).
2008 Jun 09

Testing Hello World.
2008 Jun 08

Wifi is working again!.
2008 Jun 07

CPANTS update.
2008 Jun 04

Frequent Internet blackouts.
2008 Jun 03

Upgrading to Ubuntu 8.04 Hardy on Compaq (HP) nc6400. .
2008 May 24

Test Automation Tips.
2008 May 22

Open Source IDE for Perl.
2008 May 21

This week in Ruby.
2008 May 21

Being included on Planet Perl.
2008 May 14

Adding tag cloud to the blog.
2008 May 14

Ubuntu 7.04 (beta) Feisty Fawn on Compaq (HP) nc6400.
2008 May 13

Test automation using Perl master class in Chicago.
2008 May 13

Adding tags to the blog.
2008 May 09

Automated Testing in PHP, Python, Ruby and Perl.
2008 Apr 03

Strawberry Perl for Windows.
2008 Apr 01

Oslo Hackathon day -4.
2008 Mar 28

Blogging about Perl outside the community?.
2008 Mar 27

OSCON Proposals rejected.
2008 Mar 26

Preparing for the QA Hackathon in Oslo.
2008 Mar 25

Missing licenses on CPAN modules?.
2008 Mar 24

License of Perl Modules on CPAN.
2007 Dec 24

Joining Technorati?.
2007 Dec 24

Regular Expressions in Perl 5.10.
2007 Dec 24

Switching in Perl 5.10.
2007 Dec 24

Smart Matching in Perl 5.10.
2007 Dec 24

What's new in Perl 5.10? say, //, state.
2007 Dec 23

The Zulo interview was published.
2007 Dec 08

Frequency of programming languages on LinkedIn.
2007 Dec 06

Interview in Zulo.
2007 Dec 06

Sun Startup Essentials Launch.
2007 Aug 25

Testing PostgresSQL.
2007 Aug 25

Testing Pugs and Perl 6.
2007 Aug 22

Testing Ruby.
2007 Aug 22

Testing GHC, the Glasgow Haskell Compiler.
2007 Aug 22

Testing NUT, the Network UPS Tools.
2007 Aug 21

Testing SQLite .
2007 Aug 20

Smoked Parrot.
2007 Aug 20

Quality Assurance of Perl 5.
2007 Jul 09

Using mod_perl for szabgab.com.
2007 Jul 07

Quality Assurance and Automated Testing in Open Source Software.
2007 Jul 07

Add tags to CPAN modules via CPAN::Forum .
2007 Jun 15

Windows on VMware.
2007 Jun 13

Reducing the social gap of the information age.
2007 May 25

Moving to a new server.
2007 May 04

Preparing an application for distribution.
2007 May 01

Spreadsheet::ParseExcel is looking for a maintainer.
2007 Apr 28

CPAN Modules in Linux Distributions.
2007 Apr 18

Version control of single files using Subversion.
2007 Apr 13

Testing results, Perl and CPAN module availability.
2006 Aug 05

Perltraining.org split into two.
2006 Jul 23

Upgrading Ubuntu to 6.06, (Dapper Drake).
2006 Jul 22

Ginger Spam Salad.
2006 Jul 20

Automating the blog.
2006 Jul 19

Wish list: search engine for Perl related sites.
2006 Jul 19

Perltraining.org .
2006 Jul 19

More blog related issues.
2006 Jul 19

Starting a blog.

Upcoming training classes

Lisbon, PortugalTesting PHP Applications July, 2009
Lisbon, PortugalTest Automation Using Perl July, 2009
Lisbon, PortugalHands-on Perl 6 training 1-2 August, 2009
home | blog

Smart Matching in Perl 5.10

Published on 2007.12.24 at 11:14:48

Tags: Perl, Perl 5, 5.10, smart match


So Perl 5.10 has been released on 18th December 2007, on the 20th birthday of Perl. There are several interesting additions to the language. In What's new in Perl 5.10? say, //, state I already wrote about a few. Now I am going to look at something called Smart Matching.

In 5.10 there is a new operator that looks like this: ~~

It is called the Smart Matching operator and it is available only if specifically requested. The easiest way to do that is to write the following in your code:

    use 5.010;

As it is a commutative operator normally you will use ~~ just like you would use "==" or "eq" between two scalar variables (but not like =~ which is not commutative).

Smart Match will check if the two values are string-equal using "eq", unless it finds some better way to match them....

so all the following will be true:

    "Foo" ~~ "Foo"
    42    ~~ 42
    42    ~~ 42.0
    42    ~~ "42.0"

and all these will be false

    "Foo"  ~~ "Bar"
    42     ~~ "42x"

And that is already a nice advantage over what we had earlier.

In every operator Perl changes the type of the value based on the operator. That is == turns both sides to Numerical values and compares them as numbers while eq turns both side to String values and compares them as strings.

When turning a string to a number Perl looks at the left side of the string and uses as many characters as it can understand as being a number and warns if there are more - non-number - characters in the string.

On the other hand ~~ fits the comparison method to the values on the two sides.
In a smart way.

This means that these are all true:

    42 == 42
    42 == 42.0
    42 == "42.0"
    42 == "42\n"

but this is false:

    42 eq "42.0"
    42 eq "42\n"

and the following are true:

    42 == "42x"
    "Foo" == "Bar"

albeit with a warning...
if you used use warnings...

This behavior while consistent is a bit hard to understand.

On the other hand the new ~~ is strange in another way. Its comparison is value driven as opposed to the other operations in Perl which are operator driven.


Let's see it in a different approach: As I wrote ~~ will compare the values as strings using eq, unless it finds some better way to compare them.

    "Foo" ~~ "Bar"

will return false but

    "Moose" ~~ "Moose"

will return true. Nothing to surprise us.

So what better ways might be there to compare two scalars?

I wrote a small function that can be used to see what smart matching does.

    sub compare {
        my ($x, $y, $description) = @_;
        my $t = ($x ~~ $y ? "True" : "False");
        $x //= '';
        $y //= '';
        printf("%-4s ~~ %-4s is %-5s   (%s)\n", $x, $y, $t, $description);
    }

This will get two values (and a description) and compare the two with smart matching printing the result. So I can now supply two values two this function and see how are the respective variables compared.

So

    "Foo" ~~ "Bar" is the same as "Foo" eq "Bar", (this is called Any ~~ Any in the documentation)

If one side is a number, it seems to be better to compare them as numbers (using ==)

    42 ~~ 42.0  are compared using == (this is called Any ~~ Num)

If one side is Number but the other one is a string (eg. "xyz" or "23xxyz") then we would be better off comparing with eq and not trying to change them to numbers. Hence:

    42 ~~ "42.0x"  are compared using eq (this is called Any ~~ Str)

But what if the string is actually including a numish thing such as "42" or "42.0" or even "42\n" to forgive those who forget to chomp()?

The following are all compared as numbers so they are true:

    42 ~~ "42"
    42 ~~ "42.0"
    42 ~~ "42\n"
    42 ~~ "42 "

This is called (Num ~~ numish)

There is a strange thing though, and I am not yet sure what is its purpose but if both are numish then they are compared as strings:

    "42" ~~ "42.0"

I have a posted a question regarding this on PerlMonks where you might find the answer.

But ~~ can do more. If one of the values in an == or eq is undef, Perl will complain about that. Smart Matching on the other hand understands that an undef is just an undef. So if one of the values is undef then ~~ checks if the other one is undef too using defined();

So these are false:

    3 ~~ undef
    "x" ~~ undef

While this is true:

    undef ~~ undef

In addition one can provide a regular expression on one side and then ~~ will apply the regex so one can write either of these without any success:

    "Perl 5.10" ~~ /Moose/
    /Moose/ ~~ "Perl 5.10"
    "Perl 5.10" ~~ qr/Moose/
    qr/Moose/ ~~ "Perl 5.10"


There are more interesting things though.

  • What if one of the given values is not a real scalar?
  • What if that is actually a reference to an array?

The smart match will do The Right Thing, it will check if the given scalar value is the same as one of the elements in the Array. (Str ~~ Array)

    "Moose" ~~ [qw(Foo Bar Baz)]         is false
    "Moose" ~~ [qw(Foo Bar Moose Baz)]   is true

The way the individual values are compared is based on the type of the scalar. So if the scalar is a string all the values of the array are compared to the string using "eq" while if the scalar is a number, all the comparisons will be done by "==".

(Str ~~ Array and Num ~~ Array in the documentation)

So

    42 ~~ [23, 17, 70]              false
    42 ~~ [23, 17, 42, 70]          true
    42 ~~ [23, 17, "42\n", 70]      true
    42 ~~ [23, 17, "42 ", 70]       true
    
    42 ~~ [23, 17, "42x", 70]       true with a warning Argument "42x" isn't numeric in smart match

I am not fully convinced that the last one is really good, but that's the behavior. If you would like to read more about that here is the PerlMonks post about that issue.

So that actually means we now have an operator to check if an individual scalar is represented in an array. It still slower than a hash lookup but it is faster than a grep that most people use. It is definitely easier to write.

Similarly we can have a hash reference instead of one of the values and ~~ will check if the given scalar is one of the keys in the hash: That is, using exists(); (Any ~~ Hash)

    'a' ~~ {a => 19, b => 23}        true
    42  ~~ {a => 19, b => 23}        false

But that is a bit less interesting in this context.

As a side note, instead of reference to Array or reference to Hash you can actually put there the real Array or the real Hash (but not a simple list) so this works and it is true:

    my @a = (2, 3);
    say 3 ~~ @a ? "T" : "F";

but this does not work:

    say 3 ~~ (2, 3) ? "T" : "F";

The obvious question is then, what happens when both sides are complex data structures Arrays or Hashes?

With Arrays, it will check if each element is the same (Array ~~ Array)

    ["Foo", "Bar"] ~~ ["Foo", "Bar"]          is true
    ["Foo", "Bar"] ~~ ["Foo", "Bar", "Baz"]   is false
    [1,2,3] ~~ [1,2,3]                        is true

With hashes it checks if the keys are identical (Hash ~~ Hash)

    {Foo => 19, Bar => 23} ~~ {Foo => 23, Bar => 19}     is true

and the checking is done in any deep structure:

    ["Foo", ["Bar", "Baz"]] ~~ ["Foo", ["Bar", "Baz"]], 

There are several more cases but even this was probably too much for an introduction.


To get more details search for "Smart matching in detail" after typing perldoc perlsyn on the command line - you do have 5.10 installed already, don't you, or after browsing to the perlsyn page.

blog comments powered by Disqus

Upcoming training classes

Lisbon, PortugalTesting PHP Applications July, 2009
Lisbon, PortugalTest Automation Using Perl July, 2009
Lisbon, PortugalHands-on Perl 6 training 1-2 August, 2009
Tags
Perl (116)
Perl 5 (91)
testing (35)
Padre (31)
Perl 6 (27)
CPAN (21)
newsletter (20)
IDE (17)
training (11)
TODO (11)
PHP (11)
Parrot (11)
open source (10)
Ubuntu (8)
Rakudo (8)
TPF (8)
Ruby (8)
blog (8)
Python (7)
blogs (7)
TAP (6)
Oslo (6)
Windows (5)
Israel (5)
YAPC (5)
5.10 (4)
OSDC (4)
test automation (4)
IRC (3)
Java (3)
SQL (3)
Smolder (3)
editor (3)
Linux (3)
tags (3)
community (3)
Perl IDE (3)
automated testing (3)
business (2)
Catalyst (2)
distribution (2)
new year (2)
interview (2)
smoke testing (2)
LinkedIn (2)
Strawberry Perl (2)
wifi (2)
arrays (2)
Germany (2)
Selenium (2)
Mandriva (2)
JavaScript (2)
license (2)
SimpleTest (2)
Fedora (2)
grants (2)
software license (2)
QA (2)
Debian (2)
wrappers (2)
Google (2)
Haifa (2)
Subversion (2)
PPI (2)
Norway (2)
PostgreSQL (2)
Frankfurt (2)
application (2)
scalar (1)
Haskell (1)
ironman (1)
Chicago (1)
Matlab (1)
ack (1)
subroutines (1)
Perl ecosystem (1)
DBI (1)
unique (1)
vim (1)
NPW (1)
hash (1)
commit (1)
marketing (1)
German (1)
Pugs (1)
social gap (1)
Win32 (1)
Sun (1)
Ohloh (1)
harness (1)
smartlinks (1)
Firefox (1)
ISP (1)
project management (1)
C (1)
CentOS (1)
smart match (1)
2009 (1)
quality (1)
Canonical (1)
perl (1)
Languages (1)
Apache (1)
Economy (1)
windows (1)
Build Bot (1)
Freenode (1)
PDL (1)
economy (1)
comments (1)
grep (1)
Modern Perl (1)
beginners (1)
translation (1)
SQLite (1)
links (1)
OSCON (1)
debugger (1)
Hebrew (1)
zip (1)
development (1)
scalars (1)
PAR (1)
Workshop (1)
mocking (1)
CPAN::Forum (1)
foreach (1)
resume (1)
GUI (1)
web sites (1)
pastebot (1)
cm (1)
Smalltalk (1)
programming languages (1)
perl blog (1)
Linux installation (1)
uniq (1)
Modules (1)
Eclipse (1)
WxWidgets (1)
loops (1)
map (1)
popularity (1)
mod_perl (1)
vi (1)
GHC (1)
Etoys (1)
EPO (1)
Gmail (1)
Excel (1)
Modiin (1)
Gtk (1)
number (1)
web (1)
array (1)
plans (1)
Perl blog (1)
virtual machine (1)
FreeBSD (1)
Planet (1)
release (1)
Darcs (1)
Internet (1)
promotion (1)
regular expressions (1)
junctions (1)
switch (1)
UPS (1)
Javascript (1)
Perl Mongers (1)
given (1)
search (1)
VMware (1)
cookbook (1)
wxWidgets (1)
version control (1)
for (1)
DNS (1)
kwalitee (1)
search engine (1)
Deutsch (1)
blogging (1)
shim (1)
COBOL (1)
certificate (1)
trackback (1)
Altavista (1)
test reporting (1)
win32 (1)
spam (1)
syntax highlighting (1)
say (1)
USA (1)
hackathon (1)
PHPUnit (1)
databases (1)
Mibbit (1)
Lua (1)
CMS (1)
integers (1)
Visual Basic (1)
Last Update: Tue Sep 25 17:06:26 2007