PrevIndexNext

Hashes

In most languages there are arrays. Another very useful data structure which is not always found (in C) is called a hash or an associative array. The language "awk" on Unix may have been the first to use them.

In Java there is the concept of a hash as well but it is not built-in, not an integral part, of the language like it is in Perl so the syntax is more verbose and cumbersome.

Hashes are like arrays but the index (also called a key) can be an arbitrary string (or scalar) instead of an integer. Arrays are ordered collections; hashes are not.

To indicate that a variable contains a hash you use the percent '%' instead of the at sign '@'. They are initialized with pairs of scalars:

%age = ("Jon", 53, "Helen", 3, "Bill", 23, "Mary", 28);
They are referenced just like arrays except that you use braces { } rather than brackets [ ]:
print $age{"Jon"}; # 53 $age{$name} = 32; # the key can be a variable ++$age{"Jon"}; # it is now 54
One way to think of hashes is that instead of asking for the 3rd element of an array you now can ask for the Bill'th element of a hash :).

If you assign to an existing hash element it will overwrite the contents (just like arrays).

Syntactic Sugar for Hashes

Perl offers this syntactic sugar: the string used as the key does not need to be quoted if it contains only word type characters:
print $age{Jon}; $age{Jack} = 32; ++$age{Jon};
And syntactic sugar for the initialization of a hash. Instead of:
my %age = ("Jon", 53, "Helen", 3, "Mary", 28);
You can say:
my %age = ( Jon => 53, Helen => 3, Mary => 28, );
This makes it clear that the data comes in pairs. A side effect of the => (which is very much like a comma and can be used wherever a comma is used) is that the word to its left is automatically quoted. It cannot contain any blanks or punctuation.

Also note the comma after 28 above. It is not necessary because it is the last element in the list. But Perl allows it and it is very nice because it makes it easy to add, delete, or move lines.

Keys

To get a list of all the indices that you can validly use, there is the function "keys" which returns an array.
my @names = keys %age;
or
for my $name (keys %age) { print "name: $name\n"; }
Note that the keys array is not in any particular order. If you want a certain order - sort it!
@names = sort keys %age;
or within a for statement:
for my $name (sort keys %age) { print "name: $name\n"; }

Does a Key Exist?

To see if a string can be used as an key there is the function 'exists':
unless (exists $age{Harry}) { print "I have no idea how old Harry is!\n"; }
Note that there may be an entry in a hash for a given string but its value may be undefined. This IS a valid value and the entry DOES exist. Because of this it is best to check for existence rather than definedness or for truth (because 0 or the empty string IS a valid value!).

If you access a hash with an invalid key you will get the undefined (and false) value undef.

Hash Interpolation

Hash elements can be interpolated into a double quoted string but not the hash as a whole (unlike arrays).
print "$name is $age{$name}\n";

When to use a Hash?

Hashes are very useful when you need to make a lookup table or for finding all the unique elements of a set. This happens much more frequently than you might realize! Hashes prove to be a very useful part of the language.

Exercises

  1. (Do this in stages - first things first!).

    Initialize a hash with names and ages. Print out the age of one of the people. Ask the user for a name, look up that person's age in the hash and print it out. Get into a loop asking for multiple names.

    Extend this idea to check to see if you know the person's age (use the function exists). If you don't know of their name, ask what their age is and remember it for later.

    At the end print out all the people's names and their ages. Then print it sorted by name. Then (a worthy challenge!) print it from oldest to youngest.

    Think about how you might have done this exercise without hashes! Remember that two people can have the same age!

    If you are networking literate think about how you could have done this exercise with domain names and their corresponding IP address. A hash becomes a simple DNS!

    Instead of storing an age for a person's name, store their telephone number. A hash becomes a simple phone book!

  2. Read a file of names. Print a sorted list of the unique names found there and how often they occurred.

PrevIndexNext