Hash

A Ruby hash is an optimized collection that stores keys and values. We use keys to access the values—this is called a hash lookup. The Hash class provides this function.

With a hash, we can add, remove and enumerate values. These lookup tables are powerful. They can be combined with Arrays to solve complex problems.

New example

A hash is created by calling Hash.new(). In this example, we pass no argument to new. The hash has no special default value when a key is not found.

Start We add 3 string keys that have integer values. Theses are key-value pairs.

Info We use the square brackets, "[" and "]" to specify the key. The value we add is indicated by assignment.

# Create a new hash.
items = Hash.new()

# Store these key-value pairs in it.
items["milk"] = 5
items["eggs"] = 10
items["bread"] = 15

# Display the values.
puts items["milk"]
puts items["eggs"]
puts items["bread"]5
10
15

Default value

A hash can have a custom default value. This is returned when a nonexistent key is accessed. Here we access a key that does not exist, and we get the specified default value of -1.

# Use the default value of -1.
sizes = Hash.new(-1)

# Add keys and values.
sizes["jeans"] = 32
sizes["shirt"] = "medium"

# Access existing data.
puts sizes["jeans"]
puts sizes["shirt"]

# This doesn't exist.
puts sizes["jacket"]32
medium
-1

`Count`, delete

In this example, we use the count() method—this returns the number of keys in the collection. We then use the delete() method to remove a key and its value.

Note Before the delete() method is called, there are four elements. After it is invoked, there are only three.

# Add data to the new hash.
elements = Hash.new()
elements[100] = "a"
elements[200] = "b"
elements[300] = "c"
elements[400] = "d"

# Display count.
print "Count: ", elements.count(), "\n"

# Delete a key and its value.
elements.delete(100)

# Display new count.
print "Count: ", elements.count(), "\n"Count: 4
Count: 3

`Delete_if`

This method receives a block argument. In the block, the keys and values are evaluated in pairs. We must return true or false based on the key and value.

Here We delete keys of length greater than 3. The key "rabbit" is deleted, but the other two ("cat" and "dog") remain.

Note We can also delete based on the value. Just test the second parameter in the block.

# A hash with three pairs.
values = {"cat" => 1, "dog" => 2, "rabbit" => 4}

# Delete keys longer than 3 chars.
values.delete_if{|key, value| key.length > 3}
puts values
{"cat"=>1, "dog"=>2}

Loop, keys

A for-loop can be used over the keys in a Hash. We call the keys() method on the hash variable in the for-loop statement. We must then access the value with a lookup.

# Add names.
names = Hash.new()
names["charlotte"] = "stant"
names["maggie"] = "verver"
names["adam"] = "verver"

# Use for-loop on keys.
for key in names.keys()
    print key, "/", names[key], "\n"
endcharlotte/stant
maggie/verver
adam/verver

Each

With this iterator, we loop over each pair in a hash. We must declare two variables. We use the names key and value for them. This is the current key and value in the hash.

Note The each iterator requires the "do" keyword if you do not place the entire statement on one line.

Note 2 The each_pair method is aliased to each. There is no advantage in using it—in my experience, each is more standard.

numbers = {10 => 100, 20 => 200, 30 => 300}

# Use each to enumerate the pairs in the hash.
numbers.each do |key, value|
    # Display the key and value.
    print "  KEY: ", key, "\n"
    print "VALUE: ", value, "\n"
end  KEY: 10
VALUE: 100
  KEY: 20
VALUE: 200
  KEY: 30
VALUE: 300

One-line syntax

We do not need multiple lines to use each over a hash. Here we use the each method on a hash with a block contained by curly brackets. We print the keys and values.

numbers = {100 => "A", 200 => "B", 300 => "C"}

# Use each with one-line syntax.
numbers.each {|k, v| print k, " ", v, "\n"}100 A
200 B
300 C

Empty

With count() we can see if a hash is empty by checking it for zero elements. But the "empty?" method is another syntax for this check. If it returns true, the hash has no elements.

# New hash.
items = Hash.new()

# Check emptiness.
if items.empty?
    puts "Empty"
end

# Add something.
items["sam"] = 1

# It is no longer empty.
if !items.empty?
    puts "Not empty"
endEmpty
Not empty

Merge

It is sometimes necessary to combine (union, or merge) two hashes into one. This puts all distinct keys from both hashes into a single hash. Duplicates are removed.

Result The merged hash has only one value for "b"—this is 2. But in the hashes, the key "b" has two values, 2 and 3.

So The duplicate value was lost in the merged hash. This could be a problem if there are two valid values.

# Two input hashes.
one = Hash["a" => 1, "b" => 2]
two = Hash["b" => 3, "c" => 0, "d" => 3]

# Merge them into a third.
both = two.merge(one)

# Display result.
puts both
{"b"=>2, "c"=>0, "d"=>3, "a"=>1}

Invert

In a hash, each key points to one value. We can quickly access keys, but not values. To efficiently access values, we can use the invert() method to create an inverted hash.

Then We can use a fast lookup on a value to get its original key. And this optimizes certain program requirements.

# Create a hash and invert it.
one = Hash["a" => 1, "b" => 2]
two = one.invert()

# Display both hashes.
puts one
puts two
{"a"=>1, "b"=>2}
{1=>"a", 2=>"b"}

`Keys`

How can we test a key for existence? In Ruby, the "key?" and "has_key?" methods are effective. These methods, which have the same functionality, return true or false.

Note These methods are fast—they just perform a lookup. Two methods in Ruby, member and include, are equivalent.

# A hash of utensils
silverware = Hash["spoon" => 10, "fork" => 20]

# See if key exists.
if silverware.key?("fork")
    puts "Found: fork"
end

# Check that key does not exist.
if !silverware.has_key?("knife")
    puts "Not found: knife"
endFound: fork
Not found: knife

Flatten

Sometimes we want to place a hash's pairs into an array. Flatten() has this effect. It places keys and values into a single array. The ordering of key, then value, is retained.

Tip Flatten has the same effect as calling keys() and values() and then combining the two into a third array.

However It is better to use flatten() when this combined data structure is required—it is simpler.

# Create and flatten a hash.
numbers = Hash[1 => 10, 2 => 20, 3 => 30]
flat = numbers.flatten()

# Display flattened array.
puts flat1
10
2
20
3
30

Eql method

Next, the "eql?" method compares two hashes for exact content equality. It checks all keys and all values. If any difference is found, "eql?" returns false.

Here Three hashes are created. The first hash, fruit1, is equal to the third hash, fruit3. But the second hash is not equal.

# Create three hashes.
fruit1 = Hash["apple" => 1, "pear" => 2]
fruit2 = Hash["guava" => 3, "apricot" => 4]
fruit3 = Hash["pear" => 2, "apple" => 1]

# See if first hash equals second hash.
if !fruit1.eql?(fruit2)
    puts "1 not equal to 2"
end

# First hash equals third hash.
# ... Ordering does not matter.
if fruit1.eql?(fruit3)
    puts "1 equals 3"
end1 not equal to 2
1 equals 3

Inspect

A hash is converted to a string with inspect(). This is helpful when we want to store string representations in a file or in memory.

Note With inspect(), we can make further changes to the hash but keep the same string.

Note 2 The string representation returned by inspect() uses arrows to separate keys and values.

# An input hash.
values = ["a" => 10, "b" => 20];

# Convert to string.
s = values.inspect

# Display the string.
puts s

# String length.
puts s.length
[{"a"=>10, "b"=>20}]
20

Nested

A hash can contain other hashes—we can access items by chaining lookups. This allows us to create a tree-like data structure. Here we create a hash with two nested hashes in it.

lookup = Hash[]

# Create and add a subhash.
subhash = Hash[]
subhash["paws"] = 4
subhash["fur"] = "orange"

lookup["cat"] = subhash

# Create another subhash.
subhash = Hash[]
subhash["tail"] = 1
subhash["ears"] = 2

lookup["dog"] = subhash

# Display nested hashes.
puts lookup["cat"]
puts lookup["dog"]

# Get values from nested hashes.
puts lookup["cat"]["paws"]
puts lookup["dog"]["ears"]
{"paws"=>4, "fur"=>"orange"}
{"tail"=>1, "ears"=>2}
4
2

`Sort`

A hash can be sorted. When we call sort on it, Ruby converts it into a sortable array of key-value pairs. By default, the keys are sorted, in ascending order (from lowest to highest).

plants = {"carrot" => 5, "turnip" => 10, "apple" => 8}

# Sort the hash by its keys.
plants.sort.each do |key, value|
    # Display the entry.
    puts key + ": " + String(value)
endapple: 8
carrot: 5
turnip: 10

`Sort` on values

We can also use the sort method to create an array sorted by a hash's values. We specify a block with sort. In it, we compare the second element in each pair (at index 1).

Note There is no in-place sorting method on a hash. Calling "sort!" causes an error. We must always copy into a new variable.

plants = {"carrot" => 5, "turnip" => 10, "apple" => 8}

# Sort the hash by its values.
# ... Iterate over the resulting array.
result = plants.sort{|x, y| x[1] <=> y[1]}
result.each do |key, value|
    puts String(value) + "..." + key
end5...carrot
8...apple
10...turnip

Benchmark, hash

One goal of Hash is to improve performance. When a lookup is done, a hash code is computed. This estimates the element's location in memory and speeds searching.

Version 1 In this code, we test lookups in a hash. We look up the string keys "two" and "four."

Version 2 In this version of the code, we search for those same elements in a similar Array.

Result We find Hash improves performance. Even on a small collection of five elements, it locates about 50% faster than an array.

h = Hash["one" => 0, "two" => 1, "three" => 2,
         "four" => 3, "five" => 4]
a = Array["one", "two", "three", "four", "five"]
count = 100000

n1 = Time.now.usec

# Version 1: perform hash lookups.
count.times do
    # Two hash lookups.
    v = h["four"]
    v = h["two"]
end

n2 = Time.now.usec

# Version 2: perform array searches.
count.times do
    # Two array find operations.
    v = a.index("four")
    v = a.index("two")
end

n3 = Time.now.usec

# Compute milliseconds total.
puts ((n2 - n1) / 1000)
puts ((n3 - n2) / 1000)55 ms   Hash lookup
86 ms   Array index

Detect duplicates

The hash detects duplicates—each key is unique. With a hash, we can find the duplicates in an array. This is fast even on arrays with many elements.

Discussion

We often have a choice of data structures. If lookups predominate, a hash is fastest. If elements must be unique, a hash is also easiest—it detects duplicates.

The Hash class is a key part of Ruby. It is part of the core library, which makes it simple to access and readily available. It can have keys and values of many types.