Understanding Enumerable

by Pete Nicholls on at Christchurch Ruby

The first few sections are aimed at newcomers to Ruby, so feel free to skip it if you already know it.

Enumerable is a fascinating mixin. Its use makes for unusual, characteristic, and expressive Ruby code. In this talk, we’ll take a look at Ruby’s Array class, discuss why Rubyists don’t use for, and learn how to count the uncountable.

Forming arrays

When I first starting writing Ruby, I remember being impressed with the breadth and expressiveness of the standard library. Perhaps none stood out to me more than the Array class.

Let’s look at a few examples. You can try any of these examples using the interactive console that ships with Ruby, irb. Just type irb in a terminal and hit return to start an interactive Ruby session.

Let’s start by creating an array:

parents = ["Ned", "Catelyn"]
# => ["Ned", "Catelyn"]

Nothing out of the ordinary here, but there’s a simpler way to write arrays of whitespace-separated values:

children = %w( Robb Sansa Arya Bran Rickon )
# => ["Robb", "Sansa", "Arya", "Bran", "Rickon"]

Now, let’s combine the two arrays together:

starks = parents + children
# => ["Ned", "Catelyn", "Robb", "Sansa", "Arya", "Bran", "Rickon"]

Seems natural, right? Subtraction works too:

starks - children
# => ["Ned", "Catelyn"]

Accessing items:

starks.first # => "Ned"
starks.at(3) # => "Sansa"
starks[-1] # => "Rickon"
starks[3..5] # => ["Sansa", "Arya", "Bran"]

In the last example, 3..5 is a Range. We’ll come back to those later.

Array iteration

Ruby has a for loop:

for name in starks
  puts name
end

You won’t often see it used, though. Instead, most Rubyists will use each:

starks.each do |name|
  puts name
end

There’s an important distinction, and we’ll come back to it later. For the moment, just take a look at this example.

Between the do and end we have a block. You can think of a block as an anonymous function or a closure, if you’re familiar with those concepts. Otherwise, just think of it as a “block” of code.

The block gets executed once per item in the array.

Another way to write a block is with curly braces. The following is identical to the last example:

starks.each { |name| puts name }

What if we wanted the index, too?

starks.each_with_index { |name, i| puts "#{i}. #{name}" }

Which would print:

0. Ned
1. Catelyn
2. Robb
...

Filtering and sorting arrays

Many of the methods on arrays use blocks.

Let’s try filtering an array:

starks.select { |name| name.end_with?('a') }
# => ["Sansa", "Arya"]

When using select, the return value of the block will be evaluated to determine which elements are returned in the new array. In Ruby, all methods and blocks have a return value, in this case the last line of the block.

Similar to select, we have reject:

starks.reject { |name| name.include?('a') }
# => ["Ned", "Robb", "Rickon"]

sort is available:

starks.sort
# => ["Arya", "Bran", "Catelyn", "Ned", "Rickon", "Robb", "Sansa"]

Because most methods on array return another array, you can chain methods together:

starks.sort.reverse
# => ["Sansa", "Robb", "Rickon", "Ned", "Catelyn", "Bran", "Arya"]

Here, we’re using a block sorting by the length of their name:

starks.sort_by { |name| name.length }
# => ["Ned", "Robb", "Arya", "Bran", "Sansa", "Rickon", "Catelyn"]

If you want to call a single method on each item, there’s a shorthand syntax for that:

starks.sort_by(&:length)

I won’t discuss how this works, but it’s pretty neat if you want to find out for yourself.

Transforming arrays

What if we wanted to transform all the names?

starks.map { |stark| stark.upcase }
# => ["NED", "CATELYN", "ROBB", "SANSA", "ARYA", "BRAN", "RICKON"]

Let’s add the last name to each member of the Stark house. This time, we’re going to use map’s brother, map!:

starks.map! { |first_name| "#{first_name} Stark" }
# => ["Ned Stark", "Catelyn Stark", "Robb Stark", "Sansa Stark", "Arya Stark", "Bran Stark", "Rickon Stark"]

The difference between map and map! is that map will return a new array, while map! will modify the existing array.

In the same way, select and reject have select! and reject! counterparts that modify the original array rather than retuen a new one.

Ruby Tip: The exclamation mark (also known as the bang), is a perfectly valid character to use as part of a method name. Rubyists tend to use exclamation marks at the end of their methods to let other developers know that the method is dangerous. It might be making permanent changes to your object, database, or doing something unexpected.

Searching for answers

Now that we have all our Starks properly named, let’s add one more:

starks << "Jon Snow"
# => ["Ned Stark", "Catelyn Stark", "Robb Stark", "Sansa Stark", "Arya Stark", "Bran Stark", "Rickon Stark", "Jon Snow"]

How would we know if this array contained someone whose last name was Snow? We could use select, but it would have to iterate through every item in the array, even if it had already found a Snow. Wouldn’t it be better if we had a method that returned true the moment it found one?

starks.any? { |name| name.end_with?('Snow') }
# => true

The question mark in the any? method is similar to the exclamation mark before – it’s simply a Ruby convention, this time to indicate a method that returns a true (or truthy) value.

starks.all? { |name| name.end_with?('Stark') }
# => false

starks.delete('John Snow')
starks.all? { |name| name.end_with?('Stark') }
# => true

Range iteration

Now you’ve seen a few tricks from the Array class, let’s take a look at another common class, Range:

(0..5).each { |n| puts n }

Notice how we’re using each again. Here’s a method that allows us to iterate over the range in slices:

(1..100).each_slice(5) { |set| puts set.inspect }

Which would produce:

[1, 2, 3, 4, 5]
[6, 7, 8, 9, 10]
...snip...
[96, 97, 98, 99, 100]

Counting to Infinity

In most languages, dividing by zero raises an error:

>>> 1.0/0
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ZeroDivisionError: float division by zero

Not so in Ruby:

Infinity = 1.0/0
# => Infinity

Interestingly, you can create a range that uses Infinity:

all_positive_numbers = 0..Infinity
all_positive_numbers.include?(3) # => true
all_positive_numbers.include?(41343124) # => true
us_debt = 17_000_000_000_000
all_positive_numbers.include?(us_debt) # => true

You can even create a range between negative and positive Infinity:

all_numbers = -Infinity..Infinity
all_numbers.include?(-35623) # => true

You could, if you wanted to, use each to iterate over every postive number.

all_positive_numbers.each { |n| puts n }

It may take some time. Better, perhaps, to take just a few:

all_positive_numbers.take(5)
# => [0, 1, 2, 3, 4]

Or perhaps in slices:

 all_positive_numbers.each_slice(5).take(2)
# => [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]]

Notice how we didn’t pass a block to each_slice this time? I wonder what happens if we just call each by itself?

enumerator = all_positive_numbers.each
# => #<Enumerator: 0..Infinity:each>

enumerator.next # => 0
enumerator.next # => 1
enumerator.next # => 2

Interesting.

Hashes

Hashes allow table-like mappings between names and values:

sigils = {
  'Stark'     => 'Direwolf',
  'Lannister' => 'Lion',
  'Greyjoy'   => 'Kraken'
}

Hashes, too, have an each method:

sigils.each do |house, animal|
  puts "The sigil of House #{house} is a #{animal}."
end

Let’s have a look at the methods on a Hash:

Hash.instance_methods
# => [..., :each, :map, :select, :reject, :all?, :any?, ...]

Hmm, these are the same methods that appear on the Array and Range classes.

Introducing Enumerable

Array, Range, and Hash share a number of identical (and useful) methods. How is this happening?

Hash.included_modules # => [Enumerable, …]
Array.included_modules # => [Enumerable, …]
Range.included_modules # => [Enumerable, …]

Each of these classes include the Enumerable module. What does that do?

Enumerable.instance_methods
# => [:to_a, :entries, :sort, :sort_by, :grep, :count, :find, :detect, :find_index, :find_all, :select, :reject, :collect, :map, :flat_map, :collect_concat, :inject, :reduce, :partition, :group_by, :first, :all?, :any?, :one?, :none?, :min, :max, :minmax, :min_by, :max_by, :minmax_by, :member?, :include?, :each_with_index, :reverse_each, :each_entry, :each_slice, :each_cons, :each_with_object, :zip, :take, :take_while, :drop, :drop_while, :cycle, :chunk, :slice_before, :lazy]

Aha! The methods we’ve been using have been defined inside the Enumerable module.

What’s a module?

A module is a collection of methods. Modules are sometimes called “mixins”, because you can mix in the methods defined in a module into a class.

Do you bake? I’m terrible at it. I can never remember to keep an eye on the time. Let’s write some Ruby that knows how to bake:

module Baking
  def bake!
    "Baking for #{cooking_time} at #{temperature}"
  end
end

And let’s add our newfound baking prowess to a cake:

class Cake
  include Baking

  def cooking_time
    "45 mins"
  end

  def temperature
    "180ºC"
  end
end

And now our cake knows how to bake itself:

cake = Cake.new
cake.bake! # => "Baking for 45 mins at 180ºC"

Browsing books

Let’s say we had a Book and a Bookshop class:

class Book < Struct.new(:title, :price)
end

class Bookshop
  def initialize(books)
    @books = books
  end

  def number_of_books
    @books.length
  end
end

The Struct is a little trick we’re using to sketch out the classes.

Book and Bookshop can be used like this:

books = [
  Book.new('The Little Book of Calm', 2.20),
  Book.new('Complete Works of Dickens', 300),
  Book.new('The Elephant and the Balloon', 5)
]

black_books = Bookshop.new(books)
black_books.number_of_books # => 3

How could we browse through each of the books in the bookstore?

Defining each

Let’s make our own each, which will hand us each of the books:

class Bookshop
  # ...snip...

  def each
    @books.each { |book| yield book }
  end
end

Here’s how you could use the each method we’ve implemented:

bookshop.each do |book|
  puts book.title
end

Our each just calls each on the underlying @books array. Your each could be much more interesting – every iteration of each could execute a database query, or query a web service, or take a reading from a weather station, for example. It’s entirely up to you.

Mixing in the magic

The clever bit comes when we include the Enumerable module.

class Bookshop
  include Enumerable
  # ...snip...
end

Now, we can use all the wonderful methods we could with Array, Range, and Hash:

bookshop.filter { |book| book.price <= 5.00 }
bookshop.any? { |book| book.title.start_with?('Complete') }
...

Enumerable gives you all these methods for free. You just need to define what each means to you.

Building on the abstraction

We can also easily implement methods using Enumerable-provided methods:

class Bookshop
  # ...snip...

  def prices
    map(&:price)
  end

  def titles
    map(&:title)
  end

  def under_ten_dollars
    filter { |book| book.price < 10 }
  end

  def total_value_of_stock
    prices.inject(:+)
  end
end

Let’s take a look:

black_books.prices # => [2.20, 300, 5]
black_books.under_ten_dollars # a list of books
black_books.total_value_of_stock # => 307.2

Neat!

Ducking under the hood

Array, Range, and Hash all implement their own each method, and then simply include the Enumerable module to get all of the convienient methods. How does that work?

Well, here’s how map could have been implemented inside Enumerable:

module Enumerable
  def map
    new_collection = []

    self.each do |item|
      modified_item = yield item
      new_collection << modified_item
    end

    new_collection
  end
end

In other words, all the methods you’ve seen so far (select, each_with_index, all?, map, etc.) have been written purely in terms of each.

This is a beautiful example of polymorphism in the wild. By abstracting away the underlying class or data structure, Enumerable solves a number of common problems concerning collections in one fell swoop. It can handle everything from hashes to bookshops.

Enumerable: simple, flexible, and beautiful. Just like Ruby.

Further reading