A Short Journey
From Higher-Order Functions in Ruby
to Lazy Enumerables

Yuri Leikind

Part 1 – Higher-Order Functions and Ruby

Higher-Order Functions

A higher-order function is a function that does at least one of the following:

Source: http://en.wikipedia.org/wiki/Higher-order_function

Taking one or more functions as an input

Usual thing for every Javascript developer:

Taking functions as an input in Ruby

Functions which output a function

Higher-Order Functions are essential to the FP paradigm

All of these use higher-order functions as the primary construct.

Higher-order functions in Ruby are also very important.




If fact, functions as arguments are so ubiquitous that there is a special syntax for it.



Part 2 – Blocks

Blocks

A block is a higher-order function which is an argument to a method call


It is easy to prove that blocks are functions:


The good design of blocks and their terse syntax allow for beautiful code and awesome APIs:

Standard library:

Blocks allow to build your own syntax constructs and DSL’s:

Sinatra


Rails routes:

It all looks like language syntax, but it is not:

Rspec:

More examples

A Rails plugin by yours truly:

Ruby blocks are the reason why Martin Odersky believes that Ruby is a functional language.

More and deeper on Ruby blocks

You can read more on Ruby blocks and how they are special in a post by Yehuda Katz The Building Blocks of Ruby

http://yehudakatz.com/2010/02/07/the-building-blocks-of-ruby/

Part 3 – Higher-order functions on enumerables

Who needs for loops?

when you can build language constructs with blocks and while:

Higher-order methods on enumerables

Using only each Ruby can implement the same set of higher-order methods on enumerables as functional languages:

All these are found in module Enumerable which can be included in any collection class which implements each

Example: map

Example implementation:

Example: select (filter)

Example implementation:

Example: all?

Example implementation:

A more real-life example




One more example

Part 4 – Problem

Problem

Each such method creates a new collection. They can be huge…

We can rewrite this code to process each word one by one. That’s one solution…

But we can solve the problem without changing the API much

This is what we have, a sequential execution:

This is what we need

Our blocks working concurrently, and processing elements one by one:

In other words

This is what we have


incrementing 1
incrementing 2
incrementing 3
incrementing 4
incrementing 5
checking if 2 is even
checking if 3 is even
checking if 4 is even
checking if 5 is even
checking if 6 is even
2
4
6

This is what we need


incrementing 1
checking if 2 is even
2
incrementing 2
checking if 3 is even
incrementing 3
checking if 4 is even
4
incrementing 4
checking if 5 is even
incrementing 5
checking if 6 is even
6

The second modus operandi is on demand, it is lazy.

How do we do it?



Part 5 – Fibers

Fibers: lightweight cooperative concurrency

Fibers were added in Ruby 1.9

Here’s what the documentation says:

Fibers are primitives for implementing light weight cooperative concurrency in Ruby. Basically they are a means of creating code blocks that can be paused and resumed, much like threads. The main difference is that they are never preempted and that the scheduling must be done by the programmer and not the VM.

Fibers: Example

Example: Fibonacci numbers as a never-ending fiber

Part 6 – Implementing laziness

Disclaimer

The goal of this implementation is only to demonstrate how laziness can be achieved with the help of fibers.

This is neither a complete nor a true implementation of Enumerable::Lazy in Ruby 2.0.

Beauty and completeness have been sacrificed in favor of simplicity.

Enumerable::Lazy in Ruby 2.0 uses Enumerator added in Ruby 1.9 as the basis. Enumerator uses fibers. This proof of concept code uses fibers directly because Enumerator is beyond the scope of this presentation.

Implementation: step 1

Let’s create a wrapper class which will contain a fiber (which in its turn will contain the block defined in the client code)

Let’s add a method in Enumerable which will create an instance of this wrapper out of an enumerable:

Implementation: step 2

The wrapper object should have an API similar to Enumerable. each should be non-lazy, map, select and others should be lazy.

Let’s implement the non-lazy behavior and each. We just access the underlying enumerable via a fiber, no added logic here:

Let’s run it

Oops…

Implementation: step 3

The value of the block is also returned from the fiber, so we have to rewrite each to ignore the last value returned from the fiber.

Implementation: step 4

Let’s implement map and select. We’ll convert the client code block to an explicit function object for the sake of readability.

Implementation: step 5

We are incapsulating the client code block in yet another function which implements the logic of map or select, but it calls Fiber.yield to return the result, because this function will be called from within a fiber.

Implementation: step 6

Both map and select return a new LazyPanda instance incapsulating new logic. In a way, we are dealing with a Russian doll pattern: the first LazyPanda incapsulates a real Enumerable, the next LazyPanda incapsulates the first LazyPanda, and so on.

Implementation: step 7

Now we need to modify the constructor. The constructor should have a second optional function parameter. In case such a function is present, the constructor places this function into a fiber instead of the default Fiber.yield(e):

Full source

Let’s test

No output. It is lazy. This code has only created the Russian doll structure. No block has been executed.

Let’s test

Voila!

Achievement unlocked

Further reading if you want to dig deeper

The End – Thank You