Transducers provide a nice way of efficiently processing arrays (and list-like types) by combining many operations into a single one. To illustrate this we will come up with a situation where the naive way of mapping and filtering an array of integers has some obvious flaws, and then look at how we can remedy it.
16 |Before we get to that we need to build up some tools to make function composition a little nicer. Afterall, bolstering function composition is one of the most important properties of functional programming. Here are a few simple combinators to start things off with:
17 | 18 |This signature helps clarify that filter
simply lifts a predicate A -> Bool
to a function on arrays [A] -> [A]
.
Using these combinators and array processing functions we can do all types of fun stuff. Let's take a large array of integers and run them through a few functions:
16 | 17 |This isn't very efficient because we know that this will process the array of xs
twice: once to square
and again to incr
. Instead we could compose square
and incr
once and feed that into fmap
:
Now we are simultaneously squaring and incrementing the xs
in a single pass. What if we then wanted to filter
that array of numbers to find all of the primes?
We're back to being inefficient again since we are processing the xs
twice: once to square
and incr
, and again to check isPrime
.
Transducers aim to remedy this by collecting all mapping and filtering functions into a single function that can be run once to process the xs
. The idea stems from the observation that most of the functions we write for processing arrays can actually be written in terms of reduce
. (In fact, one can make a precise statement about all array functions being rewritten as reduce
.)
For example, here is how one might write map
, filter
and take
in terms of reduce
:
Now that we know reduce
is in some sense "universal" among functions that process arrays we can try unifying all of our array processing under reduce
and see if that aids in composition. To get to that point we are going to define some more things. First a term: given data types A
and C
we call a function of the form (C, A) -> C
a reducer on A
. These are precisely the kinds of functions we could feed into reduce
. The first argument is called the accumulation and the second element is just the element of the array being inspected. A function that takes a reducer on A
and returns a reducer on B
is called a transducer. A simple example would be the following:
Using these combinators we can now be way more expressive with the flow of data:
15 | 16 |It takes any function A -> B
and lifts it to a transducer from B
to A
. It is very important to note that the direction changed! This is called contravariance. Note that the implementation of this function is pretty much the only thing we could do to make it compile. It is almost as if the compiler is writing it for us.
Another example:
16 | 17 |This lifts a predicate A -> Bool
to a transducer from A
to A
.
We can use these functions to lift functions and predicates to transducers, and then feed them into reduce
. In particular, consider mapping(square)
. That lifts square
to a transducer ((C, Int) -> C) -> ((C, Int) -> C)
, where C
can be any data type. If we feed a reducer into mapping(square)
we get another reducer. A simple reducer that comes up often when dealing with arrays is append
, which Swift doesn't have natively implemented but we can do easily enough:
Then what does mapping(square)(append)
do? It just squares an integer and appends it to an array of integers.
Feeding the reducer mapping(square)(append)
into reduce
we see that we get the same thing had we mapped with square
:
Ok, but now we've just made this code more verbose to seemingly accomplish the same thing. The reason to do this is because transducers are highly composable, whereas regular reducers are not. We can also do:
15 | 16 |Well, once again we didn't produce anything new that map
didn't provide before. However, now we will mix in filters!
There we go. This is the first time we've written something equivalently with reduce
and map
, but the reduce
way resulted in processing the xs
a single time, whereas the map
way needed to iterate over xs
twice.
Let's add another wrinkle. Say we didn't just want those primes that are of the form n*n+1
for 2 <= n <= 100
, but we wanted to find their sum. It's a very easy change:
Now that looks pretty good! Some really terrific code reusability going on right there.
15 |Sometimes it's useful to truncate an array to the first n
elements, especially when dealing with large data sets. How can we introduce truncation into our reduce
processing pipeline? Well, we need another transducer. Given an integer and a reducer we need to construct a new reducer that simply accumulates until the accumulation has reached size n
. Or in code:
Note that this transducer is specialized in that it can only work with reducers of the type ([C], A) -> [C]
, whereas our other transducers allowed for the more general case of (C, A) -> C
, i.e. the accumulation needs to be an array.
Now say that we don't want to just find the primes of the form n^2+1
for 2 <= n <= 100
. Say we want to find the first 10 twin primes of the form n^2+1
(a twin prime is a prime p
such that p+2
is also prime). First we need a function for twin primality testing:
Then we can find the first 10 twin primes of the form n^2+1
via:
We will also re-define map
to have a signature that is easier to work with:
This is all done with a single pass of the array of integers 1...200
, and it stops the moment 10 twin primes are found.
Note that we had to choose a large enough range (1...200
in this case) to ensure that we would fine all the twin primes we were looking for. That's an unfortunate choice to make. Instead, it would be nice to work on the full sequence of positive integers so we didn't have to worry about this. In fact, transducers are great for working on any list-like data type (streams, sequences, arrays, ...), and we could easily beef up everything we've done so far to work in the more general setting.
To be continued...
17 | 18 |This signature makes it clear that map
simply lifts a function A -> B
to a function [A] -> [B]
. For example:
Finally, we need to re-define filter
just like we did for map
:
7 |
8 |
11 |
12 |
15 |
16 |
19 |
20 |
23 |
24 |
27 |
28 |
31 |
32 |
35 |
36 |
39 |
40 |
43 |
44 |
47 |
48 |
51 |
52 |
55 |
56 |
59 |
60 |
63 |
64 |
67 |
68 |
71 |
72 |
75 |
76 |
79 |
80 |
83 |
84 |