Folds and recursion¶
I recently read a paper by Graham Hutton on folds and recursion. If you don't care much about functional programming, feel free to disengage now. 😆
It is an interesting paper, and I encourage you to give it a read. This article is more squirrel chasing and less about the central thesis of Graham Hutton's work.
Spoiler alert, Haskell doesn't have
That leaves tools like
map, recursions, and
fold in our toolbox.
Recursion can be very expensive, so this paper shows us how to abstract it away with
One of the examples rewrites
fold instead of recursion.
dropWhile' p = foldr f v where f x (ys, xs) = (if p x then ys else x:xs, x:xs) v = ([ ],[ ]) dropWhile p = fst . dropWhile' p
It took me a while to unpack what that chunk of code is doing.
- Iterate over the list, in reverse
- Store visited values in
- When the predicate stops firing, flip
yswhich has been empty up to this point
- The last line returns a function that takes a list of values and returns another (filtered) list
Clearly, the list must be finite for this to work.
dropWhile in Python¶
Let's mess around with a translation and a couple variants. These are quick-and-dirty examples, missing guardrails.
from functools import reduce def drop_while(pred, items): def _drop_while(tup, x): ys, xs = tup if pred(x): return (ys, [x] + xs) return ([x] + xs, [x] + xs) return reduce(_drop_while, reversed(items), (, )) %timeit drop_while(lambda x: x < 42, range(1000))
5.21 ms ± 37.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
def drop_it_hot(pred, items): for i, v in enumerate(items): if pred(v): continue return items[i:] return items %timeit drop_it_hot(lambda x: x < 42, range(1000))
6.43 µs ± 9.13 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
def drop_it_hotter(pred, items): i = iter(items) v = next(i) while pred(v): v = next(i) return i %timeit drop_it_hotter(lambda x: x < 42, range(1000))
7.37 µs ± 33.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Not bad! If you asked me ahead of time how those would perform, I would have guessed the exact opposite order that happened. The functional version has to traverse the entire list, but the other two short circuit when the predicate stops firing. If you know why this is the case, please leave a comment! (Buttons at the bottom of the page)
All this made me think about the Fibonacci function (everyone's pet recursion function).
Let's start with the typical recursive implementation then see if we can rewrite it using ~
def fibr(n): if n in [1, 2]: return 1 return fibr(n -1) + fibr(n - 2) %timeit fibr(10)
15.4 µs ± 95.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
from functools import reduce def fib(n): # `range` is simply a generator, and we ignore its value return reduce(lambda x, _: (x, x + x), range(n - 1), (1, 1)) %timeit fib(100)
18.6 µs ± 141 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
def fibfor(n): # Boring, imperative version a, b = 1, 1 if n in [1, 2]: return 1 for i in range(2, n+1): # Am I the only one that finds this cringey?! a, b = b, a + b return a %timeit fibfor(100)
5.82 µs ± 84.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Recursion is expensive¶
That recursive implementation is an order of magnitude worse than the other two versions.
I was again surprised that the imperative version kicked
TL;DR; When performance matters, measure!
I have more mathy work coming soon. I wanted to slip this out between bigger projects, more to come soon!
Want to play with the Haskell code in the intro, but in the comfort of a Jupyter notebook? Check out IHaskell!