I was browsing
/r/learnprogramming
yesterday, and I saw a few posts discussing recursion: some posters
were confused by how recursive calls worked and how to write their own
recursive functions. (There were also questions as to why recursion
was desirable; I'll leave that for another post.) I understand how
these people feel really well: I also struggled and scratched my head
often before I was able to use recursion as easily as I can use a
for
loop.
In this post, I'd like to present a tip that has helped me grok recursion: assuming you have something that wish you had. Whenever I write recursive code and that I forget this tip, I struggle and when I remember it, code just flows out of my fingertips. We'll look at three small functions written recursively, and I hope that I am able to properly explain how I thought about the solutions and that it is helpful to you. The code examples will be written in Python.
The Towers of Hanoi is a classic computer science program that most CS undergrads have seen in at least one class. It's a "game" where you have a stack of disks of increasing diameter on a rod and you want to move those disks to a second rod, using a third intermediate rod, and respecting some rules:
Our goal is to write a function that will give the movements that we need to perform to move n disks from one rod to another. One way to achieve this is with a recursive function, so let's take a look at a solution.
First, what are the inputs of the function? We'll need to pass in 4 parameters:
So we have the following function definition:
def hanoi(disks, src, dst, temp):
pass
Easy enough. Let's implement the logic. In a recursive function, we always have two parts:
The base case is usually the simpler one, so let's start with it. In our problem, what is the simplest case we can imagine? What about moving 0 disks? Well that's pretty simple: we do nothing! Let's implement that part:
def hanoi(disks, src, dst, temp):
if disks == 0:
return []
else:
pass
Our function returns the moves as a list; doing nothing is represented by returning an empty list.
Now we come to the interesting part, the inductive case. It's easy to get lost here: we can look at the rules and start imagining all sorts of tests that we could perform to move the disks. At first view, it seems that there is a lot of logic to be implemented, and that can make your head spin a little bit. But let's remember the tip:
Assume you have something that you wish you had.
So what do we need to solve the towers of Hanoi? Well, it would be really swell if we had a way to move all but the largest disk into the temporary rod. Moving the largest disk from the source to the destination would then be a single step. Afterwards, it would also be really nice if we had a way to move the smaller disks from the temporary rod to the destination. To summarize, here's what we need:
Do we have that? Well, sort of... See, hanoi
is a function that is
supposed to take n
disks from one rod to another, so maybe we can
use that?
def hanoi(disks, src, dst, temp):
if disks == 0:
return []
else:
## Move all but the last disks to the temp rod.
step1 = hanoi(disks-1, src=src, dst=temp, temp=dst)
## Move the largest disk to the destination.
step2 = [(src, dst)]
## Move the smaller disks that are on the temp rod
## to the destination.
step3 = hanoi(disks-1, src=temp, dst=dst, temp=src)
## Combine and return the moves
return step1 + step2 + step3
>>> hanoi(3, "A", "B", "C")
[('A', 'B'), ('A', 'C'), ('B', 'C'), ('A', 'B'), ('C', 'A'), ('C', 'B'), ('A', 'B')]
step1
and step3
describe the two things we wish we could do, and
step2
is simply the move of the largest disk from source to
destination. And as it turns out, that is the complete solution!
Now if you're thinking that there is something a bit "magical" going
on, you are not wrong. After all, it's not because we wish we had
something that we have it. How is it that by simply stating our two
wishes in code, that the code works? You can convince find out in
detail by simulating it with values of disks
from 0, 1, 2, etc. if
you wish, but I find that thinking of it in high level terms just
"well, by some process, the smaller disks are moved out of the way"
is easier. This is the power of our little tip. Let's look at
another example.
If you took a discrete math class, you'll probably have heard of power sets; given a set of elements S, it's all the different ways we can take some of the elements of S.
For example, if S = {1, 2, 3}
, then we can take no elements ({}
),
just one element ({1}, {2}, {3}
), two elements ({1,2}, {1,3},
{2,3}
) or three elements ({1,2,3}
).
Let's say we wanted to write a function to generate a power set for
us, how would we do it? Perhaps you'll think of some way to use a
while
loop and some clever technique to pick out the correct
elements, but let's try our hand at another recursive function.
Again, we need to consider what the inputs of the function are, and
we'll also consider the base case. The input to the function will be
a set (we'll use a list in Python). What about the base case? Let's
think, what would be the simplest input we could give to our
powerset
function? The empty set sounds like a good candidate. And
how many ways can we pick elements out of the empty set? There's just
one way: pick no elements. So we have the following base case:
def powerset(s):
if not s:
return [[]]
else:
pass
(Note that we return [[]]
and not []
: there is exactly one member
in the power set of the empty set, so it's important that we return a
list containing one element.)
Now we have to deal with the inductive case, so let's look at our tip again:
Assume you have something that you wish you had.
So what do we wish we had? What if we had the power set of the set,
but with the first element removed? What I mean by that is, suppose
we have S = [1,2,3]
, we take out the first element (1), and compute
the power set for [2,3]
. Once we have that power set, we can make a
copy, add 1 to every set in the copy and do the union between the
original power set and the modified copy. Here's what I mean in
pseudo code:
x, rest = S[0], S[1:]
P = powerset(rest)
P2 = P.copy()
for s in P2:
s.append(x)
return P + P2
All right, let's implement this (we'll simplify the Python code a bit by using a list comprehension).
def powerset(s):
if not s:
return [[]]
else:
x, rest = s[0], s[1:]
p = powerset(rest)
return p + [[x]+t for t in p]
>>> powerset([1,2,3])
[[], [3], [2], [2, 3], [1], [1, 3], [1, 2], [1, 2, 3]]
And it works! We used the same strategy, we found out what we wished
we had, assumed we had it by making a recursive call (powerset
is
supposed to give us the thing we need) and use it to build our result.
Again, you can work through small examples to see how it works. We
have one last example of applying recursion to solve a problem.
Our last example will involve money: given an amount and a list of coins, how many ways can we make change for that amount?
We'll once again start by considering our inputs and base case. The inputs are simply going to be the amount and the list of coins (n.b.: the coins are going to be in decreasing order). What about the base case? Well in this particular problem, there are actually going to be two of them:
Let's put in those base cases:
def make_change(amount, coins):
if amount == 0:
return 1
elif not coins or amount < 0:
return 0
else:
pass
And we now get to the inductive case, and remind ourselves, for the last time, of our tip:
Assume you have something that you wish you had.
What do we need to solve the problem? We'd like to know two things: how many ways are there to make change without using the first coin, and how many ways are there to make change when we use the first coin. If we had these two numbers, we could simply add them and be done. And we can describe those numbers using a recursive call:
def make_change(amount, coins):
if amount == 0:
return 1
elif not coins or amount < 0:
return 0
else:
return make_change(amount, coins[1:]) + \
make_change(amount-coins[0], coins)
>>> make_change(45, [25,10,5,1])
39
In the else
clause of our function, we do two recursive calls that
get us exactly what we said we needed. The first recursive call
calculates how to make change for the amount using all but the first
coin; the second call "uses" the first coin and calculates how to make
change for that new amount using all the coins.
Once more, by simply stating in code what we wish we had, and using it to compute the result, we end up with a correct, working function!
(Warning: do not use this algorithm with very large amounts, it's much too slow and will use all available memory. A better solution to this problem is to use dynamic programming. Nevertheless, I thought it was a nice problem to exercise our recursion reflexes.)
I hope this post helped you realize that recursion doesn't have to be scary if you proceed step-by-step:
That isn't to say that it's easy; sometimes figuring out what you need or how to combine it takes creativity. That's when pen and paper, a white board, or a helpful colleague are very helpful. And one last time, let's look at our tip for writing recursive functions:
Assume you have something that you wish you had.