Closures

7 September 2011 21:59

Closures were the last significant feature of Python that mastered, and I now seem to use them very regularly in any siginificant project. (I haven't yet had occasion to use metaclasses, but I can't imagine they will become so fundamental to me). Closures in Python are pretty similar to most other languages which have them, which is basically all dynamic languages (JavaScript, LISP, Clojure).

So, what is a closure? Simply, it's a binding of variables in a function definition to the enclosing scope. For example:

>>> def make_adder(x):
...   def f(y):
...       print x+y
...   return f
...
>>> plus_four = make_adder(4)
>>> plus_four(3)
7

Right, great, but that's not very useful. However, there are very many cases where it is useful to pass around functions, and often those functions will need to be dynamically generated. Without closures, in Python we would be stuck with code gereration.

Objects are a Poor Man's Closures?

When closures refer to mutable objects, it can be possible to use them to replace objects. For example:

>>> def make_stack():
...    stack = []
...    def push(x):
...        stack.append(x)
...    def pop():
...        return stack.pop()
...    return push, pop
...

Now we can generate a pair of functions which share access to a data structure, but with no other way of accessing that data structure. Let's just check that we can create two and that they are independent:

>>> push1, pop1 = make_stack()
>>> push2, pop2 = make_stack()
>>> push1(1)
>>> push2(2)
>>> push1(3)
>>> pop2()
2
>>> pop1()
3

Although this is a very neat technique, I don't think I have ever used it in finished code. Producing objects always seems cleaner, and passing around methods from objects actually results in passing closures around those objects anyway. I will just make that point clear with an example:

>>> class Adder:
...   def __init__(self, x):
...       self.x = x
...
...   def add(self, y):
...       return self.x + y
...
>>> add2 = Adder(2)
>>> add2.add(5)
7
>>> def caller(f, arg):
...   return f(arg)
...
>>> caller(add2.add, 8)
10

But take care

If you use closures in Python, eventually you will write some code which looks something like this:

>>> functions = []
>>> for i in range(9):
...   def f(y):
...       print i+y
...   functions.append(f)
...

Which looks good until...

>>> functions[2](2)
10

Erm, 2+2 = 10? What about

>>> functions[0](2)
10

So what's going on here? Well, each function is bound to the variable i in the enclosing scope, which changes after each iteration. Indeed, I could still change it:

>>> i=100
>>> functions[0](2)
102

Some languages (Scheme, Clojure, maybe Perl) work differently. In those languages, the loop variable in a for loop creates a new scope for every iteration, and thus there is no risk of writing code which has the same bug. Here's a demonstration in Clojure:

user=> (def l (take 9 (for [x (iterate inc 0)] (fn [i] (* x i)))))
#'user/l
user=> ((nth l 3) 1)
3
user=> ((nth l 7) 3)
21

There are a couple of fixes; one is to create a separate function to create the chained function, so that the function creation is in a new scope. Here I'll use the one in the first example:

>>> functions = []
>>> for i in range(9):
...    functions.append(make_adder(i))
...
>>> functions[4](3)
7

Another solution is to make the function argument a keyword argument instead; default arguments are immediately attached to the function being created:

>>> functions = []
>>> for i in range(9):
...   def f(y, i=i):
...       print i+y
...   functions.append(f)
...
>>> functions[2](2)
4

As ever, beware default mutable arguments.

This issue sounds like it could be fixed by Python behaving differently, and creating a new variable i and de-scoping the old i each time through the loop, as Clojure does. But that wouldn't make a lot of difference; there are many occasions when one wants to create closures in loops bound to something other that the iterator variable itself, and these would still have the same problem. Making each iteration of a loop an entirely different scope would solve the problem, but then it wouldn't be Python.

Another one to watch for

Today I wanted a decorator function which would check that the length of an argument list was an integer, though in one case I wanted it to check a range instead. The obvious way to do this is for the function producing the decorator to accept both an integer and a function.

>>> def check_args(condition):
...   if isinstance(condition, int):
...       condition = lambda x:x==condition
...   def checked(f):
...       def replacement(*args):
...            assert condition(len(args))
...            f(*args)
...       return replacement
...   return checked

Now we can use this decorator:

>>> @check_args(2)
... def f(*args):
...   print "Got %s args" % len(args)
>>> f(1)
Traceback (most recent call last):
...
AssertionError

Good ...

>>> f(1, 2)
Traceback (most recent call last):
...
AssertionError

Eh? Well, this is basically the same issue as in the for loop, but has quite a different feel. Indeed, the fact that we have created a closure here is accidental, and this could be quite a confusing issue is you aren't expecting it.

The problem is that the function defined by the lambda refers to condition, which itself becomes the said function. So the function is testing whether integers are equal to itself, which unsurprisingly they aren't. Again, this can be fixed by either having an external function to create the check function:

>>> def check_int(n):
...   return lambda x:x==n

or by using keyword arguments:

>>> condition = lambda x, cond=condition:x==cond

Overall, closures are a very powerful feature, and one well worth understanding. They make it easy to pass functions with context around, rather than trying to pass around functions with argument lists as may be tempting otherwise. Here's a final example to demonstrate that:

>>> class DelayedCalls:
...    def __init__(self):
...       self.buffered = []
...
...    def __getattr__(self, name):
...       def delay_call(*args, **kwargs):
...           def dispatch():
...               getattr(self, '_' + name)(*args, **kwargs)
...           self.buffered.append(dispatch)
...       return delay_call
...
...    def _print_args(self, *args):
...       print args
...
...    def call_buffered(self):
...       self.buffered.pop(0)()
>>> dc = DelayedCalls()
>>> dc.print_args(1, 2, 3)
>>> dc.print_args(4, 5)
>>> dc.call_buffered()
(1, 2, 3)
>>> dc.call_buffered()
(4, 5)

Comments

Andreas wrote on 21 October 2013:

Nice blog post! Especially the comparison to Clojure closures ;-) is insightful!

Regarding the part where you describe the closure related problem in the condition check decorator: One could also just use a different name for the function created by the lambda, so for example: "cond_fnc = lambda x:x==condition". And then it works as expected.

Greetings, Andreas

Leave a comment