Thumbtack Engineering

A primer on Python decorators

spock cake

Python allows you, the programmer, to do some very cool things with functions. In Python, functions are first-class objects, which means that you can do anything with them that you can do with strings, integers, or any other objects. For example, you can assign a function to a variable:

>>> def square(n):
...     return n * n
>>> square(4)
16
>>> alias = square
>>> alias(4)
16

The real power from having first-class functions, however, comes from the fact that you can pass them to and return them from other functions. Python’s built-in map function uses this ability: you pass it a function and a list, and map creates a new list by calling your function individually for each item in the list you gave it. Here’s an example that uses our square function from above:

>>> numbers = [1, 2, 3, 4, 5]
>>> map(square, numbers)
[1, 4, 9, 16, 25]

A function that accepts other function(s) as arguments and/or returns a function is called a higher-order function. While map simply made use of our function without making any changes to it, we can also use higher-order functions to change the behavior of other functions.

For example, let’s say we have a function which we call a lot that is very expensive:

>>> def fib(n):
...     "Recursively (i.e., dreadfully) calculate the nth Fibonacci number."
...     return n if n in [0, 1] else fib(n - 2) + fib(n - 1)

We would like to save the results of this calculation, so that if we ever need to calculate the value for some n (which happens very often, given this function's call tree), we don’t have to repeat our hard work. We could do that in a number of ways; for example, we could store the results in a dictionary somewhere, and every time we need a value from fib, we first see if it is in the dictionary.

But that would require us to reproduce the same dictionary-checking boilerplate every time we wanted a value from fib. Instead, it would be convenient if fib took care of saving its results internally, and our code that uses it could simply call it as it normally would. This technique is called memoization (note the lack of an ‘r’).

We could build this memoization code directly into fib, but Python gives us another, more elegant option. Since we can write functions that modify other functions, we can write a generic memoization function that takes a function and returns a memoized version of it:

def memoize(fn):
    stored_results = {}

    def memoized(*args):
        try:
            # try to get the cached result
            return stored_results[args]
        except KeyError:
            # nothing was cached for those args. let's fix that.
            result = stored_results[args] = fn(*args)
            return result

    return memoized

This memoize function takes another function as an argument, and creates a dictionary where it stores the results of previous calls to that function: the keys are the arguments passed to the function being memoized, and the values are what the function returned when called with those arguments. memoize returns a new function that first checks to see if there is an entry in the stored_results dictionary for the current arguments; if there is, the stored value is returned; otherwise, the wrapped function is called, and its return value is stored and returned back to the caller. This new function is often called a “wrapper” function, since it’s just a thin layer around a different function that does real work.

Now that we have our memoization function, we can just pass fib to it to get a wrapped version of it that won’t needlessly repeat any of the hard work it’s done before:

def fib(n):
    return n if n in [0, 1] else fib(n - 2) + fib(n - 1)
fib = memoize(fib)

By using our higher-order memoize function, we get all the benefits of memoization without having to make any changes to our fib function itself, which would have obscured the real work that function did in the midst of the memoization baggage. But you might notice that the code above is still a little awkward, as we have to write fib three times in the above example. Since this pattern – passing a function to another function and saving the result back under the name of the original function – is extremely common in code that makes use of wrapper functions, Python provides a special syntax for it: decorators.

@memoize
def fib(n):
    return n if n in [0, 1] else fib(n - 2) + fib(n - 1)

Here, we say that memoize is acting decorating fib. It’s important to realize that this is only a syntactic convenience. This code does exactly the same thing as the above snippet: it defines a function called fib, passes it to memoize, and saves the result of that as fib. The special (and, at first, a bit odd-looking) @ syntax simply cuts out the redundancy.

You can stack these decorators on top of each other, and they will apply in bottom-out fashion. For example, let's say we also have another higher-order function to help with debugging:

def make_verbose(fn):
    def verbose(*args):
        # will print (e.g.) fib(5)
        print '%s(%s)' % (fn.__name__, ', '.join(repr(arg) for arg in args))
        return fn(*args) # actually call the decorated function

    return verbose

The following two code snippets then do the same thing:

@memoize
@make_verbose
def fib(n):
    return n if n in [0, 1] else fib(n - 2) + fib(n - 1)

:::python
def fib(n):
    return n if n in [0, 1] else fib(n - 2) + fib(n - 1)
fib = memoize(make_verbose(fib))

Interestingly, you’re not restricted to simply writing a function name after the @ symbol: you can also call a function there, letting you effectively pass arguments to a decorator. Let’s say that we aren’t content with simple memoization, and we want to store the function results in memcached. If we’ve written a memcached decorator function, we could (for example) pass in the address of the server as an argument:

@memcached('127.0.0.1:11211')
def fib(n):
    return n if n in [0, 1] else fib(n - 2) + fib(n - 1)

Written without decorator syntax, this expands to:

fib = memcached('127.0.0.1:11211')(fib)

Python comes with some functions that are very useful when applied as decorators. For example, Python has a classmethod function that creates the rough equivalent of a Java static method:

class Foo(object):
    SOME_CLASS_CONSTANT = 42

    @classmethod
    def add_to_my_constant(cls, value):
        # Here, `cls` will just be Foo, but if you called this method on a
        # subclass of Foo, `cls` would be that subclass instead.
        return cls.SOME_CLASS_CONSTANT + value

Foo.add_to_my_constant(10) # => 52

# unlike in Java, you can also call a classmethod on an instance
f = Foo()
f.add_to_my_constant(10) # => 52

Sidenote: Docstrings

Python functions carry more information than just code: they also carry useful help information, like their name and docstring:

>>> def fib(n):
...     "Recursively (i.e., dreadfully) calculate the nth Fibonacci number."
...     return n if n in [0, 1] else fib(n - 2) + fib(n - 1)
...
>>> fib.__name__
'fib'
>>> fib.__doc__
'Recursively (i.e., dreadfully) calculate the nth Fibonacci number.'

This information powers Python's built-in help function. But when we wrap our function, we instead see the name and docstring of the wrapper:

>>> fib = memoized(fib)
>>> fib.__name__
'memoized'
>>> fib.__doc__

That’s not particularly helpful. Luckily, Python includes a helper function that will copy this documentation onto wrappers, called functools.wraps:

import functools
def memoize(fn):
    stored_results = {}

    @functools.wraps(fn)
    def memoized(*args):
        # (as before)

    return memoized

There's something very satisfying about using a decorator to help you write a decorator. Now, if we were to retry our code from before with the updated memoize, we see the documentation is preserved:

>>> fib = memoized(fib)
>>> fib.__name__
'fib'
>>> fib.__doc__
'Recursively (i.e., dreadfully) calculate the nth Fibonacci number.'

Thumbtack is hiring engineers! Come work with us on making it easy to hire service professionals, and enjoy our in-house chef and sweet San Francisco office.