=============================================== Parameterisation - the most important pattern =============================================== (This page is a re-worked version of an article that I first wrote for Toptal, and is also on my blog as `A Look at Python, Parameterized `_) Probably the most common and important technique for writing Python code is one that you may already have come across and used without thinking. It is one that plays a very important role in Python, where it is a lot more powerful than in some languages, and takes the place of various other “design patterns” that are common in other languages. For these reasons this post will start with some very simple examples, but progress to more advanced topics. .. contents:: Contents The simplest case ================= For most of our examples we'll use the instructional standard library `turtle `_ module for doing some graphics. Here is some code that will draw a 100x100 square using ``turtle``: .. code-block:: python from turtle import Turtle turtle = Turtle() for i in range(0, 4): turtle.forward(100) turtle.left(90) Suppose we now want to draw a different size square. A very junior programmer at this point would be tempted to copy and paste this block and modify. Obviously, a much better method would be to first extract the square drawing code into a function: .. code-block:: python :emphasize-lines: 5-8 from turtle import Turtle turtle = Turtle() def draw_square(): for i in range(0, 4): turtle.forward(100) turtle.left(90) draw_square() And then, we pull out the size of the square as parameter: .. code-block:: python :emphasize-lines: 1,3 def draw_square(size: int): for i in range(0, 4): turtle.forward(size) turtle.left(90) draw_square(100) So we can now draw squares of any size using ``draw_square``. That's all there is to the essential technique of parameterisation: - identify the part of a bit of code that you want to make into a variable - turn it into a parameter of a function or method. An immediate issue with the code above is that ``draw_square`` depends on a global variable ``turtle``. This has `lots of bad consequences `_, and there are two easy ways we can fix it. The first would be for ``draw_square`` to create the ``Turtle`` instance itself (which I'll discuss later), which might not be desirable if we want to use a single ``Turtle`` for all our drawing. The other is to simply use parameterisation again to make ``turtle`` a parameter to ``draw_square``: .. code-block:: python :emphasize-lines: 3,9 from turtle import Turtle def draw_square(turtle: Turtle, size: int): for i in range(0, 4): turtle.forward(size) turtle.left(90) turtle = Turtle() draw_square(turtle, 100) This has a fancy name - `dependency injection `_. It simply means that if a function needs some kind of object to do its work, like ``draw_square`` needs a ``Turtle``, the caller is responsible for passing that object in as a parameter. So far we've dealt with two very basic usages. The key observation for the rest of this article is that in Python, there is a large range of things that can become parameters — more than in some other languages — and this makes it a very powerful technique. Anything that is an object ========================== In Python you can use this technique to parameterise anything that is an object, and in Python most things you come across are in fact objects. This includes: * instances of builtin types - like the string ``"I'm a string"`` and the integer ``42``, or a dictionary. * instances of other types and classes e.g. a ``datetime.datetime`` object. * functions and methods * builtin types and custom classes. The last two are the ones that are the most surprising, especially if you are coming from other languages. Functions as parameters ======================= Let's tackle functions first. The function statement in Python does two things: 1. It creates a function object. 2. It creates a name in the local scope that points at that object. We can play with these objects in a REPL: .. code-block:: python >>> def foo(): ... return "Hello from foo" >>> >>> foo() 'Hello from foo' >>> print(foo) >>> type(foo) >>> foo.__name__ 'foo' And just like all objects, we can assign functions to other variables: .. code-block:: python >>> bar = foo >>> bar() 'Hello from foo' Note that ``bar`` is another name for the same object, so it has the same internal ``__name__`` property as before: .. code-block:: python >>> bar.__name__ 'foo' >>> bar That is, the ``def`` statement created a new function object, and set its internal ``__name__`` property, while the ``bar =`` statement just assigned a new name in the local scope for the same object. But the crucial point is that because functions are just objects, anywhere you see a function being used, it could be a parameter. And, any bit of code could be packaged up into a function. So, suppose we extend our square drawing function above, and now, sometimes when we draw squares we want to pause at each corner: .. code-block:: python :emphasize-lines: 6 import time def draw_square(turtle: Turtle, size: int): for i in range(0, 4): turtle.forward(size) time.sleep(5) turtle.left(90) But sometimes we don't want to pause. The simplest way to achieve this would be to add a pause parameter, perhaps with a default of zero so that by default we don't pause. However, we later discover that sometimes we actually want to do something completely different at the corners. Perhaps we want to draw another shape at each corner, or change the pen colour etc. We might be tempted to add lots more parameters, one for each thing we need to do. However, a much nicer solution would be to allow any function to be passed in as the action to take. For a default, we'll make a function that does nothing. We'll also make this function accept the local ``turtle`` and ``size`` parameters, in case they are required. .. code-block:: python :emphasize-lines: 5,8,11,18 from turtle import Turtle from typing import Callable import time def do_nothing(turtle: Turtle, size: int) -> None: pass def draw_square(turtle: Turtle, size: int, at_corner: Callable[[Turtle, int], None] = do_nothing): for i in range(0, 4): turtle.forward(size) at_corner(turtle, size) turtle.left(90) def pause(turtle, size): time.sleep(5) turtle = Turtle() draw_square(turtle, 100, at_corner=pause) Or, we could do something a bit cooler like recursively draw smaller squares at each corner: .. code-block:: python def smaller_square(turtle, size): if size < 10: return draw_square(turtle, size / 2, at_corner=smaller_square) draw_square(turtle, 128, at_corner=smaller_square) .. image:: _static/img/parameterisation/python_parameterised_squares.png :align: center There are of course variations on this. In many examples, the return value of the function would be used. Here we have a more imperative style of programming, and the function is called only for its side-effects. In other languages... --------------------- Having first class functions in Python makes this very easy. In languages that lack them, or some statically typed languages that require type signatures for parameters, this can be harder. How would we do this if we had no first class functions? One method would be to turn ``draw_square`` into a class, ``SquareDrawer``: .. code-block:: python class SquareDrawer: def __init__(self, size: int): self.size = size def draw(self, turtle: Turtle) -> None: for i in range(0, 4): turtle.forward(self.size) self.at_corner(turtle, size) turtle.left(90) def at_corner(self, turtle: Turtle, size: int) -> None: pass Now we can subclass ``SquareDrawer`` and add an ``at_corner()`` method that does what we need. This pattern is known as the `template method pattern `_. A base class defines the shape of the whole operation or algorithm, and the variant portions of the operation are put into methods that need to be implemented by sub-classes. While this pattern may sometimes be helpful in Python, pulling out the variant code into a function that is simply passed as a parameter is often going to be much simpler. A second way we might approach this problem in languages without first class functions is to wrap our functions up as methods inside classes, like this: .. code-block:: python class DoNothing: def run(self, turtle: Turtle, size: int): pass def draw_square(turtle: Turtle, size: int, at_corner=DoNothing()): for i in range(0, 4): turtle.forward(size) at_corner.run(turtle, size) t.left(90) class Pauser: def run(self, turtle, size): time.sleep(5) draw_square(turtle, 100, at_corner=Pauser()) This is known as the `strategy pattern `_. Again, this is certainly a valid pattern to use in Python, especially if the strategy class actually has not just one but a set of related functions. However, often all we really need is a function and we can `stop writing classes `_. Other callables --------------- In the examples above, I've talked about passing functions into other functions as parameters. However, everything I wrote was in fact true of any callable object. Functions are the simplest example, but we can also consider methods. Suppose we have a list ``foo``: .. code-block:: python foo = [1, 2, 3] ``foo`` now has a whole bunch of methods attached to it, such as ``.append()`` and ``.count()``. These “bound methods” can be passed around and used like functions: .. code-block:: python >>> foo = [1, 2, 3] >>> append_to_foo = foo.append >>> append_to_foo(4) >>> foo [1, 2, 3, 4] In addition to these instance methods, there other types of callable objects — class `staticmethods `_ and `classmethods `_, instances of classes that implement `__call__ `_, and classes/types themselves. Classes as parameters ===================== In Python, classes are “first class” – they are run-time objects just like dicts and strings etc. This might seem even more strange than functions being objects, but thankfully it is actually easier to demonstrate this fact than for functions. The class statement you are familiar with is a nice way of creating classes, but it isn't the only way — we can also use the `3 argument version of type `_. The following two statements do exactly the same thing: .. code-block:: python class Foo: pass Foo = type('Foo', (), {}) In the second version, note the two things we just did (which are done more conveniently using the ``class`` statement): 1. On the right hand side of the equals sign, we created a new class, with an internal name of ``'Foo'``. This is the name that you will get back if you do ``Foo.__name__``. 2. With the assignment, we then created a name in the current scope, ``Foo``, which refers to that class object we just created. We made the same observations for what the function statement does. The key insight here is that classes are objects that can be assigned names (i.e. can be put in a variable). Anywhere that you see a class in use, you are actually just seeing a variable in use. And if it's a variable, it can be a parameter. We can break that down into a number of uses: Classes are factories ===================== A class is a callable object that creates an instance of itself: .. code-block:: python >>> class Foo: ... pass >>> Foo() <__main__.Foo at 0x7f73e0c96780> And as an object it can be assigned to other variables: .. code-block:: python >>> my_class = Foo >>> my_class() <__main__.Foo at 0x7f73e0ca93c8> Going back to our turtle example above, one problem with using turtles for drawing is that the position and orientation of the drawing depends on the current position and orientation of the turtle, and it can also leave it in a different state which might be unhelpful for the caller, which we might not want. To solve this, our ``draw_square`` function could create its own turtle, move it to the desired position and then draw a square. .. code-block:: python :emphasize-lines: 2-5 def draw_square(x: int, y: int, size: int): turtle = Turtle() turtle.penup() # Don't draw while moving to the start position turtle.goto(x, y) turtle.pendown() for i in range(0, 4): turtle.forward(size) turtle.left(90) However, we now have a customisation problem. Suppose the caller wanted to set some attributes of the turtle, or use a different kind of turtle that has the same interface but has some special customisation? We could solve this with dependency injection, like we had before — the caller would be responsible for setting up the ``Turtle`` object. But what if our function sometimes needs to make many turtles for different drawing purposes - or if perhaps it wants to kick off 4 threads each with its own turtle to draw one side of the square? The answer is simply to make the ``Turtle`` class a parameter to the function. We can use a keyword argument with a default value, so that client code that doesn't care just uses the default: .. code-block:: python :emphasize-lines: 1,2 def draw_square(x: int, y: int, size: int, *, make_turtle: Callable[[], Turtle] = Turtle): turtle = make_turtle() turtle.penup() turtle.goto(x, y) turtle.pendown() for i in range(0, 4): turtle.forward(size) turtle.left(90) The ``make_turtle`` parameter here has a complex type hint which might require a bit of breaking down: .. code-block:: python make_turtle: Callable[[], Turtle] = Turtle This means: - we have a parameter called ``make_turtle`` - which must be a callable (like a function or a class, some ``foo`` that you can call: ``foo()``) - this callable must take no parameters (the ``[]`` bit) - …and it must return a ``Turtle`` instance - the default value of this parameter is ``Turtle`` (the class itself – which is indeed a callable, and it does indeed return a Turtle instance when you call it like ``Turtle()``, so it matches the requirement in the type hint). To use this function, we could write our own function to pass as the ``make_turtle`` parameter. It must create a turtle but it could also modify it before returning it. Suppose we want to hide the turtle when drawing squares: .. code-block:: python def make_hidden_turtle() -> Turtle: turtle = Turtle() turtle.hideturtle() return turtle draw_square(5, 10, 20, make_turtle=make_hidden_turtle) Or we could subclass ``Turtle`` to make that behaviour built-in, and pass the subclass as the parameter: .. code-block:: python class HiddenTurtle(Turtle): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.hideturtle() draw_square(5, 10, 20, make_turtle=HiddenTurtle) In other languages... --------------------- Several other OOP languages like Java and C# lack first class classes. To instantiate a class, you have to use the ``new`` keyword followed by an actual class name. This limitation is the reason for patterns like `abstract factory `_ (which requires the creation of a set of classes whose only job is to instantiate other classes) and the `Factory Method pattern `_. As you can see, in Python it is just a matter of pulling out the class as a parameter, because a class is its own factory. Classes as base classes ----------------------- The application of parameterisation below is much less common than those above, but it can be useful when you need it. Suppose we find ourselves creating sub-classes to add the same feature to different classes. For example, to use our example above, we want a ``Turtle`` subclass that will write out to a log when it is created: .. code-block:: python import logging logger = logging.getLogger() class LoggingTurtle(Turtle): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) logger.debug("Turtle got created") But then, we find ourselves doing exactly the same thing with another class: .. code-block:: python class LoggingHippo(Hippo): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) logger.debug("Hippo got created") The only things varying between these two are: 1. The base class 2. The name of the sub-class - but we don't really care about that and could generate it automatically from the base class ``__name__`` attribute. 3. The name used inside the ``debug`` call — but again we could generate this from the base class name. Faced with two very similar bits of code with only one variant, what can we do? Just like in our very first example - we create a function and pull out the variant part as a parameter. The only thing you need to realize is that in the class statements above, ``Turtle`` and ``Hippo`` are just variables that happen to refer to class objects. So we can do this: .. code-block:: python def make_logging_class(cls: type): class LoggingThing(cls): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) logger.debug("{0} got created".format(cls.__name__)) cls.__name__ = "Logging{0}".format(cls.__name__) return cls LoggingTurtle = make_logging_class(Turtle) LoggingHippo = make_logging_class(Hippo) Here we have a demonstration of first class classes: * We passed a class into a function - giving the parameter a conventional name ``cls`` to avoid the clash with keyword ``class`` (you will also see ``class_`` and ``klass`` used for this purpose). * Inside the function we made a new class * We returned that class as the return value of the function. We also set ``cls.__name__`` which is entirely optional but can help with debugging. Another application of this technique is when we have a whole bunch of features that we sometimes want to add to a class, and we might want to add various combinations of these features. Manually creating all the different combinations we need could get very unwieldy. In languages where classes are created at compile-time rather than run-time, this isn't possible. Instead, you have to use the `decorator pattern `_. That pattern may be useful sometimes in Python, but mostly you can just use the technique above. Normally, I actually avoid creating lots of subclasses for customising. Usually there are simpler and more Pythonic methods that don't involve classes at all. But this technique is available if you need it. See also `Brandon Rhodes full treatment of the decorator pattern in Python `_. Classes as exceptions --------------------- Another place you see classes being used is in the ``except`` clause of a try/except/finally statement. No surprises for guessing that we can parameterise those classes too. For example, the following code implements a very generic strategy of attempting an action that could fail, and retrying with exponential backoff until a maximum number of attempts is reached: .. code-block:: python import time def retry_with_backoff(action: Callable, exceptions_to_catch: type[Exception] | tuple[type[Exception], ...], max_attempts: int = 10, attempts_so_far: int = 0): try: return action() except exceptions_to_catch: attempts_so_far += 1 if attempts_so_far >= max_attempts: raise else: time_to_sleep = attempts_so_far ** 2 print(f"Waiting {time_to_sleep}") time.sleep(time_to_sleep) return retry_with_backoff(action, exceptions_to_catch, attempts_so_far=attempts_so_far, max_attempts=max_attempts) We have pulled out both the action to take and the exceptions to catch as parameters. ``exceptions_to_catch`` can be either a single class, such as ``IOError`` or ``httplib.client.HTTPConnectionError``, or a tuple of such classes. (We want to avoid bare excepts or even ``except Exception`` because `this is known to hide other programming errors `_). Warnings ======== Parameterisation is a powerful technique for re-using code and reducing code duplication. It is not without some drawbacks. In the pursuit of code re-use, several problems often surface: * Overly generic or abstracted code that becomes very difficult to understand. * Code with a `proliferation of parameters `_ that obscures the big picture, or introduces bugs because in reality only certain combinations of parameters are properly tested. * Unhelpful coupling of different parts of the code base because their 'common code' has been factored out into a single place. Sometimes code in two places is similar only accidentally, and the two places should be independent from each other because `they may need to change independently `_. Sometimes a bit of 'duplicated' code is far better than these problems, so use this technique with care.