SOLID Principles In Python

SOLID is a set of object oriented design principles aimed at making code more maintainable and flexible. They were coined by Robert "Uncle Bob" Martin in the year 2000 in his paper Design Principles and Design Patterns. The SOLID principles apply to any object oriented language, but I'm going to concentrate on what they mean in a Python application in this post.

I originally wrote about SOLID principles with PHP as the basis of the article, but as the lessons here can be easily applied to any object oriented language I thought that I would re-write it with Python in mind. If you are familiar with only PHP or Python then this will be a good learning resource on learning the other side.

I should also note here that Python doesn't really have an interface system so I have used Metaclasses to create the needed situations. For more explanation about Metaclasses see the interface section in the basics of getting started with object oriented programming in Python article.

SOLID is an acronym that stands for the following:

I'll be tackling them each in turn.

Single Responsibility Principle

This states that a class should have a single responsibility, but more than that, a class should only have one reason to change.

Taking an example of the (simple) class called Page.

import json

class Page():
    def __init__(self, title):
        self._title = title

    def get_title(self):
        return self._title

    def set_title(self, title):
        self._title = title

    def get_page(self):
        return [self._title]

    def format_json(self):
        return json.dumps(self.get_page())

This class knows about a title property and allows this title property to be retrieved by a get() method. We can also use a method in this class called format_json() to return the page as a JSON string. This might seem like a good idea as the class is responsible for its own formatting.

What happens, however, if we want to change the output of the JSON string, or to add another type of output to the class? We would need to alter the class to either add another method or change an existing method to suit. This is fine for a class as simple as this, but if it contained more properties then the formatting would be more complex to change.

A better approach to this is to modify the Page class so that is only knows about the data is handles. We then create a secondary class called JsonPageFormatter that is used to format the Page objects into JSON.

import json

class Page():
    def __init__(self, title):
        self._title = title

    def get_title(self):
        return self._title

    def set_title(self, title):
        self._title = title

    def get_page(self):
        return [self._title]

class JsonPageFormatter():
    def format_json(page: Page):
        return json.dumps(page.get_page())

Doing this means that if we wanted to create an XML format we could just add a class called XmlPageFormatter and write some simple code to output XML. We now have only one reason to change the Page class.

Open/Closed Principle

In the open/closed principle classes should be open for extension, but closed for modification. Essentially meaning that classes should be extended to change functionality, rather than being altered into something else.

As an example, take the following two classes. 

class Rectangle():
    def __init__(self, width, height):
        self._width = width
        self._height = height

    def get_width(self):
        return self._width

    def set_width(self, width):
        self._width = width

    def get_height(self):
        return self._height

    def set_height(self, height):
        self._height = height

class Board():
    @property
    def rectangles(self):
        return self._rectangles

    @rectangles.setter
    def rectangles(self, value):
        self._rectangles = value  

    def calculateArea(self):
        area = 0
        for item in self.rectangles:
            area += item.get_height() * item.get_width()
        return area

We have a Rectangle class that contains the data for a rectangle, and a Board class that is used as a collection of Rectangle objects. With this setup we can easily find out the area of the board by looping through the items in the rectangles collection property and calculating their area.

The problem with this setup is that we are restricted by the types of object we can pass to the Board class. For example, if we wanted to pass a Circle object to the Board class we would need to write conditional statements and code to detect and calculate the area of the Board.

The correct way to approach this problem is to move the area calculation code into the shape class and have all shape classes extend a Shape interface. We can now create a Rectangle and Circle shape classes that will calculate their area when asked.

import math

class ShapeMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'area') and callable(subclass.area))

class ShapeInterface(metaclass=ShapeMeta):
    pass

class Rectangle(ShapeInterface):
    def __init__(self, width, height):
        self._width = width
        self._height = height

    def get_width(self):
        return self._width

    def set_width(self, width):
        self._width = width

    def get_height(self):
        return self._height

    def set_height(self, height):
        self._height = height

    def area(self):
        return self.get_width() * self.get_height()

class Circle(ShapeInterface):
    def __init__(self, radius):
        self._radius = radius

    def get_radius(self):
        return self._radius

    def set_radius(self, radius):
        self._radius = radius

    def area(self):
        return self.get_radius() * self.get_radius() * math.pi

The Board class can now be reworked so that it doesn't care what type of shape is passed to it, as long as they implement the area() method.

class Board():
    def __init__(self, shapes):
        self._shapes = shapes

    def calculateArea(self):
        area = 0
        for shape in self._shapes:
            area += shape.area()
        return area

We have now setup these objects in a way that means we don't need to alter the Board class if we have a different type of object. We just create the object that implements Shape and pass it into the collection in the same way as the other classes.

Liskov Substitution Principle

Created by Barbara Liskov in a 1987, this states that objects should be replaceable by their subtypes without altering how the program works. In other words, derived classes must be substitutable for their base classes without causing errors.

The following code defines a Rectangle class that we can use to create and calculate the area of a rectangle.

class Rectangle():
    def __init__(self, width, height):
        self._width = width
        self._height = height

    def get_width(self):
        return self._width

    def set_width(self, width):
        self._width = width

    def get_height(self):
        return self._height

    def set_height(self, height):
        self._height = height

    def area(self):
        return self.get_width() * self.get_height()

Using that we can extend this into a Square class. Because a square a little different from a rectangle we need to override some of the code in order to allow a Square to exist correctly.

class Square(Rectangle):
    def __init__(self, width):
        self._width = width
        self._height = width

    def get_width(self):
        return self._width

    def set_width(self, width):
        self._width = width
        self._height = width

    def get_height(self):
        return self._height

    def set_height(self, height):
        self._height = height
        self._width = height

This seems fine, but ultimately a square is not a rectangle and so we have added code to force this situation to work.

A good analogy that I read once was to think about a Duck and a Rubber Duck as represented by classes. Although it is possible to extend a Duck class into a Rubber Duck class we would need to override a lot of Duck functionality to suit the Rubber Duck. For example, a Duck quacks, but a Rubber Duck doesn't (ok, maybe it squeaks a bit), A Duck is alive, but a Rubber Duck isn't.

Overriding lots of code in classes to suit specific situations can lead to maintenance problems. The more code you add to override specific conditions, the more fragile you code will become.

One solution to the rectangle vs square situation is to create an interface called Quadrilateral and implement this in separate Rectangle and Square classes. In this situation we are allowing the classes to be responsible for their own data, but enforcing the need for certain method footprints being available.

class QuadrilateralMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'area') and callable(subclass.area)) \
          and (hasattr(subclass, 'get_height') and callable(subclass.get_height)) \
          and (hasattr(subclass, 'get_width') and callable(subclass.get_width)) \

class QuadrilateralInterface(metaclass=QuadrilateralMeta):
    pass

class Rectangle(QuadrilateralInterface):
    pass

class Square(QuadrilateralInterface):
    pass

The bottom line here is that if you find you are overriding a lot of code then maybe your architecture is wrong and you should think about the Liskov Substitution principle.

Interface Segregation Principle

This states that many client-specific interfaces are better than one general-purpose interface. In other words, classes should not be forced to implement interfaces they do not use.

Let's take an example of a Worker interface. This defines several different methods that can be applied to a worker at a typical development agency.

class WorkerMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'take_break') and callable(subclass.take_break)) \
          and (hasattr(subclass, 'write_code') and callable(subclass.write_code)) \
          and (hasattr(subclass, 'call_client') and callable(subclass.call_client)) \
          and (hasattr(subclass, 'get_paid') and callable(subclass.get_paid))

class WorkerInterface(metaclass=WorkerMeta):
    pass

The problem is that because this interface is too generic we are forced to create methods in classes that implement this interface just to suit the interface.

For example, if we create a Manager class then we are forced to implement a write_code() method because that's what the interface requires. Because managers generally don't code we can't actually do anything in this method so we just return false.

class Manager(WorkerInterface):
    def write_code(self):
        pass

Also, if we have a Developer class that implements Worker then we are forced to implement a call_client() method because that's what the interface requires.

class Developer(WorkerInterface):
    def call_client(self):
        pass

Having a fat and bloated interface means having to implement methods that do nothing.

The correct solution to this is to split our interfaces into separate parts, each of which deals with specific functionality. Here, we split out a Coder and ClientFacer interface from our generic Worker interface.

class WorkerMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'take_break') and callable(subclass.take_break)) \
          and (hasattr(subclass, 'get_paid') and callable(subclass.get_paid))

class WorkerInterface(metaclass=WorkerMeta):
    pass


class ClientFacerMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'call_client') and callable(subclass.call_client))

class ClientFacerInterface(metaclass=ClientFacerMeta):
    pass


class CoderMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'write_code') and callable(subclass.write_code))

class CoderInterface(metaclass=CoderMeta):
    pass

With this in place we can implement our sub-classes without having to write code that we don't need. So our Developer and Manager classes would look like this.

class Manager(WorkerInterface, ClientFacerInterface):
    pass

class Developer(WorkerInterface, CoderInterface):
    pass

Having lots of specific interfaces means that we don't have to write code just to support an interface.

Dependency Inversion Principle

Perhaps the simplest of the principles, this states that classes should depend upon abstractions, not concretions. Essentially, don't depend on concrete classes, depend upon interfaces.

Taking an example of a PageLoader class that uses a MySqlConnection class to load pages from a database we might create the classes so that the connection class is passed to the constructor of the PageLoader class.

class MySqlConnection():
    def connect(self):
        pass

class PageLoader():
    def __init__(self, mysql_connection: MySqlConnection):
        self._mysql_connection = mysql_connection

This structure means that we are essentially stuck with using MySQL for our database layer. What happens if we want to swap this out for a different database adaptor? We could extend the MySqlConnection class in order to create a connection to Memcache or something, but that would contravene the Liskov Substitution principle. Chances are that alternate database managers might be used to load the pages so we need to find a way to do this.

The solution here is to create an interface called DbConnectionInterface and then implement this interface in the MySqlConnection class. Then, instead of relying on a MySqlConnection object being passed to the PageLoader class, we instead rely on any class that implements the DbConnectionInterface interface.

class DbConnectionMeta(type):
    def __instancecheck__(self, instance):
        return self.__subclasscheck__(type(instance))

    def __subclasscheck__(self, subclass):
        return (hasattr(subclass, 'connect') and callable(subclass.connect))

class DbConnectionInterface(metaclass=DbConnectionMeta):
    pass


class MySqlConnection(DbConnectionInterface):
    def connect(self):
        pass

class PageLoader():
    def __init__(self, db_connection: DbConnectionInterface):
        self._db_connection = db_connection

With this in place we can now create a MemcacheConnection class and as long as it implements the DbConnectionInterface then we can use it in the PageLoader class to load pages.

This approach also forces us to write code in such a way that prevents specific implementation details in classes that don't care about it. Because we have passed in a MySqlConnection class to our PageLoader class we shouldn't then write SQL queries in the PageLoader class. This means that when we pass in a MemcacheConnection object it will behave in the same way as any other type of connection class.

When thinking about interfaces instead of classes it forces us to move that specific domain code out of our PageLoader class and into the MySqlConnection class.

How To Spot It?

A bigger question might be how can you spot if you need to apply SOLID principles to your code or if you are writing code that isn't SOLID.

Knowing about these principles is only half of the picture, you also need to know when you should step back and think about applying SOLID principles. I came up with a quick list of things you need to keep an eye on that are 'tells', showing that your code might need to be re-worked.

  • You're writing a lot of "if" statements to handle different situations in object code.
  • You're writing a lot of code that doesn't actually do anything just to satisfy interface design.
  • You keep opening the same class to change the code.
  • You are writing code in classes that don't really have anything to do with that class. For example, putting SQL queries in a class outside the database connection class.

Conclusion

SOLID isn't a perfect methodology, and can lead to complex applications with many moving parts, and occasionally lead to writing code just in case it's needed. Using SOLID means writing more classes and creating more interfaces, but many modern IDE's will solve that problem through automated code completion.

That said, it does force you to separate concerns, to think about inheritance, prevent repeating code and carefully approach writing applications. Thinking about how objects fit together in an application is, after all, what object oriented code is all about.

Comments

Great job, Phil. Self teaching is a hard work and suddenly I could see how wrong my decisions were while in this process. Anyway, I'd like to ask you about some books you'd recommend for studying OOP principles, mostly those with Python code examples.

Thanks in advance

Permalink

Well, I would start with Robert Martin's book on Clean Code (https://amzn.to/3EJbTCh) as that is where these principles come from. That book is in Java, but the principles of OOP and writing clean code are interchangeable between languages.

There are a couple of clean code in python books available, but I can't say how good they are since I've not read them.

Name
Philip Norton
Permalink

Hi

Thank you for your nice content.

In the Interface Segregation Principle, the example will raise error in python3.8.10

TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

Permalink

Add new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
2 + 11 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.