Post

The power of interfaces

Using interfaces to achieve dependency inversion in our code

The power of interfaces

Programming interfaces are essentially abstractions that define a set of method signatures without their actual implementation.

They are a form of contracts that define what kind of functionality is offered by a component, without revealing the way this functionality is achieved.

Interfaces are a great way to decouple parts of your code and make them depend on abstractions instead of implementation details.

This might not be the best way to describe them, but hopefully an example can clear things up.

Simple is not always the best

Let’s assume that your boss asked you to write a program that connects to a MongoDB node and returns a report with information about a specific user.

A very naive implementation of this program would be something like the one below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from typing import Any, Dict


class MongoClient:
    def find_one(self, query: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}


class Report:
    def __init__(self, client):
        self.client = client

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.client.find_one({"userId": user_id})
        return user


def get_report(user_id: int) -> Dict[str, Any]:
    client = MongoClient()
    report = Report(client)
    user = report.get_user(user_id)
    return user


if __name__ == "__main__":
    user_id = 12345
    user = get_report(user_id)
    print(user)

Let’s explain what’s going on here.

In line 4 we define a MongoClient class that can connect to a Mongo database and retrieve data. In reality, we would use a proper library like pymongo, or motor etc, but this would complicate the code a bit more so we’ll keep things simple since this is not what we are focusing on right now.

1
2
3
class MongoClient:
    def find_one(self, query: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}

The MongoClient class is exposing a find_one method that returns the first document that matches the query provided. Again, the result here is mocked for the sake of simplicity.

In line 9 we define a Report class that will generate our actual report information.

1
2
3
4
5
6
7
class Report:
    def __init__(self, client):
        self.client = client

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.client.find_one({"userId": user_id})
        return user

The Report class is instantiated by providing a DB client, in this case MongoClient, and is exposing a get_user function that uses the DB client to retrieve the information from the database.

Finally, in line 18 we define a function get_report that takes a user_id and returns the user data.

1
2
3
4
5
def get_report(user_id: int) -> Dict[str, Any]:
    client = MongoClient()
    report = Report(client)
    user = report.get_user(user_id)
    return user

This all works fine and sort of makes sense, however, one might notice an interesting detail..

The Report class is very tightly coupled with the MongoClient class, since the get_user method is directly calling the find_one method from the latter.

1
user = self.client.find_one({"userId": user_id})

This behavior is a violation of the Dependency Inversion Principle, one of the five SOLID principles, which states that high level components (Report) should not depend on concrete implementations of low level components (MongoClient), but rather on abstractions (interfaces).

But what does that mean exactly?

The problem of tight coupling

Let’s now imagine that your boss comes back to notify you that, due to some licensing issues, you have to switch from MongoDB to CasandraDB.

You go back thinking and you decide that the natural thing to do would be to:

  • Create a new CasandraClient class
  • Alter the Report class so that it can now support queries both to Mongo and Casandra (you don’t want to scrape all the hard work you did to support Mongo queries)
  • Alter the get_report method to use a Casandra client instead of a Mongo one

Note that in reality, you cannot expect different libraries to expose the same functions, thus for the purpose of this example, the CasandraClient will expose a get_item method rather than a find_one.

Again, a naive implementation would now look something like the one below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
from typing import Any, Dict


class MongoClient:
    def find_one(self, query: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}


class CasandraClient:
    def get_item(self, key: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}


class Report:
    def __init__(self, client):
        self.client = client

    def get_user_mongo(self, user_id: int) -> Dict[str, Any]:
        user = self.client.find_one({"userId": user_id})
        return user

    def get_user_casandra(self, user_id: int) -> Dict[str, Any]:
        user = self.client.get_item({"userId": user_id})
        return user


def get_report(user_id: int) -> Dict[str, Any]:
    client = CasandraClient()
    report = Report(client)
    user = report.get_user_casandra(user_id)
    return user


if __name__ == "__main__":
    user_id = 12345
    user = get_report(user_id)
    print(user)

Just like our first example, this could work and would probably make your boss happy, but there are a few caveats following this approach.

First of all, in line 18, you’ve now renamed the initial get_user method of the Report class to get_user_mongo, in order to differentiate it from the new get_user_casandra.

This violates the Open–closed principle which states that a class should be open for extension, but closed for modification.

But, as if that was not enough, in lines 28 and 30, you also had to change the get_report function to instantiate a CasandraClient and now call a different function report.get_user_casandra.

1
2
3
4
5
def get_report(user_id: int) -> Dict[str, Any]:
    client = CasandraClient()
    report = Report(client)
    user = report.get_user_casandra(user_id)
    return user

This is bad…

Imagine that somehow you had published the Report as a library internally in your company and now other departments are also using it in their own programs.

The problem is that:

  • If they update to the latest version, their programs will break because you introduced a breaking change (get_user was renamed to get_user_mongo)
  • If they don’t update, their programs will not return anything since the data would be migrated from Mongo to Casandra

Dependency Inversion using an Interface

The way to solve this problem is by introducing an interface between the high level Report and the low level MongoClient, CasandraClient components.

And this is rather simple, especially if you do it in the early stages of development.

It goes something like this below..

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
from abc import ABC, abstractmethod
from typing import Any, Dict


class DBClientInterface(ABC):

    @classmethod
    def client_factory(cls, db_type: str):
        if db_type == "mongo":
            return MongoClient()
        elif db_type == "casandra":
            return CasandraClient()
        else:
            raise ValueError("Invalid db type")

    @abstractmethod
    def get_user(self, user_id: int) -> Dict[str, Any]:
        ...


class MongoClient(DBClientInterface):
    def find_one(self, query: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}
    
    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.find_one({"userId": user_id})
        return user


class CasandraClient(DBClientInterface):
    def get_item(self, key: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.get_item({"userId": user_id})
        return user


class Report:
    def __init__(self, client):
        self.client = client

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.client.get_user({"userId": user_id})
        return user


def get_report(user_id: int, db_type: str) -> Dict[str, Any]:
    client = DBClientInterface.client_factory(db_type)
    report = Report(client)
    user = report.get_user(user_id)
    return user


if __name__ == "__main__":
    user_id = 12345
    get_report(user_id, db_type="casandra")
    print(user)

Let’s explain what’s going on this time.

First, in line 5, we implement a DBClientInterface class that inherits from ABC and we create an abstract get_user method.

This will be our generic DB client interface, forcing any DB client (mongo, casandra, etc) to implement this method.

We can also create a factory method client_factory to dynamically choose the correct client that needs to be instantiated.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class DBClientInterface(ABC):

    @classmethod
    def client_factory(cls, db_type: str):
        if db_type == "mongo":
            return MongoClient()
        elif db_type == "casandra":
            return CasandraClient()
        else:
            raise ValueError("Invalid db type")

    @abstractmethod
    def get_user(self, user_id: int) -> Dict[str, Any]:
        ...

Then, in lines 21 and 30, we implement our low level components by inheriting from the Interface.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class MongoClient(DBClientInterface):
    def _find_one(self, query: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}
    
    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self._find_one({"userId": user_id})
        return user


class CasandraClient(DBClientInterface):
    def _get_item(self, key: Dict[str, Any]) -> Dict[str, Any]:
        return {"userId": 12345, "Name": "John Doe"}

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self._get_item({"userId": user_id})
        return user

Each client can implement the get_user method however they like.

For example, the MongoClient chooses to call a _find_one method, whereas the CasandraClient a _get_item.

This is an implementation detail and doesn’t really matter since the high level component Report will only be calling get_user directly, which is consistent in any client that implements the DBClientInterface.

Then, in line 39, we implement the Report class, which now calls the generic get_user function of the interface.

1
2
3
4
5
6
7
class Report:
    def __init__(self, client):
        self.client = client

    def get_user(self, user_id: int) -> Dict[str, Any]:
        user = self.client.get_user({"userId": user_id})
        return user

And.. voila, the Report class is now completely client agnostic and decoupled from the intricate details of the individual DB clients.

Finally, in line 48, the get_report function doesn’t care anymore about what specific client we want to use since:

  • The factory method handles the client instantiation
  • The report.get_user function is consistent across all clients that implement the interface
1
2
3
4
5
6
7
8
9
10
11
def get_report(user_id: int, db_type: str) -> Dict[str, Any]:
    client = DBClientInterface.client_factory(db_type)
    report = Report(client)
    user = report.get_user(user_id)
    return user


if __name__ == "__main__":
    user_id = 12345
    user = get_report(user_id, db_type="casandra")
    print(user)

Thus, even if you now wanted to switch to a third database, you would only have to implement a new client class and ask your colleagues to change the db_type.

No Dependency Inversion, or Open-Closed principles violated!

Conclusion

This is a very simple example that can demonstrate the power of programming interfaces.

Obviously, in reality it is a lot harder to achieve (especially if you are trying to swap databases..) but the more you practice, the more you will realize its potential and hopefully it will bring a lot of value to your code.

My personal opinion is that even if you don’t immediately think of sticking an interface between every component (ending up over-engineering things) you should at least reconsider it the moment a second low level component appears.

Let me know your thoughts.

This post is licensed under CC BY 4.0 by the author.