Dynamically import all classes in python submodule

Sep 23, 2019

Python code is organized in folders called packages containing .py files called modules. In this article, I show a trick to automatically import in the top-level package all the classes defined in the submodules. This shorten the import statements for the convenience of the end user. Namely:

from package_name import ClassName 

instead of:

from package_name.module_name import ClassName

What's the point?

I needed this while creating my static blog generator. The generator is organized as a transformation pipeline that takes markdown files as input and produces an html blog as output. Somewhat similar to a plugin-based architecture, I decided to implement each step of the pipeline in its own file. The project's structure is as follows:

calepin
├── pipeline.py
├── processors
    ├── __init__.py
    ├── processor_1.py
    ├── ...
    └── processor_50.py

And inside pipeline.py, the goal is to write:

pipeline.py
from processors import LoadFromDisk, ParseMarkdown, WriteAsHtml

pipeline = [LoadFromDisk, ParseMarkdown, WriteAsHtml]

Or even:

pipeline.py
from processors import *

pipeline = [LoadFromDisk, ParseMarkdown, WriteAsHtml]

Instead of:

pipeline.py
from processors.loadfromdisk import LoadFromDisk
from processors.parsemarkdown import ParseMarkdown
from processors.writeashtml import WriteAsHtml

pipeline = [LoadFromDisk, ParseMarkdown, WriteAsHtml]

Of course, in a typical python project, I would argue against using import *. But for my generator project, the design pattern is very clear: define a pipeline, implement all its steps in the processors sub-package. So it's pretty safe to import * all the steps.

There are several ways to achieve this in python. For instance, I could have used an import hook in the pipeline.py file. But I choose to fiddle with the processors/ sub-package to dynamically import all the classes contained in its submodules. Here's how.

Dynamically import all classes in package

The code was built based on these resources.

Put this in processors/__init__.py and you're good to go:

processors/__init__.py
from inspect import isclass
from pkgutil import iter_modules
from pathlib import Path
from importlib import import_module

# iterate through the modules in the current package
package_dir = Path(__file__).resolve().parent
for (_, module_name, _) in iter_modules([package_dir]):

    # import the module and iterate through its attributes
    module = import_module(f"{__name__}.{module_name}")
    for attribute_name in dir(module):
        attribute = getattr(module, attribute_name)

        if isclass(attribute):            
            # Add the class to this package's variables
            globals()[attribute_name] = attribute

A loader for plugin architectures

By the way, a nice way to load all plugins in a directory is to have them extend a PluginBase base class, and then dynamically load its subclasses.

To do so, simply use the python builtin issubclass(attribute, PluginBase) instead of isclass(attribute) :

from inspect import issubclass

...

for attribute_name in dir(module):
        attribute = getattr(module, attribute_name)

        if isclass(attribute) and issubclass(attribute, PluginBase):
            globals()[attribute_name] = attribute