Monday, November 1, 2021

Python Morsels: Modules are cached

Transcript

Python caches modules.

Re-importing modules doesn't update them

We've have a points module here (a file called points.py) that contains a Point class:

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

Let's say we're at the Python REPL (testing out this code) and we decide that the current string representation for a Point object isn't friendly enough:

>>> from points import Point
>>> p = Point(1, 2)
>>> p
<points.Point object at 0x7fc5ee136970>

So we modify our Point class to add a __repr__ method:

class Point:
    def __init__(self, x, y):
        self.x, self.y = x, y

    def __repr__(self):
        return f"Point({self.x}, {self.y})"

Then we save the points.py file that this Point class lives in. And then we re-import this module:

>>> from points import Point

And make a new instance of this class:

>>> p = Point(1, 2)

And then look at its string representation (expecting it to have changed):

>>> p
<points.Point object at 0x7fc5ef33f220>

But then we notice that the string representation hasn't actually changed! It's the same as it was before.

Restarting the Python REPL to force-update our module

In a desperate attempt to figure out why our Point class's string representation won't update, we exit the Python REPL, restart the Python REPL, and do everything all over again.

>>> exit()
$ python3
Python 3.10.0
Type "help", "copyright", "credits" or "license" for more information.
>>>

We import our Point class from our module, make a new instance of it, and then look at its string representation to see that it changed this time!

>>> from points import Point
>>> p = Point(1, 2)
>>> p
Point(1, 2)

What's going on here? Why did importing the module a second time not work but starting a new Python interpreter worked?

Python caches all imported modules

This all happened because Python caches modules.

In Python, every module that is imported is stored in a dictionary called sys.modules.

This dictionary maps the name of each module to the module object it represents:

>>> import sys
>>> sys.modules['points']
<module 'points' from '/home/trey/points.py'>

Each time the same module is imported again, Python doesn't actually reevaluate the code for that module: it just gives us back the same module object as before.

>>> import points
>>> points
<module 'points' from '/home/trey/points.py'>

Methods to clear the cache for a module

This module caching that Python does is a feature, not a bug: Python does this for performance reasons. But this feature does start to feel like a bug when we're testing our code from the Python REPL while also making changes to our code.

The easiest way to fix this problem is to exit the REPL and start a new REPL, starting a brand-new Python process (which has a brand new sys.modules dictionary). But there's other ways to do this as well.

You could also try to modify sys.modules, deleting things from it to clear the cache for that module:

>>> del sys.modules['points']

Or you could use the reload function from Python's importlib module, which reloads a module object:

>>> from importlib import reload

In either case, you might not actually fully fix your problem though. Because reloading a module doesn't delete references to old versions of classes as well as instances to old versions of classes. So the easiest way to fix this really is to exit the REPL and restart it.

Summary

Python caches modules. So while developing your code as you're testing your code in a Python REPL, keep in mind that when you re-import a module, Python will used the cached version of your module instead of reevaluating all the code in your module.

To really refresh a module, you should exit the Python REPL and start a new REPL.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...