Wednesday, November 4, 2020

Real Python: Caching in Python Using the LRU Cache Strategy

There are many ways to achieve fast and responsive applications. Caching is one approach that, when used correctly, makes things much faster while decreasing the load on computing resources. Python’s functools module comes with the @lru_cache decorator, which gives you the ability to cache the result of your functions using the Least Recently Used (LRU) strategy. This is a simple yet powerful technique that you can use to leverage the power of caching in your code.

In this tutorial, you’ll learn:

  • What caching strategies are available and how to implement them using Python decorators
  • What the LRU strategy is and how it works
  • How to improve performance by caching with the @lru_cache decorator
  • How to expand the functionality of the @lru_cache decorator and make it expire after a specific time

By the end of this tutorial, you’ll have a deeper understanding of how caching works and how to take advantage of it in Python.

Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.

Caching and Its Uses

Caching is an optimization technique that you can use in your applications to keep recent or often-used data in memory locations that are faster or computationally cheaper to access than their source.

Imagine you’re building a newsreader application that fetches the latest news from different sources. As the user navigates through the list, your application downloads the articles and displays them on the screen.

What would happen if the user decided to move repeatedly back and forth between a couple of news articles? Unless you were caching the data, your application would have to fetch the same content every time! That would make your user’s system sluggish and put extra pressure on the server hosting the articles.

A better approach would be to store the content locally after fetching each article. Then, the next time the user decided to open an article, your application could open the content from a locally stored copy instead of going back to the source. In computer science, this technique is called caching.

Implementing a Cache Using a Python Dictionary

You can implement a caching solution in Python using a dictionary.

Staying with the newsreader example, instead of going directly to the server every time you need to download an article, you can check whether you have the content in your cache and go back to the server only if you don’t. You can use the article’s URL as the key and its content as the value.

Here’s an example of how this caching technique might look:

 1import requests
 2
 3cache = dict()
 4
 5def get_article_from_server(url):
 6    print("Fetching article from server...")
 7    response = requests.get(url)
 8    return response.text
 9
10def get_article(url):
11    print("Getting article...")
12    if url not in cache:
13        cache[url] = get_article_from_server(url)
14
15    return cache[url]
16
17get_article("https://realpython.com/sorting-algorithms-python/")
18get_article("https://realpython.com/sorting-algorithms-python/")

Save this code to a caching.py file, install the requests library, then run the script:

$ pip install requests
$ python caching.py
Getting article...
Fetching article from server...
Getting article...

Notice how you get the string "Fetching article from server..." printed a single time despite calling get_article() twice, in lines 17 and 18. This happens because, after accessing the article for the first time, you put its URL and content in the cache dictionary. The second time, the code doesn’t need to fetch the item from the server again.

Caching Strategies

There’s one big problem with this cache implementation: the content of the dictionary will grow indefinitely! As the user downloads more articles, the application will keep storing them in memory, eventually causing the application to crash.

To work around this issue, you need a strategy to decide which articles should stay in memory and which should be removed. These caching strategies are algorithms that focus on managing the cached information and choosing which items to discard to make room for new ones.

There are several different strategies that you can use to evict items from the cache and keep it from growing past from its maximum size. Here are five of the most popular ones, with an explanation of when each is most useful:

Strategy Eviction policy Use case
First-In/First-Out (FIFO) Evicts the oldest of the entries Newer entries are most likely to be reused
Last-In/First-Out (LIFO) Evicts the latest of the entries Older entries are most likely to be reused
Least Recently Used (LRU) Evicts the least recently used entry Recently used entries are most likely to be reused
Most Recently Used (MRU) Evicts the most recently used entry Least recently used entries are most likely to be reused
Least Frequently Used (LFU) Evicts the least often accessed entry Entries with a lot of hits are more likely to be reused

In the sections below, you’ll take a closer look at the LRU strategy and how to implement it using the @lru_cache decorator from Python’s functools module.

Diving Into the Least Recently Used (LRU) Cache Strategy

Read the full article at https://realpython.com/lru-cache-python/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...