Monday, March 29, 2021

OrderedDict vs dict in Python: The Right Tool for the Job

Sometimes you need a Python dictionary that remembers the order of its items. In the past, you had only one tool for solving this specific problem: Python’s OrderedDict. It’s a dictionary subclass specially designed to remember the order of items, which is defined by the insertion order of keys.

This changed in Python 3.6. The built-in dict class now keeps its items ordered as well. Because of that, many in the Python community now wonder if OrderedDict is still useful. A closer look at OrderedDict will uncover that this class still provides valuable features.

In this tutorial, you’ll learn how to:

  • Create and use OrderedDict objects in your code
  • Identify the differences between OrderedDict and dict
  • Understand the pros and cons of using OrderedDict vs dict

With this knowledge, you’ll able to choose the dictionary class that best fits your needs when you want to preserve the order of items.

By the end of the tutorial, you’ll see an example of implementing a dictionary-based queue using OrderedDict, which would be more challenging if you used a regular dict object.

Choosing Between OrderedDict and dict

For years, Python dictionaries were unordered data structures. Python developers were used to this fact, and they relied on lists or other sequences when they needed to keep their data in order. With time, developers found a need for a new type of dictionary, one that would keep its items ordered.

Back in 2008, PEP 372 introduced the idea of adding a new dictionary class to collections. Its main goal was to remember the order of items as defined by the order in which keys were inserted. That was the origin of OrderedDict.

Core Python developers wanted to fill in the gap and provide a dictionary that could preserve the order of inserted keys. That, in turn, allowed for a more straightforward implementation of specific algorithms that rely on this property.

OrderedDict was added to the standard library in Python 3.1. Its API is essentially the same as dict. However, OrderedDict iterates over keys and values in the same order that the keys were inserted. If a new entry overwrites an existing entry, then the order of items is left unchanged. If an entry is deleted and reinserted, then it will be moved to the end of the dictionary.

Python 3.6 introduced a new implementation of dict. This new implementation represents a big win in terms of memory usage and iteration efficiency. Additionally, the new implementation provides a new and somewhat unexpected feature: dict objects now keep their items in the same order they were introduced. Initially, this feature was considered an implementation detail, and the documentation advised against relying on it.

In the words of Raymond Hettinger, core Python developer and coauthor of OrderedDict, the class was specially designed to keep its items ordered, whereas the new implementation of dict was designed to be compact and to provide fast iteration:

The current regular dictionary is based on the design I proposed several years ago. The primary goals of that design were compactness and faster iteration over the dense arrays of keys and values. Maintaining order was an artifact rather than a design goal. The design can maintain order but that is not its specialty.

In contrast, I gave collections.OrderedDict a different design (later coded in C by Eric Snow). The primary goal was to have efficient maintenance of order even for severe workloads such as that imposed by the lru_cache which frequently alters order without touching the underlying dict. Intentionally, the OrderedDict has a design that prioritizes ordering capabilities at the expense of additional memory overhead and a constant factor worse insertion time.

It is still my goal to have collections.OrderedDict have a different design with different performance characteristics than regular dicts. It has some order specific methods that regular dicts don’t have (such as a move_to_end() and a popitem() that pops efficiently from either end). The OrderedDict needs to be good at those operations because that is what differentiates it from regular dicts. (Source)

In Python 3.7, the items-ordered feature of dict objects was declared an official part of the Python language specification. So, from that point on, developers could rely on dict when they needed a dictionary that keeps its items ordered.

At this point, a question arises: Is OrderedDict still needed after this new implementation of dict? The answer depends on your specific use case and also on how explicit you want to be in your code.

At the time of writing, some features of OrderedDict still made it valuable and different from a regular dict:

  1. Intent signaling: If you use OrderedDict over dict, then your code makes it clear that the order of items in the dictionary is important. You’re clearly communicating that your code needs or relies on the order of items in the underlying dictionary.
  2. Control over the order of items: If you need to rearrange or reorder the items in a dictionary, then you can use .move_to_end() and also the enhanced variation of .popitem().
  3. Equality test behavior: If your code compares dictionaries for equality, and the order of items is important in that comparison, then OrderedDict is the right choice.

There’s at least one more reason to continue using OrderedDict in your code: backward compatibility. Relying on regular dict objects to preserve the order of items will break your code in environments that run versions of Python older than 3.6.

It’s difficult to say if dict will fully replace OrderedDict soon. Nowadays, OrderedDict still offers interesting and valuable features that you might want to consider when selecting a tool for a given job.

Getting Started With Python’s OrderedDict

Python’s OrderedDict is a dict subclass that preserves the order in which key-value pairs, commonly known as items, are inserted into the dictionary. When you iterate over an OrderedDict object, items are traversed in the original order. If you update the value of an existing key, then the order remains unchanged. If you remove an item and reinsert it, then the item is added at the end of the dictionary.

Being a dict subclass means that it inherits all the methods a regular dictionary provides. OrderedDict also has additional features that you’ll learn about in this tutorial. In this section, however, you’ll learn the basics of creating and using OrderedDict objects in your code.

Read the full article at https://realpython.com/python-ordereddict/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Real Python
read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...