Monday, July 26, 2021

Real Python: Python's collections: A Buffet of Specialized Data Types

Python’s collections module provides a rich set of specialized container data types carefully designed to approach specific programming problems in a Pythonic and efficient way. The module also provides wrapper classes that make it safer to create custom classes that behave similar to the built-in types dict, list, and str.

Learning about the data types and classes in collections will allow you to grow your programming tool kit with a valuable set of reliable and efficient tools.

In this tutorial, you’ll learn how to:

  • Write readable and explicit code with namedtuple
  • Build efficient queues and stacks with deque
  • Count objects quickly with Counter
  • Handle missing dictionary keys with defaultdict
  • Guarantee the insertion order of keys with OrderedDict
  • Manage multiple dictionaries as a single unit with ChainMap

To better understand the data types and classes in collections, you should know the basics of working with Python’s built-in data types, such as lists, tuples, and dictionaries. Additionally, the last part of the article requires some basic knowledge about object-oriented programming in Python.

Free Download: Get a sample chapter from Python Tricks: The Book that shows you Python’s best practices with simple examples you can apply instantly to write more beautiful + Pythonic code.

Getting Started With Python’s collections

Back in Python 2.4, Raymond Hettinger contributed a new module called collections to the standard library. The goal was to provide various specialized collection data types to approach specific programming problems.

At that time, collections only included one data structure, deque, which was specially designed as a double-ended queue that supports efficient append and pop operations on either end of the sequence. From this point on, several modules in the standard library took advantage of deque to improve the performance of their classes and structures. Some outstanding examples are queue and threading.

With time, a handful of specialized container data types populated the module:

Data type Python version Description
deque 2.4 A sequence-like collection that supports efficient addition and removal of items from either end of the sequence
defaultdict 2.5 A dictionary subclass for constructing default values for missing keys and automatically adding them to the dictionary
namedtuple() 2.6 A factory function for creating subclasses of tuple that provides named fields that allow accessing items by name while keeping the ability to access items by index
OrderedDict 2.7, 3.1 A dictionary subclass that keeps the key-value pairs ordered according to when the keys are inserted
Counter 2.7, 3.1 A dictionary subclass that supports convenient counting of unique items in a sequence or iterable
ChainMap 3.3 A dictionary-like class that allows treating a number of mappings as a single dictionary object

Besides these specialized data types, collections also provides three base classes that facilitate the creations of custom lists, dictionaries, and strings:

Class Description
UserDict A wrapper class around a dictionary object that facilitates subclassing dict
UserList A wrapper class around a list object that facilitates subclassing list
UserString A wrapper class around a string object that facilitates subclassing string

The need for these wrapper classes was partially eclipsed by the ability to subclass the corresponding standard built-in data types. However, sometimes using these classes is safer and less error-prone than using standard data types.

With this brief introduction to collections and the specific use cases that the data structures and classes in this module can solve, it’s time to take a closer look at them. Before that, it’s important to point out that this tutorial is an introduction to collections as a whole. In most of the following sections, you’ll find a blue alert box that’ll guide you to a dedicated article on the class or function at hand.

Improving Code Readability: namedtuple()

Python’s namedtuple() is a factory function that allows you to create tuple subclasses with named fields. These fields give you direct access to the values in a given named tuple using the dot notation, like in obj.attr.

The need for this feature arose because using indices to access the values in a regular tuple is annoying, difficult to read, and error-prone. This is especially true if the tuple you’re working with has several items and is constructed far away from where you’re using it.

Note: Check out Write Pythonic and Clean Code With namedtuple for a deeper dive into how to use namedtuple in Python.

A tuple subclass with named fields that developers can access with the dot notation seemed like a desirable feature back in Python 2.6. That’s the origin of namedtuple(). The tuple subclasses you can build with this function are a big win in code readability if you compare them with regular tuples.

To put the code readability problem in perspective, consider divmod(). This built-in function takes two (non-complex) numbers and returns a tuple with the quotient and remainder that result from the integer division of the input values:

>>>
>>> divmod(12, 5)
(2, 2)

It works nicely. However, is this result readable? Can you tell what the meaning of each number in the output is? Fortunately, Python offers a way to improve this. You can code a custom version of divmod() with an explicit result using namedtuple:

>>>
>>> from collections import namedtuple

>>> def custom_divmod(x, y):
...     DivMod = namedtuple("DivMod", "quotient remainder")
...     return DivMod(*divmod(x, y))
...

>>> result = custom_divmod(12, 5)
>>> result
DivMod(quotient=2, remainder=2)

>>> result.quotient
2
>>> result.remainder
2

Now you know the meaning of each value in the result. You can also access each independent value using the dot notation and a descriptive field name.

Read the full article at https://realpython.com/python-collections-module/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...