Dictionaries are best used for key-value lookups: we provide a key and the dictionary very quickly returns the corresponding value.
But what if you need both key-value lookups and iteration? It is possible to loop over a dictionary and when looping, we might care about the order of the items in the dictionary.
With dictionary item order in mind, you might wonder how can we sort a dictionary?
Dictionaries are ordered
As of Python 3.6 dictionaries are ordered (technically the ordering became official in 3.7).
Dictionary keys are stored in insertion order, meaning whenever a new key is added it gets added at the very end.
1 2 3 4 |
|
But if we update a key-value pair, the key remains where it was before:
1 2 3 |
|
So if you plan to populate a dictionary with some specific data and then leave that dictionary as-is, all you need to do is make sure that original data is in the order you’d like.
For example if we have a CSV file of US state abbreviations and our file is ordered alphabetically by state name, our dictionary will be ordered the same way:
1 2 3 4 5 6 7 |
|
If our input data is already ordered correctly, our dictionary will end up ordered correctly as well.
How to sort a dictionary by its keys
What if our data isn’t sorted yet?
Say we have a list-of-tuples that pair meeting rooms to their corresponding room numbers:
1
|
|
And we’d like to sort this dictionary by its keys.
We could use the built-in sorted
function to sort it:
1 2 |
|
The sorted
function uses the <
operator to compare many items in the given iterable and return a sorted list. The sorted
function always returns a list.
To make these key-value pairs into a dictionary, we can pass them straight to the dict
constructor:
1 2 3 |
|
The dict
constructor will accept a list of 2-item tuples (or any iterable of 2-item iterables) and make a dictionary out of it, using the first item from each tuple as a key and the second as the corresponding value.
Key-value pairs are sorted lexicographically… what?
We’re sorting tuples of the key-value pairs before making a dictionary out of them. But how does sorting tuples work?
1 2 3 |
|
When sorting tuples, Python uses lexicographical ordering (which sounds fancier than it is). Comparing a 2-item tuple basically boils down to this algorithm:
1 2 3 4 5 6 |
|
I’ve written an article on tuple ordering that explains this in more detail.
You might be thinking: it seems like this sorts not just by keys but by keys and values. And you’re right! But only sort of.
The keys in a dictionary should always compare as unequal (if two keys are equal, they’re seen as the same key). So as long as the keys are comparable to each other with the less than operator (<
), sorting 2-item tuples of key-value pairs should always sort by the keys.
Dictionaries can’t be sorted in-place
What if we already have our items in a dictionary and we’d like to sort that dictionary? Unlike lists, there’s no sort
method on dictionaries.
We can’t sort a dictionary in-place, but we could get the items from our dictionary, sort those items using the same technique we used before, and then turn those items them into a new dictionary:
1 2 3 4 |
|
That creates a new dictionary object. If we really wanted to update our original dictionary object, we could take the items from the dictionary, sort them, clear the dictionary of all its items, and then add all the items back into the dictionary:
1 2 3 4 |
|
But why bother? We don’t usually want to operate on data structures in-place in Python: we tend to prefer making a new data structure rather than re-using an old one (this preference is partly thanks to how variables work in Python).
How to sort a dictionary by its values
What if we wanted to sort a dictionary by its values instead of its keys?
We could make a new list of value-key tuples (actually a generator in our case below), sort that, then flip them back to key-value tuples and recreate our dictionary:
1 2 3 4 5 6 7 8 |
|
This works but it’s a bit long. Also this technique actually sorts both our values and our keys (giving the values precedence in the sorting).
What if we wanted to just sort by the values, ignoring the contents of the keys entirely? Python’s sorted
function accepts a key
argument that we can use for this!
1 2 3 4 5 6 7 8 |
|
The key function we pass to sorted should accept an item from the iterable we’re sorting and return the key to sort by. Note that the word “key” here isn’t related to dictionary keys. Dictionary keys are used for looking up dictionary values whereas this key function returns an object that determines how to order items in an iterable.
If we want to sort by our values, we could make a key function that accepts each item in our list of 2-item tuples and returns just the value:
1 2 3 4 |
|
Then we’d use our key function by passing it to the sorted
function (yes functions can be passed to other functions in Python) and pass the result to dict
to create a new dictionary:
1 2 3 |
|
If you prefer not to create a custom key function just to use it once, you could use a lambda function (which I don’t usually recommend):
1 2 3 |
|
Or you could use operator.itemgetter
to make a key function that gets the second item from each key-value tuple:
1 2 3 4 |
|
I discussed my preference for itemgetter
in my article on lambda functions.
Ordering a dictionary in some other way
What if we needed to sort by something other than just a key or a value? For example what if our room number strings include numbers that aren’t always the same length:
1 2 3 4 5 6 7 8 |
|
If we sorted these rooms by value, those strings wouldn’t be sorted in the numerical way we’re hoping for:
1 2 3 4 |
|
Rm 30 should be first and Rm 2000 should be last. But we’re sorting strings, which are ordered character-by-character based on the unicode value of each character (I noted this in my article on tuple ordering).
We could customize the key
function we’re using to sort numerically instead:
1 2 3 4 5 |
|
When we use this key function to sort our dictionary:
1
|
|
It will be sorted by the integer room number, as expected:
1 2 |
|
Should you sort a dictionary?
When you’re about to sort a dictionary, first ask yourself “do I need to do this”? In fact, when you’re considering looping over a dictionary you might ask “do I really need a dictionary here”?
Dictionaries are used for key-value lookups: you can quickly get a value given a key. They’re very fast at retrieving values for keys. But dictionaries take up more space than a list of tuples.
If you can get away with using a list of tuples in your code (because you don’t actually need a key-value lookup), you probably should use a list of tuples instead of a dictionary.
But if key lookups are what you need, it’s unlikely that you also need to loop over your dictionary.
Now it’s certainly possible that right now you do in fact have a good use case for sorting a dictionary (for example maybe you’re sorting keys in a dictionary of attributes), but keep in mind that you’ll need to sort a dictionary very rarely.
Summary
Dictionaries are used for quickly looking up a value based on a key. The order of a dictionary’s items is rarely important.
In the rare case that you care about the order of your dictionary’s items, keep in mind that dictionaries are ordered by the insertion order of their keys (as of Python 3.6). So the keys in your dictionary will remain in the order they were added to the dictionary.
If you’d like to sort a dictionary by its keys, you can use the built-in sorted
function along with the dict
constructor:
1
|
|
If you’d like to sort a dictionary by its values, you can pass a custom key
function (one which returns the value for each item) to sorted
:
1 2 3 4 5 |
|
But remember, it’s not often that we care about the order of a dictionary. Whenever you’re sorting a dictionary, please remember to ask yourself do I really need to sort this data structure and would a list of tuples be more suitable than a dictionary here?
from Planet Python
via read more
No comments:
Post a Comment