Wednesday, December 5, 2018

Stack Abuse: Sets in Python

Introduction

In Python, a set is a data structure that stores unordered items. The set items are also unindexed. Like a list, a set allows the addition and removal of elements. However, there are a few unique characteristics that define a set and separate it from other data structures:

  • A set does not hold duplicate items.
  • The elements of the set are immutable, that is, they cannot be changed, but the set itself is mutable, that is, it can be changed.
  • Since set items are not indexed, sets don't support any slicing or indexing operations.

In this article, we will be discussing the various operations that can be performed on sets in Python.

How to Create a Set

There are two ways through which we can create sets in Python.

We can create a set by passing all the set elements inside curly braces {} and separate the elements using commas (,). A set can hold any number of items and the items can be of different types, for example, integers, strings, tuples, etc. However, a set does not accept an element that is mutable, for example, a list, dictionary, etc.

Here is an example of how to create a set in Python:

num_set = {1, 2, 3, 4, 5, 6}  
print(num_set)  

Output

{1, 2, 3, 4, 5, 6}

We just created a set of numbers. We can also create a set of string values. For example:

string_set = {"Nicholas", "Michelle", "John", "Mercy"}  
print(string_set)  

Output

{'Michelle', 'Nicholas', 'John', 'Mercy'}

You must have noticed that the elements in the above output are not ordered in the same way we added them to the set. The reason for this is that set items are not ordered. If you run the same code again, it is possible that you will get an output with the elements arranged in a different order.

We can also create a set with elements of different types. For example:

mixed_set = {2.0, "Nicholas", (1, 2, 3)}  
print(mixed_set)  

Output

{2.0, 'Nicholas', (1, 2, 3)}

All the elements of the above set belong to different types.

We can also create a set from a list. This can be done by calling the Python's built-in set() function. For example:

num_set = set([1, 2, 3, 4, 5, 6])  
print(num_set)  

Output

{1, 2, 3, 4, 5, 6}

As stated above, sets do not hold duplicate items. Suppose our list had duplicate items, as shown below:

num_set = set([1, 2, 3, 1, 2])  
print(num_set)  

Output

{1, 2, 3}

The set has removed the duplicates and returned only one of each duplicate items. This also happens when we are creating a set from scratch. For example:

num_set = {1, 2, 3, 1, 2}  
print(num_set)  

Output

{1, 2, 3}

Again, the set has removed the duplicates and returned only one of the duplicate items.

The creation of an empty set is some-what tricky. If you use empty curly braces {} in Python, you create an empty dictionary rather than an empty set. For example:

x = {}  
print(type(x))  

Output

<class 'dict'>  

As shown in the output, the type of variable x is a dictionary.

To create an empty set in Python we must use the set() function without passing any value for the parameters, as shown below:

x = set()  
print(type(x))  

Output

<class 'set'>  

The output shows that we have created a set.

Accessing Set Items

Python does not provide us with a way of accessing an individual set item. However, we can use a for loop to iterate through all the items of a set. For example:

months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

for m in months:  
    print(m)

Output

March  
Feb  
Dec  
Jan  
May  
Nov  
Oct  
Apr  
June  
Aug  
Sep  
July  

We can also check for the presence of an element in a set using the in keyword as shown below:

months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

print("May" in months)  

Output

True  

The code returned "True", which means that the item was found in the set. Similarly, searching for an element that doesn't exist in the set returns "False", as shown below:

months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

print("Nicholas" in months)  

Output

False  

As expected, the code returned "False".

Adding Items to a Set

Python allows us to add new items to a set via the add() function. For example:

months = set(["Jan", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

months.add("Feb")  
print(months)  

Output

{'Oct', 'Dec', 'Feb', 'July', 'May', 'Jan', 'June', 'March', 'Sep', 'Aug', 'Nov', 'Apr'}

The item "Feb" has been successfully added to the set. If it was a set of numbers, we would not have passed the new element within quotes as we had to do for a string. For example:

num_set = {1, 2, 3}  
num_set.add(4)  
print(num_set)  

Output

{1, 2, 3, 4}

In the next section, we will be discussing how to remove elements from sets.

Removing Items from a Set

Python allows us to remove an item from a set, but not using an index as set elements are not indexed. The items can be removed using either the discard() or remove() methods.

Keep in mind that the discard() method will not raise an error if the item is not found in the set. However, if the remove() method is used and the item is not found, an error will be raised.

Let us demonstrate how to remove an element using the discard() method:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.discard(3)  
print(num_set)  

Output

{1, 2, 4, 5, 6}

The element 3 has been removed from the set.

Similarly, the remove() method can be used as follows:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.remove(3)  
print(num_set)  

Output

{1, 2, 4, 5, 6}

Now, let us try to remove an element that does not exist in the set. Let's first use the discard() method:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.discard(7)  
print(num_set)  

Output

{1, 2, 3, 4, 5, 6}

The above output shows that the set was not affected in any way. Now let's see what happens when we use the remove() method in the same scenario:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.remove(7)  
print(num_set)  

Output

Traceback (most recent call last):  
  File "C:\Users\admin\sets.py", line 2, in <module>
    num_set.remove(7)
KeyError: 7  

The output shows that the method raised an error as we attempted to remove an element that is not in the set.

With the pop() method, we can remove and return an element. Since the elements are unordered, we cannot tell or predict the item that will be removed. For example:

num_set = {1, 2, 3, 4, 5, 6}  
print(num_set.pop())  

Output

1  

You can use the same method to remove an element and return the elements that are remaining in the set. For example:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.pop()  
print(num_set)  

Output

{2, 3, 4, 5, 6}

Those are the elements remaining in the set.

The Python's clear() method helps us remove all elements from a set. For example:

num_set = {1, 2, 3, 4, 5, 6}  
num_set.clear()  
print(num_set)  

Output

set()  

The output is an empty set() with no elements in it.

Set Union

Suppose we have two sets, A and B. The union of the two sets is a set with all the elements from both sets. Such an operation is accomplished via the Python's union() function.

Here is an example:

months_a = set(["Jan", "Feb", "March", "Apr", "May", "June"])  
months_b = set(["July", "Aug", "Sep", "Oct", "Nov", "Dec"])

all_months = months_a.union(months_b)  
print(all_months)  

Output

{'Oct', 'Jan', 'Nov', 'May', 'Aug', 'Feb', 'Sep', 'March', 'Apr', 'Dec', 'June', 'July'}

A union can also be performed on more than two sets, and all their elements will be combined into a single set. For example:

x = {1, 2, 3}  
y = {4, 5, 6}  
z = {7, 8, 9}

output = x.union(y, z)

print(output)  

Output

{1, 2, 3, 4, 5, 6, 7, 8, 9}

During the union operation, duplicates are ignored, and only one of the duplicate items is shown. For example:

x = {1, 2, 3}  
y = {4, 3, 6}  
z = {7, 4, 9}

output = x.union(y, z)

print(output)  

Output

{1, 2, 3, 4, 6, 7, 9}

The | operator can also be used to find the union of two or more sets. For example:

months_a = set(["Jan","Feb", "March", "Apr", "May", "June"])  
months_b = set(["July", "Aug", "Sep", "Oct", "Nov", "Dec"])

print(months_a | months_b)  

Output

{'Feb', 'Apr', 'Sep', 'Dec', 'Nov', 'June', 'May', 'Oct', 'Jan', 'July', 'March', 'Aug'}

If you want to perform a union on more than two sets, separate the set names using the | operator. For example:

x = {1, 2, 3}  
y = {4, 3, 6}  
z = {7, 4, 9}

print(x | y | z)  

Output

{1, 2, 3, 4, 6, 7, 9}

Set Intersection

Suppose you have two sets A and B. Their intersection is a set with elements that are common in both A and B.

The intersection operation in sets can be achieved via either the & operator or the intersection() method. For example:

For example:

x = {1, 2, 3}  
y = {4, 3, 6}

print(x & y)  

Output

{3}

The two sets have 3 as the common element. The same can also be achieved with the intersection() method:

x = {1, 2, 3}  
y = {4, 3, 6}

z = x.intersection(y)  
print(z)  

Output

{3}

In the next section, we will be discussing how to determine the difference between sets.

Set Difference

Suppose you have two sets A and B. The difference of A and B (A - B) is the set with all elements that are in A but not in B. Consequently, (B - A) is the set with all the elements in B but not in A.

To determine set differences in Python, we can use either the difference() function or the - operator. For example:

set_a = {1, 2, 3, 4, 5}  
set_b = {4, 5, 6, 7, 8}  
diff_set = set_a.difference(set_b)  
print(diff_set)  

Output

{1, 2, 3}

in the script above, only the first three elements of set set_a are not available in the set set_b, hence they form our output. The minus - operator can also be used to find the difference between the two sets as shown below:

set_a = {1, 2, 3, 4, 5}  
set_b = {4, 5, 6, 7, 8}  
print(set_a - set_b)  

Output

{1, 2, 3}

The symmetric difference of sets A and B is the set with all elements that are in A and B except the elements that are common in both sets. It is determined using the Python's symmetric_difference() method or the ^ operator. For example:

set_a = {1, 2, 3, 4, 5}  
set_b = {4, 5, 6, 7, 8}  
symm_diff = set_a.symmetric_difference(set_b)  
print(symm_diff)  

Output

{1, 2, 3, 6, 7, 8}

The symmetric difference can also be found as follows:

set_a = {1, 2, 3, 4, 5}  
set_b = {4, 5, 6, 7, 8}  
print(set_a ^ set_b)  

Output

{1, 2, 3, 6, 7, 8}

Set Comparison

We can compare sets depending on the elements they have. This way, we can tell whether a set is a superset or a subset of another set. The result from such a comparison will be either True or False.

To check whether set A is a subset of set B, we can use the following operation:

A <= B  

To check whether B is a superset of A, we can use the following operation:

B >= A  

For example:

months_a = set(["Jan", "Feb", "March", "Apr", "May", "June"])  
months_b = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

subset_check = months_a <= months_b  
superset_check = months_b >= months_a

print(subset_check)  
print(superset_check)  

Output

True  
True  

The subset and superset can also be checked using issubset() and issuperset() methods as shown below:

months_a = set(["Jan","Feb", "March", "Apr", "May", "June"])  
months_b = set(["Jan","Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])

subset_check = months_a.issubset(months_b)  
superset_check = months_b.issuperset(months_a)

print(subset_check)  
print(superset_check)  

Output

True  
True  

In the next section, we will discuss some of the most commonly used set methods provided by Python that we have not already discussed.

Set Methods

Python comes with numerous built-in set methods, including the following:

copy()

This method returns a copy of the set in question. For example:

string_set = {"Nicholas", "Michelle", "John", "Mercy"}  
x = string_set.copy()

print(x)  

Output

{'John', 'Michelle', 'Nicholas', 'Mercy'}

The output shows that x is a copy of the set string_set.

isdisjoint()

This method checks whether the sets in question have an intersection or not. If the sets don't have common items, this method returns True, otherwise it returns False. For example:

names_a = {"Nicholas", "Michelle", "John", "Mercy"}  
names_b = {"Jeff", "Bosco", "Teddy", "Milly"}

x = names_a.isdisjoint(names_b)  
print(x)  

Output

True  

The two sets don't have common items, hence the output is True.

len()

This method returns the length of a set, which is the total number of elements in the set. For example:

names_a = {"Nicholas", "Michelle", "John", "Mercy"}

print(len(names_a))  

Output

4  

The output shows that the set has a length of 4.

Python Frozen Set

Frozenset is a class with the characteristics of a set, but once its elements have been assigned, they cannot be changed. Tuples can be seen as immutable lists, while frozensets can be seen as immutable sets.

Sets are mutable and unhashable, which means we cannot use them as dictionary keys. Frozensets are hashable and we can use them as dictionary keys.

To create frozensets, we use the frozenset() method. Let us create two frozensets, X and Y:

X = frozenset([1, 2, 3, 4, 5, 6])  
Y = frozenset([4, 5, 6, 7, 8, 9])

print(X)  
print(Y)  

Output

frozenset({1, 2, 3, 4, 5, 6})  
frozenset({4, 5, 6, 7, 8, 9})  

The frozensets support the use of Python set methods like copy(), difference(), symmetric_difference(), isdisjoint(), issubset(), intersection(), issuperset(), and union().

Conclusion

The article provides a detailed introduction to sets in Python. The mathematical definition of sets is the same as the definition of sets in Python. A set is simply a collection of items that are unordered. The set itself is mutable, but the set elements are immutable. However, we can add and remove elements from a set freely. In most data structures, elements are indexed. However, set elements are not indexed. This makes it impossible for us to perform operations that target specific set elements.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...