Sets in Python

Sets are one of the interesting collection types in Python. First introduced in version 2.4, sets allow you to create a collection object that is unordered, unique, immutable and supports operations corresponding to mathematical set theory.

sets can be created in two ways.

>>> x = {1,2,3,4,5,3,2,6}        # New syntax - 2.7, 3.x
>>> x = set([1,2,3,4,5,3,2,6])   # Built-in, all versions
>>> x
{1, 2, 3, 4, 5, 6}

The new syntax looks clean and adheres to the mathematical notation of a set. Note that even though you can use the new set syntax notation to create non-empty sets; for creating a empty set you still need to use the built-in set method.

Note: Initially the set type was available as a external module but it has been deprecated since version 2.6. So before 2.6 you had to import the module to use the set type.

from sets import Set

Now the built-in set/frozenset types replace the old set module.

# Cannot define a empty set like this. This is actually a empty dictionary
>>> x= {}
>>> x
{}
>>> type(x)


# To define a empty set use the following
>>> x = set()
>>> type(x)

Uses of the set type

Listing the benefits of something out of context can be difficult. But lets us give it a shot. Because sets store only one one item of each,, they can be used to filter duplicates from other collections – by converting a list to a set and back.

>>> l = [1,2,2,3,4,5,6]
>>> l
[1, 2, 2, 3, 4, 5, 6]
>>> set(l)
{1, 2, 3, 4, 5, 6}
>>> l = list(set(l))
>>> l
[1, 2, 3, 4, 5, 6]

You can check for string difference using sets.

>>>set('bill') - set('jill')
{'b'}

Besides filtering and differences, you can do the usual mathematical set operations.

>>> football_team = set(['john', 'jack', 'mark', 'tony'])  # set A
>>> football_team
{'jack', 'tony', 'john', 'mark'}
>>> baseball_team = set(['mark', 'peter', 'tony', 'ruth']) # set B
>>> baseball_team
{'peter', 'tony', 'ruth', 'mark'}

# Difference - Which players are not in the baseball team
>>> football_team - baseball_team  
{'jack', 'john'}

# Difference - Which players are not in the football team
>>> baseball_team - football_team
{'peter', 'ruth'}

# Union - Show players from both teams
>>> baseball_team | football_team
{'john', 'mark', 'peter', 'tony', 'ruth', 'jack'}

# Intersection- Which players are common in both teams
>>> baseball_team & football_team
{'tony', 'mark'}

# XOR - Which players are unique in both teams
>>> baseball_team ^ football_team
{'john', 'peter', 'ruth', 'jack'}

# Superset - baseball team is a proper superset of football team
>>> baseball_team > football_team
False

# Subset - baseball team is a proper subset of football team
>>> baseball_team < football_team False # Disjoint - Return True if a set has no elements in common with other. # Sets are disjoint if and only if their intersection is the empty set. >>>baseball_team.isdisjoint(football_team)
False

# Combine both teams, removing duplicates
>>> all_teams = baseball_team | football_team
>>> baseball_team < all_teams True # Test for set membership >>>'john' in football_team
True

# Test for non-membership
>>>'peter' not in football_team
True

You can also use the Python 3.x version while creating sets.

>>>football_team = {'john', 'jack', 'mark', 'tony'}

You can also write the above operations like below. Although I prefer the above notation.

>>> baseball_team.union(football_team)
{'john', 'mark', 'peter', 'tony', 'ruth', 'jack'}

>>> baseball_team.difference({'john', 'jack', 'mark', 'tony'})
{'peter', 'ruth'}

>>> baseball_team.union({'john', 'jack', 'mark', 'tony'})
{'john', 'mark', 'peter', 'tony', 'ruth', 'jack'}

>>> baseball_team.intersection({'john', 'jack', 'mark', 'tony'})
{'tony', 'mark'}

>>> baseball_team.symmetric_difference({'john', 'jack', 'mark', 'tony'})
{'john', 'peter', 'ruth', 'jack'}

>>> baseball_team.issubset({'john', 'jack', 'mark', 'tony'})
False

>>> baseball_team.issuperset({'john', 'jack', 'mark', 'tony'})
False

Adding and deleting elements of a set

We can add and delete elements of a set using various built-in methods.

>>> football_team = {'john', 'jack', 'mark', 'tony'}
# Add a new member to the set
>>> football_team.add('peter')
>>> football_team
{'mark', 'jack', 'peter', 'john', 'tony'}

# Remove a element from a set
>>> football_team.remove('tony')
>>> football_team
{'mark', 'jack', 'peter', 'john'}

# If a element does not exist, the remove() method throws a KeyError
>>> football_team.remove('william')
Traceback (most recent call last):
  File "", line 1, in 
KeyError: 'william'

# Discard() method is the same as remove(), except it does not throw
# any error if the element is not found.
>>> football_team.discard('william')
>>> football_team
{'mark', 'jack', 'peter', 'john'}

# Pop removes and returns a arbitrary element
>>> football_team.pop()
'mark'
>>> football_team
{'jack', 'peter', 'john'}
>>> football_team.pop()
'jack'
>>> football_team
{'peter', 'john'}
>>> football_team.pop()
'peter'
>>> football_team
{'john'}
>>> football_team.pop()
'john'
>>> football_team
set()

# Using pop() on a empty set throws an error 
>>> football_team.pop()
Traceback (most recent call last):
  File "", line 1, in 
KeyError: 'pop from an empty set'

We can use the len() method to test for emptiness.

if len(football_team) > 0:
   football_team.pop()

Types of Set

There are currently two built-in set types, set and frozenset. The set we saw before in the post is the mutable kind because the contents can be changed using methods like add() and remove(). The other type is the frozenset type, which is immutable and – its contents cannot be altered after it is created. Adding and deleting is not permitted for a frozen set.

>>> football_team = frozenset(['john', 'jack', 'mark', 'tony'])
>>> football_team
frozenset({'mark', 'jack', 'john', 'tony'})

>>> football_team.pop()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'frozenset' object has no attribute 'pop'

>>> football_team.add('peter')
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'frozenset' object has no attribute 'add'

Leave a Reply

Your email address will not be published. Required fields are marked *