What is the Python Collections Module?
The Python collections module provides access to six main specialized container datatypes; namedtuple()
, deque
, ChainMap
, Counter
, OrderedDict
and defaultdict
.
Syntax: import collections
This guide walks you through everything you need to work with the Python collections module, with clear, commented code samples.
namedtuple() Container
A named tuple allows individual elements of a tuple to be referenced by name as well as by index. They can be used to assign meaning to each field, require no more memory than a regular tuple and are immutable.
Syntax: namedtuple(typename, field_names)
NamedTuple Example 1: Creation and reference by field namefrom collections import namedtuple # Create a namedtuple Car = namedtuple('Car', ['make', 'model', 'doors']) # Method 1 Car = namedtuple('Car', 'make, model, doors') # Method 2 cars_list = [ Car('BMW','X6','5'), Car('Mercedes-Benz','GLC-Class','5'), Car('Tesla','Model X','5') ] print("Cars named tuples using index references:") print("-----------------------------------------") for car in cars_list: print(car[0], car[1], car[2]) print("\nCars named tuples using name references:") print("-----------------------------------------") for car in cars_list: print(car.make, car.model, car.doors)
Output:
Cars named tuples using index references:
-----------------------------------------
BMW X6 5
Mercedes-Benz GLC-Class 5
Tesla Model X 5
Cars named tuples using name references:
-----------------------------------------
BMW X6 5
Mercedes-Benz GLC-Class 5
Tesla Model X 5
NamedTuple Example 2: Using _replace to replace the value in a field
from collections import namedtuple # Create a namedtuple Car = namedtuple('Car', ['make', 'model', 'doors']) # Method 1 Car = namedtuple('Car', 'make, model, doors') # Method 2 cars_list = [ Car('BMW','X6','5'), Car('Mercedes-Benz','GLC-Class','5'), Car('Tesla','Model X','5') ] example = cars_list[0] print(example) example = example._replace(doors='6') print(example)
Output:
Car(make='BMW', model='X6', doors='5')
Car(make='BMW', model='X6', doors='6')
deque Container
A deque is a double-ended queue that’s optimized for fast appends and pops from each end. This is a significant advantage over lists, which are generally optimized for fast fixed-length operations and random access to elements.
If a value is specified for the maxlen
parameter, the deque is limited to this length. If new items are added, the same number of items are removed from the opposite end.
Most deque methods are the same as you get for lists, such as append(x)
, clear()
and count()
. But you also have methods unique to deques, which are:
appendleft(x)
extendleft(iterable)
popleft()
rotate(n=1)
A deque also supports slicing.
Deque Syntax: deque([iterable[, maxlen]])
A full list of deque methods can be found in the official Python 3 documentation. You can also find examples of standard list methods in our Python lists guide.
Below, you’ll find examples of the methods that are specific to Python deques.
Example 1: Create a new dequefrom collections import deque print("Using deque iteration:") deque1 = deque('abc') for letter in deque1: print(letter.upper()) print("\n---------------------\n") print("Using list iteration:") list1 = list('abc') for letter in list1: print(letter.upper())
Output:
Using deque iteration:
A
B
C
---------------------
Using list iteration:
A
B
C
Example 2: Attempting to add items to a full deque
from collections import deque # Create a full deque (maxlen=3, already has 3 elements) deque1 = deque('abc',maxlen=3) print("deque1 before append: \t{}".format(deque1)) # Add a new item 'x' to deque1 deque1.append('x') print("deque1 after append: \t{}".format(deque1))
Output:
deque1 before append: deque(['a', 'b', 'c'], maxlen=3)
deque1 after append: deque(['b', 'c', 'x'], maxlen=3)
Example 3: Using appendleft(x), extendleft(x), popleft() and rotate(n=1)
from collections import deque # Testing appendleft(), extendleft(), popleft() and rotate() deque1 = deque('abc') print("deque1: \t\t\t{}".format(deque1)) deque1.appendleft('x') print("deque1 after appendleft('x'): \t{}".format(deque1)) # Right rotation (1) deque1.rotate(1) print("deque1 after rotate(1): \t{}".format(deque1)) # Left rotation (-1) deque1.rotate(-1) print("deque1 after rotate(-1): \t{}".format(deque1)) deque1.popleft() print("deque1 after popleft(): \t{}".format(deque1)) # extendleft() reverses the input order deque1.extendleft('def') print("deque1 after extendleft(): \t{}".format(deque1))
Output:
deque1: deque(['a', 'b', 'c'])
deque1 after appendleft('x'): deque(['x', 'a', 'b', 'c'])
deque1 after rotate(1): deque(['c', 'x', 'a', 'b'])
deque1 after rotate(-1): deque(['x', 'a', 'b', 'c'])
deque1 after popleft(): deque(['a', 'b', 'c'])
deque1 after extendleft(): deque(['f', 'e', 'd', 'a', 'b', 'c'])
ChainMap objects
The ChainMap
class groups multiple mappings objects together in a single view. A ChainMap
holds references to the underlying mapping objects, so if one of them changes, the change is reflected in the ChainMap
view.
Syntax: ChainMap(*maps)
Example 1: Creating a ChainMap()from collections import ChainMap map1 = {'a':1, 'b':2, 'c':3} map2 = {'c':4, 'd':4, 'e':5} # Perform a ChainMap() between two dicts chain = ChainMap(map1, map2) print("Keys and Values in ChainMap:") for k, v in chain.items(): print("Key: {}, Value: {}".format(k, v))
Output:
Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 3
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2
The child mappings are searched for in the order they are passed to the constructor. This is why, although map2
assigns the value of 4
to key c
, because it appears after map1
in ChainMap(map1, map2)
, it’s the value of 3
from map1
that’s returned.
We can prove this by reversing the order in which we pass map1
and map2
to ChainMap
.
from collections import ChainMap map1 = {'a':1, 'b':2, 'c':3} map2 = {'c':4, 'd':4, 'e':5} # Perform a ChainMap() between two dicts chain = ChainMap(map2, map1) print("Keys and Values in ChainMap:") for k, v in chain.items(): print("Key: {}, Value: {}".format(k, v))
Output:
Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 4
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2
In the output above, we now have the value of 4
assigned to key c
. It’s possible to achieve the same result without manually changing the order map1
and map2
are passed to ChainMap
, using reversed()
.
from collections import ChainMap map1 = {'a':1, 'b':2, 'c':3} map2 = {'c':4, 'd':4, 'e':5} # Perform a ChainMap() between two dicts chain = ChainMap(map1, map2) print("Maps before reversal:\n{}\n".format(chain.maps)) chain.maps = list(reversed(chain.maps)) print("Maps after reversal:\n{}\n".format(chain.maps)) print("Keys and Values in ChainMap:") for k, v in chain.items(): print("Key: {}, Value: {}".format(k, v))
Output:
Maps before reversal:
[{'a': 1, 'b': 2, 'c': 3}, {'c': 4, 'd': 4, 'e': 5}]
Maps after reversal:
[{'c': 4, 'd': 4, 'e': 5}, {'a': 1, 'b': 2, 'c': 3}]
Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 4
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2
Counter Container
One way to count items in a tuple, string or list is to use their count()
method. But this only returns the count of a single value. A faster alternative to returning counts of multiple items in a single iterable or mapping is to create a Counter
.
Counter objects in the Python collections module have the following properties:
dict
subclass- Counters have the same methods as a
dict
, except thefromkeys()
method, which isn’t available. Theupdate()
method of a Counter also combines counts, rather than replacing the values - Counters are an unordered collection
- Values are stored as dictionary keys, counts as dictionary values
- Counter returns a zero count if item doesn’t exist. No KeyError
c1 = Counter() # Create an empty counter c2 = Counter("test_string") # Create a counter from an iterable c3 = Counter({"a": 3, "b": 4}) # Create a counter from a mapping c4 = Counter(first=3, second=2) # Create a counter from keyword args # Display the contents of each Counter object print("c1: {0}\nc2: {1}\nc3: {2}\nc4: {3}".format(c1, c2, c3, c4))
Output:
c1: Counter()
c2: Counter({'t': 3, 's': 2, 'e': 1, '_': 1, 'r': 1, 'i': 1, 'n': 1, 'g': 1})
c3: Counter({'b': 4, 'a': 3})
c4: Counter({'first': 3, 'second': 2})
Counter Example 1: Count occurrences of letters in a list
# Count occurrences of letters in a list cnt = Counter() letters = ["a", "a", "b", "b", "b", "c", "d"] # Loop through the list and count each letter for letter in letters: cnt[letter] += 1 # Display the Counter print("cnt: {0}".format(cnt))
Output:
Counter({'b': 3, 'a': 2, 'c': 1, 'd': 1})
Counter Example 2: most_common()
The most_common()
method returns a list of (value, count) tuples, ordered from the most common to the least common. Only values with a count greater than zero will be returned.
Syntax: most_common([n])
# Create a list with multiple occurrences of letters letters = ["a", "a", "b", "b", "b", "c", "d"] print("letters = {0}".format(letters)) # Find the three most common letters in the list Counter(letters).most_common(3) # Check most_common() returns all counts Counter(letters).most_common()
Output:
[('b', 3), ('a', 2), ('c', 1), ('d', 1)]
Counter Example 3: subtract()
The subtract([iterable])
method subtracts counts of elements in one Counter from another. For example, if a = Counter(a=3)
and b = Counter(a=1)
, a.subtract(b)
would update a
to be Counter({'a': 2})
.
Syntax: subtract([iterable-or-mapping])
# Create two counters c1 = Counter(a=3, b=4, c=2, d=7, e=0) c2 = Counter(a=3, b=1, c=1, d=3, e=5) print("c1: {0}".format(c1)) print("c2: {0}\n".format(c2)) # Subtract c2 from c1 c1.subtract(c2) print("c1.subtract(c2)\n{0}\n{1}".format('-'*15, c1))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})
c1.subtract(c2)
---------------
Counter({'d': 4, 'b': 3, 'c': 1, 'a': 0, 'e': -5})
The subtract operator -
can also used to subtract counts. Important: Only positive counts will be retained. This means we won’t get the zero count for ‘a’ or the -5 count for ‘e’ that we saw when using the subtract()
method in the code above.
# Create two counters c1 = Counter(a=3, b=4, c=2, d=7, e=0) c2 = Counter(a=3, b=1, c=1, d=3, e=5) print("c1: {0}".format(c1)) print("c2: {0}\n".format(c2)) # Subtract c2 from c1 using the '-' operator c1 - c2 print("c1 - c2\n{0}\n{1}".format('-'*15, c1 - c2))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})
c1 - c2
---------------
Counter({'d': 4, 'b': 3, 'c': 1})
Counter Example 4: update()
The update()
method adds counts of elements in one Counter to another. For example, if a = Counter(a=3)
and b = Counter(a=1)
, a.update(b)
would update a
to be Counter({'a': 4})
.
Syntax: update([iterable-or-mapping])
# Create two counters c1 = Counter(a=3, b=4, c=2, d=7, e=0) c2 = Counter(a=3, b=1, c=1, d=3, e=5) print("c1: {0}".format(c1)) print("c2: {0}\n".format(c2)) # Add c2 to c1 c1.update(c2) print("c1.update(c2)\n{0}\n{1}".format('-'*15, c1))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})
c1.update(c2)
---------------
Counter({'d': 10, 'a': 6, 'b': 5, 'e': 5, 'c': 3})
The addition operator +
can also used to add multiple counters together.
# Create two counters c1 = Counter(a=3, b=4, c=2, d=7, e=0) c2 = Counter(a=3, b=1, c=1, d=3, e=5) print("c1: {0}".format(c1)) print("c2: {0}\n".format(c2)) # Add c2 to c1 using the '+' operator c1 + c2 print("c1 + c2:\n{0}\n{1}".format('-'*15, c1 + c2))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})
c1 + c2
---------------
Counter({'d': 10, 'a': 6, 'b': 5, 'e': 5, 'c': 3})
OrderedDict Container
An OrderedDict
remembers its insertion order, which means you can create a sorted dictionary. However, as of Python 3.7, regular Python dictionaries also keep their insertion order.
So, when should you use an OrderedDict
?
If you need an ordered dictionary in code that runs using a Python version older than 3.7, you’ll need to use OrderedDict
. Also, OrderedDict
offers several useful methods that standard Python dictionaries don’t have, such as move_to_end
.
Another reason to use the OrderedDict
is if you want to test equality. When comparing multiple instances of OrderedDict
, the order of (key, value) pairs must be the same to be considered equal. But with standard Python dictionaries, as long as the instances contain the same (key, value) pairs, they will be considered equal, irrespective of their order.
Syntax: OrderedDict(mapping)
OrderedDict Example 1: Create an OrderedDictfrom collections import OrderedDict # OrderedDict creation dict1 = OrderedDict({'a':1, 'b':2, 'd':4, 'c':3}) # Create a sorted OrderedDict, sorted by key dict1 = OrderedDict(sorted(dict1.items(), key= lambda x: x[0])) print("Sorted OrderedDict\n{0}\n{1}".format("-"*18,dict1))
Output:
Sorted OrderedDict
------------------
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
OrderedDict Example 2: OrderedDict equality vs. dict equality
from collections import OrderedDict # Create two standard dictionaries dict1 = {'a':1, 'c':3, 'b':2} dict2 = {'a':1, 'b':2, 'c':3} # Now test their equality, when (key, value) pair order doesn't match print("Using dict, dict1 == dict2: {0}".format(dict1 == dict2)) # Perform the equality check between two OrderedDict instances dict1 = OrderedDict({'a':1, 'c':3, 'b':2}) dict2 = OrderedDict({'a':1, 'b':2, 'c':3}) print("Using OrderedDict, dict1 == dict2: {0}".format(dict1 == dict2))
Output:
Using dict, dict1 == dict2: True
Using OrderedDict, dict1 == dict2: False
defaultdict Container
Using a defaultdict
, if a key is not found in the dictionary, a new entry is created. This is in contrast to a standard Python dictionary, which would throw a KeyError
.
A defaultdict
is a subclass of the built-in dict
class, and supports the _missing_
method, in addition to the standard dict
operations.
The default_factory
function is useful for specifying what type of dictionary values you want to create if the key does not exist. Some examples are a list
, int
or set
.
Syntax: defaultdict([default_factory])
Example 1: Creating a defaultdictfrom collections import defaultdict # Create a list defaultdict list_dict = defaultdict(list) # Attempt to access a key that doesn't exist val = list_dict[1] # Check this key has now been created print("defaultdict(list) = {}".format(list_dict)) # Create a set defaultdict set_dict = defaultdict(set) # Attempt to access a key that doesn't exist val = set_dict[1] # Check this key has now been created print("defaultdict(set) = {}".format(set_dict)) # Create an integer defaultdict int_dict = defaultdict(int) # Attempt to access a key that doesn't exist val = int_dict[1] # Check this key has now been created print("defaultdict(int) = {}".format(int_dict))
Output:
defaultdict(list) = defaultdict(, {1: []})
defaultdict(set) = defaultdict(, {1: set()})
defaultdict(int) = defaultdict(, {1: 0})
Example 2: Using a defaultdict in counting
from collections import defaultdict count = defaultdict(int) letters = "a b b c a b a c c d d b a".split() for letter in letters: count[letter] +=1 print(count)
Output:
defaultdict(, {'a': 4, 'b': 4, 'c': 3, 'd': 2})