Python’s collections.defaultdict in Action
Let's have a look on Pythons collections module that is in Python's standard library. According to the docs, collections module "implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers".
One of the most useful bits of collecions is the defaultdict which is a very useful dict subclass that makes your life a lot easier. Say you are going through a list of people and their hometowns, and you are interested in finding out what kind of home towns each name has (very interesting!).
Using a normal dictionary you might do something along the following lines:
people = [ ('Jane', "Austin",), ('Jane', "SF",), ('Jane', 'Dallas',), ('Mark', "Chicago",), ('David', "St. Louis",), ('David', "Miami") ] normal_dict = {} for name, home_city in people: if name not in normal_dict: normal_dict[name] = [home_city] else: normal_dict[name].append(home_city) print(normal_dict) # {'David': ['St. Louis', 'Miami'], 'Mark': ['Chicago'], 'Jane': ['Austin', 'SF', 'Dallas']} assert('chicken' not in normal_dict)
With defaultdict from collections module you could do it the following way:
import collections l = collections.defaultdict(list) for name, home_city in people: l[name].append(home_city) print(l) # defaultdict(<class 'list'>, {'Jane': ['Austin', 'SF', 'Dallas'], 'David': ['St. Louis', 'Miami'], 'Mark': ['Chicago']}) print(l['chicken']) # []
If you used a normal dictionary and tried appending to a key that does not exist in the dictionary, Python would raise a KeyError, but when defaultdict does not find the key it creates the key automatically. You can also use defaultdict with sets and ints.
l = collections.defaultdict(set) for name, home_city in people: l[name].add(home_city) print(l) # defaultdict(<class 'set'>, {'Mark': {'Chicago'}, 'David': {'St. Louis', 'Miami'}, 'Jane': {'SF', 'Dallas', 'Austin'}}) print(l['chicken']) # set() i = collections.defaultdict(int) for name, city in people: i[name] += 1 print(i) # defaultdict(<class 'int'>, {'Mark': 1, 'David': 2, 'Jane': 3}) print(i['chicken'])