3 Common Python List, Set and Dictionary Mistakes
4 min read

3 Common Python List, Set and Dictionary Mistakes

If you've worked with Python, you've probably noticed some strange behaviour with its dictionaries and lists. Here are 3 and how to avoid them.
3 Common Python List, Set and Dictionary Mistakes

Python is a loved language. It's easy to get started with, it almost looks like English and the dynamic typing makes development a lot quicker. For my first post, I'll demystify the strange behaviour in lists, sets and dictionaries.

Prerequisite Concepts

Objects are a collection of data and methods that act on that data. Everything in Python is an object, including lists and dictionaries, meaning they all inherit from Python's object class.

my_string = 'abcd'
isinstance(my_string, object)  # output: True

my_list = ['a', 'b', 'c', 'd']
isinstance(my_list, object)  # output: True

my_dictionary = {'a': 'b', 'c': 'd'}
isinstance(my_dictionary, object)  # output: True

Variables, in general, are labelled containers carrying data in your program. Variables in Python aren't technically containers, they are references to the container in your computer's memory actually holding the data in question. This is similar to pointers in C and Java.

Mutability refers to an object's ability to be changed after it has been created. Mutable objects can be changed after they're created while immutable ones cannot. Lists, dictionaries and sets are mutable Python objects. Integers, floats, strings, booleans and tuples are immutable Python objects. To demonstrate mutability, let's examine string and list behaviour. While both are iterables supporting indexing, strings are immutable.

my_list = ['a', 'b', 'c', 'd']
my_string = 'abcd'

# both are iterable, the loops below will have the same output
for element in my_list:
    print(element)

for character in my_string:
    print(character)

# both support indexing
print(my_list[1])  # output: 'b'
print(my_list[1])  # output: 'b'

# lists are mutable but strings are not
my_list[1] = 'z'
print(my_list)  # output: ['a', 'z', 'c', 'd']

my_string[1] = 'z'  # throws a TypeError

Common Mistake 1: Mutable Variable Assignments

Say you want to make a copy of a list, dictionary or set. Your first thought would probably be to write the following code:

my_list = ['a', 'b', 'c', 'd']
my_list_copy = my_list

my_list.append('e')

We expect the copy to remain unaffected, but that's not the case. Here's what actually happens:

print(my_list)  # output: ['a', 'b', 'c', 'd', 'e']
print(my_list_copy)  # output: ['a', 'b', 'c', 'd', 'e']

Remember when we said variables hold references to the memory container? my_list holds a memory reference to where the list ['a', 'b', 'c', 'd'] is stored. The assignment my_list_copy = my_list just copies the reference to the list, not the actual list. When we then add the element 'e' to my_list, it adds it to the referenced list that my_list_copy also refers to.

This behaviour similarly affects sets and dictionaries.

Solutions

Create a new object using the objects constructor

You could use the object's constructor to create an entirely new object:

# lists
original_list = [1, 2, 3]
new_list = list(original_list)  # creates a new list object in memory and assigns it to new_list

# dictionaries
original_dict = {'a': 1, 'b': 2}
new_dict = dict(original_dict)  # creates a new dict object in memory and assigns it to new_dict

# sets
original_set = {1, 2, 3}
new_set = set(original_set)  # creates a new set object in memory and assigns it to new_set

Use the copy utility

The copy function does a shallow copy, which means it does not create new objects for the nested items.
The deepcopy function does as its name describes, copies nested objects as well. Keep this in mind and choose the appropriate one depending on your use case.

from copy import copy, deepcopy

# lists
original_list = [1, 2, 3]
new_list = copy(original_list)  # creates a new list object in memory and assigns it to new_list

# doesn't work with nested lists
original_nested_list = [1, [2, 3], 4]
partially_new_list = copy(original_nested_list)
completely_new_list = deepcopy(original_nested_list)

partially_new_list[1][0] = 99  # replace 2 with 99 in the nested list [2, 3]
print(original_nested_list)  # output: [1, [99, 3], 4]

print(completely_new_list)  # output: [1, [2, 3], 4]

Common Mistake 2: Mutable Default Function Arguments

Python functions allow declaration of default arguments. These are convenient when we don't expect the caller to always supply a certain argument. Let's play with the simple function below which optionally takes in a list and returns the same list or an empty list if no list is supplied:

def my_func(x=[]):
    return x

my_list = my_func()
print(my_list)  # output: []

my_list = my_func(x=[1, 2, 3, 4])
print(my_list)  # output: [1, 2, 3, 4]

What if we did a few modifications to our returned list my_list then called my_func again? We expect it to return an empty list if it's called without any arguments right? False!

my_list = my_func()  # returns []
my_list.append('abcd')
print(my_list)  # output ['abcd']

second_list = my_func()
print(second_list)  # output ['abcd']

The default empty list object is only created once in memory as the function object is loaded into memory. When it is returned to my_list, the reference to the original list is returned. Mutating that list results in the default empty list being modified. This similarly applies to sets and dictionaries.

Solution

A new object needs to be created every time the function is called without any arguments. We achieve this with the following pattern:

def my_func(x=None):
    if x is None:
        x = []  # or {} for a new empty dictionary or set() for a new empty set
    return x

This guarantees that a new object is created every time the function is called with no arguments.

Common Mistake 3: Mutating Iterables during Iteration

This mistake creeps up on even experienced Pythonistas. Let's consider the example below:

my_list = ['a', 'b', 'c', 'd']

for element in my_list:
    print(element)
    my_list.append('a')

If you run that code, no exception is thrown, but the program will eventually output 'a' infinitely. This is because you're adding elements to the original list being iterated upon.

Dictionaries throw an exception when mutated during an iteration:

my_dict = {'a': 1, 'b': 2}

for key, value in my_dict.items():
    print(key, value)
    my_dict['new_key'] = 'new value'  # throws TypeError when executed

Sets also throw an exception when mutated within an iteration:

my_set = {1, 2, 3, 4}

for element in my_set:
    print(element)
    my_set.add(element * 10)  # adding a new element to the set throws a RuntimeError

Conclusion

My intention isn't to scare you from using mutable objects, but to make you aware of their shortcomings when working with them. I hope this post saves you a lot of debugging time!

Drop a comment if you've got more gotchas you've come across and we can dig into them!