The copy module provides two ways to copy objects: shallow and deep. Understanding the difference prevents subtle bugs with mutable data.
The Problem
Assignment doesn't copy—it creates a reference:
original = [1, 2, [3, 4]]
reference = original
reference[0] = 99
print(original) # [99, 2, [3, 4]] - original changed!Both variables point to the same object.
Shallow Copy
A shallow copy creates a new object but references the same nested objects:
import copy
original = [1, 2, [3, 4]]
shallow = copy.copy(original)
# Top level is independent
shallow[0] = 99
print(original) # [1, 2, [3, 4]] - unchanged
# But nested objects are shared
shallow[2][0] = 99
print(original) # [1, 2, [99, 4]] - changed!Deep Copy
A deep copy recursively copies everything:
import copy
original = [1, 2, [3, 4]]
deep = copy.deepcopy(original)
# Completely independent
deep[2][0] = 99
print(original) # [1, 2, [3, 4]] - unchanged
print(deep) # [1, 2, [99, 4]]Visual Comparison
original = [1, 2, [3, 4]]
# Reference (assignment)
reference ──────────────────┐
▼
original ──────────────► [1, 2, ●]──► [3, 4]
# Shallow copy
shallow ─────────────────► [1, 2, ●]──┐
▼
original ──────────────► [1, 2, ●]──► [3, 4]
# Deep copy
deep ─────────────────► [1, 2, ●]──► [3, 4] (independent)
original ──────────────► [1, 2, ●]──► [3, 4] (independent)
When to Use Each
Shallow copy when:
- Objects contain only immutable items (strings, numbers, tuples)
- You want to share nested objects intentionally
- Performance matters and deep copy is overkill
Deep copy when:
- Objects have nested mutable containers
- You need complete independence
- Modifying the copy should never affect the original
Built-in Shortcuts
Some types have copy methods:
# Lists
original = [1, 2, 3]
shallow = original.copy() # or list(original) or original[:]
# Dicts
original = {'a': 1, 'b': 2}
shallow = original.copy() # or dict(original)
# Sets
original = {1, 2, 3}
shallow = original.copy() # or set(original)These are all shallow copies.
Dictionaries Example
import copy
original = {
'name': 'Alice',
'scores': [95, 87, 92]
}
shallow = copy.copy(original)
deep = copy.deepcopy(original)
# Modify nested list in shallow copy
shallow['scores'].append(100)
print(original['scores']) # [95, 87, 92, 100] - affected!
# Modify nested list in deep copy
deep['scores'].append(50)
print(original['scores']) # [95, 87, 92, 100] - unchangedCustom Objects
Classes can customize copying:
import copy
class Config:
def __init__(self, settings):
self.settings = settings
def __copy__(self):
# Custom shallow copy
return Config(self.settings)
def __deepcopy__(self, memo):
# Custom deep copy
return Config(copy.deepcopy(self.settings, memo))The memo dict prevents infinite loops with circular references.
Circular References
Deep copy handles circular references:
import copy
a = [1, 2]
a.append(a) # Circular reference
deep = copy.deepcopy(a)
print(deep[2] is deep) # True - circular structure preserved
print(deep[2] is a) # False - independent copyPerformance
Deep copy is slower:
import copy
import timeit
data = {'a': [1, 2, 3], 'b': {'x': [4, 5, 6]}}
shallow_time = timeit.timeit(lambda: copy.copy(data), number=100000)
deep_time = timeit.timeit(lambda: copy.deepcopy(data), number=100000)
# Deep copy is typically 5-10x slowerCommon Pitfall: Default Arguments
# Bug: mutable default argument
def add_item(item, items=[]):
items.append(item)
return items
print(add_item(1)) # [1]
print(add_item(2)) # [1, 2] - same list!
# Fix: use None and copy
def add_item(item, items=None):
if items is None:
items = []
items.append(item)
return itemsTesting Copies
Verify your copies work correctly:
import copy
def test_deep_copy():
original = {'nested': [1, 2, 3]}
copied = copy.deepcopy(original)
# Modify copy
copied['nested'].append(4)
# Original unchanged
assert original['nested'] == [1, 2, 3]
assert copied['nested'] == [1, 2, 3, 4]Summary
| Method | New Object? | Nested Objects |
|---|---|---|
Assignment (=) | No | Shared |
copy.copy() | Yes | Shared |
copy.deepcopy() | Yes | Copied |
When in doubt, use deepcopy(). It's slower but safer. Only optimize to shallow copy when you've verified nested objects don't need independence.