Python copy and intern

25 Aug 2017

[ python  copy  intern  memory  ]

“Assignment in Python never copies values”.

Sometimes it seems magical, like when you can swap values without using a third variable:

>>> a = 5
>>> b = 88
>>> a,b = b,a
>>> a
88
>>> b
5

But sometimes it can catch you off-guard:

>>> stuff = ['a','b','c']
>>> other_stuff = []
>>> for x in range(5):
...     other_stuff.append(stuff)
...
>>> other_stuff
[['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]
>>> other_stuff[1][1] = 'z'
>>> other_stuff
[['a', 'z', 'c'], ['a', 'z', 'c'], ['a', 'z', 'c'], ['a', 'z', 'c'], ['a', 'z', 'c']]
>>>

Let’s check the memory id of each of the elements:

>>> for x in other_stuff:
...     print(id(x))
...
140701071838856
140701071838856
140701071838856
140701071838856
140701071838856
>>>

Yep, all have the same memory id!

If we instead use copy:

>>> import copy
>>> stuff = ['a','b','c']
>>> other_stuff = []
>>> for x in range(5):
...     other_stuff.append(copy.copy(stuff))
...
>>> other_stuff
[['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]
>>> for x in other_stuff:
...     print(id(x))
...
140098088041544
140098088077128
140098088075336
140098088077448
140098088110536
>>>

Now, each item in the other_stuff array has a unique memory address. So we now can do:

>>> other_stuff[1][1] = 'z'
>>> other_stuff
[['a', 'b', 'c'], ['a', 'z', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c'], ['a', 'b', 'c']]

Why use copy.copy ?

Lists and strings (a special list, actually) have the slice notation, but other objects may or may not have a copying function build in. That’s is where copy is useful. The first copy is the library module copy, the second copy is the function that performs the actual copying… (if you use) the import statment from copy import copy then you could use copy(object) directly. However, by using import copy you get the module and to reference the function you need to use the form model.function. (Chris Freeman)

Here are a few links that helped me understand this better: