Category Archives for "Python"

Early-bird pricing for Weekly Python Exercise ends today!

Just a reminder: Registration for the advanced (B2) cohort of Weekly Python Exercise, which will begin on July 2nd, will remain open for the next two weeks. BUT early-bird pricing ($80 for 15 weeks of Python exercises, solutions, and community) ends today — Tuesday, June 18th.

If you want to sharpen your Python skills, then there’s no better way to do that than Weekly Python Exercise.

And hey, if you’re going to sharpen your skills, why not do it at a discount? As of tomorrow, you’ll have to pay more for the same course.

Learn more at . 100% money-back guarantee if you aren’t satisfied — but I’m sure you’ll learn so much, and be able to solve so many new problems, that you won’t want to do that.


Understanding Python assignment

Here’s a quick question I often ask students in my Python classes:

>>> x = 100
>>> y = x
>>> x = 200

After executing the above code, what is the value of y?

The answer:

>>> print(y)

Many of my students, especially those with a background in C, are surprised. Didn’t we say that “y = x”? Thus, shouldn’t a change in x be reflected by a similar change in y?

Obviously not. But why is this the case?

Assignment in Python means one thing, and one thing only: The variable named on the left should now refer to the value on the right.

In other words, when I said:

y = x

Python doesn’t read this as, “y should now refer to the variable x.” Rather, it read it as, “y should now refer to whatever value x refers to.”

Because x refers to the integer 100, y now refers to the integer 100. After these two assignments (“x = 100” and “y = x”), there are now two references to the integer 100 that didn’t previously exist.

When we say that “x = 200”, we’re removing one of those references, such that x no longer refers to 100. Instead, x will now refer to the integer 200.

But y’s reference remains in place to where it was originally pointing, to 100. And indeed, the only way to change what y is referring to is via … assignment.

Think of this as assignment inertia: Without a new and explicit assignment, a variable will continue to refer to whatever it referred to previously.

Thus, while Python does have references (i.e., variables pointing to objects), it doesn’t have pointers (i.e., variables pointing to other variables). That’s a big difference, and one that makes the language easier to understand. But references can still be a bit tricky and confusing, especially for newcomers to the language.

Remember also that in an assignment, the right side is evaluated before the left side. By the time the left-hand-side is being assigned, any variables on the right-hand-side are long gone, replaced by the final value of the expression. For example:

>>> a = 10
>>> b = 20
>>> c = 30
>>> d = a + b * c

When we assign a value to the variable “d” above, it’s only after Python has evaluated “a + b * c”. The variables are replaced by the values to which they refer, the operations are evaluated, and the final result (610) is then assigned to “d”. “d” has no idea that it was ever getting a value from “a”, “b”, or “c”.

Reminder: Early-bird pricing for Weekly Python Exercise ends tomorrow

This is just a quick reminder that if you want to join the advanced cohort of Weekly Python Exercise starting July 2nd, you should do it by tomorrow (Tuesday, June 18th).

Don’t miss this opportunity to improve your Python coding skills! We’ll be talking about iterators, generators, decorators, threads, and functional programming, and helping you to improve your skills.

Questions? Just e-mail me at But hurry, before the price goes up!


Playing with Python strings, lists, and variable names — or, a complex answer to a simple question

I recently received a question from a reader of my “Better developers” list. He asks:

Is there any way to turn a str type into a list type? For example, I have a list of elements, and want to turn that element into a separate list. For example, if I have

test = ['a', 'b', 'c']

I want the output to be

a=[], b=[], c=[]

One of the mantras of Python is that there should be one, and only one, way to do something. Reality has a way of being more complex than that, though, and in this particular case, the problem that my reader described in words and what he put in code weren’t exactly the same thing. (Which is a common problem in the professional software world — the specifications say one thing, but the client’s intentions say another.)

Let’s start with what my reader says he wants to do, and then get to what he actually seems to want:

He says that he wants to turn a string into a list. Well, there are a few ways to do that. The easiest is to use the “list” class, and apply it to a function:

He says that he wants to turn a string into a list. Well, there are a few ways to do that. The easiest is to use the “list” class, and apply it to a function:

>>> s = 'abc'
>>> mylist = list(s)
>>> mylist
['a', 'b', 'c']         

In such a case, the “list” class (which can be called, like a function, and is thus known as a “callable” in the Python world) iterates over the elements of our string. Each element is turned into a separate element in a new list that it returns.

This is fine if you want to create a new list with the same number of elements as there are characters in the string. After all, both strings and lists are Python sequences; when you create a list in this way, based on a string, you’ll find that the new list’s length and elements are identical. So s[0] and mylist[0] will return the same result, as will “len(s)” and “len(mylist)” even though “s” and “mylist” are different types.

Another way to create a list from a string is via the “str.split” method. I use this method all the time, especially when taking input from a user and iterating over the words, or fields, that the user provides. For example:

>>> words = 'here are some words'
>>> words.split(' ')
['here', 'are', 'some', 'words']

The result of “str.split” is always a list of strings. And as you can see in the above example, we can tell “str.split” what string should be used as a field delimiter; “str.split” removes all occurrences of that string, returning a list of strings.

What happens if our string is a bit weird, though, such as:

>>> words = 'here    are some     words'

Now we’re going to get an equally weird result:

>>> words.split(' ')
['here', '', '', '', 'are', 'some', '', '', '', '', 'words']

This happens because “str.split” has taken our instructions very literally, as computers do: Whenever you encounter a space character, create a new element in the output list. However, this is rarely the solution that you want, and thus “str.split” has a great default: If you don’t pass anything (or pass “None” explicitly), then any length of whitespace characters will be treated as a single delimiter. Which means that we can say:

>>> words = 'here    are some     words'
>>> words.split()
['here', 'are', 'some', 'words']

This is quite useful… and yet, while this is how I interpreted the question I got, it’s not what the user wants.

Rather, what he seems to want is to create new variables based on the elements of the string. So if the string is “abc”, then we want to create new variables “a”, “b”, and “c”, each of which references an empty list.

This is certainly possible, but I’ll admit it’s a bit odd. However, it gives us a chance to delve into some of Python’s more rarely used capabilities. (At least, I almost never use them — maybe other people are different!)

My first reaction to creating variables dynamically is to say, “No, you don’t really want to do that,” and to suggest that we create a dictionary, instead. You can think of a dict as your own private namespace, one which can’t and won’t interfere with the variables created elsewhere.

We could create an empty dictionary, and then iterate over the string, adding new key-value pairs to it, with each value being an empty list:

>>> for one_letter in 'abc':
        d[one_letter] = []

>>> d
{'a': [], 'b': [], 'c': []}

There is, however, a better way to do what we did here, and that is by using the “dict.fromkeys” class method. This is a great shortcut to creating a dictionary whose keys are known but whose values aren’t, at least not at the start. So we can say:

>>> dict.fromkeys('abc')
{'a': None, 'b': None, 'c': None}

As you can see, the value associated with each key here is “None”. We don’t want that; instead, we want to have an empty list. So we can pass an empty list as a second, optional argument to “dict.fromkeys”:

>>> dict.fromkeys('abc', [])
{'a': [], 'b': [], 'c': []}

However, you should be a bit nervous before working with the dictionary I’ve created here, because every single one of the values now refers to the same list! For example:

>>> d = dict.fromkeys('abc', [])
>>> d
{'a': [], 'b': [], 'c': []}
>>> d['a'].append(1)
>>> d['b'].append(2)
>>> d['c'].append(3)
>>> d
{'a': [1, 2, 3], 'b': [1, 2, 3], 'c': [1, 2, 3]}                

In many ways, this is similar to the problem of mutable defaults, in that we have a single value referenced in multiple places. It’s pretty obvious to experienced Python developers that this will happen, but it’s far from obvious to newcomers.

Another way to do this would be to use a dict comprehension:

>>> {one_letter : []
     for one_letter in 'abc'}
{'a': [], 'b': [], 'c': []}

“Wait,” you might be saying, “Maybe we have to worry about these lists also all referring to the same thing?”


>>> d = {one_letter : []
         for one_letter in 'abc'}
>>> d['a'].append(1)
>>> d['b'].append(2)
>>> d['c'].append(3)
>>> d
{'a': [1], 'b': [2], 'c': [3]}         

What’s the difference between this, and our previous use of “dict.fromkeys”? The difference is that here, the “[]” empty list is evaluated anew with each iteration over the string. Thus, we get a new empty list each time. By contrast, passing the same empty list as a second argument to “dict.fromkeys” gave us the same list each time.

So if you want to use a dict — and that’s my recommendation — then you are good to go! But if you really and truly want to create variables based on the values in the string, then we’ll have to use a few more tricks.

One is to take advantage of the fact that global variables are actually stored in a dictionary. Yes, that’s right — you might think that when you write “x=100” that you’re storing things in some magical location. But actually, Python turns your variable name into a string, and uses that string as a key into a dictionary.

We don’t have direct access to this dictionary, but we can retrieve it using the “globals” builtin function. Here’s what happens when I invoke “globals” in a brand-new Python 3 interactive shell:

>>> globals()
{'__name__': 'main', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>}         

See what happens now, after I assign some variables:

>>> x = 100
>>> y = [10, 20, 30]
>>> z = {'a':1, 'b':2}
>>> globals()         
{'__name__': 'main', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'x': 100, 'y': [10, 20, 30], 'z': {'a': 1, 'b': 2}}         

Take a look at the end, and you’ll see our three newly assigned variables.

It turns out that we can also define (or update the values of) global variables in this way, too:

>>> globals()['x'] = 234
>>> globals()['y'] = [9,8,7,6]
>>> globals()['z'] = 'hello out there'         
>>> globals()
{'__name__': 'main', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>, 'x': 234, 'y': [9, 8, 7, 6], 'z': 'hello out there'}

I don’t really recommend this in actual code, but if you’re absolutely, positively sure that you want to do this, then you can accomplish this task in the following way:

>>> for one_letter in 'abc':
    globals()[one_letter] = []         

Sure enough:

>>> x
>>> y
>>> z

Again, you almost certainly don’t want to have this sort of code in production. But it does work, as we see here.

Something else we could do is use the “exec” function, which lets us run any string as a tiny Python program. We could thus say:

>> for one_letter in 'abc':
        exec(f'{one_letter} = []')

>>> a
>>> b
>>> c

As you can see, it worked: We used an f-string to create a tiny (one-statement) Python program, and then used “exec” to run it. Note that we wouldn’t be able to use the related “eval” function here, because “eval” expects to have an expression, and assignment in Python isn’t an expression.

Finally, I’d generally argue that it’s a good idea not to create or manipulate global variables whose names are created dynamically from the user’s input. It’s probably best (as I wrote above) to use a dictionary. However, if you really insist on doing this, then you should probably do it in a module.

But wait — aren’t modules normally defined in files? Yes, but you can create a module on the fly by running the “module” class, just as we did above with the “list” class. There’s just one hitch, namely that the “module” class isn’t available to us in any of the Python namespaces.

That’s OK: We can grab the class via another module (e.g., __builtins__), and then invoke it, passing it the name of the module we want to create. Then we can use the builtin “setattr” function to assign a new attribute to the module. Here’s how that would look:

>>> mymod = type(__builtins__)('mymod')
>>> for one_letter in 'abc':
setattr(mymod, one_letter, [])
>>> vars(mymod)
{'__name__': 'mymod', '__doc__': None, '__package__': None, '__loader__': None, '__spec__': None, 'a': [], 'b': [], 'c': []}         

Sure enough, we’ve managed to do it!

By the way, remember how I mentioned, all the way back, that it would probably be best to use a dictionary, rather than create actual variables? Well, as you can see here, a module is actually just a fancy wrapper around… a dictionary.

This seemingly simple question raised all sorts of interesting Python functionality, none of which (I’m guessing) was ever intended by the person who asked the question. But I hope that this has given you a glimpse into the ways in which Python has implemented, and how a dynamic language allows us to play with our environment in ways that not only stretch our minds, but sometimes even the boundaries of good taste.


“Python Workout” is Manning’s Deal of the Day!

I’m a firm believer in improving your Python fluency via practice, practice, and more practice. “Python Workout” is a collection of my 50 favorite exercises from my 20 years of on-site Python training at some of the world’s largest and best-known companies.

I’m delighted to announce that “Python Workout,” my book with 50 exercises to improve your Python mastery and fluency, is Manning’s “Deal of the Day.”

That means that if you buy it today (i.e., June 13th), you’ll get 50% off!

Python Workout is currently available as an online MEAP (Manning Early-Access Program) book. Chapters 1-3 are already available, and chapters 4-6 will be up within another week or two. (And I’m working hard to finish editing the remaining chapters ASAP.)

Don’t miss this chance to get lots of extra Python practice for a low price. Get the book at , but only today (Thursday, June 13th)!


Variables are pronouns: A simple metaphor for Python newbies

I teach about 10 different courses to companies around the world, but my favorite remains “Python for non-programmers.” Participants in this course are typically network and system administrators, support engineers, and managers who want to learn some programming skills, but don’t see themselves as programmers. Moreover, many of them took a programming course back when they were university students, and were so horrified, overwhelmed, and frustrated that they gave up. Perhaps they’re still working for a high-tech company, but they have tried to avoid programming.

But jobs increasingly require some knowledge of programming, and Python is a perfect language with which to start: The syntax is consistent, and the number of things you need to learn is relatively small in order to get up and running.

But that doesn’t mean that there’s nothing to learn. And one of the hardest ideas for people to learn is that of variables. Sure, people know about variables from when they learned algebra — but variables in programming languages aren’t exactly the same thing, even if there are similarities.

For years, I struggled to explain variables: I used the mailbox metaphor (which I learned from the wonderful “Computer Science, Logo Style“). But the mailbox model doesn’t really fit Python, so I’d end up saying, “Actually, I lied to you yesterday. Variables are actually references.” Which didn’t do much to clear things up. And when we started to talk about lists of lists, it wasn’t clear how much my explanations really helped.

Finally, I hit upon a metaphor that seems to resonate with people: Variables are pronouns. This has several advantages:

  • Everyone knows what pronouns are. So it’s easy to understand how we might use them, and how they save us time. After all, you it’s far easier to say “he” or “him,” rather than “Rufus Xavier Sarsaparilla.”
  • The notion of references emerges naturally from this description. No longer do I introduce the mailbox model, and then point out how it doesn’t really work, given how assignment in Python works.
  • It becomes obvious that a variable refers to the object to which it was most recently assigned, much as “she” refers to the most recent female to whom we referred. Just as “he” and “she” can only refer to a single object at a time, so too can variables only refer to a single object at a time.
  • It also follows that while a variable (pronoun) can only refer to a single object, an object might be referred to by several pronouns.

No metaphor is perfect, and it’s still tough for many people to wrap their heads around the idea of variables when they’re programming for the first time. But this model seems to have had the greatest success so far. If you teach Python programming, then give it a whirl, and let me know if it seems to help!


Sharpen your Python skills with Weekly Python Exercise

A new WPE cohort starts on July 2nd! Join now, and take advantage of early-bird pricing.

It’s time for another cohort of Weekly Python Exercise! This time, it’s an advanced cohort with 15 weeks of practice in such subjects as functional programming, object-oriented programming, iterators, generators, and decorators.

Early-bird pricing ends in just one week, on June 18th!

Learn more, and get a sample, at


Why do Python lists let you += a tuple, when you can’t + a tuple?

Let’s say you have a list in Python:

>>> mylist = [10, 20, 30]

You want to add something to that list. The most standard way to do this is with the “append” method, which adds its argument to the end of the list:

>>> mylist.append(40)
>>> print(mylist)
[10, 20, 30, 40]

But what if you want to add multiple items to a list? If you’re new to Python, then you might think that you can and should use a “for” loop. For example:

>>> mylist = [10, 20, 30]
>>> new_items = [40, 50, 60]
>>> for one_item in new_items:
>>> print(mylist)
[10, 20, 30, 40, 50, 60]

Great, right? But it turns out that there is a smarter and faster way to do this. You can use the += operator. This operator, which invokes the “iadd” (“inplace add”) method on the object to its left, effectively does what we did above, but in much less code:

>>> mylist = [10, 20, 30]
>>> new_items = [40, 50, 60]
>>> mylist += new_items
>>> print(mylist)
[10, 20, 30, 40, 50, 60]

It’s not a huge surprise that += can do this. After all, we normally expect += to add and assign to the variable on its left; it works with numbers and strings, as well as other types. And we know that we can use the + operator on lists, too:

>>> [1, 2, 3] + [4, 5, 6]
[1, 2, 3, 4, 5, 6]

Can we join a list and a tuple? Let’s check:

>>> mylist = [10, 20, 30]
>>> t = (40, 50, 60)
>>> mylist + t
Traceback (most recent call last):
File "", line 1, in 
TypeError: can only concatenate list (not "tuple") to list

In other words: No. Trying to add a list and a tuple, even if we’re not affecting either, results in the above error.

Which is why it’s so surprising to many of my students that the following does work:

>>> mylist = [10, 20, 30]
>>> t = (40, 50, 60)
>>> mylist += t
>>> mylist
[10, 20, 30, 40, 50, 60]         

That’s right: Adding a list to a tuple with + doesn’t work. But if we use +=, it does.

What gives?

It’s common, when teaching Python, to say that

x += 5

is basically a rewrite of

x = x + 5

And in the majority of cases, that’s actually true. But it’s not always true.

Consider: When you say “x + y” in Python, the “+” operator is translated into a method call. Behind the scenes, no matter what “x” and “y” are, the expression is translated into:


The “__add__” magic method is what’s invoked on an object when it is added to another object. The object on the right-hand side of the “+” is passed as an argument to the method, while the object on the left-hand side is the recipient of the method call. That’s why, if you want your own objects to handle the “+” operator, you need to define the “__add__” method in your class definition. Do that, and things work just fine.

And thus, when we say “x = x + 5”, this is turned into

x = x.__add__(5)

Meaning: First invoke the method, and then assign it back to the variable “x”. In this case, “x” isn’t changing; rather, the variable is now referencing a new object.

Now consider the “+=” operator: It’s translated by Python into “__iadd__”, short for “inplace add.” Notice the slightly different syntax that we use here:

x += y

is translated into


Did you see the difference between __add__ and __iadd__? The latter executes the assignment all by itself, internally. You don’t have to capture its output and assign it back to x.

It turns out that the implementation of list.__iadd__ takes the second (right-hand side) argument and adds it, one element at a time, to the list. It does this internally, so that you don’t need to execute any assignment after. The second argument to “+=” must be iterable; if you say

mylist += 5

you will get an error, saying that integers are not iterable. But if you put a string, list, tuple, or any other iterable type on the right-hand side, “+=” will execute a “for” loop on that object, adding each of its elements, one at a time, to the list.

In other words: When you use + on a list, then the right-hand object must be a list. But when you use +=, then any iterable type is acceptable:

>>> mylist = [10, 20, 30]
>>> mylist += [40, 50]       # list
>>> mylist
[10, 20, 30, 40, 50]

>>> mylist += (60, 70)       # tuple
>>> mylist
[10, 20, 30, 40, 50, 60, 70]

>>> mylist += 'abc'          # string
>>> mylist
[10, 20, 30, 40, 50, 60, 70, 'a', 'b', 'c']

>>> mylist += {'x':1, 'y':2, 'z':3}    # dict!
>>> mylist
[10, 20, 30, 40, 50, 60, 70, 'a', 'b', 'c', 'x', 'y', 'z']

Does this work with other types? Not really. For example:

>>> t = (10, 20, 30)
>>> t += [40, 50]
Traceback (most recent call last):
File "", line 1, in 
TypeError: can only concatenate tuple (not "list") to tuple         

What happened here? Let’s check the definition of tuple.__iadd__ to find out:

>>> help(tuple.__iadd__)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: type object 'tuple' has no attribute '__iadd__'

Wait a second: There is no “__iadd__” method for tuples? If so, then how can “+=” work at all?

Because Python tries to be smart in such cases: If the object implements “__iadd__”, then the “+=” operator invokes it. But if the object lacks an “__iadd__” implementation, then Python does what we all guess it normally does — namely, invoke “__add__”, and then assign the results back to the variable. For example:

>>> class Foo(object):
        def __init__(self, x):
            self.x = x
        def __add__(self, other):
            print("In __add__")
            return Foo(self.x + other.x)

>>> f1 = Foo(10)
>>> f2 = Foo(20)
>>> f1 += f2
In __add__
>>> vars(f1)
{'x': 30}         

In other words, Python notices that our Foo class lacks an implementation of “__iadd__”, and substitutes “__add__” for it, assigning its result (a new instance of Foo) to the original variable.

But if we add (so to speak) the right method, then it’s invoked:

>>> class Foo(object):
        def __init__(self, x):
            self.x = x
        def __add__(self, other):
            print("In __add__")
            return Foo(self.x + other.x)
        def __iadd__(self, other):
            print("In __iadd__")
            self.x = self.x + other.x
            return self         
>>> f1 = Foo(10)
>>> f2 = Foo(20)
>>> f1 += f2
In __iadd__
>>> vars(f1)
{'x': 30}         

In the case of Python lists, __iadd__ was implemented such that it doesn’t just add “other.x” to its own value, but that it iterates over each element of “other.x” in a “for” loop. And thus, while “__add__” with a tuple won’t work, “__iadd__” with just about every iterable data types will.


Registration (and early-bird pricing) is open for Weekly Python Exercise

Do you use Python, but sometimes feel stuck?

Do you visit Stack Overflow every time you want to solve a problem?

Do you wish that you understood how to use advanced techniques, such as generators and decorators, better?

If so, then good news: I’m opening a new cohort of Weekly Python Exercise, specifically aimed at intermediate/advanced developers! For 15 weeks, starting on July 2nd, you’ll be able to improve your Python fluency — just as many other Python developers from around the world have done over the last three years.

This cohort works the same as all the others in the WPE family: On Tuesday, you get a new question (along with “pytest” tests), posing a problem for you to solve. On the following Monday, you get the solution and a detailed explanation. In between, you can discuss the question with others in your cohort via our private forum.What topics will be considered in

Among the topics we’ll discuss in this cohort:

  • Iterators and generators
  • Decorators
  • Advanced object-oriented techniques
  • Advanced data structures
  • Functional programming techniques
  • Threads and processes

If you register by June 18th, then the price of this cohort is $80. It’ll then go up to $100 on June 19th, and then $120 in the final week before it starts. So sign up now for this cohort — and improve your Python fluency, and save some money along the way

Wondering what WPE is like? You can read more on at, as well as sign up for sample exercises.

Are you a student, pensioner/retiree/senior, or do you live outside of the world’s 30 richest countries? Then you’re entitled to a discount; just e-mail me at for the appropriate coupon code.

Many hundreds of Python developers from around the world have leveled up their Python skills with Weekly Python Exercise. Join this course, use Stack Overflow less, and get more done at work — and maybe even a better job.

Python dicts and memory usage

Let’s say that we create a new, empty Python dictionary:

>>> d = {}

How much memory does this new, empty dict consume? We can find out with “sys.getsizeof“:

>>> import sys
>>> sys.getsizeof(d)

In other words, our dictionary, with nothing in it at all, consumes 240 bytes. Not bad; given how often dictionaries are used in Python, it’s good to know that they don’t normally consume that much memory.

What if I add something to the dict? What will happen to the memory usage?

>>> d['a'] = 1
>>> sys.getsizeof(d)

Something seems a bit fishy here, right? How can it be that our newly created dictionary, with zero key-value pairs, takes up the same space in memory as our dictionary with one key-value pair?

The answer is that “sys.getsizeof” is returning the size of the dictionary as a data structure, not the data inside of it. In other words: When we first create a dictionary, it contains eight slots that can be filled with key-value pairs. Only when the dictionary needs to grow, because it has too many key-value pairs for its current size, does it allocate more memory.

Moreover, the key-value pairs themselves aren’t stored in the dict itself. Rather, just a reference to the place in memory that holds the keys and values is stored there. So neither the type nor the size of the data is kept in the dictionary, and it certainly doesn’t affect the result of “sys.getsizeof” for the dictionary. Indeed, watch this:

>>> d['a'] = 'a' * 100000
>>> sys.getsizeof(d)

Even when the value is 100,000 characters long, our dictionary only needs 240 bytes.

What happens as we expand our dictionary? When does it request more memory? Let’s take a look:

>>> d = {}
>>> for one_letter in 'abcdefghijklmnopqrstuvwxyz':
d[one_letter] = one_letter
print(f'{len(d)}, sys.getsizeof(d) = {sys.getsizeof(d)}')

1, sys.getsizeof(d) = 240
2, sys.getsizeof(d) = 240
3, sys.getsizeof(d) = 240
4, sys.getsizeof(d) = 240
5, sys.getsizeof(d) = 240
6, sys.getsizeof(d) = 368
7, sys.getsizeof(d) = 368
8, sys.getsizeof(d) = 368
9, sys.getsizeof(d) = 368
10, sys.getsizeof(d) = 368
11, sys.getsizeof(d) = 648
12, sys.getsizeof(d) = 648
13, sys.getsizeof(d) = 648
14, sys.getsizeof(d) = 648
15, sys.getsizeof(d) = 648
16, sys.getsizeof(d) = 648
17, sys.getsizeof(d) = 648
18, sys.getsizeof(d) = 648
19, sys.getsizeof(d) = 648
20, sys.getsizeof(d) = 648
21, sys.getsizeof(d) = 648
22, sys.getsizeof(d) = 1184
23, sys.getsizeof(d) = 1184
24, sys.getsizeof(d) = 1184
25, sys.getsizeof(d) = 1184
26, sys.getsizeof(d) = 1184

As you can see, the dictionary adds more key-value pairs, it needs more memory. But it doesn’t grow with each addition; each time it needs more space, it allocates more than it needs, so that the allocations can be relative rare.

What happens if we remove items from our dictionary? Will it return memory to the system? Let’s find out:

>>> for key in list(d.keys()):

>>> len(d)


Notice that in the above code, I didn’t iterate over “d” or “d.keys”. Doing so would have led to an error, because changing a dictionary while iterating over it is a problem. I thus created a list based on the keys, and iterated over that.

You can also see that after removing these name-value pairs from my dict, it is indeed empty. And its memory usage?

>>> sys.getsizeof(d)

In other words: Even though we’ve removed items from our dict, it hasn’t released the memory that it previously allocated. Of course, given how rarely I find myself removing items from dicts in actual Python code, I’m not hugely surprised that this happens. After all, why return memory to the system if you’re unlikely to need to do that? But it means that if you do allocate tons of memory to a dict, then you’re unlikely to get it back until the program ends, even if you remove items.

But wait: What if I remove everything from the dict? There’s a method, “dict.clear“, that does this. I don’t use it very often, but it might at least provide us with some useful data:

>>> d.clear()
>>> len(d)
>>> sys.getsizeof(d)

Wait a second here: After running “dict.clear”, our dict size is indeed 0. Which is what it was before. But we’re somehow using less memory than we even did at the start, when we created an empty dict! How can that be?

It would seem that when you run “dict.clear”, it removes not only all of the key-value pairs, but also that initial allocation of memory that is done for new, empty dictionaries. Meaning that we now have an “emptier than new” dictionary, taking up a paltry 72 bytes in our system.

If we add a new key-value pair to our dict, then if my theory is right, we should get back to the original size of 240 bytes:

>>> d['a'] = 1
>>> len(d)
>>> sys.getsizeof(d)

Sure enough, adding that one key-value pair to “d” forced the dictionary to allocate the same amount of memory it had before, back when we first created it.