Reuven's blog — Page 11 of 25

Python dicts and memory usage

May 12, 2019 , bY reuven
Python

Let’s say that we create a new, empty Python dictionary:

>>> d = {}

How much memory does this new, empty dict consume? We can find out with “sys.getsizeof“:

>>> import sys
>>> sys.getsizeof(d)
240

In other words, our dictionary, with nothing in it at all, consumes 240 bytes. Not bad; given how often dictionaries are used in Python, it’s good to know that they don’t normally consume that much memory.

What if I add something to the dict? What will happen to the memory usage?

>>> d['a'] = 1
>>> sys.getsizeof(d)
240

Something seems a bit fishy here, right? How can it be that our newly created dictionary, with zero key-value pairs, takes up the same space in memory as our dictionary with one key-value pair?

The answer is that “sys.getsizeof” is returning the size of the dictionary as a data structure, not the data inside of it. In other words: When we first create a dictionary, it contains eight slots that can be filled with key-value pairs. Only when the dictionary needs to grow, because it has too many key-value pairs for its current size, does it allocate more memory.

Moreover, the key-value pairs themselves aren’t stored in the dict itself. Rather, just a reference to the place in memory that holds the keys and values is stored there. So neither the type nor the size of the data is kept in the dictionary, and it certainly doesn’t affect the result of “sys.getsizeof” for the dictionary. Indeed, watch this:

>>> d['a'] = 'a' * 100000
>>> sys.getsizeof(d)
240

Even when the value is 100,000 characters long, our dictionary only needs 240 bytes.

What happens as we expand our dictionary? When does it request more memory? Let’s take a look:

>>> d = {}
>>> for one_letter in 'abcdefghijklmnopqrstuvwxyz':
        d[one_letter] = one_letter
        print(f'{len(d)}, sys.getsizeof(d) = {sys.getsizeof(d)}')

1, sys.getsizeof(d) = 240
2, sys.getsizeof(d) = 240
3, sys.getsizeof(d) = 240
4, sys.getsizeof(d) = 240
5, sys.getsizeof(d) = 240
6, sys.getsizeof(d) = 368
7, sys.getsizeof(d) = 368
8, sys.getsizeof(d) = 368
9, sys.getsizeof(d) = 368
10, sys.getsizeof(d) = 368
11, sys.getsizeof(d) = 648
12, sys.getsizeof(d) = 648
13, sys.getsizeof(d) = 648
14, sys.getsizeof(d) = 648
15, sys.getsizeof(d) = 648
16, sys.getsizeof(d) = 648
17, sys.getsizeof(d) = 648
18, sys.getsizeof(d) = 648
19, sys.getsizeof(d) = 648
20, sys.getsizeof(d) = 648
21, sys.getsizeof(d) = 648
22, sys.getsizeof(d) = 1184
23, sys.getsizeof(d) = 1184
24, sys.getsizeof(d) = 1184
25, sys.getsizeof(d) = 1184
26, sys.getsizeof(d) = 1184

As you can see, the dictionary adds more key-value pairs, it needs more memory. But it doesn’t grow with each addition; each time it needs more space, it allocates more than it needs, so that the allocations can be relative rare.

What happens if we remove items from our dictionary? Will it return memory to the system? Let’s find out:

>>> for key in list(d.keys()):
        d.pop(key)

>>> len(d)

0

Notice that in the above code, I didn’t iterate over “d” or “d.keys”. Doing so would have led to an error, because changing a dictionary while iterating over it is a problem. I thus created a list based on the keys, and iterated over that.

You can also see that after removing these name-value pairs from my dict, it is indeed empty. And its memory usage?

>>> sys.getsizeof(d)
1184

In other words: Even though we’ve removed items from our dict, it hasn’t released the memory that it previously allocated. Of course, given how rarely I find myself removing items from dicts in actual Python code, I’m not hugely surprised that this happens. After all, why return memory to the system if you’re unlikely to need to do that? But it means that if you do allocate tons of memory to a dict, then you’re unlikely to get it back until the program ends, even if you remove items.

But wait: What if I remove everything from the dict? There’s a method, “dict.clear“, that does this. I don’t use it very often, but it might at least provide us with some useful data:

>>> d.clear()
>>> len(d)
0
>>> sys.getsizeof(d)
72

Wait a second here: After running “dict.clear”, our dict size is indeed 0. Which is what it was before. But we’re somehow using less memory than we even did at the start, when we created an empty dict! How can that be?

It would seem that when you run “dict.clear”, it removes not only all of the key-value pairs, but also that initial allocation of memory that is done for new, empty dictionaries. Meaning that we now have an “emptier than new” dictionary, taking up a paltry 72 bytes in our system.

If we add a new key-value pair to our dict, then if my theory is right, we should get back to the original size of 240 bytes:

>>> d['a'] = 1
>>> len(d)
0
>>> sys.getsizeof(d)
240

Sure enough, adding that one key-value pair to “d” forced the dictionary to allocate the same amount of memory it had before, back when we first created it.

Weekly Python Exercise A2 (functions + modules for beginners) closes today

May 10, 2019 , bY reuven
Python

If you are a relative beginner to Python, and want to improve your understanding of functions and modules, then there’s no better way to do so than practice.

Weekly Python Exercise provides you with that practice, with a family of six 15-week courses. In each course, you get a question on Tuesday, the answer on Monday, discussion among your cohort in our private forum, and live, monthly office hours.

And today’s the last day to sign up for the latest cohort for beginners, with an emphasis on functions and modules.

Do you have to check on Stack Overflow every time you write a Python function? Then this cohort of WPE is for you.
Do you want to have a better understanding of how scoping — local vs. global vs. builtins — works in Python? Then this cohort of WPE is for you.
Are you confused between *args and **kwargs, and want to know how and when to use them, without using Google? Then this cohort of WPE is for you.
And the Python standard library, which comes with the language — how familiar are you with the most common modules that come with the language? If you want to better understand how to use them, then this cohort of WPE is for you.
Finally, if you’ve wanted to write modules and use them in your code, so that you can reuse functionality across programs, then this cohort of WPE is for you.

Hundreds of previous participants in Weekly Python Exercise have improved their Python fluency, and gotten better at their jobs as a result. If you also want to improve your Python skills, then WPE is a great way to do it.

Want to learn more, or to sign up? Check out Weekly Python Exercise at https://WeeklyPythonExercise.com/ . And if you have questions? Just e-mail me, at reuven@lerner.co.il.

But don’t delay, because today (Friday) is the last day to join! I’ll be running more cohorts of WPE this year, but this particular one (A2) won’t run again until 2020.

“Python Workout” is Manning’s Deal of the Day!

May 8, 2019 , bY reuven
Python

If you’ve just finished a Python course or book, then you might feel a bit nervous about your Python knowledge. You might be wondering how you can become a master Python developer, solving problems without turning to Stack Overflow every few minutes.

The good news is that you can improve! But getting better at Python means practice, practice, and more practice. Just like everything else in life.

My new book, “Python Workout,” has 50 short Python challenges designed to help you become a more fluent Python developer. And today, it’s Manning’s “Deal of the day,” at 50% off its normal price!

Just go to https://www.manning.com/dotd and get 50% off “Python Workout,” as well as other Manning books.

There’s still time to join Weekly Python Exercise

May 8, 2019 , bY reuven
Uncategorized

Another cohort of Weekly Python Exercise starts next week! This time, it’s course A2 — for beginners, focusing on functions and modules.

Registration closes on Friday. So if you want to level up your Python skills, you should check out, and register with, Weekly Python Exercise sooner rather than later.

This cohort of WPE is for you, if:

You’re a bit shaky on the difference between *args and **kwargs, and when to use them
You don’t understand why mutable defaults are a bad things
You don’t know what the “global” keyword does, or why you should (or shouldn’t) use it
You want to create new Python modules, but aren’t sure where to start
You would like to be more familiar with the builtin Python standard library

The best way to learn is through a combination of practice and interactions with others — and that’s what WPE provides.

Weekly Python Exercise is the best way I know of for Python developers to improve their skills, become more fluent, and get better jobs. Join WPE, and you’ll have access not only to 15 weeks of problems and solutions, but also to a community of peers, and to monthly office hours with me.

Want to learn more? Just go to https://WeeklyPythonExercise.com/. Or if you have questions, e-mail them to me, at reuven@lerner.co.il.

My interview on the “Talk Python” podcast

May 7, 2019 , bY reuven
Python

I was delighted to appear on the popular “Talk Python to Me” podcast, run by Michael Kennedy. In the podcast, I talk about teaching, learning, and teaching Python to companies. If you’re interested in how to learn better or teach better, then I think you’ll enjoy this episode!

The episode is here: https://talkpython.fm/episodes/show/210/making-the-most-out-of-in-person-training

Making your Python decorators even better, with functool.wraps

May 5, 2019 , bY reuven
Python

The good news: I gave a talk on Friday morning, at PyCon 2019, called “Practical decorators.”

The better news: It was a huge crowd, and people have responded very warmly to the talk. Thanks to everyone at PyCon who came to talk to me about it!

However: Several people, at the talk and afterwards, asked me about “functool.wraps“.

So, please think of this post as an addendum to my talk.

Let’s assume that I have the same simple decorator that I showed at the top of my talk, “mydeco”, which takes a function’s output and puts it into a string, followed by three exclamation points:

def mydeco(func):
    def wrapper(*args, **kwargs):
        return f'{func(*args, **kwargs)}!!!'
    return wrapper

Let’s now decorate two different functions with “mydeco”:

@mydeco
def add(a, b):
    '''Add two objects together, the long way'''
    return a + b

@mydeco
def mysum(*args):
    '''Sum any numbers together, the long way'''
    total = 0
    for one_item in args:
        total += one_item
    return total

What happens when I run these functions? They do what we would expect:

>>> add(10, 20)
'30!!!'

>>> mysum(10, 20, 30, 40, 50)
'150!!!

Fantastic! We get each function’s result back, as a string, with the exclamation points. The decorator worked.

But there are a few issues with what we did. For example, what if I ask each function for its name:

>>> add.__name__
'wrapper'

>>> mysum.__name__
'wrapper'

The __name__ attribute, which gives us the name of a function when we define it, now reflects the returned internal function, “wrapper”, in our decorator. Now, this might be true, but it’s not helpful.

It gets even worse if we ask to see the docstring:

>>> help(add)
Help on function wrapper in module __main__:
wrapper(*args, **kwargs)

>>> help(mysum)
Help on function wrapper in module __main__:
wrapper(*args, **kwargs)

In other words: We are now getting the docstring and function signature of “wrapper”, the inner function. And this is a problem, because now someone cannot easily find out how our decorated function works.

We can solve this problem, at least partially, by assigning to the __name__ and __doc__ attributes in our decorator:

def mydeco(func):
    def wrapper(*args, **kwargs):
        return f'{func(*args, **kwargs)}!!!'
    wrapper.__name__ = func.__name__
    wrapper.__doc__ = func.__doc__
    return wrapper

If we use this version of the decorator, then each time we return “wrapper” from our decorator, then we’re doing so after first assigning the original function’s name and docstring to it. If we do this, then things will work the way we want. Mostly:

>>> help(add)
Help on function add in module __main__:

add(*args, **kwargs)
     Add two objects together, the long way

>>> help(mysum)
Help on function mysum in module __main__:

mysum(*args, **kwargs)
    Sum any numbers together, the long way

The good news is that we’ve now fixed the naming and the docstring problem. But the function signature is still that super-generic one, looking for both *args and **kwargs.

The solution, as people reminded me after my talk, is to use functools.wraps. It’s designed to solve precisely these problems. The irony, of course, is that it might make your head spin a bit more than decorators normally do, because functools.wraps is … a decorator, which takes an argument! Here’s how it looks:

from functools import wraps

def mydeco(func):
    @wraps(func)
    def wrapper(*args, *kwargs):
        return f'{func(*args, **kwargs)}!!!'
    return wrapper

Notice what we’ve done here: We have used the “functool.wraps” decorator to decorate our inner function, “wrapper”. We’ve passed it an argument of “func”, the decorated function passed to “mydeco”. By applying this “wraps” decorator to our inner function, we copy over func’s name, docstring, and signature to our inner function, avoiding the issues that we had seen before:

>>> help(add)
Help on function add in module main:
add(a, b)
     Add two objects together, the long way

>>> help(mysum)
Help on function mysum in module main:
mysum(*args)
     Sum any numbers together, the long way

So, to answer the questions that I got after my talk: Yes, I would definitely recommend using functool.wraps! It costs you almost nothing (i.e., one line of code), and makes your decorated function work more normally and naturally. And I’m going to try to find a way to squeeze this recommendation into future versions of this talk, as well.

Get code + slides from my “Practical Decorators” talk from Euro Python / PyCon 2019

May 3, 2019 , bY reuven
Python

I presented my “Practical Decorators” talk twice this year — once at PyCon 2019 in Cleveland, and again at EuroPython 2019 in Basel. Here is the video of my presentation in Cleveland:

If you want to get the PDF of my slides, as well as the Python code that I showed, then just enter your e-mail address here. I’ll send you a link to the zipfile that you can download.

Get the bonus content: Practical Decorators — code and slides

Click here

Thanks for your interest!

Announcing: Weekly Python Exercise A2 — functions and modules for Python beginners

May 2, 2019 , bY reuven
Python

I spend just about every day teaching Python to people at companies around the world.

I’m always amazed to see just how popular Python is, and how many people are using it — and in how many ways they are using it.

But I’m also amazed by how many people are just “getting by” with Python. You know, they’re able to write some basic functions, and read data from files, and even perform some basic manipulations on their data, without too much help.

But those people are turning to Stack Overflow for just about anything non-trivial. And that might seem fine, except:

They’re spending lots of time just searching for the right answer to their questions
Then they’re spending lots of time modifying the answer they found online, usually through trial and error
Then they’re not really sure what they’ve done, so if it breaks, they’re out of luck.

Does this describe you? Because it describe a huge number of the people I teach.

These people can use Python, in the same way that you can use a phrasebook to get around in a foreign country whose language you don’t speak. Yes, you can get through some basic tasks. But you’ll never be able to take on big jobs, and you’ll always feel frustrated, or stuck, that you don’t really know what you’re doing.

And maybe you’re even a bit nervous that your boss will discover just how little Python you know.

And besides, let’s face it: There are many problems you can’t Google your way out of.

Fortunately, there is a solution to this problem: Practice. If you have already learned Python’s basics, but you haven’t learned how to actually use the language, then you need practice. Just as if you want to learn a language, you need to surround yourself with it, so that you start to think in that language.

Weekly Python Exercise, now in its third year, is my best solution to this problem of Python non-fluency. Each WPE cohort has 15 exercises (and detailed solutions), which you solve along with others taking the course at the same time as you. WPE is designed to force you to think in new ways, to become more familiar with Python’s syntax, libraries, and capabilities — and then to be better at your current job, or even (I hope) get a better job in the future.

On May 14th, I’ll be starting a new cohort of Weekly Python Exercise A2. “A” is the level (for beginners), and this is the 2nd course in the A series. A2 focuses on Python functions and modules. So we’ll talk about function parameters and defaults, a bit of passing functions as arguments to other functions, and then how to best use the modules in Python’s standard libraries to accomplish your goals.

Registration is only open until May 10th. So if you want to join this cohort, you should act now!

Click here, and learn more about Weekly Python Exercise!

Improve your Python skills with my new book: Python Workout

April 29, 2019 , bY reuven
Python

A few years ago, I noticed that many of the participants in my corporate Python courses were asking the same question: How can I get additional practice?

And I have to say, this question made a lot of sense. After all, you can only absorb so much information from a course, regardless of how good it is. It’s only through repeated practice that you really gain the mastery and fluency that you need. This is true in sports. This is true in language. This is true in crossword puzzles. And it’s true in programming — even in a language as straightforward as Python.

Thus was born “Practice Makes Python,” my first ebook. That ebook became a course with accompanying videos. Those led me to write another book, a bunch of additional video courses (with many more on the way), and (of course) Weekly Python Exercise, now a family of six 15-week courses.

Well, I have exciting news to announce today: “Practice Makes Python” has undergone massive editing and expansion, and is being republished by Manning as “Python Workout.”

How has it changed?

It now uses Python 3 exclusively.
I’ve added many diagrams and figures.
Just about every exercise has a link to PythonTutor.com, where you can follow the code yourself, line by line.
There are numerous sidebars, describing aspects of Python’s functionality that you might not have understood previously.
After presenting my solution to an exercise, I then present three additional “beyond the exercise” challenges.
It has gone through a lot of editing by people with a great of experience in the editing and publishing worlds.

The book was just released as a MEAP (“Manning Early Access Program”), which means that it’s available as an online book today, with three of the 10 chapters already online. The next three chapters should be released within the next 1-2 months, and the full book should be done (if all goes well) by September or October. The videos are still, for the time being, the old ones that use Python 2 — but will be replaced in the coming months, as well.

If you buy the MEAP, you’ll have access to these updates as they happen, and will also be able to tell me what worked well… and what didn’t. You can be sure that I’m always experimenting with my exercises, trying to figure out how to get the questions, the tasks, and the explanations to be a bit more effective and useful to people.

If this sounds good, then I want to make it even better: As a reader of my blog, you can get 50% off “Python Workout” by using the promo code mllerner50 . Note that this promo code is good for all Manning books, in all formats (online and print). So if you see other things you like, go wild!

Once again: Get “Python Workout” for 50% off with the promo code mllerner50 !

Announcing: My new NumPy course is live!

March 29, 2019 , bY reuven
Python

Guess what? Python is the #1 language for data science. I know, it doesn’t seem like this should be true. Python is a great language, and easy to learn, but it’s not the most efficient language, either in execution speed or in its memory usage.

That’s where NumPy comes in: NumPy lets you have the best of both worlds, enjoying Python’s friendliness with C’s efficiency. As a result:

Companies are switching from C, C++, and Java to Python — because NumPy allows them to do so, with no loss of execution speed and with huge gains in their programmer productivity.
Companies are switching from Matlab to Python — because Python’s open-source license saves them huge amounts of money, and NumPy provides the functionality they need
Developers who never saw themselves as analysts or data scientists are learning these disciplines, because NumPy gives them an easy onramp into doing so
Students are discovering that you don’t need to choose between a high-level language and ruthless program efficiency, thanks to NumPy.

So, what’s the problem? Well, NumPy works differently from regular Python data structures. Learning the ins and outs, and how to apply these ideas to your own work, can take some time, even (or especially) if you have lots of Python experience.

It shouldn’t come as a surprise, then, that my “Intro to data science with Python” course has become one of my most popular. Companies around the world, from Apple to Ericsson, IBM to PayPal, VMWare to Western Digital, have asked me to teach it to their engineers. What do I teach on the first day of that course? NumPy. Because without NumPy, you can’t do any serious data science with Python.

Companies keep inviting me back, time after time, to teach this course. Almost immediately, their people use the techniques I teach to do more in less time — which is, after all, the promise that Python has offered to us.

I’m thus delighted to announce that my new “NumPy” course is available online. This course includes nearly 5 hours of videos and nearly 60 exercises, designed to help you understand how to use NumPy — along with its companion software, Jupyter and Matplotlib. It includes the same content as I present to these Fortune 500 companies, but for your own personal use, whenever and wherever you want to learn.

If you’re a programmer itching to learn data science, then this course is for you — providing an introduction to data science.
If you’re a data scientist interested in learning Python, then this course is for you — showing you how Python can serve your analysis needs.
If you’re an analyst who wants to use Python instead of Excel, then this course is for you — giving you a step-by-step introduction to the NumPy library.
If your job involves heavy doses of math, then this course is for you — showing you how NumPy can, together with Python, help you write easy-to-maintain code that executes at blazing speeds.

In short: If you want to learn one of the hottest topics in the computer industry, gaining skills that are highly in demand, then this course is for you.

Buy NumPy now

Want to learn more? Just go to the course page, and see what topics I cover. You can even watch a few of the videos for free. And then start your data-science journey with the tool that is getting lots of people excited: NumPy.

Learn more about my NumPy course at https://store.lerner.co.il/numpy .

Page [tcb_pagination_current_page] of [tcb_pagination_total_pages]

First

Last