January 18, 2015 , bY reuven
Python

One of the first things that Python programmers learn is that you can easily read through the contents of an open file by iterating over it:

f = open('/etc/passwd')
for line in f:
    print(line)

Note that the above code is possible because our file object “f” is an iterator. In other words, f knows how to behave inside of a loop — or any other iteration context, such as a list comprehension.

Most of the students in my Python courses come from other programming languages, in which they are expected to close a file when they’re done using it. It thus doesn’t surprise me when, soon after I introduce them to files in Python, they ask how we’re expected to close them.

The simplest answer is that we can explicitly close our file by invoking f.close(). Once we have done that, the object continues to exist — but we can no longer read from it, and the object’s printed representation will also indicate that the file has been closed:

>>> f = open('/etc/passwd')
>>> f
<open file '/etc/passwd', mode 'r' at 0x10f023270>
>>> f.read(5)
'##\n# '

f.close()
>>> f
<closed file '/etc/passwd', mode 'r' at 0x10f023270>

f.read(5)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-ef8add6ff846> in <module>()
----> 1 f.read(5)
ValueError: I/O operation on closed file

But here’s the thing: When I’m programming in Python, it’s pretty rare for me to explicitly invoke the “close” method on a file. Moreover, the odds are good that you probably don’t want or need to do so, either.

The preferred, best-practice way of opening files is with the “with” statement, as in the following:

with open('/etc/passwd') as f:
    for line in f:
        print(line)

The “with” statement invokes what Python calls a “context manager” on f. That is, it assigns f to be the new file instance, pointing to the contents of /etc/passwd. Within the block of code opened by “with”, our file is open, and can be read from freely.

However, once Python exits from the “with” block, the file is automatically closed. Trying to read from f after we have exited from the “with” block will result in the same ValueError exception that we saw above. Thus, by using “with”, you avoid the need to explicitly close files. Python does it for you, in a somewhat un-Pythonic way, magically, silently, and behind the scenes.

But what if you don’t explicitly close the file? What if you’re a bit lazy, and neither use a “with” block nor invoke f.close()?  When is the file closed?  When should the file be closed?

I ask this, because I have taught Python to many people over the years, and am convinced that trying to teach “with” and/or context managers, while also trying to teach many other topics, is more than students can absorb. While I touch on “with” in my introductory classes, I normally tell them that at this point in their careers, it’s fine to let Python close files, either when the reference count to the file object drops to zero, or when Python exits.

In my free e-mail course about working with Python files, I took a similarly with-less view of things, and didn’t use it in all of my proposed solutions. Several people challenged me, saying that not using “with” is showing people a bad practice, and runs the risk of having data not saved to disk.

I got enough e-mail on the subject to ask myself: When does Python close files, if we don’t explicitly do so ourselves or use a “with” block? That is, if I let the file close automatically, then what can I expect?

My assumption was always that Python closes files when the object’s reference count drops to zero, and thus is garbage collected. This is hard to prove or check when we have opened a file for reading, but it’s trivially easy to check when we open a file for writing. That’s because when you write to a file, the contents aren’t immediately flushed to disk (unless you pass “False” as the third, optional argument to “open”), but are only flushed when the file is closed.

I thus decided to conduct some experiments, to better understand what I can (and cannot) expect Python to do for me automatically. My experiment consisted of opening a file, writing some data to it, deleting the reference, and then exiting from Python. I was curious to know when the data would be written, if ever.

My experiment looked like this:

f = open('/tmp/output', 'w')
f.write('abc\n')
f.write('def\n')
# check contents of /tmp/output (1)
del(f)
# check contents of /tmp/output (2)
# exit from Python
# check contents of /tmp/output (3)

In my first experiment, conducted with Python 2.7.9 on my Mac, I can report that at stage (1) the file existed but was empty, and at stages (2) and (3), the file contained all of its contents. Thus, it would seem that in CPython 2.7, my original intuition was correct: When a file object is garbage collected, its __del__ (or the equivalent thereof) flushes and closes the file. And indeed, invoking “lsof” on my IPython process showed that the file was closed after the reference was removed.

What about Python 3?  I ran the above experiment under Python 3.4.2 on my Mac, and got identical results. Removing the final (well, only) reference to the file object resulted in the file being flushed and closed.

This is good for 2.7 and 3.4.  But what about alternative implementations, such as PyPy and Jython?  Perhaps they do things differently.

I thus tried the same experiment under PyPy 2.7.8. And this time, I got different results!  Deleting the reference to our file object — that is, stage (2), did not result in the file’s contents being flushed to disk. I have to assume that this has to do with differences in the garbage collector, or something else that works differently in PyPy than in CPython. But if you’re running programs in PyPy, then you should definitely not expect files to be flushed and closed, just because the final reference pointing to them has gone out of scope. lsof showed that the file stuck around until the Python process exited.

For fun, I decided to try Jython 2.7b3. And Jython exhibited the same behavior as PyPy.  That is, exiting from Python did always ensure that the data was flushed from the buffers, and stored to disk.

I repeated these experiments, but instead of writing “abc\n” and “def\n”, I wrote “abc\n” * 1000 and “def\n” * 1000.

In the case of Python 2.7, nothing was written after the “abc\n” * 1000. But when I wrote “def\n” * 1000, the file contained 4096 bytes — which probably indicates the buffer size. Invoking del(f) to remove the reference to the file object resulted in its being flushed and closed, with a total of 8,000 bytes. So in the case of Python 2.7, the behavior is basically the same regardless of string size; the only difference is that if you exceed the size of the buffer, then some data will be written to disk before the final flush + close.

In the case of Python 3, the behavior was different: No data was written after either of the 4,000 byte outputs written with f.write. But as soon as the reference was removed, the file was flushed and closed. This might point to a larger buffer size. But still, it means that removing the final reference to a file causes the file to be flushed and closed.

In the case of PyPy and Jython, the behavior with a large file was the same as with a small one: The file was flushed and closed when the PyPy or Jython process exited, not when the last reference to the file object was removed.

Just to double check, I also tried these using “with”. In all of these cases, it was easy to predict when the file would be flushed and closed: When the block exited, and the context manager fired the appropriate method behind the scenes.

In other words: If you don’t use “with”, then your data isn’t necessarily in danger of disappearing — at least, not in simple simple situations. However, you cannot know for sure when the data will be saved — whether it’s when the final reference is removed, or when the program exits. If you’re assuming that files will be closed when functions return, because the only reference to the file is in a local variable, then you might be in for a surprise. And if you have multiple processes or threads writing to the same file, then you’re really going to want to be careful here.

Perhaps this behavior could be specified better, and thus work similarly or identically on different platforms? Perhaps we could even see the start of a Python specification, rather than pointing to CPython and saying, “Yeah, whatever that version does is the right thing.”

I still think that “with” and context managers are great. And I still think that it’s hard for newcomers to Python to understand what “with” does. But I also think that I’ll have to start warning new developers that if they decide to use alternative versions of Python, there are all sorts of weird edge cases that might not work identically to CPython, and that might bite them hard if they’re not careful.

Enjoyed this article? Join more than 11,000 other developers who receive my free, weekly “Better developers” newsletter. Every Monday, you’ll get an article like this one about software development and Python:



 

January 5, 2015 , bY reuven
Education

If you’re like me, you love to learn. And in our industry, a primary way of learning involves attending conferences.

However, if you’re like me, you never have the time to actually attend them.  (In my case, the fact that I live far away from where many conferences take place is an additional hindrance.)

Fortunately, a very large number of talks at modern conferences are recorded. This means that even if you didn’t attend a conference, you can still enjoy (and learn from) the talks that were there.

However, this leads to a new and different problem: There are too many talks for any one person to watch. How can you find things that are interesting and relevant?

My latest side project aims to solve this problem, at least in part: DailyTechVideo.com offers, as its name implies, a high-quality, thought-provoking talk about technology each day. To date, almost all of the talks reflect the technologies that are of interest to me, which typically means that they are open source programming languages, databases, or Web application frameworks. But I have tried to include conference videos that have provoked and prodded my thinking, and which are likely to be helpful for other professionals in the computer industry. Moreover, I’m hoping to receive suggestions from people who have seen interesting videos in fields with which I’m less familiar (e.g., hardware or robotics), who can help me to improve my own understanding and knowledge.

So if you enjoy learning, I invite you to subscribe to DailyTechVideo.com, and/or to follow its Twitter feed at @DailyTechVideo.

And if you can suggest videos to include, e-mail me at reuven@lerner.co.il, or tweet me at @ReuvenMLerner or @DailyTechVideo. I already have another 4-5 weeks of videos queued up, but I’m always on the lookout for new and interesting ones.

November 13, 2014 , bY reuven
Python

The latest draft of “Practice Makes Python,” my ebook intended to sharpen your Python programming skills, is now out. This draft includes all 50 exercises, solutions, and explanations that I had hoped to include in the book.

I’m very excited to have reached this milestone, and appreciate the input from my many students and colleagues who have provided feedback.

The next steps in my release of the book are: Use a different toolchain that will allow for internal hyperlinks in the PDF, generate epub and mobi formats, and then start on the video explanations that will be included in a higher-tier version of the book package. Even without these steps, the content of the book is ready, and is a great way for you to improve your Python skills. The book is not meant to teach you Python, and assumes that you are familiar with the basics.

Please check out the latest version of Practice Makes Python. In case you’re not sure whether the book is for you, I am enclosing another sample exercise, this time from the chapter on modules and packages. As always, comments and suggestions are welcome.

Sales tax

The Republic of Freedonia has a strange tax system. To help businesses calculate their sales taxes, the government has decided to provide a Python software library.

Sales tax on a purchase depends on where the purchase was made, as well as the time of the purchase. Freedonia has four provinces, each of which charges a different percentage of tax:

  • Chico: 50%
  • Groucho: 70%
  • Harpo: 50%
  • Zeppo: 40%

Yes, the taxes are quite high in Freedonia. (So high, in fact, that they are said to have a Marxist government.) However, these taxes rarely apply in full. That’s because the amount of tax applied depends on the hour at which the purchase makes place. The tax percentage is always multiplied by the hour at which the purchase was made. At midnight, there is no sales tax. From 12 noon until 1 p.m., only 50% (12/24) of the tax applies. And from 11 p.m. until midnight, 95% (i.e., 23/24) of the tax applies.

Your job is to implement that Python module, “freedonia.py”. It should provide a function, “calculate_tax”, which takes three arguments: The amount of the purchase, the province in which the purchase took place, and the hour (using 24-hour notation) at which it happened. The “calculate_tax” function should return the final price.

Thus, if I were to invoke

calculate_tax(500, 'Harpo', 12)

A $500 purchase in Harpo province (with 50% tax) would normally be $750. However, because the purchase was done at 12 noon, the tax is only half of its usual amount, or $125, for a total of $625. If the purchase were made at 9 p.m. (i.e, 21:00 on a 24-hour clock), then the tax would be 87.5% of its full rate, or 43.75%, for a total price of $718.75.

Note that while you can still use a single file, exercises such as this one lend themselves to having two files, one of which (“use_freedonia.py”) imports and then uses “freedonia.py”.

Solution

# freedonia.py
rates = {
 'Chico': 0.5,
 'Groucho': 0.7,
 'Harpo': 0.5,
 'Zeppo': 0.4
}

def time_percentage(hour):
    return hour / 24.0

def calculate_tax(amount, state, hour):
    return amount + (amount * rates[state] * time_percentage(hour))

And now, the program that uses it:

from freedonia import calculate_tax

print "You owe a total of: {}".format(calculate_tax(100, 'Harpo', 12))

print "You owe a total of: {}".format(calculate_tax(100, 'Harpo', 21))

Discussion

The “freedonia” module does precisely what a Python module should do: Namely, it defines data structures and functions that provide functionality to one or more other programs. By providing this layer of abstraction, it allows a programmer to focus on what is important to him or her, such as the implementation of an online store, without having to worry about the nitty-gritty of particular details.

While some countries have extremely simple systems for calculating sales tax, others — such as the United States — have many overlapping jurisdictions, each of which applies its own sales tax, often at different rates and on different types of goods. Thus, while the Freedonia example is somewhat contrived, it is not unusual to purchase or use libraries of this sort of sales taxes.

Our module defines a dictionary (“rates”), in which the keys are the provinces of Freedonia, and the values are the taxation rates that should be applied there. Thus, we can find out the rate of taxation in Groucho province with “rates[‘Groucho’]”. Or we can ask the user to enter a province name in the “province” variable, and then get “rates[province]”. Either way, that will give us a floating-point number which we can use to calculate the tax.

A wrinkle in the calculation of Freedonian taxation is the fact that taxes get progressively higher as the day goes on. In order to make this calculation easier, I wrote a “time_percentage” function, which simply takes the hour and returns it as a percentage of 24 hours. In Python 2, integer division always returns an integer, even when that means throwing away the remainder. Thus, we divide the current hour not by “24” (an int) but by “24.0” (a float), which ensures that the result will be a floating-point number.

Finally, the “calculate_tax” function takes three parameters — the amount of the sale, the name of the province in which the sale is taking place, and the hour at which the sale happened — and returns a floating-point number indicating the actual, current tax rate.

It should be noted that if you’re actually doing calculations involving serious money, you should almost certainly *not* be using floats. Rather, you should use integers, and then calculate everything in terms of cents, rather than dollars. This avoids the fact that floating-point numbers are not completely accurate on computers. (Try to add “0.1” and “0.7” in Python, and see what the result is.) However, for the purposes of this example, and given the current state of the Freedonian economy in any event, this is an acceptable risk for us to take.

October 22, 2014 , bY reuven
Business, Education, Python

My first ebook, “Practice Makes Python” — containing 50 exercises that will help to sharpen your Python skills — is now available for early-bird purchase!3D_book

The book is already about 130 pages (and 26,000 words) long, containing about 40 exercises on such subjects as basic data structures, working with files, functional programming, and object-oriented development. But it’s not quite done, and thus I’m calling this an “early-bird” purchase of the book: Not all of the exercises are ready, the formatting isn’t quite there yet, and PDF is the only format available for now. That said, even in this draft version, there is more than enough here to help many Python developers to gain fluency and improve their skills with the language.

Anyone who purchases the book now can use the coupon code EARLY to get a 10% discount. Perhaps it goes without saying, but anyone buying the book now will also get all updates and improvements, free of charge, as they occur over the coming weeks. And anyone who finds that they didn’t get value from the book is welcome to e-mail me and say so — and I’ll refund 100 percent of your purchase price.

The basic idea behind “Practice Makes Python” is that learning Python — or any language — is a long, slow process. Even the best courses cannot possibly give you enough practice with the language for it to feel natural. That only comes with practice. Most people end up practicing, as it were, on projects at work. My goal with this book is to give people who have taken Python courses a chance to become more familiar with the language.

My PhD studies in Learning Sciences taught me a great deal about how people learn, and one of the most important lessons was that of “constructionism” — that one of the best ways to learn is through the creation of things that are important to the individual. I have tried to make the exercises in “Practice Makes Python” interesting and fun, as well as relevant to what people do with Python on a day-to-day basis. Perhaps you won’t be creating Pig Latin translation programs in your day job, but the techniques that you learn from writing such programs in the book will undoubtedly help you out. Certainly, by working through the exercises — not by reading the answers and discussions! — you will learn a great deal about Python programming.

If you recently took a course in Python, or even if you have been working with it for up to a year, I believe that “Practice Makes Python” will give you the knowledge and confidence you need to master this fun and interesting language. These exercises are based on the many Python courses I have taught in the United States, Europe, Israel, and China over the years, and have proven themselves to help programmers start to really “get” Python.

I’d be delighted to hear what you think about “Practice Makes Python,” and how it can help to improve people’s Python programming skills even more. Contact me at reuven@lerner.co.il if you have thoughts or ideas.

October 14, 2014 , bY reuven
Python

Newcomers to Python are often amazed to discover how easily we can create new classes. For example:

class Foo(object): 
    pass

is a perfectly valid (if boring) class. We can even create instances of this class:

f = Foo()

This is a perfectly valid instance of Foo. Indeed, if we ask it to identify itself, we’ll find that it’s an instance of Foo:

>>> type(f)
<class '__main__.Foo'>

Get the bonus content: Python Attributes: Download the code!

Now, while “f” might be a perfectly valid and reasonable instance of Foo, it’s not very useful. It’s at this point that many people who have come to Python from another language expect to learn where they can define instance variables. They’re relieved to know that they can write an  __init__ method, which is invoked on a new object immediately after its creation. For example:

class Foo(object):
    def __init__(self, x, y):
      self.x = x
      self.y = y

>> f = Foo(100, 'abc')

>>> f.x
100
>>> f.y
'abc'

On the surface, it might seem like we’re setting two instance variables, x and y, on f, our new instance of Foo. And indeed, the behavior is something like that, and many Python programmers think in these terms. But that’s not really the case, and the sooner that Python programmers stop thinking in terms of “instance variables” and “class variables,” the sooner they’ll understand how much of Python works, why objects work in the ways that they do, and how “instance variables” and “class variables” are specific cases of a more generalized system that exists throughout Python.

The bottom line is inside of __init__, we’re adding new attributes to self, the local reference to our newly created object.  Attributes are a fundamental part of objects in Python. Heck, attributes are fundamental to everything in Python. The sooner you understand what attributes are, and how they work, the sooner you’ll have a deeper understanding of Python.

Every object in Python has attributes. You can get a list of those attributes using the built-in “dir” function. For example:

>>> s = 'abc'
>>> len(dir(s))
71
>>> dir(s)[:5]
['__add__', '__class__', '__contains__', '__delattr__', '__doc__']

>>> i = 123
>>> len(dir(i))
64
>>> dir(i)[:5]
['__abs__', '__add__', '__and__', '__class__', '__cmp__']

>>> t = (1,2,3)
>>> len(dir(t))
32
>>> dir(t)[:5]
['__add__', '__class__', '__contains__', '__delattr__', '__doc__']

As you can see, even the basic data types in Python have a large number of attributes. We can see the first five attributes by limiting the output from “dir”; you can look at them yourself inside of your Python environment.

The thing is, these attribute names returned by “dir” are strings. How can I use this string to get or set the value of an attribute? We somehow need a way to translate between the world of strings and the world of attribute names.

Fortunately, Python provides us with several built-in functions that do just that. The “getattr” function lets us get the value of an attribute. We pass “getattr” two arguments: The object whose attribute we wish to read, and the name of the attribute, as a string:

>>> getattr(t, '__class__')
tuple

This is equivalent to:

>>> t.__class__
tuple

In other words, the dot notation that we use in Python all of the time is nothing more than syntactic sugar for “getattr”. Each has its uses; dot notation is far easier to read, and “getattr” gives us the flexibility to retrieve an attribute value with a dynamically built string.

Python also provides us with “setattr”, a function that takes three arguments: An object, a string indicating the name of the attribute, and the new value of the attribute. There is no difference between “setattr” and using the dot-notation on the left side of the = assignment operator:

>>> f = Foo()
>>> setattr(f, 'x', 5)
>>> getattr(f, 'x')
5
>>> f.x
5
>>> f.x = 100
>>> f.x
100

As with all assignments in Python, the new value can be any legitimate Python object. In this case, we’ve assigned f.x to be 5 and 100, both integers, but there’s no reason why we couldn’t assign a tuple, dictionary, file, or even a more complex object. From Python’s perspective, it really doesn’t matter.

In the above case, I used “setattr” and the dot notation (f.x) to assign a new value to the “x” attribute. f.x already existed, because it was set in __init__. But what if I were to assign an attribute that didn’t already exist?

The answer: It would work just fine:

>>> f.new_attrib = 'hello'
>>> f.new_attrib
'hello' 

>>> f.favorite_number = 72
>>> f.favorite_number
72

In other words, we can create and assign a new attribute value by … well, by assigning to it, just as we can create a new variable by assigning to it. (There are some exceptions to this rule, mainly in that you cannot add new attributes to many built-in classes.) Python is much less forgiving if we try to retrieve an attribute that doesn’t exist:

>>> f.no_such_attribute
AttributeError: 'Foo' object has no attribute 'no_such_attribute'

So, we’ve now seen that every Python object has attributes, that we can retrieve existing attributes using dot notation or “getattr”, and that we can always set attribute values. If the attribute didn’t exist before our assignment, then it certainly exists afterwards.

We can assign new attributes to nearly any object in Python. For example:

def hello():
    return "Hello"

>>> hello.abc_def = 'hi there!'

>>> hello.abc_def
'hi there!'

Yes, Python functions are objects. And because they’re objects, they have attributes. And because they’re objects, we can assign new attributes to them, as well as retrieve the values of those attributes.

So the first thing to understand about these “instance variables” that we oh-so-casually create in our __init__ methods is that we’re not creating variables at all. Rather, we’re adding one or more additional attributes to the particular object (i.e., instance) that has been passed to __init__. From Python’s perspective, there is no difference between saying “self.x = 5” inside of __init__, or “f.x = 5” outside of __init__. We can add new attributes whenever we want, and the fact that we do so inside of __init__ is convenient, and makes our code easier to read.

This is one of those conventions that is really useful to follow: Yes, you can create and assign object attributes wherever you want. But it makes life so much easier for everyone if you assign all of an object’s attributes in __init__, even if it’s just to give it a default value, or even None. Just because you can create an attribute whenever you want doesn’t mean that you should do such a thing.

Now that you know every object has attributes, it’s time to consider the fact that classes (i.e., user-defined types) also have attributes. Indeed, we can see this:

>>> class Foo(object):
        pass

Can we assign an attribute to a class?  Sure we can:

>>> Foo.bar = 100
>>> Foo.bar
100

Classes are objects, and thus classes have attributes. But it seems a bit annoying and roundabout for us to define attributes on our class in this way. We can define attributes on each individual instance inside of __init__. When is our class defined, and how can we stick attribute assignments in there?

The answer is easier than you might imagine. That’s because there is a fundamental difference between the body of a function definition (i.e., the block under a “def” statement) and the body of a class definition (i.e., the block under a “class” statement). A function’s body is only executed when we invoke the function. However, a the body of the class definition is executed immediately, and only once — when we define the class. We can execute code in our class definitions:

class Foo(object):
    print("Hello from inside of the class!")

Of course, you should never do this, but this is a byproduct of the fact that class definitions execute immediately. What if we put a variable assignment in the class definition?

class Foo(object):
    x = 100

If we assign a variable inside of the class definition, it turns out that we’re not assigning a variable at all. Rather, we’re creating (and then assigning to) an attribute. The attribute is on the class object. So immediately after executing the above, I can say:

Foo.x

and I’ll get the integer 100 returned back to me.

Are you a little surprised to discover that variable assignments inside of the class definition turn into attribute assignments on the class object? Many people are. They’re even more surprised, however, when they think a bit more deeply about what it must mean to have a function (or “method”) definition inside of the class:

>>> class Foo(object):
        def blah(self):
            return "blah"

>>> Foo.blah
<unbound method Foo.blah>

Think about it this way: If I define a new function with “def”, I’m defining a new variable in the current scope (usually the global scope). But if I define a new function with “def” inside of a class definition, then I’m really defining a new attribute with that name on the class.

In other words: Instance methods sit on a class in Python, not on an instance. When you invoke “f.blah()” on an instance of Foo, Python is actually invoking the “blah” method on Foo, and passing f as the first argument. Which is why it’s important that Python programmers understand that there is no difference between “f.blah()” and “Foo.blah(f)”, and that this is why we need to catch the object with “self”.

But wait a second: If I invoke “f.blah()”, then how does Python know to invoke “Foo.blah”?  f and Foo are two completely different objects; f is an instance of Foo, whereas Foo is an instance of type. Why is Python even looking for the “blah” attribute on Foo?

The answer is that Python has different rules for variable and attribute scoping. With variables, Python follows the LEGB rule: Local, Enclosing, Global, and Builtin. (See my free, five-part e-mail course on Python scopes, if you aren’t familiar with them.)  But with attributes, Python follows a different set of rules: First, it looks on the object in question. Then, it looks on the object’s class. Then it follows the inheritance chain up from the object’s class, until it hits “object” at the top.

Thus, in our case, we invoke “f.blah()”. Python looks on the instance f, and doesn’t find an attribute named “blah”. Thus, it looks on f’s class, Foo. It finds the attribute there, and performs some Python method rewriting magic, thus invoking “Foo.blah(f)”.

So Python doesn’t really have “instance variables” or “class variables.”  Rather, it has objects with attributes. Some of those attributes are defined on class objects, and others are defined on instance objects. (Of course, class objects are just instances of “type”, but let’s ignore that for now.)  This also explains why people sometimes think that they can or should define attributes on a class (“class variables”), because they’re visible to the instances. Yes, that is true, but it sometimes makes more sense than others to do so.

What you really want to avoid is creating an attribute on the instance that has the same name as an attribute on the class. For example, imagine this:

class Person(object):
    population = 0
    def __init__(self, first, last):
        self.first = first        
        self.last = last
        self.population += 1

p1 = Person('Reuven', 'Lerner')
p2 = Person('foo', 'bar')

This looks all nice, until you actually try to run it. You’ll quickly discover that Person.population remains stuck at 0, but p1.population and p2.population are both set to 1. What’s going on here?

The answer is that the line

self.population += 1

can be turned into

self.population = self.population + 1

As always, the right side of an assignment is evaluated before the left side. Thus, on the right side, we say “self.population”. Python looks at the instance, self, and looks for an attribute named “population”. No such attribute exists. It thus goes to Person, self’s class, and does find an attribute by that name, with a value of 0. It thus returns 0, and executes 0 + 1. That gives us the answer 1, which is then passed to the left side of the assignment. The left side says that we should store this result in self.population — in other words, an attribute on the instance! This works, because we can always assign any attribute. But in this case, we will now get different results for Person.population (which will remain at 0) and the individual instance values of population, such as p1 and p2.

We can actually see what attributes were actually set on the instance and on the class, using a list comprehension:

class Foo(object):
    def blah(self):
        return "blah"

>>> [attr_name for attr_name in dir(f) if attr_name not in dir(Foo)]
[]

>>> [attr_name for attr_name in dir(Foo) if attr_name not in dir(object)]
['__dict__', '__module__', '__weakref__', 'blah']

In the above, we first define “Foo”, with a method “blah”. That method definition, as we know, is stored in the “blah” attribute on Foo. We haven’t assigned any attributes to f, which means that the only attributes available to f are those in its class.

Enjoyed this article? Join more than 11,000 other developers who receive my free, weekly “Better developers” newsletter. Every Monday, you’ll get an article like this one about software development and Python:

October 14, 2014 , bY reuven
Python

It’s time again for me to offer a free Webinar, as well as two online courses:

  • I’m repeating my hour-long free Webinar about functional programming in Python on Wednesday, October 22nd.  When I offered it last month, more than 200 people got tickets, and 70 participated.  I had a blast, and nearly 2,000 people have viewed the YouTube version online.  Well, I have updated and improved this seminar, and we’ll be using technology (Google Hangouts On Air) that will let people participate with less hassle.  So if you’ve always wanted to learn about functional programming in Python, or if you have questions you would like to ask, please join me!
  • I’m also repeating my live, day-long class in functional Python programming on Monday, October 27th. This class goes into far more depth than the Webinar, and includes many exercises, as well as explanations of what functional programming is, what techniques are available in Python, and how we can use them to improve our program’s efficiency and maintainability.  We’ll cover some of the most powerful, but also misunderstood, aspects of Python programming, including list/set/dict comprehensions and the oft-maligned “reduce” function.
  • Finally, I’m offering a new live, day-long class in object-oriented Python programming on Monday, October 27th. This class starts off by describing what objects are, and quickly works its way toward classes, instance vs. class attributes, methods, class and static methods, inheritance, and even the basics of iteration. In my experience, many people who have been writing object-oriented Python for a year or more can benefit from this class; it’ll really help to solidify the concepts, and help you to understand how Python’s objects work behind the scenes.

These classes are versions of what I have taught many times at such companies as Apple, Cisco, HP, SANDisk, and VMWare around the world.  They have helped many Python programmers to become more proficient with the language, and to solve bigger and better problems in less time, and without sacrificing maintainability.

Even if you aren’t interested in the paid courses, you’re more than welcome to join the free Webinar.

As always, if you have questions, please let me know at reuven@lerner.co.il.

 

October 12, 2014 , bY reuven
Python

Cover of "Practice Makes Python"
Cover of “Practice Makes Python”

My ebook, Practice Makes Python, will go on pre-sale one week from today. The book is a collection of 50 exercises that I have used and refined when training people in Python in the United States, Europe, Israel, and China. I have found these exercises to be useful in helping people to go beyond Python’s syntax, and see how the parts of the language fit together.

(By the way, I’ll be giving my functional programming class on October 27th, and my object-oriented programming class on October 29th. Both are full-day classes, taught live and online. Signups for both classes will be announced here more formally in the coming days. Contact me at reuven@lerner.co.il if you already want details.)

Today, I’m posting an exercise that involves a common complex data structure, a list of dictionaries (aka “dicts”). I welcome feedback on the exercise, my proposed solution, and my discussion of that solution. If you’re trying to improve your Python skills, I strongly encourage you to try to solve the exercise yourself before looking at the answer. You will get much more out of struggling through the solutions to these exercises than from simply reading the answers.

Alphabetizing names

Let’s assume that you have phone book data in a list of dictionaries, as follows:

people =
[{'first':'Reuven', 'last':'Lerner', 'email':'reuven@lerner.co.il'},
 {'first':'Barack', 'last':'Obama', 'email':'president@whitehouse.gov'},
 {'first':'Vladimir', 'last':'Putin', 'email':'president@kremvax.kremlin.ru'}
 ]

First of all, if these are the only people in your phone book, then you should rethink whether Python programming is truly the best use of your time and connections. Regardless, let’s assume that you want to print information about all of these people, but in phone-book order — that is, sorted by last name and then by first name. Each line of the output should just look like this:

LastName, FirstName: email@example.com

Solution

for person in sorted(people, key=lambda person: [person['last'], person['first']]):
    print("{last}, {first}: {email}".format(**person))

Discussion

While Python’s data structures are useful by themselves, they become even more powerful and useful when combined. Lists of lists, lists of tuples, lists of dictionaries, and dictionaries of dictionaries are all quite common in Python. Learning to work with these is an important part of being a fluent Python programmer.

There are two parts to the above solution. The first is how we sort the names of the people in our list, and the second is how we print each of the people.

Let’s take the second problem first: We have a list of dictionaries. This means that when we iterate over our list, “person” is assigned a dictionary in each iteration. The dictionary has three keys: “first”, “last”, and “email”. We will want to use each of these keys to display each phone-book entry.

It’s true that the “str.format” method allows us to pass individual values, and then to grab those values in numerical order. Thus, we could say:

for person in people:
    print("{0}, {1}: {2}".format(person['last'], person['first'], person['email'])

Starting in Python 2.7, we can even eliminate the numbers, if we are planning to use them in order:

for person in people:
   print("{}, {}: {}".format(person['last'], person['first'], person['email'])

The thing is, we can also pass name-value pairs to “str.format”. For example, we could say:

for person in people:
    print("{last}, {first}: {email}".format(last=person['last'],
                                            first=person['first'],
                                            email=person['email'])

Even if our format string, with the “{first}” and “{last}”, is more readable, the name-value pairs we are passing are annoying to write. All we’re basically doing is taking our “person” dictionary, expanding it, and passing its name-value pairs as arguments to “str.format”.

However, there is a better way: We can take a dictionary and turn it into a set of keyword arguments by applying the “double splat” operator, “**”, on a dictionary. In other words, we can say:

for person in people:
    print("{last}, {first}: {email}".format(**person)

So far, so good. But we still haven’t covered the first problem, namely sorting the list of dictionaries by last name and then first name. Basically, we want to tell Python’s sort facility that before it compares two dictionaries from our “people” list, it should turn the dictionary into a list, consisting of the person’s last and first names. In other words, we want:

{'first':'Vladimir', 
 'last':'Putin', 
 'email':'president@kremvax.kremlin.ru'} 

to become

['Putin', 'Vladimir']

Note that we’re not trying to sort them as strings. That would work in our particular case, but if two people have *almost* the same last name (e.g., “Lerner” and “Lerner-Friedman”), then sorting them as strings won’t work. Sorting them by lists will work, because Python sorts lists by comparing each element in sequence. One element cannot “spill over” into the next element when making the comparison.

If we want to apply a function to each list element before the sorting comparison takes place, pass a function to the “key” parameter. Thus, we can sort elements of a list by saying:

mylist = ['abcd', 'efg', 'hi', 'j']
mylist.sort(key=len)

After executing the above, “mylist” will now be sorted in increasing order of length, because the built-in “len” function will be applied to each element before it is compared with others. In the case of our alphabetizing exercise, we could write a function that takes a dict and returns the sort of list that’s necessary:

def person_dict_to_list(d):
 return [d['last'], d['first']]

We could then apply this function when sorting our list:

people.sort(key=person_dict_to_list)

Following that, we could then iterate over the now-sorted list, and display our people.

However, it feels wrong to me to sort “people” permanently, if it’s just for the purposes of displaying its elements. Furthermore, I don’t see the point in writing a special-purpose named function if I’m only going to use it once.

We can thus use two pieces of Python which come from the functional programming world — the built-in “sorted” function, which returns a new, sorted list based on its inputs and the “lambda” operator, which returns a new, anonymous function. Combining these, thus get to the solution suggested above, namely:

for person in sorted(people, key=lambda person: [person['last'], person['first']]):
    print("{last}, {first}: {email}".format(**person))

This solution does not change the “people” list, but it does sort its elements for the purposes of printing them. And it prints them, in the phone-book order that we wanted, combining the “sorted” function, “lambda” for a built-in anonymous function, and the double-splat (“**”) operator on an argument to “str.format”.

September 16, 2014 , bY reuven
Python

In the free Webinar I gave yesterday about functional programming, I mentioned that “map,” or its equivalent (e.g., Python’s list comprehensions), is a powerful tool that I use nearly every day. Once you get into the functional mode of thinking, you’re constantly finding ways to turn one collection into another collection. It’s a mindset that takes time and practice, but allows you to solve many problems quickly and easily. The trick is to see beyond the initial problem, and to understand how you can think in terms of a source collection and a target collection.

For example, I was teaching an introductory Python course just today, and someone came to me and asked how they can turn a URL query string (e.g., x=1&y=2&z=abc) into a dictionary. Now, this isn’t a super-hard problem, but the reaction on his face to the way in which I solved it demonstrated how he would have used a completely different approach, and that functional thinking didn’t even cross his mind.

The first thing to notice is that in a query string, you’ve got name-value pairs separated by & signs. So the first task is to take the query string, and turn it into a list:

>>> query_string.split('&')
['x=1', 'y=2', 'z=abc']

Now that we have these items in a list, we can transform each of them. But wait — transform them?  Yes, and that’s where the “map” mindset comes in. You want to be moving your data into a list, which allows you to transform each element into another one. In this case, I want to transform each of the elements of the list into a dictionary pair.

Fortunately, we see that each name-value pair has a “=” sign between the name and the value. We can use that to our advantage, splitting each of the pairs:

>>> [item.split('=') for item in query_string.split('&')]
[['x', '1'], ['y', '2'], ['z', 'abc']]

In other words, we have now created a list of lists, in which the first element of each sub-list is our intended dictionary key, and the second element is our intended dictionary value.

Well, we can use dict() to construct dictionaries in Python. And whadaya know, it works just fine with a sequence of two-element sequences. We normally think of feeding dict() a list of tuples, but it turns out that a list of lists works just fine, as well:

>>> dict([item.split('=') for item in query_string.split('&')])
{'x': '1', 'y': '2', 'z': 'abc'}

And just like that, we’ve created our dictionary.

Of course, we could also use a dictionary comprehension:

>>> { item.split('=')[0] : item.split('=')[1] 
    for item in query_string.split('&') }
{'a': '1', 'b': '2', 'c': 'xyz'}

Now, none of the steps here was particularly difficult. Indeed, while the syntax of comprehensions can be a bit complex, the real difficulty here was seeing the original string, and immediately thinking, “Wait, if I can just turn that into a list, then I can easily create a dictionary from that.”

These sorts of transformations are everywhere, and they allow us to take seemingly difficult tasks and turn them into relatively simple ones.

September 14, 2014 , bY reuven
Python

My most recent blog post talked about the use of str.format instead of the % operator for interpolating values into strings. Some people who read the post wondered about their relative speeds.

I should first note that my first response to this is: I don’t really care that much. I’m not saying that speed isn’t important, or that optimization should never be done. Rather, my philosophy is that people are expensive and computers are cheap — and thus, anything we do to make people more productive, even if that comes at the expense of program speed, is probably fine.

Of course, that’s not always going to be true. Sometimes, you need (or just want) to squeeze more out of your computer. And to be a good programmer, you also need to know the relative advantages and disadvantages of the techniques you’re using.

So I decided to run a few, quick benchmarks on the relative speeds of str.format and %.  Sure enough, the % operator was a lot faster.  I ran my benchmarks the magic %timeit command that is built into the IPython interactive shell.  (If you’re using Python and aren’t using IPython, you should really switch ASAP.)  Note that in order to make things easier to read, I’m removing the IPython input and output prompts, and using >>> to denote where I entered text.

>>> name = 'Reuven'
>>> %timeit 'Hello there, {}'.format(name)
1000000 loops, best of 3: 243 ns per loop

>>> %timeit 'Hello there, %s' % name
10000000 loops, best of 3: 147 ns per loop

Wow.  As you can see, %timeit executed each of these lines of code 1,000,000 times. It then gave the average speed per loop. The % operator was, on average, about 100 ns faster than str.format. That shouldn’t come as a huge surprise, given that % is an operator (and thus doesn’t require a method call), doesn’t handle indexes and attributes, and can (I’m guessing) pass a great deal of its work off to C’s printf function.

Then again, is 100 ns really that long to wait for a formatted string?  I’m not so sure, to be honest.

What happens if we perform more than one interpolation?

>>> first = 'Reuven'
>>> last = 'Lerner'
>>> %timeit 'Hello there, {} {}'.format(first, last)
1000000 loops, best of 3: 371 ns per loop

>>> %timeit 'Hello there, %s %s' % (first, last)
1000000 loops, best of 3: 243 ns per loop

Each of these takes significantly longer to run than was the case with a single replacement. The difference between them continues to be about 120 ns per assignment — still not something to worry about too much, but the difference does exist.

What if I make the strings space-padded?

>>> %timeit 'Hello there, {:10} {:15}' % (first, last)
1000000 loops, best of 3: 459 ns per loop

>>> %timeit 'Hello there, %10s %15s' % (first, last)
1000000 loops, best of 3: 254 ns per loop

Now we see an even starker difference between the two ways of handling things. What about something like floating-point math, which takes longer?

>>> import math
>>> %timeit 'I love to eat {}'.format(math.pi)
1000000 loops, best of 3: 587 ns per loop

>>> %timeit 'I love to eat %f' % math.pi
1000000 loops, best of 3: 354 ns per loop

Limiting the number of decimals shown doesn’t seem to change the outcome very much:

>>> %timeit 'I love to eat {:.3}'.format(math.pi)
1000000 loops, best of 3: 582 ns per loop

>>>%timeit 'I love to eat %.3f' % math.pi
1000000 loops, best of 3: 329 ns per loop

UPDATE: Several people on Reddit took me to task for failing to consider the overhead of the str.format method call.  I mentioned this briefly above, but should have realized that there was an easy to to avoid this, namely aliasing the attributes (the method str.format and the float math.pi) to local variables:

>>> f = 'I love to eat {:.3}'.format
>>> p = math.pi
>>> %timeit f(p)
1000000 loops, best of 3: 489 ns per loop

>>> %timeit 'I love to eat %f' % p
1000000 loops, best of 3: 370 ns per loop

We still see significant overhead. Again, I’m guessing that a lot of this has to do with the overhead of a method vs. an operator. I’m not about to start looking at the bytecodes; this wasn’t meant to be a super-deep investigation or benchmark, but rather a quick check and comparison, and I think that on that front, it did the trick.

So, what have we learned?

  • Yes, str.format is slower than %.
  • The number of parameters you pass to str.format, and whether you then adjust the output with padding or a specified number of decimals, significantly influences the output speed.
  • That said, in many programs, the difference in execution speed is often 100 ns, which is not enough to cause trouble in many systems.

If speed is really important to you, then you should probably use %, and not str.format. However, if speed is more important than the maintainability or readability of your code, then I’d argue that Python is probably a poor choice of programming language.

September 12, 2014 , bY reuven
Python

I have been programming in Python for many years. One of the things that I wondered, soon after starting to work in Python, was how you can get Perl-style variable interpolation. After all, Perl (like the Unix shell) supports two types of quotes — single quotes (in which everything is taken literally) and double quotes (in which variables’ values are inserted). Thus, in Perl, you can do something like:

$name = 'Reuven';
print "Hello, $name\n";

And sure enough, it’ll print “Hello, Reuven”.

Because single and double quotes are equivalent in Python (so long as they come as a matched set), there is no variable interpolation. The technique that I learned years ago, when I started with Python, was that you could use the % operator on a string. In this context, % looks to the string on its left, determines how many values within the string need to be replaced, and then looks right to find those values. It then returns a new string, effectively interpolating the values. For example:

>>> name = 'Reuven'
>>> "Hello, %s" % name

'Hello, Reuven'

The above Python code works just fine, returning a string with a nice, personalized greeting. And indeed, for the length of my time working with Python, I have enjoyed using this % syntax. Yes, it’s a bit weird. And no, I cannot ever remember more than the absolute basics of printf’s various % codes, meaning that I either make everything a string (with %s), or I guess, or I look up the printf codes and what they do. But to be honest, I normally just use %s, and thus benefit additionally from the fact that Python will silently invoke “str” on the parameter.

The thing is, % is supposedly going away, or is at least deprecated. (A note on the python-dev list indicates that % will go away no sooner than 2022, which is a heckuva long time from now.) As of Python 2.6, not to mention Python 3.x, we have been told that it will eventually disappear, and that we shouldn’t use % any more. Instead, we should use the str.format method.

I have always mentioned str.format to my Python classes, but to be honest, I’ve usually relied upon % when giving live demonstrations and answering questions. And I would even encourage my students to use the % syntax, in part because I found it to be so much easier.

And yet.  I knew that I was doing something wrong, and I knew that I was probably misleading my students to some degree. Thus, in the last three classes I taught, I started to push harder in the direction of str.format. And that’s when I realized two things: (1) It’s just as easy as %, and actually easier in some ways, and (2) I hadn’t learned enough about str.format to use it, beyond the simplest ways. I thus spent a great deal of time researching it — and found out that str.format, while it takes some getting used to, is more than worth the effort.

Let’s start with the simplest case. I’d like to be able to say “Good morning” to someone, using both their first and last names. Assuming that I have variables named “first” and “last”, I can do this with the old syntax as follows:

>>> first = 'Reuven'
>>> last = 'Lerner'
>>> "Good morning, %s %s" % (first, last)

'Good morning, Reuven Lerner'

In this example, we already see one of the problems with the % syntax — namely, that if we have more than one value, we need to put it into a tuple. Perhaps this is logical and reasonable from Python’s perspective, but I can assure you that it surprises many of my students.

So, how would we do it using str.format? Pretty similarly, in many ways:

>>> "Good morning, {} {}".format(first, last)

'Good morning, Reuven Lerner'

Notice that we’ve changed things a bit here. No longer are we invoking a binary operator (%) on the string. Rather, we’re invoking a string method that takes a set of parameters. This is more logical and consistent. I can’t tell you how many of my students think that % is somehow connected to “print”, when in fact it’s connected (in the case of string formatting) to strings. Having to use put the “.format” at the end of the string makes the method call more obvious.

As you might already know, the “{} {}” in the string tells str.format to take its two parameters, and to insert them, in order, into the string. Because there are two arguments, we can only have two {} inside of the string. This is a bit harder to understand, both because having {} in Python reminds many people of a dictionary, and because the empty curly braces look a bit weird. But fine, I can live with that, and got used to it very quickly.

Where str.format quickly shows its advantages over %, however, is if I want to display the input parameters in reverse order. When I use %, there is no real way to do that. Plus, if I want to reuse a value passed to %, I cannot do so. With str.format, I can swap the order in which the inputs are displayed:

>>> "Good morning, {1} {0}".format(first, last)

'Good morning, Lerner Reuven'

Notice what happened in the above: If I just use “{} {}”, then str.format uses the two parameters in order. However, I’m also able to treat the parameters as a sequence, with indexes starting at 0. I can then insert them in reverse order, as I did above, or in the regular order:

>>> "Good morning, {0} {1}".format(first, last)

'Good morning, Reuven Lerner'

Note that if you explicitly state the field numbers, then you cannot rely on the automatic numbering.

Of course, this lets me also pass a sequence of values to be inserted, so long as we then use the splat (*) operator on it, to turn it into a parameter list:

>>> names = ('Reuven', 'Lerner')
>>> "Good morning, {} {}".format(*names)

'Good morning, Reuven Lerner'

You can also call str.format with keyword arguments. When you do this, you can then put a keyword name within the {}:

>>> "Good morning, {first} {last}".format(first='Reuven', last='Lerner')

'Good morning, Reuven Lerner'

The above really appeals to me. The named parameters are explicit (if a bit long), and the use of {first} and {last} is quite readable — certainly more so than %(first)s ever was with the % operator!

I can, of course, also pass a dictionary of names, using the ** operator to turn it into a set of keyword arguments:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Good morning, {first} {last}".format(**person)

'Good morning, Reuven Lerner'

I described all of these to my students in the last month, and I was pleasantly surprised to see how comfortable they were with the syntax. I’m sure that this reflects, to some degree, my comfort with the syntax, as well.

I should note that you can combine numeric and keyword arguments when working with str.format. I really suggest that you not do so. The results would look like this:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Good {0}, {first} {last}".format('morning', **person)

'Good morning, Reuven Lerner'

Yukko.

Now, the one thing that would appear to be missing from str.format is… well, formatting! The bad news is that str.format has a completely and different way of indicating how you want to format output. The good news is that it’s not too hard to learn and understand.

Let’s start with the easiest part: If you want to display a string within a fixed-width field, then you can do so by adding a colon (:) and then a number.  So to put my name in a fixed-width field of 10 spaces, we would say:

>>> "Your name is {name:10}".format(name="Reuven")

'Your name is Reuven    '

(Notice the trailing spaces after my name.)

In the above example, my name is left-justified. If I want it to be right-justified, I could use a > sign between the : and the number:

>>> "Your name is {name:>10}".format(name="Reuven")

'Your name is     Reuven'

And yes, I could have used an optional < symbol to say that my name should be left-justified within the field of 10 spaces in the first example.  Or I could center the text in a field of 10 spaces with the ^ specifier instead of < or >.

To pad the string with something other than a space, we specify it before the <, >, or ^ character. For example, if I’m moving to Hollywood, then perhaps I should do something like this:

>>> "Your name is {name:*^10}".format(name="Reuven")

'Your name is **Reuven**'

If I want to put the string in the (default) left-most position of the string, filling with characters on the right, then I must use the < specifier, so that the text will be on the left, and the stars on the right.

So it’s pretty clear that str.format is pretty snazzy when it comes to text. How about numbers? I wasn’t really sure how things would work here, but it turns out that they’re also quite straightforward. If you’re displaying integers, then you can go ahead and say:

>>> "The price is ${number}.".format(number=123)

'The price is $123.'

So far, we don’t see any difference between passing an integer and a string. And indeed, they share many characteristics. However, we might want to display an integer in a different way. We can do that using one of the (many) modifiers that str.format provides — letters placed just before the end of the closing } character. For example, we can get the price in binary (with a trailing “b”), or in hexadecimal (with a trailing “x”), as in the following example:

>>> "The price is ${number:x}.".format(number=123)

'The price is $7b.'

Of course, we can also zero-pad the number, such that it will always take up a fixed width. Just place a 0 between the colon and the width:

>>> "Your call is important to us. You are call #{number:05}.".format(number=123)

'Your call is important to us. You are call #00123.'

Notice that inside of the {}, we cannot put executable Python code. Instead, there is a mini-language that is separate and different from Python. However, there are two small exceptions to this rule: (1) We can retrieve any attribute with the standard . notation, and (2) we can retrieve a single item with the [] notation.

For example:

>>> class Foo(object):
        def __init__(self):
        self.x = 100
>>> f = Foo()
>>> 'Your number is {o.x}'.format(o=f)

'Your number is 100'n

Notice how we were able to retrieve the “x” attribute from the “f” object, which we mapped to “o” within the string. However, while you can retrieve an attribute, you cannot execute it. Thus, the following will not work:

>>> "Your name is {name.upper()}".format(name="Reuven")

AttributeError: 'str' object has no attribute 'upper()'

See what happened? I said “name.upper()”, in order to execute the method “str.upper” on “name”.  However, Python doesn’t want me to execute code there. So it takes the name of the attribute literally — and thus complained that there is no attribute “upper()”, with the parentheses. Of course, if you try it without the parentheses, it’ll work, for some value of “work”:

>>> "Your name is {name.upper}".format(name="Reuven")

'Your name is <built-in method upper of str object at 0x1028bf2a0>'

Similarly, we can retrieve an individual element of a sequence or mapping with []. However, we cannot use the slice notation for more than one element. For example:

>>> "Your favorite number is {n[3]}.".format(n=numbers)

'Your favorite number is 3.'

However:

>>> "Your favorite numbers are {n[2:4]}.".format(n=numbers)

ValueError: Missing ']' in format string

The “:” character, which we use for slices, isn’t available in format strings, because it’s used to control the formatting of the output.

You can, of course, use [] on a dictionary, as well. However — and this is a bit weird for Python — we omit the quote marks, even when our key is a string. For example:

>>> person = {'first':'Reuven', 'last':'Lerner'}
>>> "Your name is {p[first]}.".format(p=person)

'Your name is Reuven.'

If we were to include the quotes…

>>> "Your name is {p['first']}.".format(p=person)

KeyError: "'first'"

There is actually a lot more to str.format than what I have shared here. In particular,  each type has its own format specifications, which means that you can do certain things with floats (e.g., setting the precision) that you cannot do with strings.

You can even add formatting functionality to your own Python classes, such that they’ll be displayed in the way that you want, along with format specifiers that you define.

If you want to learn more about this, I’d definitely suggest reading PEP 3101, which describes str.format. I’d also suggest a slide show by Eric Smith, which summarizes things nicely. Finally, the Python documentation has some excellent examples, including a guide for moving from % to str.format.

I hope that this was helpful and useful! If you enjoyed this blog post, check out my many other resources, including my free e-mail course on Python scoping, and my free Webinar on functional programming in Python.