Five Python function parameters you should know and use

July 16, 2017 . By Reuven

One of Python’s mantras is “batteries included.” which means that even with a bare-bones installation, you can do quite a bit. You can (and should) install packages from PyPI, but many day-to-day tasks can be accomplished with just the built-in data structures, functions, and methods.

What I’ve discovered over the years is that some of the functions and methods have useful parameters that can make our code shorter and more elegant. Here are some of the most elegant ones that I’ve found and use in my work:

1. str.split  (part 1)

One of the methods I use most often in my work is str.split. This method always returns a list, breaking the string into different elements. For example:

In [1]: s = 'abc,def,ghi'

In [2]: s.split(',')
Out[2]: ['abc', 'def', 'ghi']

In [3]: s = 'abc::def::ghi'

In [4]: s.split('::')
Out[4]: ['abc', 'def', 'ghi']

In [5]: s = 'this is a bunch of words'

In [6]: s.split(' ')
Out[6]: ['this', 'is', 'a', 'bunch', 'of', 'words']

All of this is great, and works just fine. But what if I do the following:

In [7]: s = 'a   b   c   d'  # Note: Three spaces between each letter

In [8]: s.split(' ')
Out[8]: ['a', '', '', 'b', '', '', 'c', '', '', 'd']

Yuck!  Of course, this is one of those times that the computer does what we tell it, not what we want: We said that every time it encounters a space character, it should give us a new element in the output list. Sure enough, by having multiple space characters between the letters, we end up having lots of empty strings in our resulting list.

What’s worse is that I often use str.split to take input from users, or from files, and break it into individual elements. I’d love to break on one or more whitespace characters, ideally without reverting to the “re” module’s re.split(‘\s’).

Solution: Don’t pass any argument. The first parameter, named “sep”, has a default value of None. And when it has a value of None, str.split does indeed use one or more whitespace characters.  That’s right — str.split, when called with zero arguments, actually does more (and is often more useful) then when called with an argument:

In [10]: s = 'abc \n\n def \n\t ghi jkl\n\n'

In [11]: s.split()
Out[11]: ['abc', 'def', 'ghi', 'jkl']

2. str.split (part 2)

Let’s say you ask a user to enter their name, which you want to split into first and last names. For example:

In [13]: person = raw_input("Enter your name: ") # "input" in Python 3
Enter your name: Reuven Lerner

In [14]: first_name, last_name = person.split()

In [15]: print("First name is '{}', last name is '{}'".format(first_name, 
                                                              last_name))
First name is 'Reuven', last name is 'Lerner'

Line 14 uses Python’s unpacking; since we know that the list produced by person.split() will contain two elements, I can safely assign those two elements into two variables (first_name and last_name).

But what if the person also enters a third name?

In [16]: person = raw_input("Enter your name: ")
Enter your name: Reuven Moshe Lerner

In [17]: first_name, last_name = person.split()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-cf257c8e7997> in <module>()
----> 1 first_name, last_name = person.split()

ValueError: too many values to unpack

Yikes! str.split() returned a list of three elements. And you cannot assign three elements into two variables. (Fine, Python 3 does allow for this, but we won’t discuss this here.)

We can, however, tell str.split how many times it should split. This is done by passing the second, optional argument. If you want to split on all whitespace, as we saw above, then you must explicitly pass None as the first argument:

In [18]: first_name, last_name = person.split(None, 1)

In [19]: print("First name is '{}', last name is '{}'".format(first_name,
 ...: last_name))
First name is 'Reuven', last name is 'Moshe Lerner'

Remember that the second parameter is called “maxsplits”, meaning the number of times str.split should do its thing. This means that the number you give will be the index of the final element in the returned list. In other words: I call person.split(None, 1), which means that I’ll get back a list with two elements — the latter of which has an index of 1.

What if I want to split things in the other direction, such that my first and middle names are in the first variable, and just my last name in the second variable? We can use a variant of str.split called str.rsplit (“right-side split”):

In [20]: first_name, last_name = person.rsplit(None, 1)

In [21]: print("First name is '{}', last name is '{}'".format(first_name,
 ...: last_name))
First name is 'Reuven Moshe', last name is 'Lerner'

3. enumerate

“enumerate” is a built-in function that exists to let us number things as we iterate over them. For example, let’s assume that I have a string, and want to print the letters of the string:

In [22]: s = 'abc'

In [23]: for one_letter in s:
 ...: print(one_letter)
 ...:
a
b
c

What if I want to get the index of each letter, too? I can do this manually:

In [24]: s = 'abc'

In [25]: index = 0

In [26]: for one_letter in s:
 ...: print("{}: {}".format(index, one_letter))
 ...: index += 1
 ...:
0: a
1: b
2: c

Because this is such a common thing that people want to do, we can instead use enumerate, which returns an iterator that produces tuples. Each tuple contains two elements, the first of which is the index and the second of which is the element from the enumerated sequence. Because we know that each tuple will contain two elements, we can grab them with unpacking:

In [27]: for index, one_letter in enumerate(s):
…: print(“{}: {}”.format(index, one_letter))
…:
0: a
1: b
2: c

But wait, what if you are presenting this information to a non-programmer, for whom it seems weird to start numbering with zero? One solution is to send them to a programming course, but if there’s no time, then you can pass “enumerate” a second argument, the number with which numbering should start:

In [28]: for index, one_letter in enumerate(s, 1):
 ...: print("{}: {}".format(index, one_letter))
 ...:
 ...:
1: a
2: b
3: c

Of course, we can start with any number we want:

In [29]: for index, one_letter in enumerate(s, 72):
 ...: print("{}: {}".format(index, one_letter))
 ...:
 ...:
 ...:
72: a
73: b
74: c

4. int()

“int” is the integer type, widely used in Python to represent whole numbers. Many Python developers know that we can turn strings into integers by invoking the “int” function.

Of course, there’s not really an “int function.”  Instead, we’re using “int” as a class to create a new instance of int. So when I say

int('5')

I get a new instance of “int” back, with the value 5.  And when I say

int('12345')

I get back a different instance of “int”, with the value 12345.

But “int” takes a second argument, which lets us tell Python the base of the source data. For example, if I say

int('12345', 16)

then we get back 74565, because we asked Python to give us the value of 0x12345.

You can actually use any number base you want, from 1 through 36 — which is particularly useful for those of us with 36 fingers. But it’s not uncommon for my clients to be reading from files containing hexadecimal numbers. For example, let’s assume that we want to sum the hex numbers on each line of the following file:

10 20 30 4a 5b 6c
ff ef fa 00 20 3b
ab cd ef af be cd

We can do something like this:

In [35]: for one_line in open('hexnums.txt'):
 ...: print(sum([int(one_number, 16)
 ...: for one_number in one_line.split()]))

In other words:

  • We open the file, and read it line by line
  • We split each line on whitespace, resulting in a list
  • We interpret each number in hex
  • The resulting list of integers can then be passed to the “sum” function
  • We print the sum for each line

If you work with files that contain binary, octal, or hex numbers, this can really be handy. To be honest, I don’t do this very much — but I work with a number of companies that do, and for whom this is a real lifesaver.

5. dict.get

Dictionaries are everywhere in Python. They’re easy to define, and easy to work with. For example:

In [36]: d = {'a':1, 'b':2, 'c':3}

In [37]: d['a']
Out[37]: 1

In [38]: d['z']
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-38-da9eddaf4274> in <module>()
----> 1 d['z']

KeyError: 'z'

Oh, right — but if you request a key that doesn’t exist, you’re going to get a KeyError exception.

Let’s write a little program that lets the user repeatedly query a dictionary. If the user gives us an empty string, then we’ll exit from the loop, but otherwise we’ll either print the value associated with the key, or give an error:

In [42]: while True:
 ...:         k = raw_input("Enter key: ")
 ...:         if not k:
 ...:             break
 ...:         elif k in d:
 ...:             print("d[{}] is {}".format(k, d[k]))
 ...:         else:
 ...:             print("{} isn't a key in d".format(k))
 ...:
Enter key: a
d[a] is 1
Enter key: b
d[b] is 2
Enter key: c
d[c] is 3
Enter key: d
d isn't a key in d
Enter key: <enter>

This works fine, and is a pretty standard way that I’ve seen people check for keys in order to avoid exceptions. But often, the dict.get method will work even better.  Basically, dict.get does the same thing as square brackets ([ ]), except that if the key doesn’t exist, it returns None. For example:

In [43]: while True:
 ...:         k = raw_input("Enter key: ")
 ...:         if not k:
 ...:             break
 ...:         else:
 ...:             print("value of d[{}] is {}".format(k, d.get(k)))
 ...:
 ...:
Enter key: a
value of d[a] is 1
Enter key: b
value of d[b] is 2
Enter key: c
value of d[c] is 3
Enter key: d
value of d[d] is None
Enter key:

Now, you might not want to display None to your users. But you can always trap for None in your code, and then tell the user that the key doesn’t exist.

But dict.get takes a second, optional parameter. If you pass a second argument, you can change the default value from None to something else. For example:

In [44]: p = {'first':'Reuven', 'last':'Lerner'}

In [45]: p.get('first')
Out[45]: 'Reuven'

In [46]: p.get('last')
Out[46]: 'Lerner'

In [47]: p.get('middle', '')
Out[47]: ''

Now I can query the dict for the “middle” key, getting an empty string if the key doesn’t exist.  Another example:

In [48]: countries = {'New York':'USA', 'London':'England', 'Moscow':'Russia'}

In [51]: countries.get('Amsterdam')

In [52]: countries.get('Amsterdam', "I don't know")
Out[52]: "I don't know"

In [53]: countries.get('New York', "I don't know")
Out[53]: 'USA'

In [54]: countries.get('Moscow', "I don't know")
Out[54]: 'Russia'

Notice that when the key does exist, there’s no difference between d[k] and d.get(k).  The question is how you want to deal with a key that doesn’t exist, and dict.get often makes life much easier in these cases.

What optional parameters do you find most useful in Python?

Also: I’m teaching three live Python courses (on functional programming, advanced objects, and decorators) in the next few weeks; early-bird tickets are still available for a limited time. Grab them now, and improve your Python fluency from your own computer.

Related Posts

Prepare yourself for a better career, with my new Python learning memberships

Prepare yourself for a better career, with my new Python learning memberships

I’m banned for life from advertising on Meta. Because I teach Python.

I’m banned for life from advertising on Meta. Because I teach Python.

Sharpen your Pandas skills with “Bamboo Weekly”

Sharpen your Pandas skills with “Bamboo Weekly”
  • The ‘key’ argument in the sorted (and the like) is quite handy – a way to sort an iterable in a way other than regular ascending order.

  • John Rigby says:

    Your double space example in second box perplexed me. I get:
    [‘a’, ”, ‘b’, ”, ‘c’, ”, ‘d’]

      • When I repeat your three space example in second box, I get:
        >>> s = ‘a b c d’
        >>> s.split(‘ ‘)
        [‘a’, ”, ”, ‘b’, ”, ”, ‘c’, ”, ”, ‘d’]

        >>> s = ‘a b c d’
        >>> s.split(‘ ‘) # three space
        [‘a’, ‘b’, ‘c’, ‘d’]
        >>>
        but idea “Don’t pass any argument” – better!
        Very useful article!

  • Hello Reuven!

    Very idiomatic and useful article!

    Just found a mistake there. In your following example, the split method returns a list without empty strings:
    In [7]: s = ‘abc def ghi jkl’

    I guess, the formatting in the blog misses some more spaces in text.

    • Yes, you’re right — the formatting lost some extra spaces. I’ve fixed it, and put a comment in there to ensure it’s less ambiguous.

      Thanks for noticing!

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
    >