In the free Webinar I gave yesterday about functional programming, I mentioned that “map,” or its equivalent (e.g., Python’s list comprehensions), is a powerful tool that I use nearly every day. Once you get into the functional mode of thinking, you’re constantly finding ways to turn one collection into another collection. It’s a mindset that takes time and practice, but allows you to solve many problems quickly and easily. The trick is to see beyond the initial problem, and to understand how you can think in terms of a source collection and a target collection.
For example, I was teaching an introductory Python course just today, and someone came to me and asked how they can turn a URL query string (e.g., x=1&y=2&z=abc) into a dictionary. Now, this isn’t a super-hard problem, but the reaction on his face to the way in which I solved it demonstrated how he would have used a completely different approach, and that functional thinking didn’t even cross his mind.
The first thing to notice is that in a query string, you’ve got name-value pairs separated by & signs. So the first task is to take the query string, and turn it into a list:
>>> query_string.split('&') ['x=1', 'y=2', 'z=abc']
Now that we have these items in a list, we can transform each of them. But wait — transform them? Â Yes, and that’s where the “map” mindset comes in. You want to be moving your data into a list, which allows you to transform each element into another one. In this case, I want to transform each of the elements of the list into a dictionary pair.
Fortunately, we see that each name-value pair has a “=” sign between the name and the value. We can use that to our advantage, splitting each of the pairs:
>>> [item.split('=') for item in query_string.split('&')] [['x', '1'], ['y', '2'], ['z', 'abc']]
In other words, we have now created a list of lists, in which the first element of each sub-list is our intended dictionary key, and the second element is our intended dictionary value.
Well, we can use dict() to construct dictionaries in Python. And whadaya know, it works just fine with a sequence of two-element sequences. We normally think of feeding dict() a list of tuples, but it turns out that a list of lists works just fine, as well:
>>> dict([item.split('=') for item in query_string.split('&')]) {'x': '1', 'y': '2', 'z': 'abc'}
And just like that, we’ve created our dictionary.
Of course, we could also use a dictionary comprehension:
>>> { item.split('=')[0] : item.split('=')[1] for item in query_string.split('&') } {'a': '1', 'b': '2', 'c': 'xyz'}
Now, none of the steps here was particularly difficult. Indeed, while the syntax of comprehensions can be a bit complex, the real difficulty here was seeing the original string, and immediately thinking, “Wait, if I can just turn that into a list, then I can easily create a dictionary from that.”
These sorts of transformations are everywhere, and they allow us to take seemingly difficult tasks and turn them into relatively simple ones.
Although it does not improve much for this use-case, but of course the “dict“-function can act upon generator expressions. So that one
> dict([item.split(‘=’) for item in query_string.split(‘&’)])
can be expressed without the temporary “list“-object:
> dict(item.split(‘=’) for item in query_string.split(‘&’))
which has less braces and is easier to read but of course is also much more efficient
Excellent point, thanks!
The list version actually runs slightly quicker on my computer. The generator expression is theoretically more efficient because it avoids creating an intermediate list, but in practice this is only going to matter if you have far more items than a typical query string would contain.
I suspect that the difference is caused by some overhead that Python generators have, which a list comprehension avoids, but this is just speculation.
But as you said that is probably an implementation detail and even if not, I would always prefer to write the most readable and overall efficient solution. The complexity of algorithms quite often are distorted by internal overhead and the benefit begins to start only after some size of input values.
So imho it is better to adapt yourself to the common idiomatic solutions instead of fine tuning every line, where no optimization is necessary