Home
Blog
Python
The (lack of a) case against Python 3

The (lack of a) case against Python 3

November 29, 2016 . By Reuven

A few days ago, well-known author and developer Zed Shaw wrote a blog post, “The Case Against Python 3.” I have a huge amount of respect for Zed’s work, and his book (Learn Python the Hard Way) is one whose approach is similar to mine — so much so, that I often tell people who either are about to take my course to read it in preparation — and that people who want to practice more after finishing my course, should read it afterwards.

It was thus disappointing for me to see Zed’s post about Python 3, with which I disagree.

Let’s make it clear: About 90% of my work is as a Python trainer at various large companies; my classes range from “Python for non-programmers” and “Intro Python” to “Data science and machine learning in Python,” with a correspondingly wide range of backgrounds. I would estimate that at least 95% of the people I teach are using Python 2 in their work.

In my own development work, I switch back and forth between Python 2 and 3, depending on whether it’s for a client, for myself, and what I plan to do with it.

So I’m far from a die-hard “Python 3 or bust” person. I recognize that there are reasons to use either 2 or 3. And I do think that if there’s a major issue in the Python world today, it’s in the world of 2 vs. 3.

But there’s a difference between recognizing a problem, and saying that Python 3 is a waste of time — or, as Zed is saying, that it’s a mistake to teach Python 3 to new developers today. Moreover, I think that the reasons he gives aren’t very compelling, either for newcomers to programming in general, or to experienced programmers moving to Python.

Zed’s argument seems to boil down to:

Implementing Unicode in Python 3 has made things harder, and
The fact that you cannot run Python 2 programs in the Python 3 environment, but need to translate them semi-automatically with a combination of 2to3 and manual intervention is crazy and broken.

I think that the first is a bogus argument, and the second is overstating the issues by a lot.

As for Unicode: This was painful. It was going to be painful no matter what. Maybe the designers got some things wrong, but on the whole, Unicode works well (I think) in Python 3.

In my experience, 90% of programmers don’t need to think about Unicode, because so many programmers use ASCII in their work. For them, Python 3 works just fine, no better (and no worse) than Python 2 on this front.

For people who do need Unicode, Python 3 isn’t perfect, but it’s far, far better than Python 2. And given that some huge proportion of the world doesn’t speak English, the notion that a modern language won’t natively support Unicode strings is just nonsense.

This does mean that code needs to be rewritten, and that people need to think more before using strings that contain Unicode. Yes, those are problems. And Zed points out some issues with the implementation that can be painful for people.

But again, the population that will be affected is the 10% who deal with Unicode. That generally doesn’t include new developers — and if it does, everything is hard for them. So the notion that Unicode problems making Python 3 impossible to use is just silly. And the notion that Python can simply ignore Unicode needs, or treat non-English characters are a second thought, is laughable in the modern world.

The fact that you cannot run Python 2 programs in the Python 3 VM might have been foolish in hindsight. But if the migration from Python 2 to 3 is slow now, imagine what would have happened if companies never needed to migrate? Heck, that might still happen come 2020, when large companies don’t migrate. I actually believe that large companies won’t ever translate their Python 2 code into Python 3. It’s cheaper and easier for them to pay people to keep maintaining Python 2 code than to move mission-critical code to a new platform. So new stuff will be in Python 3, and old stuff will be in Python 2.

I’m not a language designer, and I’m not sure how hard it would have been to allow both 2 and 3 to run on the same system. I’m guessing that it would have been hard, though, if only because it would have saved a great deal of pain and angst among Python developers — and I do think that the Python developers have gone out of their way to make the transition easier.

Let’s consider who this lack of v2 backward compatibility affects, and what a compatible VM might have meant to them:

For new developers using Python 3, it doesn’t matter.
For small (and individual) shops that have some software in Python 2 and want to move to 3, this is frustrating, but it’s doable to switch, albeit incrementally. This switch wouldn’t have been necessary if the VM were multi-version capable.
For big shops, they won’t switch no matter what. They are fully invested in Python 2, and it’s going to be very hard to convince them to migrate their code — in 2016, in 2020, and in 2030.

(PS: I sense a business opportunity for consultants who will offer Python 2 maintenance support contracts starting in 2020.)

So the only losers here are legacy developers, who will need to switch in the coming three years. That doesn’t sound so catastrophic to me, especially given how many new developers are learning Python 3, the growing library compatibility with 3, and the fact that 3 increasingly has features that people want. With libraries such as six, making your code run in both 2 and 3 isn’t so terrible; it’s not ideal, but it’s certainly possible.

One of Zed’s points strikes me as particularly silly: The lack of Python 3 adoption doesn’t mean that Python 3 is a failure. It means that Python users have entrenched business interests, and would rather stick with something they know than upgrade to something they don’t. This is a natural way to do things, and you see it all the time in the computer industry. (Case in point: Airlines and banks, which run on mainframes with software from the 1970s and 1980s.)

Zed does have some fair points: Strings are more muddled than I’d like (with too many options for formatting, especially in the next release), and some of the core libraries do need to be updated and/or documented better. And maybe some of those error messages you get when mixing Unicode and bytestrings could be improved.

But to say that the entire language is a failure because you get weird results when combining a (Unicode) string and a bytestring using str.format… in my experience, if someone is doing such things, then they’re no longer a newcomer, and know how to deal with some of these issues.

Python 3 isn’t a failure, but it’s not a massive success, either. I believe that the reasons for that are (1) the Python community is too nice, and has allowed people to delay upgrading, and (2) no one ever updates anything unless they have a super-compelling reason to do so and they can’t afford not to. There is a growing number of super-compelling reasons, but many companies are still skeptical of the advantages of upgrading. I know of people who have upgraded to Python 3 for its async capabilities.

Could the Python community have handled the migration better? Undoubtedly. Would it be nice to have more, and better, translation tools? Yes. Is Unicode a bottomless pit of pain, no matter how you slice it, with Python 3’s implementation being a pretty good one, given the necessary trade-offs? Yes.

At the same time, Python 3 is growing in acceptance and usage. Oodles of universities now teach Python 3 as an introductory language, which means that in the coming years, a new generation of developers will graduate and expect/want to use Python 3. People in all sorts of fields are using Python, and many of them are switching to Python 3.

The changes are happening: Slowly, perhaps, but they are happening. And it turns out that Python 3 is just as friendly to newbies as Python 2 was. Which doesn’t mean that it’s wart-free, of course — but as time goes on, the intertia keeping people from upgrading will wane.

I doubt that we’ll ever see everyone in the Python world using Python 3. But to dismiss Python 3 as a grave error, and to say that it’ll never catch on, is far too sweeping, and ignores trends on the ground.

Enjoyed this article? Subscribe to my free weekly newsletter; every Monday, I’ll send you new ideas and insights into programming — typically in Python, but with some other technologies thrown in, as well! Subscribe at http://lerner.co.il/newsletter.

Related Posts

Prepare yourself for a better career, with my new Python learning memberships

I’m banned for life from advertising on Meta. Because I teach Python.

Sharpen your Pandas skills with “Bamboo Weekly”

[…] für eine intensive Diskussion zum Thema „Python 2 oder 3“ interessiert, kann sich den Artikel The (lack of a) case against Python 3 durchlesen und anschließend die Kommentare unter dem Artikel studieren, denn dort werden viele […]

“[Unicode in] Python 3 isn’t perfect, but it’s far, far better than Python 2.”

No, it isn’t far, far better. Actually, in many areas it is far worse. Far, far worse. And where it’s better (and I frankly can’t think of anything) it’s not that much better to make up for the pain.

“I would estimate that at least 95% of the people I teach are using Python 2 in their work.”

That, in late 2016, should actually tell you all you need to know about how much Python 3 (“Purity beats pragmatism”) has failed.

Reuven Lerner says:

at

I haven’t seen the Unicode problems. I’m definitely curious to hear more about them.

As for why my clients are using Python 2, it’s largely because they’re working at big corporations that have a legacy code base that won’t be upgraded. That’s the big problem in this whole 2->3 thing, from my perspective; it should have been wildly easier to upgrade. But right now, companies are still using Python 2 because they won’t upgrade their existing apps.

People working on new things, by contrast, are often (but not always) using Python 3.

Reply

“And the notion that Python can simply ignore Unicode needs, or treat non-English characters are a second thought, is laughable in the modern world.”
Did you mean “treat non-English characters ‘as’ a second thought”?
I read the Japanese version of this article and found this part doesn’t fit in the context. http://postd.cc/case-python-3/

[…] The (lack of a) case against Python 3 http://blog.lerner.co.il/case-python-3/ […]

[…] The (lack of a) case against Python 3 реакція на статтю Zed A. Shaw, щодо його доводів проти Python 3. Також посилання на статтю Zed’a […]

So basically, if you read all the comments.. you learn that the original writeup is an apology for Python3.. and it’s pretty clear to anyone that Python3 was a mistake. No way around it. They should really find a way to offer an olive branch to the Python2 folks. Even if you don’t care if you lose those diehards (and Guido & Co simply do not if you speak to them about this), the damage to the reputation of Python is pretty deep. I’m not using it anymore. I’m now playing with .Net Core which is actually pretty cool and has a lot of built in benefits. Broad usage with Xamarin, Linux backend and 1st class on Windows, still the most popular desktop platform.

Python isn’t in a vacuum, the success Python2 afforded it really went to the core devs heads.

[…] The (lack of a) case against Python 3 proprio come dicevo io #:linguaggi di programmazione ::: Lerner Consulting […]

[…] http://blog.lerner.co.il/case-python-3/ […]

I decided to read Zed’s diatribe myself. His unicode example amounts to this silliness: u’hello’ + b’ world’ fails in 3.x whereas b’hello ‘ + b’ world’ works in 2.x. What is the point? The latter also works in 3.x. While the former works in 2.x, it is a special case: for instance, u’hello’ + b’ world’ also fails in 2.7, with “UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xa0 in position 0: ordinal not in range(128).” A operator that only worked with toy examples sometimes resulted in people shipping inadequately tested code that eventually failed in use.

As for some core libraries still returning bytes: these were sometimes hard decisions. Changing return types to strings would have broken even more 2.x code than we did, at least for some people. In any case, at least some of the users of these libraries *wanted* to keep working with encoded bytes rather than unicode.

Good article! I agree with you, though I would not have been so kind to Zed Shaw’s opinions.

(What seems like) A small typo: “It was going to [^be] painful no matter what”

“It was going to painful no matter what.”
->
“It was going to be painful no matter what.”

I work for a Medical startup and we have always had to deal with UTF-8 reading HL7 messages. We switched from Python 2 to 3 and never looked back. Handling Unicode at first took a rewiring of the brain but things are much better now. We aid the change by running a base docker image that has both Python 2 and Python 3 installed. Currently we have one more service left to port to Python 3. That said, I feel there is always a missing argument for the move to Python 3, the standard library improvements! There are so many improvements in the standard library that make it worth moving from 2 to 3. Specifically 3.5+. Asyncio for one, or little things like glob.glob finally getting recursive = True.

Reuven Lerner says:

at

Ugh! I had to deal with HL7 back in my first job, and that strikes me as much worse than Unicode. 🙂

Reply
- Glen says:
  
  at
  
  Indeed, HL7 is a complete mess.
  
  Reply

In both Python 2 and 3, the frameworks I use all shielded me from bytes. It’s always unicode. They call it the “unicode sandwich” and it just works.

Now, I can see how writing one-off scripts means you have to deal with it. But this is where Python 2 would give me *unexpected* UnicodeDecodeErrors. Python 3 forces me to deal with it right away, so I don’t get them errors at all.

Dealing with non-ASCII character sets is a regular task for me. Not only is Python a global community, the projects are often international too. The point is: ASCII will only get you so far, it’s better to deal with it right away. Also: explicit is better than implicit.

The improvements in python 3 are too small to force people to switch. I think that a massive change that break with the past but with great benefit is better.

I think Python 3 have had a too slow start, while understanding why that did happen. But the adoption is improving.

Consider that Fedora is pushing hard for Python 3, dnf (the replacement for yum) is written in Python 3, more and more of all the other Python packages is getting a clear signal they need to be ported to Python 3 (and I expect there will be a Python 2 cut-off date in not too far future).
http://portingdb-encukou.rhcloud.com/

When Fedora pushes so hard, it would surprise me if Python 2 support is not dropped completely in the next Red Hat Enterprise Linux 8 (which then includes CentOS, ScientificLinux and Oracle Linux).

Since also many of these distro packages in the Fedora/RHEL sphere is used in Debian, the Python 3 dependency will grow stronger even there and such an adoption there will also include Ubuntu and Linux Mint. In addition, Debian is working towards a Python 3 only world too: https://www.debian.org/doc/packaging-manuals/python-policy/ch-python3.html

It also seems like Arch Linux is putting Python 3 into the driver’s seat, as packages for Python 2 must be named python2-* and use /usr/bin/python2 explicitly: https://wiki.archlinux.org/index.php/Python#Python_2
For Python 3 packages shall be named python-* and use /usr/bin/python: https://wiki.archlinux.org/index.php/Python_package_guidelines#Package_naming

So the rumors of Python 3’s death is by far exaggerated.

The Python 2 -> Python 3 transition was made in a terrible way, it almost killed the language…

The only change that made it backward incompatible was to make strings unicode by default. They should have added a transitional string (something like strbytes) and then 2to3 would just add parenthesis for print, / for //, and make every string “strbytes”.

Anyhow, I think Python 3 is a better language (it is where all Python progress happened), and it’s finally flourishing. By 2020 debian and red hat will ship Python 3 by default, facebook already uses Python 3 by default, Google is transitioning to Python 3 (web2py is finally being ported to Python 3) – in the end everyone will be on Python 3+ (and by everyone I mean 85% of active Python devs).

About formatting strings, I do not think there is “too many ways” of doing it.

The new way should be the default, and it’s just a shortcut for “.format”. Sometimes you cannot use f’strings, maybe you want to use a prepared string that codifies the format, and then you should use the unsugared “.format”. The percent way should be used when you want to treat bytes and strings more or less equally (it would be perfect for the “strbyte compatibility string”, but alas – that does not exist).

TheBlackCat says:

at

How, exactly, would you expect “strbytes” to behave?

Reply
- Alessandro Stamatto says:
  
  at
  
  More or less like strings worked on Python 2: treat them as ASCII encoded in text contexts, and as bytes in other contexts. And ignoring errors.
  
  I know it’s a bad string type, but it would allow a direct transition to Python 3. And “Wall of shame/superpowers” would show projects that are using strbytes alongside Python projects that lack a Python 3 version.
  
  Reply
  - TheBlackCat says:
    
    at
    
    Although that would make the transition easier, it seems it wouldn’t necessarily improve things. There would be little, if no, incentive for projects to switch, and we would end up with two string types instead of just one.
    
    Python does have the “from __future__ import unicode_literals” to make the transition easier by doing the opposite: letting you use the python 3 string vs. unicode behavior in Python 2.
    
    Reply

I couldn’t agree less.

“Maybe the designers got some things wrong, but on the whole, Unicode works well (I think) in Python 3.”
Come on. No maybes about it. Even if PEP 467 very politely downplays those mistakes. And Unicode does not work well in python3, as demonstrated brilliantly by Zed. Maybe it works well for the hard cases, your 10%, but “working well” means the simple cases (those of the 90%) are also handled well. And they are not.

“In my experience, 90% of programmers don’t need to think about Unicode, because so many programmers use ASCII in their work. ”
I’m a proud member of the 90% in all my three main use cases for python (scientific work, GUI and async control, scripting).
“For them, Python 3 works just fine, no better (and no worse) than Python 2 on this front.”
No!
No!
No!

This is exactly the one and only main point.
This is why python3 almost killed python.
Because for some reason people pretended it was true, and made all those problematic decisions in Python3, and they keep on pretending this is true. (Zed speculated on some possible reasons)

Zed gave detailed thoughtful examples. But it’s so much simpler: every time I want to use some library, touch something external to python (e.g. ports), or just store lots of data efficiently python3 forces me to think about Unicode. I have b-prefixed strings cluttering my code. I have to keep a mental picture of what strings are for display and what are for internal computations, because only the first types get to be called “str” and the rest are bytes and they give crappy, unexpected, error messages when I confuse them. And I do confuse them because their protocols are just similar enough for it, with enough subtle differences to byte you every time you forget. As Zed said, this is the worst of static typing in a language which used to showcase how fun dynamic typing can be when done right. And it lacks the tools to support the programmer in his struggles against these type mismatches.

Oh yeah. As long as I’m ranting: removing the print statement was a similarly arrogant and ignorant decision which seems to have come from the mindset of the same crowd of Macho web programmers. (Good for you that you don’t use debug print outs. But some people do. Sometimes for very good reasons.) No wonder there are now 3 competing alternatives. And there used to be “one right way to do it”. Oh well. That was the python2 mantra. python3 is different. Now we get to choose between inferior options. But hey, one less statement 🙂

The one thing I agree with is that python3 is now equal, and in many cases, superior to python2. For me, the decade of progress in python3 has roughly cancelled the damage of those two bizarre missteps at its inception.

Marcel Bollmann says:

at

What’s really interesting to me is how much experiences differ depending on your typical use case. To paraphrase Amnon Harel’s answer…

As someone who deals with non-ASCII characters regularly, Python 2 constantly forced me to think about Unicode. I had u-prefixed strings cluttering my code. I had to keep a mental picture of what strings are for display and what are for internal computations, because only the latter types are “unicode” and behave like I’d expect them to, and the rest are “str” and they give crappy, unexpected, error messages when I confuse them. And I did confuse them because as long as it’s only ASCII characters, they behave the same way, but as soon as there’s a non-ASCII character there are enough differences to bite me every time I forgot.

With Python 3, it’s a breeze and just works for me 99% of the time. I’ve never had to use any b-prefixed strings and I can’t remember having to deal with byte strings from an external library — but maybe I just got lucky so far.

Reply
Danilo says:

at

My approach to unicode: Every text that comes into the system must be decoded (usually just `input.decode(‘utf8’)`). Internally, *all* code uses unicode strings. Every time something exits the system, it must be encoded (usually just `ouptut.encode(‘utf8’)`.

With this approach, i have *zero* issues with unicode. Claiming a language is a mistake because your native language does not include unicode characters is ignorant. Besides, the code might work in 99% of the cases, but maybe someone enters a name containing a sŧràŋgé character into your input field and then the process crashes with a UnicodeDecodeError due to the strange auto-decoding rules of Python 2 that cause terrible, crypting error messages…

To me, Unicode support in Python 3 is *much* better than before. We as Python 2 developers need to unlearn some bad habits, but once you have the mindset of “a string is independent of encoding” it just works.

PS: I highly doubt that the use cases that only need ASCII make up 90% of the Python code. I’d rather say something like 40%.

Reply
- Reuven Lerner says:
  
  at
  
  I deal with many different types of programmers — network engineers, system engineers, QA people, and so forth. For most of them, Unicode isn’t something they deal with, because the computer world is so English-centric.
  
  I admit that my sample might be skewed. But when I ask people in my courses (and I teach Python just about every day), the majority of them don’t encounter Unicode issues, because they don’t deal with Unicode.
  
  But for the people who do, it’s a do-or-die situation. And Python 2 was closer to “die.”
  
  Reply
- Paul Boddie says:
  
  at
  
  “With this approach, i have *zero* issues with unicode.”
  
  I’ve used that approach with Python 2 for years. What never gets explained is how Python 3 is actually any better, apart from apparently doing what Java seems to do and making a snap decision about automatic decoding and encoding based on the user’s locale.
  
  “Claiming a language is a mistake because your native language does not include unicode characters is ignorant.”
  
  I don’t think it helps to start labelling native English speakers as ignorant. English does use non-ASCII characters in certain contexts, and there are many native English speakers who use other languages, too.
  
  “To me, Unicode support in Python 3 is *much* better than before.”
  
  Again, how is it better, exactly? And how is it better in the context of Zed’s article?
  
  “We as Python 2 developers need to unlearn some bad habits, but once you have the mindset of “a string is independent of encoding” it just works.”
  
  Well, I got into the Unicode game when Python 2 came out precisely because it has Unicode objects, so all it needed was better library support and maybe more people would have been doing things in the way you described. (For instance, how many people were opening files with codecs.open?)
  
  I think Zed’s remarks about propaganda are quite relevant. If you ask the average uninformed pundit, they’ll probably tell you that Python 2 didn’t support Unicode, just like it supposedly didn’t have a module to expose the abstract syntax tree of Python programs. And so on.
  
  Reply
  - Danilo says:
    
    at
    
    “Again, how is it better, exactly?”
    
    Take the following example code:
    
    print u’hello asdföäüŋɨł’
    
    This will work perfectly in the Python shell:
    
    >>> print u’hello asdföäüŋɨł’
    hello asdföäüŋɨł
    
    It will work somewhat when running on an UTF8 terminal, (although Python will encode it as ASCII):
    
    $ python2 -c “print u’hello asdföäüŋɨł'”
    hello asdfÃ¶Ã¤Ã¼ÅÉ¨Å
    
    Then you put it into production and pipe the output to a log file. By some magic process, Python realizes that the output has a different encoding and blows up the process:
    
    $ python2 -c “print u’hello asdföäüŋɨł'” > log.txt
    Traceback (most recent call last):
    File “”, line 1, in
    UnicodeEncodeError: ‘ascii’ codec can’t encode characters in position 10-21: ordinal not in range(128)
    
    Another issue are the weird cryptic error messages that occur if you actually try to encode something that is not a string. Let’s assume someone gets strings from somewhere and wants to process them using UTF8 encoding.
    
    $ python2 -c “print ‘hello asdf’.encode(‘utf8’)”
    hello asdf
    
    Perfect, this seems to work. Put it into production! But then suddenly, some input contains non-ASCII characters:
    
    $ python2 -c “print ‘hello asdföäüŋɨł’.encode(‘utf8’)”
    Traceback (most recent call last):
    File “”, line 1, in
    UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 10: ordinal not in range(128)
    
    So we’re trying to encode to UTF8, but the error message is telling me that it cannot decode from ASCII? That is *very* incomprehensible to someone without a firm grasp of unicode handling in Python 2. If you know how it works, you realize that Python first decodes from the default encoding (assuming ASCII) and then re-encodes to a byte string with the chosen encoding.
    
    In Python 3, you cannot accidentally encode an already encoded string:
    
    $ python3 -c “print(b’hello asdf’.encode(‘utf8’))”
    Traceback (most recent call last):
    File “”, line 1, in
    AttributeError: ‘bytes’ object has no attribute ‘encode’
    
    The error message already pops up during development, not just in production once someone tries to feed “international characters” to the program. Furthermore, the error message makes it clear that you’re trying to encode bytes, which does not make sense.
    
    This is not a theoretical issue, there are so many questions on SO related to this issue from people that never understood the difference between text and bytes (because Python does not make it obvious at all): http://stackoverflow.com/q/9942594/284318
    
    The clear separation between strings and bytes helps build up a correct mental model. In Python 2, it’s simply a mess. I’m convinced that the difference between unicode strings without any encoding and encoded bytes needs to be taught from the beginning. Otherwise people will get tripped up on their first real world project.
    
    Reply
    - Paul Boddie says:
      
      at
      
      (We’ll ignore the automatic “smartquoting” here in case it messes up the code fragments again…)
      
      “””
      It will work somewhat when running on an UTF8 terminal, (although Python will encode it as ASCII)
      
      $ python2 -c “print u’hello asdföäüŋɨł'”
      hello asdfÃ¶Ã¤Ã¼ÅÉ¨Å
      “””
      
      That isn’t ASCII after the first ten characters. But anyway, this and the errors that you get when printing Unicode objects can be explained by the locale not being initialised automatically by Python. If you do set the locale, it works:
      
      python -c “import locale; locale.setlocale(locale.LC_CTYPE, locale.getlocale(locale.LC_CTYPE)); print ‘hello asdföäüŋɨł'”
      hello asdföäüŋɨł
      
      I imagine that Python 3 just does this for you, perhaps as it should, whereas the core developers decided not to fix this during the whole of the Python 2 era (or before).
      
      “So we’re trying to encode to UTF8, but the error message is telling me that it cannot decode from ASCII? That is *very* incomprehensible to someone without a firm grasp of unicode handling in Python 2.”
      
      Python’s error messages are not great. Zed mentions other examples and people bring this up all the time, anyway.
      
      “In Python 3, you cannot accidentally encode an already encoded string”
      
      You can’t do that in Python 2, either. There is no “encode” method on “plain strings”, only on Unicode objects (which are the equivalent of Python 3 strings).
      
      I do agree that Python 3 gave the core developers the chance to break with previous practice, tidy up various APIs, and tell everyone to fix their code. As a result, they got to apply that locale initialisation magic and remedy some – but apparently not all – API problems.
      
      And maybe this improves usability for many people, mostly because the documentation never really cultivated the best practices available to Python 2 users, and so they get caught out in the ways you describe. But the Unicode support is most certainly there in Python 2 and, usability discussions aside, certainly isn’t much worse, which is what you have implied.
TheBlackCat says:

at

Yes, Python 3 makes you be explicit about whether you are dealing with text or binary data. That is because you should be explicit about it (“explicit is better than implicit”). Binary data is not necessarily text, and even if it is text it may be any one of many different ways of storing text. Automatically coercing random binary data into one arbitrary type of text is making assumptions that the language shouldn’t be making. Python doesn’t coerce numbers into text, or images into text, or sound into text, so why should it coerce unknown binary data into text?

And this isn’t about static typing vs. dynamic typing. Claiming that shows just how little about programming Zed actually knows. It is about strong typing vs. weak typing. Python overall is a strongly-typed language.

And the print statement vs. print function has nothing to do with string formatting. The print function was implemented to allow you to explicitly define things like separators and line endings, something you couldn’t do with the statement, and because there was no good reason to make print a statement to begin with.

And Python 2 already had two different types of string formatting, the two that actually directly compete. The third type is for a different use-case, one that wouldn’t have even been possible under the original string formatting system.

Reply

Great post! Two minor errors:

– “They are fully invested in Python 3” should be Python 2.
– “is just as friendly to newbies as Python 3 was” should also be Python 2.

Reuven Lerner says:

at

Whoops! Fixed!

Reply
- Aubrey Kohn says:
  
  at
  
  Working in machine translation, I lived in unicode, and I always used 2.7. My code was maximally multilingual, and I eschewed 3.x, and still do. The gratuitous incompatibilities make it a non-starter.
  
  Reply
  - Hugh says:
    
    at
    
    Aubrey, shhhh. You’re not supposed to exist, with your rational reaction and thoughts. When in reality, you’re exactly the guy that Guido and the core dev team have purposefully and intentionally screwed over. They’ve pulled this one over on all of us, actually.
    
    Rule #1: don’t break userland.
    Python3 broke this miserably and the core devs are unrepentant. There are a hundred reasons why it’s not only unnecessary but they wanted to shift the work onto the developer rather than the language implementation.
    
    If bothering to migrate, migrate OUT. To Rust, Go, Elixir or one of the big boys like Java/C#.
    
    Reply

The Unicode changes alone were worth the porting efforts to Python 3. In Germany, we use mostly ASCII characters with the occasional non-ASCII character (Umlauts) thrown in. In Python 2 the dreaded “UnicodeDecodeError” was a constant companion, despite constant diligence.

Porting to Python 3 is quite an effort at the beginning, but it does certainly pay off for me. The UnicodeDecodeErrors are gone, thanks to the clear separation of Unicode and byte strings.

Reuven Lerner says:

at

Yup. I think that many English speakers, not having to deal with such characters on a day-to-day basis, are ignorant of the pain caused by a lack of Unicode compliance.

Reply
- Anna says:
  
  at
  
  This.
  (Insert snarky comment on Americans being unaware of the outside world here.)
  
  Reply

Once they make Python 3.x / 4.x much much faster than Python 2.x people will move.

Speed must be the focus and this will be the best reason people will migrate.