owaislone
2 hours ago
Oh man, Python 2 > 3 was such a massive shift. Took almost half a decade if not more and yet it mainly changing superficial syntax stuff. They should have allowed ABIs to break and get these internal things done. Probably came up with a new, tighter API for integrating with other lower level languages so going forward Python internals can be changed more freely without breaking everything.
scorpioxy
2 hours ago
The text encoding stuff wasn't a small change considering what it could break, at least. And remember we're sometimes talking about software that would cost a lot of money to migrate or upgrade. I still maintain some 2.x python code-bases that will be very expensive to migrate and the customer is not willing to invest that money.
Although your general sentiment is something I agree with(if it's going to be painful do it and get it over with), I don't believe anybody knew or could've guessed what the reaction of the ecosystem would be.
Your last point about being able to change internals more freely is also great in theory but very difficult(if not impossible) to achieve in practice.
I don't know. Having maintained some small projects that were free and open source, I saw the hostility and entitlement that can come from that position. And those projects were a spec of dust next to something like Python. So I think the core team is doing the best they can. It was always going to be damned if you do, damned if you don't.
gjvc
2 hours ago
yes. it was not a massive shift. it was barely worth the effort.
pansa2
2 hours ago
The Python devs didn’t want to make huge changes because they were worried Python 3 would end up taking forever like Perl 6. Instead they went to the other extreme and broke everyone’s code for trivial reasons and minimal benefit, which meant no-one wanted to upgrade.
Even the main driver for Python 3, the bytes-Unicode split, has unfortunately turned out to be sub-optimal. Python essentially bet on UTF-32 (with space-saving optimisations), while everyone else has chosen UTF-8.
diziet_sma
37 minutes ago
> Python essentially bet on UTF-32 (with space-saving optimisations)
How so? Python3 strings are unicode and all the encoding/decoding functions default to utf-8. In practice this means all the python I write is utf-8 compatible unicode and I don't ever have to think about it.
pansa2
3 minutes ago
> all the encoding/decoding functions default to utf-8
Languages that use UTF-8 natively don't need those functions at all. And the ones in Python aren't trivial - see for example, `surrogateescape`.
As the sibling comment says, the only benefit of all this encoding/decoding is that it allows strings to support constant-time indexing of code points, which isn't something that's commonly needed.
sheept
16 minutes ago
UTF-32 allows for constant time character accesses, which means that mystr[i] isn't O(n). Most other languages can only provide constant time access for code units.
rjh29
an hour ago
Ironically Perl 5 managed to do the bytes-Unicode split with a feature gate, no giant major version change.