Fixing Python’s Greatest Mistake
Python is easily my favourite programming language right now. When I can use it, it lets me be massively more productive than I can be in other languages. Its powerful data-structuring and data-manipulation facilities, in particular, let me solve the same problems in much less code.
One specific example I can offer is this program which, given certain parameters for a display screen, will calculate the rest. The program is driven by a set of rules which give the formulas by which parameters are calculated from other parameters. The rules are kept in the dictionary named paramdefs. The definition of this dictionary, as of this writing, takes up 102 lines of the program.
Compare the Java version, written for Android. Ignore the fact that this app includes a GUI, whereas the Python version runs from the command line; just look at the Rules.java source file, specifically the creation of the ParamDefs class that corresponds to the Python dictionary: this part of the module alone is currently 578 lines—an expansion in code size of well over 5:1.
I could offer more examples, but that should already give you a good idea about the expressiveness of Python.
Of course, Python gets a few things wrong. With all its eschewing of C-style conventions that have infected most programming languages over the last few decades, it is surprising to see it use “=” for assignment and “==” for the equality comparison, instead of the older ALGOL-style convention of “:=” and “=” respectively, which would have kept it more consistent with usual mathematical notation.
But the worst thing wrong with Python is its use of indentation, instead of bracketing symbols, to delimit compound statements.
Yes, it is true that code should be indented anyway, since it helps immensely to keep the code readable. And Python’s convention for omitting semicolons at the ends of simple statements is actually quite intelligently thought out, unlike for example the corresponding rule in JavaScript. The net result is that my personal conventions for laying out code, painstakingly evolved over many years using several different languages, adapt well to the one language, not so well to the other.
The problem is that Python wants to do away with useful redundancy. In a language like C, I could write a construct like
for (int i = 0; i < 10; ++i) { if (is_prime(i)) { fprintf(stdout, "%d\n", i); } /*if*/ } /*for*/
Here the compiler pays no attention to the indentation whitespace, only to the actual symbols. The redundancy comes in having both: if there is a discrepancy between the two, the compiler may not pick it up, but there is at least a chance that a human reader could do so. Remember the saying: “many eyes make all bugs shallow”.
The way the corresponding Python version is usually written, there is no such redundancy:
for i in range(10) : if is_prime(i) : print(i)
Now, what happens if these pieces of code get posted online somewhere, say in a discussion forum which makes it hard, or even impossible to keep the correct formatting?
The C version might turn into this:
for (int i = 0; i < 10; ++i) { if (is_prime(i)) { fprintf(stdout, "%d\n", i); } /*if*/ } /*for*/
That’s harder to read, but at least it will still compile correctly, and with some work, the formatting can be recreated.
But the Python version turns into complete gibberish:
for i in range(10) : if is_prime(i) : print(i)
This is a simple enough example that you could take a guess on how it is supposed to be indented, and recover the original statement structure manually. But imagine trying to do that for just a few dozen lines of code...
For this reason, I like to put in “#end” comment lines to explicitly mark the ends of compound statements. For example, I would write the above as
for i in range(10) : if is_prime(i) : print(i) #end if #end for
Now, if the indentation were to be lost for any reason, you stand a much better chance of reconstructing it correctly. It could even be done automatically by some prettyprinting tool, just as with the C version.
Another problem with how Python code is usually written comes up when trying to edit the text of a program. For example, the Emacs editor provides commands (ctrl-shift-n and ctrl-shift-p) for jumping between matching opening and closing bracket symbols (“(” and “)”, “[” and “]”, “{” and “}”). While these can still be used within expressions in Python, they are useless for jumping around statements.
But because of my bracketing-comments convention, I was able to add commands to my custom Emacs definitions that provide corresponding functionality for Python statements. Specifically, ctrl-super-n jumps to the next line with the same indentation level as the current line, while ctrl-super-p jumps to the previous such line. So in the above example, it is easy enough to jump between the “for” and “#end for” lines, or between the “if” and “#end if” lines using these keystrokes.
In short, most other languages pay no attention to indentation whitespace, using bracketing symbols to delimit compound statements. But it is generally considered a good idea to add the indentation whitespace anyway, as a form of redundancy to help catch errors.
Conversely, Python pays no attention to these “#end” bracket comment lines, using only the indentation whitespace to delimit its compound statements. But I think it is a good idea to add the bracketing comment lines anyway, as a form of redundancy to help catch errors.














