You probably know, code golf is a “sport” of writing a program which solves a task with as little number of characters of source code as possible.
At the last Python meetup Brian Lauridsen presented a code golf challenge. The challenge is called grid computing and was available at codegolf.com, which at the moment of writing is down. Here is the approximate description of the challenge:
Write a program that expects a 10-by-10 matrix from standard input. The program should compute sum of each row and each column and print the highest of these numbers to standard output.
An example input:
01 34 46 31 55 21 16 88 87 87 32 40 82 40 43 96 08 82 41 86 30 16 24 18 04 54 65 96 38 48 32 00 99 90 24 75 89 41 04 01 11 80 31 83 08 93 37 96 27 64 09 81 28 41 48 23 68 55 86 72 64 61 14 55 33 39 40 18 57 59 49 34 50 81 85 12 22 54 80 76 18 45 50 26 81 95 25 14 46 75 22 52 37 50 37 40 16 71 52 17
The challenge is very tempting by its simplicity. However, according to codegolf.com, the best solution in Python was written in only 73 characters. At the meetup we got it down to 78 characters, but not less. So next weekend after the meetup I immersed into this challenge to get it down to 73. And here are the results.
Spoiler alert: this page contains the 73 character solution.
Let’s see what a naïve solution can look like in Python:
import sys # Parse matrix. matrix =  for line in sys.stdin.read().splitlines(): row =  for digits in line.split(' '): number = int(digits) row.append(number) matrix.append(row) # Row sums. all_sums =  for row in matrix: all_sums.append(sum(row)) # Column sums. for i in range(10): column =  for row in matrix: column.append(row[i]) all_sums.append(sum(column)) print(max(all_sums))
Total: 446 characters.
I hope you are displeased by this code as much as I am. Let’s gradually improve it.
File-like objects have a
.readlines() method, so we could
call it instead of
you might know that file-like objects support the iterator
protocol, so you can iterate over them directly:
for line in sys.stdin: ...
Also you might know that calling
.split() on a string
will split over any whitespace, so no need to call
Next, using list-comprehensions instead of for loops tightens up the code considerably:
import sys matrix = [[int(digits) for digits in line.split()] for line in sys.stdin] row_sums = [sum(row) for row in matrix] column_sums = [sum([row[i] for row in matrix]) for i in range(10)] print(max(row_sums + column_sums))
Total: 256 characters.
Some people will already claim that this is too tight and unreadable, but I think it is still sensible.
map instead of list comprehensions will
save us some characters. We can use:
map(int, line.split())to parse each line,
However, there is some duplication between
column_sums. We could pull out summation,
and apply it in the last step instead:
import sys matrix = [map(int, line.split()) for line in sys.stdin] rows = [row for row in matrix] columns = [[row[i] for row in matrix] for i in range(10)] print(max(map(sum, rows + columns)))
Total: 197 characters.
Now, the above code is silly, because
rows is the same as
columns is the same matrix, but transposed. We could
use a library like
numpy to transpose a matrix, but using
libraries is against the rules of code golf.
Otherwise you could just
write libraries that solve a challenge in one function call.
But you might know this “party trick,” that
transposes a matrix. If you didn’t know about this,
stop and think about it for a minute.
import sys matrix = [map(int, line.split()) for line in sys.stdin] print(max(map(sum, matrix + zip(*matrix))))
Total: 113 characters.
Now let’s go all-dirty on this and remove all unnecessary whitespace, and make every variable a single letter.
import sys m=[map(int,l.split())for l in sys.stdin] print(max(map(sum,m+zip(*m))))
Total: 83 characters.
Now, we are much closer to our goal of 73 characters.
One way to save 5 characters would be to use a built-in
input, which reads a single line from
standard input. However, this is when we realize that
the codegolf.com checks the solutions using Python 2.5.
That forces us to use
raw_input instead, saving a single
character instead of 5.
On the other hand, Python 2 allows to drop parenthesis
m=[map(int,raw_input().split())for _ in range(10)] print max(map(sum,m+zip(*m)))
Total: 81 characters.
raw_input forced us
range(10) instead of iterating over lines.
How can we mitigate this?! Another way to iterate 10 times
would be to iterate over a collection of length 10.
We can get this collection by multiplying a list with
a single item by 10:
m=[map(int,raw_input().split())for _ in*10] print max(map(sum,m+zip(*m)))
Total: 77 characters.
However, we still haven’t used one important code golf trick.
If you do a
hexdump of our program, you can see that
we have a trailing newline, which is usually added by the
editor. You might also discover that your editor is
\r\n instead of
$ hexdump -c grid_challenge.py | cut -c '9-' m = [ m a p ( i n t , r a w _ i n p u t ( ) . s p l i t ( ) ) f o r _ i n [ 0 ] * 1 0 ] \n p r i n t m a x ( m a p ( s u m , m + z i p ( * m ) ) ) \n
Yep, we have a trailing
\n. Let’s write a script that
removes it from our programs, and now we are down to…
Total: 76 characters.
And here the really hard part starts. I have spent many hours trying to reduce this number. And it got really nasty.
One little known misfeature of Python 2 is that
it leaks variable bindings from comprehensions
(as well as
except clauses). This was fixed in Python 3.
Here’s some code to illustrate the issue:
$ python3 >>> [x for x in range(10)] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> x Traceback (most recent call last): File "<stdin>", line 1, in
NameError: name 'x' is not defined
$ python2 >>> [x for x in range(10)] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> x 9
x – “leaked” from inside the list comprehension
and was available in the outer scope with its last assigned value.
How can we use this?! Well, right now we are not using
the list comprehension variable at all. But instead we
could put it to some use… We are calling
It would be so handy if we could put it into a variable,
without paying the cost of an explicit assignment…
m=[_(int,raw_input().split())for _ in[map]*10] print max(_(sum,m+zip(*m)))
Total: 74 characters.
Here we created a list of 10
map functions, which allowed
us to assign
map to the undescore (
_) variable and save 3 characters in total.
But we are still behind the
world’s best 73 character solution.
One less obscure feature of
raw_input is that it takes
an optional argument called
prompt, which gets printed
as a prompt before it reads a line from standard input.
Let’s see if we gain anything by assigning
the undescore (
_), instead of
map, and then…
m=[map(int,_().split())for _ in[raw_input]*10] _(max(map(sum,m+zip(*m))))
Total: 73 characters.
Yes! By using
raw_input instead of
I bet you have learned something new about Python today. ■