Mike Higginbottom

A Digital Garden - frequent growth but a bit weedy in the back corner

Writing a ROT13 Encrypter/Decrypter in Python

There’s always more than one way to skin a cat. One of the CTFs I did recently was based around ROT13 encryption and it occurred to me that this would make a good little project to learn a bit of Python. And it’s a good way to have a think about the style of code you’re writing and who you’re writing it for as well.

What is ROT13

ROT13 is a simple Ceaser Cipher where the letters in the plain text are shifted 13 places through the alphabet to produce the ciphertext. So a becomes n, b becomes o, c becomes p etc. until we get to n which ‘wraps around’ to become a and o becomes b and p becomes c. We can do the same thing for the uppercase letters as well. There’s little point in going into more detail here so maybe just check Wikipedia.

Writing to showcase the algorithm

I wrote a bit of python to do this

which you can run here and then started to explore. The first thing to note is that I cheated. The canonical test string is not “HelloWorld”. It’s “Hello World!”. My code fails on this input. My code always fails. I’m kinda proud of my consistency. Now, to be fair, I knew my code would fail. But I always know my code will fail. This time though, I knew how and why my code would fail. Which is a good thing. It fails with an exception because it doesn’t take into account any characters in the plaintext that are not in the range A-Z or a-z. Like the space and the exclamation mark. What’s the value of code that is this brittle and inflexible? Partly because it’s simple to understand as it uses only basic cross-language programming idioms like if-else, for and while loops, and simple variables. More importantly though, the purpose of this code is to show the general idea of the algorithm. Focusing on corner cases and flexibility and error checking would just distract from this. So from a ‘publishing to production’ or ‘writing elegant code’ perspective this example sucks. But that’s not what it’s for.

Considering corner cases

So I fixed that with this code

which you run here - it handles everything it can handle and completely ignores everything it can’t. But now the code looks a bit more complicated. This increase in complexity is an example of the trade-off between clarity and safety, reliability and security. And yes, using isalpha() would be a much clearer solution but it requires the reader of the code knowing about the function isalpha() and I’m trying to keep this code using the most basic and universal programming constructs I can.

Making it efficient

Another trade-off to consider is that of efficiency vs clarity. Both these versions are super inefficient. Order O(1) vs O(n). Stepping through an array for every character is not a smart way to do it. But it is clear what the code is doing. There’s a clear mapping between the description of what ROT13 does and how the code does it. This makes the code very readbale and this is important for maintainability. And considering maintainability is important, as every coder who has ever had to look at their old code knows - “What the hell was I thinking? How does this even work?” Emotionally, it’s much better to look at someone else’s old code cos then at least you feel superior in your bafflement at the idiocy of their shenanigans.

So, efficiency… We can take advantage of the fact that we’re only dealing with ASCII alpha characters. The ASCII codes for A-Z and a-z are contiguous - 65-90 and 97-122 respectively. In Python we can use the ord() and chr() functions to convert between the character and its ASCII value so I did this

which can be run here. If we’re now assuming the reader knows about ord() and chr(), it’s reasonable to assume they know about isalpha(), isupper() and islower() as well so we can make the code clearer by using those functions.

So why is this more efficient? It still has to examine each character in turn but now that examination doesn’t involve scanning through a loop. Instead it performs a simple calculation. Avoiding loops where you can substitute a single operation is a huge improvement in efficiency.

Exploring language features

For plain vanilla Python this is a pretty good solution. And because it doesn’t use any advanced, Python specific programming concepts, it’s pretty easy to translate it to another language. Portability for the win!

But what does Python offer in terms of advanced stuff? I had a bit of a Google for other implementations of ROT13 to see what I could learn and two things came up - list comprehensions and lambda functions. Play time.

Doing it with list comprehensions would look like this

which can be run here. This looks very different but if you are familiar with list comprehensions it actually reads pretty cleanly.

The lambda route looks like this

which can be run here

Not re-inventing the wheel

Coders have a tendency to re-invent the wheel. We’ll see some minor problem with an existing tool and spend six months writing a better version. Objectively speaking, this could potentially be classed as a bad idea. What existing tools will already do ROT13 for us?

Python has a module called codecs which will do ROT13 for us. This makes it super simple.

Run it here

On linux we’ve got the tr command. So we can just do:

echo "Hello World!" | tr 'A-Za-z' 'N-ZA-Mn-za-m'

In Cyber Chef we can use the ROT13 recipe.

These are in some ways black box solutions. They do little to explain the algorithm for ROT13 but they are a quick and easy way to solve the problem of having a ROT13 encoded ciphertext that you want to decrypt. It’s important to understand that these are not simpler solutions - they just hide the complexity in a black box. And fabricated simplicity doesn’t help with understanding. But often that’s exactly what we need to do. Get ’er done and move on.

Finally, just for cryptic fun, here’s a ridiculously short version in C

that you can run here