Wednesday, 8 February 2017

Understanding Base64 Encoding #5

Tier 5

This tier is aimed at filling in a few gaps, showing the wider applicability of base64 encoding, and pointing to further reading.

Padding: The Trailing Equals Character

When I first looked at the characters used in base64 encoding I noticed there was a cheeky 65th character (‘=’) sometimes appearing once or twice at the end of encoded data. It’s actually a special character used for when source binary data doesn’t divide neatly into three byte blocks. A quick example to illustrate.

Imagine I want to base64 encode the following four 8-bit bytes:
01000001 01100100 01100001 01101101

I take the first three octets:
01000001 01100100 01100001

Represent them as four sextets:
010000 010110 010001 100001

And encode using my encoding key, producing: QWRh

But now I have a lonely, final octet left to encode: 01101101

In base64 encoding it’s simply padded out with trailing zeros until we have another three octets:
01101101 00000000 00000000

And converted it to sextets as normal:
011011 010000 000000 000000

Any sextet which contains nothing but padded zeros gets represented as ‘=’.

So the rest of the encoded data becomes: bQ==.

The ‘=’ character is a bit of a courtesy and not every implementation of base64 encoding uses it; it is possible to recreate the original binary data without using ‘=’ for padding, it’s is just more explicit to include it.

Other Uses:

Base64 encoding is typically used in scenarios where representing binary data as a limited set of ASCII characters is desirable. This could be when using an 8-bit (or greater) character encoding isn’t viable, or when you wish to embed binary data in a explicitly text-based medium, or when sending non-alpha-numeric characters could be an issue.  

Attachments to emails are base64 encoded, as are the username and passwords sent for basic HTTP authentication. The specifics of why base64 encoding is used in these scenarios is beyond this series, but reading about https://en.wikipedia.org/wiki/8-bit_clean and https://en.wikipedia.org/wiki/Email_attachment gives you a good idea of why this is the case. The below quote taken from the Email Attachment Wikipedia page gives a good sense of the history:

“Originally Internet SMTP email was 7-bit ASCII text only, and attaching files was done by manually encoding 8-bit files using uuencode, BinHex or xxencode and pasting the resulting text into the body of the message.”

Further Resources:

Once you grasped the basics of base64 encoding the Wikipedia article actually becomes useful. To my mind it’s missing a Tier 1 style explanation but it otherwise quite passable.

There’s an Oracle blog post which is also good – again, if you’ve got some base knowledge to work from.

And when you want to go full nerd there’s the IETF spec!

Tuesday, 7 February 2017

Understanding Base64 Encoding #4

Tier 4

For this tier I’m going to start to push the strained and sanitised analogy into the background and, hopefully, bring the hard edges of base64 encoding into focus.

First, a quick recap on what we’ve established:
  • Base64 encoding is a methodology by which we can represent arbitrary binary data (an image, in our example) as a string of ASCII characters.
  • The 64 characters used when base64 encoding are a subset of the full ASCII character set. In our case: A-Z, a-z, 0-9, +, and /.
  • 64 characters can be neatly represented by a block of 6 bits.
  • When base64 encoding, the binary source data is broken into 3 octet blocks (24 bits) which is then parsed as 4 sextet blocks (also 24 bits); 24 being the first common multiple of 8 and 6. 
And our encoding key looked like this:


So far we’ve been using a contrived example – a world with no digital communication – in an attempt to remove the contextual complexity of base64 encoding, concentrating on the essence of subject instead. But this only takes us so far. Let’s take a real world example of where base64 encoding could be used: embedding images in XML.

Occasionally, it may be useful to be able to create an XML document which contains images – not references to images stored elsewhere, but the actual images themselves. I’ve seen this kind of thing done when archiving orders in an e-commerce context: a business wishes to archive orders made over five years ago, however, it also wants some reasonable level of access to that data should a pressing need to retrieve it arise.

One approach to take could be to create an XML document for each order, one which contains a complete record of the transaction: top-level order details, items details, invoice address, delivery address, etc. All this is relatively straightforward. But the company may also decide, for completeness sake, that they wish to store a copy of the primary product images alongside the order. This causes a problem for a developer who doesn’t know about something like base64 encoding. For one who does, it’s fairly trivial. It could look something like this:


Here you have co-opted a medium which is designed to carry text to also carry binary data, although it doesn't even necessarily know it! Those characters between the image nodes are just text characters as far as the XML is concerned. But if the reader knows they're base64 encoded binary data, then the images can be retrieved.

Tier 5 will look to fill in a few of the gaps we've glossed over, briefly give a couple of other examples, and point at some further reading.

Next Understanding Base64 Encoding #5

Thursday, 2 February 2017

Understanding Base64 Encoding #3

Tier 3

In Tier 2 I found myself with some binary data (a 10 x 10 pixel image) and a text-based medium (pen and paper) with which to communicate that image to another computer. The image is about a thousand bytes large and I didn’t fancy having to write down eight thousand ones and zeros in order to communicate that image. I’d decided I need to encode the raw data to save myself some pain.

My initial instinct to encode the raw data is that I’ll use a character to represent each possible byte value. This means I can reduce the characters I have to write out from 8000 to 1000. i.e. instead of having to write the byte value ‘00000000’, I could instead write ‘A’. As long as the recipient of my encoded image knows the encoding, e.g. ‘A’ = ‘00000000’, then they can decode the image. I start to write out my encoding key:

00 | 00000000 = A
01 | 00000001 = B
02 | 00000010 = C
24 | 00011000 = Y
25 | 00011001 = Z
26 | 00011010 = a
27 | 00011011 = b
50 | 00110010 = y
51 | 00110011 = z
52 | 00110100 = 0
53 | 00110101 = 1
61 | 00111110 = 9

However, as you might be able to see, by the time I’ve covered byte values 0 to 61 I’ve run out of standard alpha-numeric characters (A-Z, a-z and 0-9). I’m going to have to start using some less recognised characters – and/or possibly even fabricating new ones – in order to get all the way to 256 (the distinct values which can be represented by an 8-bit byte: 0 to 255). This gives me pause for thought. It feels like there’s potential for confusion if I start using arcane or made-up characters.

I stop and have a think. I’ve got 62 alphanumeric characters I’m confident any decoder can easily recognise. I also suspect I could probably be fairly confident using a handful of other characters, e.g. ‘=’, ‘!’, ‘+’, ‘:’, ‘&’, ‘/’, ‘\’, ‘%’, etc. But that doesn’t bring me anywhere near to the 256 characters I’d need for this encoding method.

While I’m ruminating on the problem a thought appears: 62, the number of easily recognised characters I have, is close to the binarily-significant number 64 – the distinct values which can be represented by 6 bits: 0 to 63 or 000000 to 111111. Perhaps I can use this? If I picked a couple of my additional characters at random, say ‘+’ and ‘/’ that would bring me up to a encoding set of 64 easily recognisable characters. I bank the thought.

Then comes the flash of inspiration! Ultimately, I’m just trying to communicate a series of ones and zeros from A to B. When thinking about those ones and zeros I’ve always naturally separated them into 8-bit bytes, but for the purpose of transmission there’s no inherent reason to do so; as long as the correct sequence of ones and zeros reaches the other end the interpretation of that data as 8-bit bytes is the receiving computer’s decision.

I start to jot down my thinking. Imagine the first 3 8-bit bytes of my 10 x 10 image are as follows:

00000010 – 0011011 – 00110100

For transmission, I could split those 24 bits any way I like. Into two bit chunks, for example:

00 – 00 – 00 – 10 – 00 – 01 – 10 - 11 – 00 – 11 – 01 – 00

Or - going back to my previous thinking! – as 6-bit chunks:

000000 – 100001 – 101100 – 110100

And with 6-bit chunks, I can use my recognisable character encoding key!

A – h – s – 0

I could send you "Ahs0" and, as long as you knew the decryption key, you could reverse the encryption and retrieve the bits.

And this is the bare bones of base64 encoding. I’ll fill in the gaps and attempt to extricate the tortured analogy from this explanation, applying the real world, in Tier 4.

Next >> Understanding Base64 Encoding #4