Disclaimer: I’m writing this blog
post in an attempt to present a tiered approach to learning a new subject. It's also to solidify my understanding of the topic of base64 encoding
as well as to act as an aide-memoire. I’m not presenting this information as
infallible fact.
Preamble: Personally, learning
a new programming concept (or any complex topic for that matter) requires me to
take a very particular approach if I want gain and maintain a comprehensive
understanding of it, and I don’t see resources which represent and facilitate
my learning process very much in evidence.
Learning
for me involves moving from the general to the specific and for my sources of
information to assume as little as possible while establishing context and
purpose quickly. Producing this type of learning resource usually manifests in
tiered levels of explanation. To my mind, Tier 1 is where the biggest shortage
of good resource on a topic generally is. It should be what the opening
paragraph of the Wikipedia topic strives to attain: a succinct and clear
overview of the topic that someone immersed in the relevant field can read and
feel more illuminated right away. Further tiers of explanation should elaborate on what previous tiers have established.
Let
me try presenting the first couple of tiers for base64 encoding in the style I'm talking about.
What I assume: you have a
programming background and that you’re looking to better understand base64
encoding.
Tier 1:
Okay,
Tier 1 explanations might be relevant if you’ve just heard someone say “base64
encode” in a meeting and you’re thinking “I should probably have some idea what
on Earth they’re talking about”; you’re googling about for five minutes to see if you can shed some light on the topic.
Wikipedia’s
Base64 opening salvo is: “Base64 is a
[...] binary-to-text encoding scheme that represent[s] binary data in an ASCII
string format”.
This
isn’t particularly illuminating on its own but there are a couple of clues in
there: it’s something do with binary data being represented as ASCII
characters.
Warning: rather unhelpfully, it
is possible to immediately jump down the rabbit hole with base64 encoding and
you may be thinking, as I was, “hang on a minute, everything eventually boils
down to binary data - including ASCII characters - so that seems like a bit of
a nonsense”. Or perhaps you have come across an example whereby someone is
showing you how they converted a sentence (one string of characters) into
base64 encoded text (another string of characters) and are thinking “what could
possibly be the value in that!?”. If you’ve done either (or both) of these
things, please, for the moment, put those thoughts on ice – don’t worry, I’m with
you comrade, I feel your pain.
A
concrete example might help. Imagine I have an 10 x 10 pixel jpeg image (some
binary data) and I want to represent it (for some ungodly reason) as ASCII
characters. Up steps base64 encoding. In fact, here is a base64 encoded 10 x 10
jpeg:
/9j/4AAQSkZJRgABAQEAYABgAAD/4QBmRXhpZgAATU0AKgAAAAgABAEaAAUAAAAB
AAAAPgEbAAUAAAABAAAARgEoAAMAAAABAAIAAAExAAIAAAAQAAAATgAAAAAAAABg
AAAAAQAAAGAAAAABcGFpbnQubmV0IDQuMC45AP/bAEMAAQEBAQEBAQEBAQEBAQEB
AQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
Af/bAEMBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
AQEBAQEBAQEBAQEBAQEBAQEBAQEBAf/AABEIAAoACgMBIgACEQEDEQH/xAAfAAAB
BQEBAQEBAQAAAAAAAAAAAQIDBAUGBwgJCgv/xAC1EAACAQMDAgQDBQUEBAAAAX0B
AgMABBEFEiExQQYTUWEHInEUMoGRoQgjQrHBFVLR8CQzYnKCCQoWFxgZGiUmJygp
KjQ1Njc4OTpDREVGR0hJSlNUVVZXWFlaY2RlZmdoaWpzdHV2d3h5eoOEhYaHiImK
kpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4eLj
5OXm5+jp6vHy8/T19vf4+fr/xAAfAQADAQEBAQEBAQEBAAAAAAAAAQIDBAUGBwgJ
Cgv/xAC1EQACAQIEBAMEBwUEBAABAncAAQIDEQQFITEGEkFRB2FxEyIygQgUQpGh
scEJIzNS8BVictEKFiQ04SXxFxgZGiYnKCkqNTY3ODk6Q0RFRkdISUpTVFVWV1hZ
WmNkZWZnaGlqc3R1dnd4eXqCg4SFhoeIiYqSk5SVlpeYmZqio6Slpqeoqaqys7S1
tre4ubrCw8TFxsfIycrS09TV1tfY2dri4+Tl5ufo6ery8/T19vf4+fr/2gAMAwEA
AhEDEQA/AP5/fg7rH7M+nWv7EPjn4nfsT/ACH49fsz/D/wDZbvPgR+xr4R+Hf7Qn
x1tf+C7/APwvX9o7x3D4p8R+Ivi14A+I3jn4ZeAviB8FbK8Xw/P4K8QeGfil/bn7
R3hz4i/sufEfwBqngX4ZaX+xF8FPxB+LFn/Z3xT+Jen/APCN/D/wd9g+IHjKz/4R
H4T+Nf8AhZXws8K/ZfEepQf8I58NPiL/AMJ/8V/+E++H+h7P7M8G+Nf+FpfEr/hK
vDlrpuu/8J/4x+3/APCRaj6B4A/ax/an+FHws8a/Av4W/tLftAfDX4JfEr/hI/8A
hYvwd8AfGT4i+DvhZ4+/4THw5Z+D/F3/AAmvw+8O+I9O8JeKv+Eq8Jadp/hbxH/b
ukX/APbnhyws9E1P7VplrBap8/0Af//Z
Sceptical?
If you copy that text and save it into a new text file (called, say, “encodedJpg.txt”)
and then navigate to the folder the file is saved in from a Windows
command prompt, you can run the following command certutil
-decode encodedJpg.txt 10x10.jpg and you should see the jpg recreated.
You can turn the jpg back in to the text above by running the alternative certutil -encode "input" "output" command.
And
that’s Tier 1. For the moment we’re not going to worry about the mechanics of
the operation, it’s enough to know that base64 encoding changes binary data into text that looks like the above. Why you'd want to do such a thing and how it's achieved are Tier 2 explanations. N.B. the binary data doesn’t
have to be a jpeg image, it could be anything: an executable, a zip file,
a Word document, etc.
Next >> Understanding base64 Encoding #2
No comments:
Post a Comment