Today, Model View Culture published an article I wrote about Unicode, character encoding, and non-Latin alphabets. I’ve included an excerpt below:
I am an engineer, and I am a writer. As an engineer, I spend a lot of time thinking about how text is stored, but relatively little about what information the text actually represents. To the computer, text is an abstract entity – a
stream of 0s and 1s, and any semantic meaning is in the eye of the
beholder. As a writer, of course, the meaning is everything, and the
mechanics of how the text is stored is merely a technical detail.
But in an economy that is increasingly digital, increasingly global,
and increasingly multilingual, we can no longer maintain this
distinction. The information we want to represent is intimately linked
to how it is stored. We can no longer separate the two.