The Proof of the Blogging

Bloggers fall into three categories.
  • Those who don't proofread their entries before posting because it's a blog
  • Those who don't proofread their entries before posting because they've never heard of proofreading
  • Those who don't proofread their entries until after posting
The proof of the blog is in the posting; something about the act of committing something changes the state of mind of the author, and he can suddenly spot errors much more easily. Outside the blogosphere this generally manifests itself as the discovery of glaring errors in your document when you see it, upside down, printed out on your boss's/customer's desk, or indeed when your customer has approved it for publication.

At least in a blog we can go back and make corrections to something already published at almost no cost at all.

I therefore advocate proofing of blog entries, but only after publication.

A Little Writing

Well folks, NaNoWriMo is almost upon us again. I began something last year, and then slipped a disk and spent three months on my back. This year, I'm going to go for it once again. I hope I don't get another serious injury. Writing isn't usually all that dangerous, no matter how much mightier than the sword the keyboard may be.



I didn't know how happy I was, until I discovered that I had been living without knowing of the existence of "i18n" and "L1on".

These abominations are what I think of as tertiary jargon. Primary jargon arises organically, by accident, often from slang, corruptions, abreviations and audibility adjustments (a lot of printing jargon uses words that are easily distinguished against a lot of background noise). Primary jargon has a certain nobility; its very existence justifies its existence. Secondary jargon is a conscious invention in the presence of a need for a word - usually to differentiate between concepts or items where no differentiation is needed in other domains or contexts. Sometimes it is created in response to an innovation. Tertiary jargon is invented by people who think that jargon is cool, and is used by people who want to show that they are with it, fab hip and trendy, and generally on the bus. Primary and secondary jargons can both enrich language, provide extra meaning, and give practical benefits - even if in some cases the benefit is restricted to those using the jargon, such as nautical and theatrical jargons. Tertiary jargon is a form of weaseling; it leaves us with less meaning and less understanding.

True, it takes less time to type, and to say, "i18n" than to type or say "internationalization" (although if you type at up to 100wpm who cares?).

Well I've just been handed an internationalization project and you can be sure that on all communications and documents I shall be writing it in full. I can't think of a reason why I would ever want to write it in a text message; I think if I had a word like that to say to someone that I'd call them and say it aloud.


en > en

I wonder sometimes about the sanity of it - but of course I understand the mind-set that says that if we internationalize our software that we should support everything classified as a language. Essentially it becomes a political, or at the very least politic choice, though, to have an enUS version and enUK, enOZ, enSA, enMY, etc, etc... Taking this to its equitable conclusion results in a profusion of languages where every pidgin, dialect and creole is included. There are at least 10 versions of French that are spoken worldwide - even frCA is at least as different from frFR as enUS is from enUK.

Going to these lengths is perhaps laudable, and probably, in the end, worthwhile, if you are the world's largest software company. And I'm all for diversity of language. The more variation there is the more modes of expression there are. At home I often switch between en (Int) and fr (FR) depending on the subject of the conversation - a facility that I value enormously.

I've been looking at Facebook's recent initiative to produce an English (UK) version. Now Facebook may use American spellings, but the language it uses is largely international English - whether they mean to or not. And what is International English?

It is a subset of English that has a reduced vocabulary. In software, if your interface designer is disciplined, you can end up with an en(Int) version unintentionally, just because he tried to keep it simple.

Most of the "translations" from enUS to enUK in Facebook are not translations at all. A few good writers have improved the style and clarity of some of the onscreen text, and the result is more comprehensible to all English speakers, native or not, worldwide. (There are also a few pointless "corrections" by the grammar Nazis, and some varied spellings. Surprisingly little fighting over -ise -ize.)

It seems to me that a good designer should consider the political implications of his choice of language. I suspect however that in many cases the designer isn't considering it at all. He is either making the assumption that he will cause offense if he only uses Perugian Italian, Parisian French or NY Engligh - in which case he should damn well say so, or he is making the assumption that a Taiwanese won't be able to use software with an interface in ch(PRC) - which is quite untrue.

When you set the editing language in MSWord, you're doing something else altogether; MSWord tries to help you to write without error in all the languages + variations that you speak. It is hence a requirement that every language with a written tradition be represented in the editing and correction tools, but by no means is this necessary for the interface itself. Indeed, it would make me completely crazy if the tool bar and keyboard changed language everytime I change the editing language.

Selection of languages for the interface should be more than just pragmatic. It should meet user expectations. An interface that can be presented in simplified English, simplified Chinese, French, Italian, Russian, modern Spanish and Japanese (westernized or standard) will meet the expectations of, and be usable by, most of the human race - and most of them will be happier that you have piled your resources into making the software efficient and easy to use, than you could possibly have made them by giving them an interface in their particular regional variation of their particular language.

Indeed, the grammar Nazis can't possibly object if you tell them explicitly that you have used an "internationalized" version of their language. And the minorities that seek to gain political capital from complaints that their language has been marginalized? If you can't ignore them, as they deserve, then get the buggers to pay for a localized version; you answer should always be "our product can be translated into any language". Give them access to the language files. After all, you'll offend far more easily by making errors in arLE (okay, that one's a bit more obscure: Lebanese Arabic), than you would by doing just one version of Arabic (there is an ar(Int)), and letting users customize it to a local version if they need to.

Ok. This wasn't mean to be a lengthy rant. The message is pretty simple. Keep it translatable. Limit your standard package to languages spoken by four fifths of the world. It's not as if your interface is a novel by Thomas Hardy. If you have more than about 150 different words in your interface, YOU'RE DOING IT WRONG.