Friday, April 29, 2005

The best tools are the less intrusive ones

The only software that people really want to use are games.

All the rest is supposedly productivity: Software that makes your life easier and/or more comfortable. You may like an app but at the end of the day, what you really want is not using it but using its output.

That's why I just added command-line support in appTranslator to build translated files. The idea is that what you like most in appTranslator is not the nice toolbar buttons but the fact that it produces translated files instantly.

So when you compile your app in Visual Studio (or using an unattended nightly build process), you want updates of the translated versions of your apps as well. Automatically !

Just add a post-build step to run appTranslator in command-line and let it silently deliver updates of your translated app to your door !

(This feature wil be available in the next beta version).

Monday, April 25, 2005

UTF-8 is your friend

This morning, I experienced a great example of REALLY BAD internationalization.

I was busy booking tickets on the website of the largest European low-cost airlines. At some moment during the booking process, I was asked to type the first name and family name of each passenger. I typed my daughter's first name: Iséa. When I tabbed to the next field, I received this error message:



It appear that this European website selling tickets to airports in more than ten different countries, counting almost as many different official languages, was not able to process the acute accent in my daughter's first name !
And don't think it's because of some English-only arrogant company spirit: The site is available in 18 languages !

UTF-8 offers diacritics and non-latin charsets support for free

I blame them because allowing me to write Iséa correctly is such an easy job: Just save your HTML pages using the UTF-8 encoding (Yes, your HTML editor can do that. Even notepad can!).

UTF-8 encodes your data as Unicode. But the _very_ nice thing is that it preserves the basic properties of null terminated ASCII strings : basic 7-bits ASCII chars are not modified by the encoding. Other characters are converted to a series of 1, 2 or 3 characters (OK, sometimes 4 but it's very rare).

This has a really cool consequence: As long as you simply store and forward the data, you don't have to modify your code at all. Your code doesn't have to be Unicode-aware. Simply store and forward UTF-8 encoded strings as if they were simple and stupid ASCII, null terminated strings ! It's the browser's job to display them correctly (and they all do a good job at it).

The irony, in this case, is that even the so-called special character é is encoded in UTF-8 exactly the same way as using the infamous ISO-8859-1 specified in the HTML page.

Conclusion: Save your HTML pages as UTF-8

It's the biggest yet most simple step to make your web site world-ready.

Saturday, April 23, 2005

No need to know your base class name. __super !

A question I love asking interviewees is: If you could fix one weakness of C++, what would it be ?
Pretty interesting to see how deeply a candidate thought about the language and its implementation.
My answer if I were asked the question is: I'd add a baseclass keyword. It would allow calling a base class member without having to explicitely specify its name. That way, one can more easily inject a small class in a hierarchy chain.
There's a trick to workaround that though: #define or typedef the base class. That way, you only have to change the #define/typedef when you inject a class.

Today, thanks to Raymond Chen, I learned that the Microsofties shared the same dream. And they made it come true: __super !

Just one more question Raymond: Why didn't you tell me 10 years ago !

Wednesday, April 20, 2005

Save the planet: Stop converting CRLFs !

Today was one of these days where I wonder why the hell I am a developer when there are so many great jobs out there : Matrasses tester, pizza eater, Aussie snorkeler,...

I was debugging a couple of apparently unrelated small bugs which altogether led me to discover that I was really not dealing correctly with carriage returns and line feeds in the multiline translatable text items.

Yesterday, I already spent much time trying to solve a stupid little usability problem related to multiline texts: When <enter> means 'OK', do people know that <ctrl+enter> is used to enter a line feed ? I eventually came out with this visual clue (enlarge) appearing when a multiline text item is selected:



Today, I noticed that this warning is ALWAYS on when the message table is opened (The message table looks like the string table, just different!). It appeared that messages from that table always end in CRLF, which made appTranslator believe they contain at least 2 lines of text! Since I don't want people to stupidly have to type <ctrl+enter> at the end of each message, I decided to strip these final CRLFs when importing the tables and re-append them when building the target file.

I must also say that when you click OK in the translation pad, appTranslator converts all CRLFs back to simple LFs (line feeds), because multiline strings usually contain single LFs. And since the rich edit control in the translation pad happily and silently converts single LFs to CRLFs, someone must reverse the conversion some day...

Convert alls CRLFs to LFs ? This ain't no true anymore, since the message compiler emits CRLFs only. Hence, how do I know when I get this string back from the rich edit with CRLFs into it if the app that is going to use these strings (your app!) expects single LFs, single CRs or CRLFs ?

So I wrote another 50+ lines of codes to compare the line breaks of the translated strings with the one in the source string. And make sure the translated string uses the same as the ones in the source string.

And each time an appTranslator user presses OK, a Pentium-class processor will spend a couple of µ-secondes checking if these conversions must be performed...

Are you still with me ?

Rocket science, isn't it! All this because some day, probably at a time when the Universe had not planned to turn a bunch of atoms into you and I yet, someone decided that a line feed would consist in 2 characters (CR and LF). And later, another someone figured it would be easier to process the files if all CRLF pairs were first converted to a single LF, but keep them as CRLFs in the file so that the first someone doesn't notice. So, now, 40+ years later, I have to spend hours figuring how to deal with that mess.

Save the planet!

I'm sure that the power consumption required by millions of computers worldwide to perform CRLFs conversions all day long easily adds up to a nuclear power plant. Here's a great slogan for all the ecologist geeks out there: "Save the planet: Stop converting CRLFs !"

Monday, April 18, 2005

International Glossaries

Last week, I explained how to install several Windows UI languages.

If you just want to peek into a Windows translation to check the translation of a menu item or some terminology, and you don't want to install full support of this language using MUI, there is an alternative : The Microsoft Glossaries.

A set of Excel files for each language

MS publishes its international glossaries as a set of Excel files. There is one set per language (it's a self-extracting exe). The files are more or less ordered collections of English/translation pairs of texts for the most important Windows components.

Note that the amount of translations available for each language varies pretty much. Consistency doesn't seem to be a motto for the people who collected the information.

Where ?

The files can be downloaded from the MSDN subscribers download area (MSDN subscribtion required). In the menu, choose Tools, SDKs and DDKs, then Developer Tools, SDKs and DDKs, then Microsoft Glossaries.

Friday, April 15, 2005

MSDN CDs

Does he know that MSDN ships on DVDs as well ?

Wednesday, April 13, 2005

New preview version...

I just posted a new preview version of appTranslator (version 0.53).

Docs !

Back when I published the first preview version, the biggest missing piece was docs. The good news is that the problem is fixed in this new version. It even contains a tutorial ! Pfeewww... I knew I wouldn't cook it in an evening! But I didn't figure putting everything together would take so much time. Not that it's difficult really, but it is indeed a time consuming task: You must organize the sequence of steps so that it's fairly straightforward and not too boring, yet covers the most important features. And is as accurate as possible ! Because when you're following a tutorial, you actually are... well... following ! If one single step doesn't work as expected, you are going to try bridging the gap yourself and if you don't do it the way I expected you to do it, we're no longer on the same path, which unavoidably makes you mad because if these #$&! are not even able to correctly describe the easiest tasks in their software, what can you expect from the software itself !?

Making it easier for first time users

When they started to use the first version, many users told me that the first steps were a little confusing. I wrote that even if they only had to spend a few minutes trying to find their way, it's important to avoid these situations because for one potential customer who accepts to spend these 3 minutes, how many are going to hit the Uninstall button immediately !
I (hopefully) addressed this problem by creating a wizard to help people get started. The nice thing with wizards is that they lead you through the necessary steps to get you started. The bad thing is that wizards are like tutorials: They take so long to be designed and programmed correctly !

Coding time again...

I didn't code much at all during this last month. There was even a moment where I didn't start Visual Studio at all during over 2 weeks. I'm not sure it had happened one single time during the last 10 years ! Now that I am a one-man band, the 2 out of 100 calories rule applies to me as a whole.
But I'm now starting to dive into code again !

Monday, April 11, 2005

Choose your Windows UI language

Ever wanted to know how Windows look like in German, Italian, Japanese, Vietnamese, Arabic, Thai, Greek,... ?

Did you know that you can install the Windows GUI in all these languages (and more) in addition to your default language ?
It's called MUI (for Multi-Language User Interface). You simply run the MUI setup on your existing Windows 2000 or Windows XP box and choose which language(s) you want to add.

When the installation is completed, your Regional and Languages Options control panel applet has a new combobox where you can choose your Windows UI language (Logoff required to take change into effect).

(Click the picture to enlarge it).

The nice thing is that it's a user setting. Just create a few users and assign each one a different language. You can now toggle between different UI languages using FUS (Fast User Switching) !

The MUI CDs are available to MSDN subscribers (Operating Systems+ subscriptions). And of course, they are available as downloads from the MSDN subscribers download area.

This guy is pretty good...

Fake Nike ad featuring Tiger Woods. (through Ian Landsman)

Wednesday, April 06, 2005

How to avoid string IDs allocation collisions

The resource string table is not team-friendly
It's a flat table where all developers add entries randomly, usually picking the next available index. Devs usually have their own copy of the .rc file and add strings to the table using the same new IDs as their teammates, resulting in painful indexes collisions. Since the resource script (.rc) is just a flat file, it doesn't provide locking or transaction capabilities as a full-blown DB would. Locking access using version control is hardly ever a solution : Since the .rc file being unique, members of the team very often need to access some part of it, making exclusive check-out a real bottleneck.For other resources, it's usually not such a big problem: Devs usually don't work on the same resource simultanously.

The problem with the string table is that it's global and multi-purpose: It holds strings as different as listview column headers, error messages, prompts and all kind of custom texts. Therefore devs need almost constant write access to it.

A simple trick to avoid string IDs allocation collisions
There's a very simple way to avoid indexes collisions: Assign each developer his/her own range of indexes in the table.

e.g.:

  1. Mike gets the range 5000-5999,
  2. Goran gets 6000-6999,
  3. Serge gets 7000-7999,...

To make it even clearer, start each range by a string mentioning the dev's name and the range.
This way, people don't step on each other's toes anymore.

Of course, this trick works for small and medium size teams only. Large teams probably need stronger policies.

Monday, April 04, 2005

Misconception: It's a better idea to translate a .rc file than a .exe file

Developers often think that using the executable file (EXE) as the source file for the translations is not a good idea. They would rather have their source file translated: The resource script (in other words the .rc files). That's what I first though, too. But it's wrong !

This article points out a few of reasons why using the EXE as the source file makes developers' life much easier. It also covers several major misconceptions about EXE-based translations.

Friday, April 01, 2005

NASA finds water on Mars

Yes! Lots of water !

And April fools as well :-)