Tuesday, November 28, 2006

appTranslator 2.1 is released

I already mentioned the new features in this release in the beta announcement post.
Here is a short reminder subjectively sorted by descending importance.

Feel free to download!
Note: The demo version contains some restrictions (documented in the download page) regarding word count and pseudo-localization.

Thursday, November 23, 2006

New EU Airports Security Rules

Security rules about carry-on luggage in airplaces have become fairly confusing lately.
Here is a summary of what you can or cannot take aboard.

You may take your razor (not this kind) but you may not take your shaving gel or aftershave. Yes Sir! I experienced it in London recently :-(

Friday, November 17, 2006

HTML Tables Import into Excel

Some of the software applications we use daily involve really smart engineering. Take word processors: The algorithms and computation involved in positioning every word of your text according to the justification, font, size, bidirectional alignment, embedded pictures,... It's a really tough problem requiring smart people to solve it. (D)HTML rendering and spreadsheet updates are other example of hard computational problems that are so difficult to solve yet we take it for granted that these programs just work. And let's not even speak about games!

What usually amazes us is not these smart pieces of codes but slick ones. Such as HTML tabular data import into Excel! Let's say you have this nice HTML report produced by someone else and you want to get it back into Excel:



Don't try to copy/paste it from your browser, there's much better: The Web Query dialog.
In the main menu (Excel 2003. Don't know for earlier versions. Excel 2007 is coming tonight to a computer near me), choose Data / Import External Data / New Web Query and type the URL in the address bar. The dialog opens the page and puts a small black on yellow arrow in front of each HTML table in the page.



Click the arrow to select a table and click Import. And here you go: Your table is now in Excel!



Being smart doesn't pay off. Be slick!

There's no rocket science in there: Just a DHTML script to inject the arrow heads and some code to parse the table layout and contents. But it's damn slick if you want my opinion.

So if you want to impress people with your software: No need to be smart. "Just" be slick!

Thursday, November 09, 2006

appTranslator 2.1 beta version

I just posted a beta version of v2.1. (Download). Here's the main new stuff:

Pseudo-Localization

I already blogged about it. Remember: Pseudo-localization is limited to menus in the demo version.

French Version

Several people asked me why I didn't translate my own software! Well, I have been eating my own dog food for long but for some reason that I fail to find, I've never included that French version in the package. This mistake is now fixed. (BTW, German version will follow in a few weeks, as soon as my favorite German translator can find some time).
If you are French, chances are appTranslator 2.1 will automatically start in French. If it doesn't (or if you want to switch back to English, point to View/Language/whatever).

Word Count

A very missing feature so far. Word Count is important when you work with external translators: It helps them make a quotation by providing a count of source words. It helps you check translators' bills by giving you the amount of words translated when you import a TE or XLIFF file back into the main project.
Point to Translations/Word Count and here's what you get:


[enlarge]

CLanguageSupport class and Sample Code

An article (in the online help. Includes detailed step-by-step how-to) and ready-to-use MFC code to implement satellite DLLs support and a Language sub-menu for your application.

It's not new really: It's a copy of an article and code I published last year on codeproject.com.

Tuesday, November 07, 2006

Input Validation: What is an alpha-num character?

A few days ago, I wrote that I didn't agree with Eric Lippert about using a regex to filter alphanumerical input. Let's take a registration system using such validation. It would rule out my daughter Iséa because of the accented é (It would also rule out all text using non-latin scripts such as Greek, Russian, Japanese,...).
I said I would post some code to do such alphanum validation.

Unicode Categories

The idea is behind such code is to loop on all chars in the string and examine their Unicode Categories: Must match Char.IsLetter() and/or Char.IsNumber(). Which means Greek letters, accented French letters, Japanese ideograms et al are accepted (Yes, the docs for IsLetter() say alphabetical. But it includes ideograms). Even Myanmar digits actually.

That's much better than a simple regex. But not enough though. The input might include composite characters: The letter and its diacritic mark(s) are coded using two (or more) separate characters. For example, é is either U+00E9 or the pair U+0065 U+0301. Hence our check should include the NonSpacingMark Unicode category.

Writing such a little routine in a ASP.NET-compatible language is left as an exercice to the reader, given the links in the paragraph above. I admit that I don't how it would look like in PHP even though I have some non negligeable experience in that language.

Want to make similar checks in your C++ Win32 apps? GetStringTypeW is your friend.

Inclusion Set vs Exclusion Set

But the more I think about it, the more I wonder if such alphanumerical checks are a good idea at all. Unless you want to validate input for a very specific format (i.e. e.g. a Belgian car license number), validation consisting in checking if all chars are within a given set is flawed by design: In most cases, you just can't define the acceptable set. Example: alphanumerical chars are not enough for first names validation: One needs the "-" as well (such as in Jean-Pierre). It's actually easier to work the other way around: Check if all characters are out of a given set of unacceptable characters, such as apostrophes which are SQL string delimiters.

The problem with my suggestion is that you let slip some unacceptable chars that you're not aware of. Security zealots would tell me that they prefer to force me write Isea instead of Iséa rather than taking the risk of leaving a door opened.

I agree with them. BTW, is it true that there was once a vulnerability in Windows based on the use of the Turkish dotless i ? That would prove that even though a regex is far too restrictive, using an exclusion set is asking for security holes.

Why exclude characters at all?

What to do then. Well, we can usually rely on our programming platform (language, DB driver author, whatever...) to provide such safety checks for us. Better yet, this check shouldn't reject unacceptable data. It should rather escape it in a way that the DB will be happy. Which will allow company names with an apostrophe to be typed correctly.

Great, we're back to another security measure enumerated by Eric: Use your DB API to escape input. Of course, pay attention to escaping it for HTML/WML/whatever rendering as well. It is obviously a little more difficult than simply accepting US-ASCII alphanum chars only. But security tradeoffs should not become an excuse for promoting incompetence: We must do our homework!

The Short Version

All this to say that input validation is sometimes not a good idea: You'd rather be able to make sure you safely accept all input.
And Unicode categories are cool ;-)

Before you decide to get started with WPF...

...you may want to follow Eric Gunnerson's check-list.
It looks like it should save you time, tears and a handful of hair!

And if you're looking for a WPF book, here's a list of authors ;-)

(WPF=Windows Presentation Framework, formerly known as Avalon)

Saturday, November 04, 2006

Irritating games - Jeux chiants

Or "How to waste your time at work".

Kek is a French Flash specialist. He makes games. Irritating games. IRRITATING GAAAAMES!!! If you see what I mean. You don't? Make yourself an opinion.

In English or in French.

Note: Part of the fun is in the wording of the congratulations at the end of the games but you'll want to choose the French version to truly enjoy that part.

If you read French and want to learn Flash, consider buying Kek's book: Flash Professional 8, le guide complet.

Thursday, November 02, 2006

OK, we must care about SQL injections. But how?

"For Joel's proposed attack to succeed, everything has to go wrong. The server has to fail to validate input, then use it in an insecure way, then connect to the database as an administrator. Regrettably, many server-side web apps leave themselves wide open to these sorts of attacks. Eliminate all of these problems, not just the string concatenation."

Very nice HowTo written by Eric Lippert in response to Joel's post about SQL injections.

I have an objection though: Using a regex to validate alphanumerical input is a neon sign saying "This site is for Americans only!". Because your regex will hardly take accented letters into account. Let's not even speak of letters and digits in non-Latin scripts!

I'll post some code to do such validation correctly. Stay tune...