Daybreakin Things

Posted
Filed under In English

Today is one of national holiday, "Hangul Day". This day is for celebrating creation of Korean alphabets, Hangul. It is a truly wonderful achievement of our nation with almost no illiterat people. Even we have sayings like "One doesn't know Giyeok(ㄱ) with a sickle.", which means the one is a fool because Hangul is so easy that everyone should know.

However, we have very many issues about Hangul in the computer world because typical softwares and operating systems were designed by single-language speaking people such as US Americans. Many foreign computer games like Supreme Commander does not support to type Hangul in them. (Games of Blizzard are the significant exceptions.)

Recently, problems of displaying Hangul became just ignorable for that many softwares now use Unicode and almost all operating systems have Unicode fonts (although some of them are not pretty for native Koreans).

Problem 1: Fonts

For Latin characters, major operating systems like Microsoft Windows and Apple MacOSX share a set of common fonts--Arial, Times New Roman, Courier, Impact, Comic Sans MS, Georgia, Lucida Console, Lucida Sans, Palatino, Trebuchet MS/Helvetica and Tahoma/Geneva. This fact makes web designs to have diversity and beauty with consistent looks in many different OSs. I think this is why W3C had not worked actively on web fonts specification.

But for Korean font, the situation is very bad. With monopoly of Microsoft Windows, the four major fonts that are the basic fonts of Korean Windows also monopolized. They are Gulim(굴림), Dotum(돋움), Batang(바탕) and Gungseo(궁서). The former two are sans-serif, and the laters are serif. In small sizes like 10pt, they are displayed with bitmaps because in the past anti-aliasing techniques were not good enough to improve readability on complicated Esatern Asian characters including Hangul.

The problem is, the most frequently used Gulim is made from Japanese font, Naru, and has been criticized for destructing native goemetry and beauty of Hangul. Many web designers also doesn't like it, but they have no other option except Dotum, but also not pretty. So using images for titles and brochures became very common to use commercial fonts like YoonGothic(윤고딕). Here, another problem is making a new Korean font requires huge amount of costs and time. You just need design about 100 characters for basic Latin font, but for Unicode Korean, we need 11,172 + alpha characters. If you want make it more perfect, you also have to include common Chinese characters, which may be another thousands. So many good fonts were developed as commercial, non-free, and not popularized, and consequently, web designers couldn't avoid to use images instead of text.

Font Test

Some sample Hangul fonts

Windows Vista is a very remarkable version for Koreans because it first introduced a new default UI font, Malgun Gothic(맑은 고딕), with clear-type enabled. Yes, finally we "graduated" the bitmap font era. Also many local governments like Seoul and major IT companies are developing new vector fonts and releasing them free to improve their brand images. This will make the situation better, but still it's not good enough like they're embedded in operating systems.

For programmers, separating English font and Korean font is very important to get good readability, because usually auto-selected fonts for monspace Latin font is generally very unreadable. Fortunately, my favorite text editor gVim support this, but SSH client PuTTY didn't. So I had to make a patch--dPuTTY.

Font Separation

Comparison of separated fonts and auto-selected fonts

Problem 2: IME

All major operating systems provides internationalized input with their own IME subsystems. Especially, CJK1 characters need complicated IME with automata and dictionaries.

On Microsoft Windows, the operating system offers a series of Input Method APIs. They encapsulates the composition process, so applications need to know just whether the composition is begun, being done, finished. Of course, they have to update their text view or edit controls on those events from IME.

If an application does not support IME interaction, Windows will do a fall-back like this:

[Flash] /blog/attachment/9747313015.swf

Compare with this native behaviour:

[Flash] /blog/attachment/8816602530.swf

The later one feels much more comfortable for Korean people. Enabling the application to interact with native IME is very very important for internationalization.

There is another important problem of IME. There are NO operating systems that make user able to know the IME state conveniently. Users must look around or move their eyes to see the IME toolbar to detect whether the current input mode is Hangul or English. Why Microsoft or Apple hasn't changed the color or shape of input cursor according to IME status? I think just they weren't aware of this problem because they don't use IME and don't switch two languages frequently.

There were a few softwares that implemented this feature in the past, but currently we don't have those softwares in our major computing environment. Almost every software uses only Windows' native IME features as provided.

* * *

Internationalizing a software truly involves headaching problems in many cases. There are other problems with file encoding, mp3 tag encoding, ANSI applications with AppLocale and many many. I hope Latin-language speaking developer would consider basic i18n habits more. Sometimes, I imagine what if modern computer or operating systems were designed in Korea. :P


  1. Chinese, Japanese, Korean. It implies that processing these three languages properly is difficult for developers.