Global Solutions for Multilingual Applications: Real World Techniques for Developers and Designersby Chris Ott
The complete resource for developing multilingual applications and Web sites.
Global Solutions for Multilingual Applications With the increasing need for worldwide accessibility to the Web and other computer applications, Webmasters, developers, and IT managers must find solutions to any multilingual computing problem that may arise. This book provides you with
The complete resource for developing multilingual applications and Web sites.
Global Solutions for Multilingual Applications With the increasing need for worldwide accessibility to the Web and other computer applications, Webmasters, developers, and IT managers must find solutions to any multilingual computing problem that may arise. This book provides you with the hands-on information you'll need to address these language and translation issues. After a concise overview of multilingual capabilities, you'll find real-world details and techniques for creating global Web sites and applications. And you'll gain additional insight on how to make multilingual electronic communications easier. Chris Ott provides you with:
• Advice for setting up both PC and Macintosh computers
• An overview of the multilingual capabilities of productivity applications (including Web browsers)>
• Information on the new standard Unicode
• Tips to travelers on how to connect to the Web anywhere in the world
• A better understanding of the major issues involved when developing multilingual applications, and intranet and Internet sites
• An examination of the multilingual aspects of work in publishing, graphic design, and multimedia
The companion Web site at www.wiley.com/compbooks/ott features:
• Update information
• Examples of online multilingual techniques
• Links to translation sites and resources
Visit our Web site at www.wiley.com/compbooks/
- Publication date:
- Product dimensions:
- 7.54(w) x 9.47(h) x 0.62(d)
Read an Excerpt
Note: The Figures and/or Tables mentioned in this sample chapter do not appear on the Web.
The World-Ready Computer
Before you can do (or even view) any work with your computer in languages other than English, you may need to get a few things set up. This can be easy or fairly complex, all depending on what languages you want to work with.
For languages that are relatively close to English, like Spanish, German, and other Western European languages, you may need to do almost nothing, because support is generally already built in. But for extra convenience in working with these languages-- and for more complex capabilities, such as work in languages that use another writing system (Chinese, for example)-- you may need to buy and install additional software.
This chapter covers the basic multilingual features of Windows 98 and Mac OS 8.5. It also explains how to add support for each of the world's major languages that are not already supported in the standard U. S. English versions of the major operating systems.
NOTE Additional details about other systems, such as Windows NT, OS/ 2, and Linux, as well as forthcoming systems such as Windows 2000 and Mac OS X, are provided in Chapter 3, "Multilingual Compatibility Issues."
The Basics: Built-In ASCII Support
Thanks to the American Standard Code for Information Interchange ASCII), even a computer set up for use in English has some built-in support for working in other languages. The standard ASCII set has 128 different characters, and the extended ASCII set (introduced by IBM in 1981 and still widely used today) has an additional 128 characters, for a total of 256.
What this means is that with many English fonts, you can type as many as 256 different characters. These characters include the 26 uppercase and 26 lowercase letters of the English alphabet, the numbers 0 to 9, punctuation marks, and mathematical symbols such as the equals sign (=). They may also have characters for all Western European languages, including characters such as é, ñ, and ü.
The way that computers handle these characters in extended ASCII is different from the way that you might think of them yourself. In other words, the computer doesn't see a character like á as "a with an accent" (although this may be how it is composed for print, for example). The computer considers á to be a completely different character from a, and each character like this has its own place in the ASCII table.
NOTE A new standard called Unicode is emerging as a truly global successor to ASCII and other local character sets. Unicode is discussed in detail in Chapter 3, but references to Unicode support are made in this chapter and in Chapter 2, "World-Savvy Applications."
When someone wants to know how to type one of these characters, they generally ask about "accent marks." A better term is diacritics, since these marks include more than just the acute accent (') and the grave accent (`). They also include marks such as the circumflex (ö) and the tilde (~). The term diacritics is generally used in this book to refer to these markings.
Foreign Character Basics: Windows
To see all the characters that are available in a particular font in Windows, use Character Map (click Start, then go to Programs, then Accessories, then System Tools). Character Map can display all the characters available for each of the fonts you have installed. Since some fonts have a different selection of characters, you can look at the character set for a different font by using the font pulldown menu within Character Map.
Character Map allows you to select and copy one or more characters into the Windows clipboard to be pasted into a document you are working on in another application. Character Map also reveals the keystroke combinations needed to type any of the font's characters. For characters not included on a standard keyboard, you hold down the Alt key while typing the character's numeric code. As Figure 1.1 shows, the numeric code for the character á is 0225.
Using Character Map or numeric codes to enter characters from the extended ASCII set works fine if you only need to enter a few characters, but it can be cumbersome if you need to work extensively in the
Characters or Glyphs?
In multilingual work, you may come across references to both characters and glyphs. The difference is that character refers to an entity in the abstract, like the letter c while glyph refers to the particular shape in which a character is displayed. The character c, for example, can be displayed in italics, with or without serifs, in a font that makes it look like it has been written in crayon, and so forth. Each of these different glyphs is a different rendering of the character c. This difference between characters and glyphs is especially important in languages such as Arabic, where the appearance of a character may vary depending on its position in a word.languages these characters are used in. Fortunately, there are more convenient ways to do this.
Some individual applications, such as Microsoft Word and Corel WordPerfect, offer their own methods for entering characters for other languages. See the documentation for the applications you use for further details.
Another solution is to choose a keyboard layout that puts foreign characters in convenient locations-- usually the places they would be on the keyboards used in the country or countries where the language you are typing is spoken. See the next section for more details about keyboard layouts.
Third-party accessories are also available. One shareware option is 3-D Keyboard, from Fingertip Software (www. fingertipsoft. com), which shows a map of your current keyboard layout and also lets you design your own layouts. For example, you could designate that the Alt key in combination with the o key will produce an ö, ó, o, or any other character you prefer. See Chapter 7, "Creating and Converting Multilingual Resources," for more information. Other products that allow you to create macros also allow you to choose certain keys or key combinations that will type special characters for you.
Because the methods for entering special language characters can vary so much-- and because using some key combinations to type characters can interfere with the keyboard equivalents for commands in some applications-- some users prefer memorizing the ASCII codes (such as Alt+ 0225 for á), or keeping a list of the numerical codes handy. Cumbersome though this may be, these work regardless of the application you are using. The codes for special characters required by Western European languages is included in Table 1.1.
The Multilanguage Capabilities of Windows
If you are searching for additional information online or in Windows Help about multilingual capabilities, it is important to know that Microsoft usually refers to multilingual features as Multilanguage capabilities, or occasionally as international language support. Use these words and phrases as search keywords in addition to multilingual or the name of the language you are interested in.
Foreign Character Basics: Mac OS
The Mac OS already has an accessory similar to 3-D Keyboard for Windows built in: Key Caps, which is found under the Apple menu.
Key Caps, shown in Figure 1.2, displays the characters you get by typing particular key combinations. Press the Shift key, and it shows you uppercase letters. Press the Option key, and it shows you the characters you get when you type letters in combination with the Option key.
Some characters, such as the German letter Eszett (ß) (also known as a scharfes S in Austria and some parts of Germany) have their own unique key combinations, but for diacritics such as accent marks that can appear over a number of letters, you press a series of keys in combination (the keys, like the Option key, which don't actually produce a character are sometimes called dead keys). For example, to put an acute accent (') over a particular letter, you type Option-e and then the letter. To type the letter e with an acute accent (é), you type Option-e, then e. To type the letter a with an acute accent (á), you type Option-e, then a. Combinations for other diacritics are shown in Table 1.2.
These key combinations do not vary from one Mac OS application to another. A complete listing of the key combinations required for the special characters of Western European languages is provided in Table 1.3.
As an alternative, third-party shareware solutions such as PopChar Pro (www. unisoft. co. at/ products/ popchar. html) can display a window with the entire character set for a given font and allow you to enter characters into your documents by clicking them in the PopChar window.
Typing the Euro and Other International Currency Symbols
The euro, the new unit of currency that 11 of the 15 nations in the European Union began to introduce on January 4, 1999-- and for which bills and coins are scheduled to come into circulation in 2001-- presents a problem for some computer users. Some can't yet type its symbol, which looks like a letter e with an extra horizontal line, as shown in Figure 1.3.
Fortunately, this is relatively easy to fix. Both Microsoft and Apple have already added the euro symbol to fonts that ship with Windows 98 and Mac OS 8.5. Microsoft, Adobe, Monotype, and others have also created freely downloadable fonts which include the euro symbol.
For more information and downloadable fonts, see Web sites from the following companies and organizations:
Adobe.www. adobe. com/ type/ eurofont. htm
Apple.http:// developer. apple. com/ technotes/ tn/ tn1140. html
European Union.http:// europa. eu. int/ euro/ html/ entry. html
Microsoft.www. microsoft. com/ typography/ faq/ faq12. htm
Monotype Typography.www. monotype. com/ html/ oem/ euro% 5Ffont/ download. html
It is also possible to add the euro symbol to your existing fonts with a product called EuroFonter from Pyrus N. A. (www. pyrus. com). See Chapter 7 for more information.
To type the euro symbol in Windows, type Alt+ 0128; on the Mac, Shift-Option-2. To type the symbol for Japanese yen (¥) in Windows, type Alt+ 0165; on the Mac, Option-y. For British pounds (£) in Windows,type Alt+ 0163; on the Mac, Option-3.
NOTE There is an error in the Windows 98 Character Map utility, which Microsoft says it will fix in a forthcoming "service pack." The key combination Ctrl+ 2 does not produce the euro symbol; Alt+ 0128 does.
For most other currencies it's generally possible to "cheat" by simply typing an abbreviation, but if you want the real symbols used for your naira, sheqels, and cruzeiros, you'll probably need to get a font designed for the language of the country whose currency you are working in. For more information about custom fonts, see Chapter 7.
For inputting more than just a few special characters-- and for users who are used to native keyboard arrangements that make these characters easier to type-- you may want to use a different keyboard layout.
This doesn't mean an actual physical keyboard, but rather a software keyboard that determines which keys produce which characters. The standard U. S. keyboard layout is set up to match the keys on a U. S. keyboard, but there is no reason that this can't be changed. This is what different keyboard layouts do.
For example, on French keyboards, the position of the letters q and a are switched, the w and z are switched, and characters such as é, è, ç, à, and ù are placed for convenient typing without needing to press combinations of keys. Instead of a QWERTY keyboard (named for the first six letters on a standard U. S. keyboard), this keyboard is called AZERTY.
Similarly, German has a keyboard called QWERTZ, for its switch of the letters y and z. This is also the same principle behind the Dvorak keyboard, which some people use for better typing efficiency.
Keyboards are independent of content and do not behave like fonts or formatting. Once text is input, no indication remains of what keyboard layout was used to type it. In other words, with fonts and formatting, if you insert your cursor in the middle of an italicized word in the font Courier, whatever you type will be in italic Courier. But if you insert your cursor in the middle of a sentence typed in the French AZERTY keyboard, you might not be typing in French AZERTY unless you deliberately choose this keyboard. You'll know there is a problem if the wrong letters appear when you type. For example, if you think you are using French AZERTY but are not, you'll keep getting the letter q when you really want an a.
It is possible to get an actual physical alternative to the standard U. S. keyboard. You can buy these in the countries where the keyboards are
Let the User Choose Keyboard Layouts
The fact that many languages have more than one possible keyboard layout means that users should be able to choose their own layouts. The differences between one layout and another may be quite significant. For example, Russian users may be used to the layout used on Russian typewriters, but they may also be accustomed to one of several layouts that map Cyrillic characters to their approximate phonetic equivalents on a U. S. English keyboard. Just providing any keyboard layout for one particular language won't do much good if it's not one that the user knows how to use.used most often, but if a trip abroad isn't in the works, there are options for ordering alternative keyboards. CEL Tech Services (www .celtech. net) makes keyboards (for IBMs only, not PC clones) for a variety of languages. They require no additional software, just a switch of the default keyboard in Windows. Several dozen other options (for PCs and Macs) are also available through California-based World Language Resources (www. worldlanguage. com), and Fingertip Software (www .fingertipsoft. com) sells sets of keytop labels for labeling an existing keyboard, plus a variety of physical keyboards (mostly for PCs).
The Keyboard Properties control panel in Windows 98 provides access to more than 70 different keyboards. All of these are for European languages that use the Latin alphabet, in addition to a few keyboards for languages that use the Cyrillic alphabet, such as Russian and Bulgarian. (For information on other languages, such as Japanese, see the last section in this chapter, Computing Support for Other World Languages and Language Groups.)
The Keyboard Properties control panel, shown in Figure 1.4, gives the option of adding any of these keyboards to your system by installing them from the Windows 98 CD. Some languages have more than one keyboard option (Spanish has more than a dozen, for the various nations of Latin America), and some keyboards have different options available by clicking the Properties button, such as the option of using a Dvorak key layout in the U. S. keyboard.
When you have installed all the new keyboards you want, be sure to check the "Enable indicator on taskbar" check box. This displays the keyboards that you have available on the right side of your taskbar, as shown in Figure 1.5. The current keyboard is designated by a two-letter code, such as En for English, and you can choose from a menu of installed keyboards by clicking the keyboard indicator.
A favorite general choice for those who aren't familiar with foreign keyboards but who still need to type characters for Western European languages is the U. S International keyboard. It contains most of the characters you'll need for Western European languages in relatively convenient and intuitive locations. You may want to make the U. S. International keyboard your default keyboard.
Mac OS Keyboards
The Mac OS comes with 21 built-in keyboards for languages that use the Latin alphabet. To use them, simply go to the Keyboard control panel as shown in Figure 1.6, and select whatever keyboard you need. When more than one is selected, the flag of the current keyboard appears automatically in the upper right corner of the screen near the Applications menu.
You can switch between keyboards by pulling down the Keyboard menu, as shown in Figure 1.7, or you can select Options to set up key combinations for choosing from among the available keyboards. The default key combination for doing this is Command-Space.
Mac OS 8.5 doesn't include as many keyboard options as Windows 98, but all the major Western European languages are covered. Freeware and shareware options are generally available for the rest. In addition, while Windows 98 includes support only for languages that use the Latin and Cyrillic alphabets, Mac OS 8.5 includes limited built-in support for the major Asian and Middle Eastern languages. For more information, see the last section of this chapter, Computing Support for Other World Languages and Language Groups.
If you aren't already familiar with font installation in Windows, see Windows Help. On the Mac OS, fonts are installed by simply dragging and dropping them into the System Folder.
For languages with writing systems similar to English-- those that read left to right, have alphabets with only a few dozen letters, and so forth-- simply installing a font, keyboard layout, and a System script will be enough to allow you to create documents in that language. Czech, for example, has several characters not found in the extended ASCII character set, but simply adding a Czech font, keyboard layout, and script is all that is needed to begin working in Czech.
For fonts that are not included with your system, one source of fonts is Linguist's Software (www. linguistsoftware. com), which sells fonts that can be used with more than 365 languages. Fonts are available for Windows and Macintosh, and in some cases for DOS, Unix, OS/ 2, NeXT, and other operating systems as well. Other commercial sources of fonts include Ecological Linguistics (P. O. Box 15156, Washington, DC 20003, 202-546-5862); Monotype (www. monotype. com), which creates modules of fonts and scripts for a customer's required language support; FontWorld (www. fontworld. com); and Adobe (www. adobe. com).
There are also many freeware and shareware sources of fonts. A good place to start looking is the Web site for the Yamada Language Center at the University of Oregon, which maintains a thorough catalog of freely downloadable fonts (mostly for the Macintosh) at http:// babel .uoregon. edu/ yamada/ fonts. html. Another list of links to downloadable fonts (some free, some for a fee) with more Windows options is the Fonts in Cyberspace page, maintained by the Summer Institute of Linguistics at www. sil. org/ computing/ fonts/ index. htm.
Buying a font works well for individual work-- for example, the word processing of documents that will be printed and distributed on paper-- but many fonts are copyrighted and cannot be freely distributed. For sharing multilingual information with others who may not have the same fonts that you do-- or who may not be able to use them because they are on another platform-- see Chapter 3 and Chapters 9 through 11.
Other languages, however, have features that are difficult or impossible for applications designed for the Latin alphabet to handle. Hebrew, for example, is written from right to left. In Arabic, the appearance of characters varies depending on their position within words. Chinese has thousands of characters, and no simple keyboard-layout rearrangement allows you to type them all.
Advice for working in these languages is provided by language or major language group in the following section.
Computing Support for Other World Languages and Language Groups
When you need to go beyond the languages that originated in Western Europe, you will generally need to install additional software. The way to do this depends on the platform you are using. Although Microsoft is reportedly making significant moves to offer better multilingual support in Windows 2000, at the present time, Apple and Microsoft differ in their approach to multilingual issues.
Apple's approach has been to make it relatively easy and convenient to get multilingual support for one or more languages, and to be able to use them singly or in combination with one another on any version of the Mac OS (English, French, Hebrew, and so forth).
Apple's support for non-Latin alphabets is based on a technology called WorldScript, which allows software developers to easily add multilingual capabilities to their applications, such as the ability to enter text from right to left. These capabilities are made available to users through a family of language kits (www. apple. com/ macos/ multilingual/ languagekits. html). More than one language kit can be installed on the same computer, as shown in Figure 1.8, which enables support for truly multilingual documents. Each kit comes with a variety of keyboard layouts and transparent labels to attach to your actual physical keyboard, as well as an assortment of TrueType, PostScript, and bitmapped fonts.
Apple has been faulted for deviations from accepted standards for some languages-- Apple's Czech fonts, for example, do not use an encoding called Latin 2 that has been established for Eastern European languages by the International Organization for Standardization (ISO)-- but applications are now frequently written in a way that takes care of conversions from one encoding scheme to another. The conversion takes place behind the scenes, and the user may not even be aware of the issue. More information is available in Chapter 3.
Apple also provides significant built-in language support in the Mac OS for the Internet though a feature called Multilingual Internet Access. Apple first included Multilingual Internet Access on the Mac OS 8.5 CD, which lets you install support for viewing languages that use the following writing systems: Arabic, Indian (Devanagari, Gujarati, and Gurmukhi), Hebrew, Japanese, Korean, and Chinese (simplified and traditional). Input support is also included for all of these languages except Chinese, Japanese, and Korean. For these, it is currently necessary to buy the full Chinese, Japanese, or Korean language kits.
To install Multilingual Internet Access, run Mac OS Install from the Mac OS CD, and when prompted, choose Add/ Remove and scroll down the list of options to Multilingual Internet Access. To view languages that use non-Latin alphabets on Macs running an older version of the Mac OS, it is necessary to buy the full Apple language kits for those languages in order to get support.
For additional information about multilingual support from Apple, see also the links at www. apple. com/ macos/ multilingual/ webinfo .html for Mac-oriented language-specific advice.
Microsoft offers built-in support for more languages in its U. S. version of Windows (namely for the languages of Eastern Europe) than Apple does in the Mac OS, but beyond this, Microsoft recommends using localized versions of the Windows operating system for work in languages that do not use the Latin alphabet. This is an approach that can be cumbersome, time-consuming, and expensive because it is necessary to track down, install, and switch back and forth between multiple language versions of Windows if you want to use more than one non-Latin language on your computer.
Although Microsoft discourages the use of language layers on top of the operating system (warning that their independent development has tended to fragment language support), there are nonetheless a variety of third-party solutions, some of which offer all-in-one language functionality similar to that of Apple's language kits. Details are provided in the following sections. The advice given for the languages covered in this chapter is cross-platform, unless specified otherwise. Not all languages are included, but as much detail as possible is given for the world's major languages and language groups.
Software companies generally sell the localized versions of their products only in the countries they have been localized for. In other words, you won't have much luck trying to get a localized Japanese version of Windows directly from Microsoft (or mainline PC software vendors) unless you buy it in Japan, and although Apple's family of language kits is available worldwide, Apple officially sells localized versions of the Mac OS only in the countries they were created for.
This doesn't mean that these localized products are impossible to get elsewhere. Whenever possible, options for obtaining them outside of the countries for which they were produced are listed. Most are available through third-party vendors who specialize in international products.
Fonts for the languages discussed in the following sections (and many others) are available from companies including Adobe (www. adobe
One Byte or Two?
Reading about multilingual computing issues, you may come across references to one-byte and two-byte languages. One-byte languages are languages that contain fewer than 256 characters, which means that they can be represented by 8 bits, or 1 byte. English, German, and Russian are examples of one-byte languages. Two-byte languages require 2 bytes of memory, which allows for as many as 65,536 characters. This accommodates most of the world's languages, but when a language has thousands of characters instead of dozens, it becomes more complex to handle. Two-byte languages may also have more than one script. Japanese, for example, uses two phonetic scripts (kana) with more than 160 unique characters in each, as well as an ideographic script (kanji) with thousands of characters. Phonetic scripts describe pronunciation, whereas in ideographic scripts, each character has a specific meaning..com), Ecological Linguistics (ecoling@ aol. com), Linguists Software (www. linguistsoftware. com), and Monotype (www. monotype. com).
Except for Arabic (see the Middle Eastern Languages section later in this chapter) and Amharic (the national language of Ethiopia), most of the languages of Africa are written in variants of the Latin alphabet. This makes their use on computers relatively easy, since it is only necessary to install fonts that include special characters and keyboards for ease of input.
The font TransRoman, available from Linguists Software (www.linguistsoftware. com), has the characters needed for more than two dozen African languages, including Afrikaans, Ashanti, Masai, Swahili, and Yoruba. The font AfroRoman, also from Linguists Software, has characters needed for some of the same African languages, in addition to others, such as Hausa. Both are available for Windows and the Mac OS.
An Afrikaans keyboard is included on the Windows 98 CD.
Tips for Avoiding Trouble
With more and more options built into operating systems and applications for working in other languages-- or several languages at once-- it can be tempting to install all the language support that is available, so that every one of your machines is ready to work in nearly any language. For machines that are used by more than one person, this would allow input or viewing in any language the user desires. This may not be as good an idea as it sounds at first. Most of the time, you can get away with installing support for multiple languages and writing systems on the same computer without any problems. But to be on the safe side, remove (or better yet, don't install in the first place) support for any language you don't need. It's a good bet that quality-assurance testing of, say, Arabic and Chinese support running side by side hasn't been as thorough as for most other elements of your system. Truly baffling conflicts can arise and cause problems such as control panels appearing in other languages, or printers spewing forth the wrong alphabet. This doesn't mean that it's necessary to segregate languages, but keeping your language-support options as simple as possible will help avoid needless conflicts.
East Asian Languages
All three of the major East Asian languages (Chinese, Japanese, and Korean) have complex writing systems with thousands of characters, requiring special input methods.
Support for all three of the major East Asian languages, plus other languages, is offered in one Windows product that allows you to view and work in all three: WinMASS 2000, from Singapore-based Star+ Globe Technologies (www. starglobe. com. sg). WinMASS 2000 lets you add these capabilities to off-the-shelf applications within the English Windows environment. A similar Windows product called AsianSuite is available from UnionWay (www. unionway. com).
Windows support for Traditional and Simplified Chinese, Japanese, and Korean-- including both display capabilities and input method editors (IMEs)-- is also available through a well-regarded product called NJStar Communicator (www. njstar. com), as well as Microsoft's freely downloadable Global IME (www. microsoft. com/ windows/ ie/ features/ ime. asp).
Because of similarities in writing systems, in some cases it is also possible to use support for one language to work in another. For example, some Chinese and Korean input options also provide limited support for entering Japanese.
Generally, however, it is necessary to get separate support for each individual language. Use of a Japanese operating system for Chinese work, for example, can cause trouble when printing because characters get mapped to Japanese-encoding equivalents, which means you don't get the characters on paper that you were expecting. "People do need to be careful about what they combine on their computers, since the results are often unpredictable," says Laurel Mittenthal, foreign language computing specialist for the Faculty of Arts and Sciences Computer Services at Harvard University.
The main options for individual East Asian language support are described in the following sections.
To support the Chinese language's thousands of characters, it is frequently necessary to distinguish between support for the Traditional and Simplified Chinese script. The Simplified script is the result of an effort that began in the 1950s to simplify the complex Chinese writing system-- some Traditional characters require as many as 33 strokes to write by hand. All Chinese characters are drawn to fit within an invisible square frame.
Today, Simplified Chinese is generally used in the People's Republic of China and Singapore, while Traditional is used in Taiwan and by Chinese communities in other countries.
When Chinese is written out in the Latin alphabet, the system of Romanization most commonly used today is called pinyin.
Dialects of Chinese such as Mandarin and Cantonese use the same writing system and are written identically, even though in speech they are not mutually intelligible.
Traditional and Simplified Chinese versions of the Windows operating system itself can be purchased through AsiaSoft (www. asiasoft. com) and World Language Resources (www. worldlanguage. com). Versions of products in Chinese are generally designated with the letter C after the name, or CS for Chinese Simplified or CT for Chinese Traditional, as in Microsoft Windows 98CT.
Support for Chinese in English versions of Windows is available through Chinese Partner from TwinBridge (www. twinbridge. com). Chinese Partner provides support to Windows 95 and 98 and Unicode-compliant applications, and includes several traditional and simplified fonts, in addition to eight Unified Chinese fonts which include both the Simplified and Traditional Chinese character sets. Chinese Partner also supports several input methods and can convert to and from several Chinese encoding standards, such as GB, Big Five, GBK, Big Five Plus, and Unicode.
Another Chinese option for Windows is Chinese Star from SunTendy America (www. suntendyusa. com), which support Windows 3. x and higher and Windows NT.
Yet another Chinese input option is Motorola's Wisdom Pen (www. mot. com/ MIMS/ lexicus), a Chinese handwriting-recognition system that transforms characters written with a stylus on a pressure-sensitive tablet to digital characters.
Simplified and Traditional Chinese versions of the Mac OS itself can be purchased from AsiaSoft (www. asiasoft. com) and World Language Resources (www. worldlanguage. com).
Other versions of Mac OS 8.5 (including U. S. English) include display-only support for Chinese Web browsing through the Multilingual Internet Access option, which can be installed from the Mac OS 8.5 CD.
Apple's Chinese Language Kit (www. apple. com/ macos/ multilingual/ chinese. html) provides support for Simplified and Traditional characters, with six fonts and a variety of input methods, including Pinyin, Zhuyin, and Cangjie. Character entry is made easier by automatic prompts for the most likely phrases beginning with the last character you have entered. You can also enter characters through handwriting or voice with the Apple Advanced Chinese Input Suite (www. asia. apple .com/ datasheets/ as/ acis. html), which must be purchased separately.
Some users have expressed concerns about the Chinese Language Kit's inability to convert to and from certain encoding schemes, but generally the kit gets high marks.
The Chinese Language Kit displays the menus of localized applications in Chinese and also allows you to work with Chinese in other applications that support Apple's WorldScript technology. Documentation is in English and Chinese.
Getting Localized Apple OSs and Language Kits
One way to get updates of language kits and localized versions of the Mac OS is by subscribing to the Apple Developer Connection Mailing (developer. apple. com/ programs/ mailing. html). The monthly CD includes software updates, including many of interest to multilingual users, as they become available. The entire family of Apple's language kits is also bundled with the multilingual publishing application Ready, Set, Go! Global, detailed in Chapter 10, "Multilingual Publishing, Graphic Design, and Multimedia."
Japanese writing uses three scripts, kanji, hiragana, and katakana, in addition to a widely used Latinization called romaji. Kanji is based on Chinese ideographic characters, while kana (hiragana and katakana) is syllabic. It is common to write in a mix of both kana and kanji.
Japanese versions of the Windows operating system itself can be purchased through AsiaSoft (www. asiasoft. com) and World Language Resources (www. worldlanguage. com). Versions of products in Japanese are generally designated with the letter J after the name, as in Microsoft Windows 98J.
One of the main options for Japanese support in English versions of Windows is Pacific Software Publishing's KanjiKit 97 (www. pspinc .com/ lsg). KanjiKit 97 is a Japanese utility for English versions of Windows that allows you to use the Internet and email in Japanese with Netscape or Internet Explorer, and to use Japanese in the English versions of Microsoft Word, Excel, and other Unicode-compliant applications.). KanjiKit 97 includes the Katana FEP (front-end processor, an input method), plus Mincho and GothicTrueType fonts. An additional font pack is also available. KanjiKit 97 runs on Windows 3. x or higher and Windows NT. A 15-day trial version can be downloaded from Pacific Software Publishing's site.
Another source of Japanese support in English versions of Windows is Japanese Partner, from TwinBridge (www. twinbridge. com). Japanese Partner allows you to work with Japanese characters (kanji and kana) in the standard English Windows environment, and a variety of input methods are included. The basic version includes four Japanese fonts, and an extended version offers 15 additional fonts and full compatibility with Microsoft Office.
J-Text Pro 1.1 from NeocorTech (www. neocor. com) is a basic $39 Japanese word processor for Windows 95/ 98 and NT. J-Text Pro lacks the full features of a standard word processor like Microsoft Word, but it allows basic word processing in Japanese because it comes with its own fonts and front-end processor for entering Japanese characters. Some reviewers say the J-Text input method for Japanese characters takes some getting used to, but one of J-Text's advantages is that it requires no additional support.
The Japanese version of the Mac OS itself can be purchased from AsiaSoft (www. asiasoft. com) and World Language Resources (www .worldlanguage. com). Versions of products in Japanese are generally designated with the letter J after the name, as in Mac OS 8.5J.
Other versions of Mac OS 8.5 (including U. S. English) include display-only support for Japanese Web browsing through the Multilingual Internet Access option, which can be installed from the OS 8.5 CD.
Apple's full Japanese Language Kit (www. apple. com/ macos/ multilingual/ japanese. html), which can be installed with any language version of the Mac OS in addition to other language kits, includes the Kotoeri input method, along with three kanji fonts and a Roman and kana keyboard layout. When the kit is installed, applications that have been localized for Japan will display Japanese menu bars. Documentation is in English and Japanese.
The kit handles Japanese text well but does not by itself allow you to type text vertically. The kit can be used with applications supported by Apple's localized Japanese system as well as with English applications that support Apple's WorldScript technology, such as the word processor Nisus Writer.
Korean writing consists of two scripts, Hangul and Hanja. Hanja refers to Chinese ideographic characters, while Hangul is an alphabet consisting of 10 vowels and 14 consonants that was invented for Korean 500 years ago but not widely accepted until the twentieth century. It is possible for Hanja and Hangul characters to be used in the same text.
Korean versions of the Windows operating system can be purchased through AsiaSoft (www. asiasoft. com) and World Language Resources (www. worldlanguage. com). Versions of products in Korean are generally designated with the letter K after the name, as in Microsoft Windows NT 4.0K Workstation.
One of the main options for Korean support in English versions of Windows is Korean Partner, from TwinBridge (www. twinbridge. com). Korean Partner includes four TrueType Korean fonts, an electronic Korean dictionary, and a code converter and font editor. Korean Partner is available for Windows 3. x and higher.
Another option is Union Way's Hangul Pro Pack (www. unionway. com), which allows you to read and input Korean characters in the standard English Windows environment, including Microsoft Office, Publisher, WinFax Pro, and other applications. Hangul Pro Pack is available for Windows 3. x and higher.
The localized Korean version of the Mac OS can be purchased from AsiaSoft (www. asiasoft. com) and World Language Resources (www. worldlanguage. com). Versions of products in Korean are generally designated with the letter K after the name, as in Mac OS 8.5K.
Other versions of Mac OS 8.5 (including U. S. English) include display-only support for Web browsing in Korean through the Multilingual Internet Access option, which can be installed from the OS 8.5 CD.
The full Korean Language Kit (www. apple. com/ macos/ multilingual/ korean. html) supports both Hangul and Hanja and can also display the menus of applications in English or Korean if they are designed for this possibility. It includes two keyboard layouts for native speakers and two romaja modes for easy Korean input by nonnative speakers.
The Korean Language Kit also includes the Hanja Dictionary Utility, which allows a user to create a personal Hanja dictionary. The kit includes five Korean TrueType fonts that include more than 2,000 symbols and 4,888 Hanja characters, as well as an extended symbol character set and support for Japanese hiragana and katakana characters. Documentation is in English and Korean.
A Korean font package with 27 fonts for Macintosh is available from World Zusson (www. worldzusson. com).
Eastern European Languages
Many of the languages of Eastern Europe (including Czech, Polish, and Rumanian) use modified versions of the Latin alphabet, and so basic support generally requires only a font and keyboard layout. Things are more complicated, however, for languages including Russian, Bulgarian, Serbian, and Macedonian (and some of the non-Slavic languages of the former Soviet Union, such as Uzbek and Kazakh, although many of these languages are returning to their original writing systems), which all use the Cyrillic alphabet and therefore require Cyrillic fonts and keyboard layouts.
Support for the Cyrillic alphabet is also complicated by several competing standards, including Windows Cyrillic, Unicode, and the Russian KOI8 (for Code of Information Exchange) standards. In addition, not every Cyrillic font contains all the characters needed for every language that uses the Cyrillic alphabet, so you may need to use fonts designed specifically for the languages you are working in. For more details, including information on support for other operating systems, see the Russify Everything page at www. siber. com/ sib/ russify.
The Windows 98 CD includes support both for Eastern European languages that are based on the Latin alphabet as well as for those that use Cyrillic. However, some of these keyboard layouts lack phonetic options that may be very helpful to non-native typists.
Sources of additional support, such as keyboard layouts, converters, fonts and proofreading tools, include the Cyrillic Starter Kit and Central European Starter Kit, both from Fingertip Software (www. cyrillic. com). Links to additional sources of keyboard layouts, coding converters, and other tools and advice is provided through a page maintained by Paul Gorodyansky, at http:// ourworld. compuserve. com/ homepages/ Paul_ Gorodyansky.
Fully localized versions of Windows for Russian and other Eastern European languages are available from (www. worldlanguage. com).
Mac OS support for Eastern European languages that use the Latin alphabet is available from commercial providers of fonts, or sometimes as shareware and freeware. Common sources for these fonts are the localized versions of Mac OS 7.0.1 that Apple makes available by FTP from ftp:// ftp. info. apple. com/ Apple_ Support_ Area.
For languages such as Russian and Bulgarian, which use the Cyrillic alphabet, it is necessary to get a copy of Apple's Cyrillic Language Kit (www. apple. com/ macos/ multilingual/ cyrillic. html), which includes support for Russian, Ukrainian, Bulgarian, Belorussian, Macedonian, and Serbian. Several TrueType fonts are included, as are keyboard layouts for typists familiar with native Russian keyboards and those who are using Latin keyboards.
Fonts in Apple's Cyrillic Language Kit lack accented vowels, but these can be added with shareware fonts such as Nevsky (ftp:// ftp. brown .edu/ pub/ language_ lab/ Nevsky. sea. bin). For additional details about Russian, see the Russification of the Macintosh page at http:// solar. rtd .utk. edu/ partners/ rusmac.
Indian languages use many writing systems, several of which may be used in the same country. The government of India, for example, makes official use of 12 different writing systems.
Devanagari is used by languages such as Hindi, Marathi, and Nepali (as well as Sanskrit, from which they descended). Devanagari has 52 basic symbols and is written from left to right. Devanagari also has its own set of symbols for numerals, but today Arabic numbers are typically used.
Gujarati is used for the languages Gujarati and Kacchi, has 51 basic characters, and is written from left to right. Gurmukhi, used by the Panjabi language, has 55 characters and is also written from left to
Can't Find What You're Looking For?
Email discussion lists and other Internet-based resources exist for many of the world's languages, and these can be a great place to find out more about the computing issues that arise in one language or another. One source is the Internet Resources for Language Teachers and Learners page, maintained by the University of Hull in Britain at www. hull. ac. uk/ cti/ langsite. htm, or the Human Languages Page at www. june29. com/ HLP. Links to other online resources are also available through www. indigo. ie/ egt/ langlist. html, and The Yamada Language Center at the University of Oregon, which maintains a catalog of these lists at babel. uoregon. edu/ yamada/ lists. html. Both Microsoft and Apple generally keep their overseas offices separate from operations in the United States, but for times when it may help to go straight to the source, a complete listing of Microsoft's overseas offices is available at www. microsoft. com/ worldwide, and for Apple at www. apple. com.right. Mayalam has 50 characters, is written left to right, and is used by languages including Tamil and Telugu.
A Perso-Arabic script is used for Urdu, the official language of Pakistan. See the Middle Eastern Languages section later in this chapter for more information.
The diversity of Indian languages has led to a lack of standardization, but the main options are detailed. For additional information about Hindi in particular, see the Hindi Language Resources site at www. cs .colostate. edu/~ malaiya/ hindilinks. html.
Although support for Tamil, Hindi, and other Indian languages that use Devanagari script is planned for Windows 2000, Microsoft and third-party support for these languages has generally been limited. One option is the Indian Language Keyboard Program from AC Zone (www. aczone. com/ ilkeyb) for Windows 3.1 and higher and Windows NT, which provides fonts and support for Indian scripts such as Devanagari, Gujarati, and Mayalam. Another option is to use software such as OnePen (www. zem. co. uk/ mlsoft/ onepen. htm), which allows you to insert non-Latin characters in most Windows applications.
Fonts for Indian writing systems and individual Indian languages are available from Monotype (www. monotype. com) and Ecological Linguistics (email: ecoling@ aol. com).
Apple's Indian Language Kit (www. apple. com/ macos/ multilingual/ indian. html) provides support for languages written in the Devana-gari, Gujarati, and Gurmukhi writing systems. It can also be used to view Indian Standard Code for Information Interchange (ISCII) -compatible Indian language sites on the Web, and includes the default INSCRIPT keyboard layout, as well as a phonetic Latinized option for input on a QWERTY keyboard. The kit includes three TrueType Indian fonts, along with matching English typefaces.
Middle Eastern Languages
Two of the major Middle Eastern languages, Arabic and Hebrew, have special requirements because they are bi-directional languages and require both right-to-left and left-to-right input.
Arabic's writing system of 28 letters is written from right to left and is a cursive script in which most letters connect with the letters next to them. There are no block letters, nor is there any distinction between upper-and lower-case. However, each letter can have up to four forms (initial, medial, final, and separate) depending on its context, and unlike the rest of Arabic writing, numbers and other mathematical expressions are written from left to right.
Localized Arabic versions of the Windows operating system can be purchased from the AramediA Group (aramedia. com) or World Language Resources (www. worldlanguage. com). Two versions ship on the same CD, enabled and localized. The enabled version supports Arabic text in an English environment, while in the localized version, the entire user interface is in Arabic. Both fully support the Arabic language and the Hijri calendar, which is based on lunar cycles.
An Arabic text-to-speech option called Reading Machine is available from Sakhr Software (www. sakhrsoft. com).
Apple's Arabic Language Kit (www. apple. com/ macos/ multilingual/ arabicfarsi. html) includes support for Arabic and Persian, including seven Arabic fonts and six Persian fonts. It also supports the Hijri (lunar) calendar. The kit basically supports Urdu and other languages that use the Arabic writing system, although it lacks characters and diacritics for some of these languages. In these cases, it may be necessary to modify an existing Arabic font with Fontographer (see Chapter 7) or to buy third-party pan-Arabic fonts.
Like Arabic, to which it is related, Hebrew is a bi-directional language. Text is written from right to left, but numbers and mathematical expressions are written left to right as in English. Hebrew script has 22 letters, 5 of which have two shapes depending on their position within words. The Hebrew writing system is also used for languages such as Yiddish.
Hebrew versions of the Windows operating system can be purchased from World Language Resources (www. worldlanguage. com). Two versions ship on the same CD, enabled and localized. The enabled version supports Hebrew text in an English environment, while in the localized version, the entire user interface is in Hebrew.
Apple's Hebrew Language Kit (www. apple. com/ macos/ multilingual/ hebrew. html) includes support for Hebrew and Yiddish, with six fonts for Hebrew with Yiddish characters and various keyboard options, including a QWERTY keyboard. The kit does not support cantillation marks (for musical recitation) and therefore may not be appropriate for religious texts.
The Windows 98 CD includes support for Turkish, which uses a modified version of the Latin alphabet. A selection of Turkish fonts for the Mac OS is available at the Multilingual Macintosh Support page at www. indigo. ie/ egt/ earra_ bog/ apple/ index. html, or http:// babel .uoregon. edu/ FontLayout/ FontMain. html, as well as from commercial font vendors.
The following languages do not fit easily into any of the language groups discussed above. Nonetheless, their use with computers requires some explanation.
This artificial language uses a modified version of the Latin alphabet. For fonts, check www. homunculus. com/ babel/ Fonts/ EspFonts. html and http:// babel. uoregon. edu/ yamada/ fonts/ esperanto. html.
Greek is written from left to right and requires no special support other than fonts and keyboard layouts. Fonts, however, may be monotonic-- referring to a reform made in the 1980s to make Greek writing conform to a single accent-- or the traditional polytonic.
Fonts and keyboards for Greek are included on the Windows 98 CD. For support on the Mac OS, try the Multilingual Macintosh Support page at www. indigo. ie/ egt/ earra_ bog/ apple/ index. html, or babel .uoregon. edu/ yamada/ fonts/ greek. html.
The font TransRoman, available from Linguists Software (www .linguistsoftware. com), has the characters needed for major Polynesian languages such as Fijian and Samoan, as well as Hawaiian. Windows support for Maori (spoken in New Zealand) is available from Reddfish (www. reddfish. co. nz/ reddfish/ kete. htm).
For More Information
For information about these and additional languages, an excellent source is the WorldType Solutions Catalogue from Monotype (www. monotype. com), which provides font samples of the company's type offerings for more than two dozen writing systems. More than this, however, the informative catalog is also a great primer on the world's languages.
After setting up your system for multilingual work, you'll also need applications that can handle various languages. These are discussed in the next chapter.
Meet the Author
Chris Ott is an expert on multilingual computing. He has written for Multilingual Computing & Technology, Independent Publisher, The Denver Business Journal, and the online magazine Salon. Chris also managed the technical resources of a university language center, participated in a foreign-language education project sponsored by the National Endowment for the Humanities, and has given presentations on language courseware at educational conferences.
Most Helpful Customer Reviews
See all customer reviews