Microsoft® Office XP Resource Kit

microsoft.com Home  
Microsoft
http://www.microsoft.com/office/ork  

    Office Resource Kit
    Toolbox
    Getting Started
    Deployment
    Maintenance
    Worldwide Deployment
    Messaging
    Site Index
    Glossary
Office Resource Kit / Worldwide Deployment / Maintaining International Installation
Topics in this chapter
  Unicode Support and Multilingual Documents  
  Taking Advantage of Unicode Support  
  Changing Language Settings  
  Removing Multilingual User Interface Files  
  Managing Language Settings for Each Application  
 

Taking Advantage of Unicode Support

Microsoft Office XP is based on an international character encoding standard — Unicode— that allows users upgrading to Office XP to more easily share documents across languages. Unicode support in Office XP also allows users to read international documents created in any previous versions of Office.

Office XP also provides the conversion tables necessary to convert code page–based data to Unicode and back again for interaction with previous applications. Because Office XP provides fonts to support many languages, users can create multilingual documents with text from multiple scripts.

Unicode support in Office XP means that users can copy multilingual text from Office 97 documents and paste it into any Office XP document, and the text is displayed correctly. Conversely, multilingual text copied from any Office XP document can be pasted into a document created in any Office 97 application (except Microsoft Access and Microsoft Outlook).

In addition to document text, Office XP supports Unicode in other areas, including document properties, bookmarks, style names, footnotes, and user information. Unicode support in Office XP also means that you can edit and display multilingual text in dialog boxes. For example, you can search for a file by a Greek author's name in the Open dialog box.

Outlook 2002 supports Unicode in the body of mail messages. However, Outlook data — such as Contacts, Tasks, and the To and Subject lines of messages — is limited to characters defined by the user's code page.


Note   Microsoft Windows NT 4.0 and Microsoft Windows 2000 provide full support for Unicode. Some support is provided in Microsoft Windows 98.


Using Unicode values in Visual Basic for Applications

The Microsoft Visual Basic® for Applications environment does not support Unicode. Only text supported by the operating system can be used in the Visual Basic Editor or displayed in custom dialog boxes or message boxes.

You can use the ChrW() function to manipulate text outside the code page. The ChrW() function accepts a number that represents the Unicode value of a character and returns that character string.

Using local language file names

In Windows 98 and Microsoft Windows Millennium Edition (Windows Me), Unicode characters in file names are not supported, but they are supported in Windows NT and Windows 2000. In Windows 98 and Windows Me, file names must use characters that exist in the code page of the operating system.

If users in your organization share files between language versions of Windows, they should use ASCII characters (unaccented Latin script) to ensure that the file names can be used in any language version of the operating system.

In Office XP, all applications (except Microsoft FrontPage® and Outlook) now support opening and saving files with Unicode file names, using File | Open in the application or by double-clicking the file name in Windows Explorer.


Note   While Microsoft Excel can open and save files with Unicode file names, it cannot save a new file using characters in the name that do not exist in the current system code page.


Printing and displaying Unicode text

Not all printers can print characters from more than one code page. In particular, printers that have built-in fonts might not have characters for other scripts in those fonts. Also, new characters such as the euro currency symbol might be missing from a particular font.

Although the Office applications contain many workarounds to enable printing on such printers, it is not possible in all cases. If text is not printing correctly, updating the printer driver might fix the problem. If the latest driver does not fix the problem, you can look for an option in the printer driver options called "download soft fonts", or "print TrueType as graphic." Change this setting and try printing again.

If the text still does not print correctly, you can create a registry entry that works around the printing problems of most printers; the printing quality, however, might be lowered.

To set the registry so that extended characters are printed correctly

  1. Go to the following registry subkey:

    HKEY_CURRENT_USER\Software\Microsoft\Office\10.0\Word\Options

  2. Add a new value entry named NoWideTextPrinting and set its value to 1.

Compressing files that contain Unicode text

Office XP stores text in a form of Unicode called "UTF-16". Unicode characters are encoded in two bytes (or very rarely, four bytes) rather than what is used in non-Unicode systems (i.e. a single byte, or in a mixture of one and two bytes in some Asian languages). Generally, Office XP files with multilingual text are similar in size to Office 97 or 2000 files. However, Office XP files may be 30 to 50 percent larger than files created in previous, non-Unicode versions of Office (Office 95 and earlier).


Note   If a file contains text from only English or Western European languages, there is little or no increase in file size because Office XP applications can compress the text.


When Microsoft Word 2002 users open and save an English or Western European file from a previous, non-Unicode version of Word (a version earlier than Office 2000), Word converts the contents to Unicode. The first time the file is saved, Word analyzes the file and notes regions that can be compressed, resulting in a file that is temporarily twice the size of the original file. The next time the file is saved, Word performs the compression, and file size returns to normal.

For Microsoft PowerPoint® files, text is typically a small percentage of file size, so Unicode does not significantly increase file size.

Copying multilingual text

You can use the Clipboard to copy multilingual text from one Office application to another. Text from the Clipboard in RTF, HTML, and Unicode formats can successfully be pasted into Office applications.

Multilingual text in RTF, HTML, and Unicode

When you copy text from an Office XP document, the RTF or HTML formatting data, as well as the Unicode text data, is stored on the Clipboard. This allows applications that do not support Unicode but do support data in multiple code pages to accept RTF text from the Clipboard, which retains some of the multilingual content. For example, both Word 95 and Word 6.0 accept multilingual Word 2002 text from the Clipboard as RTF format (as well as Word 2000 and Word 97 text).

All language versions of Word 95 and Word 6.0 can display text in most European languages. However, Asian and right-to-left language versions cannot display other Asian or right-to-left languages. Also, English and European versions of Word 6.0 and Word 95 cannot display any Asian or right-to-left text properly.

Word 97 can accept RTF and Unicode text from the Clipboard and display content in all European and most Asian languages. Word 2000 accepts HTML as well, and properly handles all Asian and right-to-left content.

Access 2000, Access 2002, Excel 2000, and Excel 2002 all support copying multilingual Unicode, RTF, or HTML text to the Clipboard. However, Access and Excel cannot accept RTF content. They can accept HTML-formatted text or Unicode text from the Clipboard instead.

Multilingual code page–based single-byte text

In some rare conditions, users may paste single-byte (ANSI) text into an Office XP document that is encoded in a code page that is different from the one their operating system uses. If this occurs, depending on the application they are pasting into they may get unintelligible characters in their document. This problem occurs because Office cannot determine which code page to use to interpret the single-byte text.

For example, you might paste text from a non-Unicode text editor that uses fonts to indicate which code page to use. If the text editor supplies only RTF and single-byte text, the font (and code page) information is lost when the text is pasted in an application that does not accept RTF (for example, Excel). Instead, the application uses the operating system's code page, which maps some characters' code points to unexpected or nonexistent characters.

Troubleshooting corrupt text results with older multilingual files

There may be occasions when a user cannot successfully use Office XP to open a file created on an older system. There are several possible scenarios that can create this problem, and for each situation there are steps you can take to work around the issues.

  • The document is a pre-Office 97 document that was created using some incorrectly made TrueType® fonts.

    For example, a document that looked fine in Word 95 can be opened in Word 2002, and the document text is converted to a mixture of characters from Western Europe. This situation occurs because the fonts used in the Word 95 document were marked internally as Western European, and the text data was therefore converted to Unicode Western European text. There are a few other variations on this problem involving symbol fonts; but in all cases you can try one of the following solutions to correct the problem:

    • Change the fonts that display the incorrect characters.

    • Use the "broken fonts add-in" that ships with Office XP. In Word, install the add-in. Then, under the Tools menu, click Fix Broken Text.


  • The document is a pre-Office 97 document created under a "shell" program designed to enable English Windows to support Chinese or other Asian language.

    For example, Chinese Star, RichWin, and TwinBridge. In this case, try one of the following solutions:

    • If you open the document in Word, ensure that the correct Chinese language is enabled by checking the setting in the Microsoft Office Language Settings tool (go to Start | Programs | Microsoft Office Tools | Microsoft Office Language Settings). Start Word, and go to Tools | Options | General. Set the value of the option English Word 6/95 documents to the appropriate setting — for example, Contain Asian text.

    • If you open the document in PowerPoint, go to Tools | Options | Asian. Locate the option convert from font-associated text, and set the language correctly.


  • The document is HTML, and the encoding of the file is not marked correctly in the file.

In this case, with the document currently open, go to Tools | Options | General, and click Web Options. Click the Encoding tab, then change the encoding to open the file with different values until the characters in the file are shown correctly.


Top

 
© 2001 Microsoft Corporation. All rights reserved. Terms of use.
License