windows-nt/Source/XPSP1/NT/sdktools/restools/rltools/doc/rlman.txt

665 lines
24 KiB
Plaintext
Raw Normal View History

2020-09-26 03:20:57 -05:00
Resource Localization Manager User's Guide
Version 1.72
05/03/94 09:51 AM
Copyright <20> 1992-1994 Microsoft Corporation
Table of Contents
Table of Contents 2
Introduction 3
Usage: 4
Languages IDs 6
Localization Process 8
One-Shot Process 8
Parallel Process 9
Project Creation 9
Maintaining the Master Project 9
Maintaining Each Locale Project 9
Data Files and Formats 10
Master Project Files (MPJ) 10
Project Files (PRJ) 10
Master Token Files (MTK) 11
Language Token Files (TOK) 12
Resource Description Files (RDF) 13
Introduction
In the past, localization of a software product required the localizer to edit strings and controls
embedded in source code and then rebuild the product in order to test the localized version. Such a process
requires at least a rudimentary knowledge of computer programming and is often prone to human error.
The Resource Localization Manager (hereafter referred to as RLMan) was designed to automate
localization of products that make use of the Windows resource model by allowing the localizer to extract
localizable resources directly from the applications that use them, modify the resources, and use the
modified resources to create localized versions of the original applications. All this can be achieved
without rebuilding the product and with minimal knowledge of computer operations.
RLMan was designed with several goals in mind. Some of these goals were:
* Allow the product to be localized without re-compilation.
* Allow localization to proceed concurrent with development (provide update capability).
* Allow localizers to share glossaries of common terms among applications.
The localization model followed by the RLMan is very simple. Localizable resources are extracted from a
source resource file and put into special text file called a token file. Each localizable resource may
generate one or more tokens. Each token is contained on a single line of text and consists of a unique
identifier followed by the localizable data associated with that particular resource. These tokens can then
be localized by using a standard text editor.
The localized tokens are then used in conjunction with the source resource file to generate a new localized
resource file. The term "resource file" in this document means "any Windows executable format file
(.EXE, .DLL, .CPL, etc.) or .RES file". The target resource file will contain exactly the same resources as
the source resource file, the only difference is that the data will be localized.
This model has been expanded a little to allow for update tracking. When localization is done in
conjunction with development a target resource file may change after the localizer has tokenized the file
and begun translation. Update tracking allows the localizer to update the localized token file without
losing any work that might have been done. Any resources that may have changed since the most recent
update are marked "dirty" and the change is tracked in the token file so the localizer may see exactly what
changed and exactly how it changed.
To allow for update tracking, the source resource file is used to generate a "master token file" which tracks
changes. The master token file is then used to update any number of "language token files" (one for each
target language). These language token files are then localized and used to generate the target resource
files.
Usage:
Any file name can include a UNC (Unified Naming Convention) share name or drive letter, and a
directory path.
The syntax of the 'RLMan' command is as follows:
rlman [-c RDFile] [ -p CodePage ] [ -f ResType ] [-{n|o} PriLang SubLang] [-w] -{e|t|m|l|r|a} files
-e InputExeFile OutputResFile
Extract localizable resources from the resource file (.exe, .dll, .cpl, .etc.) InputExeFile and create
a Win32 resource file OutputResFile. This output file can then be used, for example, by the
dialog box editor or the bit map editors that come with the SDK.
-t InputResOrExeFile OuputTokenFile
This option will extract the localizable resources from the executable or resource format file
InputResOrExeFile and create a project token file OuputTokenFile. Using this option will
circumvent the history-keeping mechanism of RLMan. It is made available for those times when
the user wants to simply see what localizable resources are in the input file or when the history
mechanism is not needed. Using the -o option and the -p option with the -t option will allow one
to extract and tokenize resources of a specific language. The resulting token file will contain only
those resources that have the specified locale and the text will be in the given code page.
-m MasterProjectFile [ InputResOrExeFile MasterTokenFile ]
When the history mechanism is wanted, the first step when creating a new master project or
updating an existing master project is to use this option. If the MasterProjectFile does not exist,
the optional InputResOrExeFile and MasterTokenFile arguments must be provided. These last
two arguments will be ignored if the MasterProjectFile exists.
-l LanguageProjectFile [ MasterProjectFile LanguageTokenFile ]
This option is used, when the history mechanism is wanted, to create a new localization project
or to update an existing project. If the LanguageProjectFile does not exist, the optional
MasterProjectFile (the one created via the -m option) and LanguageTokenFile arguments must
be provided. These last two arguments will be ignored if the LanguageProjectFile exists.
-r InputResOrExeFile LanguageTokenFile_or_ResFile OutputResOrExeFile
This option is used to create a localized version of the InputResOrExeFile. The resources in that
InputResOrExeFile will be replaced with the localized resources in
LanguageTokenFile_or_ResFile. OutputResOrExeFile is the localized version. . If the
LanguageTokenFile_or_ResFile is a Token file, use the -o option to specifiy what the old
language to be replaced is and use the -n option to specifiy what the new language is.
-a InputResOrExeFile LanguageTokenFile_or_ResFile OutputResOrExeFile
This option is used to create a localized version of the InputResOrExeFile. The resources in the
resulting OutputResOrExeFile will include the original resources from InputResOrExeFile plus
the localized resources in LanguageTokenFile_or_ResFile. If the
LanguageTokenFile_or_ResFile is a Token file, use the -n option to specifiy what the new
language is.
-n PriLang SubLang
Specifies what the language the tokens in the token file are in and consequently what language
the new resources are in. Used when setting up a new Language Project (-l) and with the -r and -
a options. Can also be used with the -m option when the resources in the original resource file
are not in U.S. English. PriLang SubLang are decimal values from the list of allowed values in
the SDK.
-o PriLangID SubID
Specifies what the language the resources being replaced in, or extracted from, the source
resource file are in.
Used with the -r and -a options when the resources being replaced, or added to, in the original
resource file are not in U.S. English. PriLang SubLang are decimal values from the list of
allowed values in the SDK. For example, if the U.S. English resources had been replaced with
German resources and now you wanted to add French resources to that German file, use the -o
option (with arguments 7 1) to indicate that the original resources are in German not U.S English
together with the -a option. Or, if you had created a file with U.S. English plus German resources
in it, and now you wanted to replace the German resources with French (thus making a U.S
English plus French file), you would use the -r option with the -o option (with arguments 7 1)
and the -n option ( with arguments 12 1) to indicate that the old German resources are to be
replaced with the new ones in the given French token file.
Used with the -t option, this identifies the language of the resources that are to be tokenized. You
should also use the -p option to specifiy the code page the text in the token file is to be written in.
-c RDFile
Use custom resources defined in the resource description file RDFile.
-p CodePage
The default code page used in converting between the Unicode resources and the text in the token
files is the Windows ANSI code page. To change the code page, use this option and use the IBM
code page number as the CodePage argument. For example; -p 932.
-f ResourceType
By default, all localizable resources are extracted. To extract a single resource type, use this
option with ResourceType set to the resource type's numeric value (1-16 for Windows resources).
-w
Print warnings about unknown custom resource types (-c option not given or resource type is not
in the RDFFile), and about resources that are not tokenized because their language is not the
language requested (-o option, or US English by default). It will also warn of any zero-length
resources found.
Languages IDs
Primary Language ID's
Sub language IDs.
Neutral
0x00
Neutral
0x00
Albanian
0x1c
Default
0x01
Arabic
0x01
System Default
0x02
Bahasa
0x21
Arabic (Saudia Arabia)
0x04
Bulgarian
0x02
Arabic (Iraq)
0x08
Byelorussian
0x23
Arabic (Egypt)
0x0C
Catalan
0x03
Arabic (Libya)
0x10
Chinese
0x04
Arabic (Algeria)
0x14
Czech
0x05
Arabic (Morocco)
0x18
Danish
0x06
Arabic (Tuinisa)
0x1C
Dutch
0x13
Arabic (Oman)
0x20
English
0x09
Arabic (Yemen)
0x24
Estonian
0x25
Arabic (Syria)
0x28
Farsi
0x29
Arabic (Jordan)
0x2C
Finnish
0x0b
Arabic (Lebanon)
0x30
French
0x0c
Arabic (Kuwait)
0x34
German
0x07
Arabic (U.A.E.)
0x38
Greek
0x08
Arabic (Bahrain)
0x3C
Hebrew
0x0d
Arabic (Qatar)
0x40
Hungarian
0x0e
Chinese (Traditional)
0x01
Icelandic
0x0f
Chinese (Simplified)
0x02
Italian
0x10
Chinese (Taiwan)
0x04
Japanese
0x11
Chinese (PRC)
0x08
Kampuchean
0x2c
Chinese (Hong Kong)
0x0C
Korean
0x12
Chinese (Singapore)
0x10
Laotian
0x2b
Dutch
0x01
Latvian
0x26
Dutch (Belgian)
0x02
Lithuanian
0x27
English (US)
0x01
Maori
0x28
English (UK)
0x02
Norwegian
0x14
English (Australian)
0x03
Polish
0x15
English (Canadian)
0x04
Portuguese
0x16
English (New Zealand)
0x05
Rhaeto Roman
0x17
English (Ireland)
0x06
Romanian
0x18
French
0x01
Russian
0x19
French (Belgian)
0x02
Serbo Croatian
0x1a
French (Canadian)
0x03
Slovak
0x1b
French (Swiss)
0x04
Spanish
0x0a
German
0x01
Swedish
0x1d
German (Swiss)
0x02
Thai
0x1e
German (Austrian)
0x03
Turkish
0x1f
Hebrew (Israel)
0x04
Ukrainian
0x22
Italian
0x01
Urdu
0x20
Italian (Swiss)
0x02
Vietnamese
0x2a
Japanese (Japan)
0x04
Sub language IDs. (cont.)
Korean (Korea)
0x04
Norwegian (Bokmal)
0x01
Norwegian (Nynorsk)
0x02
Portuguese (Brazilian)
0x01
Portuguese
0x02
Serbo Croatian (Latin)
0x01
Serbo Croatian
(Cyrillic)
0x02
Spanish (Traditional
Sort)
0x01
Spanish (Mexican)
0x02
Spanish (Modern Sort)
0x03
Thai (Thailand)
0x04
Localization Process
There are two basic types of localization. The first is when a product is correctly enabled for localization
("globalized"), the product development is finished, and all that is needed is to modify the localizable
resources. We'll call this the "one-shot process". The second is when a product is being localized in
parallel with the development process and the localization work is to be preserved across new builds of the
original (typically English) file. We'll call this the "parallel process". It is done in parallel with product
development so the localized versions are ready as soon as possible after the domestic version is.
The next two sections describe, in a step-by-step list, the procedure to use for the two processes. In the
sample command lines, the executable is called "prog.exe". A GUI version of RLMan is being developed.
That version will make the various steps invisible to the user and will incorporate an editor that hides the
token ID part of the lines in the token file.
One-Shot Process
To simplify the file names on the command line, change directories to the place where the localized files
are to be kept. Leave the source executable (typically the English version) in some other directory *
anywhere on the network or on the local machine.
If you have a localized .RES file, simply skip to step 4 and use the .RES file in place of the .TOK file
name shown in that step<65>s example.
1. Create the project token file "prog.tok".
rlman -t prog.exe prog.tok
2. Translate the text in the "prog.tok" file with your favorite text editor. Assume German in this
example.
3. If dialog boxes are to be resized, create the file "tmpprog.exe" which will contain the translated text.
rlman -n 7 1 -r prog.exe prog.tok tmpprog.exe
Create the .RES file, in German, needed by the dialog editor in the SDK.
rlman -e tmpprog.exe prog.res
Resize the dialog boxes as appropriate to account for the different lengths of the translated text.
dlgedit prog.res
Update the project token file with the revised dialog box coordinates and sizes.
rlman -t prog.res prog.tok
4. Create the final, localized, executable.
rlman -n 7 1 -r prog.exe prog.tok newprog.exe
This completes the process. The German file "newprog.exe" is now ready to be tested.
Parallel Process
This process maintains the localization work from one build of the source executable (typically the
English version) to the next. With this version of RLMan there is one caveat * if the developers change
the ID number of a localizable item, the previous translation will be lost. This is being addressed and a
solution will be available in a future release of RLMan. New items can be added or old ones deleted but an
item with a changed ID will show up as a new item.
Project Creation
1. Create a directory for the master files and a separate directory for the project files. These
directories may be anywhere on the net. The master project directory may contain any
number of master projects, one need not create a new directory for each project as long as the
base name for each master project in unique. There should be a separate directory for each
localized version of the executable file. Typically this means one directory for each language.
2. Move to the master directory.
3. Copy the source executable to the master directory.
4. Create the master project file and the master token file.
rlman -m prog.mpj prog.exe prog.mtk
5. Move to the project directory (for German in this example).
6. Create the project file and the project token file.
rlman -n 7 1 -l prog.prj prog.mpj prog.tok
7. Steps 5 and 6 need to be repeated for each project directory (language). The resulting
prog.tok files can then be translated to the appropriate language.
Maintaining the Master Project
1. Copy the newly built source executable to the master directory.
2. Move to the master directory, then update the master project file and the master token file.
rlman -m prog.mpj
Maintaining Each Locale Project
1. Follow steps 2 and 3 of the "one-shot process" (previous page) as often as desired until the
resources are localized satisfactorily.
2. Every time the master project is been updated (step 2 in the "Maintaining the Master Project"
section), update the project file and project token file.
rlman -l prog.prj
Repeat step 1 as needed to catch new or changed resource items.
Data Files and Formats
RLMan uses a variety of special file types. All of the file formats described below are a special form of text
file. Each file is human-readable and can be edited with any standard text file editor (such as Notepad).
As a general rule, all text in these files follows the C escape convention when dealing with non-
displayable characters. This convention uses escape characters to represent non-displayable characters.
For example, \n is newline and \t is tab.
Master Project Files (MPJ)
Master project files consist of four lines of text:
* The first line contains the path to the source resource file. This may be either a .RES file, or an .EXE
format file.
* The second line contains the path to the master token file (MTK).
* The third line contains zero, one or more paths to resource description files (RDFs) separated by
spaces.
* The fourth line contains a date stamp indicating the date of the source resource file as of the last
update.
* The fifth line contains a date stamp indicating the date of the master token file as of the last update of
the master project.
* The sixth line contains the Language ID of the resources in the master token file (.MTK).
Project Files (PRJ)
Project files consist of five lines of text:
* The first line contains the path to the master project file (MPJ).
* The second line contains the path to the language token file (TOK).
* The third line contains the path to the target resource file. This may be either a .RES file or (if the
source resource in the MTK is an .EXE file) an .EXE format file.
* The fourth line contains a date stamp indicating the date of the master token file (MTK) as of the last
update of the project.
* The fifth line may be left blank or it may contain the path to a glossary file. (Not used in this release.)
* The sixth line contains the code page used when reading/writing the project token file. A zero (0)
means the system's Windows ANSI code page. A one (1) means the syetem's default OEM code page.
* The seventh line contains the Language ID of the resources in the language token file (.TOK).
Master Token Files (MTK)
Master token files are text files which contain tokenized resources taken from some source resource file.
Each token consists of a unique identifier followed by the text form of the resource data. Tokens are
delimited by end-of-line characters.
Master token files are used for update tracking. They contain no localized resource data and should not be
changed except by RLMan.
An example of what one token might look like is shown below:
[[5|255|1|32|5|"FOO"]]=Localizable string containing text in C format.
The token ID is surrounded by double square brackets and divided into 6 fields delimited by the vertical
pipe '|' symbol:
* The first field indicates the type of the resource
* The second field is the resource name in the case of an enumerated resource, or it is 65535 if the
name is a label (string) in which case the label itself is stored in the sixth field.
* The third field is the internal resource id number taken from the resource header.
* The fourth field is made up of a combination of data taken from the resource header and generated by
the tools. This value is used in conjunction with the other values in the token ID to uniquely identify
the resource.
* The fifth field is a status field used by the update tools to determine the status of the current token.
* The sixth field contains the name of the resource if the resource is identified by a label. Otherwise it
contains a null string.
A token ID is followed by an equal sign which is in turn followed by the resource data. The data extends
from after the equal sign to the end of the line (exclusive). Non printing characters (such as new-line and
control characters) are represented using C escape sequences. Two of the most common are \n for new-
line and \t for tab. Some characters are shown in the form '\nnn' where nnn is the decimal value of that
character.
A token's status field is made up of combinations (bitwise OR'ing) of three basic flags:
* CHANGED 4 indicates that the token has changed since the last update
* READONLY 2 indicates that the token should not be localized.
* NEW 1 used in conjunction with the CHANGED flag to indicate that this is
the new version of the token.
For example, if a token has changed during an update, the current text would be stored in a token with a
status of CHANGED+NEW (4 + 1) = 5. The previous text is also stored in the token file using the same
token ID but the status field would contain a 4 (CHANGED). This way both the current and the previous
text are retained.
Language Token Files (TOK)
Language token files are similar to master token files; the only difference being the meaning of the status
fields found in the token identifiers.
Language token files are used during localization. They contain localized resource data.
A token's status field is made up by combinations of four flags:
* TRANSLATED 4 indicates that the token contains text that should be put in the target
resource. If a token is not marked as TRANSLATED then it contains
unlocalized text from the master token file which is maintained for
update tracking purposes.
* READONLY 2 indicates that the token should not be localized.
* NEW 1 used only for tokens that are not marked with the TRANSLATED
flag to indicate that this is the new version of the unlocalized token.
* DIRTY 1 used only for tokens that are marked with the TRANSLATED flag to
indicate that the token is in need of attention (either the original
translation has changed or the token has never been localized).
For example, a clean, localized token is marked only with the TRANSLATED flag and therefore has a
status value of 4.
As in the Master token files, non printing characters (such as new-line and control characters) are
represented (and entered by the localizer) using C escape sequences. Two of the most common are \n for
new-line and \t for tab. Some characters are shown or entered in the form '\nnn' where nnn is the decimal
value of that character. The localizer may enter any character it's '\nnn' form.
Resource Description Files (RDF)
Custom resources are described in resource description files (RDFs) using c-like structure definitions.
Each definition is identified with a specific resource type and the definition is applied to every resource of
that given type.
An identifier is declared by the following syntax:
<type>
Types are numbers or quoted names unless they are normal windows types in which case the standard
Windows type name may be used in place of a number or name. (CURSOR, BITMAP, ICON, MENU,
DIALOG, STRING, FONTDIR, FONT, ACCELERATORS, RCDATA, ERRTABLE, GROUP_CURSOR,
GROUP_ICON, NAMETABLE, and VERSION).
A structure definition follows normal 'C' syntax with the following limitations and differences:
* Each definition must be fully enclosed in braces { }.
* The standard 'C' types: char (single-byte OEM characters), int, float, long, short,
unsigned, and near and far pointers are accepted. Additionaly, the types wchar (Unicode
character) and tchar (Unicode in the NT version, OEM otherwise) are accepted. (Labels and macros
are not legal.)
* Nested structures, arrays and arrays of structures are legal. All arrays must have a fixed count except
for strings which are described below. int[10] is legal int[] is not.
* Null terminated strings (sz's) are the only variable length structures allowed. They are represented as
an array of characters with no length: char[]
* Fixed length strings are represented as arrays of characters with a fixed length: char[10]
* Comments may be included in the file using standard c comment delimiters (/* */ and //) or by
placing them after the pound symbol #.
* Localizable types (types that need to be placed in token files) are indicated by all caps. Hence INT
would generate a token while int would not.
The following is a sample RDF file:
# This is a sample Resource Description File
<"type">
{
int, // no token will be generated for this integer
CHAR // this single-byte character will be placed in a token
}
<RCDATA>
{
WCHAR[] // a null terminated Unicode string that requires a token
wchar[] // no token will be generated for this Unicode string
}
<1000>
{
TCHAR[], // a null terminated Unicode or OEM string that requires
// a token (Unicode if running NT version, else OEM).
{
int,
FLOAT, // localizable floating point value
far *,
CHAR[20] // localizable 20 character single-byte string
}[3], // an array of three structures (NOT IMPLEMENTED YET)
int
}
END // Optional
Throughout this document, the term localization refers to the process of preparing a product for an
international market. This process involves (among other things) translating text and resizing controls
such as dialogs and buttons. A person performing localization is referred to as a localizer.
This document refers to a resource file as being any file that contains Windows resources. This can be a
.EXE file (or a .DLL, a .CPL, etc.), as well as a .RES file. RLMan can use any of these files as resource
files.
The Resource Localization Manager Page 14