windows-nt/Source/XPSP1/NT/sdktools/kbdtool/syntax.txt
2020-09-26 16:20:57 +08:00

376 lines
16 KiB
Plaintext

A keyboard layout description file is a text file that fully describes a
keyboard layout for Windows NT.
The tool KBDTOOL.EXE parses the text file, and produces C sourcce files that
can be compiled and linked to form the keyboard layout DLL itself.
The keyboard layout description file consists of different sections:
KBD The header
VERSION Version
MODIFIERS Optional: Modifier Keys other than Ctrl, Shift, Alt
SHIFTSTATE Describes columns of the LAYOUT section
LAYOUT Defines each key's scancode, VK and characters etc.
DEADKEY Optional: how dead keys combine with base characters
KEYNAME Names of standard keys
KEYNAME_EXT Names of extended keys
KEYNAME_DEAD Names of dead keys
ENDKBD Keyword indicating the end of the keyboard layout description
comments // anywhere on the line or ; as the first non-space char
These sections normally appear in this order but some changes in order are
possible but not recommended.
There may be more than one DEADKEY section.
KBD
===
Syntax: KBD <name> "<description>"
KBD - required keyword starts the description
<name> - output C source files names derived from this. The files will
be kbd<name>.h kbd<name>.rc kbd<name>.c & kbd<name>.def
<name> is typically 2 to 5 letters, but can any length > 0
<description> - Becomes the VER_FILEDESCRIPTION_STR resource in the
final DLL after " Keyboard Layout" is appended.
Example: KBD CAN "Canadian National Standard"
Comments
========
// comments
can appear on the end of any line, or on a line by themselves, in
any section
; comments
can appear on the end of any line, or on a line by themselves, except
within the LAYOUT section
VERSION
=======
Currently this is ignored, but should be 1.0 since future versions of kbdtool
may require this.
This section appears as a single line:
VERSION 1.0
ATTRIBUTES
==========
This section begins with a line containing the word ATTRIBUTES, followed by
a list of attributes, one per line. Each of these attributes corresponds to
an KLF_ flag (defined in winuserp.h) that may be specified in the registry
under MACHINE\System\CurrentControlSet\Control\Keyboard Layouts\*\Attributes
The attributes are:
LRM_RLM Left Shift + Backspace will generate a Left-to-Right marker
Right Shift + Backspace will generate a Right-to-Left marker
This is useful mainl for Arabic and Hebrew languages.
Corresponds to the KLF_LRM_RLM flag.
SHIFTLOCK CapsLock will be turned off by pressing either shift key.
This corresponds to the KLF_SHIFTLOCK flag.
ALTGR Pressing the right-hand Alt key will actually generate the key
sequence Left Control + Right Alt, thereby obtaining a third
modifier state known as "AltGr" which is used by many European
layouts.
This bit need not be explicity listed here: if any Control + Alt
modifiers are listed in the MODIFIERS section (described below),
then this attribute will be automatically defined.
When the layout is loaded, the attributes specified in this section will be
combined with those obtained from the layout's Attributes value in the registry.
The section should be followed by a blank line. It is terminated by a
keyword starting the next section (which should be MODIFIERS).
MODIFIERS
=========
This section defines keys that may be held down to alter the character produced
by typing another key, in addition to the standard modifiers Shift, Ctrl and
Alt (AltGr is the same as Alt + Ctrl).
Each consists of a VK value and a hexadecimal single-bit value greater than 4.
For example, the infamous Canadian National Standard CAN/CSA-Z243.200-92 layout,
kbdcan.txt, specifies that the VK_OEM_8 key is another modifier, identified by
the value 0x08, by having a "MODIFIER" line like this: "OEM_8 8"
SHIFTSTATE
==========
This section is a list of Shift/Ctrl/Alt key combinations, represented by a
hexadecimal number formed by OR-ing in zero or more of the following bits:
0x1 Shift
0x2 Ctrl
0x4 Alt
0x8 - 0x80 Optional modifiers as defined in the previous "MODIFIERS" section
Each entry in the list is on a separate line: the first line is the modifier
state for column 4 of the "LAYOUT" section that follows, the second line is
the modifier state for LAYOUT column 5, the third line is for LAYOUT column 6
and so on. This is usually commented to make it clearer, eg:
0 ;Column 4 :
1 ;Column 5 : Shift
6 ;Column 6 : Control Alt
2 ;Column 7 : Control
3 ;Column 8 : Shift Control
In the example above, AltGr (Alt + Ctrl) character will be found in column 6
of the following "LAYOUT" section.
LAYOUT
======
This describes each key, one row per key.
Each row contains either 2 columns (for keys with no character) or more than 5
columns. Each column is separated by whitespace.
Column 1 - Scancode
-------------------
The scancode emitted by the keyboard hardware in hexadecimal.
This is usually 2 digits.
Extended or enhanced keys, which produce a 2-byte sequence starting 0xE0 or
0xE1 usually obtain default VK values hard-coded into kbdtool, but these may be
overridden by specifying a scanocde in the form E0xx or E1xx. The only example
of this so far is in kbdcan.txt (the notorious Canadian National Standard
CAN/CSA-Z243.200-92 layout)
Sometimes the scancode will appear as "-1", which indicates that this row is
information supplementing the preceding row: see SGCAPS under Column 3.
Column 2 - Virtual key value
----------------------------
This is the VK value, as passed in Windows messages such as WM_KEYDOWN.
The value is either a VK literal with the first 3 letters ("VK_") removed, or
it is an ASCII character 0 through 9 or A through Z for those VK values that
are not explicitly defined in our header files.
Column 3 - CapsLock and SGCAPS
------------------------------
This is a bitwise combination of the following values:
Value Name Appearance
0x01 CAPLOK "1"
0x02 SGCAPS "SGCap"
0x04 CAPLOKALTGR "4" but usually appears added to CAPLOK as "5"
0x08 KANALOK unknown (some FE meaning)
0x80 GRPSELTAP unused
CAPLOK: means that this key is affected by the CapsLock state (or ShiftLock state)
SGCap: means that this key is affected by the Shift state and the CapsLock
state, but in different ways. This is described in two lines: the First
is marked SGCap in column 3 and indicates how the character values are
determined according to whether a Shift key is down or not. The Second
line is marked with -1 in column 1 (scancode) and indicates how character
values for the same key are determined according to the CapsLock state.
This occurs in the Swiss German layout, kbdsg.txt, hence the term SGCaps
but it is also used in various middle-eastern layouts as well.
CAPLOKALTGR: means that the CapsLock state affects those chracaters that are
produced with AltGr too. AltGr (the right hand Alt, or Ctrl+Alt), is
a "third shift" that is used to make a key generate more than just an
upper an lowercase version of the same character. If a key generates
one character while AltGr is held down and another character while AltGr
and Shift are held down, then the CAPLOKALTGR bit says that CapsLock
state has the same effect as changing the Shift key state. This bit
usually appears together with the CAPLOK bit on the same key to form
the value "5" in column 3.
KANALOK: means ????
GRPSELTAP: not used (residue from an early attempt at the CAN/CSA layout)
Columns 4 and beyond - character values
---------------------------------------
Each column from column 4 onward represents a character value for a specific
Shift/Ctrl/Alt/CapsLock state. They appear either literally (in the US English
Windows codepage 1252), or as hexadecimal Unicode values.
Characters that have a "@" suffix are "dead keys". These will generate a
diacritic (accent mark) which will not appear at the time it is typed, but
will be applied to the next letter typed. See WM_DEADCHAR. The ways in which
each diacritic combines with the next character typed are defined in "DEADKEY"
sections that follow this "LAYOUT" section.
The "SHIFTSTATE" section that precedes the "LAYOUT" section defines the
Shift/Ctrl/Alt combination for each column.
Default Entries
----------------
Certain keys which are common to all keyboard layouts have default entries
created for them by kbdtool.exe.
These may be overridden by specifying them in the LAYOUT section.
These are the default entries:
VK_BACK 0 0008 0008
VK_ESCAPE 0 001b 001b
VK_RETURN 0 000d 000d
VK_SPACE 0 ' ' ' '
VK_CANCEL 0 0003 0003
TAB 0 0009 0009
ADD 0 + +
DIVIDE 0 / /
MULTIPLY 0 * *
SUBTRACT 0 - -
NUMPAD0 0 0
NUMPAD1 0 1
NUMPAD2 0 2
NUMPAD3 0 3
NUMPAD4 0 4
NUMPAD5 0 5
NUMPAD6 0 6
NUMPAD7 0 7
NUMPAD8 0 8
NUMPAD9 0 9
Entry order
-----------
The way the LAYOUT entries are ordered is important in determining how the
Win32 API VkKeyScanEx() functions.
Order of numeric keypad entries.
The most important thing is that any Numeric Keypad Virtual keys (VK_DECIMAL,
VK_MULTIPLY, VK_SUBTRACT, VK_ADD, VK_DIVIDE) should appear after all the other
Virtual Keys. Normally, these keys, with the exception of VK_DCIMAL, do not
need explicit entries anyway: kbdtool will use the defaults (shown above) and
place them in the correct order. VK_DECIMAL is typically either a period "."
or a comma "," and has no default entry. Its entry should be placed at the end
of the LAYOUT section.
Order of entries for duplicated characters.
If the layout contains more than one way to produce a given character, the
way to arrange that VkKeyScanEx reports a particular key combination is to
make sure that all entries containing the given charcter have the same number
of columns, and that the chosen one appears first. For example, the Hungarian
keyboard layout has more than one way to type I acute: the VK_OEM_102 key
has i acute and I acute on it, but you can also type I acute with AltGr+I.
The entry for the "I" key would normally look like this:
17 I 1 i I 00cd // 0x00cd isI acute
and the entry for the "I acute" key looks like this:
56 OEM_102 1 00ed 00cd < 001c // i acute, I acute
To ensure that VkKeyScanEx reports Shift+VK_OEM_102 as the key combo for
I acute (0xcd), we must give the "I" key an extra column with with the value -1
(meaning no character), so that both entries have the same number of columns,
and place the preferred one first:
56 OEM_102 1 00ed 00cd < 001c // i acute, I acute
17 I 1 i I 00cd -1 // 0x00cd isI acute
By giving both entries the same number of columns, they will appear in the same
subsection of the keyboard layout DLL. VkKeyScan searches within each
subsection in order, but the order in which it searches subsections themselves
is not specified.
The exception to this requirement is when the duplicated character is produced
with Ctrl or Ctrl+Shift - VkKeyScan will report such a key combo only if no
other key combination produces that character.
DEADKEY
=======
There may be many DEADKEY tables in the layout. There should be one for each
dead key that appears in the LAYOUT section.
The format of a standard DEADKEY table is a s follows.
DEADKEY <deadchar> ; Comment
<basechar1> <compchar1> ; comment
<basechar2> <compchar2> ; comment
<basechar3> <compchar3> ; comment
... ...
<basecharN> <compcharN> ; comment
0020 <spacing_deadchar>
A DEADKEY table describe how a specified diacritic or "dead" character can
combine with other characters to form a single composite character.
The first line of each DEADKEY section is of the following form:
DEADKEY <deadchar> ; Comment
Where <deadchar> represents the dead character, either in the form of a four
digit hexadecimal number or the character itself (except semicolon).
This is followed by any number of lines of the following form:
<basechar> <compchar>
Where the <basechar> is the base character and <compchar> is the resulting
composite character when the <deadchar> is typed followed by <basechar>.
Dead characters appear in the LAYOUT table as hexadecimal values or characters
with a @ appended. These represent diacritics which may be combined with
other characters to form a single composite character.
Dead chars do not appear when you type them, but combine with the next (base)
character typed to form a single composite character (or two separate characters
if there is no valid way for the dead and base characters to combine).
By convention, Microsoft ships keyboard layouts that encode the spacing version
of the diacritic (if it exists) rather than the non-spacing or combining form.
For example, use 0060 (GRAVE ACCENT) rather than 0300 (COMBINING GRAVE ACCENT).
Using spacing dead characters will cause the dead character to appear as a
spacing character when there is no combination with the base character. Also,
the dead character is sent to the application in a WM_DEADCHAR message at the
time it is typed: most apps ignore this, but for backward compatibility it will
be best to define the deadkey as a spacing character whenever possible.
When chosing the best spacing version of a diacritic, try to pick one that is
most likely to be found in many fonts, as well most likely to be translated by
the NLS subsystem into as many different ANSI codepages as possible.
For example, choose 0027 (Apostrophe) as the dead character value for 030d
(Combining Line Above) instead of 0384 (Greek Tonos) or 02c8 (Modifier Letter
Vertical Line Above), since Apostrophe it is present in most fonts, and the NLS
API will be able to convert the Unicode Apostrophe into many different ANSI
codepages so that ANSI apps will work well with the keyboardf layout. If you
030d itself as the dead character, an ANSI app would receive question marks
instead of the uncomposed 030d unless it happened to be running in the Greek
input locale, since the NLS API can only translate 030d to an ANSI character
(0xb4) in the Greek codepage (1253).
The results of a composition is usually a spacing character but it can also be
another dead character. This is achieved by including lines such as:
<basechar> <compchar>@
which is useful if you want to produce a single composite character with
multiple diacritics, such as Acute + Acute + u giving u with double acute,
which could be implemented by the following two tables.
eg:
DEADKEY 00b4 ; combinations with Acute
00b4 02ba@ ; Acute + Acute -> Double Acute (Modifier Letter Double Prime)
0020 00b4 ; Acute + Space -> Acute
DEADKEY 02ba ; combinations with Double Acute (Modifier Letter Double Prime)
u 0171 ; Double Acute + u -> Small Letter u With Double Acute
U 0170 ; Double Acute + u -> Capital Letter U With Double Acute
0020 2033 ; Double Acute + space -> Double Prime
By convention, the last line of every DEADKEY table should be a line composing
the deadkey with SPACE. Standard keyboard behavior is to have all dead keys
compose with space to form a spacing representation of that dead character.
KEYNAME
=======
This is a table of scancodes of certain keys and the user-readable names for
those keys as returned by the GetKeyNameText API.
The syntax is 0 or more lines of the form:
<scancode> <name>
Where <scancode> is 2 hexadecimal digits, and <name> is either a single word
optionally enclosed in quotes, or multiple words enclosed in quotes.
The keys listed in this section are those that do not have an 0xE0 prefixed
scancode, and which do not produce normally an alpha-numeric character.
Examples are Backspace, Enter, Ctrl, etc.
The names are typically in the language that the keyboard supports, or English.
KEYNAME_EXT
===========
The same as the KEYNAME section (above) but for keys that do have the 0xE0
scancode prefix.
KEYNAME_DEAD
============
Similar to the KEYNAME and KEYNAME_EXT sections, this provides names for keys
that produce dead characters.
The syntax is 0 or more lines of the form:
<unicode> <name>
where <unicode> is 4 hex digits representing the Unicode value of the character.
This character must appear elsewhere in the LAYOUT section as a dead key, and
should also have a DEADKEY table.
The <name> is the same format as the <name> in the KEYNAME and KEYNAME_EXT
sections.
The KEYNAME_DEAD section need only be present if dead characters are present
in the keyboard layout.
ENDKBD
======
Signifies the end of the keyboard layout definition.
The syntax is simple the keyword ENDKBD appearing by itself on the last line.
Any lines following this are ignored.