DirectInput Input Semantics

Raymond Chen
Microsoft Corporation
7 November 1997

Abstract

A common problem faced by gaming device manufacturers is ensuring that applications use the gaming device in the best possible way. There is currently no way for a device to express high-level semantic information that can be used by an application to assign, for example, proper motion semantics to each input control. A solution to this problem is proposed.

The Variety of Hardware

High degree-of-freedom devices are growing in popularity. These types of devices reveal a failing in the way axes are currently assigned: Different applications interpret (for example) the X axis differently. Flight simulator type applications typically interpret the X axis as a bank/roll control. Driving games and first-person shooting games, on the other hand, typically interpret it as a turning control. And still other types of games might interpret it as controlling left/right translation ("sliding").

Currently, hardware vendors include a software component which dynamically reconfigures the hardware based on guesses as to the semantics expected by the currently-running application. This technique is error-prone and leaves the hardware vendor in a constant game of catch-up, releasing updates to the software to accomodate new games. Consequently, it is not suitable as a long-term solution.

The HID Approach

The USB HID committee attempted to address this problem by defining "usages" which express information about the intended use of the device. Unfortunately, this approach has a few failings:

Semantics

This document proposes a new concept, tentatively named "Semantics". (Suggestions for alternate names welcome. "Behavior" is a possibility, although it tends to imply some sort of AI.)

A semantic expresses what application behavior should result from the user's operation of the control.

A list of semantics would be agreed to by the gaming community.

How devices express semantics

A hardware device can express the semantics that can be applied to each control, in descending order of priority. For example, the X axis on a joystick could be listed as

This information would be recorded in the registry "type key" associated with the device. (Information of this ilk is already kept in the type key.) The INF file distributed with a hardware device would establish these semantics. In the absence of semantic information, DirectInput would apply a default set of semantics.

How Applications Request Semantics

One of the goals of this proposal is to shift the onus of establishing mappings between game behaviors and device controls from the application to DirectInput. Doing this would accomplish several things.

According to this proposal, the application would present to DirectInput a list of semantics it desires from the device. Example:

    LPDIDATAFORMAT pdf;
    HRESULT hres;
    DWORD rgSemantics[] = {
        DISEM_AXIS_BANK              , /* Bank/roll control */
        DISEM_AXIS_CLIMB             , /* Climb/dive control */
        DISEM_AXIS_THROTTLE          , /* Throttle (velocity) control */
        DISEM_BUTTON_FIREWEAPON      , /* Fire selected weapon */
        DISEM_BUTTON_WEAPONSELECTUP  , /* Select next weapon */
        DISEM_BUTTON_WEAPONSELECTDOWN, /* Select previous weapon */
        DISEM_BUTTON_SHOWMAP         , /* Show/hide onscreen map */
        DISEM_BUTTON_ANY             , /* For special game feature */
    };

    hres = pdev->BuildDataFormat(rgSemantics, &pdf);
    if (SUCCEEDED(hres)) {
        hres = pdev->SetDataFormat(pdf);
        pdev->FreeDataFormat(pdf);
    }

Observe that this is a simple extension of what applications already do, except that instead of using a fixed data format, the application asks DirectInput to build a custom data format based on its semantics requirements.

The rgSemantics array describes the controls which the application requests. The BuildDataFormat method compares this structure against the capabilities of the device and determines how semantics should be assigned to controls. Note the special semantic named DISEM_BUTTON_ANY which acts as a catch-all that matches any button (just like in the old days).

The application can look to see which semantics got assigned to which controls (or perhaps to no control if all compatible controls are already in use) by inspecting the DIDATAFORMAT structure. For example, the above sample application could check if DirectInput successfully found a control for use as a throttle as follows:

    // Do this before pdev->FreeDataFormat(pdf), of course
    //
    // rgSemantics[2] = DISEM_AXIS_THROTTLE, corresponding to rgodf[2]
    if (!pdf->rgodf[2].dwType) {
        // Unable to find a throttle on the device
    }
Most game applications provide keyboard equivalents for all functions, so there would typically be no need for checking if a particular semantic was supported on the gaming device.

This is merely the basic idea; there are a lot of details that are not covered. For example, if the application selected DISEM_POV_GLANCE to request a control that can be used to glance around the environment (turning the head without turning the body), this can be expressed in a device either with a single control (a hatswitch) or with a pair of controls (two axes), a quartet of controls (four buttons arranged in a diamond pattern) or even a quintet (a diamond pattern with a center button). It is also not clear how relative and absolute controls should be managed.

One approach is to add a translation layer to the data retrieval functions as well as to the data format functions. So the application can assume that it will always receive the information as two LONGs (say), one describing horizontal glance information and one describing vertical glance information. DirectInput would do the work of mapping the hatswitch or buttons into LONGs. However, this leaves open the question of what numerical value to assign to a glance action triggered by a button rather than an axis.

Another approach is to split BuildDataFormat into two separate methods, CreateEmptyDataFormat and AddToDataFormat. An application can then, for example, use the following code to select the best control for glancing:

    hres = pdev->CreateEmptyDataFormat(&pdf);
    if (FAILED(hres)) {
        goto panic;
    }

    //
    //  1 = number of semantics we are adding this time
    //  DIADF_ALL = fail if not all semantics are available
    //
    DWORD dwSem = DISEM_POV_GLANCE;
    hres = pdev->AddToDataFormat(pdf, &dwSem, 1, DIADF_ALL);
    if (SUCCEEDED(hres)) {
        GlanceViaPOV = TRUE;
    } else {
        // Couldn't find it on a POV; try it on a pair of axes
        DWORD rgSem[2] = { DISEM_AXIS_GLANCEUPDOWN,
                           DISEM_AXIS_GLANCELEFTRIGHT };
        hres = pdev->AddToDataFormat(pdf, rgSem, 2, DIADF_ALL);
        if (SUCCEEDED(hres)) {
            GlanceViaAxes = TRUE;
        } else {
            // If I were really into it, I could try glancing via buttons
        }
    }
The downside is that this requires vendors to write code, which results in the impression that "DirectInput is hard to use". (Be honest, that's what you were thinking when you saw that code snippet.)

Associating Semantics to Controls

Via the Control Panel, the end-user can adjust the list of semantics associated with each control. This would be a simple list control with a "Move Up/Down" control (to reorder items) and "Add" and "Remove" buttons to allow the user to change the contents of the list.

The Mapping Algorithm

For each control requested by the application, consult the list of all controls on the device not already assigned by a previous step. Among the controls which support the requested semantics, choose the one whose semantics appears earliest in its corresponding list. For example, if there are two axes that claim to be usable as "Translate left/right", but one of them lists the capability as its primary behavior, whereas the other lists it in third place, then the behavior will be assigned to the first axis.

The List of Semantics

The list of semantics would be agreed upon by the members of the gaming community. Here follows a list of possibilities.

References

Universal Serial Bus HID Usage Tables, Version 1.0, USB Implementers Forum.

DirectX 5.0 SDK , Microsoft Corporation.