Initial revision

This commit is contained in:
Micah Dowty 2009-04-13 07:05:42 +00:00
commit 75775e0dff
139 changed files with 21478 additions and 0 deletions

145
README.txt Normal file
View file

@ -0,0 +1,145 @@
--------------------------------
VMware SVGA Device Developer Kit
--------------------------------
The "VMware SVGA II" device is the virtual graphics card implemented
by all VMware virtualization products. It is a virtual PCI device,
which implements a basic 2D framebuffer, as well as 3D acceleration,
video overlay acceleration, and hardware cursor support.
This is a package of documentation and example code for the VMware
SVGA device's programming model. Currently it consists of some very
basic documentation, and a collection of examples which illustrate the
more advanced features of the device. These examples are written to
run on the "virtual bare metal", without an operating system.
This package is intended for educational purposes, or for people who
are developing 3D drivers. This code won't help you if you're writing
normal user-level apps that you'd like to run inside a virtual
machine. It's for driver authors, and it assumes a reasonable amount
of prior knowledge about graphics hardware.
Requirements
------------
To compile the example code, you'll need a few basic open source tools:
- A recent version of GCC. I use 4.2. Older versions may
require tweaking the Makefile.rules file slightly.)
- binutils
- GNU Make
- Python
To run the examples, you'll need a recent version of VMware
Workstation, Player, or Fusion. Some of the examples will work on
older versions, but Workstation 6.5.x or Fusion 2.0.x is strongly
recommended.
Contents
--------
* bin/
Precompiled binaries and .vmx files for all examples. These can be
loaded directly into VMware Workstation, Player, or Fusion.
* doc/
Basic SVGA hardware documentation. This includes a text file with
information about the programming model, plus it includes a copy of
a WIOV paper which describes our 3D acceleration architecture.
* lib/metalkit/
Metalkit is a very simple open source OS, which bootstraps the
examples and provides basic hardware support.
* lib/refdriver/
The SVGA "Reference Driver". This is a sample implementation of a
driver for our device, which is used by the examples. It provides
device initialiation, an implementation of the low-level FIFO
protocol, and wrappers around common FIFO commands.
If you're writing a driver for the VMware SVGA device, "svga.c"
from this directory is required reading. The FIFO protocol has
many subtle gotchas, and this source file is the only place
where they're publicly documented.
* lib/vmware/
Header files which define VMware's protocols and virtual hardware.
The svga_reg.h and svga3d_reg.h files are (in places, at least)
commented with more information on the programming model.
If you can't find specific documentation or an example on a feature,
this is the next place to look. This is also where to get a complete
list of the supported registers and commands.
* lib/util/
Higher-level utilities built on top of the refdriver layer. This
directory won't contain any novel information about the virtual
hardware, but it does contain some higher-level abstractions used
by the examples, and these abstractions demonstrate some useful
idioms for programming the SVGA device.
* examples/
Each example has a separate subdirectory. You can run "make" in the
top-level directory to compile all examples, or you can build them
individually.
Many of the examples are self-explanatory, but some of them are
not. See the comments at the beginning of the 'main.c' file in each
example.
Development
-----------
This project isn't intended to be a one-time "code drop" from VMware.
Our intent is for the examples in this package to be maintained out in
the open. If we have a bugfix, or a new example that works on released
VMware products, we'll check it in directly to the public repository.
For examples of not-yet-released features, we will be developing on an
internal branch. This branch will be merged to the public repository
shortly after the first release which has working versions of these
features.
License
-------
Except where noted in individual source files, the whole package is
Copyright (C) 1998-2009 VMware, Inc. It is released under the MIT
license:
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Contact
-------
This project is provided as-is, with no official support from
VMware. However, I will try to answer questions as time permits.
If you have questions or you'd like to submit a patch, feel free
to email me at: micah at vmware.com
--

BIN
bin/2dmark.img Executable file

Binary file not shown.

9
bin/2dmark.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = 2dmark
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = 2dmark.img

BIN
bin/blit-cube.img Executable file

Binary file not shown.

9
bin/blit-cube.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = blit-cube
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = blit-cube.img

BIN
bin/bunnies.img Executable file

Binary file not shown.

9
bin/bunnies.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = bunnies
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = bunnies.img

BIN
bin/cube.img Executable file

Binary file not shown.

9
bin/cube.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = cube
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = cube.img

BIN
bin/cubemark.img Executable file

Binary file not shown.

9
bin/cubemark.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = cubemark
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = cubemark.img

BIN
bin/dynamic-vertex-stress.img Executable file

Binary file not shown.

View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = dynamic-vertex-stress
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = dynamic-vertex-stress.img

BIN
bin/dynamic-vertex.img Executable file

Binary file not shown.

9
bin/dynamic-vertex.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = dynamic-vertex
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = dynamic-vertex.img

BIN
bin/fence-stress.img Executable file

Binary file not shown.

9
bin/fence-stress.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = fence-stress
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = fence-stress.img

BIN
bin/gmr-test.img Executable file

Binary file not shown.

9
bin/gmr-test.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 128
displayname = gmr-test
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = gmr-test.img

BIN
bin/half-float-test.img Executable file

Binary file not shown.

9
bin/half-float-test.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = half-float-test
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = half-float-test.img

BIN
bin/pong.img Executable file

Binary file not shown.

9
bin/pong.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = pong
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = pong.img

BIN
bin/presentReadback.img Executable file

Binary file not shown.

9
bin/presentReadback.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = presentReadback
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = presentReadback.img

BIN
bin/simple-shaders.img Executable file

Binary file not shown.

9
bin/simple-shaders.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = simple-shaders
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = simple-shaders.img

BIN
bin/simple_blit.img Executable file

Binary file not shown.

9
bin/simple_blit.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = simple_blit
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = simple_blit.img

BIN
bin/video-formats.img Executable file

Binary file not shown.

9
bin/video-formats.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = video-formats
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = video-formats.img

BIN
bin/video-sync.img Executable file

Binary file not shown.

9
bin/video-sync.vmx Normal file
View file

@ -0,0 +1,9 @@
config.version = 8
virtualHW.version = 7
memsize = 4
displayname = video-sync
guestOS = other
mks.enable3d = TRUE
floppy0.startConnected = TRUE
floppy0.fileType = file
floppy0.fileName = video-sync.img

BIN
doc/gpu-wiov.pdf Normal file

Binary file not shown.

872
doc/svga_interface.txt Executable file
View file

@ -0,0 +1,872 @@
Copyright (C) 1999-2009 VMware, Inc.
All Rights Reserved
VMware SVGA Device Interface and Programming Model
--------------------------------------------------
Revision 3, 2009-04-12
Table of Contents:
1. Introduction
2. Examples and Reference Implementation
3. Virtual Hardware Overview
4. 2D Graphics Model
5. 3D Graphics Model
6. Overview of SVGA3D features
7. Programming the VMware SVGA Device
XXX - Todo
----------
This document does not yet describe the 3D hardware in great
detail. It is an architectural overview. See the accompanying sample
and reference code for details.
Section (7) is biased toward describing much older features of the
virtual hardware. Many new capability flags and FIFO commands have
been added, and these are sparsely documented in svga_reg.h.
1. Introduction
----------------
This document describes the virtual graphics adapter interface which
is implemented by VMware products. The VMware SVGA Device is a virtual
PCI video card. It does not directly correspond to any real video
card, but it serves as an interface for exposing accelerated graphics
capabilities to virtual machines in a hardware-independent way.
In its simplest form, the VMware SVGA Device can be used as a basic
memory-mapped framebuffer. In this mode, the main advantage of
VMware's SVGA device over alternatives like VBE is that the virtual
machine can explicitly indicate which ares of the screen have changed
by sending update rectangles through the device's command FIFO. This
allows VMware products to avoid reading areas of the framebuffer which
haven't been modified by the virtualized OS.
The VMware SVGA device also supports several advanced features:
- Accelerated video overlays
- 2D acceleration
- Synchronization primitives
- DMA transfers
- Device-independent 3D acceleration, with shaders
- Multiple monitors
- Desktop resizing
2. Examples and Reference Implementation
-----------------------------------------
This document is not yet complete, in that it doesn't describe the
entire SVGA device interface in detail. It is an architectural
overview of the entire device, as well as an introduction to a few
basic areas of programming for the device.
For deeper details, see the attached example code. The "examples"
directory contains individual example applications which show various
features in action. The "lib" directory contains support code for the
example applications.
Some of this support code is designed to act as a reference
implementation. For example, the process of writing to the command
FIFO safely and efficiently is very complicated. The attached
reference implementation is a must-read for anyone attempting to write
their own driver for the SVGA device.
For simplicity and OS-neutrality, all examples compile to floppy disk
images which execute "on the bare metal" in a VMware virtual
machine. There are no run-time dependencies. At compile-time, most of
the examples only require a GNU toolchain (GCC and Binutils). Some of
the examples require Python at compile-time.
Each example will generate a .vmx virtual machine configuration file
which can be used to boot it in VMware Workstation or Fusion.
The included example code focuses on advanced features of the SVGA
device, such as 3D and synchronization primitives. There are also a
couple examples that demonstrate 3D graphics and video overlays.
For more examples of basic 2D usage, the Xorg driver is also a good
reference.
Header files and reference implementation files in 'lib':
* svga_reg.h
SVGA register definitions, SVGA capabilities, and FIFO command
definitions.
* svga_overlay.h
Definitions required to use the SVGA device's hardware video overlay
support.
* svga_escape.h
A list of definitions for the SVGA Escape commands, a way to send
arbitrary data over the SVGA command FIFO. Escapes are used for video
overlays, for vendor-specific extensions to the SVGA device, and for
various tools internal to VMware.
* svga3d_reg.h
Defines the SVGA3D protocol, the set of FIFO commands used for hardware
3D acceleration.
* svga3d_shaderdefs.h
Defines the bytecode format for SVGA3D shaders. This is used for
accelerated 3D with programmable vertex and pixel pipelines.
* svga.c
Reference implementation of low-level SVGA device functionality.
This contains sample code for device initialization, safe and
efficient FIFO writes, and various synchronization primitives.
* svga3d.c
Reference implementation for the SVGA3D protocol. This file
uses the FIFO primitives in svga.c to speak the SVGA3D protocol.
Includes a lot of in-line documentation.
* svga3dutil.c
This is a collection of high-level utilities which provide usage
examples for svga3d.c and svga.c, and which demonstrate common
SVGA3D idioms.
3. Virtual Hardware Overview
-----------------------------
The VMware SVGA Device is a virtual PCI device. It provides the
following low-level hardware features, which are used to implement
various feature-specific protocols for 2D, 3D, and video overlays:
* I/O space, at PCI Base Address Register 0 (BAR0)
There are only a few I/O ports. Besides the ports used to access
registers, these are generally either legacy features, or they are for
I/O which is performance critical but may have side-effects. (Such as
clearing IRQs after they occur.)
* Registers, accessed indirectly via INDEX and VALUE I/O ports.
The device's register space is the principal method by which
configuration takes place. In general, registers are for actions which
may have side-effects and which take place synchronously with the CPU.
* Guest Framebuffer (BAR1)
The SVGA device itself owns a variable amount of "framebuffer" memory,
up to a maximum of 128MB. This memory size is fixed at power-on. The
memory exists outside of the virtual machine's "main memory", and it's
mapped into PCI space via BAR1. The size of this framebuffer may be
determined either by probing BAR1 in typical PCI fashion, or by
reading SVGA_REG_FB_SIZE.
The beginning of framebuffer memory is reserved for the 2D
framebuffer. The rest of framebuffer memory may be used as buffer
space for DMA operations.
* Command FIFO (BAR2)
The SVGA device can be thought of as a co-processor which executes
commands asynchronously with the virtual machine's CPU. To enqueue
commands for this coprocessor, the SVGA device uses another
device-owned memory region which is mapped into PCI space.
The command FIFO is usually much smaller than the framebuffer. While
the framebuffer usually ranges from 4MB to 128MB, the FIFO ranges in
size from 256KB to 2MB. Like the framebuffer, the FIFO size is fixed
at power-on. The FIFO is mapped via PCI BAR2.
* FIFO Registers
The beginning of FIFO memory is reserved for a set of "registers".
Some of these are used to implement the FIFO command queueing
protocol, but many of these are used for other purposes. The main
difference between FIFO registers and non-FIFO registers is that FIFO
registers are backed by normal RAM whereas non-FIFO registers require
I/O operations to access. This means that only non-FIFO registers can
have side-effects, but FIFO registers are much more efficient when
side-effects aren't necessary.
The FIFO register space is variable-sized. The driver is responsible
for partitioning FIFO memory into register space and command space.
* Synchronization Primitives
Conceptually, the part of the SVGA device which processes FIFO
commands can be thought of as a coprocessor or a separate thread of
execution. The virtual machine may need to:
- Wake up the FIFO processor when it's sleeping, to ensure that
new commands are processed with low-latency. (FIFO doorbell)
- Check whether a previously enqueued FIFO command has been
processed. (FIFO fence)
- Wait until the FIFO processor has passed a particular
command. (Sync to fence)
- Wait until more space is available in the FIFO. (Wait for
FIFO progress)
* Interrupts (Workstation 6.5 virtual machines and later only)
On virtual machines which have been upgrade to Workstation 6.5 virtual
hardware, the SVGA device provides an IRQ which can be used to notify
the virtual machine when a synchronization event occurs. This allows
implementing operations like "Sync to fence" without interfering with
a virtual machine's ability to multitask.
On older virtual hardware versions, the SVGA device only supports a
"legacy sync" mechanism, in which a particular register access has the
side-effect of waiting for host FIFO processing to occur. This older
mechanism completely halts the virtual machine's CPU while the FIFO is
being processed.
* Physical VRAM
The VMware SVGA device provides management of physical VRAM resources
via "surface objects", however physical VRAM is never directly visible
to the virtual machine. Physical VRAM can only be accessed via DMA
transfers.
Note that framebuffer memory is simply a convenient place to put DMA
buffers. Even if a virtual machine only has 16MB of framebuffer memory
allocated to it, it could be using gigabytes of physical VRAM if that
memory is available to the physical GPU.
* DMA engine
The VMware SVGA device can asynchronously transfer surface data
between phyiscal VRAM and guest-visible memory. This guest-visible
memory could be part of framebuffer memory, or it could be part of
guest system memory.
The DMA engine uses a "Guest Pointer" abstraction to refer to any
guest-visible memory. Guest pointer consist of an offset and a Guest
Memory Region (GMR) ID. There is a pre-defined GMR which refers to
framebuffer memory. The virtual machine can create additional GMRs to
refer to regions of system memory which may or may not be physically
contiguous.
4. 2D Graphics Model
---------------------
Conceptually, the 2D portion of the VMware SVGA device is a compositor
which displays a user-visible image composed of several planes. From
back to front, those planes are:
- The 2D framebuffer
- 3D regions
- Video overlay regions
- The virtual hardware mouse cursor ("cursor bypass")
- The physical hardware mouse cursor ("host cursor")
It is important to note that host-executed 2D graphics commands do not
necessarily modify the 2D framebuffer, they may write directly to the
physical display or display window. Like a physical video card, the
VMware SVGA device's framebuffer is never modified by a mouse cursor
or video overlay. Unlike a physical video card, however, 3D display
regions in the VMware SVGA device may or may not modify the 2D
framebuffer.
The following basic 2D operations are available:
* Update
Redraw a portion of the screen, using data from the 2D
framebuffer. Any update rectangles are subtracted from the set of
on-screen 3D regions, so 2D updates always overwrite 3D regions. 2D
updates still appear behind video overlays and mouse cursors.
An update command must be sent any time the driver wishes to make
changes to the 2D framebuffer available. The user-visible screen is
not guaranteed to update unless an explicit update command is sent.
Also note that the SVGA device is allowed to read the 2D framebuffer
even if no update command has been sent. For example, if the virtual
machine is running in a partially obscured window, the SVGA device
will read the 2D framebuffer immediately when the window is uncovered
in order to draw the newly visible portion of the VM's window.
This means that the virtual machine must not treat the 2D framebuffer
as a back-buffer. It must contain a completely rendered image at all
times.
There is not yet any way to synchronize updates with the vertical
refresh. Current VMware SVGA devices may suffer from tearing
artifacts.
* 2D acceleration operations
These include fills, copies, and various kinds of blits. All 2D
acceleration operations happen directly on the user-visible screen,
not in 2D framebuffer memory.
Use of the 2D acceleration operations is encouraged only in very
limited circumstances. For example, when moving or scrolling
windows. Mixing accelerated and unaccelerated 2D operations is
difficult to implement properly, and incurs a significant
synchronization penalty.
* Present 3D surface
"Present" is an SVGA3D command which copies a finished image from an
SVGA3D surface to the user-visible screen. It may or may not update
the 2D framebuffer in the process.
Present commands effectively create a 3D overlay on top of part of the
2D framebuffer. This overlay can be overwritten by Update commands or
by other Present commands.
Present is the only way in which the 2D and 3D portions of the VMware
SVGA device interact.
* Video overlay operations
The SVGA device defines a group of virtual "video overlay units", each
of which can color-convert, scale, and display a frame of YUV video
overlayed with the 2D framebuffer. Overlay units each have a set of
virtual registers which are configured using the commands in
svga_overlay.h.
* Virtual mouse cursor operations
The virtual mouse cursor is an overlay which shows the SVGA device's
current cursor image at a particular location. It may not be
hardware-accelerated by the physical machine, and it does not
necessarily correspond with the position of the user's physical mouse.
There are three "Cursor Bypass" mechanisms by which the virtual
machine can set the position of the virtual mouse cursor. Cursor
bypass 1 did not follow the overlay model described above, and it has
long been obsolete. Cursor bypass 2 and 3 are functionally equivalent,
except that cursor bypass 2 operates via non-FIFO registers and cursor
bypass 3 operates via FIFO registers. If cursor bypass 3 is supported
(SVGA_FIFO_CAP_CURSOR_BYPASS_3), it should be used instead of cursor
bypass 2.
For all forms of cursor bypass, the cursor image is defined by
SVGA_CMD_DEFINE_CURSOR.
* Physical mouse cursor operations
The virtual machine does not define the location of the physical mouse
cursor, but it can define the cursor image and hide/show it. It does
so using the SVGA_CMD_DEFINE_CURSOR and SVGA_CMD_DISPLAY_CURSOR
commands.
5. 3D Graphics Model
---------------------
The VMware SVGA device supports hardware-independent accelerated 3D
graphics via the "SVGA3D" protocol. This is a set of extended FIFO
commands. SVGA3D utilizes the same underlying FIFO and synchronization
primitives as the 2D portion of the SVGA device, but the 2D and 3D
portions of the device are largely independent.
The SVGA3D protocol is relatively high-level. The device is
responsible for tracking render state among multiple contexts, for
managing physical VRAM, and for implementing both fixed-function and
programmable vertex and pixel processing.
The SVGA3D protocol is designed to be vendor- and API-neutral, but for
convenience it has been designed to be compatible with Direct3D in
most places. The shader bytecode is fully binary-compatible with
Direct3D bytecode, and most render states are identical to those
defined by Direct3D.
Note that the VMware SVGA device still supports 3D acceleration on all
operating systems that VMware products run on. Internally, hardware
accelerated 3D is implemented on top of the OpenGL graphics API.
To summarize the SVGA3D device's design:
* SVGA3D is an extension to the VMware SVGA device's command FIFO
protocol.
* In some ways it looks like a graphics API:
o SVGA3D device manages all physical VRAM allocation.
o High-level render states, relatively high-level shader bytecode.
* In some ways it looks like hardware:
o All commands are executed asynchronously.
o Driver must track memory ownership, schedule DMA transfers.
o All physical VRAM represented by generic "Surface" objects
* Supports both fixed-function and programmable vertex and fragment
pipelines.
6. Overview of SVGA3D features
------------------------------
* Capabilities
o Extensible key/value pair list describes the SVGA3D device's
capabilities.
o Number of texture image units, max texture size, number of
lights, texture formats, etc.
* Surfaces
o Formats: 8-bit RGB/RGBA, 16-bit RGB/RGBA, depth, packed
depth/stencil, luminance/alpha, DXT compressed, signed, floating
point, etc.
o Supports 3D (volume) textures, cube maps,.
o Surfaces are also used as vertex and index buffers.
o Generic DMA blits between surfaces and system memory or
offscreen "virtual VRAM".
o Generic surface-to-surface blits, with and without scaling.
* Contexts
o Surfaces are global, other objects are per-context, render
states are per-context.
o Commands to create/delete contexts.
* Render State (Mostly Direct3D-style)
o Matrices
o Texture stage states: Filtering, combiners, LOD, gamma
correction, etc.
o Stencil, depth, culling, blending, lighting, materials, etc.
* Render Targets
o Few restrictions on which surfaces can be used as render
targets (More lenient than OpenGL FBOs)
o Supports depth, stencil, color buffer(s)
* Present
o The "present" operation is a blit from an SVGA3D surface back
to the user-visible screen.
o May or may not update the guest-visible 2D framebuffer.
* Occlusion queries
o Submitted via FIFO commands
o Results returned asynchronously: a results structure is filled
in via DMA.
* Shaders
o We define an "SVGA3D bytecode", which is binary-compatible
with Direct3D's shader bytecode.
o SVGA3D may define extensions to the bytecode format in the future.
* Drawing
o A single generic "draw primitives" command performs a list of
rendering operations from a list of vertex buffers.
o Index buffer is optional.
o Similar to drawing with OpenGL vertex arrays and VBOs.
7. Programming the VMware SVGA Device
-------------------------------------
1. Reading/writing a register:
The SVGA registers are addressed by an index/value pair of 32 bit
registers in the IO address space.
The 0710 VMware SVGA chipset (PCI device ID PCI_DEVICE_ID_VMWARE_SVGA) has
its index and value ports hardcoded at:
index: SVGA_LEGACY_BASE_PORT + 4 * SVGA_INDEX_PORT
value: SVGA_LEGACY_BASE_PORT + 4 * SVGA_VALUE_PORT
The 0405 VMware SVGA chipset (PCI device ID PCI_DEVICE_ID_VMWARE_SVGA2)
determines its index and value ports as a function of the first base
address register in its PCI configuration space as:
index: <Base Address Register 0> + SVGA_INDEX_PORT
value: <Base Address Register 0> + SVGA_VALUE_PORT
To read a register:
Set the index port to the index of the register, using a dword OUT
Do a dword IN from the value port
To write a register:
Set the index port to the index of the register, using a dword OUT
Do a dword OUT to the value port
Example, setting the width to 1024:
mov eax, SVGA_REG_WIDTH
mov edx, <SVGA Address Port>
out dx, eax
mov eax, 1024
mov edx, <SVGA Value Port>
out dx, eax
2. Initialization
Check the version number
loop:
Write into SVGA_REG_ID the maximum SVGA_ID_* the driver supports.
Read from SVGA_REG_ID.
Check if it is the value you wrote.
If yes, VMware SVGA device supports it
If no, decrement SVGA_ID_* and goto loop
This algorithm converges.
Map the frame buffer and the command FIFO
Read SVGA_REG_FB_START, SVGA_REG_FB_SIZE, SVGA_REG_MEM_START,
SVGA_REG_MEM_SIZE.
Map the frame buffer (FB) and the FIFO memory (MEM).
This step must occur after the version negotiation above, since by
default the device is in a legacy-compatibility mode in which there
is no command FIFO.
Get the device capabilities and frame buffer dimensions
Read SVGA_REG_CAPABILITIES, SVGA_REG_MAX_WIDTH, SVGA_REG_MAX_HEIGHT,
and SVGA_REG_HOST_BITS_PER_PIXEL / SVGA_REG_BITS_PER_PIXEL.
Note: The capabilities can and do change without the PCI device ID
changing or the SVGA_REG_ID changing. A driver should always check
the capabilities register when loading before expecting any
capabilities-determined feature to be available. See below for a list
of capabilities as of this writing.
Note: If SVGA_CAP_8BIT_EMULATION is not set, then it is possible that
SVGA_REG_HOST_BITS_PER_PIXEL does not exist and
SVGA_REG_BITS_PER_PIXEL should be read instead.
Optional: Report the Guest Operating System
Write SVGA_REG_GUEST_ID with the appropriate value from <guest_os.h>.
While not required in any way, this is useful information for the
virtual machine to have available for reporting and sanity checking
purposes.
SetMode
Set SVGA_REG_WIDTH, SVGA_REG_HEIGHT, SVGA_REG_BITS_PER_PIXEL
Read SVGA_REG_FB_OFFSET
(SVGA_REG_FB_OFFSET is the offset from SVGA_REG_FB_START of the
visible portion of the frame buffer)
Read SVGA_REG_BYTES_PER_LINE, SVGA_REG_DEPTH, SVGA_REG_PSEUDOCOLOR,
SVGA_REG_RED_MASK, SVGA_REG_GREEN_MASK, SVGA_REG_BLUE_MASK
Note: SVGA_REG_BITS_PER_PIXEL is readonly if
SVGA_CAP_8BIT_EMULATION is not set in the capabilities register. Even
if it is set, values other than 8 and SVGA_REG_HOST_BITS_PER_PIXEL
will be ignored.
Enable SVGA
Set SVGA_REG_ENABLE to 1
(to disable SVGA, set SVGA_REG_ENABLE to 0. Setting SVGA_REG_ENABLE
to 0 also enables VGA.)
Initialize the command FIFO
The FIFO is exclusively dword (32-bit) aligned. The first four
dwords define the portion of the MEM area that is used for the
command FIFO. These are values are all in byte offsets from the
start of the MEM area.
A minimum sized FIFO would have these values:
mem[SVGA_FIFO_MIN] = 16;
mem[SVGA_FIFO_MAX] = 16 + (10 * 1024);
mem[SVGA_FIFO_NEXT_CMD] = 16;
mem[SVGA_FIFO_STOP] = 16;
Various addresses near the beginning of the FIFO are defined as
"FIFO registers" with special meaning. If the driver wishes to
take advantage of the special meaning of these addresses rather
than using them as part of the command FIFO, the driver must
reserve space for these registers when setting up the FIFO.
Typically the driver will set MIN to SVGA_FIFO_NUM_REGS*4.
Report the guest 3D version
If your driver supports 3D, write the latest supported 3D
version (SVGA3D_HWVERSION_CURRENT) to the
SVGA_FIFO_GUEST_3D_HWVERSION register.
Enable the command FIFO
Set SVGA_REG_CONFIG_DONE to 1 after these values have been set.
Note: Setting SVGA_REG_CONFIG_DONE to 0 will stop the device from
reading the FIFO until it is reinitialized and SVGA_REG_CONFIG_DONE is
set to 1 again.
3. SVGA command FIFO protocol
The FIFO is empty when SVGA_FIFO_NEXT_CMD == SVGA_FIFO_STOP. The
driver writes commands to the FIFO starting at the offset specified
by SVGA_FIFO_NEXT_CMD, and then increments SVGA_FIFO_NEXT_CMD.
The FIFO is full when SVGA_FIFO_NEXT_CMD is one word before SVGA_FIFO_STOP.
When the FIFO becomes full, the driver must wait for space to become
available. It can do this via various methods (busy-wait, legacy sync)
but the preferred method is to use the FIFO_PROGRESS interrupt.
The SVGA device does not guarantee that all of FIFO memory is valid
at all times. The device is free to discard the contents of any memory
which is not part of the active portion of the FIFO. The active portion
of the FIFO is defined as the region with valid commands (starting
at SVGA_FIFO_STOP and ending at SVGA_FIFO_NEXT_CMD) plus the reserved
portion of the FIFO.
By default, only one word of memory is 'reserved'. If the FIFO supports
the SVGA_FIFO_CAP_RESERVE capability, the device supports reserving
driver-defined amounts of memory. If both the device and driver support
this operation, it's possible to write multiple words of data between
updates to the FIFO control registers.
The simplest way to use the FIFO is to write one word at a time, but the
highest-performance way to use the FIFO is to reserve enough space for
an entire command or group of commands, write the commands directly to
FIFO memory, then "commit" the command(s) by updating the FIFO control
registers.
A reference implementation of this reserve/commit algorithm is provided
in svga.c, in SVGA_FIFOReserve() and SVGA_FIFOCommit(). In the common
case, this algorithm lets drivers assemble commands directly in FIFO
memory without any additional copies or memory allocation.
4. Synchronization
The primary synchronization primitive defined by the SVGA device is
"Sync to fence". A "fence" is a numbered marker inserted into the FIFO
command stream. The driver can insert fences at any time, and efficiently
determine the value of the last fence processed by the device.
"Sync to fence" is the process of waiting for a particular fence to be
processed. This may be important for several reasons:
- Flow control. For interactivity, it is important to put an upper
limit on the amount by which the device may lag the application.
- Waiting for DMA completion. If the driver needs to recycle a DMA
buffer or complete a DMA operation synchronously, it must sync
to a fence which occurred after the DMA operation in the command
stream.
- Waiting for accelerated 2D operations. If a 2D driver needs to
write to a portion of the framebuffer which is affected by
an accelerated blit, it should sync to a fence which occurred
after the blit.
There are multiple possible implementations of Sync to Fence, depending
on the capabilities of the SVGA device you're driving. Very old versions
of the VMware SVGA device did not support fences at all. For these
devices, you must always perform a "legacy sync". New virtual machines
with Workstation 6.5 virtual hardware or later support an IRQ-driven
sync operation. For all other versions of the SVGA device, the best
approach is a hybrid in which you synchronously use the SYNC/BUSY
registers to process the FIFO until the sync has passed.
FIFO synchronization is a very complex topic, and it isn't covered fully
by this document. Please see the synchronization-related comments in
svga_reg.h, and the reference implementation of these primitives in
svga.c.
5. Cursor
When SVGA_CAP_CURSOR is set, hardware cursor support is available. In
practice, SVGA_CAP_CURSOR will only be set when SVGA_CAP_CURSOR_BYPASS is
also set and drivers supporting a hardware cursor should only worry about
SVGA_CAP_CURSOR_BYPASS and only use the FIFO to define the cursor. See
below for more information.
6. Pseudocolor
When the read-only register SVGA_REG_PSEUDOCOLOR is 1, the device is in a
colormapped mode whose index width and color width are both SVGA_REG_DEPTH.
Thus far, 8 is the only depth at which pseudocolor is ever used.
In pseudocolor, the colormap is programmed by writing to the SVGA palette
registers. These start at SVGA_PALETTE_BASE and are interpreted as
follows:
SVGA_PALETTE_BASE + 3*n - The nth red component
SVGA_PALETTE_BASE + 3*n + 1 - The nth green component
SVGA_PALETTE_BASE + 3*n + 2 - The nth blue component
And n ranges from 0 to ((1<<SVGA_REG_DEPTH) - 1).
7. Pseudocolor
After initialization, the driver can write directly to the frame
buffer. The updated frame buffer is not displayed immediately, but
only when an update command is sent. The update command
(SVGA_CMD_UPDATE) defines the rectangle in the frame buffer that has
been modified by the driver, and causes that rectangle to be updated
on the screen.
A complete driver can be developed this way. For increased
performance, additional commands are available to accelerate common
operations. The two most useful are SVGA_CMD_RECT_FILL and
SVGA_CMD_RECT_COPY.
After issuing an accelerated command, the FIFO should be sync'd, as
described above, before writing to the frame buffer.
SVGA_REG_FB_OFFSET and SVGA_REG_BYTES_PER_LINE may change after SVGA_REG_WIDTH
or SVGA_REG_HEIGHT is set. Also the VGA registers must be written to after
setting SVGA_REG_ENABLE to 0 to change the display to a VGA mode.
8. Mode changes
The video mode may be changed by writing to the WIDTH, HEIGHT,
and/or DEPTH registers again, after initialization. All of the
registers listed in the 'SetMode' initialization section above
should be reread afterwards. Additionally, when changing modes, it
can be convenient to set SVGA_REG_ENABLE to 0, change
SVGA_REG_WIDTH, SVGA_REG_HEIGHT, and SVGA_REG_BITS_PER_PIXEL (if
available), and then set SVGA_REG_ENABLE to 1 again. This is
optional, but it will avoid intermediate states in which only one
component of the new mode has been set.
9. Capabilities
The capabilities register (SVGA_REG_CAPABILITIES) is an array of bits that
indicates the capabilities of the SVGA emulation. A driver should check
SVGA_REG_CAPABILITIES every time it loads before relying on any feature that
is only optionally available.
XXX: There is also a capabilities register in the FIFO register space.
It is not documented in this file, but all of the available bits
are listed in svga_reg.h.
Some of the capabilities determine which FIFO commands are available. This
table shows which capability indicates support for which command.
FIFO Command Capability
------------ ----------
SVGA_CMD_RECT_FILL SVGA_CAP_RECT_FILL
SVGA_CMD_RECT_COPY SVGA_CAP_RECT_COPY
SVGA_CMD_DEFINE_BITMAP SVGA_CAP_OFFSCREEN
SVGA_CMD_DEFINE_BITMAP_SCANLINE SVGA_CAP_OFFSCREEN
SVGA_CMD_DEFINE_PIXMAP SVGA_CAP_OFFSCREEN
SVGA_CMD_DEFINE_PIXMAP_SCANLINE SVGA_CAP_OFFSCREEN
SVGA_CMD_RECT_BITMAP_FILL SVGA_CAP_RECT_PAT_FILL
SVGA_CMD_RECT_PIXMAP_FILL SVGA_CAP_RECT_PAT_FILL
SVGA_CMD_RECT_BITMAP_COPY SVGA_CAP_RECT_PAT_FILL
SVGA_CMD_RECT_PIXMAP_COPY SVGA_CAP_RECT_PAT_FILL
SVGA_CMD_FREE_OBJECT SVGA_CAP_OFFSCREEN
SVGA_CMD_RECT_ROP_FILL SVGA_CAP_RECT_FILL +
SVGA_CAP_RASTER_OP
SVGA_CMD_RECT_ROP_COPY SVGA_CAP_RECT_COPY +
SVGA_CAP_RASTER_OP
SVGA_CMD_RECT_ROP_BITMAP_FILL SVGA_CAP_RECT_PAT_FILL +
SVGA_CAP_RASTER_OP
SVGA_CMD_RECT_ROP_PIXMAP_FILL SVGA_CAP_RECT_PAT_FILL +
SVGA_CAP_RASTER_OP
SVGA_CMD_RECT_ROP_BITMAP_COPY SVGA_CAP_RECT_PAT_FILL +
SVGA_CAP_RASTER_OP
SVGA_CMD_RECT_ROP_PIXMAP_COPY SVGA_CAP_RECT_PAT_FILL +
SVGA_CAP_RASTER_OP
SVGA_CMD_DEFINE_CURSOR SVGA_CAP_CURSOR
SVGA_CMD_DISPLAY_CURSOR SVGA_CAP_CURSOR
SVGA_CMD_MOVE_CURSOR SVGA_CAP_CURSOR
SVGA_CMD_DEFINE_ALPHA_CURSOR SVGA_CAP_ALPHA_CURSOR
SVGA_CMD_DRAW_GLYPH SVGA_CAP_GLYPH
SVGA_CMD_DRAW_GLYPH_CLIPPED SVGA_CAP_GLYPH_CLIPPING
SVGA_CMD_ESCAPE SVGA_FIFO_CAP_ESCAPE
(NOTE: Many of the commands here are deprecated, and listed
in the table only for reference. All comments for glyph,
bitmap, and pixmap drawing are not implemented in the
latest releases of VMware products.)
Other capabilities indicate other functionality as described below:
SVGA_CAP_CURSOR_BYPASS
The hardware cursor can be drawn via SVGA Registers (without requiring
the FIFO be synchronized and will be drawn potentially before any
outstanding unprocessed FIFO commands).
Note: Without SVGA_CAP_CURSOR_BYPASS_2, cursors drawn this way still
appear in the guest's framebuffer and need to be turned off before any
save under / overlapping drawing and turned back on after. This can
cause very noticeable cursor flicker.
SVGA_CAP_CURSOR_BYPASS_2
Instead of turning the cursor off and back on around any overlapping
drawing, the driver can write SVGA_CURSOR_ON_REMOVE_FROM_FB and
SVGA_CURSOR_ON_RESTORE_TO_FB to SVGA_REG_CURSOR_ON. In almost all
cases these are NOPs and the cursor will be remain visible without
appearing in the guest framebuffer. In 'direct graphics' modes like
Linux host fullscreen local displays, however, the cursor will still
be drawn in the framebuffer, still flicker, and be drawn incorrectly
if a driver does not use SVGA_CURSOR_ON_REMOVE_FROM_FB / RESTORE_TO_FB.
SVGA_CAP_8BIT_EMULATION
SVGA_REG_BITS_PER_PIXEL is writable and can be set to either 8 or
SVGA_REG_HOST_BITS_PER_PIXEL. Otherwise the only SVGA modes available
inside a virtual machine must match the host's bits per pixel.
Note: Some versions which lack SVGA_CAP_8BIT_EMULATION also lack the
SVGA_REG_HOST_BITS_PER_PIXEL and a driver should assume
SVGA_REG_BITS_PER_PIXEL is both read-only and initialized to the only
available value if SVGA_CAP_8BIT_EMULATION is not set.
SVGA_CAP_OFFSCREEN_1
SVGA_CMD_RECT_FILL, SVGA_CMD_RECT_COPY, SVGA_CMD_RECT_ROP_FILL,
SVGA_CMD_RECT_ROP_COPY can operate with a source or destination (or
both) in offscreen memory.
Usable offscreen memory is a rectangle located below the last scanline
of the visible memory:
x1 = 0
y1 = (SVGA_REG_FB_SIZE + SVGA_REG_BYTES_PER_LINE - 1) /
SVGA_REG_BYTES_PER_LINE
x2 = SVGA_REG_BYTES_PER_LINE / SVGA_REG_DEPTH
y2 = SVGA_REG_VRAM_SIZE / SVGA_REG_BYTES_PER_LINE
Cursor Handling
---------------
Several cursor drawing mechanisms are supported for legacy
compatibility. The current mechanism, and the only one that new
drivers need support, is "Cursor Bypass 3".
In Cursor Bypass 3 mode, the cursor image is defined via FIFO
commands, but the cursor position and visibility is reported
asynchronously by writing to FIFO registers.
A driver defines an AND/XOR hardware cursor using
SVGA_CMD_DEFINE_CURSOR to assign an ID and establish the AND and XOR
masks with the hardware. A driver uses SVGA_CMD_DEFINE_ALPHA_CURSOR
to define a 32 bit mask whose top 8 bits are used to blend the cursor
image with the pixels it covers. Alpha cursor support is only
available when SVGA_CAP_ALPHA_CURSOR is set. Note that alpha cursors
use pre-multiplied alpha.
---

6
examples/2dmark/Makefile Normal file
View file

@ -0,0 +1,6 @@
TARGET = 2dmark.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

148
examples/2dmark/main.c Normal file
View file

@ -0,0 +1,148 @@
/*
* Simple 2D graphics benchmark.
*
* The VMware SVGA device typically coalesces update rectangles and
* processes them asynchronously. This makes it difficult to get
* meaningful 2D benchmark numbers from tools which run inside a
* normal guest OS.
*
* This tool sweeps through multiple 2D update sizes on multiple
* video modes. After the test, results are summarized to the screen
* (in VGA text mode) and to vmware.log.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga.h"
#include "intr.h"
#include "console_vga.h"
#include "vmbackdoor.h"
#include "svga3dutil.h"
struct {
uint32 value;
const char *label;
} sizes[] = {
{ 1, " 1" },
{ 8, " 8" },
{ 64, " 64" },
{ 233, " 233" }, /* Prime */
{ 256, " 256" },
{ 2048, " 2048" },
{ 2099, " 2099" }, /* Prime */
{ 4096, " 4096" },
};
/*
* benchmarkAtSize --
*
* Inner benchmarking loop, tests one combination of fb and update sizes.
*/
static FPSCounterState *
benchmarkAtSize(uint32 screen, uint32 update)
{
int i = 3;
static FPSCounterState fps;
memset(&fps, 0, sizeof fps);
/* Clear the screen and change modes */
memset(gSVGA.fbMem, 0x40, screen * screen * sizeof(uint32));
SVGA_SetMode(screen, screen, 32);
/* Make sure the FIFO is empty */
SVGA_SyncToFence(SVGA_InsertFence());
/*
* UpdateFPSCounter returns TRUE each time it's output is updated.
* The first time, it won't have an FPS reading available yet. (It is
* guaranteed to return TRUE on its first call.) The second time,
* an FPS reading will be ready. We wait until the third time, in
* order to give the readings extra time to stabilize.
*
* Note that the 'i--' part of this expression only executes when
* UpdateFPSCounter returns TRUE.
*/
do {
/* Synchronously update the screen */
SVGA_Update(0, 0, update, update);
SVGA_SyncToFence(SVGA_InsertFence());
} while (!SVGA3DUtil_UpdateFPSCounter(&fps) || i--);
return &fps;
}
/*
* runBenchmark --
*
* Main benchmark loop. Run through all valid combinations
* of update and display sizes.
*/
static void
runBenchmark()
{
const int numSizes = sizeof sizes / sizeof sizes[0];
int i, j;
Console_WriteString("Synchronous 2D updates per second.\n"
"Video mode width/height on Y axis, update size on X axis.\n"
"\n");
/* Size headings across the top of the screen */
Console_WriteString(" | ");
for (i = 0; i < numSizes; i++) {
Console_WriteString(" ");
Console_WriteString(sizes[i].label);
}
Console_WriteString("\n");
for (i = 0; i < 79; i++) {
Console_WriteString("-");
}
Console_WriteString("\n");
for (i = 0; i < numSizes; i++) {
Console_Format("%s | ", sizes[i].label);
for (j = 0; j <= i; j++) {
char *fps = benchmarkAtSize(sizes[i].value, sizes[j].value)->text;
/* Hack to make the string shorter by cutting off "FPS" label. */
fps[7] = '\0';
Console_Format(" %s", fps);
}
Console_WriteString("\n");
}
Console_WriteString("\nBenchmark complete. Results are "
"also available in the VMX log.");
}
/*
* main --
*
* Initialization and results reporting.
*/
int
main(void)
{
Intr_Init();
Intr_SetFaultHandlers(SVGA_DefaultFaultHandler);
ConsoleVGA_Init();
SVGA_Init();
runBenchmark();
SVGA_Disable();
VMBackdoor_VGAScreenshot();
return 0;
}

11
examples/Makefile Normal file
View file

@ -0,0 +1,11 @@
SUBDIRS = $(subst /Makefile,,$(wildcard */Makefile))
.PHONY: subdirs clean $(SUBDIRS)
subdirs: $(SUBDIRS)
$(SUBDIRS):
$(MAKE) -C $@
clean:
for dir in $(SUBDIRS); do $(MAKE) -C $$dir clean; done

View file

@ -0,0 +1,6 @@
TARGET = blit-cube.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

349
examples/blit-cube/main.c Normal file
View file

@ -0,0 +1,349 @@
/*
* SVGA3D example: Spinning cube, with various blit operations.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
typedef struct {
float position[3];
float texcoord[2];
float color[3];
} MyVertex;
static const MyVertex vertexData[] = {
{ {-1, -1, -1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* -X */
{ {-1, -1, 1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ {-1, 1, -1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ {-1, 1, 1}, { 1, 1 }, {1.0, 1.0, 1.0} },
{ { 1, -1, -1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* +X */
{ { 1, -1, 1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ { 1, 1, -1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ { 1, 1, 1}, { 1, 1 }, {1.0, 1.0, 1.0} },
{ {-1, -1, -1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* -Y */
{ {-1, -1, 1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ { 1, -1, -1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ { 1, -1, 1}, { 1, 1 }, {1.0, 1.0, 1.0} },
{ {-1, 1, -1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* +Y */
{ {-1, 1, 1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ { 1, 1, -1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ { 1, 1, 1}, { 1, 1 }, {1.0, 1.0, 1.0} },
{ {-1, -1, -1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* -Z */
{ {-1, 1, -1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ { 1, -1, -1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ { 1, 1, -1}, { 1, 1 }, {1.0, 1.0, 1.0} },
{ {-1, -1, 1}, { 0, 0 }, {0.5, 0.5, 0.5} }, /* +Z */
{ {-1, 1, 1}, { 0, 1 }, {1.0, 1.0, 1.0} },
{ { 1, -1, 1}, { 1, 0 }, {0.5, 0.5, 0.5} },
{ { 1, 1, 1}, { 1, 1 }, {1.0, 1.0, 1.0} },
};
#define QUAD(a,b,c,d) a, b, d, d, c, a
static const uint16 indexData[] = {
QUAD(0, 1, 2, 3), // -X
QUAD(4, 5, 6, 7), // +X
QUAD(8, 9, 10, 11), // -Y
QUAD(12, 13, 14, 15), // +Y
QUAD(16, 17, 18, 19), // -Z
QUAD(20, 21, 22, 23), // +Z
};
#undef QUAD
const uint32 numTriangles = sizeof indexData / sizeof indexData[0] / 3;
uint32 vertexSid, indexSid, textureSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
VMMousePacket lastMouseState;
/*
* render --
*
* Set up render state, and draw our cube scene from static index
* and vertex buffers.
*
* This render state only needs to be set each frame because
* SVGA3DText_Draw() changes it.
*/
void
render(void)
{
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Scale(view, 0.5, 0.5, 0.5, 1.0);
if (lastMouseState.buttons & VMMOUSE_LEFT_BUTTON) {
Matrix_RotateX(view, lastMouseState.y * 0.0001);
Matrix_RotateY(view, lastMouseState.x * -0.0001);
} else {
Matrix_RotateX(view, 30.0 * M_PI / 180.0);
Matrix_RotateY(view, gFPS.frame * 0.01f);
}
Matrix_Translate(view, 0, 0, 2);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, gIdentityMatrix);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 10);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = textureSid;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_MODULATE;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_TEXTURE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_COLORARG2;
ts[3].value = SVGA3D_TA_DIFFUSE;
ts[4].stage = 0;
ts[4].name = SVGA3D_TS_ALPHAOP;
ts[4].value = SVGA3D_TC_SELECTARG1;
ts[5].stage = 0;
ts[5].name = SVGA3D_TS_ALPHAARG1;
ts[5].value = SVGA3D_TA_DIFFUSE;
ts[6].stage = 0;
ts[6].name = SVGA3D_TS_MINFILTER;
ts[6].value = SVGA3D_TEX_FILTER_LINEAR;
ts[7].stage = 0;
ts[7].name = SVGA3D_TS_MAGFILTER;
ts[7].value = SVGA3D_TEX_FILTER_LINEAR;
ts[8].stage = 0;
ts[8].name = SVGA3D_TS_ADDRESSU;
ts[8].value = SVGA3D_TEX_ADDRESS_WRAP;
ts[9].stage = 0;
ts[9].name = SVGA3D_TS_ADDRESSV;
ts[9].value = SVGA3D_TEX_ADDRESS_WRAP;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginDrawPrimitives(CID, &decls, 3, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_FLOAT2;
decls[1].identity.usage = SVGA3D_DECLUSAGE_TEXCOORD;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, texcoord);
decls[2].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[2].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[2].array.surfaceId = vertexSid;
decls[2].array.stride = sizeof(MyVertex);
decls[2].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = numTriangles;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint16);
ranges[0].indexWidth = sizeof(uint16);
}
SVGA_FIFOCommitAll();
}
/*
* defineCheckerboard --
*
* Create a new checkerboard texture of the specified size.
*/
uint32
defineCheckerboard(uint32 width, uint32 height)
{
uint32 *buffer;
int i, j;
SVGAGuestPtr gPtr;
uint32 size = width * height * sizeof *buffer;
uint32 sid = SVGA3DUtil_DefineSurface2D(width, height, SVGA3D_A8R8G8B8);
buffer = SVGA3DUtil_AllocDMABuffer(size, &gPtr);
for (j = 0; j < height; j++) {
for (i = 0; i < width; i++) {
*buffer = (i + j) & 1 ? 0xFFFFFFFF : 0x00000000;
buffer++;
}
}
SVGA3DUtil_SurfaceDMA2D(sid, &gPtr, SVGA3D_WRITE_HOST_VRAM, width, height);
return sid;
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
uint32 texSize = 256;
uint32 checkerSid;
SVGA3DUtil_InitFullscreen(CID, 1024, 768);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineStaticBuffer(vertexData, sizeof vertexData);
indexSid = SVGA3DUtil_DefineStaticBuffer(indexData, sizeof indexData);
textureSid = SVGA3DUtil_DefineSurface2D(texSize, texSize, SVGA3D_A8R8G8B8);
checkerSid = defineCheckerboard(texSize, texSize);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format(
"VMware SVGA3D Example:\n"
"Spinning cube blitter test: \n"
" - SurfaceStretchBlt from back buffer to cube texture\n"
" - SurfaceCopy from cube texture to back buffer\n"
" - Checkerboard pattern in bottom left\n"
"\n"
"Verify performance and correctness with all blitter implementations.\n"
"\n"
"%s",
gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
while (VMBackdoor_MouseGetPacket(&lastMouseState));
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x6666dd, 1.0f, 0);
render();
SVGA3DText_Draw();
/* Surface copy from cube texture to the lower-right corner of the back buffer */
{
SVGA3dSurfaceImageId src = { textureSid };
SVGA3dCopyBox *boxes;
SVGA3D_BeginSurfaceCopy(&src, &gFullscreen.colorImage, &boxes, 1);
boxes[0].w = texSize;
boxes[0].h = texSize;
boxes[0].d = 1;
boxes[0].x = gFullscreen.screen.w - texSize;
boxes[0].y = gFullscreen.screen.h - texSize;
SVGA_FIFOCommitAll();
}
/*
* We're displaying the checkerboard texture in the lower-left
* corner of the back buffer. This tests for subpixel alignment
* errors within the blitter.
*
* Draw the top half with a regular blit, bottom half with a
* stretch blit. You should see a contiguous checkerboard.
*/
{
SVGA3dSurfaceImageId src = { checkerSid };
SVGA3dCopyBox *boxes;
SVGA3dBox boxSrc = { 0 };
SVGA3dBox boxDest = { 0 };
SVGA3D_BeginSurfaceCopy(&src, &gFullscreen.colorImage, &boxes, 1);
boxes[0].w = texSize;
boxes[0].h = texSize/2;
boxes[0].d = 1;
boxes[0].y = gFullscreen.screen.h - texSize;
SVGA_FIFOCommitAll();
boxSrc.w = texSize;
boxSrc.y = texSize/2;
boxSrc.h = texSize/2;
boxSrc.d = 1;
boxDest.w = texSize;
boxDest.y = gFullscreen.screen.h - texSize/2;
boxDest.h = texSize/2;
boxDest.d = 1;
SVGA3D_SurfaceStretchBlt(&src, &gFullscreen.colorImage, &boxSrc, &boxDest,
SVGA3D_STRETCH_BLT_LINEAR);
}
SVGA3DUtil_PresentFullscreen();
/* Stretch blit from back buffer to cube */
{
SVGA3dSurfaceImageId dest = { textureSid };
SVGA3dBox boxSrc = { 0 };
SVGA3dBox boxDest = { 0 };
boxSrc.w = gFullscreen.screen.w;
boxSrc.h = gFullscreen.screen.h;
boxSrc.d = 1;
boxDest.w = texSize;
boxDest.h = texSize;
boxDest.d = 1;
SVGA3D_SurfaceStretchBlt(&gFullscreen.colorImage, &dest, &boxSrc, &boxDest,
SVGA3D_STRETCH_BLT_LINEAR);
}
}
return 0;
}

View file

@ -0,0 +1,6 @@
TARGET = bunnies.img
APP_SOURCES = main.c bunny.ib.z.data.o bunny.vb.z.data.o
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

BIN
examples/bunnies/bunny.ib Normal file

Binary file not shown.

BIN
examples/bunnies/bunny.vb Normal file

Binary file not shown.

202
examples/bunnies/main.c Normal file
View file

@ -0,0 +1,202 @@
/*
* SVGA3D example: Bunnies.
*
* This example loads the famous Stanford Bunny model, and draws many
* copies of it. This demonstrates large models and fixed-function
* lighting.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
DECLARE_DATAFILE(ibFile, bunny_ib_z);
DECLARE_DATAFILE(vbFile, bunny_vb_z);
uint32 vertexSid, indexSid;
uint32 ibSize, vbSize;
Matrix perspectiveMat;
FPSCounterState gFPS;
/*
* setupFrame --
*
* Set up render state that we load once per frame (because
* SVGA3DText clobbered it) and perform matrix calculations that we
* only need once per frame.
*/
void
setupFrame(void)
{
static Matrix world;
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
static const SVGA3dLightData light = {
.type = SVGA3D_LIGHTTYPE_POINT,
.inWorldSpace = TRUE,
.diffuse = { 10.0f, 10.0f, 10.0f, 1.0f },
.ambient = { 0.05f, 0.05f, 0.1f, 1.0f },
.position = { -5.0f, 5.0f, 0.0f, 1.0f },
.attenuation0 = 1.0f,
.attenuation1 = 0.0f,
.attenuation2 = 0.0f,
};
static const SVGA3dMaterial mat = {
.diffuse = { 1.0f, 0.9f, 0.9f, 1.0f },
.ambient = { 1.0f, 1.0f, 1.0f, 1.0f },
};
Matrix_Copy(world, gIdentityMatrix);
Matrix_Scale(world, 10, 10, 10, 1);
Matrix_RotateY(world, gFPS.frame * 0.001f);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, world);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_SetMaterial(CID, SVGA3D_FACE_FRONT_BACK, &mat);
SVGA3D_SetLightData(CID, 0, &light);
SVGA3D_SetLightEnabled(CID, 0, TRUE);
SVGA3D_BeginSetRenderState(CID, &rs, 8);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
rs[4].state = SVGA3D_RS_LIGHTINGENABLE;
rs[4].uintValue = TRUE;
rs[5].state = SVGA3D_RS_VERTEXMATERIALENABLE;
rs[5].uintValue = FALSE;
rs[6].state = SVGA3D_RS_CULLMODE;
rs[6].uintValue = SVGA3D_FACE_FRONT;
rs[7].state = SVGA3D_RS_AMBIENT;
rs[7].uintValue = 0x00000000;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
}
/*
* drawMesh --
*
* Draw our bunny mesh at a particular position.
*/
void
drawMesh(float posX, float posY, float posZ)
{
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Translate(view, posX, posY, posZ);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = 6 * sizeof(float);
decls[1].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[1].identity.usage = SVGA3D_DECLUSAGE_NORMAL;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = 6 * sizeof(float);
decls[1].array.offset = 3 * sizeof(float);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = ibSize / sizeof(uint32) / 3;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint32);
ranges[0].indexWidth = sizeof(uint32);
}
SVGA_FIFOCommitAll();
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_LoadCompressedBuffer(vbFile, &vbSize);
indexSid = SVGA3DUtil_LoadCompressedBuffer(ibFile, &ibSize);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (1) {
int i;
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Bunnies: Drawing 4 copies of the Stanford Bunny,"
" at 65K triangles each.\n\n%s",
gFPS.text);
SVGA3DText_Update();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
setupFrame();
for (i = 0; i < 4; i++) {
drawMesh(0.8 - i * 1.0f, -1, 3 + i * 1.0f);
}
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

6
examples/cube/Makefile Normal file
View file

@ -0,0 +1,6 @@
TARGET = cube.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

191
examples/cube/main.c Normal file
View file

@ -0,0 +1,191 @@
/*
* SVGA3D example: Spinning cube, with static vertex/index buffers.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
#include "keyboard.h"
#include "apm.h"
typedef struct {
float position[3];
uint32 color;
} MyVertex;
static const MyVertex vertexData[] = {
{ {-1, -1, -1}, 0xFFFFFF },
{ {-1, -1, 1}, 0xFFFF00 },
{ {-1, 1, -1}, 0xFF00FF },
{ {-1, 1, 1}, 0xFF0000 },
{ { 1, -1, -1}, 0x00FFFF },
{ { 1, -1, 1}, 0x00FF00 },
{ { 1, 1, -1}, 0x0000FF },
{ { 1, 1, 1}, 0x000000 },
};
#define QUAD(a,b,c,d) a, b, d, d, c, a
static const uint16 indexData[] = {
QUAD(0,1,2,3), // -X
QUAD(4,5,6,7), // +X
QUAD(0,1,4,5), // -Y
QUAD(2,3,6,7), // +Y
QUAD(0,2,4,6), // -Z
QUAD(1,3,5,7), // +Z
};
#undef QUAD
const uint32 numTriangles = sizeof indexData / sizeof indexData[0] / 3;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
VMMousePacket lastMouseState;
/*
* render --
*
* Set up render state, and draw our cube scene from static index
* and vertex buffers.
*
* This render state only needs to be set each frame because
* SVGA3DText_Draw() changes it.
*/
void
render(void)
{
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Scale(view, 0.5, 0.5, 0.5, 1.0);
if (lastMouseState.buttons & VMMOUSE_LEFT_BUTTON) {
Matrix_RotateX(view, lastMouseState.y * 0.0001);
Matrix_RotateY(view, lastMouseState.x * -0.0001);
} else {
Matrix_RotateX(view, 30.0 * M_PI / 180.0);
Matrix_RotateY(view, gFPS.frame * 0.01f);
}
Matrix_Translate(view, 0, 0, 3);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, gIdentityMatrix);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_D3DCOLOR;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = numTriangles;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint16);
ranges[0].indexWidth = sizeof(uint16);
}
SVGA_FIFOCommitAll();
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
Keyboard_Init();
APM_Init();
vertexSid = SVGA3DUtil_DefineStaticBuffer(vertexData, sizeof vertexData);
indexSid = SVGA3DUtil_DefineStaticBuffer(indexData, sizeof indexData);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (!Keyboard_IsKeyPressed(KEY_ESCAPE)) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Spinning cube with static vertex and index buffer.\n"
"Drag with left mouse button to rotate.\n"
"Press ESC to exit.\n"
"\n%s",
gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
while (VMBackdoor_MouseGetPacket(&lastMouseState));
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
APM_SetPowerState(POWER_OFF);
return 0;
}

View file

@ -0,0 +1,16 @@
TARGET = cubemark.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules
.PHONY: shaders
shaders: cube_vs.h cube_ps.h
cube_vs.h: cube.fx
wine fxc.exe /T vs_2_0 /E MyVertexShader /Fh cube_vs.h cube.fx
cube_ps.h: cube.fx
wine fxc.exe /T ps_2_0 /E MyPixelShader /Fh cube_ps.h cube.fx

36
examples/cubemark/cube.fx Normal file
View file

@ -0,0 +1,36 @@
float4x4 matView, matProj;
struct VS_Input
{
float4 Pos : POSITION;
float4 Color : COLOR0;
};
struct VS_Output
{
float4 Pos : POSITION;
float4 Color : COLOR0;
};
VS_Output
MyVertexShader(VS_Input Input)
{
VS_Output Output;
Output.Pos = mul(mul(Input.Pos, matView), matProj);
Output.Color = Input.Color;
return Output;
}
struct PS_Input
{
float4 Color : COLOR0;
};
float4
MyPixelShader(PS_Input Input) : COLOR
{
return Input.Color;
}

View file

@ -0,0 +1,21 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T ps_2_0 /E MyPixelShader /Fh cube_ps.h cube.fx
//
ps_2_0
dcl v0
mov oC0, v0
// approximately 1 instruction slot used
#endif
const DWORD g_ps20_MyPixelShader[] =
{
0xffff0200, 0x0013fffe, 0x42415443, 0x0000001c, 0x00000023, 0xffff0200,
0x00000000, 0x00000000, 0x20000100, 0x0000001c, 0x325f7370, 0x4d00305f,
0x6f726369, 0x74666f73, 0x29522820, 0x44334420, 0x53203958, 0x65646168,
0x6f432072, 0x6c69706d, 0x00207265, 0x0200001f, 0x80000000, 0x900f0000,
0x02000001, 0x800f0800, 0x90e40000, 0x0000ffff
};

View file

@ -0,0 +1,54 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T vs_2_0 /E MyVertexShader /Fh cube_vs.h cube.fx
//
//
// Parameters:
//
// float4x4 matProj;
// float4x4 matView;
//
//
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// matView c0 4
// matProj c4 4
//
vs_2_0
dcl_position v0
dcl_color v1
dp4 r0.x, v0, c0
dp4 r0.y, v0, c1
dp4 r0.z, v0, c2
dp4 r0.w, v0, c3
dp4 oPos.x, r0, c4
dp4 oPos.y, r0, c5
dp4 oPos.z, r0, c6
dp4 oPos.w, r0, c7
mov oD0, v1
// approximately 9 instruction slots used
#endif
const DWORD g_vs20_MyVertexShader[] =
{
0xfffe0200, 0x0025fffe, 0x42415443, 0x0000001c, 0x0000006b, 0xfffe0200,
0x00000002, 0x0000001c, 0x20000100, 0x00000064, 0x00000044, 0x00040002,
0x00000004, 0x0000004c, 0x00000000, 0x0000005c, 0x00000002, 0x00000004,
0x0000004c, 0x00000000, 0x5074616d, 0x006a6f72, 0x00030003, 0x00040004,
0x00000001, 0x00000000, 0x5674616d, 0x00776569, 0x325f7376, 0x4d00305f,
0x6f726369, 0x74666f73, 0x29522820, 0x44334420, 0x53203958, 0x65646168,
0x6f432072, 0x6c69706d, 0x00207265, 0x0200001f, 0x80000000, 0x900f0000,
0x0200001f, 0x8000000a, 0x900f0001, 0x03000009, 0x80010000, 0x90e40000,
0xa0e40000, 0x03000009, 0x80020000, 0x90e40000, 0xa0e40001, 0x03000009,
0x80040000, 0x90e40000, 0xa0e40002, 0x03000009, 0x80080000, 0x90e40000,
0xa0e40003, 0x03000009, 0xc0010000, 0x80e40000, 0xa0e40004, 0x03000009,
0xc0020000, 0x80e40000, 0xa0e40005, 0x03000009, 0xc0040000, 0x80e40000,
0xa0e40006, 0x03000009, 0xc0080000, 0x80e40000, 0xa0e40007, 0x02000001,
0xd00f0000, 0x90e40001, 0x0000ffff
};

235
examples/cubemark/main.c Normal file
View file

@ -0,0 +1,235 @@
/*
* Cubemark, a microbenchmark which renders a very large number of
* very simple objects. This stresses the throughput of the SVGA3D
* command pipeline and API layers.
*
* Half of the cubes are rendered using fixed-function, and half of
* them are rendered using shaders. This helps hilight any performance
* differences between per-draw setup for FFP vs. for shaders.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
typedef uint32 DWORD;
#include "cube_vs.h"
#include "cube_ps.h"
#define MY_VSHADER_ID 0
#define MY_PSHADER_ID 0
#define CONST_MAT_VIEW 0
#define CONST_MAT_PROJ 4
typedef struct {
float position[3];
uint32 color;
} MyVertex;
/*
* Two colors for the cubes, so we can see them rotate more easily.
*/
#define COLOR1 0x8080FF
#define COLOR2 0x000080
/*
* This defines the grid spacing, as well as the total number of cubes we draw.
*/
#define GRID_X_MIN (-35)
#define GRID_X_MAX 35
#define GRID_Y_MIN (-20)
#define GRID_Y_MAX 20
#define GRID_STEP 2
static const MyVertex vertexData[] = {
{ {-1, -1, -1}, COLOR1 },
{ {-1, -1, 1}, COLOR1 },
{ {-1, 1, -1}, COLOR1 },
{ {-1, 1, 1}, COLOR1 },
{ { 1, -1, -1}, COLOR2 },
{ { 1, -1, 1}, COLOR2 },
{ { 1, 1, -1}, COLOR2 },
{ { 1, 1, 1}, COLOR2 },
};
#define QUAD(a,b,c,d) a, b, d, d, c, a
static const uint16 indexData[] = {
QUAD(0,1,2,3), // -X
QUAD(4,5,6,7), // +X
QUAD(0,1,4,5), // -Y
QUAD(2,3,6,7), // +Y
QUAD(0,2,4,6), // -Z
QUAD(1,3,5,7), // +Z
};
#undef QUAD
const uint32 numTriangles = sizeof indexData / sizeof indexData[0] / 3;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
VMMousePacket lastMouseState;
/*
* render --
*
* Set up common render state and matrices, then enter a loop
* drawing many cubes with individual draw commands.
*
* This render state only needs to be set each frame because
* SVGA3DText_Draw() changes it.
*/
void
render(void)
{
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view, instance;
float x, y;
Bool useShaders = FALSE;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Scale(view, 0.5, 0.5, 0.5, 1.0);
Matrix_RotateX(view, 30.0 * M_PI / 180.0);
Matrix_RotateY(view, gFPS.frame * 0.1f);
Matrix_Translate(view, 0, 0, 75);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, gIdentityMatrix);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3DUtil_SetShaderConstMatrix(CID, CONST_MAT_PROJ,
SVGA3D_SHADERTYPE_VS, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
for (x = GRID_X_MIN; x <= GRID_X_MAX; x += GRID_STEP) {
for (y = GRID_Y_MIN; y <= GRID_Y_MAX; y += GRID_STEP) {
Matrix_Copy(instance, view);
Matrix_Translate(instance, x, y, 0);
if (useShaders) {
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, MY_VSHADER_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, MY_PSHADER_ID);
SVGA3DUtil_SetShaderConstMatrix(CID, CONST_MAT_VIEW,
SVGA3D_SHADERTYPE_VS, instance);
} else {
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, SVGA3D_INVALID_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, SVGA3D_INVALID_ID);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, instance);
}
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_D3DCOLOR;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = numTriangles;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint16);
ranges[0].indexWidth = sizeof(uint16);
}
SVGA_FIFOCommitAll();
}
useShaders = !useShaders;
}
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, SVGA3D_INVALID_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, SVGA3D_INVALID_ID);
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineStaticBuffer(vertexData, sizeof vertexData);
indexSid = SVGA3DUtil_DefineStaticBuffer(indexData, sizeof indexData);
SVGA3D_DefineShader(CID, MY_VSHADER_ID, SVGA3D_SHADERTYPE_VS,
g_vs20_MyVertexShader, sizeof g_vs20_MyVertexShader);
SVGA3D_DefineShader(CID, MY_PSHADER_ID, SVGA3D_SHADERTYPE_PS,
g_ps20_MyPixelShader, sizeof g_ps20_MyPixelShader);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 10.0f, 100.0f);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("Cubemark microbenchmark\n\n%s", gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x000000, 1.0f, 0);
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

View file

@ -0,0 +1,6 @@
TARGET = dynamic-vertex-stress.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

View file

@ -0,0 +1,353 @@
/*
* SVGA3D example: Dynamic vertex buffer stress-test.
*
* This example is a performance stress-test for dynamic vertex
* buffers, and specifically for performing DMA on buffers which may
* still be in use by the GPU.
*
* Like the original dynamic-vertex test, we compute an animated
* function on the guest CPU and upload it via a vertex buffer before
* each draw. To simulate the stresses involved in dealing with apps
* that render in immediate-mode, however, this test breaks the vertex
* buffer up into very small pieces which are all DMA'ed and rendered
* individually.
*
* If the SVGA3D implementation has any bottlenecks related to reusing
* vertex buffers that are still in use by the physical GPU, this test
* will expose them.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
#define MESH_WIDTH 256 /* 64 kilovertices, 1.5MB */
#define MESH_HEIGHT 256
#define MESH_NUM_VERTICES (MESH_WIDTH * MESH_HEIGHT)
#define MESH_NUM_QUADS ((MESH_WIDTH-1) * (MESH_HEIGHT-1))
#define MESH_NUM_TRIANGLES (MESH_NUM_QUADS * 2)
#define MESH_NUM_INDICES (MESH_NUM_TRIANGLES * 3)
#define MESH_NUM_BYTES (MESH_NUM_VERTICES * sizeof(MyVertex))
#define TRIANGLES_PER_ROW ((MESH_WIDTH-1) * 2)
#define INDICES_PER_ROW (TRIANGLES_PER_ROW * 3)
#define MESH_ELEMENT(x, y) (MESH_WIDTH * (y) + (x))
typedef struct {
float position[3];
float color[3];
} MyVertex;
typedef uint16 IndexType;
DMAPool vertexDMA;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
/*
* setupFrame --
*
* Set up render state that we load once per frame (because
* SVGA3DText clobbered it) and perform matrix calculations that we
* only need once per frame.
*/
void
setupFrame(void)
{
static Matrix world;
static Matrix view;
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Translate(view, 0, 0, 3);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
Matrix_Copy(world, gIdentityMatrix);
Matrix_RotateX(world, -60.0 * PI_OVER_180);
Matrix_RotateY(world, gFPS.frame * 0.01f);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, world);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
}
/*
* updateVertices --
*
* Calculate new vertices, writing them directly into an available
* DMA buffer. Returns a DMAPoolBuffer which contains the vertex
* data for an entire frame.
*/
DMAPoolBuffer *
updateVertices(float red, float green, float blue, float phase, float offset)
{
DMAPoolBuffer *dma;
MyVertex *vert;
int x, y;
float t = gFPS.frame * 0.1f + phase;
dma = SVGA3DUtil_DMAPoolGetBuffer(&vertexDMA);
vert = (MyVertex*) dma->buffer;
for (y = 0; y < MESH_HEIGHT; y++) {
for (x = 0; x < MESH_WIDTH; x++) {
float fx = x * (2.0 / MESH_WIDTH) - 1.0;
float fy = y * (2.0 / MESH_HEIGHT) - 1.0;
float fxo = fx + offset;
float dist = fxo * fxo + fy * fy;
float z = sinf(dist * 8.0 + t) / (1 + dist * 10.0);
vert->position[0] = fx;
vert->position[1] = fy;
vert->position[2] = z;
vert->color[0] = red - z;
vert->color[1] = green - z;
vert->color[2] = blue - z;
vert++;
}
}
return dma;
}
/*
* createIndexBuffer --
*
* Create a static index buffer that renders our vertices as a 2D
* mesh. For simplicity, we use a triangle list rather than a
* triangle strip.
*/
uint32
createIndexBuffer(void)
{
IndexType *indexBuffer;
const uint32 bufferSize = MESH_NUM_INDICES * sizeof *indexBuffer;
SVGAGuestPtr gPtr;
uint32 sid;
int x, y;
sid = SVGA3DUtil_DefineSurface2D(bufferSize, 1, SVGA3D_BUFFER);
indexBuffer = SVGA3DUtil_AllocDMABuffer(bufferSize, &gPtr);
for (y = 0; y < (MESH_HEIGHT - 1); y++) {
for (x = 0; x < (MESH_WIDTH - 1); x++) {
indexBuffer[0] = MESH_ELEMENT(x, y );
indexBuffer[1] = MESH_ELEMENT(x+1, y );
indexBuffer[2] = MESH_ELEMENT(x+1, y+1);
indexBuffer[3] = MESH_ELEMENT(x+1, y+1);
indexBuffer[4] = MESH_ELEMENT(x, y+1);
indexBuffer[5] = MESH_ELEMENT(x, y );
indexBuffer += 6;
}
}
SVGA3DUtil_SurfaceDMA2D(sid, &gPtr, SVGA3D_WRITE_HOST_VRAM, bufferSize, 1);
return sid;
}
/*
* trashBuffer --
*
* Upload zeroes to the vertex buffer, to make any future DMA errors obvious.
*/
void trashBuffer(void)
{
DMAPoolBuffer *dma = SVGA3DUtil_DMAPoolGetBuffer(&vertexDMA);
memset(dma->buffer, 0, MESH_NUM_BYTES);
SVGA3DUtil_SurfaceDMA2D(vertexSid, &dma->ptr,
SVGA3D_WRITE_HOST_VRAM, MESH_NUM_BYTES, 1);
SVGA3DUtil_AsyncCall((AsyncCallFn) SVGA3DUtil_DMAPoolFreeBuffer, dma);
}
/*
* uploadRow --
*
* Upload the vertex data for one row of the mesh.
*/
void uploadRow(int row, DMAPoolBuffer *dma)
{
SVGA3dCopyBox *boxes;
SVGA3dGuestImage guestImage;
SVGA3dSurfaceImageId hostImage = { vertexSid };
guestImage.ptr = dma->ptr;
guestImage.pitch = 0;
SVGA3D_BeginSurfaceDMA(&guestImage, &hostImage, SVGA3D_WRITE_HOST_VRAM, &boxes, 1);
{
boxes[0].x = MESH_HEIGHT * sizeof(MyVertex) * row;
boxes[0].w = MESH_WIDTH * sizeof(MyVertex);
boxes[0].srcx = boxes[0].x;
boxes[0].h = 1;
boxes[0].d = 1;
}
SVGA_FIFOCommitAll();
}
/*
* drawStrip --
*
* Draw all triangles between 'row' and 'row+1'.
*/
void
drawStrip(int row)
{
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = TRIANGLES_PER_ROW;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(IndexType);
ranges[0].indexArray.offset = sizeof(IndexType) * INDICES_PER_ROW * row;
ranges[0].indexWidth = sizeof(IndexType);
}
SVGA_FIFOCommitAll();
}
/*
* render --
*
* Calculate, upload, and draw the entire mesh.
*/
void
render(void)
{
DMAPoolBuffer *dma = updateVertices(0.2, 0.8, 0.2, 0, 0);
int row;
trashBuffer();
uploadRow(0, dma);
for (row = 1; row < MESH_HEIGHT; row++) {
uploadRow(row, dma);
drawStrip(row - 1);
}
SVGA3DUtil_AsyncCall((AsyncCallFn) SVGA3DUtil_DMAPoolFreeBuffer, dma);
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineSurface2D(MESH_NUM_BYTES, 1, SVGA3D_BUFFER);
indexSid = createIndexBuffer();
SVGA3DUtil_AllocDMAPool(&vertexDMA, MESH_NUM_BYTES, 16);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Dynamic vertex buffer stress-test.\n"
"This example performs a separate DMA and "
"Draw for each row of the mesh.\n\n%s",
gFPS.text);
SVGA3DText_Update();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
setupFrame();
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

View file

@ -0,0 +1,6 @@
TARGET = dynamic-vertex.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

View file

@ -0,0 +1,279 @@
/*
* SVGA3D example: Dynamic vertex buffers.
*
* This example shows how to efficiently stream vertex data to the
* GPU, using multiple DMA buffers but a single vertex buffer. We
* allocate DMA buffers from a pool every time we want to draw a new
* dynamic mesh, then we asynchronously recycle those buffers after
* the DMA transfer has completed.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
#define MESH_WIDTH 128
#define MESH_HEIGHT 128
#define MESH_NUM_VERTICES (MESH_WIDTH * MESH_HEIGHT)
#define MESH_NUM_QUADS ((MESH_WIDTH-1) * (MESH_HEIGHT-1))
#define MESH_NUM_TRIANGLES (MESH_NUM_QUADS * 2)
#define MESH_NUM_INDICES (MESH_NUM_TRIANGLES * 3)
#define MESH_ELEMENT(x, y) (MESH_WIDTH * (y) + (x))
typedef struct {
float position[3];
float color[3];
} MyVertex;
typedef uint16 IndexType;
DMAPool vertexDMA;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
/*
* setupFrame --
*
* Set up render state that we load once per frame (because
* SVGA3DText clobbered it) and perform matrix calculations that we
* only need once per frame.
*/
void
setupFrame(void)
{
static Matrix world;
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
Matrix_Copy(world, gIdentityMatrix);
Matrix_RotateX(world, -60.0 * PI_OVER_180);
Matrix_RotateY(world, gFPS.frame * 0.001f);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, world);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
}
/*
* updateVertices --
*
* Calculate new vertices, writing them directly into an available
* DMA buffer. Asynchronously begin DMA and recycle the buffer.
*/
void
updateVertices(float red, float green, float blue, float phase, float offset)
{
DMAPoolBuffer *dma;
MyVertex *vert;
int x, y;
float t = gFPS.frame * 0.01f + phase;
dma = SVGA3DUtil_DMAPoolGetBuffer(&vertexDMA);
vert = (MyVertex*) dma->buffer;
for (y = 0; y < MESH_HEIGHT; y++) {
for (x = 0; x < MESH_WIDTH; x++) {
float fx = x * (2.0 / MESH_WIDTH) - 1.0;
float fy = y * (2.0 / MESH_HEIGHT) - 1.0;
float fxo = fx + offset;
float dist = fxo * fxo + fy * fy;
float z = sinf(dist * 8.0 + t) / (1 + dist * 10.0);
vert->position[0] = fx;
vert->position[1] = fy;
vert->position[2] = z;
vert->color[0] = red - z;
vert->color[1] = green - z;
vert->color[2] = blue - z;
vert++;
}
}
SVGA3DUtil_SurfaceDMA2D(vertexSid, &dma->ptr, SVGA3D_WRITE_HOST_VRAM,
MESH_NUM_VERTICES * sizeof(MyVertex), 1);
SVGA3DUtil_AsyncCall((AsyncCallFn) SVGA3DUtil_DMAPoolFreeBuffer, dma);
}
/*
* drawMesh --
*
* Draw our mesh at a particular position. This uses the index and
* vertex data which is resident in the host VRAM buffers at the
* time the drawing command is executed asynchronously.
*/
void
drawMesh(float posX, float posY, float posZ)
{
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Translate(view, posX, posY, posZ);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = MESH_NUM_TRIANGLES;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(IndexType);
ranges[0].indexWidth = sizeof(IndexType);
}
SVGA_FIFOCommitAll();
}
/*
* createIndexBuffer --
*
* Create a static index buffer that renders our vertices as a 2D
* mesh. For simplicity, we use a triangle list rather than a
* triangle strip.
*/
uint32
createIndexBuffer(void)
{
IndexType *indexBuffer;
const uint32 bufferSize = MESH_NUM_INDICES * sizeof *indexBuffer;
SVGAGuestPtr gPtr;
uint32 sid;
int x, y;
sid = SVGA3DUtil_DefineSurface2D(bufferSize, 1, SVGA3D_BUFFER);
indexBuffer = SVGA3DUtil_AllocDMABuffer(bufferSize, &gPtr);
for (y = 0; y < (MESH_HEIGHT - 1); y++) {
for (x = 0; x < (MESH_WIDTH - 1); x++) {
indexBuffer[0] = MESH_ELEMENT(x, y );
indexBuffer[1] = MESH_ELEMENT(x+1, y );
indexBuffer[2] = MESH_ELEMENT(x+1, y+1);
indexBuffer[3] = MESH_ELEMENT(x+1, y+1);
indexBuffer[4] = MESH_ELEMENT(x, y+1);
indexBuffer[5] = MESH_ELEMENT(x, y );
indexBuffer += 6;
}
}
SVGA3DUtil_SurfaceDMA2D(sid, &gPtr, SVGA3D_WRITE_HOST_VRAM, bufferSize, 1);
return sid;
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineSurface2D(MESH_NUM_VERTICES * sizeof(MyVertex),
1, SVGA3D_BUFFER);
indexSid = createIndexBuffer();
SVGA3DUtil_AllocDMAPool(&vertexDMA, MESH_NUM_VERTICES * sizeof(MyVertex), 16);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Dynamic vertex buffers.\n\n%s",
gFPS.text);
SVGA3DText_Update();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
setupFrame();
updateVertices(1, 0.5, 0.5, M_PI, 0);
drawMesh(-1.5, -1, 6);
updateVertices(0.5, 1.0, 0.5, 0, 0);
drawMesh(0, 1, 6);
updateVertices(0.5, 0.5, 1.0, 0, 1.5);
drawMesh(1.5, -1, 6);
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

View file

@ -0,0 +1,6 @@
TARGET = fence-stress.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

View file

@ -0,0 +1,58 @@
/*
* SVGA3D example: Stress-test for our FIFO Fence synchronization.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#define SYNCS_PER_FRAME 1024
int
main(void)
{
int i, j;
uint32 fence = 0;
static FPSCounterState gFPS;
SVGA3DUtil_InitFullscreen(CID, 640, 480);
SVGA3DText_Init();
while (1) {
SVGA3DUtil_UpdateFPSCounter(&gFPS);
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"FIFO Fence stress-test.\n"
"%d syncs per frame.\n"
"\n"
"%s\n"
"\n"
"Latest fence: 0x%08x\n"
" IRQ count: %d\n",
SYNCS_PER_FRAME, gFPS.text, fence, gSVGA.irq.count);
SVGA3DText_Update();
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR, 0, 1.0f, 0);
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
for (j = 0; j < SYNCS_PER_FRAME; j++) {
for (i=0; i<100; i++) {
SVGA_InsertFence();
}
fence = SVGA_InsertFence();
for (i=0; i<50; i++) {
SVGA_InsertFence();
}
SVGA_SyncToFence(fence);
}
}
return 0;
}

View file

@ -0,0 +1,7 @@
TARGET = gmr-test.img
VMX_MEMSIZE = 128
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

578
examples/gmr-test/main.c Normal file
View file

@ -0,0 +1,578 @@
/*
* SVGA3D example: Test harness and low-level example program for
* Guest Memory Regions.
*
* With Guest Memory regions, the SVGA device can perform DMA
* operations directly between guest system memory and host
* VRAM. Guest drivers use the device's GMR registers to set up
* regions of guest memory which can be accessed by the device, then
* the driver refers to these regions by ID when sending pointers over
* the command FIFO.
*
* GMRs support physically contiguous or discontiguous memory. This
* example is a bit contrived because we're testing GMRs without an
* operating system or a virtual memory subsystem- in a real OS,
* support for physically discontiguous addresses would often be
* required in order to ensure that the GMR's address space matches
* that of a particular virtual address space in the OS. In this
* example, we just test physically discontiguous regions for the sake
* of testing them.
*
* This test harness is focused on system memory GMRs, however it also
* ends up testing much of the GLSurface and GLFBO code, since it
* performs GMR-to-GMR copies by way of surface DMA operations.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga.h"
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "console_vga.h"
#include "gmr.h"
#include "math.h"
#include "mt19937ar.h"
/* Maximum number of copy boxes we'll test with. The host has no limit. */
#define MAX_COPY_BOXES 128
/*
* Global data
*/
static uint32 tempSurfaceId;
static uint32 randSeed;
static uint32 testIters;
static uint32 testRegionSize;
static const char *testPass;
/*
* TestPattern_Write --
* TestPattern_Check --
*
* Write/check an arbitrary deterministic test pattern in the
* provided buffer. The buffer must be a multiple of 4 bytes long.
*
* Instead of generating a unique random number for every word,
* which would be pretty slow, this generates a prime number of
* random words, which then repeat across the entire check range.
*/
#define PATTERN_BUFFER_LEN 41 // Must be prime
void
TestPattern_Write(uint32 *buffer,
uint32 size)
{
#ifndef DISABLE_CHECKING
uint32 pattern[PATTERN_BUFFER_LEN];
int i;
init_genrand(randSeed);
for (i = 0; i < PATTERN_BUFFER_LEN; i++) {
pattern[i] = genrand_int32();
}
i = 0;
size /= sizeof *buffer;
while (size--) {
*(buffer++) = pattern[i];
if (++i == PATTERN_BUFFER_LEN) {
i = 0;
}
}
#endif
}
void
TestPattern_Check(uint32 *buffer,
uint32 size,
uint32 offset,
uint32 line,
uint32 index)
{
#ifndef DISABLE_CHECKING
uint32 pattern[PATTERN_BUFFER_LEN];
int i;
init_genrand(randSeed);
for (i = 0; i < PATTERN_BUFFER_LEN; i++) {
pattern[i] = genrand_int32();
}
offset /= sizeof *buffer;
size /= sizeof *buffer;
i = offset % PATTERN_BUFFER_LEN;
while (size) {
uint32 v = pattern[i];
if (++i == PATTERN_BUFFER_LEN) {
i = 0;
}
if (*buffer != v) {
SVGA_Disable();
ConsoleVGA_Init();
Console_Format("Test pattern mismatch on %4x.%4x\n"
"Test pass: %s\n"
"Mismatch at %08x, with %08x bytes left in block.\n\n",
line, index, testPass, buffer, size * sizeof *buffer);
size = MIN(size, 16);
while (size) {
Console_Format("Actual: %08x Expected: %08x\n",
*buffer, v);
buffer++;
size--;
v = pattern[i];
if (++i == PATTERN_BUFFER_LEN) {
i = 0;
}
}
Intr_Disable();
Intr_Halt();
}
buffer++;
size--;
}
#endif
}
/*
* GMR_GenericCopy --
*
* Copy between two GMRs, using an arbitrarily shaped buffer
* surface and an arbitrary list of copy boxes.
*
* In the copy boxes, the 'source' represents locations
* on both guest surfaces and the 'destination' represents
* a locations in host VRAM.
*/
void
GMR_GenericCopy(SVGAGuestPtr *dest,
SVGAGuestPtr *src,
SVGA3dSize *surfSize,
SVGA3dSurfaceFormat format,
SVGA3dCopyBox *boxes,
uint32 numBoxes)
{
SVGA3dSize *mipSizes;
SVGA3dSurfaceFace *faces;
SVGA3dCopyBox *dmaBoxes;
SVGA3dGuestImage srcImage = { *src };
SVGA3dGuestImage destImage = { *dest };
SVGA3dSurfaceImageId hostImage = { tempSurfaceId };
SVGA3D_BeginDefineSurface(tempSurfaceId, 0, format, &faces, &mipSizes, 1);
faces[0].numMipLevels = 1;
mipSizes[0] = *surfSize;
SVGA_FIFOCommitAll();
SVGA3D_BeginSurfaceDMA(&srcImage, &hostImage, SVGA3D_WRITE_HOST_VRAM,
&dmaBoxes, numBoxes);
memcpy(dmaBoxes, boxes, numBoxes * sizeof boxes[0]);
SVGA_FIFOCommitAll();
SVGA3D_BeginSurfaceDMA(&destImage, &hostImage, SVGA3D_READ_HOST_VRAM,
&dmaBoxes, numBoxes);
memcpy(dmaBoxes, boxes, numBoxes * sizeof boxes[0]);
SVGA_FIFOCommitAll();
SVGA3D_DestroySurface(tempSurfaceId);
/* Wait for both DMA operations to finish. */
SVGA_SyncToFence(SVGA_InsertFence());
}
/*
* Display_BeginPass --
*
* Begin a new test pass, and update the on-screen display.
*/
void
Display_BeginPass(const char *pass)
{
testPass = pass;
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Guest Memory Region stress-test.\n"
"\n"
"Host capabilities\n"
"-----------------\n"
"\n"
" Max IDs: %d\n"
" Max Descriptor Len: %d\n"
"\n"
"Test status\n"
"-----------\n"
"\n"
" Iterations: %d\n"
" Seed: %08x\n"
" Running: %s\n"
"\n"
#ifdef DISABLE_CHECKING
"CHECKING DISABLED. This test can't fail.\n",
#else
"Test is running successfully so far. Will Panic on failure.\n",
#endif
gGMR.maxIds, gGMR.maxDescriptorLen, testIters,
randSeed, testPass);
VMBackdoor_VGAScreenshot();
SVGA3DText_Update();
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR, 0x000080, 1.0f, 0);
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
/*
* runTestPass --
*
* Run one test pass- create two large GMRs, one contiguous and one
* discontiguous. Copy a test pattern back and forth between the
* two buffers, using the provided surface size and type.
*/
void
runTestPass(uint32 testRegionSize,
SVGA3dSize *surfSize,
SVGA3dSurfaceFormat format,
SVGA3dCopyBox *boxes,
uint32 numBoxes)
{
SVGAGuestPtr contig = { 0, 0 };
SVGAGuestPtr evenPages = { gGMR.maxIds - 1, 0 };
int i;
uint32 contigPages = GMR_DefineContiguous(contig.gmrId, gGMR.maxDescriptorLen * 2);
uint32 discontigPages = GMR_DefineEvenPages(evenPages.gmrId, gGMR.maxDescriptorLen);
/*
* Write a test pattern into the contiguous GMR.
*/
TestPattern_Write(PPN_POINTER(contigPages), testRegionSize);
TestPattern_Check(PPN_POINTER(contigPages), testRegionSize, 0, __LINE__, 0);
/*
* Copy from contiguous to discontiguous.
*/
GMR_GenericCopy(&evenPages, &contig, surfSize, format, boxes, numBoxes);
/*
* Check the discontiguous GMR, page-by-page.
*/
for (i = 0; i < testRegionSize / PAGE_SIZE; i++) {
TestPattern_Check(PPN_POINTER(discontigPages + 2*i),
PAGE_SIZE, PAGE_SIZE * i, __LINE__, i);
}
/*
* Clear the contiguous GMR, then copy data back into it from the discontiguous GMR.
*/
memset(PPN_POINTER(contigPages), 0x42, testRegionSize);
GMR_GenericCopy(&contig, &evenPages, surfSize, format, boxes, numBoxes);
/*
* Check the contiguous GMR again.
*/
TestPattern_Check(PPN_POINTER(contigPages), testRegionSize, 0, __LINE__, i);
GMR_FreeAll();
Heap_Reset();
}
/*
* createBoxes --
*
* Create an array of N copyboxes which cover an entire surface.
* This begins with a single large copybox, and iteratively splits
* small boxes off from a random face on the original box.
*
* This function can and will generate degenerate copy boxes
* (zero-size). The SVGA3D device must ignore those boxes.
*/
void
createBoxes(SVGA3dSize *size,
SVGA3dCopyBox *boxes,
uint32 numBoxes)
{
uint32 i;
SVGA3dCopyBox space = {
.w = size->width,
.h = size->height,
.d = size->depth,
};
init_genrand(randSeed);
for (i = 0; i < numBoxes - 1; i++) {
uint32 rand = genrand_int32();
uint32 a;
memcpy(&boxes[i], &space, sizeof space);
switch (rand % 6) {
case 0: /* X- */
a = rand % space.w;
boxes[i].w = a;
space.x += a;
space.w -= a;
break;
case 1: /* Y- */
a = rand % space.h;
boxes[i].h = a;
space.y += a;
space.h -= a;
break;
case 2: /* Z- */
a = rand % space.d;
boxes[i].d = a;
space.z += a;
space.d -= a;
break;
case 3: /* X+ */
a = rand % space.w;
boxes[i].w = a;
space.w -= a;
boxes[i].x += space.w;
break;
case 4: /* Y+ */
a = rand % space.h;
boxes[i].h = a;
space.h -= a;
boxes[i].y += space.h;
break;
case 5: /* Z+ */
a = rand % space.d;
boxes[i].d = a;
space.d -= a;
boxes[i].z += space.d;
break;
}
}
boxes[i] = space;
for (i = 0; i < numBoxes; i++) {
boxes[i].srcx = boxes[i].x;
boxes[i].srcy = boxes[i].y;
boxes[i].srcz = boxes[i].z;
}
}
/*
* createMisaligned1dBoxes --
*
* Create an array of N 1-dimensional copyboxes, most of which
* have a width of PAGE_SIZE-1 bytes.
*
* The boxes may extend past the end of 'size'. This is okay,
* the SVGA3D device is responsible for clipping them.
*/
void
createMisaligned1dBoxes(uint32 size,
SVGA3dCopyBox *boxes,
uint32 numBoxes)
{
uint32 offset = 0;
uint32 i;
memset(boxes, 0, sizeof *boxes * numBoxes);
for (i = 0; i < numBoxes - 1; i++) {
boxes[i].x = boxes[i].srcx = offset;
boxes[i].w = PAGE_SIZE-1;
boxes[i].h = 1;
boxes[i].d = 1;
offset += boxes[i].w;
}
boxes[i].x = boxes[i].srcx = offset;
boxes[i].w = size - offset;
boxes[i].h = 1;
boxes[i].d = 1;
}
/*
* runTests --
*
* Main function to run one iteration of all tests.
*/
void
runTests(void)
{
/* Maximum size of worst-case-discontiguous region we can represent */
uint32 largeRegionSize = gGMR.maxDescriptorLen * PAGE_SIZE;
/* Smaller region, to speed up other testing. */
uint32 regionSize = 0x20 * PAGE_SIZE;
/* Smallest region, suitable for 1D textures. */
uint32 tinyRegionSize = 1024;
SVGA3dSize size1dLarge = {
.width = largeRegionSize,
.height = 1,
.depth = 1,
};
SVGA3dSize size1d = {
.width = tinyRegionSize,
.height = 1,
.depth = 1,
};
SVGA3dSize size2d = {
.width = 0x100,
.height = regionSize / 0x100,
.depth = 1,
};
SVGA3dSize size3d = {
.width = 0x40,
.height = 0x40,
.depth = regionSize / 0x1000,
};
/* A single maximally-sized 1D copybox. The host will clip it. */
SVGA3dCopyBox maxBox1d = {
.w = 0xFFFFFFFFUL,
.h = 1,
.d = 1,
};
SVGA3dCopyBox boxes[MAX_COPY_BOXES];
/*
* Basic per-surface-format tests.
*
* Note that 3D compressed textures are not expected to work yet,
* so we skip those tests.
*/
#define TEST_FORMAT_2D(f, b) \
{ \
Display_BeginPass("Single copy via 1D " #f " surface."); \
runTestPass(tinyRegionSize*b, &size1d, SVGA3D_ ## f, &maxBox1d, 1); \
\
Display_BeginPass("Single copy via 2D " #f " surface."); \
createBoxes(&size2d, boxes, 1); \
runTestPass(regionSize*b, &size2d, SVGA3D_ ## f, boxes, 1); \
}
#define TEST_FORMAT(f, b) \
{ \
TEST_FORMAT_2D(f, b) \
\
Display_BeginPass("Single copy via 3D " #f " surface."); \
createBoxes(&size3d, boxes, 1); \
runTestPass(regionSize*b, &size3d, SVGA3D_ ## f, boxes, 1); \
}
TEST_FORMAT(BUFFER, 1) // Buffers use their own host VRAM type
TEST_FORMAT(LUMINANCE8, 1) // Test a simple 8bpp format
TEST_FORMAT(ALPHA8, 1) // To isolate alpha channel bugs
TEST_FORMAT(A8R8G8B8, 4) // ARGB surfaces have more readback paths than others
TEST_FORMAT_2D(DXT2, 1) // Test 4x4 block size, and compressed texture upload/download
#undef TEST_FORMAT
#undef TEST_FORMAT_2D
/*
* Test large buffers (Limited by max size of worst-case fragmented GMR)
*/
Display_BeginPass("Single copy via 1D BUFFER surface. (Large region)");
runTestPass(largeRegionSize, &size1dLarge, SVGA3D_BUFFER, &maxBox1d, 1);
/*
* Test with randomly subdivided copyboxes.
*/
#define TEST_FORMAT_2D(f, b) \
{ \
Display_BeginPass("Subdivided copy via 2D " #f " surface."); \
createBoxes(&size2d, boxes, MAX_COPY_BOXES); \
runTestPass(regionSize*b, &size2d, SVGA3D_ ## f, boxes, MAX_COPY_BOXES); \
}
#define TEST_FORMAT(f, b) \
{ \
TEST_FORMAT_2D(f, b) \
\
Display_BeginPass("Subdivided copy via 3D " #f " surface."); \
createBoxes(&size3d, boxes, MAX_COPY_BOXES); \
runTestPass(regionSize*b, &size3d, SVGA3D_ ## f, boxes, MAX_COPY_BOXES); \
}
TEST_FORMAT(BUFFER, 1)
TEST_FORMAT(ALPHA8, 1)
TEST_FORMAT(A8R8G8B8, 4)
TEST_FORMAT_2D(DXT2, 1) // Test compressed texture rectangle clipping
#undef TEST_FORMAT
#undef TEST_FORMAT_2D
/*
* Test another large 1D copy, split into slightly misaligned chunks.
*/
Display_BeginPass("Misaligned copies via 1D BUFFER surface. (Large region)");
createMisaligned1dBoxes(largeRegionSize, boxes, MAX_COPY_BOXES);
runTestPass(largeRegionSize, &size1dLarge, SVGA3D_BUFFER, boxes, MAX_COPY_BOXES);
}
/*
* main --
*
* Entry point and main loop for the example.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 640, 480);
SVGA3DText_Init();
GMR_Init();
Heap_Reset();
tempSurfaceId = SVGA3DUtil_AllocSurfaceID();
testRegionSize = gGMR.maxDescriptorLen * PAGE_SIZE;
while (1) {
runTests();
randSeed = genrand_int32();
testIters++;
}
return 0;
}

View file

@ -0,0 +1,16 @@
TARGET = half-float-test.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules
.PHONY: shaders
shaders: cube_vs.h cube_ps.h
cube_vs.h: cube.fx
wine fxc.exe /T vs_2_0 /E MyVertexShader /Fh cube_vs.h cube.fx
cube_ps.h: cube.fx
wine fxc.exe /T ps_2_0 /E MyPixelShader /Fh cube_ps.h cube.fx

View file

@ -0,0 +1,36 @@
float4x4 matView, matProj;
struct VS_Input
{
float4 Pos : POSITION;
float4 Color : COLOR0;
};
struct VS_Output
{
float4 Pos : POSITION;
float4 Color : COLOR0;
};
VS_Output
MyVertexShader(VS_Input Input)
{
VS_Output Output;
Output.Pos = mul(mul(Input.Pos, matView), matProj);
Output.Color = Input.Color;
return Output;
}
struct PS_Input
{
float4 Color : COLOR0;
};
float4
MyPixelShader(PS_Input Input) : COLOR
{
return Input.Color;
}

View file

@ -0,0 +1,21 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T ps_2_0 /E MyPixelShader /Fh cube_ps.h cube.fx
//
ps_2_0
dcl v0
mov oC0, v0
// approximately 1 instruction slot used
#endif
const DWORD g_ps20_MyPixelShader[] =
{
0xffff0200, 0x0013fffe, 0x42415443, 0x0000001c, 0x00000023, 0xffff0200,
0x00000000, 0x00000000, 0x20000100, 0x0000001c, 0x325f7370, 0x4d00305f,
0x6f726369, 0x74666f73, 0x29522820, 0x44334420, 0x53203958, 0x65646168,
0x6f432072, 0x6c69706d, 0x00207265, 0x0200001f, 0x80000000, 0x900f0000,
0x02000001, 0x800f0800, 0x90e40000, 0x0000ffff
};

View file

@ -0,0 +1,54 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T vs_2_0 /E MyVertexShader /Fh cube_vs.h cube.fx
//
//
// Parameters:
//
// float4x4 matProj;
// float4x4 matView;
//
//
// Registers:
//
// Name Reg Size
// ------------ ----- ----
// matView c0 4
// matProj c4 4
//
vs_2_0
dcl_position v0
dcl_color v1
dp4 r0.x, v0, c0
dp4 r0.y, v0, c1
dp4 r0.z, v0, c2
dp4 r0.w, v0, c3
dp4 oPos.x, r0, c4
dp4 oPos.y, r0, c5
dp4 oPos.z, r0, c6
dp4 oPos.w, r0, c7
mov oD0, v1
// approximately 9 instruction slots used
#endif
const DWORD g_vs20_MyVertexShader[] =
{
0xfffe0200, 0x0025fffe, 0x42415443, 0x0000001c, 0x0000006b, 0xfffe0200,
0x00000002, 0x0000001c, 0x20000100, 0x00000064, 0x00000044, 0x00040002,
0x00000004, 0x0000004c, 0x00000000, 0x0000005c, 0x00000002, 0x00000004,
0x0000004c, 0x00000000, 0x5074616d, 0x006a6f72, 0x00030003, 0x00040004,
0x00000001, 0x00000000, 0x5674616d, 0x00776569, 0x325f7376, 0x4d00305f,
0x6f726369, 0x74666f73, 0x29522820, 0x44334420, 0x53203958, 0x65646168,
0x6f432072, 0x6c69706d, 0x00207265, 0x0200001f, 0x80000000, 0x900f0000,
0x0200001f, 0x8000000a, 0x900f0001, 0x03000009, 0x80010000, 0x90e40000,
0xa0e40000, 0x03000009, 0x80020000, 0x90e40000, 0xa0e40001, 0x03000009,
0x80040000, 0x90e40000, 0xa0e40002, 0x03000009, 0x80080000, 0x90e40000,
0xa0e40003, 0x03000009, 0xc0010000, 0x80e40000, 0xa0e40004, 0x03000009,
0xc0020000, 0x80e40000, 0xa0e40005, 0x03000009, 0xc0040000, 0x80e40000,
0xa0e40006, 0x03000009, 0xc0080000, 0x80e40000, 0xa0e40007, 0x02000001,
0xd00f0000, 0x90e40001, 0x0000ffff
};

View file

@ -0,0 +1,227 @@
/*
* Test support for half-precision (16-bit) floating point.
*
* This test draws four cubes, to test fixed-function and programmable
* pipelines, and to test 16-bit and 32-bit float vertices.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
typedef uint32 DWORD;
#include "cube_vs.h"
#include "cube_ps.h"
/* 16-bit floating point constants */
#define HALF_0 0x0000
#define HALF_POS_1 0x3c00
#define HALF_NEG_1 0xbc00
#define MY_VSHADER_ID 0
#define MY_PSHADER_ID 0
#define CONST_MAT_VIEW 0
#define CONST_MAT_PROJ 4
typedef struct {
float position32[3];
uint16 position16[4];
uint32 color;
} MyVertex;
static const MyVertex vertexData[] = {
{ {-1, -1, -1}, {HALF_NEG_1, HALF_NEG_1, HALF_NEG_1, HALF_POS_1}, 0xFFFFFF },
{ {-1, -1, 1}, {HALF_NEG_1, HALF_NEG_1, HALF_POS_1, HALF_POS_1}, 0xFFFF00 },
{ {-1, 1, -1}, {HALF_NEG_1, HALF_POS_1, HALF_NEG_1, HALF_POS_1}, 0xFF00FF },
{ {-1, 1, 1}, {HALF_NEG_1, HALF_POS_1, HALF_POS_1, HALF_POS_1}, 0xFF0000 },
{ { 1, -1, -1}, {HALF_POS_1, HALF_NEG_1, HALF_NEG_1, HALF_POS_1}, 0x00FFFF },
{ { 1, -1, 1}, {HALF_POS_1, HALF_NEG_1, HALF_POS_1, HALF_POS_1}, 0x00FF00 },
{ { 1, 1, -1}, {HALF_POS_1, HALF_POS_1, HALF_NEG_1, HALF_POS_1}, 0x0000FF },
{ { 1, 1, 1}, {HALF_POS_1, HALF_POS_1, HALF_POS_1, HALF_POS_1}, 0x000000 },
};
#define QUAD(a,b,c,d) a, b, d, d, c, a
static const uint16 indexData[] = {
QUAD(0,1,2,3), // -X
QUAD(4,5,6,7), // +X
QUAD(0,1,4,5), // -Y
QUAD(2,3,6,7), // +Y
QUAD(0,2,4,6), // -Z
QUAD(1,3,5,7), // +Z
};
#undef QUAD
const uint32 numTriangles = sizeof indexData / sizeof indexData[0] / 3;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
/*
* renderCube --
*
* Render one cube at the supplied X/Y coordinate, using either
* shaders or fixed-function, and using either 16-bit or 32-bit
* vertex data.
*/
void
renderCube(float x,
float y,
Bool useShaders,
Bool useHalf)
{
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_RotateX(view, 30.0 * M_PI / 180.0);
Matrix_RotateY(view, gFPS.frame * 0.01f);
Matrix_Translate(view, x, y, 15);
if (useShaders) {
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, MY_VSHADER_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, MY_PSHADER_ID);
SVGA3DUtil_SetShaderConstMatrix(CID, CONST_MAT_PROJ,
SVGA3D_SHADERTYPE_VS, perspectiveMat);
SVGA3DUtil_SetShaderConstMatrix(CID, CONST_MAT_VIEW,
SVGA3D_SHADERTYPE_VS, view);
} else {
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, SVGA3D_INVALID_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, SVGA3D_INVALID_ID);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, gIdentityMatrix);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
}
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
if (useHalf) {
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT16_4;
decls[0].array.offset = offsetof(MyVertex, position16);
} else {
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].array.offset = offsetof(MyVertex, position32);
}
decls[1].identity.type = SVGA3D_DECLTYPE_D3DCOLOR;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = numTriangles;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint16);
ranges[0].indexWidth = sizeof(uint16);
}
SVGA_FIFOCommitAll();
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, SVGA3D_INVALID_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, SVGA3D_INVALID_ID);
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineStaticBuffer(vertexData, sizeof vertexData);
indexSid = SVGA3DUtil_DefineStaticBuffer(indexData, sizeof indexData);
SVGA3D_DefineShader(CID, MY_VSHADER_ID, SVGA3D_SHADERTYPE_VS,
g_vs20_MyVertexShader, sizeof g_vs20_MyVertexShader);
SVGA3D_DefineShader(CID, MY_PSHADER_ID, SVGA3D_SHADERTYPE_PS,
g_ps20_MyPixelShader, sizeof g_ps20_MyPixelShader);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 10.0f, 100.0f);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("Half-precision floating point test.\n"
"You should see four identical cubes.\n"
"\n"
"Top row: Fixed function, Bottom row: Shaders.\n"
"Left column: 32-bit float, Right column: 16-bit float.\n"
"\n%s",
gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
renderCube(-2, 2, FALSE, FALSE); /* Top-left */
renderCube(2, 2, FALSE, TRUE); /* Top-right */
renderCube(-2, -2, TRUE, FALSE); /* Bottom-left */
renderCube(2, -2, TRUE, TRUE); /* Bottom-right */
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

8
examples/pong/Makefile Normal file
View file

@ -0,0 +1,8 @@
TARGET = pong.img
APP_SOURCES = main.c
DEFS = -DREALLY_TINY
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

710
examples/pong/main.c Normal file
View file

@ -0,0 +1,710 @@
/*
* PongOS v2.0
*
* Micah Dowty <micah@vmware.com>
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga.h"
#include "intr.h"
#include "io.h"
#include "timer.h"
#include "keyboard.h"
#include "vmbackdoor.h"
#define PONG_DOT_SIZE 8
#define PONG_DIGIT_PIXEL_SIZE 10
#define PONG_BG_COLOR 0x000000
#define PONG_SPRITE_COLOR 0xFFFFFF
#define PONG_PLAYFIELD_COLOR 0xAAAAAA
#define PONG_FRAME_RATE 60
#define MAX_DIRTY_RECTS 128
#define MAX_SPRITES 8
typedef struct {
float x, y;
} Vector2;
typedef struct {
int x, y, w, h;
} Rect;
typedef struct {
Rect r;
uint32 color;
} FillRect;
static struct {
uint32 *buffer;
Rect dirtyRects[MAX_DIRTY_RECTS];
uint32 numDirtyRects;
} back;
static struct {
FillRect paddles[2];
FillRect ball;
uint8 scores[2];
float ballSpeed;
float paddleVelocities[2];
float paddlePos[2];
Vector2 ballVelocity;
Vector2 ballPos;
Bool playfieldDirty;
} pong;
/*
*-----------------------------------------------------------------------------
*
* Random32 --
*
* "Random" number generator. To save code space, we actually just use
* the low bits of the TSC. This of course isn't actually random, but
* it's good enough for Pong.
*
*-----------------------------------------------------------------------------
*/
static uint32
Random32(void)
{
uint64 t;
__asm__ __volatile__("rdtsc" : "=A" (t));
return (uint32)t;
}
/*
*-----------------------------------------------------------------------------
*
* RectTestIntersection --
*
* Returns TRUE iff two Rects intersect with each other.
*
*-----------------------------------------------------------------------------
*/
static Bool
RectTestIntersection(Rect *a, // IN
Rect *b) // IN
{
return !(a->x + a->w < b->x ||
a->x > b->x + b->w ||
a->y + a->h < b->y ||
a->y > b->y + b->h);
}
/*
*-----------------------------------------------------------------------------
*
* BackFill --
*
* Perform a color fill on the backbuffer.
*
*-----------------------------------------------------------------------------
*/
static void
BackFill(FillRect fr) // IN
{
int i, j;
for (i = 0; i < fr.r.h; i++) {
uint32 *line = &back.buffer[(fr.r.y + i) * gSVGA.width + fr.r.x];
for (j = 0; j < fr.r.w; j++) {
line[j] = fr.color;
}
}
}
/*
*-----------------------------------------------------------------------------
*
* BackMarkDirty --
*
* Mark a region of the backbuffer as dirty. We'll copy it to the
* front buffer and ask the host to update it on the next
* BackUpdate().
*
*-----------------------------------------------------------------------------
*/
static void
BackMarkDirty(Rect rect) // IN
{
back.dirtyRects[back.numDirtyRects++] = rect;
}
/*
*-----------------------------------------------------------------------------
*
* BackUpdate --
*
* Copy all dirty regions of the backbuffer to the frontbuffer, and
* send updates to the SVGA device. Clears the dirtyRects list.
*
* For flow control, this also waits for the host to process the
* batch of updates we just queued into the FIFO.
*
*-----------------------------------------------------------------------------
*/
static void
BackUpdate() // IN
{
int rectNum;
for (rectNum = 0; rectNum < back.numDirtyRects; rectNum++) {
Rect rect = back.dirtyRects[rectNum];
uint32 i, j;
for (i = 0; i < rect.h; i++) {
uint32 offset = (rect.y + i) * gSVGA.width + rect.x;
uint32 *src = &back.buffer[offset];
uint32 *dest = &((uint32*) gSVGA.fbMem)[offset];
for (j = 0; j < rect.w; j++) {
dest[j] = src[j];
}
}
SVGA_Update(rect.x, rect.y, rect.w, rect.h);
}
back.numDirtyRects = 0;
SVGA_SyncToFence(SVGA_InsertFence());
}
/*
*-----------------------------------------------------------------------------
*
* PongDrawString --
*
* Draw a string of digits, using our silly blocky font. The
* string's origin is the top-middle.
*
*-----------------------------------------------------------------------------
*/
static void
PongDrawString(uint32 x, // IN
uint32 y, // IN
const char *str, // IN
uint32 strLen) // IN
{
const int charW = 4;
const int charH = 5;
static const uint8 font[] = {
0xF1, // **** ...*
0x91, // *..* ...*
0x91, // *..* ...*
0x91, // *..* ...*
0xF1, // **** ...*
0xFF, // **** ****
0x11, // ...* ...*
0xFF, // **** ****
0x81, // *... ...*
0xFF, // **** ****
0x9F, // *..* ****
0x98, // *..* *...
0xFF, // **** ****
0x11, // ...* ...*
0x1F, // ...* ****
0xFF, // **** ****
0x81, // *... ...*
0xF1, // **** ...*
0x91, // *..* ...*
0xF1, // **** ...*
0xFF, // **** ****
0x99, // *..* *..*
0xFF, // **** ****
0x91, // *..* ...*
0xF1, // **** ...*
};
x -= (PONG_DIGIT_PIXEL_SIZE * (strLen * (charW + 1) - 1)) / 2;
while (*str) {
int digit = *str - '0';
if (digit >= 0 && digit <= 9) {
int i, j;
for (j = 0; j < charH; j++) {
for (i = 0; i < charW; i++) {
if ((font[digit / 2 * 5 + j] << i) & (digit & 1 ? 0x08 : 0x80)) {
FillRect pixel = {
{x + i * PONG_DIGIT_PIXEL_SIZE,
y + j * PONG_DIGIT_PIXEL_SIZE,
PONG_DIGIT_PIXEL_SIZE,
PONG_DIGIT_PIXEL_SIZE},
PONG_PLAYFIELD_COLOR,
};
BackFill(pixel);
}
}
}
}
x += PONG_DIGIT_PIXEL_SIZE * (charW + 1);
str++;
}
}
/*
*-----------------------------------------------------------------------------
*
* DecDigit --
*
* Utility for extracting a decimal digit.
*
*-----------------------------------------------------------------------------
*/
static char
DecDigit(int i, int div, Bool blank)
{
if (blank && i < div) {
return ' ';
}
return (i / div) % 10 + '0';
}
/*
*-----------------------------------------------------------------------------
*
* PongDrawPlayfield --
*
* Redraw the playfield for Pong.
*
*-----------------------------------------------------------------------------
*/
static void
PongDrawPlayfield()
{
int i;
/*
* Clear the screen
*/
FillRect background = {
{0, 0, gSVGA.width, gSVGA.height},
PONG_BG_COLOR,
};
BackFill(background);
/*
* Draw the dotted dividing line
*/
for (i = PONG_DOT_SIZE;
i <= gSVGA.height - PONG_DOT_SIZE * 2;
i += PONG_DOT_SIZE * 2) {
FillRect dot = {
{(gSVGA.width - PONG_DOT_SIZE) / 2, i,
PONG_DOT_SIZE, PONG_DOT_SIZE},
PONG_PLAYFIELD_COLOR,
};
BackFill(dot);
}
/*
* Draw the score counters.
*
* sprintf() is big, so we'll format this the old-fashioned way.
* Right-justify the left score, and left-justify the right score.
*/
{
char scoreStr[7] = " ";
char *p = scoreStr;
*(p++) = DecDigit(pong.scores[0], 100, TRUE);
*(p++) = DecDigit(pong.scores[0], 10, TRUE);
*(p++) = DecDigit(pong.scores[0], 1, FALSE);
p++;
if (pong.scores[1] >= 100) {
*(p++) = DecDigit(pong.scores[1], 100, TRUE);
}
if (pong.scores[1] >= 10) {
*(p++) = DecDigit(pong.scores[1], 10, TRUE);
}
*(p++) = DecDigit(pong.scores[1], 1, FALSE);
PongDrawString(gSVGA.width/2, PONG_DIGIT_PIXEL_SIZE,
scoreStr, sizeof scoreStr);
}
}
/*
*-----------------------------------------------------------------------------
*
* PongDrawScreen --
*
* Top-level redraw function for Pong. This does a lot of unnecessary
* drawing to the backbuffer, but we're careful to only send update
* rectangles for a few things:
*
* - When the playfield changes, we update the entire screen.
* - Each sprite (the paddles and ball) gets two rectangles:
* - One for its new position
* - One for its old position
*
* None of these rectangles are ever merged.
*
*-----------------------------------------------------------------------------
*/
static void
PongDrawScreen()
{
PongDrawPlayfield();
if (pong.playfieldDirty) {
Rect r = {0, 0, gSVGA.width, gSVGA.height};
BackMarkDirty(r);
pong.playfieldDirty = FALSE;
}
/* Draw all sprites at the current positions */
BackFill(pong.paddles[0]);
BackMarkDirty(pong.paddles[0].r);
BackFill(pong.paddles[1]);
BackMarkDirty(pong.paddles[1].r);
BackFill(pong.ball);
BackMarkDirty(pong.ball.r);
/* Commit this to the front buffer and the host's screen */
BackUpdate();
/* Make sure we erase all sprites at the current positions on the next frame */
BackMarkDirty(pong.paddles[0].r);
BackMarkDirty(pong.paddles[1].r);
BackMarkDirty(pong.ball.r);
}
/*
*-----------------------------------------------------------------------------
*
* PongLaunchBall --
*
* Reset the ball position, and give it a random angle.
*
*-----------------------------------------------------------------------------
*/
static void
PongLaunchBall()
{
/* sin() from 0 to PI/2 */
static const float sineTable[64] = {
0.000000, 0.024931, 0.049846, 0.074730, 0.099568, 0.124344, 0.149042, 0.173648,
0.198146, 0.222521, 0.246757, 0.270840, 0.294755, 0.318487, 0.342020, 0.365341,
0.388435, 0.411287, 0.433884, 0.456211, 0.478254, 0.500000, 0.521435, 0.542546,
0.563320, 0.583744, 0.603804, 0.623490, 0.642788, 0.661686, 0.680173, 0.698237,
0.715867, 0.733052, 0.749781, 0.766044, 0.781831, 0.797133, 0.811938, 0.826239,
0.840026, 0.853291, 0.866025, 0.878222, 0.889872, 0.900969, 0.911506, 0.921476,
0.930874, 0.939693, 0.947927, 0.955573, 0.962624, 0.969077, 0.974928, 0.980172,
0.984808, 0.988831, 0.992239, 0.995031, 0.997204, 0.998757, 0.999689, 1.000000,
};
int t;
float sinT, cosT;
pong.ballPos.x = gSVGA.width / 2;
pong.ballPos.y = gSVGA.height / 2;
/* Limit the random angle to avoid those within 45 degrees of vertical */
t = 32 + (Random32() & 31);
sinT = sineTable[t];
cosT = -sineTable[(t + 32) & 63];
sinT *= pong.ballSpeed;
cosT *= pong.ballSpeed;
switch (Random32() & 3) {
case 0:
pong.ballVelocity.x = sinT;
pong.ballVelocity.y = cosT;
break;
case 1:
pong.ballVelocity.x = -sinT;
pong.ballVelocity.y = cosT;
break;
case 2:
pong.ballVelocity.x = -sinT;
pong.ballVelocity.y = -cosT;
break;
case 3:
pong.ballVelocity.x = sinT;
pong.ballVelocity.y = -cosT;
break;
}
}
/*
*-----------------------------------------------------------------------------
*
* PongInit --
*
* Initialize all game variables, including sprite location/size/color.
* Requires that SVGA has already been initialized.
*
*-----------------------------------------------------------------------------
*/
static void
PongInit()
{
pong.scores[0] = 0;
pong.scores[1] = 0;
pong.playfieldDirty = TRUE;
pong.paddlePos[0] = pong.paddlePos[1] = gSVGA.height / 2;
pong.paddles[0].r.x = 10;
pong.paddles[0].r.w = 16;
pong.paddles[0].r.h = 64;
pong.paddles[0].color = PONG_SPRITE_COLOR;
pong.paddles[1].r.x = gSVGA.width - 16 - 10;
pong.paddles[1].r.w = 16;
pong.paddles[1].r.h = 64;
pong.paddles[1].color = PONG_SPRITE_COLOR;
pong.ball.r.w = 16;
pong.ball.r.h = 16;
pong.ball.color = PONG_SPRITE_COLOR;
pong.ballSpeed = 400;
PongLaunchBall();
}
/*
*-----------------------------------------------------------------------------
*
* PongUpdateMotion --
*
* Perform motion updates for the ball and paddles. This includes
* bounce/goal detection.
*
*-----------------------------------------------------------------------------
*/
static void
PongUpdateMotion(float dt) // IN
{
int playableWidth = gSVGA.width - pong.ball.r.w;
int playableHeight = gSVGA.height - pong.ball.r.h;
int i;
pong.ballPos.x += pong.ballVelocity.x * dt;
pong.ballPos.y += pong.ballVelocity.y * dt;
for (i = 0; i < 2; i++) {
int pos = pong.paddlePos[i] + pong.paddleVelocities[i] * dt;
pong.paddlePos[i] = MIN(gSVGA.height - pong.paddles[i].r.h, MAX(0, pos));
pong.paddles[i].r.y = (int)pong.paddlePos[i];
}
if (pong.ballPos.x >= playableWidth) {
/* Goal off the right edge */
pong.scores[0]++;
pong.playfieldDirty = TRUE;
PongLaunchBall();
}
if (pong.ballPos.x <= 0) {
/* Goal off the left edge */
pong.scores[1]++;
pong.playfieldDirty = TRUE;
PongLaunchBall();
}
if (pong.ballPos.y >= playableHeight) {
/* Bounce off the bottom edge */
pong.ballVelocity.y = -pong.ballVelocity.y;
pong.ballPos.y = playableHeight - (pong.ballPos.y - playableHeight);
}
if (pong.ballPos.y <= 0) {
/* Bounce off the top edge */
pong.ballVelocity.y = -pong.ballVelocity.y;
pong.ballPos.y = -pong.ballPos.y;
}
pong.ballPos.y = MIN(playableHeight, pong.ballPos.y);
pong.ballPos.y = MAX(0, pong.ballPos.y);
pong.ball.r.x = (int)pong.ballPos.x;
pong.ball.r.y = (int)pong.ballPos.y;
/*
* Lame collision detection between ball and paddles. Really we
* should be testing the ball's entire path over this time step,
* not just the ball's new position. Using the current
* implementation, it's possible for the ball to move through a
* paddle if it's going fast enough or our frame rate is slow
* enough.
*/
for (i = 0; i < 2; i++) {
/*
* Only bounce off the paddle when we're moving toward it, to
* prevent the ball from getting stuck inside the paddle
*/
if ((pong.paddles[i].r.x > gSVGA.width / 2) == (pong.ballVelocity.x > 0) &&
RectTestIntersection(&pong.ball.r, &pong.paddles[i].r)) {
/*
* Boing! The ball bounces back, plus it gets a little spin
* if the paddle itself was moving at the time.
*/
pong.ballVelocity.x = -pong.ballVelocity.x;
pong.ballVelocity.y += pong.paddleVelocities[i];
pong.ballVelocity.y = MIN(pong.ballVelocity.y, pong.ballSpeed * 2);
pong.ballVelocity.y = MAX(pong.ballVelocity.y, -pong.ballSpeed * 2);
}
}
}
/*
*-----------------------------------------------------------------------------
*
* PongKeyboardPlayer --
*
* A human player, using the up and down arrows on a keyboard.
*
*-----------------------------------------------------------------------------
*/
static void
PongKeyboardPlayer(int playerNum, // IN
float maxSpeed, // IN
float accel) // IN
{
float v = pong.paddleVelocities[playerNum];
Bool up = Keyboard_IsKeyPressed(KEY_UP);
Bool down = Keyboard_IsKeyPressed(KEY_DOWN);
if (up && !down) {
v -= accel;
} else if (down && !up) {
v += accel;
} else {
v = 0;
}
v = MIN(maxSpeed, MAX(-maxSpeed, v));
pong.paddleVelocities[playerNum] = v;
}
/*
*-----------------------------------------------------------------------------
*
* PongAbsMousePlayer --
*
* A human player, controlled with the Y axis of the absolute mouse.
*
*-----------------------------------------------------------------------------
*/
static void
PongAbsMousePlayer(int playerNum) // IN
{
int currentY = pong.paddles[playerNum].r.y;
int newY = currentY;
VMMousePacket p;
Bool mouseMoved = FALSE;
while (VMBackdoor_MouseGetPacket(&p)) {
newY = (p.y * gSVGA.height / 0xFFFF) - pong.paddles[playerNum].r.h / 2;
newY = MAX(0, newY);
newY = MIN(gSVGA.height - pong.paddles[playerNum].r.h, newY);
mouseMoved = TRUE;
}
if (newY != currentY && mouseMoved) {
pong.paddleVelocities[playerNum] = (newY - currentY) * (float)PONG_FRAME_RATE;
}
}
/*
*-----------------------------------------------------------------------------
*
* PongComputerPlayer --
*
* Simple computer player. Always moves its paddle toward the ball.
*
*-----------------------------------------------------------------------------
*/
static void
PongComputerPlayer(int playerNum, // IN
float maxSpeed) // IN
{
int paddleCenter = pong.paddles[playerNum].r.y + pong.paddles[playerNum].r.h / 2;
int ballCenter = pong.ball.r.y + pong.ball.r.h / 2;
int distance = ballCenter - paddleCenter;
pong.paddleVelocities[playerNum] = distance / (float)gSVGA.height * maxSpeed;
}
/*
*-----------------------------------------------------------------------------
*
* main --
*
* Initialization and main loop.
*
*-----------------------------------------------------------------------------
*/
void
main(void)
{
Intr_Init();
SVGA_Init();
SVGA_SetMode(800, 600, 32);
back.buffer = (uint32*) (gSVGA.fbMem + gSVGA.width * gSVGA.height * sizeof(uint32));
Keyboard_Init();
VMBackdoor_MouseInit(TRUE);
PongInit();
Timer_InitPIT(PIT_HZ / PONG_FRAME_RATE);
Intr_SetMask(0, TRUE);
while (1) {
PongKeyboardPlayer(0, 1000, 50);
PongAbsMousePlayer(0);
PongComputerPlayer(1, 2000);
PongUpdateMotion(1.0 / PONG_FRAME_RATE);
PongDrawScreen();
Intr_Halt();
}
}

View file

@ -0,0 +1,6 @@
TARGET = presentReadback.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

View file

@ -0,0 +1,233 @@
/*
* SVGA3D example: Present Reaback example. This example tests the 3d
* and 2d syncronization presentReadback command. This example draws
* a spinning cube using 3d. After every frame parts of the 2d
* framebuffer are updated with present readback with a 2d update
* following. Parts of the 3d region are cleared before the present
* readback command testing that the last presented 3d data is what is
* copied to the 2d framebuffer. This cube should spin with no
* flicker.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
typedef struct {
float position[3];
uint32 color;
} MyVertex;
static const MyVertex vertexData[] = {
{ {-1, -1, -1}, 0xFFFFFF },
{ {-1, -1, 1}, 0xFFFF00 },
{ {-1, 1, -1}, 0xFF00FF },
{ {-1, 1, 1}, 0xFF0000 },
{ { 1, -1, -1}, 0x00FFFF },
{ { 1, -1, 1}, 0x00FF00 },
{ { 1, 1, -1}, 0x0000FF },
{ { 1, 1, 1}, 0x000000 },
};
#define QUAD(a,b,c,d) a, b, d, d, c, a
static const uint16 indexData[] = {
QUAD(0,1,2,3), // -X
QUAD(4,5,6,7), // +X
QUAD(0,1,4,5), // -Y
QUAD(2,3,6,7), // +Y
QUAD(0,2,4,6), // -Z
QUAD(1,3,5,7), // +Z
};
#undef QUAD
const uint32 numTriangles = sizeof indexData / sizeof indexData[0] / 3;
uint32 vertexSid, indexSid;
Matrix perspectiveMat;
FPSCounterState gFPS;
VMMousePacket lastMouseState;
/*
* render --
*
* Set up render state, and draw our cube scene from static index
* and vertex buffers.
*
* This render state only needs to be set each frame because
* SVGA3DText_Draw() changes it.
*/
void
render(void)
{
SVGA3dTextureState *ts;
SVGA3dRenderState *rs;
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
static Matrix view;
Matrix_Copy(view, gIdentityMatrix);
Matrix_Scale(view, 0.5, 0.5, 0.5, 1.0);
if (lastMouseState.buttons & VMMOUSE_LEFT_BUTTON) {
Matrix_RotateX(view, lastMouseState.y * 0.0001);
Matrix_RotateY(view, lastMouseState.x * -0.0001);
} else {
Matrix_RotateX(view, 30.0 * M_PI / 180.0);
Matrix_RotateY(view, gFPS.frame * 0.01f);
}
Matrix_Translate(view, 0, 0, 3);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_VIEW, view);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_WORLD, gIdentityMatrix);
SVGA3D_SetTransform(CID, SVGA3D_TRANSFORM_PROJECTION, perspectiveMat);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginSetTextureState(CID, &ts, 4);
{
ts[0].stage = 0;
ts[0].name = SVGA3D_TS_BIND_TEXTURE;
ts[0].value = SVGA3D_INVALID_ID;
ts[1].stage = 0;
ts[1].name = SVGA3D_TS_COLOROP;
ts[1].value = SVGA3D_TC_SELECTARG1;
ts[2].stage = 0;
ts[2].name = SVGA3D_TS_COLORARG1;
ts[2].value = SVGA3D_TA_DIFFUSE;
ts[3].stage = 0;
ts[3].name = SVGA3D_TS_ALPHAARG1;
ts[3].value = SVGA3D_TA_DIFFUSE;
}
SVGA_FIFOCommitAll();
SVGA3D_BeginDrawPrimitives(CID, &decls, 2, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
decls[1].identity.type = SVGA3D_DECLTYPE_D3DCOLOR;
decls[1].identity.usage = SVGA3D_DECLUSAGE_COLOR;
decls[1].array.surfaceId = vertexSid;
decls[1].array.stride = sizeof(MyVertex);
decls[1].array.offset = offsetof(MyVertex, color);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = numTriangles;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(uint16);
ranges[0].indexWidth = sizeof(uint16);
}
SVGA_FIFOCommitAll();
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGAGuestPtr ptr;
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DUtil_AllocDMABuffer(gSVGA.width * gSVGA.height * 4, &ptr);
SVGA3DText_Init();
vertexSid = SVGA3DUtil_DefineStaticBuffer(vertexData, sizeof vertexData);
indexSid = SVGA3DUtil_DefineStaticBuffer(indexData, sizeof indexData);
Matrix_Perspective(perspectiveMat, 45.0f,
gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
while (1) {
SVGA3dRect *rects;
int halfWidth = gSVGA.width / 2;
int halfHeight = gSVGA.height / 2;
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Present Readback:\n"
" - upper left quadrant:\n"
" present\n"
" - lower right quadrant:\n"
" present -> presentReadback -> update\n"
" - upper right and lower left quadrants:\n"
" present -> clear -> presentReadback -> update\n"
"\n"
"The cube should appear to be smothly spinning \n"
"with all quadrants of the screen in sync.\n\n%s",
gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
while (VMBackdoor_MouseGetPacket(&lastMouseState));
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
SVGA3D_BeginPresentReadback(&rects, 1);
rects[0].x = halfWidth;
rects[0].y = halfHeight;
rects[0].w = halfWidth;
rects[0].h = halfHeight;
SVGA_FIFOCommitAll();
SVGA_SyncToFence(SVGA_InsertFence());
SVGA_Update(halfWidth,halfHeight,halfWidth,halfHeight);
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0xff00ff, 1.0f, 0);
SVGA3D_BeginPresentReadback(&rects, 2);
rects[0].x = halfWidth;
rects[0].y = 0;
rects[0].w = halfWidth;
rects[0].h = halfHeight;
rects[1].x = 0;
rects[1].y = halfHeight;
rects[1].w = halfWidth;
rects[1].h = halfHeight;
SVGA_FIFOCommitAll();
SVGA_SyncToFence(SVGA_InsertFence());
SVGA_Update(halfWidth, 0, halfWidth, halfHeight);
SVGA_Update(0, halfHeight, halfWidth, halfHeight);
}
return 0;
}

View file

@ -0,0 +1,6 @@
TARGET = simple_blit.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

104
examples/simple-blit/main.c Normal file
View file

@ -0,0 +1,104 @@
/*
* SVGA3D example: Simple BLIT (Block Image Transfer) which updates
* the render target surface ID.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "types.h"
#include "svga3dutil.h"
#include "svga3dtext.h"
FPSCounterState gFPS;
/*
* Alpha, red, green, blue components of color.
*/
static uint32 a = 255, r = 0, g = 0, b = 0;
/*
* DMA pools allow for the allocation of GMR memory.
* The re-use policy is handled by the allocation routines.
*/
static uint32 blitSize = 0;
DMAPool blitDMA;
/*
* render --
*
* Set up render state, and use surface DMA to update the
* render target with a solid color that goes from black
* to white.
*
*/
void
render(void)
{
DMAPoolBuffer *dma = NULL;
uint32 *buffer = NULL;
uint32 color;
dma = SVGA3DUtil_DMAPoolGetBuffer(&blitDMA);
buffer = (uint32 *)dma->buffer;
/* uint32 memset. */
color = ((a&255) << 24) | ((r&255) << 16) | ((b&255) << 8) | (g&255);
memset32(buffer, color, blitSize / sizeof *buffer);
r++; g++; b++;
/*
* Copy pixel data from our temporary memory in the GMR into
* the render target. This is a BLIT operation from memory
* in the guest to the host render target.
*/
SVGA3DUtil_SurfaceDMA2D(gFullscreen.colorImage.sid, &dma->ptr,
SVGA3D_WRITE_HOST_VRAM,
gSVGA.width, gSVGA.height);
SVGA3DUtil_AsyncCall((AsyncCallFn) SVGA3DUtil_DMAPoolFreeBuffer, dma);
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
/*
* Allocate 2 buffers for DMA. Each buffer is the size of the display
* so that we can fill the buffers with color data and DMA that buffer
* to the render target.
*/
blitSize = gSVGA.width * gSVGA.height * sizeof(uint32);
SVGA3DUtil_AllocDMAPool(&blitDMA, blitSize, 4);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Simple BLIT of image into render target.\n%s",
gFPS.text);
SVGA3DText_Update();
VMBackdoor_VGAScreenshot();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR, 0x113366, 1.0f, 0);
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

View file

@ -0,0 +1,16 @@
TARGET = simple-shaders.img
APP_SOURCES = main.c
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules
.PHONY: shaders
shaders: simple_vs.h simple_ps.h
simple_vs.h: simple.fx
wine fxc.exe /T vs_2_0 /E MyVertexShader /Fh simple_vs.h simple.fx
simple_ps.h: simple.fx
wine fxc.exe /T ps_2_0 /E MyPixelShader /Fh simple_ps.h simple.fx

View file

@ -0,0 +1,249 @@
/*
* SVGA3D example: Simple Shaders.
*
* This is a simple example to demonstrate the programmable pixel
* and vertex pipelines. A vertex shader animates a rippling surface,
* and a pixel shader generates a procedural checkerboard pattern.
*
* For simplicity, this example generates shader bytecode at
* compile-time using the Microsoft HLSL compiler.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga3dutil.h"
#include "svga3dtext.h"
#include "matrix.h"
#include "math.h"
typedef uint32 DWORD;
#include "simple_vs.h"
#include "simple_ps.h"
/*
* Small integers to identify our shaders.
*/
#define MY_VSHADER_ID 0
#define MY_PSHADER_ID 0
/*
* Shader constants. These must match the constant registers in the
* bytecode we send the device, so in this example the constants are
* actually assigned by the Microsoft HLSL compiler.
*/
#define CONST_MAT_WORLDVIEWPROJ 0
#define CONST_TIMESTEP 4
/*
* Macros for the simple mesh we generate as input for the vertex
* shader. It's a static grid in the XY plane.
*/
#define MESH_WIDTH 256
#define MESH_HEIGHT 256
#define MESH_NUM_VERTICES (MESH_WIDTH * MESH_HEIGHT)
#define MESH_NUM_QUADS ((MESH_WIDTH-1) * (MESH_HEIGHT-1))
#define MESH_NUM_TRIANGLES (MESH_NUM_QUADS * 2)
#define MESH_NUM_INDICES (MESH_NUM_TRIANGLES * 3)
#define MESH_ELEMENT(x, y) (MESH_WIDTH * (y) + (x))
typedef struct {
float position[3];
} MyVertex;
typedef uint16 IndexType;
uint32 vertexSid, indexSid;
FPSCounterState gFPS;
/*
* render --
*
* Set up render state that we load once per frame (because
* SVGA3DText clobbered it) and render the scene.
*/
void
render(void)
{
SVGA3dVertexDecl *decls;
SVGA3dPrimitiveRange *ranges;
SVGA3dRenderState *rs;
float shaderTimestep[4] = { gFPS.frame * 0.01 };
SVGA3D_SetShaderConst(CID, CONST_TIMESTEP, SVGA3D_SHADERTYPE_VS,
SVGA3D_CONST_TYPE_FLOAT, shaderTimestep);
SVGA3D_BeginSetRenderState(CID, &rs, 4);
{
rs[0].state = SVGA3D_RS_BLENDENABLE;
rs[0].uintValue = FALSE;
rs[1].state = SVGA3D_RS_ZENABLE;
rs[1].uintValue = TRUE;
rs[2].state = SVGA3D_RS_ZWRITEENABLE;
rs[2].uintValue = TRUE;
rs[3].state = SVGA3D_RS_ZFUNC;
rs[3].uintValue = SVGA3D_CMP_LESS;
}
SVGA_FIFOCommitAll();
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, MY_VSHADER_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, MY_PSHADER_ID);
SVGA3D_BeginDrawPrimitives(CID, &decls, 1, &ranges, 1);
{
decls[0].identity.type = SVGA3D_DECLTYPE_FLOAT3;
decls[0].identity.usage = SVGA3D_DECLUSAGE_POSITION;
decls[0].array.surfaceId = vertexSid;
decls[0].array.stride = sizeof(MyVertex);
decls[0].array.offset = offsetof(MyVertex, position);
ranges[0].primType = SVGA3D_PRIMITIVE_TRIANGLELIST;
ranges[0].primitiveCount = MESH_NUM_TRIANGLES;
ranges[0].indexArray.surfaceId = indexSid;
ranges[0].indexArray.stride = sizeof(IndexType);
ranges[0].indexWidth = sizeof(IndexType);
}
SVGA_FIFOCommitAll();
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_VS, SVGA3D_INVALID_ID);
SVGA3D_SetShader(CID, SVGA3D_SHADERTYPE_PS, SVGA3D_INVALID_ID);
}
/*
* createIndexBuffer --
*
* Create a static index buffer that renders our vertices as a 2D
* mesh. For simplicity, we use a triangle list rather than a
* triangle strip.
*/
uint32
createIndexBuffer(void)
{
IndexType *indexBuffer;
const uint32 bufferSize = MESH_NUM_INDICES * sizeof *indexBuffer;
SVGAGuestPtr gPtr;
uint32 sid;
int x, y;
sid = SVGA3DUtil_DefineSurface2D(bufferSize, 1, SVGA3D_BUFFER);
indexBuffer = SVGA3DUtil_AllocDMABuffer(bufferSize, &gPtr);
for (y = 0; y < (MESH_HEIGHT - 1); y++) {
for (x = 0; x < (MESH_WIDTH - 1); x++) {
indexBuffer[0] = MESH_ELEMENT(x, y );
indexBuffer[1] = MESH_ELEMENT(x+1, y );
indexBuffer[2] = MESH_ELEMENT(x+1, y+1);
indexBuffer[3] = MESH_ELEMENT(x+1, y+1);
indexBuffer[4] = MESH_ELEMENT(x, y+1);
indexBuffer[5] = MESH_ELEMENT(x, y );
indexBuffer += 6;
}
}
SVGA3DUtil_SurfaceDMA2D(sid, &gPtr, SVGA3D_WRITE_HOST_VRAM, bufferSize, 1);
return sid;
}
/*
* createVertexBuffer --
*
* Create a static vertex buffer that renders a mesh on thee XY
* plane. For simplicity, we use a triangle list rather than a
* triangle strip.
*/
uint32
createVertexBuffer(void)
{
MyVertex *vert;
const uint32 bufferSize = MESH_NUM_VERTICES * sizeof(MyVertex);
SVGAGuestPtr gPtr;
uint32 sid;
int x, y;
sid = SVGA3DUtil_DefineSurface2D(bufferSize, 1, SVGA3D_BUFFER);
vert = SVGA3DUtil_AllocDMABuffer(bufferSize, &gPtr);
for (y = 0; y < MESH_HEIGHT; y++) {
for (x = 0; x < MESH_WIDTH; x++) {
vert->position[0] = x * (2.0 / MESH_WIDTH) - 1.0;
vert->position[1] = y * (2.0 / MESH_HEIGHT) - 1.0;
vert->position[2] = 0.0f;
vert++;
}
}
SVGA3DUtil_SurfaceDMA2D(sid, &gPtr, SVGA3D_WRITE_HOST_VRAM, bufferSize, 1);
return sid;
}
/*
* main --
*
* Our example's entry point, invoked directly by the bootloader.
*/
int
main(void)
{
Matrix worldViewProj, proj;
SVGA3DUtil_InitFullscreen(CID, 800, 600);
SVGA3DText_Init();
vertexSid = createVertexBuffer();
indexSid = createIndexBuffer();
SVGA3D_DefineShader(CID, MY_VSHADER_ID, SVGA3D_SHADERTYPE_VS,
g_vs20_MyVertexShader, sizeof g_vs20_MyVertexShader);
SVGA3D_DefineShader(CID, MY_PSHADER_ID, SVGA3D_SHADERTYPE_PS,
g_ps20_MyPixelShader, sizeof g_ps20_MyPixelShader);
/*
* Compute a single matrix for the world, view, and projection
* transforms, then upload that to the shader.
*/
Matrix_Copy(worldViewProj, gIdentityMatrix);
Matrix_RotateX(worldViewProj, 60.0 * PI_OVER_180);
Matrix_Translate(worldViewProj, 0, 0, 3);
Matrix_Perspective(proj, 45.0f, gSVGA.width / (float)gSVGA.height, 0.1f, 100.0f);
Matrix_Multiply(worldViewProj, proj);
SVGA3DUtil_SetShaderConstMatrix(CID, CONST_MAT_WORLDVIEWPROJ,
SVGA3D_SHADERTYPE_VS, worldViewProj);
while (1) {
if (SVGA3DUtil_UpdateFPSCounter(&gFPS)) {
Console_Clear();
Console_Format("VMware SVGA3D Example:\n"
"Simple Shaders.\n\n%s",
gFPS.text);
SVGA3DText_Update();
}
SVGA3DUtil_ClearFullscreen(CID, SVGA3D_CLEAR_COLOR | SVGA3D_CLEAR_DEPTH,
0x113366, 1.0f, 0);
render();
SVGA3DText_Draw();
SVGA3DUtil_PresentFullscreen();
}
return 0;
}

View file

@ -0,0 +1,60 @@
float4x4 matWorldViewProj;
float timestep;
struct VS_Output
{
float4 Pos : POSITION;
float4 Coord : TEXCOORD0;
};
VS_Output
MyVertexShader(float4 inputPos : POSITION)
{
VS_Output Output;
float4 objectCoord = inputPos;
float dist = pow(objectCoord.x, 2) + pow(objectCoord.y, 2);
objectCoord.z = sin(dist * 8.0 + timestep) / (1 + dist * 10.0);
Output.Pos = mul(objectCoord, matWorldViewProj);
Output.Coord = objectCoord;
return Output;
}
struct PS_Input
{
float4 Coord : TEXCOORD0;
};
float4
MyPixelShader(PS_Input Input) : COLOR
{
/*
* Simple 2D procedural checkerboard.
*/
const float4 color1 = { 0.25, 0.25, 0.25, 1.0 };
const float4 color2 = { 1.0, 1.0, 1.0, 1.0 };
const float checkerSize = 0.2;
float2 s = fmod(Input.Coord.xy / checkerSize, 1);
float check = ( (float)(s.x > 0.5 || (s.x < 0 && s.x > -0.5)) +
(float)(s.y > 0.5 || (s.y < 0 && s.y > -0.5)) );
float4 color = lerp(color1, color2, fmod(check, 2));
/*
* Do a little fake shading
*/
const float4 shadeTop = { 1.0, 1.0, 0.5, 1.0 };
const float4 shadeBottom = { 0.5, 0.5, 1.0, 1.0 };
float z = Input.Coord.z * 2;
color = lerp(color, shadeBottom, clamp(z, 0, 0.25));
color = lerp(color, shadeTop, clamp(-z, 0, 0.25));
return color;
}

View file

@ -0,0 +1,89 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T ps_2_0 /E MyPixelShader /Fh simple_ps.h simple.fx
//
ps_2_0
def c0, 5, 0.5, 0, 1
def c1, -0.5, 0.25, 0, 0
def c2, 0.25, 0.25, 1, 0
def c3, 1.5, 1.5, 0, 0
def c4, 0.5, 1, 1, 0
def c5, 1, 0.5, 1, 0
dcl t0.xyz
mul r0.xy, t0, c0.x
abs r0.xy, r0
frc r0.xy, r0
cmp r0.xy, t0, r0, -r0
add r0.w, -r0.x, c0.y
cmp r0.w, r0.w, c0.z, c0.w
add r1.w, -r0.x, c1.x
cmp r1.w, r1.w, c0.z, c0.w
cmp r2.w, r0.x, c0.z, c0.w
mad r0.w, r2.w, r1.w, r0.w
cmp r0.w, -r0.w, c0.z, c0.w
add r1.w, -r0.y, c0.y
cmp r1.w, r1.w, c0.z, c0.w
add r2.w, -r0.y, c1.x
cmp r3.w, r0.y, c0.z, c0.w
cmp r2.w, r2.w, c0.z, c0.w
mad r1.w, r3.w, r2.w, r1.w
cmp r1.w, -r1.w, c0.z, c0.w
add r0.w, r0.w, r1.w
mul r0.w, r0.w, c0.y
frc r0.w, r0.w
mov r0.xyz, c3
mad r0.xyz, r0.w, r0, c2
add r0.w, t0.z, t0.z
max r1.w, r0.w, c0.z
max r2.w, -r0.w, c0.z
min r0.w, r1.w, c1.y
lrp r1.xyz, r0.w, c4, r0
min r0.w, r2.w, c1.y
lrp r2.xyz, r0.w, c5, r1
mov r0.xy, r2.x
mov r0.w, r2.z
mov r0.z, r2.y
mov oC0, r0
// approximately 34 instruction slots used
#endif
const DWORD g_ps20_MyPixelShader[] =
{
0xffff0200, 0x0013fffe, 0x42415443, 0x0000001c, 0x00000023, 0xffff0200,
0x00000000, 0x00000000, 0x20000100, 0x0000001c, 0x325f7370, 0x4d00305f,
0x6f726369, 0x74666f73, 0x29522820, 0x44334420, 0x53203958, 0x65646168,
0x6f432072, 0x6c69706d, 0x00207265, 0x05000051, 0xa00f0000, 0x40a00000,
0x3f000000, 0x00000000, 0x3f800000, 0x05000051, 0xa00f0001, 0xbf000000,
0x3e800000, 0x00000000, 0x00000000, 0x05000051, 0xa00f0002, 0x3e800000,
0x3e800000, 0x3f800000, 0x00000000, 0x05000051, 0xa00f0003, 0x3fc00000,
0x3fc00000, 0x00000000, 0x00000000, 0x05000051, 0xa00f0004, 0x3f000000,
0x3f800000, 0x3f800000, 0x00000000, 0x05000051, 0xa00f0005, 0x3f800000,
0x3f000000, 0x3f800000, 0x00000000, 0x0200001f, 0x80000000, 0xb0070000,
0x03000005, 0x80030000, 0xb0e40000, 0xa0000000, 0x02000023, 0x80030000,
0x80e40000, 0x02000013, 0x80030000, 0x80e40000, 0x04000058, 0x80030000,
0xb0e40000, 0x80e40000, 0x81e40000, 0x03000002, 0x80080000, 0x81000000,
0xa0550000, 0x04000058, 0x80080000, 0x80ff0000, 0xa0aa0000, 0xa0ff0000,
0x03000002, 0x80080001, 0x81000000, 0xa0000001, 0x04000058, 0x80080001,
0x80ff0001, 0xa0aa0000, 0xa0ff0000, 0x04000058, 0x80080002, 0x80000000,
0xa0aa0000, 0xa0ff0000, 0x04000004, 0x80080000, 0x80ff0002, 0x80ff0001,
0x80ff0000, 0x04000058, 0x80080000, 0x81ff0000, 0xa0aa0000, 0xa0ff0000,
0x03000002, 0x80080001, 0x81550000, 0xa0550000, 0x04000058, 0x80080001,
0x80ff0001, 0xa0aa0000, 0xa0ff0000, 0x03000002, 0x80080002, 0x81550000,
0xa0000001, 0x04000058, 0x80080003, 0x80550000, 0xa0aa0000, 0xa0ff0000,
0x04000058, 0x80080002, 0x80ff0002, 0xa0aa0000, 0xa0ff0000, 0x04000004,
0x80080001, 0x80ff0003, 0x80ff0002, 0x80ff0001, 0x04000058, 0x80080001,
0x81ff0001, 0xa0aa0000, 0xa0ff0000, 0x03000002, 0x80080000, 0x80ff0000,
0x80ff0001, 0x03000005, 0x80080000, 0x80ff0000, 0xa0550000, 0x02000013,
0x80080000, 0x80ff0000, 0x02000001, 0x80070000, 0xa0e40003, 0x04000004,
0x80070000, 0x80ff0000, 0x80e40000, 0xa0e40002, 0x03000002, 0x80080000,
0xb0aa0000, 0xb0aa0000, 0x0300000b, 0x80080001, 0x80ff0000, 0xa0aa0000,
0x0300000b, 0x80080002, 0x81ff0000, 0xa0aa0000, 0x0300000a, 0x80080000,
0x80ff0001, 0xa0550001, 0x04000012, 0x80070001, 0x80ff0000, 0xa0e40004,
0x80e40000, 0x0300000a, 0x80080000, 0x80ff0002, 0xa0550001, 0x04000012,
0x80070002, 0x80ff0000, 0xa0e40005, 0x80e40001, 0x02000001, 0x80030000,
0x80000002, 0x02000001, 0x80080000, 0x80aa0002, 0x02000001, 0x80040000,
0x80550002, 0x02000001, 0x800f0800, 0x80e40000, 0x0000ffff
};

View file

@ -0,0 +1,75 @@
#if 0
//
// Generated by Microsoft (R) D3DX9 Shader Compiler
//
// fxc /T vs_2_0 /E MyVertexShader /Fh simple_vs.h simple.fx
//
//
// Parameters:
//
// float4x4 matWorldViewProj;
// float timestep;
//
//
// Registers:
//
// Name Reg Size
// ---------------- ----- ----
// matWorldViewProj c0 4
// timestep c4 1
//
vs_2_0
def c5, 8, 0.159154937, 0.5, 0
def c6, 6.28318548, -3.14159274, 10, 1
def c7, -1.55009923e-06, -2.17013894e-05, 0.00260416674, 0.00026041668
def c8, -0.020833334, -0.125, 1, 0.5
dcl_position v0
mul r0.xy, v0, v0
add r0.x, r0.y, r0.x
mov r1.x, c5.x
mad r0.y, r0.x, r1.x, c4.x
mad r0.x, r0.x, c6.z, c6.w
mad r0.y, r0.y, c5.y, c5.z
frc r0.y, r0.y
mad r0.y, r0.y, c6.x, c6.y
sincos r1.y, r0.y, c7, c8
rcp r0.x, r0.x
mul r0.z, r1.y, r0.x
mov r0.xyw, v0
dp4 oPos.x, r0, c0
dp4 oPos.y, r0, c1
dp4 oPos.z, r0, c2
dp4 oPos.w, r0, c3
mov oT0, r0
// approximately 24 instruction slots used
#endif
const DWORD g_vs20_MyVertexShader[] =
{
0xfffe0200, 0x002dfffe, 0x42415443, 0x0000001c, 0x0000008b, 0xfffe0200,
0x00000002, 0x0000001c, 0x20000100, 0x00000084, 0x00000044, 0x00000002,
0x00000004, 0x00000058, 0x00000000, 0x00000068, 0x00040002, 0x00000001,
0x00000074, 0x00000000, 0x5774616d, 0x646c726f, 0x77656956, 0x6a6f7250,
0xababab00, 0x00030003, 0x00040004, 0x00000001, 0x00000000, 0x656d6974,
0x70657473, 0xababab00, 0x00030000, 0x00010001, 0x00000001, 0x00000000,
0x325f7376, 0x4d00305f, 0x6f726369, 0x74666f73, 0x29522820, 0x44334420,
0x53203958, 0x65646168, 0x6f432072, 0x6c69706d, 0x00207265, 0x05000051,
0xa00f0005, 0x41000000, 0x3e22f983, 0x3f000000, 0x00000000, 0x05000051,
0xa00f0006, 0x40c90fdb, 0xc0490fdb, 0x41200000, 0x3f800000, 0x05000051,
0xa00f0007, 0xb5d00d01, 0xb7b60b61, 0x3b2aaaab, 0x39888889, 0x05000051,
0xa00f0008, 0xbcaaaaab, 0xbe000000, 0x3f800000, 0x3f000000, 0x0200001f,
0x80000000, 0x900f0000, 0x03000005, 0x80030000, 0x90e40000, 0x90e40000,
0x03000002, 0x80010000, 0x80550000, 0x80000000, 0x02000001, 0x80010001,
0xa0000005, 0x04000004, 0x80020000, 0x80000000, 0x80000001, 0xa0000004,
0x04000004, 0x80010000, 0x80000000, 0xa0aa0006, 0xa0ff0006, 0x04000004,
0x80020000, 0x80550000, 0xa0550005, 0xa0aa0005, 0x02000013, 0x80020000,
0x80550000, 0x04000004, 0x80020000, 0x80550000, 0xa0000006, 0xa0550006,
0x04000025, 0x80020001, 0x80550000, 0xa0e40007, 0xa0e40008, 0x02000006,
0x80010000, 0x80000000, 0x03000005, 0x80040000, 0x80550001, 0x80000000,
0x02000001, 0x800b0000, 0x90e40000, 0x03000009, 0xc0010000, 0x80e40000,
0xa0e40000, 0x03000009, 0xc0020000, 0x80e40000, 0xa0e40001, 0x03000009,
0xc0040000, 0x80e40000, 0xa0e40002, 0x03000009, 0xc0080000, 0x80e40000,
0xa0e40003, 0x02000001, 0xe00f0000, 0x80e40000, 0x0000ffff
};

View file

@ -0,0 +1,6 @@
TARGET = video-formats.img
APP_SOURCES = main.c screen.png.data.o wols4x3.yuv.z.data.o
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

View file

@ -0,0 +1,297 @@
/*
* video-formats -- Demonstrate all supported video overlay formats.
*
* XXX: There are some known bugs in the currently released VMware
* products, which are exposed by this test:
*
* 1. The very first VideoFlush may not appear. In this test,
* the bug manifests as "No Overlay" for test #1.
*
* 2. Software emulated scaling is very low quality.
*
* 3. If the host is using hardware video overlay rather than
* its software fallback, it assumes that colorkey is always
* enabled. This means our video will only draw in the black
* portions of the background image (inside the "X", and
* the box around the "No overlay" text.)
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga.h"
#include "png.h"
#include "intr.h"
#include "datafile.h"
/*
* This is our video test card, in UYVY format.
*
* It's a 720x576 pixel 4:3 aspect test card designed by Barney
* Wol. (http://www.barney-wol.net/testpatterns)
*/
DECLARE_DATAFILE(testCardFile, wols4x3_yuv_z);
#define TESTCARD_WIDTH 720
#define TESTCARD_HEIGHT 576
/*
* Our background image, in PNG format.
*
* This has 'cutouts' where we're supposed to display the test
* pattern. Each of these are described by the table of overlay
* settings below.
*/
DECLARE_DATAFILE(screenPNGFile, screen_png);
#define OFFSET_YUY2 0x400000
#define OFFSET_UYVY 0x500000
#define OFFSET_YV12 0x600000
static SVGAOverlayUnit overlays[] = {
// #0 - YUY2 Large
{
.enabled = TRUE,
.format = VMWARE_FOURCC_YUY2,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcWidth = TESTCARD_WIDTH,
.srcHeight = TESTCARD_HEIGHT,
.dstX = 109,
.dstY = 407,
.dstWidth = 320,
.dstHeight = 240,
.pitches[0] = TESTCARD_WIDTH * 2,
.dataOffset = OFFSET_YUY2,
},
// #1 - YV12 Large
{
.enabled = TRUE,
.format = VMWARE_FOURCC_YV12,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcWidth = TESTCARD_WIDTH,
.srcHeight = TESTCARD_HEIGHT,
.dstX = 564,
.dstY = 58,
.dstWidth = 320,
.dstHeight = 240,
.pitches[0] = TESTCARD_WIDTH,
.pitches[1] = TESTCARD_WIDTH / 2,
.pitches[2] = TESTCARD_WIDTH / 2,
.dataOffset = OFFSET_YV12,
},
// #2 - UYVY Large
{
.enabled = TRUE,
.format = VMWARE_FOURCC_UYVY,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcWidth = TESTCARD_WIDTH,
.srcHeight = TESTCARD_HEIGHT,
.dstX = 564,
.dstY = 407,
.dstWidth = 320,
.dstHeight = 240,
.pitches[0] = TESTCARD_WIDTH * 2,
.dataOffset = OFFSET_UYVY,
},
// #3 - YUY2 Small
{
.enabled = TRUE,
.format = VMWARE_FOURCC_YUY2,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcX = 34,
.srcY = 31,
.srcWidth = 76,
.srcHeight = 79,
.dstX = 109,
.dstY = 652,
.dstWidth = 64,
.dstHeight = 64,
.pitches[0] = TESTCARD_WIDTH * 2,
.dataOffset = OFFSET_YUY2,
},
// #4 - YV12 Small
{
.enabled = TRUE,
.format = VMWARE_FOURCC_YV12,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcX = 34,
.srcY = 31,
.srcWidth = 76,
.srcHeight = 79,
.dstX = 564,
.dstY = 303,
.dstWidth = 64,
.dstHeight = 64,
.pitches[0] = TESTCARD_WIDTH,
.pitches[1] = TESTCARD_WIDTH / 2,
.pitches[2] = TESTCARD_WIDTH / 2,
.dataOffset = OFFSET_YV12,
},
// #5 - UYVY Small
{
.enabled = TRUE,
.format = VMWARE_FOURCC_UYVY,
.width = TESTCARD_WIDTH,
.height = TESTCARD_HEIGHT,
.srcX = 34,
.srcY = 31,
.srcWidth = 76,
.srcHeight = 79,
.dstX = 564,
.dstY = 652,
.dstWidth = 64,
.dstHeight = 64,
.pitches[0] = TESTCARD_WIDTH * 2,
.dataOffset = OFFSET_UYVY,
},
};
/*
* convertUYVYtoYUY2 --
*
* Convert the test card image from UYVY format to YUY2.
* Both of these are packed-pixel formats, they just use
* different byte orders.
*/
static void
convertUYVYtoYUY2(uint8 *src, // IN
uint8 *dest) // OUT
{
uint32 numWords = TESTCARD_WIDTH / 2 * TESTCARD_HEIGHT;
while (numWords--) {
uint8 u = *(src++);
uint8 y1 = *(src++);
uint8 v = *(src++);
uint8 y2 = *(src++);
*(dest++) = y1;
*(dest++) = u;
*(dest++) = y2;
*(dest++) = v;
}
}
/*
* convertUYVYtoYV12 --
*
* Convert the test card image from UYVY format (packed pixel) to
* YV12 (planar). This vertically decimates the chroma planes by
* 1/2.
*/
static void
convertUYVYtoYV12(uint8 *src, // IN
uint8 *dest) // OUT
{
/*
* Y plane, full resolution.
*/
uint8 *s = src;
uint32 numWords = TESTCARD_WIDTH / 2 * TESTCARD_HEIGHT;
while (numWords--) {
s++; // U
*(dest++) = *(s++); // Y1
s++; // V
*(dest++) = *(s++); // Y2
}
/*
* U and V planes, at 1/2 height.
*/
uint32 x, y;
const uint32 pitch = TESTCARD_WIDTH * 2;
uint8 *line1 = src;
uint8 *v = dest;
uint8 *u = v + (TESTCARD_WIDTH * TESTCARD_HEIGHT) / 4;
for (y = TESTCARD_HEIGHT/2; y; y--) {
uint8 *line2 = line1 + pitch;
for (x = TESTCARD_WIDTH/2; x; x--) {
uint8 u1 = *(line1)++; // U
line1++; // Y1
uint8 v1 = *(line1)++; // V
line1++; // Y2
uint8 u2 = *(line2)++; // U
line2++; // Y1
uint8 v2 = *(line2)++; // V
line2++; // Y2
*(u++) = ((int)u1 + (int)u2) >> 1;
*(v++) = ((int)v1 + (int)v2) >> 1;
}
line1 = line2;
}
}
/*
* main --
*
* Set up the virtual hardware, decompress the YUV images, and
* program the overlay units.
*/
int
main(void)
{
PNGChunkIHDR *screenPNG = PNG_Header(screenPNGFile->ptr);
uint32 width = bswap32(screenPNG->width);
uint32 height = bswap32(screenPNG->height);
uint32 streamId;
Intr_Init();
Intr_SetFaultHandlers(SVGA_DefaultFaultHandler);
SVGA_Init();
SVGA_SetMode(width, height, 32);
/*
* Draw the background image
*/
PNG_DecompressBGRX(screenPNG, (uint32*) gSVGA.fbMem, gSVGA.pitch);
SVGA_Update(0, 0, width, height);
/*
* Decompress the YUY2 image, and use it to generate UYVY and YV12 versions.
*/
DataFile_Decompress(testCardFile, gSVGA.fbMem + OFFSET_UYVY, 0x100000);
convertUYVYtoYUY2(gSVGA.fbMem + OFFSET_UYVY, gSVGA.fbMem + OFFSET_YUY2);
convertUYVYtoYV12(gSVGA.fbMem + OFFSET_UYVY, gSVGA.fbMem + OFFSET_YV12);
/*
* Program the overlay units
*/
for (streamId = 0; streamId < arraysize(overlays); streamId++) {
SVGA_VideoSetAllRegs(streamId, &overlays[streamId], SVGA_VIDEO_PITCH_3);
SVGA_VideoFlush(streamId);
}
return 0;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,6 @@
TARGET = video-sync.img
APP_SOURCES = main.c screen.png.data.o
LIB_DIR = ../../lib
include $(LIB_DIR)/Makefile.rules

147
examples/video-sync/main.c Normal file
View file

@ -0,0 +1,147 @@
/*
* video-sync -- Test video DMA synchronization, by displaying a
* sequence of animated frames with flow control and multi-frame
* buffering.
*
* Copyright (C) 2008-2009 VMware, Inc. Licensed under the MIT
* License, please see the README.txt. All rights reserved.
*/
#include "svga.h"
#include "png.h"
#include "intr.h"
#include "datafile.h"
/*
* Our background image, in PNG format.
*/
DECLARE_DATAFILE(screenPNGFile, screen_png);
/*
* generateFrame --
*
* Generate one frame of video, in UYVY format.
*/
static void
generateFrame(uint8 *buffer, // OUT
uint32 width, // IN
uint32 height, // IN
uint32 frame) // IN
{
uint32 wordPitch = width / 2;
uint32 numWords = wordPitch * height;
int x = frame % width;
uint32 *linePtr = (uint32*)buffer + (x >> 1);
uint32 lineWord;
/*
* Clear it multiple times, so it will be obvious if the
* host reads a frame that we're still writing to.
*/
// Y1VVY0UU
memset32(buffer, 0xFFFFFFFF, numWords);
memset32(buffer, 0x40804080, numWords);
/*
* Draw a vertical line that moves right on each frame. This is
* the easiest way to make it obvious when the image tears.
*
* This test will also show when the luminance bytes in the
* packed-pixel decoder are out of order.
*/
if (x & 1) {
lineWord = 0xFF804080;
} else {
lineWord = 0x4080FF80;
}
while (height--) {
*linePtr = lineWord;
linePtr += wordPitch;
}
}
/*
* main --
*
* Initialization and main loop.
*/
int
main(void)
{
PNGChunkIHDR *screenPNG = PNG_Header(screenPNGFile->ptr);
uint32 width = bswap32(screenPNG->width);
uint32 height = bswap32(screenPNG->height);
Intr_Init();
Intr_SetFaultHandlers(SVGA_DefaultFaultHandler);
SVGA_Init();
SVGA_SetMode(width, height, 32);
/*
* Draw the background image
*/
PNG_DecompressBGRX(screenPNG, (uint32*) gSVGA.fbMem, gSVGA.pitch);
SVGA_Update(0, 0, width, height);
/*
* Initialize the video overlay unit. We're displaying DVD-resolution
* letterboxed 16:9 video, in UYVY (packed-pixel) format.
*/
SVGAOverlayUnit overlay = {
.enabled = TRUE,
.format = VMWARE_FOURCC_UYVY,
.width = 720,
.height = 480,
.srcWidth = 720,
.srcHeight = 480,
.dstX = 1,
.dstY = 92,
.dstWidth = 1022,
.dstHeight = 574,
.pitches[0] = 1440,
};
SVGA_VideoSetAllRegs(0, &overlay, SVGA_VIDEO_PITCH_3);
/*
* Main loop. Loop over each frame in the ring buffer repeatedly.
* We wait for the DMA buffer to become available, fill it with the
* next frame, then program the overlay unit to display that frame.
*/
uint32 frameCounter = 0;
uint32 baseOffset = width * height * 4;
uint32 frameSize = overlay.pitches[0] * overlay.height;
static uint32 fences[16];
while (1) {
uint32 bufferId;
for (bufferId = 0; bufferId < arraysize(fences); bufferId++) {
uint32 bufferOffset = baseOffset + bufferId * frameSize;
uint8 *bufferPtr = gSVGA.fbMem + bufferOffset;
SVGA_SyncToFence(fences[bufferId]);
generateFrame(bufferPtr, overlay.width, overlay.height, frameCounter++);
SVGA_VideoSetReg(0, SVGA_VIDEO_DATA_OFFSET, bufferOffset);
SVGA_VideoFlush(0);
fences[bufferId] = SVGA_InsertFence();
}
}
return 0;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

132
lib/Makefile.rules Normal file
View file

@ -0,0 +1,132 @@
#
# Common GNU Make rules for the VMware SVGA examples.
#
# To build your own apps, you just need a makefile which
# defines a few variables and includes this one. For example:
#
# LIB_DIR = path/to/lib
# TARGET = myapp.img
# APP_MODULES = main
# include $(LIB_DIR)/Makefile.rules
#
# All examples get compiled with all library code, and we let
# GCC garbage collect modules that aren't being used.
#
# Basic options necessary to produce our standalone binary.
# Produce 32-bit code, even on 64-bit machines. Don't use
# the standard library at all. Begin the text segment at 1MB.
CFLAGS := -m32 -ffreestanding -nostdinc -fno-stack-protector
LDFLAGS := -nostdlib -Wl,-T,$(LIB_DIR)/metalkit/image.ld
# Extra warnings
CFLAGS += -Wall -Werror
# Size Optimizations.
CFLAGS += -Os -Wl,--gc-sections -ffunction-sections -fdata-sections
# This enables extra gcc builtins for floating point math.
CFLAGS += -march=i686 -ffast-math
# Generate debug symbols. These only show up in the .elf file, not the
# final image. Recent versions of VMware have a gdb debug stub that
# you can use along with these symbols for source-level debugging of
# Metalkit apps.
CFLAGS += -g
# Most of the examples only need 4MB of memory. Some examples
# override this, so only set it if it isn't already defined.
ifeq ($(VMX_MEMSIZE),)
VMX_MEMSIZE = 4
endif
CFLAGS += \
-I$(LIB_DIR)/metalkit \
-I$(LIB_DIR)/util \
-I$(LIB_DIR)/refdriver \
-I$(LIB_DIR)/vmware \
SOURCES := \
$(LIB_DIR)/metalkit/boot.S \
$(LIB_DIR)/metalkit/pci.c \
$(LIB_DIR)/metalkit/intr.c \
$(LIB_DIR)/metalkit/console.c \
$(LIB_DIR)/metalkit/console_vga.c \
$(LIB_DIR)/metalkit/puff.c \
$(LIB_DIR)/metalkit/timer.c \
$(LIB_DIR)/metalkit/keyboard.c \
$(LIB_DIR)/metalkit/bios.c \
$(LIB_DIR)/metalkit/apm.c \
$(LIB_DIR)/metalkit/gcc_support.c \
$(LIB_DIR)/util/matrix.c \
$(LIB_DIR)/util/svga3dutil.c \
$(LIB_DIR)/util/svga3dtext.c \
$(LIB_DIR)/util/vmbackdoor.c \
$(LIB_DIR)/util/mt19937ar.c \
$(LIB_DIR)/util/png.c \
$(LIB_DIR)/refdriver/svga.c \
$(LIB_DIR)/refdriver/svga3d.c \
$(LIB_DIR)/refdriver/gmr.c \
$(APP_SOURCES)
ELF_TARGET := $(subst .img,.elf,$(TARGET))
LST_TARGET := $(subst .img,.lst,$(TARGET))
VMX_TARGET := $(subst .img,.vmx,$(TARGET))
PLAIN_TARGET := $(subst .img,,$(TARGET))
.PHONY: all target clean sizeprof listing
target: $(TARGET) $(VMX_TARGET)
%.lst: %.elf
objdump -d $< > $@
%.img: %.elf
objcopy -O binary $< $@
# Stackable rules for processing data files
%.data.o: %
objcopy -I binary -O elf32-i386 -B i386 $< $@
%.z: %
python $(LIB_DIR)/metalkit/deflate.py < $< > $@
# To optimize size, we compile all input files in one step. This
# lets GCC use information available from all files during its
# optimization phase.
$(ELF_TARGET): $(SOURCES)
$(CC) $(LDFLAGS) $(CFLAGS) -o $@ $(SOURCES)
clean:
rm -f $(TARGET) $(ELF_TARGET) $(LST_TARGET) $(VMX_TARGET) *.o
# This is a phony target which prints a list of symbols, sorted by
# size, and excluding the BSS segment. This is a quick way to see
# which functions and initialized data are taking the most space in
# the final binary.
sizeprof: $(ELF_TARGET)
@nm --size-sort -S $< | egrep -v " [bBsS] "
# Another phony target, for convenience, which dumps an assembly
# listing to stdout.
listing: $(ELF_TARGET)
objdump -d $<
# Generate a .vmx config file for VMware
$(VMX_TARGET):
@echo config.version = 8 > $(VMX_TARGET)
@echo virtualHW.version = 7 >> $(VMX_TARGET)
@echo memsize = $(VMX_MEMSIZE) >> $(VMX_TARGET)
@echo displayname = $(PLAIN_TARGET) >> $(VMX_TARGET)
@echo guestOS = other >> $(VMX_TARGET)
@echo mks.enable3d = TRUE >> $(VMX_TARGET)
@echo floppy0.startConnected = TRUE >> $(VMX_TARGET)
@echo floppy0.fileType = file >> $(VMX_TARGET)
@echo floppy0.fileName = $(TARGET) >> $(VMX_TARGET)

25
lib/README Normal file
View file

@ -0,0 +1,25 @@
Library Code
------------
metalkit -
Open source (MIT-licensed) library code for writing programs
that run on the IA32 architecture on the "bare metal", without
an operating system.
refdriver -
Source code for the VMware SVGA reference driver.
util -
Utility code used by the accompanying examples. Includes higher
level APIs built on top of the reference driver, as well as
miscellaneous code such as text rendering and matrix math.
vmware -
VMware-provided header files, including the headers which define
registers and FIFO commands used by the VMware SVGA device.
win32 -
A Win32 port of the VMWare SVGA reference driver. This driver runs
in userspace, using the kernel mode interface provided by VMware's
proprietary kernel-mode graphics driver for Windows XP.

164
lib/metalkit/apm.c Normal file
View file

@ -0,0 +1,164 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* apm.c - Support for the legacy Advanced Power Management (APM) BIOS
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "apm.h"
#include "bios.h"
#include "intr.h"
APMState gAPM;
/*
* APM_Init --
*
* Probe for APM support. If APM is available, this connects to APM.
*
* It would be easier to use the 16-bit real mode interface via
* Metalkit's BIOS module, but that wouldn't work very well for us
* because we can't handle interrupts during a real-mode BIOS call.
* So, any APM_Idle() call would hang!
*
* Instead, we need to use 16-bit BIOS calls to bootstrap APM, then
* we do our real work via the 32-bit APM interface that is present
* in all APM 1.2 BIOSes.
*
* On exit, gAPM will have a valid 'connected' flag, APM version,
* and flags.
*/
fastcall void
APM_Init()
{
APMState *self = &gAPM;
Regs reg = {};
/* Real mode "APM Installation Check" call */
reg.ax = 0x5300;
reg.bx = 0x0000;
BIOS_Call(0x15, &reg);
if (reg.bx == SIGNATURE_APM && reg.cf == 0) {
self->version = reg.ax;
self->flags = reg.cx;
} else {
return;
}
/* Real-mode interface connect */
reg.ax = 0x5303;
reg.bx = 0x0000;
BIOS_Call(0x15, &reg);
if (reg.cf != 0) {
return;
}
/* Indicate that we want APM v1.2 */
reg.ax = 0x530e;
reg.bx = 0x0000;
reg.cx = 0x0102;
BIOS_Call(0x15, &reg);
if (reg.cf != 0) {
return;
}
/* Success! */
self->connected = TRUE;
}
/*
* APM_Idle --
*
* If we're connected to APM, issue a "CPU Idle" call. The BIOS may
* halt the CPU until the next interrupt and/or slow or stop the
* CPU clock.
*
* If we aren't connected to APM or the APM call is unsuccessful,
* this issue a CPU HLT instruction.
*/
fastcall void
APM_Idle()
{
/*
* XXX: This doesn't actually work, because BIOS_Call disables
* interrupts! To get idle calls working, we'll need to use
* the real 32-bit APM interface.
*/
#if 0
APMState *self = &gAPM;
if (self->connected) {
Regs reg = {};
/* Real mode "CPU Idle" call */
reg.ax = 0x5305;
BIOS_Call(0x15, &reg);
if (reg.cf == 0) {
/* Success */
return;
}
}
#endif
/* Fall back to CPU HLT */
Intr_Halt();
}
/*
* APM_SetPowerState --
*
* Set the power state of all APM-managed devices.
* If we aren't connected to APM, always fails.
*
* Returns TRUE on success, FALSE on error.
*/
fastcall Bool
APM_SetPowerState(uint16 state)
{
APMState *self = &gAPM;
if (self->connected) {
Regs reg = {};
reg.ax = 0x5307; // APM Set Power State
reg.bx = 0x0001; // All devices
reg.cx = state;
BIOS_Call(0x15, &reg);
return reg.cf == 0;
}
return FALSE;
}

65
lib/metalkit/apm.h Normal file
View file

@ -0,0 +1,65 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* apm.c - Support for the legacy Advanced Power Management (APM) BIOS
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __APM_H__
#define __APM_H__
#include "types.h"
#include "bios.h"
#define SIGNATURE_APM 0x504d // "PM"
#define APM_FLAG_16BIT (1 << 0)
#define APM_FLAG_32BIT (1 << 1)
#define APM_FLAG_SLOW_CPU_ON_IDLE (1 << 2)
#define APM_FLAG_DISABLED (1 << 3)
#define APM_FLAG_DISENGAGED (1 << 4)
/* APM power states */
#define POWER_ON 0
#define POWER_STANDBY 1
#define POWER_SUSPEND 2
#define POWER_OFF 3
typedef struct {
Bool connected; // Are we successfully connected to APM?
uint16 version; // Supported APM version in BCD, 0 if not supported
uint16 flags;
} APMState;
extern APMState gAPM;
fastcall void APM_Init();
fastcall void APM_Idle();
fastcall Bool APM_SetPowerState(uint16 state);
#endif /* __APM_H_ */

247
lib/metalkit/bios.c Normal file
View file

@ -0,0 +1,247 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* bios.c - Make real-mode BIOS calls from protected mode.
* For simplicity and small size, this implementation
* switches back to real-mode rather than using virtual 8086
* mode. A v86 mode implementation may be more robust.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "bios.h"
#include "boot.h"
#include "intr.h"
/*
* BIOSCallInternal --
*
* Internal implementation of BIOS_Call. This is a C function which
* wraps the assembly-language internal implementation. The only
* reason to use C here, really, is so we can easily calculate
* offsets into our shared C structures.
*
* This function must not make any function calls, since we need to
* be able to trust the value of %esp. This is also why it must not
* be inlined into BIOS_Call itself.
*/
static __attribute__((noinline)) void
BIOSCallInternal(void)
{
/*
* Save registers and stack in a safe place.
*/
asm volatile ("pusha");
asm volatile ("mov %%esp, %0" :"=m" (BIOS_SHARED->esp));
/*
* Jump the the relocated 16-bit trampoline (source code below).
*/
asm volatile ("ljmp %0, %1"
:: "i" (BOOT_CODE16_SEG), "i" (BIOS_SHARED->trampoline));
/*
* This is where we return from the relocated trampoline.
* We're back in protected mode, but the data segments
* are still 16-bit. Restore them.
*/
asm volatile ("BIOSReturn32: \n"
"mov %0, %%ax \n"
"mov %%ax, %%ss \n"
"mov %%ax, %%ds \n"
"mov %%ax, %%es \n"
"mov %%ax, %%fs \n"
"mov %%ax, %%gs \n"
:: "i" (BOOT_DATA_SEG));
/*
* Restore our stack and saved registers.
* Now we can safely execute C code again.
*/
asm volatile("mov %0, %%esp" ::"m" (BIOS_SHARED->esp));
asm volatile ("popa");
/*
* Return here. The rest of this code is never run directly,
* but we need to prevent GCC from optimizing it out.
*/
asm volatile("jmp BIOSTrampolineEnd\n");
/*
* This is a 16-bit assembly-language trampoline, relocated at
* runtime to low memory, which actually makes the BIOS call. It
* handles saving/restoring registers, and it switches in and out
* of real mode.
*
* This code is never run directly.
*/
asm volatile("BIOSTrampoline: .code16");
/*
* Switch to our 16-bit data segment.
*/
asm volatile("movw %0, %%ax \n"
"movw %%ax, %%ds \n"
"movw %%ax, %%es \n"
"movw %%ax, %%ss \n"
:: "i" (BOOT_DATA16_SEG));
/*
* Disable protected mode.
*/
asm volatile("movl %cr0, %eax \n"
"andl $(~1), %eax \n"
"movl %eax, %cr0 \n");
/*
* Do another long jump to reset the real-mode %cs
* register to a valid paragraph number. Right now
* it's still a protected-mode-style selector index.
*
* XXX: I'm not sure how to do this address calculation cleanly.
* Currently I'm hardcoding the address of the relocated trampoline.
*/
asm volatile("ljmp $0, $(BIOSTrampolineCS16 - BIOSTrampoline + 0x7C00)\n"
"BIOSTrampolineCS16: \n");
/*
* Set up the real-mode stack and %cs register.
*/
asm volatile("xorw %%ax, %%ax \n"
"mov %%ax, %%ss \n"
"mov %0, %%esp \n"
:: "i" (&BIOS_SHARED->stackTop[-sizeof(Regs)]));
/*
* Pop Regs off the stack.
*/
asm volatile("pop %ds \n"
"pop %es \n"
"pop %eax \n" // Ignore EFLAGS value.
"popal \n");
/*
* This interrupt instruction is a placeholder that gets
* patched at runtime (after relocation) to point to the
* right interrupt vector.
*/
asm volatile("BIOSTrampolineVector: \n"
"int $0xFF");
/*
* Push Regs back onto the stack.
*/
asm volatile("pushal \n"
"pushfl \n"
"push %es \n"
"push %ds \n");
/*
* Enable protected mode.
*/
asm volatile("movl %cr0, %eax \n"
"orl $1, %eax \n"
"movl %eax, %cr0 \n");
/*
* Return via a long 16-to-32 bit jump.
*/
asm volatile("data32 ljmp %0, $BIOSReturn32 \n"
:: "i" (BOOT_CODE_SEG));
asm volatile("BIOSTrampolineEnd: .code32 \n");
}
extern struct {
uint16 limit;
uint32 base;
} PACKED IDTDesc;
/*
* BIOS_Call --
*
* Make BIOS calls after boot, by temporarily switching
* back into real mode.
*
* This function relocates the trampoline and stack into
* real-mode-addressable low memory, then makes a 32-to-16-bit jump
* into the trampoline.
*/
fastcall void
BIOS_Call(uint8 vector, Regs *regs)
{
extern uint8 BIOSTrampoline[];
extern uint8 BIOSTrampolineVector[];
extern uint8 BIOSTrampolineEnd[];
const uint32 trampSize = (uint8*)BIOSTrampolineEnd - (uint8*)BIOSTrampoline;
const uint32 vectorOffset = (uint8*)BIOSTrampolineVector - (uint8*)BIOSTrampoline + 1;
Bool iFlag = Intr_Save();
Intr_Disable();
/*
* Relocate the trampoline code itself.
*/
memcpy(BIOS_SHARED->trampoline, BIOSTrampoline, trampSize);
/*
* Save the 32-bit IDT descriptor, and set up a legacy 256-entry
* 16-bit IDT descriptor.
*/
asm volatile("sidt %0" : "=m" (BIOS_SHARED->idtr32));
BIOS_SHARED->idtr16.base = 0;
BIOS_SHARED->idtr16.limit = 0x3ff;
asm volatile("lidt %0" :: "m" (BIOS_SHARED->idtr16));
/*
* Binary-patch the trampoline code with the right interrupt vector.
*/
BIOS_SHARED->trampoline[vectorOffset] = vector;
/*
* Copy Regs onto the top of the 16-bit stack.
*/
memcpy(&BIOS_SHARED->stackTop[-sizeof *regs], regs, sizeof *regs);
BIOSCallInternal();
/* Copy Regs back */
memcpy(regs, &BIOS_SHARED->stackTop[-sizeof *regs], sizeof *regs);
/*
* Back to 32-bit IDT.
*/
asm volatile("lidt %0" :: "m" (BIOS_SHARED->idtr32));
Intr_Restore(iFlag);
}

176
lib/metalkit/bios.h Normal file
View file

@ -0,0 +1,176 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* bios.h - Make real-mode BIOS calls from protected mode.
* For simplicity and small size, this implementation
* switches back to real-mode rather than using virtual 8086
* mode. A v86 mode implementation may be more robust.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __BIOS_H__
#define __BIOS_H__
#include "types.h"
#include "boot.h"
typedef struct Regs {
/*
* Subset of segment registers
*/
uint16 ds;
uint16 es;
/*
* CPU flags (Saved on BIOS exit, ignored on entry)
*/
union {
uint16 flags;
uint32 eflags;
struct {
uint32 cf : 1;
uint32 reserved_0 : 1;
uint32 pf : 1;
uint32 reserved_1 : 1;
uint32 af : 1;
uint32 reserved_2 : 1;
uint32 zf : 1;
uint32 sf : 1;
uint32 tp : 1;
uint32 intf : 1;
uint32 df : 1;
uint32 of : 1;
uint32 iopl : 2;
uint32 nt : 1;
uint32 reserved_3 : 1;
uint32 rf : 1;
uint32 vm : 1;
uint32 vif : 1;
uint32 vip : 1;
uint32 id : 1;
uint32 reserved_4 : 10;
};
};
/*
* General purpose 32-bit registers, in the order expected by
* pushad/popad. Note that while most BIOS routines need only the
* 16-bit portions of these registers, some 32-bit-aware routines
* use them even in real mode.
*/
union {
uint32 edi;
uint16 di;
};
union {
uint32 esi;
uint16 si;
};
union {
uint32 ebp;
uint16 bp;
};
union { // Saved on BIOS exit, ignored on entry
uint32 esp;
uint16 sp;
};
union {
uint32 ebx;
uint16 bx;
struct {
uint8 bl;
uint8 bh;
};
};
union {
uint32 edx;
uint16 dx;
struct {
uint8 dl;
uint8 dh;
};
};
union {
uint32 ecx;
uint16 cx;
struct {
uint8 cl;
uint8 ch;
};
};
union {
uint32 eax;
uint16 ax;
struct {
uint8 al;
uint8 ah;
};
};
} PACKED Regs;
/*
* This is the communication area between the real-mode BIOS
* and protected mode. Parts of it are used internally by this
* module, but the 'userdata' area is available to the caller.
*/
struct BIOSShared {
uint8 trampoline[512];
uint8 stack[4096];
uint8 stackTop[0];
uint32 esp;
struct {
uint16 limit;
uint32 base;
} PACKED idtr16, idtr32;
uint8 userdata[1024];
} PACKED;
#define BIOS_SHARED ((struct BIOSShared*) BOOT_REALMODE_SCRATCH)
/*
* Macros for converting between 32-bit and 16-bit near/far pointers.
*/
typedef uint32 far_ptr_t;
#define PTR_32_TO_NEAR(p, seg) ((uint16)((uint32)(p) - ((seg) << 4)))
#define PTR_NEAR_TO_32(seg, off) ((void*)((((uint32)(seg)) << 4) + ((uint32)(off))))
#define PTR_FAR_TO_32(p) PTR_NEAR_TO_32(p >> 16, p & 0xFFFF)
/*
* Public entry point.
*/
fastcall void BIOS_Call(uint8 vector, Regs *regs);
#endif /* __BIOS_H__ */

586
lib/metalkit/boot.S Normal file
View file

@ -0,0 +1,586 @@
/*
* boot.S --
*
* This is a tiny but relatively featureful bootloader for
* 32-bit standalone apps and kernels. It compiles into one
* binary that can be used either stand-alone (loaded directly
* by the BIOS, from a floppy or USB disk image) or as a GNU
* Multiboot image, loaded by GRUB.
*
* This bootloader loads itself and the attached main program
* at 1MB, with the available portions of the first megabyte of
* RAM set up as stack space by default.
*
* This loader is capable of loading an arbitrarily big binary
* image from the boot device into high memory. If you're booting
* from a floppy, it can load the whole 1.44MB disk. If you're
* booting from USB, it can load any amount of data from the USB
* disk.
*
* This loader works by using the BIOS's disk services, so we
* should be able to read the whole binary image off of any device
* the BIOS knows how to boot from. Since we have only a tiny
* amount of buffer space, and we need to store the resulting image
* above the 1MB boundary, we have to keep switching back and forth
* between real mode and protected mode.
*
* To avoid device-specific CHS addressing madness, we require LBA
* mode to boot off of anything other than a 1.44MB floppy or a
* Multiboot loader. We try to use the INT 13h AH=42h "Extended Read
* Sectors From Drive" command, which uses LBA addressing. If this
* doesn't work, we fall back to floppy-disk-style CHS addressing.
*
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#define ASM
#include "boot.h"
/*
* Constants that affect our early boot memory map.
*/
#define BIOS_START_ADDRESS 0x7C00 // Defined by the BIOS
#define EARLY_STACK_ADDRESS 0x2000 // In low DOS memory
#define SECTORS_AT_A_TIME 18 // Must equal CHS sectors per head
#define SECTOR_SIZE 512
#define DISK_BUFFER 0x2800
#define DISK_BUFFER_SIZE (SECTORS_AT_A_TIME * SECTOR_SIZE)
#define BIOS_PTR(x) (x - _start + BIOS_START_ADDRESS)
.section .boot
.global _start
/*
* External symbols. main() is self-explanatory, but these
* other symbols must be provided by the linker script. See
* "image.ld" for the actual partition size and LDT calculations.
*/
.extern main
.extern _end
.extern _edata
.extern _bss_size
.extern _stack
.extern _partition_chs_head
.extern _partition_chs_sector_byte
.extern _partition_chs_cylinder_byte
.extern _partition_blocks
.extern _ldt_byte0
.extern _ldt_byte1
.extern _ldt_byte2
.extern _ldt_byte3
/*
* Other modules can optionally define an LDT in uninitialized
* memory. By default this LDT will be all zeroes, but this
* is a simple and code-size-efficient way of letting other
* Metalkit modules allocate segment descriptors when they
* need to.
*
* Note that we page-align the LDT. This isn't strictly
* necessary, but it might be useful for performance in
* some environments.
*/
.comm LDT, BOOT_LDT_SIZE, 4096
/*
* This begins our 16-bit DOS MBR boot sector segment. This
* sits in the first 512 bytes of our floppy image, and it
* gets loaded by the BIOS at START_ADDRESS.
*
* Until we've loaded the memory image off of disk into
* its final location, this code is running at a different
* address than the linker is expecting. Any absolute
* addresses must be fixed up by the BIOS_PTR() macro.
*/
.code16
_start:
ljmp $0, $BIOS_PTR(bios_main)
/*
* gnu_multiboot --
*
* GNU Multiboot header. This can come anywhere in the
* first 8192 bytes of the image file.
*/
.p2align 2
.code32
gnu_multiboot:
#define MULTIBOOT_MAGIC 0x1BADB002
#define MULTIBOOT_FLAGS 0x00010000
.long MULTIBOOT_MAGIC
.long MULTIBOOT_FLAGS
.long -(MULTIBOOT_MAGIC + MULTIBOOT_FLAGS)
.long gnu_multiboot
.long _start
.long _edata
.long _end
.long entry32
/*
* String table, located in the boot sector.
*/
loading_str: .string "\r\nMETALKIT "
disk_error_str: .string " err!"
/*
* bios_main --
*
* Main routine for our BIOS MBR based loader. We set up the
* stack, display some welcome text, then load the rest of
* the boot image from disk. We have to use real mode to
* call the BIOS's floppy driver, then protected mode to
* copy each disk block to its final location above the 1MB
* barrier.
*/
.code16
bios_main:
/*
* Early init: setup our stack and data segments, make sure
* interrupts are off.
*/
cli
xorw %ax, %ax
movw %ax, %ss
movw %ax, %ds
movw %ax, %es
movw $EARLY_STACK_ADDRESS, %sp
/*
* Save parameters that the BIOS gave us via registers.
*/
mov %dl, BIOS_PTR(disk_drive)
/*
* Switch on the A20 gate, so we can access more than 1MB
* of memory. There are multiple ways to do this: The
* original way was to write to bit 1 of the keyboard
* controller's output port. There's also a bit on PS2
* System Control port A to enable A20.
*
* The keyboard controller method should always work, but
* it's kind of slow and it takes a lot of code space in
* our already-cramped bootloader. Instead, we ask the BIOS
* to enable A20.
*
* If your computer doesn't support this BIOS interface,
* you'll see our "err!" message before "METAL" appears.
*
* References:
* http://www.win.tue.nl/~aeb/linux/kbd/A20.html
*/
mov $0x2401, %ax // Enable A20
int $0x15
jc fatal_error
/*
* Load our image, starting at the beginning of whatever disk
* the BIOS told us we booted from. The Disk Address Packet
* (DAP) has already been initialized statically.
*/
mov $BIOS_PTR(loading_str), %si
call print_str
/*
* Fill our DISK_BUFFER, reading SECTORS_AT_A_TIME sectors.
*
* First, try to use LBA addressing. This is required in
* order to boot off of non-floppy devices, like USB drives.
*/
disk_copy_loop:
mov $0x42, %ah
mov BIOS_PTR(disk_drive), %dl
mov $BIOS_PTR(dap_buffer), %si
int $0x13
jnc disk_success
/*
* If LBA fails, fall back to old fashioned CHS addressing.
* This works everywhere, but only if we're on a 1.44MB floppy.
*/
mov $(0x0200 | SECTORS_AT_A_TIME), %ax
mov BIOS_PTR(chs_sector), %cx // Sector and cylinder
mov BIOS_PTR(disk_drive), %dx // Drive and head
mov $DISK_BUFFER, %bx
int $0x13
jnc disk_success
/*
* If both CHS and LBA fail, the error is fatal.
*/
fatal_error:
mov $BIOS_PTR(disk_error_str), %si
call print_str
cli
hlt
disk_success:
mov $'.', %al
call print_char
/*
* Enter protected mode, so we can copy this sector to
* memory above the 1MB boundary.
*
* Note that we reset CS, DS, and ES, but we don't
* modify the stack at all.
*/
cli
lgdt BIOS_PTR(bios_gdt_desc)
movl %cr0, %eax
orl $1, %eax
movl %eax, %cr0
ljmp $BOOT_CODE_SEG, $BIOS_PTR(copy_enter32)
.code32
copy_enter32:
movw $BOOT_DATA_SEG, %ax
movw %ax, %ds
movw %ax, %es
/*
* Copy the buffer to high memory.
*/
mov $DISK_BUFFER, %esi
mov BIOS_PTR(dest_address), %edi
mov $(DISK_BUFFER_SIZE / 4), %ecx
rep movsl
/*
* Next...
*
* Even though the CHS and LBA addresses are mutually exclusive,
* there's no harm in incrementing them both. The LBA increment
* is pretty straightforward, but CHS is of course less so.
* We only support CHS on 1.44MB floppies. We always copy one
* head at a time (SECTORS_AT_A_TIME must equal 18), so we have
* to hop between disk head 0 and 1, and increment the cylinder
* on every other head.
*
* When we're done copying, branch to entry32 while we're
* still in protected mode. Also note that we do a long branch
* to its final address, not it's temporary BIOS_PTR() address.
*/
addl $DISK_BUFFER_SIZE, BIOS_PTR(dest_address)
addl $SECTORS_AT_A_TIME, BIOS_PTR(dap_sector)
xorb $1, BIOS_PTR(chs_head)
jnz same_cylinder
incb BIOS_PTR(chs_cylinder)
same_cylinder:
cmpl $_edata, BIOS_PTR(dest_address)
jl not_done_copying
ljmp $BOOT_CODE_SEG, $entry32
not_done_copying:
/*
* Back to 16-bit mode for the next copy.
*
* To understand this code, it's important to know the difference
* between how segment registers are treated in protected-mode and
* in real-mode. Loading a segment register in PM is actually a
* request for the processor to fill the hidden portion of that
* segment register with data from the GDT. When we switch to
* real-mode, the segment registers change meaning (now they're
* paragraph offsets again) but that hidden portion of the
* register remains set.
*/
/* 1. Load protected-mode segment registers (CS, DS, ES) */
movw $BOOT_DATA16_SEG, %ax
movw %ax, %ds
movw %ax, %es
ljmp $BOOT_CODE16_SEG, $BIOS_PTR(copy_enter16)
/* (We're entering a 16-bit code segment now) */
.code16
copy_enter16:
/* 2. Disable protected mode */
movl %cr0, %eax
andl $(~1), %eax
movl %eax, %cr0
/*
* 3. Load real-mode segment registers. (CS, DS, ES)
*/
xorw %ax, %ax
movw %ax, %ds
movw %ax, %es
ljmp $0, $BIOS_PTR(disk_copy_loop)
/*
* print_char --
*
* Use the BIOS's TTY emulation to output one character, from %al.
*/
.code16
print_char:
mov $0x0E, %ah
mov $0x0001, %bx
int $0x10
ret_label:
ret
/*
* print_str --
*
* Print a NUL-terminated string, starting at %si.
*/
.code16
print_str:
lodsb
test %al, %al
jz ret_label
call print_char
jmp print_str
/*
* entry32 --
*
* Main 32-bit entry point. To be here, we require that:
*
* - We're running in protected mode
* - The A20 gate is enabled
* - The entire image is loaded at _start
*
* We jump directly here from GNU Multiboot loaders (like
* GRUB), and this is where we jump directly from our
* protected mode disk block copy routine after we've copied
* the lask block.
*
* We still need to set up our final stack and GDT.
*/
.code32
entry32:
cli
lgdt boot_gdt_desc
movl %cr0, %eax
orl $1, %eax
movl %eax, %cr0
ljmp $BOOT_CODE_SEG, $entry32_gdt_done
entry32_gdt_done:
movw $BOOT_DATA_SEG, %ax
movw %ax, %ds
movw %ax, %ss
movw %ax, %es
movw %ax, %fs
movw %ax, %gs
mov $_stack, %esp
/*
* Zero out the BSS segment.
*/
xor %eax, %eax
mov $_bss_size, %ecx
mov $_edata, %edi
rep stosb
/*
* Set our LDT segment as the current LDT.
*/
mov $BOOT_LDT_SEG, %ax
lldt %ax
/*
* Call main().
*
* If it returns, put the machine in a halt loop. We don't
* disable interrupts: if the main program is in fact done
* with, but the application is still doing useful work in its
* interrupt handlers, no reason to stop them.
*/
call main
halt_loop:
hlt
jmp halt_loop
/*
* boot_gdt --
*
* This is a Global Descriptor Table that gives us a
* code and data segment, with a flat memory model.
*
* See section 3.4.5 of the Intel IA32 software developer's manual.
*/
.code32
.p2align 3
boot_gdt:
/*
* This is BOOT_NULL_SEG, the unusable segment zero.
* Reuse this memory as bios_gdt_desc, a GDT descriptor
* which uses our pre-relocation (BIOS_PTR) GDT address.
*/
bios_gdt_desc:
.word (boot_gdt_end - boot_gdt - 1)
.long BIOS_PTR(boot_gdt)
.word 0 // Unused
.word 0xFFFF, 0x0000 // BOOT_CODE_SEG
.byte 0x00, 0x9A, 0xCF, 0x00
.word 0xFFFF, 0x0000 // BOOT_DATA_SEG
.byte 0x00, 0x92, 0xCF, 0x00
.word 0xFFFF, 0x0000 // BOOT_CODE16_SEG
.byte 0x00, 0x9A, 0x00, 0x00
.word 0xFFFF, 0x0000 // BOOT_DATA16_SEG
.byte 0x00, 0x92, 0x00, 0x00
.word 0xFFFF // BOOT_LDT_SEG
.byte _ldt_byte0
.byte _ldt_byte1
.byte _ldt_byte2
.byte 0x82, 0x40
.byte _ldt_byte3
boot_gdt_end:
boot_gdt_desc: // Uses final address
.word (boot_gdt_end - boot_gdt - 1)
.long boot_gdt
/*
* dap_buffer --
*
* The Disk Address Packet buffer holds the current LBA
* disk address. We pass this to BIOS INT 13h, and we
* statically initialize it here.
*
* Note that the DAP is only used in LBA mode, not CHS mode.
*
* References:
* http://en.wikipedia.org/wiki/INT_13
* #INT_13h_AH.3D42h:_Extended_Read_Sectors_From_Drive
*/
dap_buffer:
.byte 0x10 // DAP structure size
.byte 0x00 // (Unused)
.byte SECTORS_AT_A_TIME // Number of sectors to read
.byte 0x00 // (Unused)
.word DISK_BUFFER // Buffer offset
.word 0x00 // Buffer segment
dap_sector:
.long 0x00000000 // Disk sector number
.long 0x00000000
/*
* Statically initialized disk addressing variables. The CHS
* address here is only used in CHS mode, not LBA mode, but
* the disk drive number and dest address are always used.
*/
chs_sector: // Order matters. Cylinder/sector and head/drive
.byte 0x01 // are packed into words together.
chs_cylinder:
.byte 0x00
disk_drive:
.byte 0x00
chs_head:
.byte 0x00
dest_address:
.long _start // Initial dest address for 16-to-32-bit copy.
/*
* Partition table and Boot Signature --
*
* This must be at the end of the first 512-byte disk
* sector. The partition table marks the end of the
* portion of this binary which is loaded by the BIOS.
*
* Each partition record is 16 bytes.
*
* After installing Metalkit, a disk can be partitioned as
* long as the space used by the Metalkit binary itself is
* reserved. By default, we create a single "Non-FS data"
* partition which holds the Metalkit binary. Note that
* this default partition starts at sector 1 (the first
* sector) so it covers the entire Metalkit image including
* bootloader.
*
* Partitions 2 through 4 are unused, and must be all zero
* or fdisk will complain.
*
* References:
* http://en.wikipedia.org/wiki/Master_boot_record
*/
.org 0x1BE // Partition 1
boot_partition_table:
.byte 0x80 // Status (Bootable)
.byte 0x00 // First block (head, sector/cylinder, cylinder)
.byte 0x01
.byte 0x00
.byte 0xda // Partition type ("Non-FS data" in fdisk)
.byte _partition_chs_head // Last block (head, sector/cylinder, cylinder)
.byte _partition_chs_sector_byte
.byte _partition_chs_cylinder_byte
.long 0 // LBA of first sector
.long _partition_blocks // Number of blocks in partition
.org 0x1CE // Partition 2 (Unused)
.org 0x1DE // Partition 3 (Unused)
.org 0x1EE // Partition 4 (Unused)
.org 0x1FE // Boot signature
.byte 0x55, 0xAA // This marks the end of the 512-byte MBR.

59
lib/metalkit/boot.h Normal file
View file

@ -0,0 +1,59 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* boot.h - Definitions used by both the bootloader and
* the rest of the library. This file must be valid
* C and assembly.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __BOOT_H__
#define __BOOT_H__
#define BOOT_NULL_SEG 0x00
#define BOOT_CODE_SEG 0x08
#define BOOT_DATA_SEG 0x10
#define BOOT_CODE16_SEG 0x18
#define BOOT_DATA16_SEG 0x20
#define BOOT_LDT_SEG 0x28
#define BOOT_LDT_ENTRIES 1024
#define BOOT_LDT_SIZE (BOOT_LDT_ENTRIES * 8)
/* Unused real-mode-accessable scratch memory. */
#define BOOT_REALMODE_SCRATCH 0x7C00
/*
* The bootloader defines an LDT table which can be modified
* by C code, for loading segments dynamically.
*/
#ifndef ASM
extern unsigned char LDT[BOOT_LDT_SIZE];
#endif
#endif /* __BOOT_H__ */

291
lib/metalkit/console.c Normal file
View file

@ -0,0 +1,291 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* console.c - Abstract text console
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "types.h"
#include "console.h"
#include "intr.h"
ConsoleInterface gConsole;
/*
* Console_WriteString --
*
* Write a NUL-terminated string.
*/
fastcall void
Console_WriteString(const char *str)
{
char c;
while ((c = *(str++))) {
Console_WriteChar(c);
}
}
/*
* Console_WriteUInt32 --
*
* Write a positive 32-bit integer with arbitrary base from 2 to
* 16, up to 'digits' characters long. If 'padding' is non-NUL,
* this character is used for leading digits that would be zero.
* If padding is NUL, leading digits are suppressed entierly.
*/
fastcall void
Console_WriteUInt32(uint32 num, int digits, char padding, int base, Bool suppressZero)
{
if (digits == 0) {
return;
}
Console_WriteUInt32(num / base, digits - 1, padding, base, TRUE);
if (num == 0 && suppressZero) {
if (padding) {
Console_WriteChar(padding);
}
} else {
uint8 digit = num % base;
Console_WriteChar(digit >= 10 ? digit - 10 + 'A' : digit + '0');
}
}
/*
* Console_Format --
* Console_FormatV --
*
* Write a formatted string. This is for the most part a tiny
* subset of printf(). Supports the standard %c, %s, %d, %u,
* and %X specifiers.
*
* Deviates from a standard printf() in a few ways, in the interest
* of low-level utility and small code size:
*
* - Adds a nonstandard %b specifier, for binary numbers.
* - Width specifiers set an exact width, not a minimum width.
* - %x is treated as %X.
*/
void
Console_Format(const char *fmt, ...)
{
Console_FormatV(&fmt);
}
fastcall void
Console_FormatV(const char **args)
{
char c;
const char *fmt = *(args++);
while ((c = *(fmt++))) {
int width = 0;
Bool isSigned = FALSE;
char padding = '\0';
if (c != '%') {
Console_WriteChar(c);
continue;
}
while ((c = *(fmt++))) {
if (c == '0' && width == 0) {
/* If we get a leading 0 in the width specifier, turn on zero-padding */
padding = '0';
continue;
}
if (c >= '0' && c <= '9') {
/* Add another digit to the width specifier */
width = (width * 10) + (c - '0');
if (padding == '\0') {
padding = ' ';
}
continue;
}
/*
* Any other character means the width specifier has
* ended. If it's still zero, set the defaults.
*/
if (width == 0) {
width = 32;
}
/*
* Non-integer format specifiers
*/
if (c == 's') {
Console_WriteString((char*) *(args++));
break;
}
if (c == 'c') {
Console_WriteChar((char)(uint32) *(args++));
break;
}
/*
* Integers of different bases
*/
int base = 0;
if (c == 'X' || c == 'x') {
base = 16;
} else if (c == 'd') {
base = 10;
isSigned = TRUE;
} else if (c == 'u') {
base = 10;
} else if (c == 'b') {
base = 2;
}
if (base) {
uint32 value = (uint32)*(args++);
/*
* Print the sign for negative numbers.
*/
if (isSigned && 0 > (int32)value) {
Console_WriteChar('-');
width--;
value = -value;
}
Console_WriteUInt32(value, width, padding, base, FALSE);
break;
}
/* Unrecognized */
Console_WriteChar(c);
break;
}
}
}
/*
* Console_HexDump --
*
* Write a 32-bit hex dump to the console, labelling each
* line with addresses starting at 'startAddr'.
*/
fastcall void
Console_HexDump(uint32 *data, uint32 startAddr, uint32 numWords)
{
while (numWords) {
int32 lineWords = 4;
Console_Format("%08x:", startAddr);
while (numWords && lineWords) {
Console_Format(" %08x", *data);
data++;
startAddr += 4;
numWords--;
lineWords--;
}
Console_WriteChar('\n');
}
}
/*
* Console_UnhandledFault --
*
* Display a fatal error message with register and stack trace when
* an unhandled fault occurs. This fault handler must be installed
* using the Intr module.
*/
void
Console_UnhandledFault(int vector)
{
IntrContext *ctx = Intr_GetContext(vector);
/*
* Using a regular inline string constant, the linker can't
* optimize out this string when the function isn't used.
*/
static const char faultFmt[] =
"Fatal error:\n"
"Unhandled fault %d at %04x:%08x\n"
"\n"
"eax=%08x ebx=%08x ecx=%08x edx=%08x\n"
"esi=%08x edi=%08x esp=%08x ebp=%08x\n"
"eflags=%032b\n"
"\n";
Console_BeginPanic();
/*
* IntrContext's stack pointer includes the three values that were
* pushed by the hardware interrupt. Advance past these, so the
* stack trace shows the state of execution at the time of the
* fault rather than at the time our interrupt trampoline was
* invoked.
*/
ctx->esp += 3 * sizeof(int);
Console_Format(faultFmt,
vector, ctx->cs, ctx->eip,
ctx->eax, ctx->ebx, ctx->ecx, ctx->edx,
ctx->esi, ctx->edi, ctx->esp, ctx->ebp,
ctx->eflags);
Console_HexDump((void*)ctx->esp, ctx->esp, 64);
Console_Flush();
Intr_Disable();
Intr_Halt();
}
/*
* Console_Panic --
*
* Default panic handler. Prints a caller-defined message, and
* halts the machine.
*/
void
Console_Panic(const char *fmt, ...)
{
Console_BeginPanic();
Console_WriteString("Panic:\n");
Console_FormatV(&fmt);
Console_Flush();
Intr_Disable();
Intr_Halt();
}

63
lib/metalkit/console.h Normal file
View file

@ -0,0 +1,63 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* console.h - Abstract text console
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __CONSOLE_H__
#define __CONSOLE_H__
#include "types.h"
typedef struct {
fastcall void (*beginPanic)(void); // Initialize the console for a Panic message
fastcall void (*clear)(void); // Clear the screen, home the cursor
fastcall void (*moveTo)(int x, int y); // Move the cursor
fastcall void (*writeChar)(char c); // Write one character, with support for control codes
fastcall void (*flush)(void); // Finish writing a string of characters
} ConsoleInterface;
extern ConsoleInterface gConsole;
#define Console_BeginPanic() gConsole.beginPanic()
#define Console_Clear() gConsole.clear()
#define Console_MoveTo(x, y) gConsole.moveTo(x, y)
#define Console_WriteChar(c) gConsole.writeChar(c)
#define Console_Flush() gConsole.flush()
fastcall void Console_WriteString(const char *str);
fastcall void Console_WriteUInt32(uint32 num, int digits, char padding, int base, Bool suppressZero);
fastcall void Console_FormatV(const char **args);
fastcall void Console_HexDump(uint32 *data, uint32 startAddr, uint32 numWords);
void Console_Format(const char *fmt, ...);
void Console_Panic(const char *str, ...);
void Console_UnhandledFault(int number);
#endif /* __CONSOLE_H__ */

269
lib/metalkit/console_vga.c Normal file
View file

@ -0,0 +1,269 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* console_vga.c - Console driver for VGA text mode.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "types.h"
#include "console_vga.h"
#include "io.h"
#include "intr.h"
#define VGA_TEXT_FRAMEBUFFER ((uint8*)0xB8000)
#define VGA_CRTCREG_CURSOR_LOC_HIGH 0x0E
#define VGA_CRTCREG_CURSOR_LOC_LOW 0x0F
typedef struct {
uint16 crtc_iobase;
struct {
int8 x, y;
} cursor;
int8 attr;
} ConsoleVGAObject;
ConsoleVGAObject gConsoleVGA[1];
/*
* ConsoleVGAWriteCRTC --
*
* Write to a VGA CRT Control register.
*/
static fastcall void
ConsoleVGAWriteCRTC(uint8 addr, uint8 value)
{
ConsoleVGAObject *self = gConsoleVGA;
IO_Out8(self->crtc_iobase, addr);
IO_Out8(self->crtc_iobase + 1, value);
}
/*
* ConsoleVGAMoveHardwareCursor --
*
* Set the hardware cursor to the current cursor position.
*/
static fastcall void
ConsoleVGAMoveHardwareCursor(void)
{
ConsoleVGAObject *self = gConsoleVGA;
uint16 loc = self->cursor.x + self->cursor.y * VGA_TEXT_WIDTH;
ConsoleVGAWriteCRTC(VGA_CRTCREG_CURSOR_LOC_LOW, loc & 0xFF);
ConsoleVGAWriteCRTC(VGA_CRTCREG_CURSOR_LOC_HIGH, loc >> 8);
}
/*
* ConsoleVGAMoveTo --
*
* Set the text insertion point. This will move the hardware cursor
* at the next Console_Flush().
*/
static fastcall void
ConsoleVGAMoveTo(int x, int y)
{
ConsoleVGAObject *self = gConsoleVGA;
self->cursor.x = x;
self->cursor.y = y;
}
/*
* ConsoleVGA_Clear --
*
* Clear the screen and move the cursor to the home position.
*/
static fastcall void
ConsoleVGAClear(void)
{
ConsoleVGAObject *self = gConsoleVGA;
uint8 *fb = VGA_TEXT_FRAMEBUFFER;
int i, j;
ConsoleVGAMoveTo(0, 0);
for (j = 0; j < VGA_TEXT_HEIGHT; j++) {
for (i = 0; i < VGA_TEXT_WIDTH; i++) {
fb[0] = ' ';
fb[1] = self->attr;
fb += 2;
}
}
}
/*
* ConsoleVGA_SetColor --
*
* Set the text foreground color.
*/
fastcall void
ConsoleVGA_SetColor(int8 fgColor)
{
ConsoleVGAObject *self = gConsoleVGA;
self->attr &= 0xF0;
self->attr |= fgColor;
}
/*
* ConsoleVGA_SetColor --
*
* Set the text background color.
*/
fastcall void
ConsoleVGA_SetBgColor(int8 bgColor)
{
ConsoleVGAObject *self = gConsoleVGA;
self->attr &= 0x0F;
self->attr |= bgColor << 4;
}
/*
* ConsoleVGAWriteChar --
*
* Write one character, TTY-style. Interprets \n characters.
*/
static fastcall void
ConsoleVGAWriteChar(char c)
{
ConsoleVGAObject *self = gConsoleVGA;
uint8 *fb = VGA_TEXT_FRAMEBUFFER;
if (c == '\n') {
self->cursor.y++;
self->cursor.x = 0;
} else if (c == '\t') {
while (self->cursor.x & 7) {
ConsoleVGAWriteChar(' ');
}
} else if (c == '\b') {
if (self->cursor.x > 0) {
self->cursor.x--;
ConsoleVGAWriteChar(' ');
self->cursor.x--;
}
} else {
fb += self->cursor.x * 2 + self->cursor.y * VGA_TEXT_WIDTH * 2;
fb[0] = c;
fb[1] = self->attr;
self->cursor.x++;
}
if (self->cursor.x >= VGA_TEXT_WIDTH) {
self->cursor.x = 0;
self->cursor.y++;
}
if (self->cursor.y >= VGA_TEXT_HEIGHT) {
int i;
uint8 *fb = VGA_TEXT_FRAMEBUFFER;
const uint32 scrollSize = VGA_TEXT_WIDTH * 2 * (VGA_TEXT_HEIGHT - 1);
self->cursor.y = VGA_TEXT_HEIGHT - 1;
memcpy(fb, fb + VGA_TEXT_WIDTH * 2, scrollSize);
fb += scrollSize;
for (i = 0; i < VGA_TEXT_WIDTH; i++) {
fb[0] = ' ';
fb[1] = self->attr;
fb += 2;
}
}
}
/*
* ConsoleVGABeginPanic --
*
* Prepare for a panic in VGA mode: Set up the panic colors,
* and clear the screen.
*/
static fastcall void
ConsoleVGABeginPanic(void)
{
ConsoleVGA_SetColor(VGA_COLOR_WHITE);
ConsoleVGA_SetBgColor(VGA_COLOR_RED);
ConsoleVGAClear();
ConsoleVGAMoveHardwareCursor();
}
/*
* ConsoleVGA_Init --
*
* Perform first-time initialization for VGA text mode,
* set VGA as the current console driver, and clear the
* screen with a default color.
*/
fastcall void
ConsoleVGA_Init(void)
{
ConsoleVGAObject *self = gConsoleVGA;
/*
* Read the I/O address select bit, to determine where the CRTC
* registers are.
*/
if (IO_In8(0x3CC) & 1) {
self->crtc_iobase = 0x3D4;
} else {
self->crtc_iobase = 0x3B4;
}
gConsole.beginPanic = ConsoleVGABeginPanic;
gConsole.clear = ConsoleVGAClear;
gConsole.moveTo = ConsoleVGAMoveTo;
gConsole.writeChar = ConsoleVGAWriteChar;
gConsole.flush = ConsoleVGAMoveHardwareCursor;
ConsoleVGA_SetColor(VGA_COLOR_WHITE);
ConsoleVGA_SetBgColor(VGA_COLOR_BLUE);
ConsoleVGAClear();
ConsoleVGAMoveHardwareCursor();
}

View file

@ -0,0 +1,63 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* console_vga.h - Console driver for VGA text mode.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __CONSOLE_VGA_H__
#define __CONSOLE_VGA_H__
#include "types.h"
#include "console.h"
#define VGA_COLOR_BLACK 0
#define VGA_COLOR_BLUE 1
#define VGA_COLOR_GREEN 2
#define VGA_COLOR_CYAN 3
#define VGA_COLOR_RED 4
#define VGA_COLOR_MAGENTA 5
#define VGA_COLOR_BROWN 6
#define VGA_COLOR_LIGHT_GRAY 7
#define VGA_COLOR_DARK_GRAY 8
#define VGA_COLOR_LIGHT_BLUE 9
#define VGA_COLOR_LIGHT_GREEN 10
#define VGA_COLOR_LIGHT_CYAN 11
#define VGA_COLOR_LIGHT_RED 12
#define VGA_COLOR_LIGHT_MAGENTA 13
#define VGA_COLOR_YELLOW 14
#define VGA_COLOR_WHITE 15
#define VGA_TEXT_WIDTH 80
#define VGA_TEXT_HEIGHT 25
fastcall void ConsoleVGA_Init(void);
fastcall void ConsoleVGA_SetColor(int8 fgColor);
fastcall void ConsoleVGA_SetBgColor(int8 bgColor);
#endif /* __CONSOLE_VGA_H__ */

71
lib/metalkit/datafile.h Normal file
View file

@ -0,0 +1,71 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* datafile.h - Macros for using raw data files included via objcopy.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __DATAFILE_H__
#define __DATAFILE_H__
#include "types.h"
#include "puff.h"
typedef struct DataFile {
uint8 *ptr;
uint32 size;
} DataFile;
#define DECLARE_DATAFILE(symbol, filename) \
extern uint8 _binary_ ## filename ## _start[]; \
extern uint8 _binary_ ## filename ## _size[]; \
static const DataFile symbol[1] = {{ \
(uint8*) _binary_ ## filename ## _start, \
(uint32) _binary_ ## filename ## _size, \
}}
static inline uint32
DataFile_Decompress(const DataFile *f, void *buffer, uint32 bufferSize)
{
unsigned long sourcelen = f->size;
unsigned long destlen = bufferSize;
if (puff(buffer, &destlen, f->ptr, &sourcelen)) {
asm volatile ("int3");
}
return destlen;
}
static inline uint32
DataFile_GetDecompressedSize(const DataFile *f)
{
return DataFile_Decompress(f, NULL, 0);
}
#endif /* __DATAFILE_H_ */

25
lib/metalkit/deflate.py Normal file
View file

@ -0,0 +1,25 @@
#!/usr/bin/env python
#
# A simple Python script to compress data
# with zlib's DEFLATE algorithm at build time.
#
import zlib, sys
level = 9
input = sys.stdin.read()
zData = zlib.compress(input, level)
# Strip off the zlib header, and return the raw DEFLATE data stream.
# See the zlib RFC: http://www.gzip.org/zlib/rfc-zlib.html
cmf = ord(zData[0])
flg = ord(zData[1])
assert (cmf & 0x0F) == 8 # DEFLATE algorithm
assert (flg & 0x20) == 0 # No preset dictionary
# Strip off 2-byte header and 4-byte checksum
rawData = zData[2:len(zData)-4]
sys.stdout.write(rawData)

View file

@ -0,0 +1,48 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* gcc_support.c - Older versions of GCC will call functions for
* common operations like memcpy/memset instead of
* using compiler intrinsics. This file provides
* non-inlined memcpy/memset functions for this
* purpose, and it's a good place to put any other
* compiler-specific functionality.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
void
memcpy(void *dest, const void *src, unsigned long size)
{
asm volatile ("cld; rep movsb" : "+c" (size), "+S" (src), "+D" (dest) :: "memory");
}
void
memset(void *dest, unsigned char value, unsigned long size)
{
asm volatile ("cld; rep stosb" : "+c" (size), "+D" (dest) : "a" (value) : "memory");
}

110
lib/metalkit/image.ld Normal file
View file

@ -0,0 +1,110 @@
/*
* GNU Linker script for assembling a Metalkit binary image.
*
* Notable changes from ld's default behaviour:
*
* - Load address is at the 1MB boundary.
*
* - Our binary begins with a .boot section.
*
* - The end of the data section is padded to a
* 512-byte boundary, to make sure that our disk
* image ends on a sector boundary. (Required by QEMU)
*
* - We calculate a few auxiliary values used by the
* bootloader, which depend on knowing the size of
* the entire binary.
*/
OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
OUTPUT_ARCH(i386)
ENTRY(_start)
/*
* Stack starts at the top of the usable portion of the first 1MB, and
* grows downward.
*/
_stack = 0x9fffc;
SECTIONS
{
. = 0x100000;
.text : {
_file_origin = .;
*(.boot);
*(.text .text.*);
}
.data : {
*(.rodata .rodata.* .data .data.*)
_edata = .;
_sector_padding = .;
. = ALIGN(512);
_sector_padding_end = .;
}
.bss : {
__bss_start = .;
*(.bss .bss.*);
}
_end = .;
/DISCARD/ : {
*(.note .note.* .comment .comment.*);
}
}
_bss_size = _end - _edata;
_image_size = _edata - _file_origin;
/*
* Disk geometry. CHS geometry is mostly irrelevant these days, so we
* just pick something that will make fdisk happy. It tries to
* autodetect the disk size by looking at the disk's existing
* partitions, so the easiest way to keep it happy is to align the
* partition to a cylinder boundary.
*
* We'd like to use a floppy-disk-compatible geometry for images that
* are small enough to fit on a 1.44 MB disk, but for larger images we
* need to use a bigger geometry so that our cylinder numbers can fit
* in 10 bits. This larger geometry has 1 megabyte cylinders, so we
* can address 1 GB without breaking the 10 bit boundary.
*/
_geom_large_disk = _image_size >= (2880 * 512);
_geom_sectors_per_head = _geom_large_disk ? 32 : 18;
_geom_heads_per_cylinder = _geom_large_disk ? 64 : 2;
_geom_sectors_per_cylinder = _geom_sectors_per_head * _geom_heads_per_cylinder;
/*
* Partition is just big enough to hold our initialized data, rounded
* up to the nearest cylinder. The "_partition_chs_cylinder" is the
* number of the last cylinder in the partition. Also note that
* sector numbers are 1-based.
*/
_image_sectors = (_image_size + 511) / 512;
_partition_chs_cylinder = _image_sectors / _geom_sectors_per_cylinder;
_partition_blocks = (_partition_chs_cylinder + 1) * _geom_sectors_per_cylinder;
_partition_chs_head = _geom_heads_per_cylinder - 1;
_partition_chs_sector = _geom_sectors_per_head;
/*
* Encode the sector and cylinder bytes in the format expected by MBR
* partition tables.
*/
_partition_chs_cylinder_byte = _partition_chs_cylinder & 0xff;
_partition_chs_sector_byte = _partition_chs_sector |
((_partition_chs_cylinder - _partition_chs_cylinder_byte) >> 2);
/*
* Split up the LDT address into byte-wide chunks, so we can write it
* into the GDT at link time. We can't do this entirely in boot.S,
* because the LDT address isn't contiguous in the GDT.
*/
_ldt_byte0 = (LDT >> 0) & 0xff;
_ldt_byte1 = (LDT >> 8) & 0xff;
_ldt_byte2 = (LDT >> 16) & 0xff;
_ldt_byte3 = (LDT >> 24) & 0xff;

419
lib/metalkit/intr.c Normal file
View file

@ -0,0 +1,419 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* intr.c - Interrupt vector management, interrupt routing,
* and low-level building blocks for multithreading.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#include "intr.h"
#include "boot.h"
#include "io.h"
/*
* Definitions for the two PIC chips.
*/
#define PIC1_COMMAND_PORT 0x20
#define PIC1_DATA_PORT 0x21
#define PIC2_COMMAND_PORT 0xA0
#define PIC2_DATA_PORT 0xA1
/*
* IDT table and IDT table descriptor. The table itself lives in the
* BSS segment, the descriptor lives in the data segment.
*/
typedef union {
struct {
uint16 offsetLow;
uint16 segment;
uint16 flags;
uint16 offsetHigh;
};
struct {
uint32 offsetLowSeg;
uint32 flagsOffsetHigh;
};
} PACKED IDTType;
/*
* Note the IDT is page-aligned. Only 8-byte alignment is actually
* necessary, though page alignment may help performance in some
* environments.
*/
static IDTType ALIGNED(4096) IDT[NUM_INTR_VECTORS];
const struct {
uint16 limit;
void *address;
} PACKED IDTDesc = {
.limit = NUM_INTR_VECTORS * 8 - 1,
.address = IDT,
};
/*
* To save space, we don't include assembly-language trampolines for
* each interrupt vector. Instead, we allocate a table in the BSS
* segment which we can fill in at runtime with simple trampoline
* functions. This structure actually describes executable 32-bit
* code.
*/
typedef struct {
uint16 code1;
uint32 arg;
uint8 code2;
IntrHandler handler;
uint32 code3;
uint32 code4;
uint32 code5;
uint32 code6;
uint32 code7;
uint32 code8;
} PACKED IntrTrampolineType;
static IntrTrampolineType ALIGNED(4) IntrTrampoline[NUM_INTR_VECTORS];
/*
* IntrDefaultHandler --
*
* Default no-op interrupt handler.
*/
static void
IntrDefaultHandler(int vector)
{
/* Do nothing. */
}
/*
* Intr_Init --
*
* Initialize the interrupt descriptor table and the programmable
* interrupt controller (PIC). On return, interrupts are enabled
* but all handlers are no-ops.
*/
fastcall void
Intr_Init(void)
{
int i;
Intr_Disable();
IDTType *idt = IDT;
IntrTrampolineType *tramp = IntrTrampoline;
for (i = 0; i < NUM_INTR_VECTORS; i++) {
uint32 trampolineAddr = (uint32) tramp;
/*
* Set up the IDT entry as a 32-bit interrupt gate, pointing at
* our trampoline for this vector. Fill in the IDT with two 32-bit
* writes, since GCC generates significantly smaller code for this
* than when writing four 16-bit fields separately.
*/
idt->offsetLowSeg = (trampolineAddr & 0x0000FFFF) | (BOOT_CODE_SEG << 16);
idt->flagsOffsetHigh = (trampolineAddr & 0xFFFF0000) | 0x00008E00;
/*
* Set up the trampoline, pointing it at the default handler.
* The trampoline function wraps our C interrupt handler, and
* handles placing a vector number onto the stack. It also allows
* interrupt handlers to switch stacks upon return by writing
* to the saved 'esp' register.
*
* Note that the old stack and new stack may actually be different
* stack frames on the same stack. We require that the new stack
* is in a higher or equal stack frame, but the two stacks may
* overlap. This is why the trampoline does its copy in reverse.
*
* Keep the trampoline function consistent with the definition
* of IntrContext in intr.h.
*
* Stack layout:
*
* 8 eflags
* 4 cs
* 0 eip <- esp on entry to IRQ handler
* -4 eax
* -8 ecx
* -12 edx
* -16 ebx
* -20 esp
* -24 ebp
* -28 esi
* -32 edi
* -36 <arg> <- esp on entry to handler function
*
* Our trampolines each look like:
*
* 60 pusha // Save general-purpose regs
* 68 <32-bit arg> push <arg> // Call handler(arg)
* b8 <32-bit addr> mov <addr>, %eax
* ff d0 call *%eax
* 58 pop %eax // Remove arg from stack
* 8b 7c 24 0c mov 12(%esp), %edi // Load new stack address
* 8d 74 24 28 lea 40(%esp), %esi // Addr of eflags on old stack
* 83 c7 08 add $8, %edi // Addr of eflags on new stack
* fd std // Copy backwards
* a5 movsl // Copy eflags
* a5 movsl // Copy cs
* a5 movsl // Copy eip
* 61 popa // Restore general-purpose regs
* 8b 64 24 ec mov -20(%esp), %esp // Switch stacks
* cf iret // Restore eip, cs, eflags
*
* Note: Surprisingly enough, it's actually more size-efficient to initialize
* the structure in code like this than it is to memcpy() the trampoline from
* a template in the data segment.
*/
tramp->code1 = 0x6860;
tramp->code2 = 0xb8;
tramp->code3 = 0x8b58d0ff;
tramp->code4 = 0x8d0c247c;
tramp->code5 = 0x83282474;
tramp->code6 = 0xa5fd08c7;
tramp->code7 = 0x8b61a5a5;
tramp->code8 = 0xcfec2464;
tramp->handler = IntrDefaultHandler;
tramp->arg = i;
idt++;
tramp++;
}
asm volatile ("lidt IDTDesc");
typedef struct {
uint8 port, data;
} PortData8;
static const PortData8 pitInit[] = {
/*
* Program the PIT to map all IRQs linearly starting at
* IRQ_VECTOR_BASE.
*/
{ PIC1_COMMAND_PORT, 0x11 }, // Begin init, use 4 command words
{ PIC2_COMMAND_PORT, 0x11 },
{ PIC1_DATA_PORT, IRQ_VECTOR_BASE },
{ PIC2_DATA_PORT, IRQ_VECTOR_BASE + 8 },
{ PIC1_DATA_PORT, 0x04 },
{ PIC2_DATA_PORT, 0x02 },
{ PIC1_DATA_PORT, 0x03 }, // 8086 mode, auto-end-of-interrupt.
{ PIC2_DATA_PORT, 0x03 },
/*
* All IRQs start out masked, except for the cascade IRQs 2 and 4.
*/
{ PIC1_DATA_PORT, 0xEB },
{ PIC2_DATA_PORT, 0xFF },
};
const PortData8 *p = pitInit;
for (i = arraysize(pitInit); i; i--, p++) {
IO_Out8(p->port, p->data);
}
Intr_Enable();
}
/*
* Intr_SetHandler --
*
* Set a C-language interrupt handler for a particular vector.
* Note that the argument is a vector number, not an IRQ.
*/
fastcall void
Intr_SetHandler(int vector, IntrHandler handler)
{
IntrTrampoline[vector].handler = handler;
}
/*
* Intr_SetMask --
*
* (Un)mask a particular IRQ.
*/
fastcall void
Intr_SetMask(int irq, Bool enable)
{
uint8 port, bit, mask;
if (irq >= 8) {
bit = 1 << (irq - 8);
port = PIC2_DATA_PORT;
} else {
bit = 1 << irq;
port = PIC1_DATA_PORT;
}
mask = IO_In8(port);
/* A '1' bit in the mask inhibits the interrupt. */
if (enable) {
mask &= ~bit;
} else {
mask |= bit;
}
IO_Out8(port, mask);
}
/*
* Intr_SetFaultHandlers --
*
* Set all processor fault handlers to the provided function.
*/
fastcall void
Intr_SetFaultHandlers(IntrHandler handler)
{
int vector;
for (vector = 0; vector < NUM_FAULT_VECTORS; vector++) {
Intr_SetHandler(vector, handler);
}
}
/*
* Intr_InitContext --
*
* Create an IntrContext representing a brand new thread of
* execution. This can be used as a primitive to implement
* light-weight cooperative or pre-emptive multithreading.
*
* 'Stack' points to the initial value of the stack pointer.
* Stacks grow downward, so this should point to the top word of
* the allocated stack memory.
*/
fastcall void
Intr_InitContext(IntrContext *ctx, uint32 *stack, IntrContextFn main)
{
Intr_SaveContext(ctx);
ctx->esp = (uint32) stack;
ctx->eip = (uint32) main;
}
/*
* Intr_SaveContext --
*
* This is a C-callable function which constructs an
* IntrContext representing the current execution
* context. This is nearly equivalent to invoking
* software interrupt and saving the interrupt's
* IntrContext, but this implementation doesn't have the
* overhead of an actual interrupt invocation.
*/
asm (".global Intr_SaveContext \n Intr_SaveContext:"
"pusha \n"
/*
* Adjust the saved stack pointer. IntrContexts always
* store an %esp which has three words on the stack
* prior to the general-purpose regs, but since we don't
* use cs or eflags we only have 1.
*/
"sub $8, 12(%esp) \n"
/*
* The stack now matches the layout of the first 9 words
* of IntrContext. Copy these, then manually save CS and
* eflags.
*/
"mov %esp, %esi \n"
"mov 36(%esp), %edi \n"
"mov $9, %ecx \n"
"rep movsl \n"
"xor %eax, %eax \n"
"mov %cs, %ax \n"
"stosl \n"
"pushf \n"
"pop %eax \n"
"stosl \n"
/* Return 0 when this function is called directly. */
"popa \n"
"xor %eax, %eax \n"
"ret" );
/*
* Intr_RestoreContext --
*
* This is the inverse of Intr_SaveContext: copy the
* IntrContext onto the target context's stack frame,
* switch stacks, then restore the rest of the context's
* saved state.
*/
asm(".global Intr_RestoreContext \n Intr_RestoreContext:"
"mov 4(%esp), %esi \n" // Load pointer to IntrContext
"mov 12(%esi), %esp \n" // Switch stacks
/*
* esp was saved with 3 words on the stack (eip, cs, cflags).
* Position esp so we have 9 words instead. (General purpose
* regs plus eip, but no cs/eflags.)
*/
"sub $24, %esp \n"
// Copy the first 9 words of Intrcontext back onto the stack.
"mov %esp, %edi \n"
"mov $9, %ecx \n"
"rep movsl \n"
// Restore the general purpose regs and eip
"popa \n"
"ret" );

176
lib/metalkit/intr.h Normal file
View file

@ -0,0 +1,176 @@
/* -*- Mode: C; c-basic-offset: 3 -*-
*
* intr.h - Interrupt vector management and interrupt routing.
*
* This file is part of Metalkit, a simple collection of modules for
* writing software that runs on the bare metal. Get the latest code
* at http://svn.navi.cx/misc/trunk/metalkit/
*
* Copyright (c) 2008-2009 Micah Dowty
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*/
#ifndef __INTR_H__
#define __INTR_H__
#include "types.h"
#define NUM_INTR_VECTORS 256
#define NUM_FAULT_VECTORS 0x20
#define NUM_IRQ_VECTORS 0x10
#define IRQ_VECTOR_BASE NUM_FAULT_VECTORS
#define IRQ_VECTOR(irq) ((irq) + IRQ_VECTOR_BASE)
#define USER_VECTOR_BASE (IRQ_VECTOR_BASE + NUM_IRQ_VECTORS)
#define USER_VECTOR(n) ((n) + USER_VECTOR_BASE)
#define IRQ_TIMER 0
#define IRQ_KEYBOARD 1
#define FAULT_DE 0x00 // Divide error
#define FAULT_NMI 0x02 // Non-maskable interrupt
#define FAULT_BP 0x03 // Breakpoint
#define FAULT_OF 0x04 // Overflow
#define FAULT_BR 0x05 // Bound range
#define FAULT_UD 0x06 // Undefined opcode
#define FAULT_NM 0x07 // No FPU
#define FAULT_DF 0x08 // Double Fault
#define FAULT_TS 0x0A // Invalid TSS
#define FAULT_NP 0x0B // Segment not present
#define FAULT_SS 0x0C // Stack-segment fault
#define FAULT_GP 0x0D // General Protection Fault
#define FAULT_PF 0x0E // Page fault
#define FAULT_MF 0x10 // Math fault
#define FAULT_AC 0x11 // Alignment check
#define FAULT_MC 0x12 // Machine check
#define FAULT_XM 0x13 // SIMD floating point exception
typedef void (*IntrHandler)(int vector);
typedef void (*IntrContextFn)(void);
fastcall void Intr_Init(void);
fastcall void Intr_SetFaultHandlers(IntrHandler handler);
fastcall void Intr_SetHandler(int vector, IntrHandler handler);
fastcall void Intr_SetMask(int irq, Bool enable);
static inline void
Intr_Enable(void) {
asm volatile ("sti");
}
static inline void
Intr_Disable(void) {
asm volatile ("cli");
}
static inline Bool
Intr_Save(void) {
uint32 eflags;
asm volatile ("pushf; pop %0" : "=r" (eflags));
return (eflags & 0x200) != 0;
}
static inline void
Intr_Restore(Bool flag) {
if (flag) {
Intr_Enable();
} else {
Intr_Disable();
}
}
static inline void
Intr_Halt(void) {
asm volatile ("hlt");
}
static inline void
Intr_Break(void) {
asm volatile ("int3");
}
/*
* This structure describes all execution state that's saved when an
* interrupt or a setjmp occurs. In the case of an interrupt, this
* structure actually describes the stack frame of the interrupt
* trampoline.
*
* An interrupt handler can get a pointer to its IntrContext by
* passing its first argument to the Intr_GetContext macro. This
* allows an interrupt handler to examine the execution context in
* which the interrupt occurred, to modify the interrupt's return
* address, or even to implement input and output for OS traps.
*
* This module also provides functions for directly saving and
* restoring IntrContext structures. This can be used much like
* setjmp/longjmp, or it can even be used for simple cooperative or
* preemptive multithreading. An interrupt handler can perform a
* context switch by overwriting its IntrContext with a saved context.
*
* The definition of this structure must be kept in sync with the
* machine code in our interrupt trampolines, and with the
* assembly-language implementation of SaveContext and RestoreContext.
*/
typedef struct IntrContext {
/*
* General purpose registers. These are all saved after the value
* of %esp is captured.
*/
uint32 edi;
uint32 esi;
uint32 ebp;
uint32 esp;
uint32 ebx;
uint32 edx;
uint32 ecx;
uint32 eax;
/*
* These values are save by the CPU during an interrupt. By
* convention, these values are at the top of the stack when %esp
* was saved.
*
* The values of cs and eflags are ignored by Intr_SaveContext
* and Intr_RestoreContext.
*/
uint32 eip;
uint32 cs;
uint32 eflags;
} IntrContext;
/*
* Always use the 'volatile' keyword when storing the result
* of Intr_GetContext. GCC can erroneously decide to optimize
* out any copies to this pointer, because it doesn't know the
* values will be used by our trampoline.
*/
#define Intr_GetContext(arg) ((IntrContext*) &(&arg)[1])
uint32 Intr_SaveContext(IntrContext *ctx);
void Intr_RestoreContext(IntrContext *ctx);
fastcall void Intr_InitContext(IntrContext *ctx, uint32 *stack, IntrContextFn main);
#endif /* __INTR_H__ */

Some files were not shown because too many files have changed in this diff Show more