All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC] docs: describe QEMU's VMGenID design
@ 2015-08-28 20:18 Laszlo Ersek
  2015-09-01 19:47 ` Eric Blake
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Laszlo Ersek @ 2015-08-28 20:18 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gal Hammer, Paolo Bonzini, Michael S. Tsirkin, Igor Mammedov

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gal Hammer <ghammer@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---

Notes:
    This is based on the super long private email discussion we had two
    months ago, plus on the IRL discussion between Michael and myself @ the
    KVM Forum 2015.

 docs/specs/vmgenid.txt | 343 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 343 insertions(+)
 create mode 100644 docs/specs/vmgenid.txt

diff --git a/docs/specs/vmgenid.txt b/docs/specs/vmgenid.txt
new file mode 100644
index 0000000..d4bf132
--- /dev/null
+++ b/docs/specs/vmgenid.txt
@@ -0,0 +1,343 @@
+Virtual Machine Generation ID Device
+====================================
+
+The Microsoft specification entitled "Virtual Machine Generation ID",
+maintained at <http://go.microsoft.com/fwlink/?LinkId=260709>, defines an ACPI
+feature that allows the guest OSPM to recognize when it has been returned "to
+an earlier point in time", eg. by restoral from snapshot, or by incoming
+migration. Quoting the spec,
+
+    The virtual machine generation ID is a feature whereby the virtual machines
+    BIOS will expose a new ID. This is a 128-bit, cryptographically random
+    integer value identifier that will be different every time the virtual
+    machine executes from a different configuration file-such as executing from
+    a recovered snapshot, or executing after restoring from backup. [...]
+
+The document you are reading now extracts the requirements set forth by the
+VMGenID spec for hypervisors that intend to provide the feature, and describes
+QEMU's implementation. The design below targets both SeaBIOS and OVMF as
+compatible guest firmwares, without any changes to either of them.
+
+Requirements
+------------
+
+These requirements are extracted from the "How to implement virtual machine
+generation ID support in a virtualization platform" section of the
+specification, dated August 1, 2012.
+
+R1a. The generation ID shall live in an 8-byte aligned buffer.
+
+R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device
+     MMIO range.
+
+R1c. The buffer holding the generation ID shall be kept separate from areas
+     used by the operating system.
+
+R1d. The buffer shall not be covered by an AddressRangeMemory or
+     AddressRangeACPI entry in the E820 or UEFI memory map.
+
+R1e. The generation ID shall not live in a page frame that could be mapped with
+     caching disabled. (In other words, if the generation ID lives in RAM, then
+     it shall only be mapped as cacheable.)
+
+R2 to R5. [These AML requirements are isolated well enough in the Microsoft
+          specification for us to simply refer to them here.]
+
+R6. The hypervisor shall expose a _HID (hardware identifier) object in the
+    VMGenId device's scope that is unique to the hypervisor vendor.
+
+Generation ID buffer design
+---------------------------
+
+QEMU places the generation ID buffer inside a separate fw_cfg blob that is
+exposed to the guest OS with the ACPI linker/loader.
+
+The structure of the blob is as follows. Offsets, sizes and numeric values are
+given in decimal; furthermore the latter are encoded in little endian.
+
+  Offs  Field               Size  Value
+  ----  ------------------  ----  ------------------------------------
+     0  System Description    36
+        Table Header
+     0    Signature            4                                "UEFI"
+     4    Length               4                                    62
+     8    Revision             1                                     1
+     9    Checksum             1                                     0
+    10    OEMID                6        ACPI_BUILD_APPNAME6 ("BOCHS ")
+    16    OEM Table ID         8                            "QEMUPARM"
+    24    OEM Revision         4                                     1
+    28    Creator ID           4          ACPI_BUILD_APPNAME4 ("BXPC")
+    32    Creator Revision     4                                     1
+
+    36  UEFI Table            18
+        Sub-Header
+    36    Identifier          16  417a5dff-bf4b-4abc-a839-6593bb41f452
+    52    DataOffset           2                                    54
+
+    54  ADDR base pointer      8                                    62
+  ....................................................................
+    62  OVMF SDT Header       36                                zeroes
+        probe suppressor
+    98  VMGenID alignment      6                                zeroes
+        padding
+   104  generation ID         16                       128-bit VMGenID
+   120  fw_cfg blob         3976                                zeroes
+        padding
+  4096  <end of blob>
+
+The fw_cfg blob is divided in two parts conceptually (separated by the dotted
+line in the diagram). The first part, up to and excluding offset 62, is a
+"UEFI" ACPI Table, governed by the UEFI specification 2.5, Appendix O. The
+second part is mainly padding, but it also contains the generation ID.
+
+The "UEFI" ACPI Table -- in the first part -- is a "normal" ACPI table whose
+generic header is defined by the ACPI specification, but for which the UEFI
+spec defines the "UEFI" signature and adds two more fixed fields, "Identifier"
+and "DataOffset".
+
+- The Identifier field carries a 128-bit GUID, and enables firmware
+  implementors to install several "UEFI" tables with different internal
+  structures, enabling OSPM to tell them apart based on the (Type-)Identifier
+  GUID field.
+
+  For the purposes of QEMU's VMGenID implementation, we generated a new GUID
+  with the "uuidgen" utility. It should be different from all other
+  "Identifier" values, present and future, but otherwise no other software need
+  be aware of the concrete GUID value we generated.
+
+- The DataOffset field is just an offset into the table where the actual
+  (Identifier-specific) data starts.
+
+  For the purposes of QEMU's VMGenID implementation, we simply set it to the
+  next (QEMU-specific) field, "ADDR base pointer".
+
+Linker/loader commands
+----------------------
+
+The name of the fw_cfg blob is "etc/acpi/qemuparam". The ALLOCATE command that
+instructs the guest firmware to download this fw_cfg blob specifies an
+alignment of 4096, and the blob will have size 4096 too.
+
+An ADD_POINTER command links the "UEFI" ACPI Table at the start of the blob
+into the RSDT.
+
+Another ADD_POINTER command relocates the "ADDR base pointer" field to the
+absolute address of the "OVMF SDT Header probe suppressor" field, within the
+same blob.
+
+After this relocation, an ADD_CHECKSUM command updates the Checksum field,
+covering the entire "UEFI" ACPI Table (which extends up to and excluding offset
+62).
+
+Blob behavior under SeaBIOS
+---------------------------
+
+(Most of the complexity in the blob is ignored when the guest firmware is
+SeaBIOS.)
+
+- SeaBIOS's ACPI linker/loader client allocates the blob in normal RAM
+  (satisfying R1b).
+
+- Because the ALLOCATE command prescribes an alignment of 4KB, and the blob's
+  size is also 4KB, the allocation covers a standalone page frame in full
+  (satisfying R1e).
+
+- The 128-bit VMGenID field is located at offset 104 within that page,
+  resulting in a guest-physical address divisible by 8 (satisfying R1a).
+
+- The blob is marked as Reserved in the E820 map (satisfying R1c and R1d).
+
+- The "UEFI" ACPI Table at the start of the blob is linked into the RSDT,
+  in-place.
+
+- The "ADDR" AML method (see later) is allowed to refer to the "UEFI" ACPI
+  Table with the DataTableRegion operator, because the table is located in
+  memory marked as AddressRangeReserved.
+
+- The "ADDR base pointer" field points at "OVMF SDT Header probe suppressor",
+  which is right after the "UEFI" ACPI Table inside the blob. At OSPM runtime,
+  the "ADDR" AML method reads the "ADDR base pointer" field, and adds 42, to
+  arrive at the address of the VMGenID field.
+
+  blob @ page offset 0              RSDT
+  +-----------------------+         +-----+
+  | "UEFI" ACPI Table <---------+   | ... |
+  | +-------------------+ |     |   | ... |
+  | | ...               | |     +---- ... |
+  | | ...               | |         +-----+
+  | | ADDR base pointer -----+
+  | +-------------------+ |  |
+  | probe suppressor <-------+
+  | VMGenID @ offset 104  |
+  | padding               |
+  +-----------------------+
+
+Blob behavior under OVMF
+------------------------
+
+The complexity in the blob is required by the two-pass nature of OVMF's ACPI
+linker/loader client, which in turn comes from the fact that OVMF has to
+dissect blobs into individual ACPI tables vs. "other things", tracking the
+ADD_POINTER commands, so that tables can be installed individually, with
+EFI_ACPI_TABLE_PROTOCOL.
+
+- OVMF's ACPI linker/loader client allocates the blob in normal RAM (satisfying
+  R1b).
+
+- Because the ALLOCATE command prescribes an alignment of 4KB, and the blob's
+  size is also 4KB, the allocation covers a standalone page frame in full
+  (satisfying R1e).
+
+- The 128-bit VMGenID field is located at offset 104 within that page,
+  resulting in a guest-physical address divisible by 8 (satisfying R1a).
+
+- OVMF's ACPI linker/loader allocates the blob in EfiACPIMemoryNVS type memory,
+  therefore it is marked as such in the UEFI memmap (satisfying R1c and R1d).
+
+- OVMF identifies the "UEFI" ACPI Table at the start of the blob in the second
+  pass, following the ADD_POINTER command that is meant to link the table into
+  the RSDT. OVMF installs a *copy* of the "UEFI" ACPI Table with
+  EFI_ACPI_TABLE_PROTOCOL (linking the copy into both RSDT and XSDT). Given the
+  "UEFI" signature of the table, EFI_ACPI_TABLE_PROTOCOL places the copy of the
+  table in EfiACPIMemoryNVS type memory.
+
+- The "ADDR" AML method (see later) is allowed to refer to the "UEFI" ACPI
+  Table with the DataTableRegion operator, because the table is located in
+  memory marked as AddressRangeNVS.
+
+- The "ADDR base pointer" field inside the installed table points at "OVMF SDT
+  Header probe suppressor" in the original blob. Because this field is filled
+  with zeros, OVMF's table identification heuristics unconditionally reports a
+  negative when it tracks the relevant ADD_POINTER command to it in the second
+  pass. Therefore the blob is marked as "hosts something else than just ACPI
+  tables", and it is preserved permanently (in the same EfiACPIMemoryNVS type
+  memory where it has been originally allocated).
+
+  At OSPM runtime, the "ADDR" AML method reads the "ADDR base pointer" field,
+  and adds 42, to arrive at the address of the VMGenID field.
+
+  blob @ page offset 0               RSDT         XSDT
+  +-----------------------------+    +-----+      +-----+
+  | "UEFI" ACPI Table (in blob) |    | ... |      | ... |
+  | +-------------------------+ |    | ... ---+   | ... ---------------+
+  | |XXXXXXXXXXXXXXXXXXXXXXXXX| |    +-----+  |   +-----+              |
+  | |XXXXXXX [unused] XXXXXXXX| |             |                        |
+  | |XXXXXXXXXXXXXXXXXXXXXXXXX| |             +------------------------+
+  | +-------------------------+ |                                      |
+  | probe suppressor <-------------+  "UEFI" ACPI Table (installed) <--+
+  | VMGenID @ offset 104        |  |  +---------------------------+
+  | padding                     |  |  | ...                       |
+  +-----------------------------+  |  | ...                       |
+                                   +--- ADDR base pointer         |
+                                      +---------------------------+
+
+ACPI device, control methods
+----------------------------
+
+Requirements R2 through R6 of the VMGenID specification are satisfied with the
+following ACPI logic, exposed by QEMU's ACPI generator in one of the SSDTs, and
+installed by both guest firmwares as such.
+
+The basic idea is that, when the appropriate guest driver calls the ADDR method
+(see R4), OSPM locates the generation ID field in the 4KB blob that lives in
+E820 Reserved (SeaBIOS) or EfiACPIMemoryNVS type (OVMF) memory. The
+guest-physical address of the field is communicated to QEMU via IO ports
+[0x512..0x519] inclusive. Then QEMU is cued through IO port 0x51A to refresh
+(and keep refreshing when appropriate) the generation ID at the passed back
+address. Finally, the method returns the address to the guest driver too, in
+the format required by R4.
+
+    Scope(\_SB) {
+        Device (VMGI) {
+            /* satisfy R2 */
+            Name (_CID, "VM_Gen_Counter")
+
+            /* satisfy R3 */
+            Name (_DDN, "VM_Gen_Counter")
+
+            /* satisfy R6 */
+            Name (_HID, "QEMU0002")
+
+            /* the device owns this IO port range */
+            Name (_CRS, ResourceTemplate () {
+                IO (Decode16, 0x512, 0x512, 1, 9)
+            })
+
+            /* Device status: present, enabled & decoding resources, should be
+             * shown in the UI, functioning properly.
+             */
+            Name (_STA, 0xF)
+
+            /* Satisfy R4.
+             *
+             * This method is serialized because it creates named objects.
+             */
+            Method (ADDR, 0, Serialized) {
+                /* The 8-byte integer field defined as ADBP below is the
+                 * "ADDR base pointer" field in the UEFI ACPI Table.
+                 *
+                 * The DataTableRegion() operator locates that ACPI table by
+                 * scanning the RSDT/XSDT using the (SignatureString,
+                 * OemIDString, OemTableIDString) triplet as key.
+                 *
+                 * Windows XP would normally crash on the DataTableRegion()
+                 * operator, but it never calls the ADDR method, hence it never
+                 * reaches or evaluates DataTableRegion().
+                 */
+                DataTableRegion (TBLR, "UEFI", "BOCHS", "QEMUPARM")
+                Field (TBLR, AnyAcc, NoLock, Preserve) {
+                  Offset (54),
+                  ADBP, 64
+                }
+
+                /* This is the IO port range exposed in the _CRS above.
+                 *
+                 * The first two 4-byte ports are used to communicate the
+                 * 64-bit guest-physical address of the actual (relocated)
+                 * 128-bit generation ID field to QEMU, in little endian
+                 * encoding, so that QEMU can rewrite that field in guest RAM.
+                 *
+                 * A write to last 1-byte port signals that the address has
+                 * been written fully, and QEMU is free to dereference it.
+                 */
+                OperationRegion (VMGR, SystemIO, 0x512, 9)
+                Field (VMGR, DWordAcc, NoLock, Preserve) {
+                    PTLO, 32,
+                    PTHI, 32,
+                    AccessAs (ByteAcc),
+                    DONE, 8
+                }
+
+                /* The ADBP field points to the "OVMF SDT Header probe
+                 * suppressor" area in the blob, at offset 62. In order to
+                 * arrive at the generation ID field at offset 104, we must add
+                 * 42 dynamically.
+                 *
+                 * The RESU buffer below will contain the result of the
+                 * addition. The ADFU field exposes it as an 8-byte integer
+                 * (for storing the sum), while the ADLO and ADHI fields enable
+                 * us to access the result in two separate 4-byte integers.
+                 * This exact integer width is especially important for
+                 * composing the package object that the ADDR method must
+                 * return.
+                 */
+                Name (RESU, Buffer (8) {})
+                CreateQWordField (RESU, 0, ADFU)
+                CreateDWordField (RESU, 0, ADLO)
+                CreateDWordField (RESU, 4, ADHI)
+
+                Add (ADBP, 42, ADFU)
+                Store (ADLO, PTLO)
+                Store (ADHI, PTHI)
+                Store (0, DONE)
+                Return (Package (2) { ADLO, ADHI })
+            }
+        }
+    }
+
+    /* satisfy R5 */
+    Scope (\_GPE) {
+        Method (_E04) {
+            Notify (\_SB.VMGI, 0x80)
+        }
+    }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2015-09-15 14:29 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-28 20:18 [Qemu-devel] [RFC] docs: describe QEMU's VMGenID design Laszlo Ersek
2015-09-01 19:47 ` Eric Blake
2015-09-01 22:05   ` Laszlo Ersek
2015-09-01 22:22     ` Eric Blake
2015-09-07 16:30       ` Paolo Bonzini
2015-09-03 13:49 ` Michael S. Tsirkin
2015-09-03 14:24   ` Laszlo Ersek
2015-09-13 11:56 ` [Qemu-devel] Windows does not support DataTableRegion at all [was: docs: describe QEMU's VMGenID design] Laszlo Ersek
2015-09-13 12:34   ` Michael S. Tsirkin
2015-09-13 12:57     ` Laszlo Ersek
2015-09-14  8:24     ` Igor Mammedov
2015-09-14 10:24       ` Laszlo Ersek
2015-09-14 16:53         ` [Qemu-devel] [edk2] " Bill Paul
2015-09-14 17:14           ` Moore, Robert
2015-09-14 17:23             ` Walz, Michael C
2015-09-14 18:04               ` Moore, Robert
2015-09-14 18:24               ` Laszlo Ersek
2015-09-15 10:49               ` Laszlo Ersek
2015-09-14 18:20           ` Laszlo Ersek
2015-09-14 21:12             ` Bill Paul
2015-09-15 10:49               ` Laszlo Ersek
2015-09-15 13:45                 ` Moore, Robert
2015-09-15 14:29                   ` Laszlo Ersek
2015-09-13 12:43   ` [Qemu-devel] [PATCH FYI 00/13] ACPI stuff for the DataTableRegion()-based VMGenID Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 01/13] docs: describe QEMU's VMGenID design Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 02/13] hw/acpi: add i386 callbacks for injecting GPE 04 when the VMGENID changes Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 03/13] hw/acpi: rename "AcpiBuildTables.table_data" to "main_blob" Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 04/13] hw/acpi: allow RSDT entries to be relocated to various fw_cfg blobs Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 05/13] hw/acpi: add more flexible acpi_add_table() and build_header() variants Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 06/13] hw/acpi: introduce ACPI_BUILD_QEMUPARAM_FILE Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 07/13] hw/acpi: introduce the AcpiQemuParamTable structure Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 08/13] hw/i386: build UEFI ACPI Data Table for VMGENID in the dedicated blob (WIP) Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 09/13] hw/acpi: expose more parameters for aml_method() Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 10/13] hw/acpi: add AML generator function for DataTableRegion() Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 11/13] hw/acpi: add AML generator function for AccessAs() Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 12/13] hw/acpi: add AML generator function for CreateQWordField() Laszlo Ersek
2015-09-13 12:43     ` [Qemu-devel] [PATCH FYI 13/13] hw/i386: generate AML for the VMGENID device (WIP) Laszlo Ersek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.