From: "Marc-André Lureau" <marcandre.lureau@redhat.com>
To: Laszlo Ersek <lersek@redhat.com>
Cc: qemu-devel@nongnu.org, ehabkost@redhat.com,
"Michael S. Tsirkin" <mst@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
anderson@redhat.com, imammedo@redhat.com,
Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH 2/7] acpi: add vmcoreinfo device
Date: Wed, 5 Jul 2017 09:54:49 -0400 (EDT) [thread overview]
Message-ID: <1945214498.47130577.1499262889217.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <0c4c4552-1e82-3fa2-31fc-c266f5f23f91@redhat.com>
Hi
----- Original Message -----
> comments below
>
> On 06/29/17 15:23, Marc-André Lureau wrote:
> > The VM coreinfo (vmcoreinfo) device is an emulated device which
> > exposes a 4k memory range to the guest to store various informations
> > useful to debug the guest OS. (it is greatly inspired by the VMGENID
> > device implementation)
> >
> > This is an early-boot alternative to the qemu-ga VMDUMP_INFO event
> > proposed in "[PATCH 00/21] WIP: dump: add kaslr support".
> >
> > A proof-of-concept kernel module:
> > https://github.com/elmarco/vmgenid-test/blob/master/qemuvmci-test.c
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> > include/hw/acpi/aml-build.h | 1 +
> > include/hw/acpi/vmcoreinfo.h | 36 +++++++
> > hw/acpi/aml-build.c | 2 +
> > hw/acpi/vmcoreinfo.c | 198
> > +++++++++++++++++++++++++++++++++++++
> > hw/i386/acpi-build.c | 14 +++
> > default-configs/arm-softmmu.mak | 1 +
> > default-configs/i386-softmmu.mak | 1 +
> > default-configs/x86_64-softmmu.mak | 1 +
> > docs/specs/vmcoreinfo.txt | 138 ++++++++++++++++++++++++++
> > hw/acpi/Makefile.objs | 1 +
> > 10 files changed, 393 insertions(+)
> > create mode 100644 include/hw/acpi/vmcoreinfo.h
> > create mode 100644 hw/acpi/vmcoreinfo.c
> > create mode 100644 docs/specs/vmcoreinfo.txt
> >
> > diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
> > index 88d0738d76..cf781bcd34 100644
> > --- a/include/hw/acpi/aml-build.h
> > +++ b/include/hw/acpi/aml-build.h
> > @@ -211,6 +211,7 @@ struct AcpiBuildTables {
> > GArray *rsdp;
> > GArray *tcpalog;
> > GArray *vmgenid;
> > + GArray *vmcoreinfo;
> > BIOSLinker *linker;
> > } AcpiBuildTables;
> >
> > diff --git a/include/hw/acpi/vmcoreinfo.h b/include/hw/acpi/vmcoreinfo.h
> > new file mode 100644
> > index 0000000000..40fe99c3ed
> > --- /dev/null
> > +++ b/include/hw/acpi/vmcoreinfo.h
> > @@ -0,0 +1,36 @@
> > +#ifndef ACPI_VMCOREINFO_H
> > +#define ACPI_VMCOREINFO_H
> > +
> > +#include "hw/acpi/bios-linker-loader.h"
> > +#include "hw/qdev.h"
> > +
> > +#define VMCOREINFO_DEVICE "vmcoreinfo"
> > +#define VMCOREINFO_FW_CFG_FILE "etc/vmcoreinfo"
> > +#define VMCOREINFO_ADDR_FW_CFG_FILE "etc/vmcoreinfo-addr"
> > +
> > +#define VMCOREINFO_FW_CFG_SIZE 4096 /* Occupy a page of memory */
> > +#define VMCOREINFO_OFFSET 40 /* allow space for
> > + * OVMF SDT Header Probe
> > Supressor
> > + */
> > +
> > +#define VMCOREINFO(obj) OBJECT_CHECK(VmcoreinfoState, (obj),
> > VMCOREINFO_DEVICE)
> > +
> > +typedef struct VmcoreinfoState {
>
> I think this should be spelled with a bit more camel-casing, like
> VMCoreInfoState or some such.
>
ok
> > + DeviceClass parent_obj;
> > + uint8_t vmcoreinfo_addr_le[8]; /* Address of memory region */
> > + bool write_pointer_available;
> > +} VmcoreinfoState;
> > +
> > +/* returns NULL unless there is exactly one device */
> > +static inline Object *find_vmcoreinfo_dev(void)
> > +{
> > + return object_resolve_path_type("", VMCOREINFO_DEVICE, NULL);
> > +}
> > +
> > +void vmcoreinfo_build_acpi(VmcoreinfoState *vis, GArray *table_data,
> > + GArray *vmci, BIOSLinker *linker);
> > +void vmcoreinfo_add_fw_cfg(VmcoreinfoState *vis, FWCfgState *s, GArray
> > *vmci);
> > +bool vmcoreinfo_get(VmcoreinfoState *vis, uint64_t *paddr, uint64_t *size,
> > + Error **errp);
> > +
> > +#endif
> > diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
> > index 36a6cc450e..47043ade4a 100644
> > --- a/hw/acpi/aml-build.c
> > +++ b/hw/acpi/aml-build.c
> > @@ -1561,6 +1561,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
> > tables->table_data = g_array_new(false, true /* clear */, 1);
> > tables->tcpalog = g_array_new(false, true /* clear */, 1);
> > tables->vmgenid = g_array_new(false, true /* clear */, 1);
> > + tables->vmcoreinfo = g_array_new(false, true /* clear */, 1);
> > tables->linker = bios_linker_loader_init();
> > }
> >
> > @@ -1571,6 +1572,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables
> > *tables, bool mfre)
> > g_array_free(tables->table_data, true);
> > g_array_free(tables->tcpalog, mfre);
> > g_array_free(tables->vmgenid, mfre);
> > + g_array_free(tables->vmcoreinfo, mfre);
> > }
> >
> > /* Build rsdt table */
> > diff --git a/hw/acpi/vmcoreinfo.c b/hw/acpi/vmcoreinfo.c
> > new file mode 100644
> > index 0000000000..216e0bb83a
> > --- /dev/null
> > +++ b/hw/acpi/vmcoreinfo.c
> > @@ -0,0 +1,198 @@
> > +/*
> > + * Virtual Machine coreinfo device
> > + * (based on Virtual Machine Generation ID Device)
> > + *
> > + * Copyright (C) 2017 Red Hat, Inc.
> > + * Copyright (C) 2017 Skyport Systems.
> > + *
> > + * Authors: Marc-André Lureau <marcandre.lureau@redhat.com>
> > + * Ben Warren <ben@skyportsystems.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> > later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +#include "qemu/osdep.h"
> > +#include "hw/acpi/acpi.h"
> > +#include "hw/acpi/aml-build.h"
> > +#include "hw/acpi/vmcoreinfo.h"
> > +#include "hw/nvram/fw_cfg.h"
> > +#include "sysemu/sysemu.h"
> > +#include "qapi/error.h"
> > +
> > +void vmcoreinfo_build_acpi(VmcoreinfoState *vis, GArray *table_data,
> > + GArray *vmci, BIOSLinker *linker)
> > +{
> > + Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
> > + uint32_t vgia_offset;
>
> This should be called "vcia_offset".
ok
>
> > +
> > + g_array_set_size(vmci, VMCOREINFO_FW_CFG_SIZE);
> > +
> > + /* Put this in a separate SSDT table */
> > + ssdt = init_aml_allocator();
> > +
> > + /* Reserve space for header */
> > + acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
> > +
> > + /* Storage address */
> > + vgia_offset = table_data->len +
> > + build_append_named_dword(ssdt->buf, "VCIA");
> > + scope = aml_scope("\\_SB");
> > + dev = aml_device("VMCI");
> > + aml_append(dev, aml_name_decl("_HID", aml_string("QEMUVMCI")));
> > +
> > + /* Simple status method to check that address is linked and non-zero
> > */
> > + method = aml_method("_STA", 0, AML_NOTSERIALIZED);
> > + addr = aml_local(0);
> > + aml_append(method, aml_store(aml_int(0xf), addr));
> > + if_ctx = aml_if(aml_equal(aml_name("VCIA"), aml_int(0)));
> > + aml_append(if_ctx, aml_store(aml_int(0), addr));
> > + aml_append(method, if_ctx);
> > + aml_append(method, aml_return(addr));
> > + aml_append(dev, method);
> > +
> > + /* the ADDR method returns two 32-bit words representing the lower and
> > + * upper halves of the physical address of the vmcoreinfo area
> > + */
> > + method = aml_method("ADDR", 0, AML_NOTSERIALIZED);
> > +
> > + addr = aml_local(0);
> > + aml_append(method, aml_store(aml_package(2), addr));
> > +
> > + aml_append(method, aml_store(aml_add(aml_name("VCIA"),
> > + aml_int(VMCOREINFO_OFFSET),
> > NULL),
> > + aml_index(addr, aml_int(0))));
> > + aml_append(method, aml_store(aml_int(0), aml_index(addr,
> > aml_int(1))));
> > + aml_append(method, aml_return(addr));
> > +
> > + aml_append(dev, method);
> > + aml_append(scope, dev);
> > + aml_append(ssdt, scope);
> > +
> > + g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
> > +
> > + /* Allocate guest memory */
> > + bios_linker_loader_alloc(linker, VMCOREINFO_FW_CFG_FILE, vmci, 4096,
> > + false /* page boundary, high memory */);
> > +
> > + /* Patch address of vmcoreinfo fw_cfg blob into the ADDR fw_cfg
> > + * blob so QEMU can read the info from there. The address is
> > + * expected to be < 4GB, but write 64 bits anyway.
> > + * The address that is patched in is offset in order to implement
> > + * the "OVMF SDT Header probe suppressor"
> > + * see docs/specs/vmcoreinfo.txt for more details.
> > + */
> > + bios_linker_loader_write_pointer(linker,
> > + VMCOREINFO_ADDR_FW_CFG_FILE, 0, sizeof(uint64_t),
> > + VMCOREINFO_FW_CFG_FILE, VMCOREINFO_OFFSET);
> > +
> > + /* Patch address of vmcoreinfo into the AML so OSPM can retrieve
> > + * and read it. Note that while we provide storage for 64 bits, only
> > + * the least-signficant 32 get patched into AML.
> > + */
> > + bios_linker_loader_add_pointer(linker,
> > + ACPI_BUILD_TABLE_FILE, vgia_offset, sizeof(uint32_t),
> > + VMCOREINFO_FW_CFG_FILE, 0);
> > +
> > + build_header(linker, table_data,
> > + (void *)(table_data->data + table_data->len - ssdt->buf->len),
> > + "SSDT", ssdt->buf->len, 1, NULL, "VMCOREIN");
> > + free_aml_allocator();
> > +}
> > +
> > +void vmcoreinfo_add_fw_cfg(VmcoreinfoState *vis, FWCfgState *s, GArray
> > *vmci)
> > +{
> > + /* Create a read-only fw_cfg file for vmcoreinfo allocation */
> > + /* XXX: linker could learn to allocate without backing fw_cfg? */
>
> Yes, and a number of other things, as I'm sure Igor and MST will point
> out upon reading the patch :)
>
> > + fw_cfg_add_file(s, VMCOREINFO_FW_CFG_FILE, vmci->data,
> > + VMCOREINFO_FW_CFG_SIZE);
> > + /* Create a read-write fw_cfg file for Address */
> > + fw_cfg_add_file_callback(s, VMCOREINFO_ADDR_FW_CFG_FILE, NULL, NULL,
> > + vis->vmcoreinfo_addr_le,
> > + ARRAY_SIZE(vis->vmcoreinfo_addr_le), false);
> > +}
> > +
> > +bool vmcoreinfo_get(VmcoreinfoState *vis,
> > + uint64_t *paddr, uint64_t *size,
> > + Error **errp)
> > +{
> > + uint32_t vmcoreinfo_addr;
> > + uint32_t version;
> > +
> > + assert(vis);
> > + assert(paddr);
> > + assert(size);
> > +
> > + memcpy(&vmcoreinfo_addr, vis->vmcoreinfo_addr_le,
> > sizeof(vmcoreinfo_addr));
> > + vmcoreinfo_addr = le32_to_cpu(vmcoreinfo_addr);
> > + if (!vmcoreinfo_addr) {
> > + error_setg(errp, "BIOS has not yet written the address of %s",
> > + VMCOREINFO_DEVICE);
> > + return false;
> > + }
> > +
> > + cpu_physical_memory_read(vmcoreinfo_addr, &version, sizeof(version));
> > + if (version != 0) {
> > + error_setg(errp, "Unknown %s memory version", VMCOREINFO_DEVICE);
> > + return false;
> > + }
> > +
> > + cpu_physical_memory_read(vmcoreinfo_addr + 4, paddr, sizeof(paddr));
>
> This is wrong, it should be sizeof(*paddr).
right
>
> > + *paddr = le64_to_cpu(*paddr);
> > + cpu_physical_memory_read(vmcoreinfo_addr + 12, size, sizeof(size));
>
> This is wrong for two reasons:
> - first, it should be sizeof(*size),
ok
> - second, sizeof(*size) is 8, however, according to the design, the size
> field is 4 bytes wide. I think the function prototype should be
> updated to fix this.
>
yes
> > + *size = le32_to_cpu(*size);
> > +
> > + return true;
> > +}
> > +
> > +static const VMStateDescription vmstate_vmcoreinfo = {
> > + .name = "vmcoreinfo",
> > + .version_id = 1,
> > + .minimum_version_id = 1,
> > + .fields = (VMStateField[]) {
> > + VMSTATE_UINT8_ARRAY(vmcoreinfo_addr_le, VmcoreinfoState,
> > sizeof(uint64_t)),
> > + VMSTATE_END_OF_LIST()
> > + },
> > +};
> > +
> > +static void vmcoreinfo_realize(DeviceState *dev, Error **errp)
> > +{
> > + if (!bios_linker_loader_can_write_pointer()) {
> > + error_setg(errp, "%s requires DMA write support in fw_cfg, "
> > + "which this machine type does not provide",
> > + VMCOREINFO_DEVICE);
> > + return;
> > + }
> > +
> > + /* Given that this function is executing, there is at least one
> > VMCOREINFO
> > + * device. Check if there are several.
> > + */
> > + if (!find_vmcoreinfo_dev()) {
> > + error_setg(errp, "at most one %s device is permitted",
> > + VMCOREINFO_DEVICE);
> > + return;
> > + }
> > +}
>
> You didn't copy the reset logic from vmgenid, but that's wrong -- all
> device models that produce WRITE_POINTER linker/loader commands must
> forget about guest-returned GPAs upon reset. Think of it this way: when
> the guest executes WRITE_POINTER, it creates a reference to guest-phys
> memory in QEMU; and when the guest initiates a reboot, guest-phys memory
> is fully "invalidated", so all such references must be dropped in QEMU.
>
> (S3 resume is an exception, because guest memory is preserved, but both
> SeaBIOS and OVMF handle that specially -- they stash the WRITE_POINTER
> commands in a condensed format during normal boot, and then replay them
> during S3 resume. Thus, they restore the QEMU references discussed
> abovem at S3 resume.)
>
> In more practical terms, assume the guest boots Linux, the address is
> passed back fine, then the Linux guest reboots, and then some other
> guest OS is launched from the guest firmware and/or bootloader (or we
> just stay in the firmware / bootloader indefinitely). If a dump is
> requested at this point, QEMU shouldn't go looking for the VMCOREINFO
> ELF note at the previously communicated address.
>
Good point, I didn't realize it.
> The rest looks OK to me.
Thanks
next prev parent reply other threads:[~2017-07-05 13:55 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-29 13:23 [Qemu-devel] [PATCH 0/7] KASLR kernel dump support Marc-André Lureau
2017-06-29 13:23 ` [Qemu-devel] [PATCH 1/7] vmgenid: replace x-write-pointer-available hack Marc-André Lureau
2017-06-29 14:11 ` Michael S. Tsirkin
2017-07-02 3:09 ` Ben Warren
2017-07-03 14:48 ` Eduardo Habkost
2017-07-03 18:06 ` Laszlo Ersek
2017-07-03 18:27 ` Eduardo Habkost
2017-07-03 18:35 ` Laszlo Ersek
2017-07-03 18:21 ` Michael S. Tsirkin
2017-07-03 18:38 ` Michael S. Tsirkin
2017-07-03 18:50 ` Eduardo Habkost
2017-07-03 19:51 ` Michael S. Tsirkin
2017-06-29 13:23 ` [Qemu-devel] [PATCH 2/7] acpi: add vmcoreinfo device Marc-André Lureau
2017-07-04 22:07 ` Laszlo Ersek
2017-07-05 13:54 ` Marc-André Lureau [this message]
2017-06-29 13:23 ` [Qemu-devel] [PATCH 3/7] tests: add simple vmcoreinfo test Marc-André Lureau
2017-06-29 13:23 ` [Qemu-devel] [PATCH 4/7] dump: add vmcoreinfo ELF note Marc-André Lureau
2017-07-04 23:48 ` Laszlo Ersek
2017-07-05 21:52 ` Marc-André Lureau
2017-07-06 10:29 ` Laszlo Ersek
2017-06-29 13:23 ` [Qemu-devel] [PATCH 5/7] kdump: " Marc-André Lureau
2017-07-05 0:07 ` Laszlo Ersek
2017-07-06 10:05 ` Marc-André Lureau
2017-06-29 13:23 ` [Qemu-devel] [PATCH 6/7] scripts/dump-guest-memory.py: add vmcoreinfo Marc-André Lureau
2017-07-05 0:22 ` Laszlo Ersek
2017-07-05 9:58 ` Marc-André Lureau
2017-07-05 11:05 ` Laszlo Ersek
2017-06-29 13:23 ` [Qemu-devel] [PATCH 7/7] MAINTAINERS: add Dump maintainers Marc-André Lureau
2017-07-05 0:26 ` Laszlo Ersek
2017-07-06 9:54 ` Marc-André Lureau
2017-07-06 10:17 ` Laszlo Ersek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1945214498.47130577.1499262889217.JavaMail.zimbra@redhat.com \
--to=marcandre.lureau@redhat.com \
--cc=anderson@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=lersek@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.