From: Sergio Lopez <slp@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: ehabkost@redhat.com, maran.wilson@oracle.com, mst@redhat.com,
qemu-devel@nongnu.org, pbonzini@redhat.com, rth@twiddle.net
Subject: Re: [Qemu-devel] [PATCH v2 4/4] hw/i386: Introduce the microvm machine type
Date: Tue, 02 Jul 2019 10:47:08 +0200 [thread overview]
Message-ID: <877e90ygab.fsf@redhat.com> (raw)
In-Reply-To: <20190702081914.ulccsaokivd6epgv@steredhat>
[-- Attachment #1: Type: text/plain, Size: 10133 bytes --]
Stefano Garzarella <sgarzare@redhat.com> writes:
> On Mon, Jul 01, 2019 at 04:47:05PM +0200, Sergio Lopez wrote:
>> Microvm is a machine type inspired by both NEMU and Firecracker, and
>> constructed after the machine model implemented by the latter.
>>
>> It's main purpose is providing users a KVM-only machine type with fast
>> boot times, minimal attack surface (measured as the number of IO ports
>> and MMIO regions exposed to the Guest) and small footprint (specially
>> when combined with the ongoing QEMU modularization effort).
>>
>> Normally, other than the device support provided by KVM itself,
>> microvm only supports virtio-mmio devices. Microvm also includes a
>> legacy mode, which adds an ISA bus with a 16550A serial port, useful
>> for being able to see the early boot kernel messages.
>>
>> Microvm only supports booting PVH-enabled Linux ELF images. Booting
>> other PVH-enabled kernels may be possible, but due to the lack of ACPI
>> and firmware, we're relying on the command line for specifying the
>> location of the virtio-mmio transports. If there's an interest on
>> using this machine type with other kernels, we'll try to find some
>> kind of middle ground solution.
>>
>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>> ---
>> default-configs/i386-softmmu.mak | 1 +
>> hw/i386/Kconfig | 4 +
>> hw/i386/Makefile.objs | 1 +
>> hw/i386/microvm.c | 500 +++++++++++++++++++++++++++++++
>> include/hw/i386/microvm.h | 77 +++++
>> 5 files changed, 583 insertions(+)
>> create mode 100644 hw/i386/microvm.c
>> create mode 100644 include/hw/i386/microvm.h
>>
>> diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
>> index cd5ea391e8..338f07420f 100644
>> --- a/default-configs/i386-softmmu.mak
>> +++ b/default-configs/i386-softmmu.mak
>> @@ -26,3 +26,4 @@ CONFIG_ISAPC=y
>> CONFIG_I440FX=y
>> CONFIG_Q35=y
>> CONFIG_ACPI_PCI=y
>> +CONFIG_MICROVM=y
>> diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
>> index 9817888216..94c565d8db 100644
>> --- a/hw/i386/Kconfig
>> +++ b/hw/i386/Kconfig
>> @@ -87,6 +87,10 @@ config Q35
>> select VMMOUSE
>> select FW_CFG_DMA
>>
>> +config MICROVM
>> + bool
>> + select VIRTIO_MMIO
>> +
>> config VTD
>> bool
>>
>> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
>> index c5f20bbd72..7bffca413e 100644
>> --- a/hw/i386/Makefile.objs
>> +++ b/hw/i386/Makefile.objs
>> @@ -4,6 +4,7 @@ obj-y += pvh.o
>> obj-y += pc.o
>> obj-$(CONFIG_I440FX) += pc_piix.o
>> obj-$(CONFIG_Q35) += pc_q35.o
>> +obj-$(CONFIG_MICROVM) += mptable.o microvm.o
>> obj-y += fw_cfg.o pc_sysfw.o
>> obj-y += x86-iommu.o
>> obj-$(CONFIG_VTD) += intel_iommu.o
>> diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
>> new file mode 100644
>> index 0000000000..8b5efe9e45
>> --- /dev/null
>> +++ b/hw/i386/microvm.c
>> @@ -0,0 +1,500 @@
>> +/*
>> + * Copyright (c) 2018 Intel Corporation
>> + * Copyright (c) 2019 Red Hat, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2 or later, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/cutils.h"
>> +#include "qapi/error.h"
>> +#include "qapi/visitor.h"
>> +#include "sysemu/sysemu.h"
>> +#include "sysemu/cpus.h"
>> +#include "sysemu/numa.h"
>> +
>> +#include "hw/loader.h"
>> +#include "hw/nmi.h"
>> +#include "hw/kvm/clock.h"
>> +#include "hw/i386/microvm.h"
>> +#include "hw/i386/pc.h"
>> +#include "target/i386/cpu.h"
>> +#include "hw/timer/i8254.h"
>> +#include "hw/char/serial.h"
>> +#include "hw/i386/topology.h"
>> +#include "hw/virtio/virtio-mmio.h"
>> +#include "hw/i386/mptable.h"
>> +
>> +#include "cpu.h"
>> +#include "elf.h"
>> +#include "pvh.h"
>> +#include "kvm_i386.h"
>> +#include "hw/xen/start_info.h"
>> +
>> +static void microvm_gsi_handler(void *opaque, int n, int level)
>> +{
>> + qemu_irq *ioapic_irq = opaque;
>> +
>> + qemu_set_irq(ioapic_irq[n], level);
>> +}
>> +
>> +static void microvm_legacy_init(MicrovmMachineState *mms)
>> +{
>> + ISABus *isa_bus;
>> + GSIState *gsi_state;
>> + qemu_irq *i8259;
>> + int i;
>> +
>> + assert(kvm_irqchip_in_kernel());
>> + gsi_state = g_malloc0(sizeof(*gsi_state));
>> + mms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
>> +
>> + isa_bus = isa_bus_new(NULL, get_system_memory(), get_system_io(),
>> + &error_abort);
>> + isa_bus_irqs(isa_bus, mms->gsi);
>> +
>> + assert(kvm_pic_in_kernel());
>> + i8259 = kvm_i8259_init(isa_bus);
>> +
>> + for (i = 0; i < ISA_NUM_IRQS; i++) {
>> + gsi_state->i8259_irq[i] = i8259[i];
>> + }
>> +
>> + kvm_pit_init(isa_bus, 0x40);
>> +
>> + for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) {
>> + int nirq = VIRTIO_IRQ_BASE + i;
>> + ISADevice *isadev = isa_create(isa_bus, TYPE_ISA_SERIAL);
>> + qemu_irq mmio_irq;
>> +
>> + isa_init_irq(isadev, &mmio_irq, nirq);
>> + sysbus_create_simple("virtio-mmio",
>> + VIRTIO_MMIO_BASE + i * 512,
>> + mms->gsi[VIRTIO_IRQ_BASE + i]);
>> + }
>> +
>> + g_free(i8259);
>> +
>> + serial_hds_isa_init(isa_bus, 0, 1);
>> +}
>> +
>> +static void microvm_ioapic_init(MicrovmMachineState *mms)
>> +{
>> + qemu_irq *ioapic_irq;
>> + DeviceState *ioapic_dev;
>> + SysBusDevice *d;
>> + int i;
>> +
>> + assert(kvm_irqchip_in_kernel());
>> + ioapic_irq = g_new0(qemu_irq, IOAPIC_NUM_PINS);
>> + kvm_pc_setup_irq_routing(true);
>> +
>> + assert(kvm_ioapic_in_kernel());
>> + ioapic_dev = qdev_create(NULL, "kvm-ioapic");
>> +
>> + object_property_add_child(qdev_get_machine(),
>> + "ioapic", OBJECT(ioapic_dev), NULL);
>> +
>> + qdev_init_nofail(ioapic_dev);
>> + d = SYS_BUS_DEVICE(ioapic_dev);
>> + sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS);
>> +
>> + for (i = 0; i < IOAPIC_NUM_PINS; i++) {
>> + ioapic_irq[i] = qdev_get_gpio_in(ioapic_dev, i);
>> + }
>> +
>> + mms->gsi = qemu_allocate_irqs(microvm_gsi_handler,
>> + ioapic_irq, IOAPIC_NUM_PINS);
>> +
>> + for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) {
>> + sysbus_create_simple("virtio-mmio",
>> + VIRTIO_MMIO_BASE + i * 512,
>> + mms->gsi[VIRTIO_IRQ_BASE + i]);
>> + }
>> +}
>> +
>> +static void microvm_memory_init(MicrovmMachineState *mms)
>> +{
>> + MachineState *machine = MACHINE(mms);
>> + MemoryRegion *ram, *ram_below_4g, *ram_above_4g;
>> + MemoryRegion *system_memory = get_system_memory();
>> +
>> + if (machine->ram_size > MICROVM_MAX_BELOW_4G) {
>> + mms->above_4g_mem_size = machine->ram_size - MICROVM_MAX_BELOW_4G;
>> + mms->below_4g_mem_size = MICROVM_MAX_BELOW_4G;
>> + } else {
>> + mms->above_4g_mem_size = 0;
>> + mms->below_4g_mem_size = machine->ram_size;
>> + }
>> +
>> + ram = g_malloc(sizeof(*ram));
>> + memory_region_allocate_system_memory(ram, NULL, "microvm.ram",
>> + machine->ram_size);
>> +
>> + ram_below_4g = g_malloc(sizeof(*ram_below_4g));
>> + memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
>> + 0, mms->below_4g_mem_size);
>> + memory_region_add_subregion(system_memory, 0, ram_below_4g);
>> +
>> + e820_add_entry(0, mms->below_4g_mem_size, E820_RAM);
>> +
>> + if (mms->above_4g_mem_size > 0) {
>> + ram_above_4g = g_malloc(sizeof(*ram_above_4g));
>> + memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
>> + mms->below_4g_mem_size,
>> + mms->above_4g_mem_size);
>> + memory_region_add_subregion(system_memory, 0x100000000ULL,
>> + ram_above_4g);
>> + e820_add_entry(0x100000000ULL, mms->above_4g_mem_size, E820_RAM);
>> + }
>> +}
>> +
>> +static void microvm_cpus_init(const char *typename, Error **errp)
>> +{
>> + int i;
>> +
>> + for (i = 0; i < smp_cpus; i++) {
>> + Object *cpu = NULL;
>> + Error *local_err = NULL;
>> +
>> + cpu = object_new(typename);
>> +
>> + object_property_set_uint(cpu, i, "apic-id", &local_err);
>> + object_property_set_bool(cpu, true, "realized", &local_err);
>> +
>> + object_unref(cpu);
>> + error_propagate(errp, local_err);
>> + }
>> +}
>> +
>> +static void microvm_machine_state_init(MachineState *machine)
>> +{
>> + MicrovmMachineState *mms = MICROVM_MACHINE(machine);
>> + Error *local_err = NULL;
>> +
>> + if (machine->kernel_filename == NULL) {
>> + error_report("missing kernel image file name, required by microvm");
>> + exit(1);
>> + }
>
> Could it be useful to support initrd as well?
>
> I'm thinking a possibility to a microvm to use only the initrd without a
> block device.
I agree, thanks for the suggestion. I'll add support for it.
Sergio.
> In this case, Linux expects the initrd address and size in the first
> element of the modlist in the 'struct hvm_start_info'.
>
> See pc-bios/optionrom/pvh_main.c
>
> Cheers,
> Stefano
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]
next prev parent reply other threads:[~2019-07-02 8:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-01 14:47 [Qemu-devel] [PATCH v2 0/4] Introduce the microvm machine type Sergio Lopez
2019-07-01 14:47 ` [Qemu-devel] [PATCH v2 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
2019-07-01 14:47 ` [Qemu-devel] [PATCH v2 2/4] hw/i386: Add an Intel MPTable generator Sergio Lopez
2019-07-02 8:02 ` Gerd Hoffmann
2019-07-02 8:37 ` Sergio Lopez
2019-07-02 9:33 ` Gerd Hoffmann
2019-07-01 14:47 ` [Qemu-devel] [PATCH v2 3/4] hw/i386: Factorize PVH related functions Sergio Lopez
2019-07-01 14:47 ` [Qemu-devel] [PATCH v2 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez
2019-07-02 8:17 ` Gerd Hoffmann
2019-07-02 8:42 ` Sergio Lopez
2019-07-02 10:16 ` Gerd Hoffmann
2019-07-02 10:52 ` Sergio Lopez
2019-07-02 11:50 ` Gerd Hoffmann
2019-07-02 14:06 ` Paolo Bonzini
2019-07-02 14:41 ` Sergio Lopez
2019-07-18 14:34 ` Sergio Lopez
2019-07-18 15:48 ` Paolo Bonzini
2019-07-19 10:30 ` Sergio Lopez
2019-07-19 11:49 ` Paolo Bonzini
2019-07-02 8:19 ` Stefano Garzarella
2019-07-02 8:47 ` Sergio Lopez [this message]
2019-07-02 10:37 ` Paolo Bonzini
2019-07-02 11:16 ` Sergio Lopez
2019-07-01 18:32 ` [Qemu-devel] [PATCH v2 0/4] " no-reply
2019-07-01 19:06 ` no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=877e90ygab.fsf@redhat.com \
--to=slp@redhat.com \
--cc=ehabkost@redhat.com \
--cc=maran.wilson@oracle.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
--cc=sgarzare@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).