From: "Michael S. Tsirkin" <mst@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: qemu-devel@nongnu.org, Juraj Marcin <jmarcin@redhat.com>,
Peter Maydell <peter.maydell@linaro.org>
Subject: Re: [PATCH v1] virtio-mem: unplug memory only during system resets, not device resets
Date: Fri, 8 Nov 2024 06:52:38 -0500 [thread overview]
Message-ID: <20241108065204-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20241025104103.342188-1-david@redhat.com>
On Fri, Oct 25, 2024 at 12:41:03PM +0200, David Hildenbrand wrote:
> We recently converted from the LegacyReset to the new reset framework
> in commit c009a311e939 ("virtio-mem: Use new Resettable framework instead
> of LegacyReset") to be able to use the ResetType to filter out wakeup
> resets.
>
> However, this change had an undesired implications: as we override the
> Resettable interface methods in VirtIOMEMClass, the reset handler will
> not only get called during system resets (i.e., qemu_devices_reset())
> but also during any direct or indirect device rests (e.g.,
> device_cold_reset()).
>
> Further, we might now receive two reset callbacks during
> qemu_devices_reset(), first when reset by a parent and later when reset
> directly.
>
> The memory state of virtio-mem devices is rather special: it's supposed to
> be persistent/unchanged during most resets (similar to resetting a hard
> disk will not destroy the data), unless actually cold-resetting the whole
> system (different to a hard disk where a reboot will not destroy the data):
> ripping out system RAM is something guest OSes don't particularly enjoy,
> but we want to detect when rebooting to an OS that does not support
> virtio-mem and wouldn't be able to detect+use the memory -- and we want
> to force-defragment hotplugged memory to also shrink the usable device
> memory region. So we rally want to catch system resets to do that.
>
> On supported targets (e.g., x86), getting a cold reset on the
> device/parent triggers is not that easy (but looks like PCI code
> might trigger it), so this implication went unnoticed.
>
> However, with upcoming s390x support it is problematic: during
> kdump, s390x triggers a subsystem reset, ending up in
> s390_machine_reset() and calling only subsystem_reset() instead of
> qemu_devices_reset() -- because it's not a full system reset.
>
> In subsystem_reset(), s390x performs a device_cold_reset() of any
> TYPE_VIRTUAL_CSS_BRIDGE device, which ends up resetting all children,
> including the virtio-mem device. Consequently, we wrongly detect a system
> reset and unplug all device memory, resulting in hotplugged memory not
> getting included in the crash dump -- undesired.
>
> We really must not mess with hotplugged memory state during simple
> device resets. To fix, create+register a new reset object that will only
> get triggered during qemu_devices_reset() calls, but not during any other
> resets as it is logically not the child of any other object.
>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Juraj Marcin <jmarcin@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> hw/virtio/virtio-mem.c | 103 +++++++++++++++++++++++----------
> include/hw/virtio/virtio-mem.h | 12 +++-
> 2 files changed, 83 insertions(+), 32 deletions(-)
>
> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index ae1e81d7ba..08e0e9da1c 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -956,6 +956,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp)
> VirtIOMEM *vmem = VIRTIO_MEM(dev);
> uint64_t page_size;
> RAMBlock *rb;
> + Object *obj;
> int ret;
>
> if (!vmem->memdev) {
> @@ -1121,7 +1122,28 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp)
> vmstate_register_any(VMSTATE_IF(vmem),
> &vmstate_virtio_mem_device_early, vmem);
> }
> - qemu_register_resettable(OBJECT(vmem));
> +
> + /*
> + * We only want to unplug all memory to start with a clean slate when
> + * it is safe for the guest -- during system resets that call
> + * qemu_devices_reset().
> + *
> + * We'll filter out selected qemu_devices_reset() calls used for other
> + * purposes, like resetting all devices during wakeup from suspend on
> + * x86 based on the reset type passed to qemu_devices_reset().
> + *
> + * Unplugging all memory during simple device resets can result in the VM
> + * unexpectedly losing RAM, corrupting VM state.
> + *
> + * Simple device resets (or resets triggered by getting a parent device
> + * reset) must not change the state of plugged memory blocks. Therefore,
> + * we need a dedicated reset object that only gets called during
> + * qemu_devices_reset().
> + */
> + obj = object_new(TYPE_VIRTIO_MEM_SYSTEM_RESET);
> + vmem->system_reset = VIRTIO_MEM_SYSTEM_RESET(obj);
> + vmem->system_reset->vmem = vmem;
> + qemu_register_resettable(obj);
>
> /*
> * Set ourselves as RamDiscardManager before the plug handler maps the
> @@ -1141,7 +1163,10 @@ static void virtio_mem_device_unrealize(DeviceState *dev)
> * found via an address space anymore. Unset ourselves.
> */
> memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL);
> - qemu_unregister_resettable(OBJECT(vmem));
> +
> + qemu_unregister_resettable(OBJECT(vmem->system_reset));
> + object_unref(OBJECT(vmem->system_reset));
> +
> if (vmem->early_migration) {
> vmstate_unregister(VMSTATE_IF(vmem), &vmstate_virtio_mem_device_early,
> vmem);
> @@ -1841,38 +1866,12 @@ static void virtio_mem_unplug_request_check(VirtIOMEM *vmem, Error **errp)
> }
> }
>
> -static ResettableState *virtio_mem_get_reset_state(Object *obj)
> -{
> - VirtIOMEM *vmem = VIRTIO_MEM(obj);
> - return &vmem->reset_state;
> -}
> -
> -static void virtio_mem_system_reset_hold(Object *obj, ResetType type)
> -{
> - VirtIOMEM *vmem = VIRTIO_MEM(obj);
> -
> - /*
> - * When waking up from standby/suspend-to-ram, do not unplug any memory.
> - */
> - if (type == RESET_TYPE_WAKEUP) {
> - return;
> - }
> -
> - /*
> - * During usual resets, we will unplug all memory and shrink the usable
> - * region size. This is, however, not possible in all scenarios. Then,
> - * the guest has to deal with this manually (VIRTIO_MEM_REQ_UNPLUG_ALL).
> - */
> - virtio_mem_unplug_all(vmem);
> -}
> -
> static void virtio_mem_class_init(ObjectClass *klass, void *data)
> {
> DeviceClass *dc = DEVICE_CLASS(klass);
> VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
> VirtIOMEMClass *vmc = VIRTIO_MEM_CLASS(klass);
> RamDiscardManagerClass *rdmc = RAM_DISCARD_MANAGER_CLASS(klass);
> - ResettableClass *rc = RESETTABLE_CLASS(klass);
>
> device_class_set_props(dc, virtio_mem_properties);
> dc->vmsd = &vmstate_virtio_mem;
> @@ -1899,9 +1898,6 @@ static void virtio_mem_class_init(ObjectClass *klass, void *data)
> rdmc->replay_discarded = virtio_mem_rdm_replay_discarded;
> rdmc->register_listener = virtio_mem_rdm_register_listener;
> rdmc->unregister_listener = virtio_mem_rdm_unregister_listener;
> -
> - rc->get_state = virtio_mem_get_reset_state;
> - rc->phases.hold = virtio_mem_system_reset_hold;
> }
>
> static const TypeInfo virtio_mem_info = {
> @@ -1924,3 +1920,48 @@ static void virtio_register_types(void)
> }
>
> type_init(virtio_register_types)
> +
> +OBJECT_DEFINE_SIMPLE_TYPE_WITH_INTERFACES(VirtioMemSystemReset, virtio_mem_system_reset, VIRTIO_MEM_SYSTEM_RESET, OBJECT, { TYPE_RESETTABLE_INTERFACE }, { })
> +
> +static void virtio_mem_system_reset_init(Object *obj)
> +{
> +}
> +
> +static void virtio_mem_system_reset_finalize(Object *obj)
> +{
> +}
> +
> +static ResettableState *virtio_mem_system_reset_get_state(Object *obj)
> +{
> + VirtioMemSystemReset *vmem_reset = VIRTIO_MEM_SYSTEM_RESET(obj);
> +
> + return &vmem_reset->reset_state;
> +}
> +
> +static void virtio_mem_system_reset_hold(Object *obj, ResetType type)
> +{
> + VirtioMemSystemReset *vmem_reset = VIRTIO_MEM_SYSTEM_RESET(obj);
> + VirtIOMEM *vmem = vmem_reset->vmem;
> +
> + /*
> + * When waking up from standby/suspend-to-ram, do not unplug any memory.
> + */
> + if (type == RESET_TYPE_WAKEUP) {
> + return;
> + }
> +
> + /*
> + * During usual resets, we will unplug all memory and shrink the usable
> + * region size. This is, however, not possible in all scenarios. Then,
> + * the guest has to deal with this manually (VIRTIO_MEM_REQ_UNPLUG_ALL).
> + */
> + virtio_mem_unplug_all(vmem);
> +}
> +
> +static void virtio_mem_system_reset_class_init(ObjectClass *klass, void *data)
> +{
> + ResettableClass *rc = RESETTABLE_CLASS(klass);
> +
> + rc->get_state = virtio_mem_system_reset_get_state;
> + rc->phases.hold = virtio_mem_system_reset_hold;
> +}
> diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h
> index a1af144c28..abde1c4101 100644
> --- a/include/hw/virtio/virtio-mem.h
> +++ b/include/hw/virtio/virtio-mem.h
> @@ -25,6 +25,10 @@
> OBJECT_DECLARE_TYPE(VirtIOMEM, VirtIOMEMClass,
> VIRTIO_MEM)
>
> +#define TYPE_VIRTIO_MEM_SYSTEM_RESET "virtio-mem-system-reset"
> +
> +OBJECT_DECLARE_SIMPLE_TYPE(VirtioMemSystemReset, VIRTIO_MEM_SYSTEM_RESET)
> +
> #define VIRTIO_MEM_MEMDEV_PROP "memdev"
> #define VIRTIO_MEM_NODE_PROP "node"
> #define VIRTIO_MEM_SIZE_PROP "size"
> @@ -117,8 +121,14 @@ struct VirtIOMEM {
> /* listeners to notify on plug/unplug activity. */
> QLIST_HEAD(, RamDiscardListener) rdl_list;
>
> - /* State of the resettable container */
> + /* Catch system resets -> qemu_devices_reset() only. */
> + VirtioMemSystemReset *system_reset;
> +};
> +
> +struct VirtioMemSystemReset {
> + Object parent;
> ResettableState reset_state;
> + VirtIOMEM *vmem;
> };
>
> struct VirtIOMEMClass {
> --
> 2.46.1
next prev parent reply other threads:[~2024-11-08 11:53 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-25 10:41 [PATCH v1] virtio-mem: unplug memory only during system resets, not device resets David Hildenbrand
2024-11-08 10:46 ` David Hildenbrand
2024-11-08 11:52 ` Michael S. Tsirkin [this message]
2024-11-08 14:37 ` Philippe Mathieu-Daudé
2024-11-08 14:57 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241108065204-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=david@redhat.com \
--cc=jmarcin@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.