qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Milan Zamazal <mzamazal@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	"Liu, Jingqi" <jingqi.liu@intel.com>,
	"Lai, Paul C" <paul.c.lai@intel.com>,
	Cornelia Huck <cohuck@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Amnon Ilan <ailan@redhat.com>
Subject: Re: Adjustments of NVDIMM devices and future data safety
Date: Mon, 3 May 2021 16:09:39 +0200	[thread overview]
Message-ID: <20210503160939.30047450@redhat.com> (raw)
In-Reply-To: <87y2d0gjw6.fsf@redhat.com>

On Fri, 30 Apr 2021 14:18:30 +0200
Milan Zamazal <mzamazal@redhat.com> wrote:

> Hi,
> 
> I work on NVDIMM support in oVirt/RHV, I think other virtualization
> management software built on top of QEMU may have similar concerns.
> 
> When a virtual NVDIMM device size is specified, it's not necessarily the
> eventual NVDIMM device size visible to the guest OS.  As seen in
> https://github.com/qemu/qemu/blob/v6.0.0/hw/mem/nvdimm.c#L117, QEMU
> makes some adjustments (other adjustments are performed by libvirt but
> that's a topic for a different forum):
> 
> - NVDIMM label size is subtracted from the NVDIMM size.
> 
> - NVDIMM label is pointed to a certain memory region.
> 
> - The remaining NVDIMM size is aligned down.
> 
> There are some related potential problems:
> 
> - If the alignment rules change in a future QEMU version, it may result
>   in a different device size visible to the guest (even if the requested
>   size remains the same) and cause trouble there up to data loss.
> 
> - If the layout on the backing device changes, e.g. a label placement,
>   then the stored data may become corrupt or inaccessible.

We usually tie ABI changes to machine versions, so if in future we decide to
change NVDIMM layout, we should preserve old layout for old machine types
(which is accomplished using compat mechanism)

> 
> - I'm not sure about the current QEMU version, but at least in previous
>   QEMU versions, the resulting size is important for memory hot plug.
>   The NVDIMM alignment size is smaller than the required regular memory
>   DIMM placement alignment.  If a VM contains an NVDIMM with the
>   resulting size not matching the DIMM placement requirements and a
>   memory hot plug is attempted then the hot plug fails because the DIMM
>   is mapped next to the end of the NVDIMM region, which is not
>   DIMM-aligned.


I vaguely recall that, start address of any hotplugged (NV)DIMM
is aligned on 1G boundary (only the very first versions of memory
hotplug used unaligned address). Described above situation shouldn't happen.

I'd try to fix alignment issues first if there is any before talking about
splitting label out.


> All this means:
> 
> - The requested NVDIMM size must be computed and specified carefully,
>   with attention to QEMU internal implementation.
> 
> - And because it depends on QEMU internal implementation, there is a
>   risk of malfunction or data loss when the same backing device with the
>   same parameters is used with a future QEMU version.

When making incompatible changes, we usually add a new property to enable them,
so normally situation when NVDIMM "with the same parameters" is used
should lead to old behaviour.
If we change 'default' values then as long as one uses versioned machine
type, old behaviour will be preserved with future QEMU.
However if one uses un-versioned/another machine type or enables new behavior,
QEMU doesn't guarantee any compatibility.


> As for labels, I was told NVDIMM labels might be put to regular files in
> future to avoid some problems.  Since label placement is not visible to
> the guest, such a change could be made transparently without disrupting
> access to the data.  (As long as the label data is transferred to the
> new location properly and undesirable resulting NVDIMM size changes are
> not induced by such a change.)

I think current approach resembles real nvdimm devices
(the only problem is that one has to configure size/label size,
where with real devices it's done by manufacturer).

if we add a dedicated property, It should be possible to split label out
into a separate file.
However I don't fancy carrying transparent migration from old format to
the new one in QEMU, I think it should be done by separate utility. So
that if users have access to backend + and know used label size,
they should be able split it.

Also I'm not sure that splitting label out is fixing anything, it just
replaces one set of rules how to set size/label (assuming there is one)
with another + user will have to manage 1 more backend (content + label).
 
> The primary point is still how to ensure that data kept on a backing
> device will remain accessible and safe in future QEMU versions and how
> to possibly avoid reliance on QEMU implementation details.  A big
> warning in the NVDIMM handling source code to keep backward
> compatibility (incl. memory hot plugs) and data safety on mind before
> making any changes there might be a reasonable minimum measure.
> Any additional ideas?  What do you think about it all?
we usually are trying to keep compatibility (versioned or new features
are disabled by default and user has to explicitly enable them)
when making breaking changes.
(and that is done without extra warnings in the code,
otherwise QEMU will be full of them).

> 
> Thank you,
> Milan
> 
> 



  reply	other threads:[~2021-05-03 14:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-30 12:18 Adjustments of NVDIMM devices and future data safety Milan Zamazal
2021-05-03 14:09 ` Igor Mammedov [this message]
2021-05-05 20:46   ` Milan Zamazal
2021-05-08  7:30 ` Liu, Jingqi
2021-05-18 15:29   ` Milan Zamazal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210503160939.30047450@redhat.com \
    --to=imammedo@redhat.com \
    --cc=ailan@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=ehabkost@redhat.com \
    --cc=jingqi.liu@intel.com \
    --cc=mzamazal@redhat.com \
    --cc=paul.c.lai@intel.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).