From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>,
qemu-devel@nongnu.org,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
mst@redhat.com, Juan Quintela <quintela@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
Eduardo Habkost <ehabkost@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 0/8] nvdimm: guarantee persistence of QEMU writes to persistent memory
Date: Tue, 13 Mar 2018 09:36:01 +0000 [thread overview]
Message-ID: <20180313093600.GA3548@work-vm> (raw)
In-Reply-To: <20180313001150.gtbs4jz4degeblba@hz-desktop>
* Haozhong Zhang (haozhong.zhang@intel.com) wrote:
> On 03/12/18 15:39 +0000, Stefan Hajnoczi wrote:
> > On Wed, Feb 28, 2018 at 03:25:50PM +0800, Haozhong Zhang wrote:
> > > QEMU writes to vNVDIMM backends in the vNVDIMM label emulation and
> > > live migration. If the backend is on the persistent memory, QEMU needs
> > > to take proper operations to ensure its writes persistent on the
> > > persistent memory. Otherwise, a host power failure may result in the
> > > loss the guest data on the persistent memory.
> > >
> > > This v3 patch series is based on Marcel's patch "mem: add share
> > > parameter to memory-backend-ram" [1] because of the changes in patch 1.
> > >
> > > [1] https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg03858.html
> > >
> > > Previous versions can be found at
> > > v3: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg04365.html
> > > v2: https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg01579.html
> > > v1: https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg05040.html
> > >
> > > Changes in v4:
> > > * (Patch 2) Fix compilation errors found by patchew.
> > >
> > > Changes in v3:
> > > * (Patch 5) Add a is_pmem flag to ram_handle_compressed() and handle
> > > PMEM writes in it, so we don't need the _common function.
> > > * (Patch 6) Expose qemu_get_buffer_common so we can remove the
> > > unnecessary qemu_get_buffer_to_pmem wrapper.
> > > * (Patch 8) Add a is_pmem flag to xbzrle_decode_buffer() and handle
> > > PMEM writes in it, so we can remove the unnecessary
> > > xbzrle_decode_buffer_{common, to_pmem}.
> > > * Move libpmem stubs to stubs/pmem.c and fix the compilation failures
> > > of test-{xbzrle,vmstate}.c.
> > >
> > > Changes in v2:
> > > * (Patch 1) Use a flags parameter in file ram allocation functions.
> > > * (Patch 2) Add a new option 'pmem' to hostmem-file.
> > > * (Patch 3) Use libpmem to operate on the persistent memory, rather
> > > than re-implementing those operations in QEMU.
> > > * (Patch 5-8) Consider the write persistence in the migration path.
> > >
> > > Haozhong Zhang (8):
> > > [1/8] memory, exec: switch file ram allocation functions to 'flags' parameters
> > > [2/8] hostmem-file: add the 'pmem' option
> > > [3/8] configure: add libpmem support
> > > [4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation
> > > [5/8] migration/ram: ensure write persistence on loading zero pages to PMEM
> > > [6/8] migration/ram: ensure write persistence on loading normal pages to PMEM
> > > [7/8] migration/ram: ensure write persistence on loading compressed pages to PMEM
> > > [8/8] migration/ram: ensure write persistence on loading xbzrle pages to PMEM
> > >
> > > backends/hostmem-file.c | 27 +++++++++++++++++++-
> > > configure | 35 ++++++++++++++++++++++++++
> > > docs/nvdimm.txt | 14 +++++++++++
> > > exec.c | 20 ++++++++++++---
> > > hw/mem/nvdimm.c | 9 ++++++-
> > > include/exec/memory.h | 12 +++++++--
> > > include/exec/ram_addr.h | 28 +++++++++++++++++++--
> > > include/migration/qemu-file-types.h | 2 ++
> > > include/qemu/pmem.h | 27 ++++++++++++++++++++
> > > memory.c | 8 +++---
> > > migration/qemu-file.c | 29 ++++++++++++++--------
> > > migration/ram.c | 49 +++++++++++++++++++++++++++----------
> > > migration/ram.h | 2 +-
> > > migration/rdma.c | 2 +-
> > > migration/xbzrle.c | 8 ++++--
> > > migration/xbzrle.h | 3 ++-
> > > numa.c | 2 +-
> > > qemu-options.hx | 9 ++++++-
> > > stubs/Makefile.objs | 1 +
> > > stubs/pmem.c | 37 ++++++++++++++++++++++++++++
> > > tests/Makefile.include | 4 +--
> > > tests/test-xbzrle.c | 4 +--
> > > 22 files changed, 285 insertions(+), 47 deletions(-)
> > > create mode 100644 include/qemu/pmem.h
> > > create mode 100644 stubs/pmem.c
> >
> > A few thoughts:
> >
> > 1. Can you use pmem_is_pmem() to auto-detect the pmem=on|off value?
>
> The manpage [1] of pmem_is_pmem says:
>
> "The result of pmem_is_pmem() query is only valid for the mappings
> created using pmem_map_file(). For other memory regions, in
> particular those created by a direct call to mmap(2), pmem_is_pmem()
> always returns false, even if the queried range is entirely
> persistent memory."
>
> QEMU is using mmap for NVDIMM mapping, so pmem_is_pmem does not work.
>
> [1] http://pmem.io/pmdk/manpages/linux/master/libpmem/pmem_is_pmem.3#caveats
>
> >
> > 2. The migration/ram code is invasive. Is it really necessary to
> > persist data each time pages are loaded from a migration stream? It
> > seems simpler to migrate as normal and call pmem_persist() just once
> > after RAM has been migrated but before the migration completes.
>
> The concern is about the overhead of cache flush.
>
> In this patch series, if possible, QEMU will use pmem_mem{set,cpy}_nodrain
> APIs to copy NVDIMM blocks. Those APIs use movnt (if it's available) and
> can avoid the subsequent cache flush.
>
> Anyway, I'll make some microbenchmark to check which one will be better.
The problem is not just the overhead; the problem is the code
complexity; this series makes all the paths through the migration code
more complex in places we wouldn't expect to change.
>
> >
> > 3. This is independent of this patch series and can be done later.
> > NVDIMM seems incompatible with post-copy live migration. It would be
> > good to have a postcopy_add_blocker() API so that a nice error
> > message is printed if post-copy live migration is attempted.
>
> Post-copy with NVDIMM currently fails with message "Postcopy on shared
> RAM (...) is not yet supported". Is it enough?
Once shared support arrives (see my patch series) that check goes
though; it might get trapped by one of the other checks though as well;
I'll need to try simulated pmem to find out.
Dave
> >
> > The code itself seems fine though:
> >
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>
> Thanks,
> Haozhong
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2018-03-13 9:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-28 7:25 [Qemu-devel] [PATCH v4 0/8] nvdimm: guarantee persistence of QEMU writes to persistent memory Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 1/8] memory, exec: switch file ram allocation functions to 'flags' parameters Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 2/8] hostmem-file: add the 'pmem' option Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 3/8] configure: add libpmem support Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 4/8] mem/nvdimm: ensure write persistence to PMEM in label emulation Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 5/8] migration/ram: ensure write persistence on loading zero pages to PMEM Haozhong Zhang
2018-03-29 18:59 ` Dr. David Alan Gilbert
2018-04-02 1:42 ` Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 6/8] migration/ram: ensure write persistence on loading normal " Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 7/8] migration/ram: ensure write persistence on loading compressed " Haozhong Zhang
2018-02-28 7:25 ` [Qemu-devel] [PATCH v4 8/8] migration/ram: ensure write persistence on loading xbzrle " Haozhong Zhang
2018-02-28 7:31 ` [Qemu-devel] [PATCH v4 0/8] nvdimm: guarantee persistence of QEMU writes to persistent memory Haozhong Zhang
2018-03-08 6:49 ` Haozhong Zhang
2018-03-12 15:39 ` Stefan Hajnoczi
2018-03-13 0:11 ` Haozhong Zhang
2018-03-13 9:36 ` Dr. David Alan Gilbert [this message]
2018-03-13 10:44 ` Stefan Hajnoczi
2018-03-29 19:12 ` Dr. David Alan Gilbert
2018-04-02 1:29 ` Haozhong Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180313093600.GA3548@work-vm \
--to=dgilbert@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).