qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pagupta@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 3/3] virtio-pmem: should we make it migratable???
Date: Fri, 4 May 2018 13:26:51 +0100	[thread overview]
Message-ID: <20180504122651.GC2611@work-vm> (raw)
In-Reply-To: <20180504111323.7cb4c7c8@redhat.com>

* Igor Mammedov (imammedo@redhat.com) wrote:
> On Thu, 26 Apr 2018 03:37:51 -0400 (EDT)
> Pankaj Gupta <pagupta@redhat.com> wrote:
> 
> trimming CC list to keep people that might be interested in the topic
> and renaming thread to reflect it.
> 
> > > > > > > > >> +
> > > > > > > > >> +    memory_region_add_subregion(&hpms->mr, addr - hpms->base,
> > > > > > > > >> mr);  
> > > > > > > > > missing vmstate registration?  
> > > > > > > > 
> > > > > > > > Missed this one: To be called by the caller. Important because e.g.
> > > > > > > > for
> > > > > > > > virtio-pmem we don't want this (I assume :) ).  
> > > > > > > if pmem isn't on shared storage, then We'd probably want to migrate
> > > > > > > it as well, otherwise target would experience data loss.
> > > > > > > Anyways, I'd just reat it as normal RAM in migration case  
> > > > > > 
> > > > > > Main difference between RAM and pmem it acts like combination of RAM
> > > > > > and
> > > > > > disk.
> > > > > > Saying this, in normal use-case size would be 100 GB's - few TB's
> > > > > > range.
> > > > > > I am not sure we really want to migrate it for non-shared storage
> > > > > > use-case.  
> > > > > with non shared storage you'd have to migrate it target host but
> > > > > with shared storage it might be possible to flush it and use directly
> > > > > from target host. That probably won't work right out of box and would
> > > > > need some sort of synchronization between src/dst hosts.  
> > > > 
> > > > Shared storage should work out of the box.
> > > > Only thing is data in destination
> > > > host will be cache cold and existing pages in cache should be invalidated
> > > > first.
> > > > But if we migrate entire fake DAX RAMstate it will populate destination
> > > > host page
> > > > cache including pages while were idle in source host. This would
> > > > unnecessarily
> > > > create entropy in destination host.
> > > > 
> > > > To me this feature don't make much sense. Problem which we are solving is:
> > > > Efficiently use guest RAM.  
> > > What would live migration handover flow look like in case of
> > > guest constantly dirting memory provided by virtio-pmem and
> > > and sometimes issuing async flush req along with it?  
> > 
> > Dirty entire pmem (disk) at once not a usual scenario. Some part of disk/pmem
> > would get dirty and we need to handle that. I just want to say moving entire
> > pmem (disk) is not efficient solution because we are using this solution to
> > manage guest memory efficiently. Otherwise it will be like any block device copy
> > with non-shared storage.   
> not sure if we can use block layer analogy here.
> 
> > > > > The same applies to nv/pc-dimm as well, as backend file easily could be
> > > > > on pmem storage as well.  
> > > > 
> > > > Are you saying backing file is in actual actual nvdimm hardware? we don't
> > > > need
> > > > emulation at all.  
> > > depends on if file is on DAX filesystem, but your argument about
> > > migrating huge 100Gb- TB's range applies in this case as well.
> > >   
> > > >   
> > > > > 
> > > > > Maybe for now we should migrate everything so it would work in case of
> > > > > non shared NVDIMM on host. And then later add migration-less capability
> > > > > to all of them.  
> > > > 
> > > > not sure I agree.  
> > > So would you inhibit migration in case of non shared backend storage,
> > > to avoid loosing data since they aren't migrated?  
> > 
> > I am just thinking what features we want to support with pmem. And live migration
> > with shared storage is the one which comes to my mind.
> > 
> > If live migration with non-shared storage is what we want to support (I don't know
> > yet) we can add this? Even with shared storage it would copy entire pmem state?
> Perhaps we should register vmstate like for normal ram and use something similar to
>   http://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html this
> to skip shared memory on migration.
> In this case we could use this for pc-dimms as well.
> 
> David,
>  what's your take on it?

My feel is that something is going to have to migrate it, I'm just not
sure how.
So let me just check I understand:
  a) It's potentially huge
  b) It's a RAMBlock
  c) It's backed by ????
     c1) Something machine local - i.e. a physical lump of flash in a
         socket rather than something sharable by machines?
  d) It can potentially be rapidly changing as the guest writes to it?

Dave

> > Thanks,
> > Pankaj
> >  
> > > 
> > >   
> > > > > > One reason why nvdimm added vmstate info could be: still there would be
> > > > > > transient
> > > > > > writes in memory with fake DAX and there is no way(till now) to flush
> > > > > > the
> > > > > > guest
> > > > > > writes. But with virtio-pmem we can flush such writes before migration
> > > > > > and
> > > > > > automatically
> > > > > > at destination host with shared disk we will have updated data.  
> > > > > nvdimm has concept of flush address hint (may be not implemented in qemu
> > > > > yet)
> > > > > but it can flush. The only reason I'm buying into virtio-mem idea
> > > > > is that would allow async flush queues which would reduce number
> > > > > of vmexits.  
> > > > 
> > > > Thats correct.
> > > > 
> > > > Thanks,
> > > > Pankaj
> > > > 
> > > >    
> > > 
> > > 
> > >   
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  parent reply	other threads:[~2018-05-04 12:27 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-20 12:34 [Qemu-devel] [PATCH v3 0/3] pc-dimm: factor out MemoryDevice David Hildenbrand
2018-04-20 12:34 ` [Qemu-devel] [PATCH v3 1/3] pc-dimm: factor out MemoryDevice interface David Hildenbrand
2018-04-22  4:26   ` David Gibson
2018-04-22  8:21     ` David Hildenbrand
2018-04-22 10:10       ` David Gibson
2018-04-23  9:52         ` David Hildenbrand
2018-04-22  5:09   ` Pankaj Gupta
2018-04-22  8:26     ` David Hildenbrand
2018-04-20 12:34 ` [Qemu-devel] [PATCH v3 2/3] machine: make MemoryHotplugState accessible via the machine David Hildenbrand
2018-04-23  3:28   ` David Gibson
2018-04-23  9:36     ` David Hildenbrand
2018-04-23 10:44       ` David Gibson
2018-04-23 11:11         ` David Hildenbrand
2018-04-20 12:34 ` [Qemu-devel] [PATCH v3 3/3] pc-dimm: factor out address space logic into MemoryDevice code David Hildenbrand
2018-04-23 12:19   ` Igor Mammedov
2018-04-23 12:44     ` David Hildenbrand
2018-04-24 13:28       ` Igor Mammedov
2018-04-24 13:39         ` David Hildenbrand
2018-04-24 14:38           ` Igor Mammedov
2018-04-23 12:52     ` David Hildenbrand
2018-04-24 13:31       ` Igor Mammedov
2018-04-24 13:41         ` David Hildenbrand
2018-04-24 14:44           ` Igor Mammedov
2018-04-24 15:23             ` David Hildenbrand
2018-04-25  5:45         ` Pankaj Gupta
2018-04-25 13:23           ` Igor Mammedov
2018-04-25 13:56             ` Pankaj Gupta
2018-04-25 15:26               ` Igor Mammedov
2018-04-26  7:37                 ` Pankaj Gupta
2018-05-04  9:13                   ` [Qemu-devel] [PATCH v3 3/3] virtio-pmem: should we make it migratable??? Igor Mammedov
2018-05-04  9:30                     ` David Hildenbrand
2018-05-04 11:59                       ` Pankaj Gupta
2018-05-04 12:26                     ` Dr. David Alan Gilbert [this message]
2018-05-07  8:12                       ` Igor Mammedov
2018-05-07 11:19                         ` Pankaj Gupta
2018-05-08  9:44                         ` Dr. David Alan Gilbert
2018-04-23 14:44     ` [Qemu-devel] [PATCH v3 3/3] pc-dimm: factor out address space logic into MemoryDevice code David Hildenbrand
2018-04-22  4:58 ` [Qemu-devel] [PATCH v3 0/3] pc-dimm: factor out MemoryDevice Pankaj Gupta
2018-04-22  8:20   ` David Hildenbrand
2018-04-23  4:58     ` Pankaj Gupta
2018-04-23 12:31 ` Igor Mammedov
2018-04-23 12:50   ` David Hildenbrand
2018-04-23 15:32   ` Pankaj Gupta
2018-04-23 16:35     ` David Hildenbrand
2018-04-24 14:00     ` Igor Mammedov
2018-04-24 15:42       ` David Hildenbrand
2018-04-25 12:15         ` Igor Mammedov
2018-04-25 12:46           ` David Hildenbrand
2018-04-25 13:15             ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180504122651.GC2611@work-vm \
    --to=dgilbert@redhat.com \
    --cc=david@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).