qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Alexey Perevalov <a.perevalov@samsung.com>
Cc: qemu-devel@nongnu.org, i.maximets@samsung.com,
	quintela@redhat.com, heetae82.ahn@samsung.com,
	dgilbert@redhat.com
Subject: Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page
Date: Fri, 28 Jul 2017 12:27:58 +0800	[thread overview]
Message-ID: <20170728042758.GS1595@pxdev.xzpeter.org> (raw)
In-Reply-To: <bb34d360-3ad8-f6c5-c03a-8d2dc994fb95@samsung.com>

On Thu, Jul 27, 2017 at 10:27:41AM +0300, Alexey Perevalov wrote:
> On 07/27/2017 05:35 AM, Peter Xu wrote:
> >On Wed, Jul 26, 2017 at 06:24:11PM +0300, Alexey Perevalov wrote:
> >>On 07/26/2017 11:43 AM, Peter Xu wrote:
> >>>On Wed, Jul 26, 2017 at 11:07:17AM +0300, Alexey Perevalov wrote:
> >>>>On 07/26/2017 04:49 AM, Peter Xu wrote:
> >>>>>On Thu, Jul 20, 2017 at 09:52:34AM +0300, Alexey Perevalov wrote:
> >>>>>>This patch adds ability to track down already received
> >>>>>>pages, it's necessary for calculation vCPU block time in
> >>>>>>postcopy migration feature, maybe for restore after
> >>>>>>postcopy migration failure.
> >>>>>>Also it's necessary to solve shared memory issue in
> >>>>>>postcopy livemigration. Information about received pages
> >>>>>>will be transferred to the software virtual bridge
> >>>>>>(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
> >>>>>>already received pages. fallocate syscall is required for
> >>>>>>remmaped shared memory, due to remmaping itself blocks
> >>>>>>ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
> >>>>>>error (struct page is exists after remmap).
> >>>>>>
> >>>>>>Bitmap is placed into RAMBlock as another postcopy/precopy
> >>>>>>related bitmaps.
> >>>>>>
> >>>>>>Reviewed-by: Peter Xu <peterx@redhat.com>
> >>>>>>Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
> >>>>>>---
> >>>>>[...]
> >>>>>
> >>>>>>  static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
> >>>>>>-        void *from_addr, uint64_t pagesize)
> >>>>>>+                               void *from_addr, uint64_t pagesize, RAMBlock *rb)
> >>>>>>  {
> >>>>>>+    int ret;
> >>>>>>      if (from_addr) {
> >>>>>>          struct uffdio_copy copy_struct;
> >>>>>>          copy_struct.dst = (uint64_t)(uintptr_t)host_addr;
> >>>>>>          copy_struct.src = (uint64_t)(uintptr_t)from_addr;
> >>>>>>          copy_struct.len = pagesize;
> >>>>>>          copy_struct.mode = 0;
> >>>>>>-        return ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
> >>>>>>+        ret = ioctl(userfault_fd, UFFDIO_COPY, &copy_struct);
> >>>>>>      } else {
> >>>>>>          struct uffdio_zeropage zero_struct;
> >>>>>>          zero_struct.range.start = (uint64_t)(uintptr_t)host_addr;
> >>>>>>          zero_struct.range.len = pagesize;
> >>>>>>          zero_struct.mode = 0;
> >>>>>>-        return ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
> >>>>>>+        ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
> >>>>>>+    }
> >>>>>>+    if (!ret) {
> >>>>>>+        ramblock_recv_bitmap_set(host_addr, rb);
> >>>>>Wait...
> >>>>>
> >>>>>Now we are using 4k-page/bit bitmap, do we need to take care of the
> >>>>>huge pages here?  Looks like we are only setting the first bit of it
> >>>>>if it is a huge page?
> >>>>First version was per ramblock page size, IOW bitmap was smaller in
> >>>>case of hugepages.
> >>>Yes, but this is not the first version any more. :)
> >>>
> >>>This patch is using:
> >>>
> >>>   bitmap_new(rb->max_length >> TARGET_PAGE_BITS);
> >>>
> >>>to allocate bitmap, so it is using small pages always for bitmap,
> >>>right? (I should not really say "4k" pages, here I think the size is
> >>>host page size, which is the thing returned from getpagesize()).
> >>>
> >>>>You mentioned that TARGET_PAGE_SIZE is reasonable for precopy case,
> >>>>in "Re: [Qemu-devel] [PATCH v1 2/2] migration: add bitmap for copied page"
> >>>>I though TARGET_PAGE_SIZE as transmition unit, is using in precopy even
> >>>>hugepage case.
> >>>>But it's not so logically, page being marked as dirty, should be sent as a
> >>>>whole page.
> >>>Sorry if I misunderstood, but I didn't see anything wrong - we are
> >>>sending pages in small pages, but when postcopy is there, we do
> >>>UFFDIO_COPY in huge page, so everything is fine?
> >>I think yes, we chose TARGET_PAGE_SIZE because of wider
> >>use case ranges.
> >So... are you going to post another version? IIUC we just need to use
> >a bitmap_set() to replace the ramblock_recv_bitmap_set(), while set
> >the size with "pagesize / TARGET_PAGE_SIZE"?
> From my point of view TARGET_PAGE_SIZE/TARGET_PAGE_BITS it's a platform
> specific
> 
> and it used in ram_load to copy to buffer so it's more preferred for bitmap size
> and I'm not going to replace ramblock_recv_bitmap_set helper - it calculates offset.
> 
> >
> >(I think I was wrong when saying getpagesize() above: the small page
> >  should be target page size, while the huge page should be the host's)
> I think we should forget about huge page case in "received bitmap"
> concept, maybe in "uffd_copied bitmap" it was reasonable ;)

Again, I am not sure I got the whole idea of the reply...

However, I do think when we UFFDIO_COPY a huge page, then we should do
bitmap_set() on the received bitmap for the whole range that the huge
page covers.

IMHO, the bitmap is defined as "one bit per small page", and the small
page size is TARGET_PAGE_SIZE. We cannot just assume that "as long as
the first bit of the huge page is set, all the small pages in the huge
page are set".

Thanks,

-- 
Peter Xu

  reply	other threads:[~2017-07-28  4:28 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20170720065246eucas1p1fb290bf54218503c2eb59db1c472faeb@eucas1p1.samsung.com>
2017-07-20  6:52 ` [Qemu-devel] [PATCH v8 0/3] Add bitmap for received pages in postcopy migration Alexey Perevalov
     [not found]   ` <CGME20170720065248eucas1p233be1718f190cd4415c0e58a23e2ef2a@eucas1p2.samsung.com>
2017-07-20  6:52     ` [Qemu-devel] [PATCH v8 1/3] migration: postcopy_place_page factoring out Alexey Perevalov
     [not found]   ` <CGME20170720065248eucas1p1641d5e3f9f3d6cc3849f9345006d3bb6@eucas1p1.samsung.com>
2017-07-20  6:52     ` [Qemu-devel] [PATCH v8 2/3] migration: introduce qemu_ufd_copy_ioctl helper Alexey Perevalov
     [not found]   ` <CGME20170720065249eucas1p2cc3e779e2cde2125f90ae25ff03ac646@eucas1p2.samsung.com>
2017-07-20  6:52     ` [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page Alexey Perevalov
2017-07-26  1:49       ` Peter Xu
2017-07-26  8:07         ` Alexey Perevalov
2017-07-26  8:43           ` Peter Xu
2017-07-26 15:24             ` Alexey Perevalov
2017-07-27  2:35               ` Peter Xu
2017-07-27  7:27                 ` Alexey Perevalov
2017-07-28  4:27                   ` Peter Xu [this message]
2017-07-28  6:43                     ` Alexey Perevalov
2017-07-28  6:57                       ` Peter Xu
2017-07-28  7:06                         ` Alexey Perevalov
2017-07-28 15:29                           ` Alexey Perevalov
2017-07-31  1:33                             ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170728042758.GS1595@pxdev.xzpeter.org \
    --to=peterx@redhat.com \
    --cc=a.perevalov@samsung.com \
    --cc=dgilbert@redhat.com \
    --cc=heetae82.ahn@samsung.com \
    --cc=i.maximets@samsung.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).