From: Peter Xu <peterx@redhat.com>
To: Alexey Perevalov <a.perevalov@samsung.com>
Cc: qemu-devel@nongnu.org, i.maximets@samsung.com,
quintela@redhat.com, heetae82.ahn@samsung.com,
dgilbert@redhat.com
Subject: Re: [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page
Date: Fri, 28 Jul 2017 12:27:58 +0800 [thread overview]
Message-ID: <20170728042758.GS1595@pxdev.xzpeter.org> (raw)
In-Reply-To: <bb34d360-3ad8-f6c5-c03a-8d2dc994fb95@samsung.com>
On Thu, Jul 27, 2017 at 10:27:41AM +0300, Alexey Perevalov wrote:
> On 07/27/2017 05:35 AM, Peter Xu wrote:
> >On Wed, Jul 26, 2017 at 06:24:11PM +0300, Alexey Perevalov wrote:
> >>On 07/26/2017 11:43 AM, Peter Xu wrote:
> >>>On Wed, Jul 26, 2017 at 11:07:17AM +0300, Alexey Perevalov wrote:
> >>>>On 07/26/2017 04:49 AM, Peter Xu wrote:
> >>>>>On Thu, Jul 20, 2017 at 09:52:34AM +0300, Alexey Perevalov wrote:
> >>>>>>This patch adds ability to track down already received
> >>>>>>pages, it's necessary for calculation vCPU block time in
> >>>>>>postcopy migration feature, maybe for restore after
> >>>>>>postcopy migration failure.
> >>>>>>Also it's necessary to solve shared memory issue in
> >>>>>>postcopy livemigration. Information about received pages
> >>>>>>will be transferred to the software virtual bridge
> >>>>>>(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
> >>>>>>already received pages. fallocate syscall is required for
> >>>>>>remmaped shared memory, due to remmaping itself blocks
> >>>>>>ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
> >>>>>>error (struct page is exists after remmap).
> >>>>>>
> >>>>>>Bitmap is placed into RAMBlock as another postcopy/precopy
> >>>>>>related bitmaps.
> >>>>>>
> >>>>>>Reviewed-by: Peter Xu <peterx@redhat.com>
> >>>>>>Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
> >>>>>>---
> >>>>>[...]
> >>>>>
> >>>>>> static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr,
> >>>>>>- void *from_addr, uint64_t pagesize)
> >>>>>>+ void *from_addr, uint64_t pagesize, RAMBlock *rb)
> >>>>>> {
> >>>>>>+ int ret;
> >>>>>> if (from_addr) {
> >>>>>> struct uffdio_copy copy_struct;
> >>>>>> copy_struct.dst = (uint64_t)(uintptr_t)host_addr;
> >>>>>> copy_struct.src = (uint64_t)(uintptr_t)from_addr;
> >>>>>> copy_struct.len = pagesize;
> >>>>>> copy_struct.mode = 0;
> >>>>>>- return ioctl(userfault_fd, UFFDIO_COPY, ©_struct);
> >>>>>>+ ret = ioctl(userfault_fd, UFFDIO_COPY, ©_struct);
> >>>>>> } else {
> >>>>>> struct uffdio_zeropage zero_struct;
> >>>>>> zero_struct.range.start = (uint64_t)(uintptr_t)host_addr;
> >>>>>> zero_struct.range.len = pagesize;
> >>>>>> zero_struct.mode = 0;
> >>>>>>- return ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
> >>>>>>+ ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct);
> >>>>>>+ }
> >>>>>>+ if (!ret) {
> >>>>>>+ ramblock_recv_bitmap_set(host_addr, rb);
> >>>>>Wait...
> >>>>>
> >>>>>Now we are using 4k-page/bit bitmap, do we need to take care of the
> >>>>>huge pages here? Looks like we are only setting the first bit of it
> >>>>>if it is a huge page?
> >>>>First version was per ramblock page size, IOW bitmap was smaller in
> >>>>case of hugepages.
> >>>Yes, but this is not the first version any more. :)
> >>>
> >>>This patch is using:
> >>>
> >>> bitmap_new(rb->max_length >> TARGET_PAGE_BITS);
> >>>
> >>>to allocate bitmap, so it is using small pages always for bitmap,
> >>>right? (I should not really say "4k" pages, here I think the size is
> >>>host page size, which is the thing returned from getpagesize()).
> >>>
> >>>>You mentioned that TARGET_PAGE_SIZE is reasonable for precopy case,
> >>>>in "Re: [Qemu-devel] [PATCH v1 2/2] migration: add bitmap for copied page"
> >>>>I though TARGET_PAGE_SIZE as transmition unit, is using in precopy even
> >>>>hugepage case.
> >>>>But it's not so logically, page being marked as dirty, should be sent as a
> >>>>whole page.
> >>>Sorry if I misunderstood, but I didn't see anything wrong - we are
> >>>sending pages in small pages, but when postcopy is there, we do
> >>>UFFDIO_COPY in huge page, so everything is fine?
> >>I think yes, we chose TARGET_PAGE_SIZE because of wider
> >>use case ranges.
> >So... are you going to post another version? IIUC we just need to use
> >a bitmap_set() to replace the ramblock_recv_bitmap_set(), while set
> >the size with "pagesize / TARGET_PAGE_SIZE"?
> From my point of view TARGET_PAGE_SIZE/TARGET_PAGE_BITS it's a platform
> specific
>
> and it used in ram_load to copy to buffer so it's more preferred for bitmap size
> and I'm not going to replace ramblock_recv_bitmap_set helper - it calculates offset.
>
> >
> >(I think I was wrong when saying getpagesize() above: the small page
> > should be target page size, while the huge page should be the host's)
> I think we should forget about huge page case in "received bitmap"
> concept, maybe in "uffd_copied bitmap" it was reasonable ;)
Again, I am not sure I got the whole idea of the reply...
However, I do think when we UFFDIO_COPY a huge page, then we should do
bitmap_set() on the received bitmap for the whole range that the huge
page covers.
IMHO, the bitmap is defined as "one bit per small page", and the small
page size is TARGET_PAGE_SIZE. We cannot just assume that "as long as
the first bit of the huge page is set, all the small pages in the huge
page are set".
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2017-07-28 4:28 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20170720065246eucas1p1fb290bf54218503c2eb59db1c472faeb@eucas1p1.samsung.com>
2017-07-20 6:52 ` [Qemu-devel] [PATCH v8 0/3] Add bitmap for received pages in postcopy migration Alexey Perevalov
2017-07-20 6:52 ` [Qemu-devel] [PATCH v8 1/3] migration: postcopy_place_page factoring out Alexey Perevalov
2017-07-20 6:52 ` [Qemu-devel] [PATCH v8 2/3] migration: introduce qemu_ufd_copy_ioctl helper Alexey Perevalov
2017-07-20 6:52 ` [Qemu-devel] [PATCH v8 3/3] migration: add bitmap for received page Alexey Perevalov
2017-07-26 1:49 ` Peter Xu
2017-07-26 8:07 ` Alexey Perevalov
2017-07-26 8:43 ` Peter Xu
2017-07-26 15:24 ` Alexey Perevalov
2017-07-27 2:35 ` Peter Xu
2017-07-27 7:27 ` Alexey Perevalov
2017-07-28 4:27 ` Peter Xu [this message]
2017-07-28 6:43 ` Alexey Perevalov
2017-07-28 6:57 ` Peter Xu
2017-07-28 7:06 ` Alexey Perevalov
2017-07-28 15:29 ` Alexey Perevalov
2017-07-31 1:33 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170728042758.GS1595@pxdev.xzpeter.org \
--to=peterx@redhat.com \
--cc=a.perevalov@samsung.com \
--cc=dgilbert@redhat.com \
--cc=heetae82.ahn@samsung.com \
--cc=i.maximets@samsung.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.