All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Prasad Pandit <ppandit@redhat.com>
Cc: Fabiano Rosas <farosas@suse.de>,
	qemu-devel@nongnu.org, berrange@redhat.com,
	Prasad Pandit <pjp@fedoraproject.org>
Subject: Re: [PATCH v9 0/7] Allow to enable multifd and postcopy migration together
Date: Tue, 29 Apr 2025 09:04:46 -0400	[thread overview]
Message-ID: <aBDObgL7hDQMy63F@x1.local> (raw)
In-Reply-To: <CAE8KmOz7P+Pz8zwJq+mTEJbZjhCk7iAo9+c5DrZzhbTmz=VtUQ@mail.gmail.com>

On Tue, Apr 29, 2025 at 06:21:13PM +0530, Prasad Pandit wrote:
> Hi,
> 
> > On Thu, Apr 17, 2025 at 01:05:37PM -0300, Fabiano Rosas wrote:
> > > It's not that page faults happen during multifd. The page was already
> > > sent during precopy, but multifd-recv didn't write to it, it just marked
> > > the receivedmap. When postcopy starts, the page gets accessed and
> > > faults. Since postcopy is on, the migration wants to request the page
> > > from the source, but it's present in the receivedmap, so it doesn't
> > > ask. No page ever comes and the code hangs waiting for the page fault to
> > > be serviced (or potentially faults continuously? I'm not sure on the
> > > details).
> >
> > I think your previous analysis is correct on the zero pages.  I am not 100%
> > sure if that's the issue but very likely.  I tend to also agree with you
> > that we could skip zero page optimization in multifd code when postcopy is
> > enabled (maybe plus some comment right above..).
> 
>    migration/multifd: solve zero page causing multiple page faults
>      -> https://gitlab.com/qemu-project/qemu/-/commit/5ef7e26bdb7eda10d6d5e1b77121be9945e5e550
> 
> * Is this the optimization that is causing the migration hang issue?

I think that's what Fabiano mentioned, but ultimately we need to verify it
on a reproducer to know.

> 
> ===
> diff --git a/migration/multifd-zero-page.c b/migration/multifd-zero-page.c
> index dbc1184921..00f69ff965 100644
> --- a/migration/multifd-zero-page.c
> +++ b/migration/multifd-zero-page.c
> @@ -85,7 +85,8 @@ void multifd_recv_zero_page_process(MultiFDRecvParams *p)
>  {
>      for (int i = 0; i < p->zero_num; i++) {
>          void *page = p->host + p->zero[i];
> -        if (ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i])) {
> +        if (!migrate_postcopy() &&
> +            ramblock_recv_bitmap_test_byte_offset(p->block, p->zero[i])) {
>              memset(page, 0, multifd_ram_page_size());
>          } else {
>              ramblock_recv_bitmap_set_offset(p->block, p->zero[i]);
> ===
> 
> * Would the above patch help to resolve it?

Looks ok, but please add some comments explain why postcopy needs to do it,
and especially do it during precopy phase.

I'd use migrate_postcopy_ram() instead. I wished migrate_dirty_bitmaps()
has a better name, maybe migrate_postcopy_block()..  I have totally no idea
who is using the feature, especially when postcopy-ram is off.

> 
> * Another way could be when the page fault occurs during postcopy
> phase, if we know (from receivedmap) that the faulted page is a
> zero-page, maybe we could write it locally on the destination to
> service the page-fault?

I don't think we can know that - receivedmap set doesn't mean it's a zero
page, but only says it's been received before.  It can also happen e.g. >1
threads faulted on the same page then the 2nd thread faulted on it may see
receivedmap set because the 1st thread got faulted already got the fault
resolved.

-- 
Peter Xu



  reply	other threads:[~2025-04-29 13:05 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 11:45 [PATCH v9 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-04-11 11:45 ` [PATCH v9 1/7] migration/multifd: move macros to multifd header Prasad Pandit
2025-04-11 11:45 ` [PATCH v9 2/7] migration: refactor channel discovery mechanism Prasad Pandit
2025-04-17 16:07   ` Fabiano Rosas
2025-04-11 11:45 ` [PATCH v9 3/7] migration: Add save_postcopy_prepare() savevm handler Prasad Pandit
2025-04-17 16:07   ` Fabiano Rosas
2025-04-11 11:45 ` [PATCH v9 4/7] migration/ram: Implement save_postcopy_prepare() Prasad Pandit
2025-04-17 16:08   ` Fabiano Rosas
2025-04-11 11:45 ` [PATCH v9 5/7] migration: enable multifd and postcopy together Prasad Pandit
2025-04-11 11:45 ` [PATCH v9 6/7] tests/qtest/migration: consolidate set capabilities Prasad Pandit
2025-04-17 16:11   ` Fabiano Rosas
2025-04-11 11:45 ` [PATCH v9 7/7] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit
2025-04-17 16:10   ` Fabiano Rosas
2025-04-16  0:31 ` [PATCH v9 0/7] Allow to enable multifd and postcopy migration together Fabiano Rosas
2025-04-16 12:59   ` Fabiano Rosas
2025-04-17 11:13     ` Prasad Pandit
2025-04-17 16:05       ` Fabiano Rosas
2025-04-23 22:50         ` Peter Xu
2025-04-29 12:51           ` Prasad Pandit
2025-04-29 13:04             ` Peter Xu [this message]
2025-04-29 13:28               ` Prasad Pandit
2025-04-29 13:47                 ` Peter Xu
2025-04-29 15:20                   ` Prasad Pandit
2025-04-29 15:49                     ` Peter Xu
2025-05-05 19:01                 ` Fabiano Rosas
2025-05-06 12:32                   ` Prasad Pandit
2025-05-05 19:04             ` Fabiano Rosas
2025-05-06 12:38               ` Prasad Pandit
2025-05-06 13:40                 ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aBDObgL7hDQMy63F@x1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=farosas@suse.de \
    --cc=pjp@fedoraproject.org \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.