From: Peter Xu <peterx@redhat.com>
To: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Juan Quintela <quintela@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
qemu-devel@nongnu.org,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>, Den Lunev <den@openvz.org>
Subject: Re: [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate()
Date: Fri, 20 Nov 2020 10:07:03 -0500 [thread overview]
Message-ID: <20201120150703.GE32525@xz-x1> (raw)
In-Reply-To: <1e35a550-a6a0-fbe2-ac8d-6844ce23b3fb@virtuozzo.com>
On Fri, Nov 20, 2020 at 01:44:53PM +0300, Andrey Gruzdev wrote:
> On 19.11.2020 21:25, Peter Xu wrote:
> > On Thu, Nov 19, 2020 at 03:59:36PM +0300, Andrey Gruzdev via wrote:
> >
> > [...]
> >
> > > +/**
> > > + * ram_find_block_by_host_address: find RAM block containing host page
> > > + *
> > > + * Returns true if RAM block is found and pss->block/page are
> > > + * pointing to the given host page, false in case of an error
> > > + *
> > > + * @rs: current RAM state
> > > + * @pss: page-search-status structure
> > > + */
> > > +static bool ram_find_block_by_host_address(RAMState *rs, PageSearchStatus *pss,
> > > + hwaddr page_address)
> > > +{
> > > + bool found = false;
> > > +
> > > + pss->block = rs->last_seen_block;
> > > + do {
> > > + if (page_address >= (hwaddr) pss->block->host &&
> > > + (page_address + TARGET_PAGE_SIZE) <=
> > > + ((hwaddr) pss->block->host + pss->block->used_length)) {
> > > + pss->page = (unsigned long)
> > > + ((page_address - (hwaddr) pss->block->host) >> TARGET_PAGE_BITS);
> > > + found = true;
> > > + break;
> > > + }
> > > +
> > > + pss->block = QLIST_NEXT_RCU(pss->block, next);
> > > + if (!pss->block) {
> > > + /* Hit the end of the list */
> > > + pss->block = QLIST_FIRST_RCU(&ram_list.blocks);
> > > + }
> > > + } while (pss->block != rs->last_seen_block);
> > > +
> > > + rs->last_seen_block = pss->block;
> > > + /*
> > > + * Since we are in the same loop with ram_find_and_save_block(),
> > > + * need to reset pss->complete_round after switching to
> > > + * other block/page in pss.
> > > + */
> > > + pss->complete_round = false;
> > > +
> > > + return found;
> > > +}
> >
> > I forgot whether Denis and I have discussed this, but I'll try anyways... do
> > you think we can avoid touching PageSearchStatus at all?
> >
> > PageSearchStatus is used to track a single migration iteration for precopy, so
> > that we scan from the 1st ramblock until the last one. Then we finish one
> > iteration.
> >
>
> Yes, my first idea also was to separate normal iteration from write-fault
> page source completely and leave pss for normal scan.. But, the other idea
> is to keep some locality in respect to last write fault. I mean it seems to
> be more optimal to re-start normal scan on the page that is next to faulting
> one. In this case we can save and un-protect
> the neighborhood faster and prevent many write faults.
Yeah locality sounds reasonable, and you just reminded me the fact that
postcopy has that already I think. :) Just see get_queued_page():
if (block) {
/*
* As soon as we start servicing pages out of order, then we have
* to kill the bulk stage, since the bulk stage assumes
* in (migration_bitmap_find_and_reset_dirty) that every page is
* dirty, that's no longer true.
*/
rs->ram_bulk_stage = false;
/*
* We want the background search to continue from the queued page
* since the guest is likely to want other pages near to the page
* it just requested.
*/
pss->block = block;
pss->page = offset >> TARGET_PAGE_BITS;
/*
* This unqueued page would break the "one round" check, even is
* really rare.
*/
pss->complete_round = false;
}
So as long as we queue the pages onto the src_page_requests queue, it'll take
care of write locality already, iiuc.
>
> > Snapshot is really something, imho, that can easily leverage this structure
> > without touching it - basically we want to do two things:
> >
> > - Do the 1st iteration of precopy (when ram_bulk_stage==true), and do that
> > only. We never need the 2nd, 3rd, ... iterations because we're snapshoting.
> >
> > - Leverage the postcopy queue mechanism so that when some page got written,
> > queue that page. We should have this queue higher priority than the
> > precopy scanning mentioned above.
> >
> > As long as we follow above rules, then after the above 1st round precopy, we're
> > simply done... If that works, the whole logic of precopy and PageSearchStatus
> > does not need to be touched, iiuc.
> >
> > [...]
> >
>
> It's quite good alternative and I thought about using postcopy page queue,
> but this implementation won't consider the locality of writes..
>
> What do you think?
So now I think "Do the 1st iteration of precopy only" idea won't work, but
still please consider whether it's natural to just reuse postcopy's queue
mechanism. IOW, to see whether we can avoid major of the pss logic changes in
this patch.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2020-11-20 15:10 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-19 12:59 [PATCH v3 0/7] UFFD write-tracking migration/snapshots Andrey Gruzdev via
2020-11-19 12:59 ` [PATCH v3 1/7] introduce 'track-writes-ram' migration capability Andrey Gruzdev via
2020-11-19 18:51 ` Peter Xu
2020-11-19 19:07 ` Peter Xu
2020-11-20 11:35 ` Andrey Gruzdev
2020-11-24 16:55 ` Dr. David Alan Gilbert
2020-11-24 17:25 ` Andrey Gruzdev
2020-11-20 11:32 ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 2/7] introduce UFFD-WP low-level interface helpers Andrey Gruzdev via
2020-11-19 18:39 ` Peter Xu
2020-11-20 11:04 ` Andrey Gruzdev
2020-11-20 15:01 ` Peter Xu
2020-11-20 15:43 ` Andrey Gruzdev
2020-11-24 17:57 ` Dr. David Alan Gilbert
2020-11-25 8:11 ` Andrey Gruzdev
2020-11-25 18:43 ` Dr. David Alan Gilbert
2020-11-25 19:17 ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate() Andrey Gruzdev via
2020-11-19 18:25 ` Peter Xu
2020-11-20 10:44 ` Andrey Gruzdev
2020-11-20 15:07 ` Peter Xu [this message]
2020-11-20 16:15 ` Andrey Gruzdev
2020-11-20 16:43 ` Peter Xu
2020-11-20 16:53 ` Andrey Gruzdev
2020-11-23 21:34 ` Peter Xu
2020-11-24 8:02 ` Andrey Gruzdev
2020-11-24 15:17 ` Peter Xu
2020-11-24 17:40 ` Andrey Gruzdev
2020-11-25 13:08 ` Dr. David Alan Gilbert
2020-11-25 14:40 ` Andrey Gruzdev
2020-11-25 18:41 ` Dr. David Alan Gilbert
2020-11-25 19:12 ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 4/7] implementation of write-tracking migration thread Andrey Gruzdev via
2020-11-19 18:47 ` Peter Xu
2020-11-20 11:41 ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 5/7] implementation of vm_start() BH Andrey Gruzdev via
2020-11-19 18:46 ` Peter Xu
2020-11-20 11:13 ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 6/7] the rest of write tracking migration code Andrey Gruzdev via
2020-11-19 12:59 ` [PATCH v3 7/7] introduce simple linear scan rate limiting mechanism Andrey Gruzdev via
2020-11-19 20:02 ` Peter Xu
2020-11-20 12:06 ` Andrey Gruzdev
2020-11-20 15:23 ` Peter Xu
2020-11-24 16:41 ` [PATCH v3 0/7] UFFD write-tracking migration/snapshots Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201120150703.GE32525@xz-x1 \
--to=peterx@redhat.com \
--cc=andrey.gruzdev@virtuozzo.com \
--cc=armbru@redhat.com \
--cc=den@openvz.org \
--cc=dgilbert@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).