qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Cc: Juan Quintela <quintela@redhat.com>,
	Markus Armbruster <armbru@redhat.com>,
	qemu-devel@nongnu.org,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Den Lunev <den@openvz.org>
Subject: Re: [PATCH v3 2/7] introduce UFFD-WP low-level interface helpers
Date: Fri, 20 Nov 2020 10:01:24 -0500	[thread overview]
Message-ID: <20201120150124.GD32525@xz-x1> (raw)
In-Reply-To: <8eb862a9-90d3-e3ea-5bdf-50287ce2226f@virtuozzo.com>

On Fri, Nov 20, 2020 at 02:04:46PM +0300, Andrey Gruzdev wrote:
> > > +    RAMBLOCK_FOREACH_NOT_IGNORED(bs) {
> > > +        /* Nothing to do with read-only and MMIO-writable regions */
> > > +        if (bs->mr->readonly || bs->mr->rom_device) {
> > > +            continue;
> > > +        }
> > > +
> > > +        /* Register block memory with UFFD to track writes */
> > > +        if (uffd_register_memory(rs->uffdio_fd, (hwaddr) bs->host,
> > > +                bs->max_length, false, true)) {
> > > +            goto fail;
> > > +        }
> > > +        /* Apply UFFD write protection to the block memory range */
> > > +        if (uffd_protect_memory(rs->uffdio_fd, (hwaddr) bs->host,
> > > +                bs->max_length, true)) {
> > 
> > Here logically we need to undo the previous register first, however userfaultfd
> > will auto-clean these when close(fd), so it's ok.  However still better to
> > unwind the protection of pages, I think.  So...
> > 
> 
> It should auto-clean, but as an additional safety measure - yes.

I'm afraid it will only clean up the registers, but not the page table updates;
at least that should be what we do now in the kernel. I'm not sure whether we
should always force the kernel to unprotect those when close(). The problem is
the registered range is normally quite large while the wr-protect range can be
very small (page-based), so that could take a lot of time, which can be
unnecessary, since the userspace is the one that knows the best on which range
was protected.

Indeed I can't think if anything really bad even if not unprotect the pages as
you do right now - what will happen is that the wr-protected pages will have
UFFD_WP set and PAGE_RW cleared in the page tables even after the close(fd).
It means after the snapshot got cancelled those wr-protected pages could still
trigger page fault again when being written, though since it's not covered by
uffd-wp protected vmas, it'll become a "normal cow" fault, and the write bit
will be recovered.  However the UFFD_WP bit in the page tables could got
leftover there...  So maybe it's still best to unprotect from userspace.

There's an idea that maybe we can auto-remove the UFFD_WP bit in kernel when
cow happens for a page, but that's definitely out of topic (and we must make
sure things like "enforced cow for read-only get_user_pages() won't happen
again").  No hard to do that in userspace, anyways.

-- 
Peter Xu



  reply	other threads:[~2020-11-20 15:02 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-19 12:59 [PATCH v3 0/7] UFFD write-tracking migration/snapshots Andrey Gruzdev via
2020-11-19 12:59 ` [PATCH v3 1/7] introduce 'track-writes-ram' migration capability Andrey Gruzdev via
2020-11-19 18:51   ` Peter Xu
2020-11-19 19:07     ` Peter Xu
2020-11-20 11:35       ` Andrey Gruzdev
2020-11-24 16:55       ` Dr. David Alan Gilbert
2020-11-24 17:25         ` Andrey Gruzdev
2020-11-20 11:32     ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 2/7] introduce UFFD-WP low-level interface helpers Andrey Gruzdev via
2020-11-19 18:39   ` Peter Xu
2020-11-20 11:04     ` Andrey Gruzdev
2020-11-20 15:01       ` Peter Xu [this message]
2020-11-20 15:43         ` Andrey Gruzdev
2020-11-24 17:57   ` Dr. David Alan Gilbert
2020-11-25  8:11     ` Andrey Gruzdev
2020-11-25 18:43       ` Dr. David Alan Gilbert
2020-11-25 19:17         ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 3/7] support UFFD write fault processing in ram_save_iterate() Andrey Gruzdev via
2020-11-19 18:25   ` Peter Xu
2020-11-20 10:44     ` Andrey Gruzdev
2020-11-20 15:07       ` Peter Xu
2020-11-20 16:15         ` Andrey Gruzdev
2020-11-20 16:43           ` Peter Xu
2020-11-20 16:53             ` Andrey Gruzdev
2020-11-23 21:34               ` Peter Xu
2020-11-24  8:02                 ` Andrey Gruzdev
2020-11-24 15:17                   ` Peter Xu
2020-11-24 17:40                     ` Andrey Gruzdev
2020-11-25 13:08   ` Dr. David Alan Gilbert
2020-11-25 14:40     ` Andrey Gruzdev
2020-11-25 18:41       ` Dr. David Alan Gilbert
2020-11-25 19:12         ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 4/7] implementation of write-tracking migration thread Andrey Gruzdev via
2020-11-19 18:47   ` Peter Xu
2020-11-20 11:41     ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 5/7] implementation of vm_start() BH Andrey Gruzdev via
2020-11-19 18:46   ` Peter Xu
2020-11-20 11:13     ` Andrey Gruzdev
2020-11-19 12:59 ` [PATCH v3 6/7] the rest of write tracking migration code Andrey Gruzdev via
2020-11-19 12:59 ` [PATCH v3 7/7] introduce simple linear scan rate limiting mechanism Andrey Gruzdev via
2020-11-19 20:02   ` Peter Xu
2020-11-20 12:06     ` Andrey Gruzdev
2020-11-20 15:23       ` Peter Xu
2020-11-24 16:41 ` [PATCH v3 0/7] UFFD write-tracking migration/snapshots Dr. David Alan Gilbert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201120150124.GD32525@xz-x1 \
    --to=peterx@redhat.com \
    --cc=andrey.gruzdev@virtuozzo.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).