All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Marc Zyngier <maz@kernel.org>
Cc: kvm@vger.kernel.org, catalin.marinas@arm.com,
	andrew.jones@linux.dev, dmatlack@google.com, will@kernel.org,
	shan.gavin@gmail.com, bgardon@google.com, kvmarm@lists.linux.dev,
	pbonzini@redhat.com, zhenyzha@redhat.com, shuah@kernel.org,
	kvmarm@lists.cs.columbia.edu, ajones@ventanamicro.com
Subject: Re: [PATCH v8 3/7] KVM: Support dirty ring in conjunction with bitmap
Date: Sun, 6 Nov 2022 11:22:29 -0500	[thread overview]
Message-ID: <Y2ffRYoqlQOxgVtk@x1n> (raw)
In-Reply-To: <87o7tkf5re.wl-maz@kernel.org>

Hi, Marc,

On Sun, Nov 06, 2022 at 03:43:17PM +0000, Marc Zyngier wrote:
> > +Note that the bitmap here is only a backup of the ring structure, and
> > +normally should only contain a very small amount of dirty pages, which
> 
> I don't think we can claim this. It is whatever amount of memory is
> dirtied outside of a vcpu context, and we shouldn't make any claim
> regarding the number of dirty pages.

The thing is the current with-bitmap design assumes that the two logs are
collected in different windows of migration, while the dirty log is only
collected after the VM is stopped.  So collecting dirty bitmap and sending
the dirty pages within the bitmap will be part of the VM downtime.

It will stop to make sense if the dirty bitmap can contain a large portion
of the guest memory, because then it'll be simpler to just stop the VM,
transfer pages, and restart on dest node without any tracking mechanism.

[1]

> 
> > +needs to be transferred during VM downtime. Collecting the dirty bitmap
> > +should be the very last thing that the VMM does before transmitting state
> > +to the target VM. VMM needs to ensure that the dirty state is final and
> > +avoid missing dirty pages from another ioctl ordered after the bitmap
> > +collection.
> > +
> > +To collect dirty bits in the backup bitmap, the userspace can use the
> > +same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed
> > +and its behavior is undefined since collecting the dirty bitmap always
> > +happens in the last phase of VM's migration.
> 
> It isn't clear to me why KVM_CLEAR_DIRTY_LOG should be called out. If
> you have multiple devices that dirty the memory, such as multiple
> ITSs, why shouldn't userspace be allowed to snapshot the dirty state
> multiple time? This doesn't seem like a reasonable restriction, and I
> really dislike the idea of undefined behaviour here.

I suggested the paragraph because it's very natural to ask whether we'd
need to CLEAR_LOG for this special GET_LOG phase, so I thought this could
be helpful as a reference to answer that.

I wanted to make it clear that we don't need CLEAR_LOG at all in this case,
as fundamentally clear log is about re-protect the guest pages, but if
we're with the restriction of above (having the dirty bmap the last to
collect and once and for all) then it'll make no sense to protect the guest
page at all at this stage since src host shouldn't run after the GET_LOG
then the CLEAR_LOG will be a vain effort.

I used "undefined" here just to be loose on the interface, also a hint that
we should never do that for this specific GET_LOG.  If we want, we can
ignore CLEAR_LOG in the future with ALLOW_BITMAP, and the undefined also
provides the flexibility, but that's not really that important.

The wording could definitely be improved, or maybe even avoid mentioning
the CLEAR_LOG might help, but IIUC the major thing to reach the consensus
is not CLEAR_LOG itself but on whether we can have that assumption [1] and
whether such a design of using dirty bmap is acceptable in general.

Thanks,

-- 
Peter Xu

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

WARNING: multiple messages have this Message-ID (diff)
From: Peter Xu <peterx@redhat.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>,
	kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu,
	kvm@vger.kernel.org, shuah@kernel.org, catalin.marinas@arm.com,
	andrew.jones@linux.dev, ajones@ventanamicro.com,
	bgardon@google.com, dmatlack@google.com, will@kernel.org,
	suzuki.poulose@arm.com, alexandru.elisei@arm.com,
	pbonzini@redhat.com, seanjc@google.com, oliver.upton@linux.dev,
	zhenyzha@redhat.com, shan.gavin@gmail.com
Subject: Re: [PATCH v8 3/7] KVM: Support dirty ring in conjunction with bitmap
Date: Sun, 6 Nov 2022 11:22:29 -0500	[thread overview]
Message-ID: <Y2ffRYoqlQOxgVtk@x1n> (raw)
Message-ID: <20221106162229.VUYhAkvAPc8DxTgNu7of1U7l5k-ZTGMUE6yazBMI6ek@z> (raw)
In-Reply-To: <87o7tkf5re.wl-maz@kernel.org>

Hi, Marc,

On Sun, Nov 06, 2022 at 03:43:17PM +0000, Marc Zyngier wrote:
> > +Note that the bitmap here is only a backup of the ring structure, and
> > +normally should only contain a very small amount of dirty pages, which
> 
> I don't think we can claim this. It is whatever amount of memory is
> dirtied outside of a vcpu context, and we shouldn't make any claim
> regarding the number of dirty pages.

The thing is the current with-bitmap design assumes that the two logs are
collected in different windows of migration, while the dirty log is only
collected after the VM is stopped.  So collecting dirty bitmap and sending
the dirty pages within the bitmap will be part of the VM downtime.

It will stop to make sense if the dirty bitmap can contain a large portion
of the guest memory, because then it'll be simpler to just stop the VM,
transfer pages, and restart on dest node without any tracking mechanism.

[1]

> 
> > +needs to be transferred during VM downtime. Collecting the dirty bitmap
> > +should be the very last thing that the VMM does before transmitting state
> > +to the target VM. VMM needs to ensure that the dirty state is final and
> > +avoid missing dirty pages from another ioctl ordered after the bitmap
> > +collection.
> > +
> > +To collect dirty bits in the backup bitmap, the userspace can use the
> > +same KVM_GET_DIRTY_LOG ioctl. KVM_CLEAR_DIRTY_LOG shouldn't be needed
> > +and its behavior is undefined since collecting the dirty bitmap always
> > +happens in the last phase of VM's migration.
> 
> It isn't clear to me why KVM_CLEAR_DIRTY_LOG should be called out. If
> you have multiple devices that dirty the memory, such as multiple
> ITSs, why shouldn't userspace be allowed to snapshot the dirty state
> multiple time? This doesn't seem like a reasonable restriction, and I
> really dislike the idea of undefined behaviour here.

I suggested the paragraph because it's very natural to ask whether we'd
need to CLEAR_LOG for this special GET_LOG phase, so I thought this could
be helpful as a reference to answer that.

I wanted to make it clear that we don't need CLEAR_LOG at all in this case,
as fundamentally clear log is about re-protect the guest pages, but if
we're with the restriction of above (having the dirty bmap the last to
collect and once and for all) then it'll make no sense to protect the guest
page at all at this stage since src host shouldn't run after the GET_LOG
then the CLEAR_LOG will be a vain effort.

I used "undefined" here just to be loose on the interface, also a hint that
we should never do that for this specific GET_LOG.  If we want, we can
ignore CLEAR_LOG in the future with ALLOW_BITMAP, and the undefined also
provides the flexibility, but that's not really that important.

The wording could definitely be improved, or maybe even avoid mentioning
the CLEAR_LOG might help, but IIUC the major thing to reach the consensus
is not CLEAR_LOG itself but on whether we can have that assumption [1] and
whether such a design of using dirty bmap is acceptable in general.

Thanks,

-- 
Peter Xu


  reply	other threads:[~2022-11-06 16:22 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-04 23:40 [PATCH v8 0/7] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-11-04 23:40 ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 1/7] KVM: x86: Introduce KVM_REQ_DIRTY_RING_SOFT_FULL Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 2/7] KVM: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 3/7] KVM: Support dirty ring in conjunction with bitmap Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-06 15:43   ` Marc Zyngier
2022-11-06 15:43     ` Marc Zyngier
2022-11-06 16:22     ` Peter Xu [this message]
2022-11-06 16:22       ` Peter Xu
2022-11-06 20:12       ` Marc Zyngier
2022-11-06 20:12         ` Marc Zyngier
2022-11-06 21:06         ` Peter Xu
2022-11-06 21:06           ` Peter Xu
2022-11-06 21:23           ` Gavin Shan
2022-11-06 21:23             ` Gavin Shan
2022-11-07  9:38             ` Marc Zyngier
2022-11-07  9:38               ` Marc Zyngier
2022-11-07 14:29               ` Peter Xu
2022-11-07 14:29                 ` Peter Xu
2022-11-07  9:21           ` Marc Zyngier
2022-11-07  9:21             ` Marc Zyngier
2022-11-07 14:59             ` Peter Xu
2022-11-07 14:59               ` Peter Xu
2022-11-07 15:30               ` Marc Zyngier
2022-11-07 15:30                 ` Marc Zyngier
2022-11-06 21:40     ` Gavin Shan
2022-11-06 21:40       ` Gavin Shan
2022-11-07  9:45       ` Marc Zyngier
2022-11-07  9:45         ` Marc Zyngier
2022-11-07 10:45   ` Gavin Shan
2022-11-07 10:45     ` Gavin Shan
2022-11-07 11:33     ` Marc Zyngier
2022-11-07 11:33       ` Marc Zyngier
2022-11-07 23:53       ` Gavin Shan
2022-11-07 23:53         ` Gavin Shan
2022-11-07 16:05   ` Sean Christopherson
2022-11-07 16:05     ` Sean Christopherson
2022-11-08  0:44     ` Gavin Shan
2022-11-08  0:44       ` Gavin Shan
2022-11-08  1:13       ` Oliver Upton
2022-11-08  1:13         ` Oliver Upton
2022-11-08  3:30         ` Gavin Shan
2022-11-08  3:30           ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 4/7] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-06 15:50   ` Marc Zyngier
2022-11-06 15:50     ` Marc Zyngier
2022-11-06 21:46     ` Gavin Shan
2022-11-06 21:46       ` Gavin Shan
2022-11-07  9:47       ` Marc Zyngier
2022-11-07  9:47         ` Marc Zyngier
2022-11-07 10:47         ` Gavin Shan
2022-11-07 10:47           ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 5/7] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 6/7] KVM: selftests: Clear dirty ring states between two modes " Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-04 23:40 ` [PATCH v8 7/7] KVM: selftests: Automate choosing dirty ring size " Gavin Shan
2022-11-04 23:40   ` Gavin Shan
2022-11-06 16:08 ` [PATCH v8 0/7] KVM: arm64: Enable ring-based dirty memory tracking Marc Zyngier
2022-11-06 16:08   ` Marc Zyngier
2022-11-06 21:50   ` Gavin Shan
2022-11-06 21:50     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y2ffRYoqlQOxgVtk@x1n \
    --to=peterx@redhat.com \
    --cc=ajones@ventanamicro.com \
    --cc=andrew.jones@linux.dev \
    --cc=bgardon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dmatlack@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=shan.gavin@gmail.com \
    --cc=shuah@kernel.org \
    --cc=will@kernel.org \
    --cc=zhenyzha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.