From: Sean Christopherson <seanjc@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: shuah@kernel.org, kvm@vger.kernel.org, catalin.marinas@arm.com,
andrew.jones@linux.dev, dmatlack@google.com,
shan.gavin@gmail.com, bgardon@google.com, kvmarm@lists.linux.dev,
pbonzini@redhat.com, zhenyzha@redhat.com, will@kernel.org,
kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap
Date: Mon, 24 Oct 2022 23:50:29 +0000 [thread overview]
Message-ID: <Y1ckxYst3tc0LCqb@google.com> (raw)
In-Reply-To: <87edv0gnb3.wl-maz@kernel.org>
On Sat, Oct 22, 2022, Marc Zyngier wrote:
> On Fri, 21 Oct 2022 17:05:26 +0100, Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Oct 21, 2022, Marc Zyngier wrote:
> > > Because dirtying memory outside of a vcpu context makes it
> > > incredibly awkward to handle a "ring full" condition?
> >
> > Kicking all vCPUs with the soft-full request isn't _that_ awkward.
> > It's certainly sub-optimal, but if inserting into the per-VM ring is
> > relatively rare, then in practice it's unlikely to impact guest
> > performance.
>
> But there is *nothing* to kick here. The kernel is dirtying pages,
> devices are dirtying pages (DMA), and there is no context associated
> with that. Which is why a finite ring is the wrong abstraction.
I don't follow. If there's a VM, KVM can always kick all vCPUs. Again, might
be far from optimal, but it's an option. If there's literally no VM, then KVM
isn't involved at all and there's no "ring vs. bitmap" decision.
> > Would it be possible to require a dirty bitmap when an ITS is
> > created? That would allow treating the above condition as a KVM
> > bug.
>
> No. This should be optional. Everything about migration should be
> absolutely optional (I run plenty of concurrent VMs on sub-2GB
> systems). You want to migrate a VM that has an ITS or will collect
> dirty bits originating from a SMMU with HTTU, you enable the dirty
> bitmap. You want to have *vcpu* based dirty rings, you enable them.
>
> In short, there shouldn't be any reason for the two are either
> mandatory or conflated. Both should be optional, independent, because
> they cover completely disjoined use cases. *userspace* should be in
> charge of deciding this.
I agree about userspace being in control, what I want to avoid is letting userspace
put KVM into a bad state without any indication from KVM that the setup is wrong
until something actually dirties a page.
Specifically, if mark_page_dirty_in_slot() is invoked without a running vCPU, on
a memslot with dirty tracking enabled but without a dirty bitmap, then the migration
is doomed. Dropping the dirty page isn't a sane response as that'd all but
guaranatee memory corruption in the guest. At best, KVM could kick all vCPUs out
to userspace with a new exit reason, but that's not a very good experience for
userspace as either the VM is unexpectedly unmigratable or the VM ends up being
killed (or I suppose userspace could treat the exit as a per-VM dirty ring of
size '1').
That's why I asked if it's possible for KVM to require a dirty_bitmap when KVM
might end up collecting dirty information without a vCPU. KVM is still
technically prescribing a solution to userspace, but only because there's only
one solution.
> > > > The acquire-release thing is irrelevant for x86, and no other
> > > > architecture supports the dirty ring until this series, i.e. there's
> > > > no need for KVM to detect that userspace has been updated to gain
> > > > acquire-release semantics, because the fact that userspace is
> > > > enabling the dirty ring on arm64 means userspace has been updated.
> > >
> > > Do we really need to make the API more awkward? There is an
> > > established pattern of "enable what is advertised". Some level of
> > > uniformity wouldn't hurt, really.
> >
> > I agree that uniformity would be nice, but for capabilities I don't
> > think that's ever going to happen. I'm pretty sure supporting
> > enabling is actually in the minority. E.g. of the 20 capabilities
> > handled in kvm_vm_ioctl_check_extension_generic(), I believe only 3
> > support enabling (KVM_CAP_HALT_POLL, KVM_CAP_DIRTY_LOG_RING, and
> > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2).
>
> I understood that you were advocating that a check for KVM_CAP_FOO
> could result in enabling KVM_CAP_BAR. That I definitely object to.
I was hoping KVM could make the ACQ_REL capability an extension of DIRTY_LOG_RING,
i.e. userspace would DIRTY_LOG_RING _and_ DIRTY_LOG_RING_ACQ_REL for ARM and other
architectures, e.g.
int enable_dirty_ring(void)
{
if (!kvm_check(KVM_CAP_DIRTY_LOG_RING))
return -EINVAL;
if (!tso && !kvm_check(KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
return -EINVAL;
return kvm_enable(KVM_CAP_DIRTY_LOG_RING);
}
But I failed to consider that userspace might try to enable DIRTY_LOG_RING on
all architectures, i.e. wouldn't arbitrarily restrict DIRTY_LOG_RING to x86.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Gavin Shan <gshan@redhat.com>,
kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu,
kvm@vger.kernel.org, peterx@redhat.com, will@kernel.org,
catalin.marinas@arm.com, bgardon@google.com, shuah@kernel.org,
andrew.jones@linux.dev, dmatlack@google.com, pbonzini@redhat.com,
zhenyzha@redhat.com, james.morse@arm.com, suzuki.poulose@arm.com,
alexandru.elisei@arm.com, oliver.upton@linux.dev,
shan.gavin@gmail.com
Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap
Date: Mon, 24 Oct 2022 23:50:29 +0000 [thread overview]
Message-ID: <Y1ckxYst3tc0LCqb@google.com> (raw)
Message-ID: <20221024235029.3wy41p0MBDw-jMF-4HEwrK8j5O0gL09fz5Ckq3kJDUg@z> (raw)
In-Reply-To: <87edv0gnb3.wl-maz@kernel.org>
On Sat, Oct 22, 2022, Marc Zyngier wrote:
> On Fri, 21 Oct 2022 17:05:26 +0100, Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Oct 21, 2022, Marc Zyngier wrote:
> > > Because dirtying memory outside of a vcpu context makes it
> > > incredibly awkward to handle a "ring full" condition?
> >
> > Kicking all vCPUs with the soft-full request isn't _that_ awkward.
> > It's certainly sub-optimal, but if inserting into the per-VM ring is
> > relatively rare, then in practice it's unlikely to impact guest
> > performance.
>
> But there is *nothing* to kick here. The kernel is dirtying pages,
> devices are dirtying pages (DMA), and there is no context associated
> with that. Which is why a finite ring is the wrong abstraction.
I don't follow. If there's a VM, KVM can always kick all vCPUs. Again, might
be far from optimal, but it's an option. If there's literally no VM, then KVM
isn't involved at all and there's no "ring vs. bitmap" decision.
> > Would it be possible to require a dirty bitmap when an ITS is
> > created? That would allow treating the above condition as a KVM
> > bug.
>
> No. This should be optional. Everything about migration should be
> absolutely optional (I run plenty of concurrent VMs on sub-2GB
> systems). You want to migrate a VM that has an ITS or will collect
> dirty bits originating from a SMMU with HTTU, you enable the dirty
> bitmap. You want to have *vcpu* based dirty rings, you enable them.
>
> In short, there shouldn't be any reason for the two are either
> mandatory or conflated. Both should be optional, independent, because
> they cover completely disjoined use cases. *userspace* should be in
> charge of deciding this.
I agree about userspace being in control, what I want to avoid is letting userspace
put KVM into a bad state without any indication from KVM that the setup is wrong
until something actually dirties a page.
Specifically, if mark_page_dirty_in_slot() is invoked without a running vCPU, on
a memslot with dirty tracking enabled but without a dirty bitmap, then the migration
is doomed. Dropping the dirty page isn't a sane response as that'd all but
guaranatee memory corruption in the guest. At best, KVM could kick all vCPUs out
to userspace with a new exit reason, but that's not a very good experience for
userspace as either the VM is unexpectedly unmigratable or the VM ends up being
killed (or I suppose userspace could treat the exit as a per-VM dirty ring of
size '1').
That's why I asked if it's possible for KVM to require a dirty_bitmap when KVM
might end up collecting dirty information without a vCPU. KVM is still
technically prescribing a solution to userspace, but only because there's only
one solution.
> > > > The acquire-release thing is irrelevant for x86, and no other
> > > > architecture supports the dirty ring until this series, i.e. there's
> > > > no need for KVM to detect that userspace has been updated to gain
> > > > acquire-release semantics, because the fact that userspace is
> > > > enabling the dirty ring on arm64 means userspace has been updated.
> > >
> > > Do we really need to make the API more awkward? There is an
> > > established pattern of "enable what is advertised". Some level of
> > > uniformity wouldn't hurt, really.
> >
> > I agree that uniformity would be nice, but for capabilities I don't
> > think that's ever going to happen. I'm pretty sure supporting
> > enabling is actually in the minority. E.g. of the 20 capabilities
> > handled in kvm_vm_ioctl_check_extension_generic(), I believe only 3
> > support enabling (KVM_CAP_HALT_POLL, KVM_CAP_DIRTY_LOG_RING, and
> > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2).
>
> I understood that you were advocating that a check for KVM_CAP_FOO
> could result in enabling KVM_CAP_BAR. That I definitely object to.
I was hoping KVM could make the ACQ_REL capability an extension of DIRTY_LOG_RING,
i.e. userspace would DIRTY_LOG_RING _and_ DIRTY_LOG_RING_ACQ_REL for ARM and other
architectures, e.g.
int enable_dirty_ring(void)
{
if (!kvm_check(KVM_CAP_DIRTY_LOG_RING))
return -EINVAL;
if (!tso && !kvm_check(KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
return -EINVAL;
return kvm_enable(KVM_CAP_DIRTY_LOG_RING);
}
But I failed to consider that userspace might try to enable DIRTY_LOG_RING on
all architectures, i.e. wouldn't arbitrarily restrict DIRTY_LOG_RING to x86.
next prev parent reply other threads:[~2022-10-24 23:50 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-11 6:14 [PATCH v6 0/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 1/8] KVM: x86: Introduce KVM_REQ_RING_SOFT_FULL Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-20 22:42 ` Sean Christopherson
2022-10-20 22:42 ` Sean Christopherson
2022-10-21 5:54 ` Gavin Shan
2022-10-21 5:54 ` Gavin Shan
2022-10-21 15:25 ` Sean Christopherson
2022-10-21 15:25 ` Sean Christopherson
2022-10-21 23:03 ` Gavin Shan
2022-10-21 23:03 ` Gavin Shan
2022-10-21 23:48 ` Sean Christopherson
2022-10-21 23:48 ` Sean Christopherson
2022-10-22 0:16 ` Gavin Shan
2022-10-22 0:16 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 2/8] KVM: x86: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-18 16:07 ` Peter Xu
2022-10-18 16:07 ` Peter Xu
2022-10-18 22:20 ` Gavin Shan
2022-10-18 22:20 ` Gavin Shan
2022-10-20 18:58 ` Oliver Upton
2022-10-20 18:58 ` Oliver Upton
2022-10-20 23:44 ` Sean Christopherson
2022-10-20 23:44 ` Sean Christopherson
2022-10-21 8:06 ` Marc Zyngier
2022-10-21 8:06 ` Marc Zyngier
2022-10-21 16:05 ` Sean Christopherson
2022-10-21 16:05 ` Sean Christopherson
2022-10-22 8:27 ` Gavin Shan
2022-10-22 8:27 ` Gavin Shan
2022-10-22 10:54 ` Marc Zyngier
2022-10-22 10:54 ` Marc Zyngier
2022-10-22 10:33 ` Marc Zyngier
2022-10-22 10:33 ` Marc Zyngier
2022-10-24 23:50 ` Sean Christopherson [this message]
2022-10-24 23:50 ` Sean Christopherson
2022-10-25 0:08 ` Sean Christopherson
2022-10-25 0:08 ` Sean Christopherson
2022-10-25 0:24 ` Oliver Upton
2022-10-25 0:24 ` Oliver Upton
2022-10-25 7:31 ` Marc Zyngier
2022-10-25 7:31 ` Marc Zyngier
2022-10-25 17:47 ` Sean Christopherson
2022-10-25 17:47 ` Sean Christopherson
2022-10-27 8:29 ` Marc Zyngier
2022-10-27 8:29 ` Marc Zyngier
2022-10-27 17:44 ` Sean Christopherson
2022-10-27 17:44 ` Sean Christopherson
2022-10-27 18:30 ` Marc Zyngier
2022-10-27 18:30 ` Marc Zyngier
2022-10-27 19:09 ` Sean Christopherson
2022-10-27 19:09 ` Sean Christopherson
2022-10-28 6:43 ` Gavin Shan
2022-10-28 6:43 ` Gavin Shan
2022-10-28 16:51 ` Sean Christopherson
2022-10-28 16:51 ` Sean Christopherson
2022-10-31 3:37 ` Gavin Shan
2022-10-31 3:37 ` Gavin Shan
2022-10-31 9:08 ` Marc Zyngier
2022-10-31 9:08 ` Marc Zyngier
2022-10-31 22:48 ` Gavin Shan
2022-10-31 22:48 ` Gavin Shan
2022-10-25 7:22 ` Marc Zyngier
2022-10-25 7:22 ` Marc Zyngier
2022-10-21 10:13 ` Gavin Shan
2022-10-21 10:13 ` Gavin Shan
2022-10-21 23:20 ` Sean Christopherson
2022-10-21 23:20 ` Sean Christopherson
2022-10-22 0:33 ` Gavin Shan
2022-10-22 0:33 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 4/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 5/8] KVM: selftests: Enable KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP if possible Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 6/8] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 7/8] KVM: selftests: Clear dirty ring states between two modes " Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 8/8] KVM: selftests: Automate choosing dirty ring size " Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:23 ` [PATCH v6 0/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:23 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1ckxYst3tc0LCqb@google.com \
--to=seanjc@google.com \
--cc=andrew.jones@linux.dev \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=dmatlack@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=pbonzini@redhat.com \
--cc=shan.gavin@gmail.com \
--cc=shuah@kernel.org \
--cc=will@kernel.org \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.