From: Sean Christopherson <seanjc@google.com>
To: Gavin Shan <gshan@redhat.com>
Cc: shuah@kernel.org, kvm@vger.kernel.org, maz@kernel.org,
bgardon@google.com, andrew.jones@linux.dev, dmatlack@google.com,
shan.gavin@gmail.com, catalin.marinas@arm.com,
kvmarm@lists.linux.dev, pbonzini@redhat.com, zhenyzha@redhat.com,
will@kernel.org, kvmarm@lists.cs.columbia.edu
Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap
Date: Fri, 21 Oct 2022 23:20:33 +0000 [thread overview]
Message-ID: <Y1MpQTprxk+XdYFb@google.com> (raw)
In-Reply-To: <0adc538b-594e-c662-5a38-3ca6b98ab059@redhat.com>
On Fri, Oct 21, 2022, Gavin Shan wrote:
> > What about inverting the naming to better capture that this is about the dirty
> > bitmap, and less so about the dirty ring? It's not obvious what "exclusive"
> > means, e.g. I saw this stub before reading the changelog and assumed it was
> > making a dirty ring exclusive to something.
> >
> > Something like this?
> >
> > bool kvm_use_dirty_bitmap(struct kvm *kvm)
> > {
> > return !kvm->dirty_ring_size || kvm->dirty_ring_with_bitmap;
> > }
> >
>
> If you agree, I would rename is to kvm_dirty_ring_use_bitmap(). In this way,
> we will have "kvm_dirty_ring" prefix for the function name, consistent with
> other functions from same module.
I'd prefer to avoid "ring" in the name at all, because in the common case (well,
legacy case at least) the dirty ring has nothing to do with using the dirty
bitmap, e.g. this code ends up being very confusing because the "dirty_ring"
part implies that KVM _doesn't_ need to allocate the bitmap when the dirty ring
isn't being used.
if (!(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
new->dirty_bitmap = NULL;
else if (old && old->dirty_bitmap)
new->dirty_bitmap = old->dirty_bitmap;
else if (kvm_dirty_ring_use_bitmap(kvm) {
r = kvm_alloc_dirty_bitmap(new);
if (r)
return r;
if (kvm_dirty_log_manual_protect_and_init_set(kvm))
bitmap_set(new->dirty_bitmap, 0, new->npages);
}
The helper exists because the dirty ring exists, but the helper is fundamentally
about the dirty bitmap, not the ring.
> > But dirty_ring_with_bitmap really shouldn't need to exist. It's mandatory for
> > architectures that have HAVE_KVM_DIRTY_RING_WITH_BITMAP, and unsupported for
> > architectures that don't. In other words, the API for enabling the dirty ring
> > is a bit ugly.
> >
> > Rather than add KVM_CAP_DIRTY_LOG_RING_ACQ_REL, which hasn't been officially
> > released yet, and then KVM_CAP_DIRTY_LOG_ING_WITH_BITMAP on top, what about
> > usurping bits 63:32 of cap->args[0] for flags? E.g.
> >
> > Ideally we'd use cap->flags directly, but we screwed up with KVM_CAP_DIRTY_LOG_RING
> > and didn't require flags to be zero :-(
> >
> > Actually, what's the point of allowing KVM_CAP_DIRTY_LOG_RING_ACQ_REL to be
> > enabled? I get why KVM would enumerate this info, i.e. allowing checking, but I
> > don't seen any value in supporting a second method for enabling the dirty ring.
> >
> > The acquire-release thing is irrelevant for x86, and no other architecture
> > supports the dirty ring until this series, i.e. there's no need for KVM to detect
> > that userspace has been updated to gain acquire-release semantics, because the
> > fact that userspace is enabling the dirty ring on arm64 means userspace has been
> > updated.
> >
> > Same goes for the "with bitmap" capability. There are no existing arm64 users,
> > so there's no risk of breaking existing userspace by suddenly shoving stuff into
> > the dirty bitmap.
> >
> > KVM doesn't even get the enabling checks right, e.g. KVM_CAP_DIRTY_LOG_RING can be
> > enabled on architectures that select CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL but not
> > KVM_CAP_DIRTY_LOG_RING. The reverse is true (ignoring that x86 selects both and
> > is the only arch that selects the TSO variant).
> >
> > Ditto for KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP...
>
> If I didn't miss anything in the previous discussions, we don't want to make
> KVM_CAP_DIRTY_LOG_RING_ACQ_REL and KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
> architecture dependent. If they become architecture dependent, the userspace
> will have different stubs (x86, arm64, other architectures to support
> dirty-ring in future) to enable those capabilities. It's not friendly to
> userspace. So I intend to prefer the existing pattern: advertise, enable. To
> enable a capability without knowing if it's supported sounds a bit weird to
> me.
Enabling without KVM advertising that it's supported would indeed be odd. Ugh,
and QEMU doesn't have existing checks to restrict the dirty ring to x86, i.e. we
can't make the ACQ_REL capability a true attribute without breaking userspace.
Rats.
> I think it's a good idea to enable KVM_CAP_DIRTY_LOG_RING_{ACQ_REL, WITH_BITMAP} as
> flags, instead of standalone capabilities. In this way, those two capabilities can
> be treated as sub-capability of KVM_CAP_DIRTY_LOG_RING. The question is how these
> two flags can be exposed by kvm_vm_ioctl_check_extension_generic(), if we really
> want to expose those two flags.
>
> I don't understand your question on how KVM has wrong checks when KVM_CAP_DIRTY_LOG_RING
> and KVM_CAP_DIRTY_LOG_RING_ACQ_REL are enabled.
In the current code base, KVM only checks that _a_ form of dirty ring is supported,
by way of kvm_vm_ioctl_enable_dirty_log_ring()'s check on KVM_DIRTY_LOG_PAGE_OFFSET.
The callers don't verify that the "correct" capability is enabled.
case KVM_CAP_DIRTY_LOG_RING:
case KVM_CAP_DIRTY_LOG_RING_ACQ_REL:
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
E.g. userspace could do
if (kvm_check(KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
kvm_enable(KVM_CAP_DIRTY_LOG_RING)
and KVM would happily enable the dirty ring. Functionally it doesn't cause
problems, it's just weird.
Heh, we can fix without more ifdeffery by using the check internally.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e30f1b4ecfa5..300489a0eba5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4585,6 +4585,8 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
}
case KVM_CAP_DIRTY_LOG_RING:
case KVM_CAP_DIRTY_LOG_RING_ACQ_REL:
+ if (!kvm_vm_ioctl_check_extension_generic(kvm, cap->cap))
+ return -EINVAL;
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
default:
return kvm_vm_ioctl_enable_cap(kvm, cap);
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <seanjc@google.com>
To: Gavin Shan <gshan@redhat.com>
Cc: kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu,
kvm@vger.kernel.org, peterx@redhat.com, maz@kernel.org,
will@kernel.org, catalin.marinas@arm.com, bgardon@google.com,
shuah@kernel.org, andrew.jones@linux.dev, dmatlack@google.com,
pbonzini@redhat.com, zhenyzha@redhat.com, james.morse@arm.com,
suzuki.poulose@arm.com, alexandru.elisei@arm.com,
oliver.upton@linux.dev, shan.gavin@gmail.com
Subject: Re: [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap
Date: Fri, 21 Oct 2022 23:20:33 +0000 [thread overview]
Message-ID: <Y1MpQTprxk+XdYFb@google.com> (raw)
Message-ID: <20221021232033.LjjUlmzKzSAzOXhdBCmulpV-jLpd502WJQegj2IaT5E@z> (raw)
In-Reply-To: <0adc538b-594e-c662-5a38-3ca6b98ab059@redhat.com>
On Fri, Oct 21, 2022, Gavin Shan wrote:
> > What about inverting the naming to better capture that this is about the dirty
> > bitmap, and less so about the dirty ring? It's not obvious what "exclusive"
> > means, e.g. I saw this stub before reading the changelog and assumed it was
> > making a dirty ring exclusive to something.
> >
> > Something like this?
> >
> > bool kvm_use_dirty_bitmap(struct kvm *kvm)
> > {
> > return !kvm->dirty_ring_size || kvm->dirty_ring_with_bitmap;
> > }
> >
>
> If you agree, I would rename is to kvm_dirty_ring_use_bitmap(). In this way,
> we will have "kvm_dirty_ring" prefix for the function name, consistent with
> other functions from same module.
I'd prefer to avoid "ring" in the name at all, because in the common case (well,
legacy case at least) the dirty ring has nothing to do with using the dirty
bitmap, e.g. this code ends up being very confusing because the "dirty_ring"
part implies that KVM _doesn't_ need to allocate the bitmap when the dirty ring
isn't being used.
if (!(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
new->dirty_bitmap = NULL;
else if (old && old->dirty_bitmap)
new->dirty_bitmap = old->dirty_bitmap;
else if (kvm_dirty_ring_use_bitmap(kvm) {
r = kvm_alloc_dirty_bitmap(new);
if (r)
return r;
if (kvm_dirty_log_manual_protect_and_init_set(kvm))
bitmap_set(new->dirty_bitmap, 0, new->npages);
}
The helper exists because the dirty ring exists, but the helper is fundamentally
about the dirty bitmap, not the ring.
> > But dirty_ring_with_bitmap really shouldn't need to exist. It's mandatory for
> > architectures that have HAVE_KVM_DIRTY_RING_WITH_BITMAP, and unsupported for
> > architectures that don't. In other words, the API for enabling the dirty ring
> > is a bit ugly.
> >
> > Rather than add KVM_CAP_DIRTY_LOG_RING_ACQ_REL, which hasn't been officially
> > released yet, and then KVM_CAP_DIRTY_LOG_ING_WITH_BITMAP on top, what about
> > usurping bits 63:32 of cap->args[0] for flags? E.g.
> >
> > Ideally we'd use cap->flags directly, but we screwed up with KVM_CAP_DIRTY_LOG_RING
> > and didn't require flags to be zero :-(
> >
> > Actually, what's the point of allowing KVM_CAP_DIRTY_LOG_RING_ACQ_REL to be
> > enabled? I get why KVM would enumerate this info, i.e. allowing checking, but I
> > don't seen any value in supporting a second method for enabling the dirty ring.
> >
> > The acquire-release thing is irrelevant for x86, and no other architecture
> > supports the dirty ring until this series, i.e. there's no need for KVM to detect
> > that userspace has been updated to gain acquire-release semantics, because the
> > fact that userspace is enabling the dirty ring on arm64 means userspace has been
> > updated.
> >
> > Same goes for the "with bitmap" capability. There are no existing arm64 users,
> > so there's no risk of breaking existing userspace by suddenly shoving stuff into
> > the dirty bitmap.
> >
> > KVM doesn't even get the enabling checks right, e.g. KVM_CAP_DIRTY_LOG_RING can be
> > enabled on architectures that select CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL but not
> > KVM_CAP_DIRTY_LOG_RING. The reverse is true (ignoring that x86 selects both and
> > is the only arch that selects the TSO variant).
> >
> > Ditto for KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP...
>
> If I didn't miss anything in the previous discussions, we don't want to make
> KVM_CAP_DIRTY_LOG_RING_ACQ_REL and KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP
> architecture dependent. If they become architecture dependent, the userspace
> will have different stubs (x86, arm64, other architectures to support
> dirty-ring in future) to enable those capabilities. It's not friendly to
> userspace. So I intend to prefer the existing pattern: advertise, enable. To
> enable a capability without knowing if it's supported sounds a bit weird to
> me.
Enabling without KVM advertising that it's supported would indeed be odd. Ugh,
and QEMU doesn't have existing checks to restrict the dirty ring to x86, i.e. we
can't make the ACQ_REL capability a true attribute without breaking userspace.
Rats.
> I think it's a good idea to enable KVM_CAP_DIRTY_LOG_RING_{ACQ_REL, WITH_BITMAP} as
> flags, instead of standalone capabilities. In this way, those two capabilities can
> be treated as sub-capability of KVM_CAP_DIRTY_LOG_RING. The question is how these
> two flags can be exposed by kvm_vm_ioctl_check_extension_generic(), if we really
> want to expose those two flags.
>
> I don't understand your question on how KVM has wrong checks when KVM_CAP_DIRTY_LOG_RING
> and KVM_CAP_DIRTY_LOG_RING_ACQ_REL are enabled.
In the current code base, KVM only checks that _a_ form of dirty ring is supported,
by way of kvm_vm_ioctl_enable_dirty_log_ring()'s check on KVM_DIRTY_LOG_PAGE_OFFSET.
The callers don't verify that the "correct" capability is enabled.
case KVM_CAP_DIRTY_LOG_RING:
case KVM_CAP_DIRTY_LOG_RING_ACQ_REL:
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
E.g. userspace could do
if (kvm_check(KVM_CAP_DIRTY_LOG_RING_ACQ_REL))
kvm_enable(KVM_CAP_DIRTY_LOG_RING)
and KVM would happily enable the dirty ring. Functionally it doesn't cause
problems, it's just weird.
Heh, we can fix without more ifdeffery by using the check internally.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e30f1b4ecfa5..300489a0eba5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4585,6 +4585,8 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
}
case KVM_CAP_DIRTY_LOG_RING:
case KVM_CAP_DIRTY_LOG_RING_ACQ_REL:
+ if (!kvm_vm_ioctl_check_extension_generic(kvm, cap->cap))
+ return -EINVAL;
return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]);
default:
return kvm_vm_ioctl_enable_cap(kvm, cap);
next prev parent reply other threads:[~2022-10-21 23:20 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-11 6:14 [PATCH v6 0/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 1/8] KVM: x86: Introduce KVM_REQ_RING_SOFT_FULL Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-20 22:42 ` Sean Christopherson
2022-10-20 22:42 ` Sean Christopherson
2022-10-21 5:54 ` Gavin Shan
2022-10-21 5:54 ` Gavin Shan
2022-10-21 15:25 ` Sean Christopherson
2022-10-21 15:25 ` Sean Christopherson
2022-10-21 23:03 ` Gavin Shan
2022-10-21 23:03 ` Gavin Shan
2022-10-21 23:48 ` Sean Christopherson
2022-10-21 23:48 ` Sean Christopherson
2022-10-22 0:16 ` Gavin Shan
2022-10-22 0:16 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 2/8] KVM: x86: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 3/8] KVM: Add support for using dirty ring in conjunction with bitmap Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-18 16:07 ` Peter Xu
2022-10-18 16:07 ` Peter Xu
2022-10-18 22:20 ` Gavin Shan
2022-10-18 22:20 ` Gavin Shan
2022-10-20 18:58 ` Oliver Upton
2022-10-20 18:58 ` Oliver Upton
2022-10-20 23:44 ` Sean Christopherson
2022-10-20 23:44 ` Sean Christopherson
2022-10-21 8:06 ` Marc Zyngier
2022-10-21 8:06 ` Marc Zyngier
2022-10-21 16:05 ` Sean Christopherson
2022-10-21 16:05 ` Sean Christopherson
2022-10-22 8:27 ` Gavin Shan
2022-10-22 8:27 ` Gavin Shan
2022-10-22 10:54 ` Marc Zyngier
2022-10-22 10:54 ` Marc Zyngier
2022-10-22 10:33 ` Marc Zyngier
2022-10-22 10:33 ` Marc Zyngier
2022-10-24 23:50 ` Sean Christopherson
2022-10-24 23:50 ` Sean Christopherson
2022-10-25 0:08 ` Sean Christopherson
2022-10-25 0:08 ` Sean Christopherson
2022-10-25 0:24 ` Oliver Upton
2022-10-25 0:24 ` Oliver Upton
2022-10-25 7:31 ` Marc Zyngier
2022-10-25 7:31 ` Marc Zyngier
2022-10-25 17:47 ` Sean Christopherson
2022-10-25 17:47 ` Sean Christopherson
2022-10-27 8:29 ` Marc Zyngier
2022-10-27 8:29 ` Marc Zyngier
2022-10-27 17:44 ` Sean Christopherson
2022-10-27 17:44 ` Sean Christopherson
2022-10-27 18:30 ` Marc Zyngier
2022-10-27 18:30 ` Marc Zyngier
2022-10-27 19:09 ` Sean Christopherson
2022-10-27 19:09 ` Sean Christopherson
2022-10-28 6:43 ` Gavin Shan
2022-10-28 6:43 ` Gavin Shan
2022-10-28 16:51 ` Sean Christopherson
2022-10-28 16:51 ` Sean Christopherson
2022-10-31 3:37 ` Gavin Shan
2022-10-31 3:37 ` Gavin Shan
2022-10-31 9:08 ` Marc Zyngier
2022-10-31 9:08 ` Marc Zyngier
2022-10-31 22:48 ` Gavin Shan
2022-10-31 22:48 ` Gavin Shan
2022-10-25 7:22 ` Marc Zyngier
2022-10-25 7:22 ` Marc Zyngier
2022-10-21 10:13 ` Gavin Shan
2022-10-21 10:13 ` Gavin Shan
2022-10-21 23:20 ` Sean Christopherson [this message]
2022-10-21 23:20 ` Sean Christopherson
2022-10-22 0:33 ` Gavin Shan
2022-10-22 0:33 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 4/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 5/8] KVM: selftests: Enable KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP if possible Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 6/8] KVM: selftests: Use host page size to map ring buffer in dirty_log_test Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 7/8] KVM: selftests: Clear dirty ring states between two modes " Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:14 ` [PATCH v6 8/8] KVM: selftests: Automate choosing dirty ring size " Gavin Shan
2022-10-11 6:14 ` Gavin Shan
2022-10-11 6:23 ` [PATCH v6 0/8] KVM: arm64: Enable ring-based dirty memory tracking Gavin Shan
2022-10-11 6:23 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1MpQTprxk+XdYFb@google.com \
--to=seanjc@google.com \
--cc=andrew.jones@linux.dev \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=dmatlack@google.com \
--cc=gshan@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=pbonzini@redhat.com \
--cc=shan.gavin@gmail.com \
--cc=shuah@kernel.org \
--cc=will@kernel.org \
--cc=zhenyzha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.