public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Andrew Jones <drjones@redhat.com>
Cc: kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
	pbonzini@redhat.com, steven.price@arm.com
Subject: Re: [PATCH 2/4] arm64/x86: KVM: Introduce steal time cap
Date: Mon, 22 Jun 2020 10:51:47 +0100	[thread overview]
Message-ID: <5a52210e5f123d52459f15c594e77bad@kernel.org> (raw)
In-Reply-To: <20200622084110.uosiqx3oy22lremu@kamzik.brq.redhat.com>

On 2020-06-22 09:41, Andrew Jones wrote:
> On Mon, Jun 22, 2020 at 09:20:02AM +0100, Marc Zyngier wrote:
>> Hi Andrew,
>> 
>> On 2020-06-19 19:46, Andrew Jones wrote:
>> > arm64 requires a vcpu fd (KVM_HAS_DEVICE_ATTR vcpu ioctl) to probe
>> > support for steal time. However this is unnecessary and complicates
>> > userspace (userspace may prefer delaying vcpu creation until after
>> > feature probing). Since probing steal time only requires a KVM fd,
>> > we introduce a cap that can be checked.
>> 
>> So this is purely an API convenience, right? You want a way to
>> identify the presence of steal time accounting without having to
>> create a vcpu? It would have been nice to have this requirement
>> before we merged this code :-(.
> 
> Yes. I wish I had considered it more closely when I was reviewing the
> patches. And, I believe we have yet another user interface issue that
> I'm looking at now. Without the VCPU feature bit I'm not sure how easy
> it will be for a migration to fail when attempting to migrate from a 
> host
> with steal-time enabled to one that does not support steal-time. So 
> it's
> starting to look like steal-time should have followed the pmu pattern
> completely, not just the vcpu device ioctl part.

Should we consider disabling steal time altogether until this is worked 
out?

>> 
>> > Additionally, when probing
>> > steal time we should check delayacct_on, because even though
>> > CONFIG_KVM selects TASK_DELAY_ACCT, it's possible for the host
>> > kernel to have delay accounting disabled with the 'nodelayacct'
>> > command line option. x86 already determines support for steal time
>> > by checking delayacct_on and can already probe steal time support
>> > with a kvm fd (KVM_GET_SUPPORTED_CPUID), but we add the cap there
>> > too for consistency.
>> >
>> > Signed-off-by: Andrew Jones <drjones@redhat.com>
>> > ---
>> >  Documentation/virt/kvm/api.rst | 11 +++++++++++
>> >  arch/arm64/kvm/arm.c           |  3 +++
>> >  arch/x86/kvm/x86.c             |  3 +++
>> >  include/uapi/linux/kvm.h       |  1 +
>> >  4 files changed, 18 insertions(+)
>> >
>> > diff --git a/Documentation/virt/kvm/api.rst
>> > b/Documentation/virt/kvm/api.rst
>> > index 9a12ea498dbb..05b1fdb88383 100644
>> > --- a/Documentation/virt/kvm/api.rst
>> > +++ b/Documentation/virt/kvm/api.rst
>> > @@ -6151,3 +6151,14 @@ KVM can therefore start protected VMs.
>> >  This capability governs the KVM_S390_PV_COMMAND ioctl and the
>> >  KVM_MP_STATE_LOAD MP_STATE. KVM_SET_MP_STATE can fail for protected
>> >  guests when the state change is invalid.
>> > +
>> > +8.24 KVM_CAP_STEAL_TIME
>> > +-----------------------
>> > +
>> > +:Architectures: arm64, x86
>> > +
>> > +This capability indicates that KVM supports steal time accounting.
>> > +When steal time accounting is supported it may be enabled with
>> > +architecture-specific interfaces.  For x86 see
>> > +Documentation/virt/kvm/msr.rst "MSR_KVM_STEAL_TIME".  For arm64 see
>> > +Documentation/virt/kvm/devices/vcpu.rst "KVM_ARM_VCPU_PVTIME_CTRL"
>> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> > index 90cb90561446..f6dca6d09952 100644
>> > --- a/arch/arm64/kvm/arm.c
>> > +++ b/arch/arm64/kvm/arm.c
>> > @@ -222,6 +222,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm,
>> > long ext)
>> >  		 */
>> >  		r = 1;
>> >  		break;
>> > +	case KVM_CAP_STEAL_TIME:
>> > +		r = sched_info_on();
>> > +		break;
>> >  	default:
>> >  		r = kvm_arch_vm_ioctl_check_extension(kvm, ext);
>> >  		break;
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index 00c88c2f34e4..ced6335e403e 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -3533,6 +3533,9 @@ int kvm_vm_ioctl_check_extension(struct kvm
>> > *kvm, long ext)
>> >  	case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
>> >  		r = kvm_x86_ops.nested_ops->enable_evmcs != NULL;
>> >  		break;
>> > +	case KVM_CAP_STEAL_TIME:
>> > +		r = sched_info_on();
>> > +		break;
>> >  	default:
>> >  		break;
>> >  	}
>> > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> > index 4fdf30316582..121fb29ac004 100644
>> > --- a/include/uapi/linux/kvm.h
>> > +++ b/include/uapi/linux/kvm.h
>> > @@ -1031,6 +1031,7 @@ struct kvm_ppc_resize_hpt {
>> >  #define KVM_CAP_PPC_SECURE_GUEST 181
>> >  #define KVM_CAP_HALT_POLL 182
>> >  #define KVM_CAP_ASYNC_PF_INT 183
>> > +#define KVM_CAP_STEAL_TIME 184
>> >
>> >  #ifdef KVM_CAP_IRQ_ROUTING
>> 
>> Shouldn't you also add the same check of sched_info_on() to
>> the various pvtime attribute handling functions? It feels odd
>> that the capability can say "no", and yet we'd accept userspace
>> messing with the steal time parameters...
> 
> I considered that, but the 'has attr' interface is really only asking
> if the interface exists, not if it should be used. I'm not sure what
> we should do about it other than document that the cap needs to say
> it's usable, rather than just the attr presence. But, since we've had
> the attr merged quite a while without the cap, then maybe we can't
> rely on a doc change alone?

Accepting the pvtime attributes (setting up the per-vcpu area) has two
effects: we promise both the guest and userspace that we will provide
the guest with steal time. By not checking sched_info_on(), we lie to
both, with potential consequences. It really feels like a bug.

Thanks,

          M.
-- 
Jazz is not dead. It just smells funny...

  reply	other threads:[~2020-06-22  9:51 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 18:46 [PATCH 0/4] arm64/x86: KVM: Introduce KVM_CAP_STEAL_TIME Andrew Jones
2020-06-19 18:46 ` [PATCH 1/4] KVM: Documentation minor fixups Andrew Jones
2020-06-19 18:46 ` [PATCH 2/4] arm64/x86: KVM: Introduce steal time cap Andrew Jones
2020-06-22  8:20   ` Marc Zyngier
2020-06-22  8:35     ` Steven Price
2020-06-22  8:41     ` Andrew Jones
2020-06-22  9:51       ` Marc Zyngier [this message]
2020-06-22 10:31         ` Andrew Jones
2020-06-22 10:39           ` Marc Zyngier
2020-06-22 11:04             ` Andrew Jones
2020-06-19 18:46 ` [PATCH 3/4] tools headers kvm: Sync linux/kvm.h with the kernel sources Andrew Jones
2020-06-19 18:46 ` [PATCH 4/4] KVM: selftests: Use KVM_CAP_STEAL_TIME Andrew Jones
2020-06-22  8:09 ` [PATCH 0/4] arm64/x86: KVM: Introduce KVM_CAP_STEAL_TIME Steven Price

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a52210e5f123d52459f15c594e77bad@kernel.org \
    --to=maz@kernel.org \
    --cc=drjones@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=pbonzini@redhat.com \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox