Re: [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h
Date: Wed, 20 Jan 2021 09:25:13 -0800	[thread overview]
Message-ID: <YAhnebWjQCOfLtJ0@google.com> (raw)
In-Reply-To: <87czxz6cl9.fsf@vitty.brq.redhat.com>

On Wed, Jan 20, 2021, Vitaly Kuznetsov wrote:
> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Tue, 19 Jan 2021 09:20:42 -0800
> > Sean Christopherson <seanjc@google.com> wrote:
> >
> >> 
> >> Were you planning on adding a capability to check for the new and improved
> >> memslots limit, e.g. to know whether or not KVM might die on a large VM?
> >> If so, requiring the VMM to call an ioctl() to set a higher (or lower?) limit
> >> would be another option.  That wouldn't have the same permission requirements as
> >> a module param, but it would likely be a more effective safeguard in practice,
> >> e.g. use cases with a fixed number of memslots or a well-defined upper bound
> >> could use the capability to limit themselves.
> > Currently QEMU uses KVM_CAP_NR_MEMSLOTS to get limit, and depending on place the
> > limit is reached it either fails gracefully (i.e. it checks if free slot is
> > available before slot allocation) or aborts (in case where it tries to allocate
> > slot without check).
> 
> FWIW, 'synic problem' causes it to abort.
> 
> > New ioctl() seems redundant as we already have upper limit check
> > (unless it would allow go over that limit, which in its turn defeats purpose of
> > the limit).

Gah, and I even looked for an ioctl().  No idea how I didn't find NR_MEMSLOTS.

> Right, I didn't plan to add any new CAP as what we already have should
> be enough to query the limits. Having an ioctl to set the upper limit
> seems complicated: with device and CPU hotplug it may not be easy to
> guess what it should be upfront so VMMs will likely just add a call to
> raise the limit in memslot modification code so it won't be providing
> any new protection.

True.  Maybe the best approach, if we want to make the limit configurable, would
be to make a lower limit opt-in.  I.e. default=KVM_CAP_NR_MEMSLOTS=SHRT_MAX,
with an ioctl() to set a lower limit.  That would also allow us to defer adding
a new ioctl() until someone actually plans on using it.

> >> Thoughts?  An ioctl() feels a little over-engineered, but I suspect that adding
> >> a module param that defaults to N*KVM_MAX_VPCUS will be a waste, e.g. no one
> >> will ever touch the param and we'll end up with dead, rarely-tested code.
> 
> Alternatively, we can hard-code KVM_USER_MEM_SLOTS to N*KVM_MAX_VPCUS so
> no new parameter is needed but personally, I'd prefer to have it
> configurable (in case we decide to set it to something lower than
> SHRT_MAX of course) even if it won't be altered very often (which is a
> good thing for 'general purpose' usage, right?).
>
> First, it will allow tightening the limit for some very specific deployments
> (e.g. FaaS/ Firecracker-style) to say '20' which should be enough.

A module param likely isn't usable for many such deployments though, as it would
require a completely isolated pool of systems.  That's why an ioctl() is
appealing; the expected number of memslots is a property of the VM, not of the
platform.

> Second, we may be overlooking some configurations where the number of
> memslots is actually dependent on the number of vCPUs but nobody complained
> so far just because these configutrarions use a farly small number and the
> ceiling wasn't hit yet.
> 
> One more spare thought. Per-vCPU memslots may come handy if someone
> decides to move some of the KVM PV features to userspace. E.g. I can
> imagine an attempt to move async_pf out of kernel.

Memslots aren't per-vCPU though, any userspace feature that tries to treat them
as such will be flawed in some way.  Practically speaking, per-vCPU memslots
just aren't feasible, at least not without KVM changes that are likely
unpalatable, e.g. KVM would need to incorporate the existence of a vCPU specific
memslot into the MMU role.

next prev parent reply	other threads:[~2021-01-20 17:34 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-15 13:18 [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 1/4] KVM: x86: Drop redundant KVM_MEM_SLOTS_NUM definition Vitaly Kuznetsov
2021-01-15 16:47   ` Sean Christopherson
2021-01-15 13:18 ` [PATCH RFC 2/4] KVM: mips: Drop KVM_PRIVATE_MEM_SLOTS definition Vitaly Kuznetsov
2021-01-15 13:18 ` [PATCH RFC 3/4] KVM: Define KVM_USER_MEM_SLOTS in arch-neutral include/linux/kvm_host.h Vitaly Kuznetsov
2021-01-15 17:05   ` Sean Christopherson
2021-01-18  9:52     ` Vitaly Kuznetsov
2021-01-19 17:20       ` Sean Christopherson
2021-01-20 11:34         ` Igor Mammedov
2021-01-20 12:02           ` Vitaly Kuznetsov
2021-01-20 17:25             ` Sean Christopherson [this message]
2021-01-15 13:18 ` [PATCH RFC 4/4] KVM: x86: Stop limiting KVM_USER_MEM_SLOTS Vitaly Kuznetsov
2021-01-15 16:03 ` [PATCH RFC 0/4] KVM: x86: Drastically raise KVM_USER_MEM_SLOTS limit Sean Christopherson
2021-01-15 16:23   ` Vitaly Kuznetsov
2021-01-15 16:45     ` Sean Christopherson
2021-01-15 18:47 ` Maciej S. Szmigiero
2021-01-27 17:55   ` Vitaly Kuznetsov
2021-01-27 22:26     ` Maciej S. Szmigiero
2021-01-28  8:47       ` Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YAhnebWjQCOfLtJ0@google.com \
    --to=seanjc@google.com \
    --cc=imammedo@redhat.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.