From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Sean Christopherson <seanjc@google.com>,
Maxim Levitsky <mlevitsk@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Nitesh Narayan Lal <nitesh@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 4/4] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
Date: Wed, 25 Aug 2021 10:26:44 +0200 [thread overview]
Message-ID: <87eeaij5ff.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <YSUvESHcms6B3+DA@google.com>
Sean Christopherson <seanjc@google.com> writes:
> On Tue, Aug 24, 2021, Maxim Levitsky wrote:
>> On Tue, 2021-08-24 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> > Eduardo Habkost <ehabkost@redhat.com> writes:
>> >
>> > > On Tue, Aug 24, 2021 at 3:13 AM Vitaly Kuznetsov <vkuznets@redhat.com> wrote:
>> > > > Eduardo Habkost <ehabkost@redhat.com> writes:
>> > > >
>> > > > > On Mon, Aug 23, 2021 at 04:30:28PM +0200, Vitaly Kuznetsov wrote:
>> > > > > > diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
>> > > > > > index ff005fe738a4..92cd4b02e9ba 100644
>> > > > > > --- a/arch/x86/kvm/ioapic.c
>> > > > > > +++ b/arch/x86/kvm/ioapic.c
>> > > > > > @@ -319,7 +319,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
>> > > > > > unsigned index;
>> > > > > > bool mask_before, mask_after;
>> > > > > > union kvm_ioapic_redirect_entry *e;
>> > > > > > - unsigned long vcpu_bitmap;
>> > > > > > + unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)];
>
> The preferred pattern is:
>
> DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
>
Yes, thanks!
>> > > > >
>> > > > > Is there a way to avoid this KVM_MAX_VCPUS-sized variable on the
>> > > > > stack? This might hit us back when we increase KVM_MAX_VCPUS to
>> > > > > a few thousand VCPUs (I was planning to submit a patch for that
>> > > > > soon).
>> > > >
>> > > > What's the short- or mid-term target?
>> > >
>> > > Short term target is 2048 (which was already tested). Mid-term target
>> > > (not tested yet) is 4096, maybe 8192.
>> > >
>> > > > Note, we're allocating KVM_MAX_VCPUS bits (not bytes!) here, this means
>> > > > that for e.g. 2048 vCPUs we need 256 bytes of the stack only. In case
>> > > > the target much higher than that, we will need to either switch to
>> > > > dynamic allocation or e.g. use pre-allocated per-CPU variables and make
>> > > > this a preempt-disabled region. I, however, would like to understand if
>> > > > the problem with allocating this from stack is real or not first.
>> > >
>> > > Is 256 bytes too much here, or would that be OK?
>> > >
>> >
>> > AFAIR, on x86_64 stack size (both reqular and irq) is 16k, eating 256
>
> Don't forget i386! :-)
>
I'm not forgetting, I'm deliberately ignoring its existence :-)
Whoever tries to raise KVM_MAX_VCPUS from '288' may limit the change to
x86_64, I seriosly doubt 32bit users want to run guests with thouthands
of CPUs.
>> > bytes of it is probably OK. I'd start worrying when we go to 1024 (8k
>> > vCPUs) and above (but this is subjective of course).
>
> 256 is fine, 1024 would indeed be problematic, e.g. CONFIG_FRAME_WARN defaults to
> 1024 on 32-bit kernels. That's not a hard limit per se, but ideally KVM will stay
> warn-free on all flavors of x86.
Thanks for the CONFIG_FRAME_WARN pointer, I said '1024' out of top of my
head but it seems the number wasn't random after all)
>
>> On the topic of enlarging these bitmaps to cover all vCPUs.
>>
>> I also share the worry of having the whole bitmap on kernel stack for very
>> large number of vcpus.
>> Maybe we need to abstract and use a bitmap for a sane number of vcpus,
>> and use otherwise a 'kmalloc'ed buffer?
>
> That's a future problem. More specifically, it's the problem of whoever wants to
> push KVM_MAX_VCPUS > ~2048. There are a lot of ways to solve the problem, e.g.
> this I/O APIC code runs under a spinlock so a dedicated bitmap in struct kvm_ioapic
> could be used to avoid a large stack allocation.
+1
>
>> Also in theory large bitmaps might affect performance a bit.
>
> Maybe. The only possible degredation for small VMs, i.e. VMs that don't need the
> full bitmap, is if the compiler puts other variables below the bitmap and causes
> sub-optimal cache line usage. But I suspect/hope the compiler is smart enough to
> use GPRs and/or organize the local variables on the stack so that doesn't happen.
>
--
Vitaly
next prev parent reply other threads:[~2021-08-25 8:26 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-23 14:30 [PATCH v2 0/4] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 1/4] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 2/4] KVM: Guard cpusmask NULL check with CONFIG_CPUMASK_OFFSTACK Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 3/4] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
2021-08-23 14:30 ` [PATCH v2 4/4] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Vitaly Kuznetsov
2021-08-23 18:58 ` Eduardo Habkost
2021-08-24 7:13 ` Vitaly Kuznetsov
2021-08-24 14:23 ` Eduardo Habkost
2021-08-24 14:42 ` Vitaly Kuznetsov
2021-08-24 16:07 ` Maxim Levitsky
2021-08-24 17:40 ` Sean Christopherson
2021-08-25 8:26 ` Vitaly Kuznetsov [this message]
2021-08-25 8:21 ` Vitaly Kuznetsov
2021-08-25 9:11 ` Maxim Levitsky
2021-08-25 9:43 ` Vitaly Kuznetsov
2021-08-25 10:41 ` Maxim Levitsky
2021-08-25 13:19 ` Eduardo Habkost
2021-08-26 12:40 ` Vitaly Kuznetsov
2021-08-26 14:52 ` Eduardo Habkost
2021-08-26 18:01 ` Sean Christopherson
2021-08-26 18:13 ` Eduardo Habkost
2021-08-26 19:27 ` Eduardo Habkost
2021-08-30 19:47 ` Nitesh Lal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eeaij5ff.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mlevitsk@redhat.com \
--cc=nitesh@redhat.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.