All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: Igor Mammedov <imammedo@redhat.com>,
	mst@redhat.com, qemu-devel@nongnu.org, pbonzini@redhat.com
Subject: Re: [PATCH] x86: q35: require split irqchip for large CPU count
Date: Mon, 14 Mar 2022 13:21:18 +0000	[thread overview]
Message-ID: <Yi9BTkZIM3iZsvdK@redhat.com> (raw)
In-Reply-To: <0d0f10cf1e593be0fb4546749cd7ee11765accb5.camel@infradead.org>

On Mon, Mar 14, 2022 at 12:59:38PM +0000, David Woodhouse wrote:
> On Mon, 2022-03-14 at 11:35 +0100, Igor Mammedov wrote:
> > On Fri, 11 Mar 2022 14:58:41 +0000
> > David Woodhouse <
> > dwmw2@infradead.org
> > > wrote:
> > 
> > > On Fri, 2022-03-11 at 09:39 -0500, Igor Mammedov wrote:
> > > > if VM is started with:
> > > > 
> > > >    -enable-kvm -smp 256
> > > > 
> > > > without specifying 'split' irqchip, VM might eventually boot
> > > > but no more than 255 CPUs will be operational and following
> > > > error messages in guest could be observed:
> > > >    ...
> > > >    smpboot: native_cpu_up: bad cpu 256
> > > >    ...
> > > > It's a regression introduced by [1], which removed dependency
> > > > on intremap=on that were implicitly requiring 'split' irqchip
> > > > and forgot to check for 'split' irqchip.
> > > > Instead of letting VM boot a broken VM, error out and tell
> > > > user how to fix CLI.  
> > > 
> > > Hm, wasn't that already fixed in the patches I posted in December?
> > 
> > It might be, could you point to the commit/series that fixed it.
> 
> https://lore.kernel.org/all/20211209220840.14889-1-dwmw2@infradead.org/
> is the patch I was thinking of, but although that moves the check to a
> more useful place and fixes the X2APIC check, it *doesn't* include the
> fix you're making; it's still using kvm_irqchip_in_kernel().
> 
> I can change that and repost the series, which is still sitting (with
> fixed Reviewed-By/Acked-By attributions that I screwed up last time) in
> https://git.infradead.org/users/dwmw2/qemu.git
> 
> > Regardless of that, fixing it in recent kernels doesn't help
> > as still supported kernels are still affected by it.
> > 
> > If there is a way to detect that fix, I can add to q35 a compat
> > property and an extra logic to enable kernel-irqchip if fix is present.
> > Otherwise the fix does not exist until minimum supported kernel
> > version reaches version where it was fixed.
> 
> Hm, I'm not sure I follow here. Do you mean recent versions of *qemu*
> when you say 'kernels'? 
> 
> I'm not even sure I agree with the observation that qemu should error
> out here. The guest boots fine and the guest can even *use* all the
> CPUs. IPIs etc. will all work fine. The only thing that doesn't work is
> delivering *external* interrupts to CPUs above 254.
> 
> Ultimately, this is the *guest's* problem. Some operating systems can
> cope; some can't.
> 
> The fact that *Linux* has a fundamental assumption that *all* CPUs can
> receive all interrupts and that affinity can't be limited in hardware,
> is a Linux problem. I tried to fix it once but it was distinctly non-
> trivial and eventually I gave up and took a different approach.
> https://lore.kernel.org/linux-iommu/87lfgj59mp.fsf@nanos.tec.linutronix.de/T/
> 
> But even if we 'fix' the check as you suggest to bail out and refuse to
> boot a certain configuration because Linux guest wouldn't be able to
> fully utilize it... Even if we boot with the split IRQ chip and the 15-
> bit MSI enlightenment, we're still in the same position. Some guests
> will be able to use it; some won't.
> 
> In fact, there are operating systems that don't even know about X2APIC.
> 
> Why should qemu refuse to even start up?

We've generally said QEMU should not reject / block startup of valid
hardware configurations, based on existance of bugs in certain guest
OS, if the config would be valid for other guest.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2022-03-14 14:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-11 14:39 [PATCH] x86: q35: require split irqchip for large CPU count Igor Mammedov
2022-03-11 14:58 ` David Woodhouse
2022-03-14 10:35   ` Igor Mammedov
2022-03-14 12:59     ` David Woodhouse
2022-03-14 13:21       ` Daniel P. Berrangé [this message]
2022-03-14 14:21         ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yi9BTkZIM3iZsvdK@redhat.com \
    --to=berrange@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.