From: Greg Kurz <groug@kaod.org>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
Date: Mon, 23 Nov 2020 11:07:29 +0100 [thread overview]
Message-ID: <20201123110729.19954272@bahia.lan> (raw)
In-Reply-To: <97e23014-efa1-4ea3-95dc-1686ef097cf5@kaod.org>
On Mon, 23 Nov 2020 09:04:42 +0100
Cédric Le Goater <clg@kaod.org> wrote:
> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
> > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
> > ids, which happen to be numerically equal in general, but are really
> > different entities that can diverge in some setups. When this happens,
> > we end up misconfiguring XIVE in a way that is almost fatal for the
> > guest.
> >
> > The confusion comes from XICS which has historically assumed equality
> > between interrupt server numbers and vCPU ids, as mentionned in a
> > comment back from 2011 in the linux kernel icp_native_init_one_node()
> > function:
> >
> > /* This code does the theorically broken assumption that the interrupt
> > * server numbers are the same as the hard CPU numbers.
> > * This happens to be the case so far but we are playing with fire...
> > * should be fixed one of these days. -BenH.
> > */
> >
> > This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
> > property of the "interrupt-controller" node in the DT. This property
> > contains ranges of consecutive vCPU ids defined as (first id, # of ids).
> > In the case of QEMU, we define a single range starting from 0 up to the
> > highest vCPU id, as returned by spapr_max_server_number(). This has
> > always been associated to the "nr_servers" wording when naming variables
> > or function arguments. When XIVE got added, we introduced an sPAPR IRQ
> > abstraction to be able to control several interrupt controller backends.
> > The sPAPR IRQ base class provides a dt() handler used to populate the
> > "interrupt-controller" node in the DT. This handler takes an "nr_server"
> > argument inherited from XICS and we ended up using it to populate the
> > "ibm,xive-lisn-ranges" property used with XIVE, which has completely
> > different semantics. It contain ranges of interrupt numbers that the
> > guest can use for any purpose. Since one obvious purpose is IPI, its
> > first range should just be able to accomodate all possible vCPUs.
>
> To clarify, PAPR says it is a requirement :
>
> "The first range will contain at least one per possible thread in the
> partition."
>
> The regression showed that we were not initializing correctly this range
> in QEMU/KVM. I an not even sure it had the correct size either since
> we were anyhow initializing 4096 IPIs.
>
The bad thing was that each online vCPU would reset it's IPI in
KVM using a bogus IPI number (the vCPU id), and thus doesn't reset
the interrupt the guest is really going to use for the IPI.
> > In the case of QEMU, we define a single range starting from 0 up
> > to "nr_server" but we should rather size it to the number of vCPUs
> > actually (ie. smp.max_cpus).
>
> ah. And so spapr_max_server_number(spapr) is crap ? This is starting
> to be complex to follow :/
>
No. spapr_max_server_number(spapr) gives the highest vCPU id that
we end over to KVM in order to optimize VP id allocation in the HW.
But it definitely has nothing to do with "ibm,xive-lisn-ranges".
David suggested in some other mail that we could maybe pass
both spapr_max_server_number(spapr) and smp.max_cpus to the
activate() handler.
> > This series aims at getting rid of the "nr_server" argument in the
> > sPAPR IC handlers. Since both the highest possible vCPU id and the
> > maximum number of vCPUs are invariants for XICS and XIVE respectively,
>
> What XIVE cares about is the number of possible IPIs and the number
> of vCPUs since we deduced from that the number of event queue
> descriptors, which is another XIVE structure.
>
> > let's make them device properties to be configured by the machine when
> > it creates the interrupt controllers and use them where needed.
> >
> > This doesn't cause any visible change to setups using the default
> > VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
> > setups that mess with VSMT, but this is acceptable since linux
> > only allocates one interrupt per vCPU and the higher part of the
> > range was never used.
>
> This range is only used for vCPUs.
>
> C.
>
> > [1] https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807
> >
> > Greg Kurz (8):
> > spapr/xive: Turn some sanity checks into assertions
> > spapr/xive: Introduce spapr_xive_nr_ends()
> > spapr/xive: Add "nr-servers" property
> > spapr/xive: Add "nr-ipis" property
> > spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
> > spapr/xics: Add "nr-servers" property
> > spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
> > spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation
> >
> > include/hw/ppc/spapr.h | 4 +--
> > include/hw/ppc/spapr_irq.h | 9 ++---
> > include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
> > include/hw/ppc/xics_spapr.h | 23 +++++++++---
> > hw/intc/spapr_xive.c | 72 ++++++++++++++++++++++---------------
> > hw/intc/spapr_xive_kvm.c | 4 +--
> > hw/intc/xics_kvm.c | 4 +--
> > hw/intc/xics_spapr.c | 45 ++++++++++++++---------
> > hw/ppc/spapr.c | 7 ++--
> > hw/ppc/spapr_irq.c | 27 +++++++-------
> > 10 files changed, 141 insertions(+), 79 deletions(-)
> >
>
prev parent reply other threads:[~2020-11-23 10:08 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
2020-11-23 3:33 ` David Gibson
2020-11-23 8:09 ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
2020-11-23 3:33 ` David Gibson
2020-11-25 22:43 ` Greg Kurz
2020-11-26 0:06 ` David Gibson
2020-11-23 9:46 ` Cédric Le Goater
2020-11-23 11:16 ` Greg Kurz
2020-11-24 13:54 ` Cédric Le Goater
2020-11-24 17:01 ` Greg Kurz
2020-11-24 17:56 ` Cédric Le Goater
2020-11-25 9:33 ` Greg Kurz
2020-11-25 11:34 ` Cédric Le Goater
2020-11-25 12:26 ` Greg Kurz
2020-11-26 7:06 ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
2020-11-23 3:52 ` David Gibson
2020-11-23 9:20 ` Greg Kurz
2020-11-23 9:56 ` Cédric Le Goater
2020-11-23 11:25 ` Greg Kurz
2020-11-24 14:18 ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
2020-11-23 4:10 ` David Gibson
2020-11-23 10:13 ` Cédric Le Goater
2020-11-24 17:18 ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
2020-11-23 4:10 ` David Gibson
2020-11-23 10:15 ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
2020-11-23 4:18 ` David Gibson
2020-11-23 9:39 ` Greg Kurz
2020-11-23 10:24 ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
2020-11-23 4:38 ` David Gibson
2020-11-23 9:47 ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation Greg Kurz
2020-11-23 8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
2020-11-23 10:07 ` Greg Kurz [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201123110729.19954272@bahia.lan \
--to=groug@kaod.org \
--cc=clg@kaod.org \
--cc=david@gibson.dropbear.id.au \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).