qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids
Date: Mon, 23 Nov 2020 11:07:29 +0100	[thread overview]
Message-ID: <20201123110729.19954272@bahia.lan> (raw)
In-Reply-To: <97e23014-efa1-4ea3-95dc-1686ef097cf5@kaod.org>

On Mon, 23 Nov 2020 09:04:42 +0100
Cédric Le Goater <clg@kaod.org> wrote:

> On 11/20/20 6:46 PM, Greg Kurz wrote:
> > A regression was recently fixed in the sPAPR XIVE code for QEMU 5.2
> > RC3 [1]. It boiled down to a confusion between IPI numbers and vCPU
> > ids, which happen to be numerically equal in general, but are really
> > different entities that can diverge in some setups. When this happens,
> > we end up misconfiguring XIVE in a way that is almost fatal for the
> > guest.
> > 
> > The confusion comes from XICS which has historically assumed equality
> > between interrupt server numbers and vCPU ids, as mentionned in a
> > comment back from 2011 in the linux kernel icp_native_init_one_node()
> > function:
> > 
> >     /* This code does the theorically broken assumption that the interrupt
> >      * server numbers are the same as the hard CPU numbers.
> >      * This happens to be the case so far but we are playing with fire...
> >      * should be fixed one of these days. -BenH.
> >      */
> > 
> > This assumption crept into QEMU through the "ibm,interrupt-server-ranges"
> > property of the "interrupt-controller" node in the DT. This property
> > contains ranges of consecutive vCPU ids defined as (first id, # of ids).
> > In the case of QEMU, we define a single range starting from 0 up to the
> > highest vCPU id, as returned by spapr_max_server_number(). This has
> > always been associated to the "nr_servers" wording when naming variables
> > or function arguments. When XIVE got added, we introduced an sPAPR IRQ
> > abstraction to be able to control several interrupt controller backends.
> > The sPAPR IRQ base class provides a dt() handler used to populate the
> > "interrupt-controller" node in the DT. This handler takes an "nr_server"
> > argument inherited from XICS and we ended up using it to populate the
> > "ibm,xive-lisn-ranges" property used with XIVE, which has completely
> > different semantics. It contain ranges of interrupt numbers that the
> > guest can use for any purpose. Since one obvious purpose is IPI, its
> > first range should just be able to accomodate all possible vCPUs.
> 
> To clarify, PAPR says it is a requirement :
> 
>   "The first range will contain at least one per possible thread in the 
>    partition."
> 
> The regression showed that we were not initializing correctly this range 
> in QEMU/KVM. I an not even sure it had the correct size either since
> we were anyhow initializing 4096 IPIs.
> 

The bad thing was that each online vCPU would reset it's IPI in
KVM using a bogus IPI number (the vCPU id), and thus doesn't reset
the interrupt the guest is really going to use for the IPI.

> > In the case of QEMU, we define a single range starting from 0 up
> > to "nr_server" but we should rather size it to the number of vCPUs
> > actually (ie. smp.max_cpus).
> 
> ah. And so spapr_max_server_number(spapr) is crap ? This is starting
> to be complex to follow :/
>  

No. spapr_max_server_number(spapr) gives the highest vCPU id that
we end over to KVM in order to optimize VP id allocation in the HW.
But it definitely has nothing to do with "ibm,xive-lisn-ranges".

David suggested in some other mail that we could maybe pass
both spapr_max_server_number(spapr) and smp.max_cpus to the
activate() handler.

> > This series aims at getting rid of the "nr_server" argument in the
> > sPAPR IC handlers. Since both the highest possible vCPU id and the
> > maximum number of vCPUs are invariants for XICS and XIVE respectively,
> 
> What XIVE cares about is the number of possible IPIs and the number
> of vCPUs since we deduced from that the number of event queue 
> descriptors, which is another XIVE structure.
> 
> > let's make them device properties to be configured by the machine when
> > it creates the interrupt controllers and use them where needed.
> > 
> > This doesn't cause any visible change to setups using the default
> > VSMT machine settings. This changes "ibm,xive-lisn-ranges" for
> > setups that mess with VSMT, but this is acceptable since linux
> > only allocates one interrupt per vCPU and the higher part of the
> > range was never used.
> 
> This range is only used for vCPUs. 
> 
> C.
> 
> > [1] https://git.qemu.org/?p=qemu.git;a=commit;h=6d24795ee7e3199d199d3c415312c93382ad1807
> > 
> > Greg Kurz (8):
> >   spapr/xive: Turn some sanity checks into assertions
> >   spapr/xive: Introduce spapr_xive_nr_ends()
> >   spapr/xive: Add "nr-servers" property
> >   spapr/xive: Add "nr-ipis" property
> >   spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect()
> >   spapr/xics: Add "nr-servers" property
> >   spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation
> >   spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation
> > 
> >  include/hw/ppc/spapr.h      |  4 +--
> >  include/hw/ppc/spapr_irq.h  |  9 ++---
> >  include/hw/ppc/spapr_xive.h | 25 ++++++++++++-
> >  include/hw/ppc/xics_spapr.h | 23 +++++++++---
> >  hw/intc/spapr_xive.c        | 72 ++++++++++++++++++++++---------------
> >  hw/intc/spapr_xive_kvm.c    |  4 +--
> >  hw/intc/xics_kvm.c          |  4 +--
> >  hw/intc/xics_spapr.c        | 45 ++++++++++++++---------
> >  hw/ppc/spapr.c              |  7 ++--
> >  hw/ppc/spapr_irq.c          | 27 +++++++-------
> >  10 files changed, 141 insertions(+), 79 deletions(-)
> > 
> 



      reply	other threads:[~2020-11-23 10:08 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20 17:46 [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 1/8] spapr/xive: Turn some sanity checks into assertions Greg Kurz
2020-11-23  3:33   ` David Gibson
2020-11-23  8:09   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 2/8] spapr/xive: Introduce spapr_xive_nr_ends() Greg Kurz
2020-11-23  3:33   ` David Gibson
2020-11-25 22:43     ` Greg Kurz
2020-11-26  0:06       ` David Gibson
2020-11-23  9:46   ` Cédric Le Goater
2020-11-23 11:16     ` Greg Kurz
2020-11-24 13:54       ` Cédric Le Goater
2020-11-24 17:01         ` Greg Kurz
2020-11-24 17:56           ` Cédric Le Goater
2020-11-25  9:33             ` Greg Kurz
2020-11-25 11:34               ` Cédric Le Goater
2020-11-25 12:26                 ` Greg Kurz
2020-11-26  7:06                   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 3/8] spapr/xive: Add "nr-servers" property Greg Kurz
2020-11-23  3:52   ` David Gibson
2020-11-23  9:20     ` Greg Kurz
2020-11-23  9:56   ` Cédric Le Goater
2020-11-23 11:25     ` Greg Kurz
2020-11-24 14:18       ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 4/8] spapr/xive: Add "nr-ipis" property Greg Kurz
2020-11-23  4:10   ` David Gibson
2020-11-23 10:13   ` Cédric Le Goater
2020-11-24 17:18     ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 5/8] spapr/xics: Drop unused argument to xics_kvm_has_broken_disconnect() Greg Kurz
2020-11-23  4:10   ` David Gibson
2020-11-23 10:15   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 6/8] spapr/xics: Add "nr-servers" property Greg Kurz
2020-11-23  4:18   ` David Gibson
2020-11-23  9:39     ` Greg Kurz
2020-11-23 10:24   ` Cédric Le Goater
2020-11-20 17:46 ` [PATCH for-6.0 7/8] spapr: Drop "nr_servers" argument of the sPAPR IC activate() operation Greg Kurz
2020-11-23  4:38   ` David Gibson
2020-11-23  9:47     ` Greg Kurz
2020-11-20 17:46 ` [PATCH for-6.0 8/8] spapr: Drop "nr_servers" argument of the sPAPR IC dt() operation Greg Kurz
2020-11-23  8:04 ` [PATCH for-6.0 0/8] spapr: Address the confusion between IPI numbers and vCPU ids Cédric Le Goater
2020-11-23 10:07   ` Greg Kurz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201123110729.19954272@bahia.lan \
    --to=groug@kaod.org \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).