All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Greg Kurz <groug@kaod.org>
Cc: qemu-ppc@nongnu.org, "Cédric Le Goater" <clg@kaod.org>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH] spapr/xive: skip partially initialized vCPUs in presenter
Date: Wed, 2 Oct 2019 11:02:45 +1000	[thread overview]
Message-ID: <20191002010245.GT11105@umbus.fritz.box> (raw)
In-Reply-To: <20191001185629.0b284ba1@bahia.lan>

[-- Attachment #1: Type: text/plain, Size: 6117 bytes --]

On Tue, Oct 01, 2019 at 06:56:28PM +0200, Greg Kurz wrote:
> On Tue, 1 Oct 2019 13:56:10 +0200
> Cédric Le Goater <clg@kaod.org> wrote:
> 
> > On 01/10/2019 13:06, Greg Kurz wrote:
> > > On Tue,  1 Oct 2019 10:57:22 +0200
> > > Cédric Le Goater <clg@kaod.org> wrote:
> > > 
> > >> When vCPUs are hotplugged, they are added to the QEMU CPU list before
> > >> being fully realized. This can crash the XIVE presenter because the
> > >> 'tctx' pointer is not necessarily initialized when looking for a
> > >> matching target.
> > >>
> > > 
> > > Ouch... :-\
> > > 
> > >> These vCPUs are not valid targets for the presenter. Skip them.
> > >>
> > > 
> > > This likely fixes this specific issue, but more generally, this
> > > seems to indicate that using CPU_FOREACH() is rather fragile.
> > > 
> > > What about tracking XIVE TM contexts with a QLIST in xive.c ?
> > 
> > This is a good idea.  
> > 
> > On HW, the thread interrupt contexts belong to the XIVE presenter 
> > subengine. This is the logic doing the CAM line matching to find
> > a target for an event notification. So we should model the context 
> > list below the XiveRouter in QEMU which models both router and 
> > presenter subengines. We have done without a presenter model for 
> > the moment and I don't think we will need to introduce one.  
> > 
> > This would be a nice improvements of my patchset adding support
> > for xive escalations and better support of multi chip systems. 
> > I have introduced a PNV_CHIP_CPU_FOREACH in this patchset which 
> > would become useless with a list of tctx under the XIVE interrupt
> > controller, XiveRouter, SpaprXive, PnvXive.
> > 
> 
> I agree. It makes more sense to have the list below the XiveRouter,
> rather than relying on CPU_FOREACH(), which looks a bit weird from
> a device emulation code perspective.

That does sound like a better idea long term.  However, for now, I
think the NULL check is a reasonable way of fixing the real error
we're hitting, so I've applied the patch here.

Future cleanups to a different approach remain welcome, of course.

> > Next step would be to get rid of the tctx->cs pointer. In my latest
> > patches, it is only used to calculate the HW CAM line. 
> > 
> > There might be some consequences on the object hierarchy and it will
> > break migration.
> > 
> 
> This could break if the contexts were devices sitting in a bus, which
> isn't the case here. I'll try to come up with a proposal for spapr,
> and we can work out the changes on your recent XIVE series for pnv.
> 
> > Thanks,
> > 
> > C.
> > 
> > > 
> > > ================================================================================
> > > diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
> > > index 6d38755f8459..89b9ef7f20b1 100644
> > > --- a/include/hw/ppc/xive.h
> > > +++ b/include/hw/ppc/xive.h
> > > @@ -319,6 +319,8 @@ typedef struct XiveTCTX {
> > >      qemu_irq    os_output;
> > >  
> > >      uint8_t     regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
> > > +
> > > +    QTAILQ_ENTRY(XiveTCTX) next;
> > >  } XiveTCTX;
> > >  
> > >  /*
> > > diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> > > index b7417210d817..f7721c711041 100644
> > > --- a/hw/intc/xive.c
> > > +++ b/hw/intc/xive.c
> > > @@ -568,6 +568,8 @@ static void xive_tctx_reset(void *dev)
> > >          ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]);
> > >  }
> > >  
> > > +static QTAILQ_HEAD(, XiveTCTX) xive_tctx_list = QTAILQ_HEAD_INITIALIZER(xive_tctx_list);
> > > +
> > >  static void xive_tctx_realize(DeviceState *dev, Error **errp)
> > >  {
> > >      XiveTCTX *tctx = XIVE_TCTX(dev);
> > > @@ -609,10 +611,14 @@ static void xive_tctx_realize(DeviceState *dev, Error **errp)
> > >      }
> > >  
> > >      qemu_register_reset(xive_tctx_reset, dev);
> > > +    QTAILQ_INSERT_HEAD(&xive_tctx_list, tctx, next);
> > >  }
> > >  
> > >  static void xive_tctx_unrealize(DeviceState *dev, Error **errp)
> > >  {
> > > +    XiveTCTX *tctx = XIVE_TCTX(dev);
> > > +
> > > +    QTAILQ_REMOVE(&xive_tctx_list, tctx, next);
> > >      qemu_unregister_reset(xive_tctx_reset, dev);
> > >  }
> > >  
> > > @@ -1385,15 +1391,14 @@ static bool xive_presenter_match(XiveRouter *xrtr, uint8_t format,
> > >                                   bool cam_ignore, uint8_t priority,
> > >                                   uint32_t logic_serv, XiveTCTXMatch *match)
> > >  {
> > > -    CPUState *cs;
> > > +    XiveTCTX *tctx;
> > >  
> > >      /*
> > >       * TODO (PowerNV): handle chip_id overwrite of block field for
> > >       * hardwired CAM compares
> > >       */
> > >  
> > > -    CPU_FOREACH(cs) {
> > > -        XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs);
> > > +    QTAILQ_FOREACH(tctx, &xive_tctx_list, next) {
> > >          int ring;
> > >  
> > >          /*
> > > ================================================================================
> > > 
> > >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> > >> ---
> > >>  hw/intc/xive.c | 8 ++++++++
> > >>  1 file changed, 8 insertions(+)
> > >>
> > >> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
> > >> index b7417210d817..29df06df1136 100644
> > >> --- a/hw/intc/xive.c
> > >> +++ b/hw/intc/xive.c
> > >> @@ -1396,6 +1396,14 @@ static bool xive_presenter_match(XiveRouter *xrtr, uint8_t format,
> > >>          XiveTCTX *tctx = xive_router_get_tctx(xrtr, cs);
> > >>          int ring;
> > >>  
> > >> +        /*
> > >> +         * Skip partially initialized vCPUs. This can happen when
> > >> +         * vCPUs are hotplugged.
> > >> +         */
> > >> +        if (!tctx) {
> > >> +            continue;
> > >> +        }
> > >> +
> > >>          /*
> > >>           * HW checks that the CPU is enabled in the Physical Thread
> > >>           * Enable Register (PTER).
> > > 
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-10-02  1:58 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01  8:57 [PATCH] spapr/xive: skip partially initialized vCPUs in presenter Cédric Le Goater
2019-10-01 11:06 ` Greg Kurz
2019-10-01 11:56   ` Cédric Le Goater
2019-10-01 16:56     ` Greg Kurz
2019-10-02  1:02       ` David Gibson [this message]
2019-10-02 14:21         ` Greg Kurz
2019-10-02 14:47           ` Cédric Le Goater
2019-10-02 22:37             ` David Gibson
2019-10-03  8:02               ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191002010245.GT11105@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=groug@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.