All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Cédric Le Goater" <clg@kaod.org>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Andrea Bolognani" <abologna@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v2 1/2] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge
Date: Mon, 23 Jul 2018 14:16:14 +1000	[thread overview]
Message-ID: <20180723041614.GD6830@umbus.fritz.box> (raw)
In-Reply-To: <bc6c5befa0a24b35f439419d6dea493dc19035b9.camel@kernel.crashing.org>

[-- Attachment #1: Type: text/plain, Size: 10461 bytes --]

On Wed, Jul 18, 2018 at 06:03:29PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-07-18 at 16:12 +1000, David Gibson wrote:
> > On Thu, Jun 28, 2018 at 10:36:32AM +0200, Cédric Le Goater wrote:
> > > From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 
> Can you trim your replies ? It's really hard to find your comments in
> such a long patch...
> 
> > > diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> > > index 6ac8a9392da6..966a996c2eac 100644
> > > --- a/include/hw/ppc/xics.h
> > > +++ b/include/hw/ppc/xics.h
> > > @@ -194,6 +194,7 @@ void icp_set_mfrr(ICPState *icp, uint8_t mfrr);
> > >  uint32_t icp_accept(ICPState *ss);
> > >  uint32_t icp_ipoll(ICPState *ss, uint32_t *mfrr);
> > >  void icp_eoi(ICPState *icp, uint32_t xirr);
> > > +void icp_irq(ICSState *ics, int server, int nr, uint8_t priority);
> > 
> > Hrm... are you sure you need to expose this?

[snip]
> > > +    limit = pci_config_size(pdev);
> > > +    if (limit <= cfg_addr) {
> > > +        /* conventional pci device can be behind pcie-to-pci bridge.
> > > +           256 <= addr < 4K has no effects. */
> > > +        return ~0ull;
> > > +    }
> > > +    val = pci_host_config_read_common(pdev, cfg_addr, limit, size);
> > > +    switch (size) {
> > > +    case 1:
> > > +        return val;
> > > +    case 2:
> > > +        return bswap16(val);
> > > +    case 4:
> > > +        return bswap32(val);
> > > +    default:
> > > +        g_assert_not_reached();
> > > +    }
> > > +}
> > > +
> > > +static void pnv_phb3_check_m32(PnvPHB3 *phb)
> > > +{
> > > +    uint64_t base, start, size;
> > > +    MemoryRegion *parent;
> > > +    PnvPBCQState *pbcq = &phb->pbcq;
> > > +
> > > +    if (phb->m32_mapped) {
> > > +        memory_region_del_subregion(phb->mr_m32.container, &phb->mr_m32);
> > > +        phb->m32_mapped = false;
> > 
> > Could you use memory_region_set_enabled() rather than having to add
> > and delete the subregion and keep track of it in this separate
> > variable?
> 
> There was a reason for that but it's years old and I forgot... I think
> when re-enabled it moves around, among other things. So it's not more
> efficient.

Hm, ok.  It'd be good to know what this is.

> > > +    }
> > > +
> > > +    /* Disabled ? move on with life ... */
> > 
> > Trivial: "nothing to do" seems to be the standard way to express this.
> > Even trivialler: usual English rules don't put a space before '?' or
> > '!' punctuation.
> 
> No, that's the tasteless English rule :-) It shall be overridden by
> anybody interested in making things actually readable :-)
> 
> > > +
> > > +static void pnv_phb3_lxivt_write(PnvPHB3 *phb, unsigned idx, uint64_t val)
> > > +{
> > > +    ICSState *ics = &phb->lsis;
> > > +    uint8_t server, prio;
> > > +
> > > +    phb->ioda_LXIVT[idx] = val & (IODA2_LXIVT_SERVER |
> > > +                                  IODA2_LXIVT_PRIORITY |
> > > +                                  IODA2_LXIVT_NODE_ID);
> > > +    server = GETFIELD(IODA2_LXIVT_SERVER, val);
> > > +    prio = GETFIELD(IODA2_LXIVT_PRIORITY, val);
> > > +
> > > +    /*
> > > +     * The low order 2 bits are the link pointer (Type II interrupts).
> > 
> > I don't think we've currently implemented link pointers in our xics
> > code.  Do we need to do that?
> 
> Not until somebody uses them (other than pHyp).
> 
> > > +     * Shift back to get a valid IRQ server.
> > > +     */
> > > +    server >>= 2;
> > > +
> > > +    ics_simple_write_xive(ics, idx, server, prio, prio);
> > > +}
> > > +
> > > +static uint64_t *pnv_phb3_ioda_access(PnvPHB3 *phb,
> > > +                                      unsigned *out_table, unsigned *out_idx)
> > > +{
> > > +    uint64_t adreg = phb->regs[PHB_IODA_ADDR >> 3];
> > > +    unsigned int index = GETFIELD(PHB_IODA_AD_TADR, adreg);
> > > +    unsigned int table = GETFIELD(PHB_IODA_AD_TSEL, adreg);
> > > +    unsigned int mask;
> > > +    uint64_t *tptr = NULL;
> > > +
> > > +    switch (table) {
> > > +    case IODA2_TBL_LIST:
> > > +        tptr = phb->ioda_LIST;
> > > +        mask = 7;
> > 
> > I'd suggest hex for the masks.
> 
> This is more readable when matched with the documentation but not a big
> deal.

Matching the docs is a good enough reason to keep decimal.

> > > +        break;
> > > +    case IODA2_TBL_LXIVT:
> > > +        tptr = phb->ioda_LXIVT;
> > > +        mask = 7;
> > > +        break;
> > > +    case IODA2_TBL_IVC_CAM:
> > > +    case IODA2_TBL_RBA:
> > > +        mask = 31;
> > > +        break;
> > > +    case IODA2_TBL_RCAM:
> > > +        mask = 63;
> > > +        break;
> > > +    case IODA2_TBL_MRT:
> > > +        mask = 7;
> > > +        break;
> > > +    case IODA2_TBL_PESTA:
> > > +    case IODA2_TBL_PESTB:
> > > +        mask = 255;
> > > +        break;
> > > +    case IODA2_TBL_TVT:
> > > +        tptr = phb->ioda_TVT;
> > > +        mask = 511;
> > > +        break;
> > > +    case IODA2_TBL_TCAM:
> > > +    case IODA2_TBL_TDR:
> > > +        mask = 63;
> > > +        break;
> > > +    case IODA2_TBL_M64BT:
> > > +        tptr = phb->ioda_M64BT;
> > > +        mask = 15;
> > > +        break;
> > > +    case IODA2_TBL_M32DT:
> > > +        tptr = phb->ioda_MDT;
> > > +        mask = 255;
> > > +        break;
> > > +    case IODA2_TBL_PEEV:
> > > +        tptr = phb->ioda_PEEV;
> > > +        mask = 3;
> > > +        break;
> > > +    default:
> > > +        return NULL;
> > > +    }
> > > +    index &= mask;
> > 
> > Do we want to silently mask, rather than logging an error if there's
> > an access out of bounds for a particular table?
> 
> We could do both, as to behave like the HW but also flag an error.

Sounds reasonable.

> But
> you have to be careful with the auto-increment below.

Hm, ok.

> > > +    if (out_idx) {
> > > +        *out_idx = index;
> > > +    }
> > > +    if (out_table) {
> > > +        *out_table = table;
> > > +    }
> > > +    if (adreg & PHB_IODA_AD_AUTOINC) {
> > > +        index = (index + 1) & mask;
> > > +        adreg = SETFIELD(PHB_IODA_AD_TADR, adreg, index);
> > > +    }
> > > +    if (tptr) {
> > > +        tptr += index;
> > > +    }
> > > +    phb->regs[PHB_IODA_ADDR >> 3] = adreg;
> > > +    return tptr;
> > > +}
> 
>  ../..
> 
> > > +    if ((comp & mask) != comp) {
> > > +        qemu_log_mask(LOG_GUEST_ERROR,
> > > +                      "IRQ compare bits not in mask: comp=0x%x mask=0x%x",
> > > +                      comp, mask);
> > > +        comp &= mask;
> > > +    }
> > > +    /* Setup LSI offset */
> > > +    ics->offset = comp + global;
> > 
> > Oh.. changing ICS offset at runtime.  I hadn't considered that case..
> 
> As above, see further down.
> 
> > > +    /* Special case configuration data */
> > > +    if ((off & 0xfffc) == PHB_CONFIG_DATA) {
> > > +        pnv_phb3_config_write(phb, off & 0x3, size, val);
> > > +        return;
> > > +    }
> > > +
> > > +    /* Other registers are 64-bit only */
> > > +    if (size != 8 || off & 0x7) {
> > > +        qemu_log_mask(LOG_GUEST_ERROR,
> > > +                      "Invalid register access, offset: 0x%"PRIx64" size: %d",
> > > +                      off, size);
> > > +        return;
> > > +    }
> > > +
> > > +    /* Handle masking */
> > > +    switch (off) {
> > > +    case PHB_M64_UPPER_BITS:
> > 
> > Couldn't you handle this in the switch below - it should be ok to
> > momentarily have the invalid bits set in your reg case, as long as you
> > mask them before applying the side-effects, yes?
> 
> That felt easier that way ...
> 
> > That said... shouldn't you filter our invalid or read-only regs before
> > updating the cache?
> 
> Well, I had a reason for doing it that way, I do have a vague memory of
> that but I can't remember it now :-)
> 
> > > +/*
> > > + * MSI/MSIX memory region implementation.
> > > + * The handler handles both MSI and MSIX.
> > > + */
> > > +static void pnv_phb3_msi_write(void *opaque, hwaddr addr,
> > > +                               uint64_t data, unsigned size)
> > > +{
> > > +    PnvPhb3DMASpace *ds = opaque;
> > > +
> > > +    /* Resolve PE# */
> > > +    if (!pnv_phb3_resolve_pe(ds)) {
> > > +        qemu_log_mask(LOG_GUEST_ERROR,
> > > +                      "Failed to resolve PE# for bus @%p (%d) devfn 0x%x",
> > > +                      ds->bus, pci_bus_num(ds->bus), ds->devfn);
> > > +        return;
> > > +    }
> > > +
> > > +    pnv_phb3_msi_send(&ds->phb->msis, addr, data, ds->pe_num);
> > > +}
> > > +
> > > +static const MemoryRegionOps pnv_phb3_msi_ops = {
> > > +    /* There is no .read as the read result is undefined by PCI spec */
> > 
> > What will qemu do if it hits a NULL read here?  The behaviour may be
> > undefind, but we'd prefer it didn't crash qemu..
> 
> Yeah.

[snip]
> > > +    /*
> > > +     * The low order 2 bits are the link pointer (Type II interrupts).
> > > +     * Shift back to get a valid IRQ server.
> > > +     */
> > > +    server >>= 2;
> > > +
> > > +    switch (pq) {
> > > +    case 0: /* 00 */
> > > +        if (prio == 0xff) {
> > > +            /* Masked, set Q */
> > > +            phb3_msi_set_q(msi, srcno);
> > > +        } else {
> > > +            /* Enabled, set P and send */
> > > +            phb3_msi_set_p(msi, srcno, gen);
> > > +            icp_irq(ics, server, srcno + ics->offset, prio);
> > 
> > Can't you just pulse the right qirq here, which will use the core ICS
> > logic to propagate to the ICP?
> 
> Ok, interrupts ... sooooo...
> 
> This code predates a pile of XICS work you guys did to start with.
> 
> Now, this is an ICS subclass, so why shouldn't it directly poke at the
> target ICP ?

That's ok in theory, but causing it to expose the icp interface to a
new module isn't great.

> It's an alternate to the normal ICS since it behaves a bit
> differently (PQ bits & all).

AFAICT the PQ bits are effectively another filtering layer on top of
the same ICS logic as elsewhere.  I think it's better if we can share
that shared logic, rather than replicating it.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2018-07-23  4:16 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-28  8:36 [Qemu-devel] [PATCH v2 0/2] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge Cédric Le Goater
2018-06-28  8:36 ` [Qemu-devel] [PATCH v2 1/2] " Cédric Le Goater
2018-07-09  7:22   ` Cédric Le Goater
2018-07-18  6:13     ` David Gibson
2018-07-18  6:12   ` David Gibson
2018-07-18  8:03     ` Benjamin Herrenschmidt
2018-07-23  4:16       ` David Gibson [this message]
2018-07-23 23:55         ` Benjamin Herrenschmidt
2018-07-24  2:14           ` David Gibson
2018-07-24  3:49             ` Benjamin Herrenschmidt
2018-07-24  4:10               ` David Gibson
2018-07-23 21:35     ` Cédric Le Goater
2018-07-23 21:37     ` Cédric Le Goater
2018-07-24  1:29       ` David Gibson
2018-07-23 21:38     ` Cédric Le Goater
2018-07-24 12:18     ` Cédric Le Goater
2018-07-27  5:32       ` David Gibson
2018-07-27  7:33         ` Cédric Le Goater
2018-07-27  8:08         ` Benjamin Herrenschmidt
2018-07-27  8:25           ` Cédric Le Goater
2018-07-27  8:43             ` Benjamin Herrenschmidt
2018-07-27  9:19               ` Cédric Le Goater
2018-07-30  8:56                 ` Cédric Le Goater
2018-07-26  9:03   ` Cédric Le Goater
2018-07-26 22:36     ` Benjamin Herrenschmidt
2018-07-27  7:16       ` Cédric Le Goater
2018-07-27  8:18         ` Benjamin Herrenschmidt
2018-06-28  8:36 ` [Qemu-devel] [PATCH v2 2/2] ppc/pnv: make the PHB3 devices user creatable Cédric Le Goater
2018-07-01 18:33 ` [Qemu-devel] [PATCH v2 0/2] ppc/pnv: Add model for Power8 PHB3 PCIe Host bridge no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180723041614.GD6830@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=abologna@redhat.com \
    --cc=benh@kernel.crashing.org \
    --cc=clg@kaod.org \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.