From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Greg Kurz <groug@kaod.org>
Subject: Re: [Qemu-devel] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sources
Date: Wed, 20 Dec 2017 16:22:25 +1100 [thread overview]
Message-ID: <20171220052225.GE5981@umbus.fritz.box> (raw)
In-Reply-To: <20171209084338.29395-4-clg@kaod.org>
[-- Attachment #1: Type: text/plain, Size: 14433 bytes --]
On Sat, Dec 09, 2017 at 09:43:22AM +0100, Cédric Le Goater wrote:
> Each XIVE interrupt source is associated with a two bit state machine
> called an Event State Buffer (ESB) : the first bit "P" means that an
> interrupt is "pending" and waiting for an EOI and the bit "Q" (queued)
> means a new interrupt was triggered while another was still pending.
>
> When an event is triggered, the associated interrupt state bits are
> fetched and modified and forwarded to the virtualization engine of the
> controller doing the routing. These can also be controlled by MMIO, to
> trigger events or turn off the sources for instance. See code for more
> details on the states and transitions.
>
> The MMIO space for the ESBs is 512GB large on the bare-metal system
> (PowerNV) and the BAR depends on the chip id. In our model for the
> sPAPR machine, we choose to only map the sub-region for the
> provisioned IRQ numbers and to use the mapping address of chip 0 of a
> real system.
I think we probably want a device property to make the virtualized
base address arbitrary. It's fine for it to default to the chip 0
base, but that'll make it easier to adapt if we need to later on.
As noted in the followup messages, I think you're going to want to
move this stuff from the current xive object into a "block of sources"
object.
Apart from that this looks pretty sound.
> In the real world, each source may have different characteristics
> depending on the revision of a controller or the CPU. Early systems
> had two different MMIO pages for trigger and for EOI. We choose to use
> the same characteristics for all sources to simplify the model. The
> minimum CPU level for XIVE exploitation mode will be DD2.X as it has
> full support.
>
> The OS will obtain the address of the MMIO page of the ESB entry
> associated with a source and its characteristic using the
> H_INT_GET_SOURCE_INFO hcall. This will be addressed in the patch
> introducing the hcalls.
>
> The spapr_xive_irq() routine in charge of triggering the CPU interrupt
> line will be filled later on.
>
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
>
> Changes since v1:
>
> - merged in the same patch the qemu_irq handlers
> - reworked the event notification logic of the qemu_irq handlers.
> - introduced XIVE_ESB_STORE_EOI support
> - removed 'esb_shift' field
> - removed a useless check on the validity of the IVE in the memory
> region handlers.
> - fixed spapr_xive_pq_trigger() to return true when XIVE_ESB_QUEUED
> is set
> - removed the overall ESB memory region. We now have only one region
> for the provisioned sources.
> - improved 'info pic' output
>
> hw/intc/spapr_xive.c | 254 +++++++++++++++++++++++++++++++++++++++++++-
> hw/intc/xive-internal.h | 10 ++
> include/hw/ppc/spapr_xive.h | 9 ++
> 3 files changed, 271 insertions(+), 2 deletions(-)
>
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index e6e8841add17..43df6814619d 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -18,23 +18,252 @@
>
> #include "xive-internal.h"
>
> +static void spapr_xive_irq(sPAPRXive *xive, int lisn)
> +{
> +
> +}
> +
> /*
> - * Main XIVE object
> + * XIVE Interrupt Source
> + */
> +
> +/*
> + * "magic" Event State Buffer (ESB) MMIO offsets.
> + *
> + * Each interrupt source has a 2-bit state machine called ESB
> + * which can be controlled by MMIO. It's made of 2 bits, P and
> + * Q. P indicates that an interrupt is pending (has been sent
> + * to a queue and is waiting for an EOI). Q indicates that the
> + * interrupt has been triggered while pending.
> + *
> + * This acts as a coalescing mechanism in order to guarantee
> + * that a given interrupt only occurs at most once in a queue.
> + *
> + * When doing an EOI, the Q bit will indicate if the interrupt
> + * needs to be re-triggered.
> + *
> + * The following offsets into the ESB MMIO allow to read or
> + * manipulate the PQ bits. They must be used with an 8-bytes
> + * load instruction. They all return the previous state of the
> + * interrupt (atomically).
> + *
> + * Additionally, some ESB pages support doing an EOI via a
> + * store at 0 and some ESBs support doing a trigger via a
> + * separate trigger page.
> + */
> +#define XIVE_ESB_STORE_EOI 0x400 /* Store */
> +#define XIVE_ESB_LOAD_EOI 0x000 /* Load */
> +#define XIVE_ESB_GET 0x800 /* Load */
> +#define XIVE_ESB_SET_PQ_00 0xc00 /* Load */
> +#define XIVE_ESB_SET_PQ_01 0xd00 /* Load */
> +#define XIVE_ESB_SET_PQ_10 0xe00 /* Load */
> +#define XIVE_ESB_SET_PQ_11 0xf00 /* Load */
> +
> +#define XIVE_ESB_VAL_P 0x2
> +#define XIVE_ESB_VAL_Q 0x1
> +
> +#define XIVE_ESB_RESET 0x0
> +#define XIVE_ESB_PENDING XIVE_ESB_VAL_P
> +#define XIVE_ESB_QUEUED (XIVE_ESB_VAL_P | XIVE_ESB_VAL_Q)
> +#define XIVE_ESB_OFF XIVE_ESB_VAL_Q
> +
> +static uint8_t spapr_xive_pq_get(sPAPRXive *xive, uint32_t lisn)
> +{
> + uint32_t byte = lisn / 4;
> + uint32_t bit = (lisn % 4) * 2;
> +
> + assert(byte < xive->sbe_size);
> +
> + return (xive->sbe[byte] >> bit) & 0x3;
> +}
> +
> +static uint8_t spapr_xive_pq_set(sPAPRXive *xive, uint32_t lisn, uint8_t pq)
> +{
> + uint32_t byte = lisn / 4;
> + uint32_t bit = (lisn % 4) * 2;
> + uint8_t old, new;
> +
> + assert(byte < xive->sbe_size);
> +
> + old = xive->sbe[byte];
> +
> + new = xive->sbe[byte] & ~(0x3 << bit);
> + new |= (pq & 0x3) << bit;
> +
> + xive->sbe[byte] = new;
> +
> + return (old >> bit) & 0x3;
> +}
> +
> +static bool spapr_xive_pq_eoi(sPAPRXive *xive, uint32_t lisn)
> +{
> + uint8_t old_pq = spapr_xive_pq_get(xive, lisn);
> +
> + switch (old_pq) {
> + case XIVE_ESB_RESET:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_RESET);
> + return false;
> + case XIVE_ESB_PENDING:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_RESET);
> + return false;
> + case XIVE_ESB_QUEUED:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_PENDING);
> + return true;
> + case XIVE_ESB_OFF:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_OFF);
> + return false;
> + default:
> + g_assert_not_reached();
> + }
> +}
> +
> +/*
> + * Returns whether the event notification should be forwarded to the
> + * IVE for routing.
> */
> +static bool spapr_xive_pq_trigger(sPAPRXive *xive, uint32_t lisn)
> +{
> + uint8_t old_pq = spapr_xive_pq_get(xive, lisn);
>
> + switch (old_pq) {
> + case XIVE_ESB_RESET:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_PENDING);
> + return true;
> + case XIVE_ESB_PENDING:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_QUEUED);
> + return false;
> + case XIVE_ESB_QUEUED:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_QUEUED);
> + return false;
> + case XIVE_ESB_OFF:
> + spapr_xive_pq_set(xive, lisn, XIVE_ESB_OFF);
> + return false;
> + default:
> + g_assert_not_reached();
> + }
> +}
> +
> +/*
> + * XIVE Interrupt Source MMIOs
> + */
> +
> +/*
> + * Some HW use a separate page for trigger. We only support the case
> + * in which the trigger can be done in the same page as the EOI.
> + */
> +static uint64_t spapr_xive_esb_read(void *opaque, hwaddr addr, unsigned size)
> +{
> + sPAPRXive *xive = SPAPR_XIVE(opaque);
> + uint32_t offset = addr & 0xF00;
> + uint32_t lisn = addr >> ESB_SHIFT;
> + uint64_t ret = -1;
> +
> + switch (offset) {
> + case XIVE_ESB_LOAD_EOI:
> + /*
> + * EOI on load is not used anymore as we now advertise
> + * XIVE_ESB_STORE_EOI support for the interrupt sources
> + */
> + ret = spapr_xive_pq_eoi(xive, lisn);
> + break;
> +
> + case XIVE_ESB_GET:
> + ret = spapr_xive_pq_get(xive, lisn);
> + break;
> +
> + case XIVE_ESB_SET_PQ_00:
> + case XIVE_ESB_SET_PQ_01:
> + case XIVE_ESB_SET_PQ_10:
> + case XIVE_ESB_SET_PQ_11:
> + ret = spapr_xive_pq_set(xive, lisn, (offset >> 8) & 0x3);
> + break;
> + default:
> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid ESB addr %d\n", offset);
> + }
> +
> + return ret;
> +}
> +
> +static void spapr_xive_esb_write(void *opaque, hwaddr addr,
> + uint64_t value, unsigned size)
> +{
> + sPAPRXive *xive = SPAPR_XIVE(opaque);
> + uint32_t offset = addr & 0xF00;
> + uint32_t lisn = addr >> ESB_SHIFT;
> + bool notify = false;
> +
> + switch (offset) {
> + case 0:
> + notify = spapr_xive_pq_trigger(xive, lisn);
> + break;
> + case XIVE_ESB_STORE_EOI:
> + /* If the Q bit is set, we should forward a new source event
> + * notification
> + */
> + notify = spapr_xive_pq_eoi(xive, lisn);
> + break;
> + default:
> + qemu_log_mask(LOG_GUEST_ERROR, "XIVE: invalid ESB write addr %d\n",
> + offset);
> + return;
> + }
> +
> + /* Forward the source event notification for routing */
> + if (notify) {
> + spapr_xive_irq(xive, lisn);
> + }
> +}
> +
> +static const MemoryRegionOps spapr_xive_esb_ops = {
> + .read = spapr_xive_esb_read,
> + .write = spapr_xive_esb_write,
> + .endianness = DEVICE_BIG_ENDIAN,
> + .valid = {
> + .min_access_size = 8,
> + .max_access_size = 8,
> + },
> + .impl = {
> + .min_access_size = 8,
> + .max_access_size = 8,
> + },
> +};
> +
> +static void spapr_xive_source_set_irq(void *opaque, int lisn, int val)
> +{
> + sPAPRXive *xive = SPAPR_XIVE(opaque);
> + bool notify = false;
> +
> + if (val) {
> + notify = spapr_xive_pq_trigger(xive, lisn);
> + }
> +
> + /* Forward the source event notification for routing */
> + if (notify) {
> + spapr_xive_irq(xive, lisn);
> + }
> +}
> +
> +/*
> + * Main XIVE object
> + */
> void spapr_xive_pic_print_info(sPAPRXive *xive, Monitor *mon)
> {
> int i;
>
> for (i = 0; i < xive->nr_irqs; i++) {
> XiveIVE *ive = &xive->ivt[i];
> + uint8_t pq;
>
> if (!(ive->w & IVE_VALID)) {
> continue;
> }
>
> - monitor_printf(mon, " %4x %s %08x %08x\n", i,
> + pq = spapr_xive_pq_get(xive, i);
> +
> + monitor_printf(mon, " %4x %s %c%c %08x %08x\n", i,
> ive->w & IVE_MASKED ? "M" : " ",
> + pq & XIVE_ESB_VAL_P ? 'P' : '-',
> + pq & XIVE_ESB_VAL_Q ? 'Q' : '-',
> (int) GETFIELD(IVE_EQ_INDEX, ive->w),
> (int) GETFIELD(IVE_EQ_DATA, ive->w));
> }
> @@ -52,6 +281,9 @@ static void spapr_xive_reset(DeviceState *dev)
> ive->w |= IVE_MASKED;
> }
> }
> +
> + /* SBEs are initialized to 0b01 which corresponds to "ints off" */
> + memset(xive->sbe, 0x55, xive->sbe_size);
> }
>
> static void spapr_xive_realize(DeviceState *dev, Error **errp)
> @@ -65,6 +297,23 @@ static void spapr_xive_realize(DeviceState *dev, Error **errp)
>
> /* Allocate the IVT (Interrupt Virtualization Table) */
> xive->ivt = g_new0(XiveIVE, xive->nr_irqs);
> +
> + /* QEMU IRQs */
> + xive->qirqs = qemu_allocate_irqs(spapr_xive_source_set_irq, xive,
> + xive->nr_irqs);
> +
> + /* Allocate SBEs (State Bit Entry). 2 bits, so 4 entries per byte */
> + xive->sbe_size = DIV_ROUND_UP(xive->nr_irqs, 4);
> + xive->sbe = g_malloc0(xive->sbe_size);
> +
> + /* VC BAR. Use address of chip 0 to install the ESB memory region
> + * for *all* interrupt sources */
> + xive->esb_base = (P9_MMIO_BASE | VC_BAR_DEFAULT);
> +
> + memory_region_init_io(&xive->esb_iomem, OBJECT(xive),
> + &spapr_xive_esb_ops, xive, "xive.esb",
> + (1ull << ESB_SHIFT) * xive->nr_irqs);
> + sysbus_init_mmio(SYS_BUS_DEVICE(dev), &xive->esb_iomem);
> }
>
> static const VMStateDescription vmstate_spapr_xive_ive = {
> @@ -92,6 +341,7 @@ static const VMStateDescription vmstate_spapr_xive = {
> VMSTATE_UINT32_EQUAL(nr_irqs, sPAPRXive, NULL),
> VMSTATE_STRUCT_VARRAY_UINT32(ivt, sPAPRXive, nr_irqs, 1,
> vmstate_spapr_xive_ive, XiveIVE),
> + VMSTATE_VBUFFER_UINT32(sbe, sPAPRXive, 1, NULL, sbe_size),
> VMSTATE_END_OF_LIST()
> },
> };
> diff --git a/hw/intc/xive-internal.h b/hw/intc/xive-internal.h
> index 132b71a6daf0..872648dd96a2 100644
> --- a/hw/intc/xive-internal.h
> +++ b/hw/intc/xive-internal.h
> @@ -16,6 +16,16 @@
> #define SETFIELD(m, v, val) \
> (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_TO_LSH(m)) & (m)))
>
> +/*
> + * XIVE MMIO regions
> + */
> +#define P9_MMIO_BASE 0x006000000000000ull
> +
> +/* VC BAR contains set translations for the ESBs and the EQs. */
> +#define VC_BAR_DEFAULT 0x10000000000ull
> +#define VC_BAR_SIZE 0x08000000000ull
> +#define ESB_SHIFT 16 /* One 64k page. OPAL has two */
> +
> /* IVE/EAS
> *
> * One per interrupt source. Targets that interrupt to a given EQ
> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> index 5b1f78e06a1e..ecc15d889b74 100644
> --- a/include/hw/ppc/spapr_xive.h
> +++ b/include/hw/ppc/spapr_xive.h
> @@ -24,8 +24,17 @@ struct sPAPRXive {
> /* Properties */
> uint32_t nr_irqs;
>
> + /* IRQ */
> + qemu_irq *qirqs;
> +
> /* XIVE internal tables */
> XiveIVE *ivt;
> + uint8_t *sbe;
> + uint32_t sbe_size;
> +
> + /* ESB memory region */
> + hwaddr esb_base;
> + MemoryRegion esb_iomem;
> };
>
> bool spapr_xive_irq_enable(sPAPRXive *xive, uint32_t lisn);
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2017-12-20 5:26 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-09 8:43 [Qemu-devel] [PATCH v2 00/19] spapr: Guest exploitation of the XIVE interrupt controller (POWER9) Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 01/19] dma-helpers: add a return value to store helpers Cédric Le Goater
2017-12-19 4:46 ` David Gibson
2017-12-19 6:43 ` Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 02/19] spapr: introduce a skeleton for the XIVE interrupt controller Cédric Le Goater
2017-12-09 14:06 ` Cédric Le Goater
2017-12-20 5:09 ` David Gibson
2017-12-20 7:38 ` Cédric Le Goater
2018-04-12 5:07 ` David Gibson
2018-04-12 8:18 ` Cédric Le Goater
2018-04-16 4:26 ` David Gibson
2018-04-19 17:40 ` Cédric Le Goater
2018-04-26 5:36 ` David Gibson
2018-04-26 8:17 ` Cédric Le Goater
2018-05-03 2:29 ` David Gibson
2018-05-03 8:43 ` Cédric Le Goater
2018-05-04 6:35 ` David Gibson
2018-05-04 15:35 ` Cédric Le Goater
2017-12-21 0:12 ` Benjamin Herrenschmidt
2017-12-21 9:16 ` Cédric Le Goater
2017-12-21 10:09 ` Cédric Le Goater
2017-12-21 22:53 ` Benjamin Herrenschmidt
2018-01-17 9:18 ` Cédric Le Goater
2018-01-17 11:10 ` Benjamin Herrenschmidt
2018-01-17 14:39 ` Cédric Le Goater
2018-01-17 17:57 ` Cédric Le Goater
2018-01-17 21:27 ` Benjamin Herrenschmidt
2018-01-18 13:27 ` Cédric Le Goater
2018-01-18 21:08 ` Benjamin Herrenschmidt
2018-02-11 8:08 ` David Gibson
2018-02-11 22:55 ` Benjamin Herrenschmidt
2018-02-12 2:02 ` Alexey Kardashevskiy
2018-02-12 12:20 ` [Qemu-devel] [Qemu-ppc] " Andrea Bolognani
2018-02-12 14:40 ` Benjamin Herrenschmidt
2018-02-13 1:11 ` Alexey Kardashevskiy
2018-02-13 7:40 ` Cédric Le Goater
2018-02-12 7:10 ` [Qemu-devel] " Cédric Le Goater
2018-04-12 5:16 ` David Gibson
2018-04-12 8:36 ` Cédric Le Goater
2018-04-16 4:29 ` David Gibson
2018-04-19 13:01 ` Cédric Le Goater
2018-04-12 5:15 ` David Gibson
2018-04-12 8:51 ` Cédric Le Goater
2018-04-12 5:10 ` David Gibson
2018-04-12 8:41 ` Cédric Le Goater
2018-04-12 5:08 ` David Gibson
2018-04-12 8:28 ` Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 03/19] spapr: introduce the XIVE interrupt sources Cédric Le Goater
2017-12-14 15:24 ` Cédric Le Goater
2017-12-18 0:59 ` Benjamin Herrenschmidt
2017-12-19 6:37 ` Cédric Le Goater
2017-12-20 5:13 ` David Gibson
2017-12-20 5:22 ` David Gibson [this message]
2017-12-20 7:54 ` Cédric Le Goater
2017-12-20 18:08 ` Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 04/19] spapr: add support for the LSI " Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 05/19] spapr: introduce a XIVE interrupt presenter model Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 06/19] spapr: introduce the XIVE Event Queues Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 07/19] spapr: push the XIVE EQ data in OS event queue Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 08/19] spapr: notify the CPU when the XIVE interrupt priority is more privileged Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 09/19] spapr: add support for the SET_OS_PENDING command (XIVE) Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 10/19] spapr: introduce a 'xive_exploitation' boolean to enable XIVE Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 11/19] spapr: add a sPAPRXive object to the machine Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 12/19] spapr: add hcalls support for the XIVE exploitation interrupt mode Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 13/19] spapr: add device tree support for the XIVE " Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 14/19] spapr: introduce a helper to map the XIVE memory regions Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 15/19] spapr: add XIVE support to spapr_qirq() Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 16/19] spapr: introduce a spapr_icp_create() helper Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 17/19] spapr: toggle the ICP depending on the selected interrupt mode Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 18/19] spapr: add support to dump XIVE information Cédric Le Goater
2017-12-09 8:43 ` [Qemu-devel] [PATCH v2 19/19] spapr: advertise XIVE exploitation mode in CAS Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171220052225.GE5981@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=benh@kernel.crashing.org \
--cc=clg@kaod.org \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).