qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode
Date: Wed, 12 Dec 2018 11:19:27 +1100	[thread overview]
Message-ID: <20181212001927.GA2719@umbus.fritz.box> (raw)
In-Reply-To: <4070f6b9-a4e3-6fd0-f6d7-45f33b869426@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 9445 bytes --]

On Tue, Dec 11, 2018 at 10:06:46AM +0100, Cédric Le Goater wrote:
> On 12/11/18 1:38 AM, David Gibson wrote:
> > On Mon, Dec 10, 2018 at 08:53:17AM +0100, Cédric Le Goater wrote:
> >> On 12/10/18 7:39 AM, David Gibson wrote:
> >>> On Sun, Dec 09, 2018 at 08:46:00PM +0100, Cédric Le Goater wrote:
> >>>> The XIVE interface for the guest is described in the device tree under
> >>>> the "interrupt-controller" node. A couple of new properties are
> >>>> specific to XIVE :
> >>>>
> >>>>  - "reg"
> >>>>
> >>>>    contains the base address and size of the thread interrupt
> >>>>    managnement areas (TIMA), for the User level and for the Guest OS
> >>>>    level. Only the Guest OS level is taken into account today.
> >>>>
> >>>>  - "ibm,xive-eq-sizes"
> >>>>
> >>>>    the size of the event queues. One cell per size supported, contains
> >>>>    log2 of size, in ascending order.
> >>>>
> >>>>  - "ibm,xive-lisn-ranges"
> >>>>
> >>>>    the IRQ interrupt number ranges assigned to the guest for the IPIs.
> >>>>
> >>>> and also under the root node :
> >>>>
> >>>>  - "ibm,plat-res-int-priorities"
> >>>>
> >>>>    contains a list of priorities that the hypervisor has reserved for
> >>>>    its own use. OPAL uses the priority 7 queue to automatically
> >>>>    escalate interrupts for all other queues (DD2.X POWER9). So only
> >>>>    priorities [0..6] are allowed for the guest.
> >>>>
> >>>> Extend the sPAPR IRQ backend with a new handler to populate the DT
> >>>> with the appropriate "interrupt-controller" node.
> >>>>
> >>>> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >>>> ---
> >>>>  include/hw/ppc/spapr_irq.h  |  2 ++
> >>>>  include/hw/ppc/spapr_xive.h |  2 ++
> >>>>  include/hw/ppc/xics.h       |  4 +--
> >>>>  hw/intc/spapr_xive.c        | 64 +++++++++++++++++++++++++++++++++++++
> >>>>  hw/intc/xics_spapr.c        |  3 +-
> >>>>  hw/ppc/spapr.c              |  3 +-
> >>>>  hw/ppc/spapr_irq.c          |  3 ++
> >>>>  7 files changed, 77 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
> >>>> index 23cdb51b879e..e51e9f052f63 100644
> >>>> --- a/include/hw/ppc/spapr_irq.h
> >>>> +++ b/include/hw/ppc/spapr_irq.h
> >>>> @@ -39,6 +39,8 @@ typedef struct sPAPRIrq {
> >>>>      void (*free)(sPAPRMachineState *spapr, int irq, int num);
> >>>>      qemu_irq (*qirq)(sPAPRMachineState *spapr, int irq);
> >>>>      void (*print_info)(sPAPRMachineState *spapr, Monitor *mon);
> >>>> +    void (*dt_populate)(sPAPRMachineState *spapr, uint32_t nr_servers,
> >>>> +                        void *fdt, uint32_t phandle);
> >>>>  } sPAPRIrq;
> >>>>  
> >>>>  extern sPAPRIrq spapr_irq_xics;
> >>>> diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
> >>>> index 9506a8f4d10a..728a5e8dc163 100644
> >>>> --- a/include/hw/ppc/spapr_xive.h
> >>>> +++ b/include/hw/ppc/spapr_xive.h
> >>>> @@ -45,5 +45,7 @@ qemu_irq spapr_xive_qirq(sPAPRXive *xive, uint32_t lisn);
> >>>>  typedef struct sPAPRMachineState sPAPRMachineState;
> >>>>  
> >>>>  void spapr_xive_hcall_init(sPAPRMachineState *spapr);
> >>>> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>> +                   uint32_t phandle);
> >>>>  
> >>>>  #endif /* PPC_SPAPR_XIVE_H */
> >>>> diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
> >>>> index 9958443d1984..14afda198cdb 100644
> >>>> --- a/include/hw/ppc/xics.h
> >>>> +++ b/include/hw/ppc/xics.h
> >>>> @@ -181,8 +181,6 @@ typedef struct XICSFabricClass {
> >>>>      ICPState *(*icp_get)(XICSFabric *xi, int server);
> >>>>  } XICSFabricClass;
> >>>>  
> >>>> -void spapr_dt_xics(int nr_servers, void *fdt, uint32_t phandle);
> >>>> -
> >>>>  ICPState *xics_icp_get(XICSFabric *xi, int server);
> >>>>  
> >>>>  /* Internal XICS interfaces */
> >>>> @@ -204,6 +202,8 @@ void icp_resend(ICPState *ss);
> >>>>  
> >>>>  typedef struct sPAPRMachineState sPAPRMachineState;
> >>>>  
> >>>> +void spapr_dt_xics(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>> +                   uint32_t phandle);
> >>>>  int xics_kvm_init(sPAPRMachineState *spapr, Error **errp);
> >>>>  void xics_spapr_init(sPAPRMachineState *spapr);
> >>>>  
> >>>> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> >>>> index 982ac6e17051..a6d854b07690 100644
> >>>> --- a/hw/intc/spapr_xive.c
> >>>> +++ b/hw/intc/spapr_xive.c
> >>>> @@ -14,6 +14,7 @@
> >>>>  #include "target/ppc/cpu.h"
> >>>>  #include "sysemu/cpus.h"
> >>>>  #include "monitor/monitor.h"
> >>>> +#include "hw/ppc/fdt.h"
> >>>>  #include "hw/ppc/spapr.h"
> >>>>  #include "hw/ppc/spapr_xive.h"
> >>>>  #include "hw/ppc/xive.h"
> >>>> @@ -1381,3 +1382,66 @@ void spapr_xive_hcall_init(sPAPRMachineState *spapr)
> >>>>      spapr_register_hypercall(H_INT_SYNC, h_int_sync);
> >>>>      spapr_register_hypercall(H_INT_RESET, h_int_reset);
> >>>>  }
> >>>> +
> >>>> +void spapr_dt_xive(sPAPRMachineState *spapr, uint32_t nr_servers, void *fdt,
> >>>> +                   uint32_t phandle)
> >>>> +{
> >>>> +    sPAPRXive *xive = spapr->xive;
> >>>> +    int node;
> >>>> +    uint64_t timas[2 * 2];
> >>>> +    /* Interrupt number ranges for the IPIs */
> >>>> +    uint32_t lisn_ranges[] = {
> >>>> +        cpu_to_be32(0),
> >>>> +        cpu_to_be32(nr_servers),
> >>>> +    };
> >>>> +    uint32_t eq_sizes[] = {
> >>>> +        cpu_to_be32(12), /* 4K */
> >>>> +        cpu_to_be32(16), /* 64K */
> >>>> +        cpu_to_be32(21), /* 2M */
> >>>> +        cpu_to_be32(24), /* 16M */
> >>>
> >>> For KVM, are we going to need to clamp this list based on the
> >>> pagesizes the guest can use?
> >>
> >> I would say so. Is there a KVM service for that ?
> > 
> > I'm not sure what you mean by a KVM service here.
> 
> I meant a routine giving a list of the possible backing pagesizes.

I'm still not really sure what you mean by that.

> >> Today, the OS scans the list and picks the size fitting its current 
> >> PAGE_SIZE/SHIFT. But I suppose it would be better to advertise 
> >> only the page sizes that QEMU/KVM supports. Or should we play safe 
> >> and only export : 4K, 64K ? 
> > 
> > So, I'm guessing the limitation here is that when we use the KVM XIVE
> > acceleration, the EQ will need to be host-contiguous?  
> 
> The EQ should be a single page (from the specs). So we don't have 
> a contiguity problem.

A single page according to whom, though?  A RPT guest can use 2MiB
pages, even if it is backed with smaller pages on the host, but I'm
guessing it would not work to have its EQ be a 2MiB page, since it
won't be host-contiguous.

> > So, we can't have EQs larger than the backing pagesize for guest 
> > memory.
> >
>  > We try to avoid having guest visible differences based on whether
> > we're using KVM or not, so we don't want to change this list depending
> > on whether KVM is active etc. - or on whether we're backing the guest
> > with hugepages.
> > 
> > We could reuse the cap-hpt-max-page-size machine parameter which also
> > relates to backing page size, but that might be confusing since it's
> > explicitly about HPT only whereas this limitation would apply to RPT
> > guests too, IIUC.
> > 
> > I think our best bet is probably to limit to 64kiB queues
> > unconditionally.  
> 
> OK. That's the default. We can refine this list of page sizes later on
> when we introduce KVM support if needed.
> 
> > AIUI, that still gives us 8192 slots in the queue,
> 
> The entries are 32bits. So that's 16384 entries. 

Even better.

> > which is enough to store one of every possible irq, which should be
> > enough.
> 
> yes. 
> 
> I don't think Linux measures how much entries are dequeued at once.
> It would give us a feel of how much pressure these EQs can sustain.
> 
> C.
>  
> > There's still an issue if we have a 4kiB pagesize host kernel, but we
> > could just disable kernel_irqchip in that situation, since we don't
> > expect people to be using that much in practice.
> > 
> >>>> +    };
> >>>> +    /* The following array is in sync with the reserved priorities
> >>>> +     * defined by the 'spapr_xive_priority_is_reserved' routine.
> >>>> +     */
> >>>> +    uint32_t plat_res_int_priorities[] = {
> >>>> +        cpu_to_be32(7),    /* start */
> >>>> +        cpu_to_be32(0xf8), /* count */
> >>>> +    };
> >>>> +    gchar *nodename;
> >>>> +
> >>>> +    /* Thread Interrupt Management Area : User (ring 3) and OS (ring 2) */
> >>>> +    timas[0] = cpu_to_be64(xive->tm_base +
> >>>> +                           XIVE_TM_USER_PAGE * (1ull << TM_SHIFT));
> >>>> +    timas[1] = cpu_to_be64(1ull << TM_SHIFT);
> >>>> +    timas[2] = cpu_to_be64(xive->tm_base +
> >>>> +                           XIVE_TM_OS_PAGE * (1ull << TM_SHIFT));
> >>>> +    timas[3] = cpu_to_be64(1ull << TM_SHIFT);
> >>>
> >>> So this gives the MMIO address of the TIMA.  
> >>
> >> Yes. It is considered to be a fixed "address" 
> >>
> >>> Where does the guest get the MMIO address for the ESB pages from?
> >>
> >> with the hcall H_INT_GET_SOURCE_INFO.
> > 
> > Ah, right.
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2018-12-12  0:38 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-09 19:45 [Qemu-devel] [PATCH v7 00/19] ppc: support for the XIVE interrupt controller (POWER9) Cédric Le Goater
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 01/19] ppc/xive: add support for the END Event State Buffers Cédric Le Goater
2018-12-10  4:16   ` David Gibson
2018-12-10  7:11     ` Cédric Le Goater
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 02/19] ppc/xive: introduce the XIVE interrupt thread context Cédric Le Goater
2018-12-10  4:19   ` David Gibson
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 03/19] ppc/xive: introduce a simplified XIVE presenter Cédric Le Goater
2018-12-10  4:27   ` David Gibson
2018-12-10  7:15     ` Cédric Le Goater
2018-12-11  1:37       ` David Gibson
2018-12-11 10:43         ` Cédric Le Goater
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 04/19] ppc/xive: notify the CPU when the interrupt priority is more privileged Cédric Le Goater
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 05/19] spapr/xive: introduce a XIVE interrupt controller Cédric Le Goater
2018-12-10  4:36   ` David Gibson
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 06/19] spapr/xive: use the VCPU id as a NVT identifier Cédric Le Goater
2018-12-10  4:42   ` David Gibson
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 07/19] spapr: introduce a new machine IRQ backend for XIVE Cédric Le Goater
2018-12-10  4:45   ` David Gibson
2018-12-09 19:45 ` [Qemu-devel] [PATCH v7 08/19] spapr: add hcalls support for the XIVE exploitation interrupt mode Cédric Le Goater
2018-12-10  6:34   ` David Gibson
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 09/19] spapr: add device tree support for the XIVE exploitation mode Cédric Le Goater
2018-12-10  6:39   ` David Gibson
2018-12-10  7:53     ` Cédric Le Goater
2018-12-11  0:38       ` David Gibson
2018-12-11  9:06         ` Cédric Le Goater
2018-12-12  0:19           ` David Gibson [this message]
2018-12-12  7:37             ` Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 10/19] spapr: allocate the interrupt thread context under the CPU core Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 11/19] spapr: extend the sPAPR IRQ backend for XICS migration Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 12/19] spapr: add a 'reset' method to the sPAPR IRQ backend Cédric Le Goater
2018-12-10  6:42   ` David Gibson
2018-12-10  7:30     ` Cédric Le Goater
2018-12-11 10:55     ` Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 13/19] spapr: add an extra OV5 field " Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 14/19] spapr: set the interrupt presenter at reset Cédric Le Goater
2018-12-11  1:46   ` David Gibson
2018-12-11 10:58     ` Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 15/19] spapr/xive: enable XIVE MMIOs " Cédric Le Goater
2018-12-11  1:47   ` David Gibson
2018-12-11 10:14     ` Cédric Le Goater
2018-12-12  0:32       ` David Gibson
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 16/19] spapr: introduce a new sPAPR IRQ backend supporting XIVE and XICS Cédric Le Goater
2018-12-11  2:03   ` David Gibson
2018-12-11 10:19     ` Cédric Le Goater
2018-12-12  0:54       ` David Gibson
2018-12-12  9:13         ` Cédric Le Goater
2018-12-15  8:09           ` David Gibson
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 17/19] spapr: Add a pseries-4.0 machine type Cédric Le Goater
2018-12-09 22:05   ` Benjamin Herrenschmidt
2018-12-10  3:41     ` David Gibson
2018-12-10  7:09       ` Cédric Le Goater
2018-12-10  6:45   ` David Gibson
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 18/19] spapr: add a 'pseries-4.0-xive' " Cédric Le Goater
2018-12-10 22:17   ` Cédric Le Goater
2018-12-11  2:06     ` David Gibson
2018-12-11 10:42       ` Cédric Le Goater
2018-12-11 16:44         ` Cédric Le Goater
2018-12-15  8:03           ` David Gibson
2018-12-12  0:34         ` David Gibson
2018-12-12  7:26           ` Cédric Le Goater
2018-12-09 19:46 ` [Qemu-devel] [PATCH v7 19/19] spapr: add a 'pseries-4.0-dual' " Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181212001927.GA2719@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=benh@kernel.crashing.org \
    --cc=clg@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).