qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kurz <groug@kaod.org>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context
Date: Thu, 23 Sep 2021 11:12:49 +0200	[thread overview]
Message-ID: <20210923111249.33c41068@bahia.huguette> (raw)
In-Reply-To: <20210922144120.1277504-1-clg@kaod.org>

On Wed, 22 Sep 2021 16:41:20 +0200
Cédric Le Goater <clg@kaod.org> wrote:

> When QEMU switches to the XIVE interrupt mode, it creates all possible
> guest interrupts at the level of the KVM device. These interrupts are
> backed by real HW interrupts from the IPI interrupt pool of the XIVE
> controller.
> 
> Currently, this is done from the QEMU main thread, which results in
> allocating all interrupts from the chip on which QEMU is running. IPIs
> are not distributed across the system and the load is not well
> balanced across the interrupt controllers.
> 
> To improve distribution on the system, we should try to allocate the
> underlying HW IPI from the vCPU context. This also benefits to
> configurations using CPU pinning. In this case, the HW IPI is
> allocated on the local chip, rerouting between interrupt controllers
> is reduced and performance improved.
> 
> This moves the initialization of the vCPU IPIs from reset time to the
> H_INT_SET_SOURCE_CONFIG hcall which is called from the vCPU context.
> But this needs some extra checks in the sequences getting and setting
> the source states to make sure a valid HW IPI is backing the guest
> interrupt. For that, we check if a target was configured in the END in
> case of a vCPU IPI.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> ---
> 
>  I have tested different SMT configurations, kernel_irqchip=off/on,
>  did some migrations, CPU hotplug, etc. It's not enough and I would
>  like more testing but, at least, it is not making anymore the bad
>  assumption vCPU id = IPI number.
> 

Yeah, the IPI number is provided by the guest, so h_int_set_source_config()
is really the only place where we can know the IPI number of a given vCPU.

>  Comments ? 
> 

LGTM but I didn't check if more users of xive_end_is_valid() should
be converted to using xive_source_is_initialized().

Any chance you have some perf numbers to share ?

>  hw/intc/spapr_xive.c     | 17 +++++++++++++++++
>  hw/intc/spapr_xive_kvm.c | 36 +++++++++++++++++++++++++++++++-----
>  2 files changed, 48 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
> index 6f31ce74f198..2dc594a720b1 100644
> --- a/hw/intc/spapr_xive.c
> +++ b/hw/intc/spapr_xive.c
> @@ -1089,6 +1089,23 @@ static target_ulong h_int_set_source_config(PowerPCCPU *cpu,
>      if (spapr_xive_in_kernel(xive)) {
>          Error *local_err = NULL;
>  
> +        /*
> +         * Initialize the vCPU IPIs from the vCPU context to allocate
> +         * the backing HW IPI on the local chip. This improves
> +         * distribution of the IPIs in the system and when the vCPUs
> +         * are pinned, it reduces rerouting between interrupt
> +         * controllers for better performance.
> +         */
> +        if (lisn < SPAPR_XIRQ_BASE) {
> +            XiveSource *xsrc = &xive->source;
> +
> +            kvmppc_xive_source_reset_one(xsrc, lisn, &local_err);
> +            if (local_err) {
> +                error_report_err(local_err);
> +                return H_HARDWARE;
> +            }
> +        }
> +
>          kvmppc_xive_set_source_config(xive, lisn, &new_eas, &local_err);
>          if (local_err) {
>              error_report_err(local_err);
> diff --git a/hw/intc/spapr_xive_kvm.c b/hw/intc/spapr_xive_kvm.c
> index 53731d158625..1607a59e9483 100644
> --- a/hw/intc/spapr_xive_kvm.c
> +++ b/hw/intc/spapr_xive_kvm.c
> @@ -254,7 +254,12 @@ static int kvmppc_xive_source_reset(XiveSource *xsrc, Error **errp)
>      SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
>      int i;
>  
> -    for (i = 0; i < xsrc->nr_irqs; i++) {
> +    /*
> +     * vCPU IPIs are initialized at the KVM level when configured by
> +     * H_INT_SET_SOURCE_CONFIG.
> +     */
> +
> +    for (i = SPAPR_XIRQ_BASE; i < xsrc->nr_irqs; i++) {
>          int ret;
>  
>          if (!xive_eas_is_valid(&xive->eat[i])) {
> @@ -342,6 +347,27 @@ uint64_t kvmppc_xive_esb_rw(XiveSource *xsrc, int srcno, uint32_t offset,
>      }
>  }
>  
> +static bool xive_source_is_initialized(SpaprXive *xive, int lisn)
> +{
> +    assert(lisn < xive->nr_irqs);
> +
> +    if (!xive_eas_is_valid(&xive->eat[lisn])) {
> +        return false;
> +    }
> +
> +    /*
> +     * vCPU IPIs are initialized at the KVM level when configured by
> +     * H_INT_SET_SOURCE_CONFIG, in which case, we should have a valid
> +     * target (server, priority) in the END.
> +     */
> +    if (lisn < SPAPR_XIRQ_BASE) {
> +        return !!xive_get_field64(EAS_END_INDEX, xive->eat[lisn].w);
> +    }
> +
> +    /* Device sources */
> +    return true;
> +}
> +
>  static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>  {
>      SpaprXive *xive = SPAPR_XIVE(xsrc->xive);
> @@ -350,7 +376,7 @@ static void kvmppc_xive_source_get_state(XiveSource *xsrc)
>      for (i = 0; i < xsrc->nr_irqs; i++) {
>          uint8_t pq;
>  
> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> +        if (!xive_source_is_initialized(xive, i)) {
>              continue;
>          }
>  
> @@ -533,7 +559,7 @@ static void kvmppc_xive_change_state_handler(void *opaque, bool running,
>              uint8_t pq;
>              uint8_t old_pq;
>  
> -            if (!xive_eas_is_valid(&xive->eat[i])) {
> +            if (!xive_source_is_initialized(xive, i)) {
>                  continue;
>              }
>  
> @@ -561,7 +587,7 @@ static void kvmppc_xive_change_state_handler(void *opaque, bool running,
>      for (i = 0; i < xsrc->nr_irqs; i++) {
>          uint8_t pq;
>  
> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> +        if (!xive_source_is_initialized(xive, i)) {
>              continue;
>          }
>  
> @@ -666,7 +692,7 @@ int kvmppc_xive_post_load(SpaprXive *xive, int version_id)
>  
>      /* Restore the EAT */
>      for (i = 0; i < xive->nr_irqs; i++) {
> -        if (!xive_eas_is_valid(&xive->eat[i])) {
> +        if (!xive_source_is_initialized(xive, i)) {
>              continue;
>          }
>  



  reply	other threads:[~2021-09-23  9:14 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22 14:41 [RFC PATCH] spapr/xive: Allocate vCPU IPIs from local context Cédric Le Goater
2021-09-23  9:12 ` Greg Kurz [this message]
2021-09-24 12:40   ` Cédric Le Goater
2021-09-24 13:49     ` Greg Kurz
2021-09-24 14:58       ` Cédric Le Goater
2021-09-24 17:13         ` Greg Kurz
2021-09-27 16:50           ` Cédric Le Goater
2021-09-28  7:19             ` Greg Kurz
2021-10-01  9:59               ` Cédric Le Goater
2021-10-01 10:38                 ` Greg Kurz
2021-10-01 11:11                   ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210923111249.33c41068@bahia.huguette \
    --to=groug@kaod.org \
    --cc=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).