From: Marc Zyngier <maz@kernel.org>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Muchun Song <muchun.song@linux.dev>,
Thomas Gleixner <tglx@linutronix.de>,
Will Deacon <will@kernel.org>,
Douglas Anderson <dianders@chromium.org>,
Mark Rutland <mark.rutland@arm.com>,
Nanyong Sun <sunnanyong@huawei.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast
Date: Tue, 22 Oct 2024 16:03:30 +0100 [thread overview]
Message-ID: <86a5ew41tp.wl-maz@kernel.org> (raw)
In-Reply-To: <20241021042218.746659-4-yuzhao@google.com>
On Mon, 21 Oct 2024 05:22:15 +0100,
Yu Zhao <yuzhao@google.com> wrote:
>
> GIC v3 and later support SGI broadcast, i.e., the mode that routes
> interrupts to all PEs in the system excluding the local CPU.
>
> Supporting this mode can avoid looping through all the remote CPUs
> when broadcasting SGIs, especially for systems with 200+ CPUs. The
> performance improvement can be measured with the rest of this series
> booted with "hugetlb_free_vmemmap=on irqchip.gicv3_pseudo_nmi=1":
>
> cd /sys/kernel/mm/hugepages/
> echo 600 >hugepages-1048576kB/nr_hugepages
> echo 2048kB >hugepages-1048576kB/demote_size
> perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote"
>
> gic_ipi_send_mask() bash sys time
> Before: 38.14% 0m10.513s
> After: 0.20% 0m5.132s
>
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
> drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index ce87205e3e82..42c39385e1b9 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, unsigned int irq)
> gic_write_sgi1r(val);
> }
>
> +static void gic_broadcast_sgi(unsigned int irq)
> +{
> + u64 val;
> +
> + val = BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SHIFT);
As picked up by the test bot, please fix the 32bit build.
> +
> + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq);
> + gic_write_sgi1r(val);
> +}
> +
> static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> {
> int cpu;
> + cpumask_t broadcast;
>
> if (WARN_ON(d->hwirq >= 16))
> return;
> @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *mask)
> */
> dsb(ishst);
>
> + cpumask_copy(&broadcast, cpu_present_mask);
Why cpu_present_mask? I'd expect that cpu_online_mask should be the
correct mask to use -- we don't IPI offline CPUs, in general.
> + cpumask_clear_cpu(smp_processor_id(), &broadcast);
> + if (cpumask_equal(&broadcast, mask)) {
> + gic_broadcast_sgi(d->hwirq);
> + goto done;
> + }
So the (valid) case where you would IPI *everyone* is not handled as a
fast path? That seems a missed opportunity.
This also seem an like expensive way to do it. How about something
like:
int mcnt = cpumask_weight(mask);
int ocnt = cpumask_weight(cpu_online_mask);
if (mcnt == ocnt) {
/* Broadcast to all CPUs including self */
} else if (mcnt == (ocnt - 1) &&
!cpumask_test_cpu(smp_processor_id(), mask)) {
/* Broadcast to all but self */
}
which avoids the copy+update_full compare.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2024-10-22 15:11 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 4:22 [PATCH v1 0/6] mm/arm64: re-enable HVO Yu Zhao
2024-10-21 4:22 ` [PATCH v1 1/6] mm/hugetlb_vmemmap: batch update PTEs Yu Zhao
2024-10-21 4:22 ` [PATCH v1 2/6] mm/hugetlb_vmemmap: add arch-independent helpers Yu Zhao
2024-10-21 4:22 ` [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast Yu Zhao
2024-10-22 0:24 ` kernel test robot
2024-10-22 15:03 ` Marc Zyngier [this message]
2024-10-25 5:07 ` Yu Zhao
2024-10-25 16:14 ` Marc Zyngier
2024-10-25 17:31 ` Yu Zhao
2024-10-29 19:02 ` Marc Zyngier
2024-10-29 19:53 ` Yu Zhao
2024-10-21 4:22 ` [PATCH v1 4/6] arm64: broadcast IPIs to pause remote CPUs Yu Zhao
2024-10-22 16:15 ` Marc Zyngier
2024-10-28 22:11 ` Yu Zhao
2024-10-29 19:36 ` Marc Zyngier
2024-10-31 18:10 ` Yu Zhao
2024-10-21 4:22 ` [PATCH v1 5/6] arm64: pause remote CPUs to update vmemmap Yu Zhao
2024-10-21 4:22 ` [PATCH v1 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86a5ew41tp.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=dianders@chromium.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mark.rutland@arm.com \
--cc=muchun.song@linux.dev \
--cc=sunnanyong@huawei.com \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.