From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Oleksii Kurochko <oleksii.kurochko@gmail.com>
Subject: Re: [PATCH v2 for-4.21 3/9] x86/HPET: replace handle_hpet_broadcast()'s on-stack cpumask_t
Date: Tue, 21 Oct 2025 15:08:18 +0100 [thread overview]
Message-ID: <aPeT0h-S7FVgk3TZ@Mac.lan> (raw)
In-Reply-To: <c357cb79-a10d-4d81-9695-9d16a4080595@suse.com>
On Mon, Oct 20, 2025 at 01:19:20PM +0200, Jan Beulich wrote:
> With large NR_CPUS on-stack cpumask_t variables are problematic. Now that
> the IRQ handler can't be invoked in a nested manner anymore, we can
> instead use a per-CPU variable. While we can't use scratch_cpumask in code
> invoked from IRQ handlers, simply amend that one with a HPET-special form.
> (Note that only one of the two IRQ handling functions can come into play
> at any one time.)
>
> Fixes: 996576b965cc ("xen: allow up to 16383 cpus")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> While doing this I noticed that this and all pre-existing uses of
> DEFINE_PER_CPU_READ_MOSTLY() aren't quite right: When the type is
> cpumask_var_t, the variable is read-mostly only when NR_CPUS <=
> 2 * BITS_PER_LONG. That'll want sorting separately, though.
>
> It is important for this to not be moved ahead of "x86/HPET: use single,
> global, low-priority vector for broadcast IRQ", as the original call here
> from set_channel_irq_affinity() may not use the new variable (it would
> need to use scratch_cpumask instead).
> ---
> v2: New.
>
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -196,7 +196,7 @@ static void evt_do_broadcast(cpumask_t *
>
> static void cf_check handle_hpet_broadcast(struct hpet_event_channel *ch)
> {
> - cpumask_t mask;
> + cpumask_t *scratch = this_cpu(hpet_scratch_cpumask);
> s_time_t now, next_event;
> unsigned int cpu;
> unsigned long flags;
> @@ -209,7 +209,7 @@ again:
> spin_unlock_irqrestore(&ch->lock, flags);
>
> next_event = STIME_MAX;
> - cpumask_clear(&mask);
> + cpumask_clear(scratch);
> now = NOW();
>
> /* find all expired events */
> @@ -218,13 +218,13 @@ again:
> s_time_t deadline = ACCESS_ONCE(per_cpu(timer_deadline, cpu));
>
> if ( deadline <= now )
> - __cpumask_set_cpu(cpu, &mask);
> + __cpumask_set_cpu(cpu, scratch);
> else if ( deadline < next_event )
> next_event = deadline;
> }
>
> /* wakeup the cpus which have an expired event. */
> - evt_do_broadcast(&mask);
> + evt_do_broadcast(scratch);
>
> if ( next_event != STIME_MAX )
> {
> --- a/xen/arch/x86/include/asm/smp.h
> +++ b/xen/arch/x86/include/asm/smp.h
> @@ -22,6 +22,7 @@
> DECLARE_PER_CPU(cpumask_var_t, cpu_sibling_mask);
> DECLARE_PER_CPU(cpumask_var_t, cpu_core_mask);
> DECLARE_PER_CPU(cpumask_var_t, scratch_cpumask);
> +DECLARE_PER_CPU(cpumask_var_t, hpet_scratch_cpumask);
Should this be declared in the hpet.h header?
> DECLARE_PER_CPU(cpumask_var_t, send_ipi_cpumask);
>
> /*
> --- a/xen/arch/x86/smpboot.c
> +++ b/xen/arch/x86/smpboot.c
> @@ -55,6 +55,9 @@ DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t
> DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, scratch_cpumask);
> static cpumask_t scratch_cpu0mask;
>
> +DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, hpet_scratch_cpumask);
> +static cpumask_t hpet_scratch_cpu0mask;
And then this defined in hpet.c.
Just thinking whether someone would like to introduce support for
build time disabling HPET in the future.
Can always be moved at a later time anyway.
Thanks, Roger.
next prev parent reply other threads:[~2025-10-21 14:08 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-20 11:16 [PATCH v2 for-4.21 0/9] x86/HPET: broadcast IRQ and other improvements Jan Beulich
2025-10-20 11:18 ` [PATCH v2 for-4.21 1/9] x86/HPET: disable unused channels Jan Beulich
2025-10-23 14:51 ` Jan Beulich
2025-10-20 11:18 ` [PATCH v2 for-4.21 2/9] x86/HPET: use single, global, low-priority vector for broadcast IRQ Jan Beulich
2025-10-20 16:22 ` Roger Pau Monné
2025-10-21 6:42 ` Jan Beulich
2025-10-21 13:49 ` Roger Pau Monné
2025-10-22 9:21 ` Jan Beulich
2025-10-22 11:59 ` Jan Beulich
2025-10-23 8:39 ` Roger Pau Monné
2025-10-23 10:37 ` Jan Beulich
2025-10-23 12:49 ` Roger Pau Monné
2025-10-23 15:36 ` Jan Beulich
2025-10-20 11:19 ` [PATCH v2 for-4.21 3/9] x86/HPET: replace handle_hpet_broadcast()'s on-stack cpumask_t Jan Beulich
2025-10-20 11:23 ` Jan Beulich
2025-10-21 14:08 ` Roger Pau Monné [this message]
2025-10-21 14:43 ` Jan Beulich
2025-10-20 11:19 ` [PATCH v2 4/9] x86/HPET: avoid indirect call to event handler Jan Beulich
2025-10-20 11:20 ` [PATCH v2 5/9] x86/HPET: make another channel flags update atomic Jan Beulich
2025-10-20 11:20 ` [PATCH v2 6/9] x86/HPET: move legacy tick IRQ count adjustment Jan Beulich
2025-10-20 11:20 ` [PATCH v2 7/9] x86/HPET: reduce hpet_next_event() call sites Jan Beulich
2025-10-20 11:21 ` [PATCH v2 for-4.21 8/9] x86/HPET: don't use hardcoded 0 for "long timeout" Jan Beulich
2025-10-20 11:24 ` Jan Beulich
2025-10-21 14:21 ` Roger Pau Monné
2025-10-21 14:48 ` Jan Beulich
2025-10-20 11:21 ` [PATCH v2 9/9] x86/HPET: simplify "expire" check a little in reprogram_hpet_evt_channel() Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPeT0h-S7FVgk3TZ@Mac.lan \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=oleksii.kurochko@gmail.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.