From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Julien Grall <julien@xen.org>,
Stefano Stabellini <sstabellini@kernel.org>,
Anthony PERARD <anthony.perard@vates.tech>,
Michal Orzel <michal.orzel@amd.com>
Subject: Re: [PATCH RFC DO-NOT-APPLY] x86/IRQ: don't pass offline CPU(s) to on_selected_cpus()
Date: Tue, 11 Feb 2025 19:37:56 +0100 [thread overview]
Message-ID: <Z6uZBATkX72AeDfU@macbook.local> (raw)
In-Reply-To: <0396a0fa-e318-493a-9858-30f63304cc99@suse.com>
On Tue, Feb 11, 2025 at 10:38:41AM +0100, Jan Beulich wrote:
> on_selected_cpus() asserts that it's only passed online CPUs. Between
> __cpu_disable() removing a CPU from the online map and fixup_eoi()
> cleaning up for the CPU being offlined there's a window, though, where
> the CPU in question can still appear in an action's cpu_eoi_map. Remove
> offline CPUs, as they're (supposed to be) taken care of by fixup_eoi().
>
> Move the entire tail of desc_guest_eoi() into a new helper, thus
> streamlining processinf also for other call sites when the sole
^ processing
> remaining CPU is the local one. Move and split the assertion, to be a
> functionally equivalent replacement for the earlier BUG_ON() in
> __pirq_guest_unbind(). Note that we can't use scratch_cpumask there, in
> particular in the context of a timer handler and with data held there
> needing to stay intact across process_pending_softirqs().
I would probably add the crash backtrace to the commit message.
>
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> While this builds okay, for now I didn't even consider testing it: Both
> aspects below (plus the ugly locking arrangement) speak against dealing
> with the issue this way, imo. Cc-ing REST rather than just x86 for this
> reason.
Indeed, the locking is specially problematic here, both for being
ugly, and because I assume EOI is a hot path.
> RFC: Don't we need {get,put}_cpu_maps() around quite a few more uses of
> cpu_online_map (e.g. _clear_irq_vector() when called from
> destroy_irq(), or about every caller of smp_call_function())?
Indeed, that's my understanding also. Quite a lot of users of
cpu_online_mask likely requires wrapping around {get,put}_cpu_maps()
calls, otherwise they very likely incur in TOCTOU races.
> Roger
> suggests using stop-machine context for CPU {on,off}lining, but I
> seem to vaguely recall that there was a reason not to do so at
> least for the offlining part.
I would have to explore this more thoroughly, it does seems feasible
on first sight, but devil (and bugs) is in the details.
I don't think we want to go to the lengths of what you do below for
quite a lot of users of cpu_online_mask.
>
> RFC: I doubt process_pending_softirqs() is okay to use from a timer
> handler. (Callers of, in particular, {desc,pirq}_guest_eoi() would
> also need checking for process_pending_softirqs() to be okay to use
> in their contexts.)
Yeah, I would be worry of doing softirq processing from this context,
the more that (even if not the common case) I would be afraid of the
unbounded latency that process_pending_softirqs() could introduce.
>
> --- unstable.orig/xen/arch/x86/irq.c 2021-04-08 11:10:47.000000000 +0200
> +++ unstable/xen/arch/x86/irq.c 2025-02-11 09:54:38.532567287 +0100
> @@ -6,6 +6,7 @@
> */
>
> #include <xen/init.h>
> +#include <xen/cpu.h>
> #include <xen/delay.h>
> #include <xen/errno.h>
> #include <xen/event.h>
> @@ -1183,7 +1184,7 @@ static inline void clear_pirq_eoi(struct
> }
> }
>
> -static void cf_check set_eoi_ready(void *data);
> +static bool invoke_set_eoi_ready(struct irq_desc *desc, bool wait);
>
> static void cf_check irq_guest_eoi_timer_fn(void *data)
> {
> @@ -1224,18 +1225,13 @@ static void cf_check irq_guest_eoi_timer
>
> switch ( action->ack_type )
> {
> - cpumask_t *cpu_eoi_map;
> -
> case ACKTYPE_UNMASK:
> if ( desc->handler->end )
> desc->handler->end(desc, 0);
> break;
>
> case ACKTYPE_EOI:
> - cpu_eoi_map = this_cpu(scratch_cpumask);
> - cpumask_copy(cpu_eoi_map, action->cpu_eoi_map);
> - spin_unlock_irq(&desc->lock);
> - on_selected_cpus(cpu_eoi_map, set_eoi_ready, desc, 0);
> + invoke_set_eoi_ready(desc, false);
> return;
> }
>
> @@ -1468,6 +1464,49 @@ static void cf_check set_eoi_ready(void
> flush_ready_eoi();
> }
>
> +/*
> + * Locking is somewhat special here: In all cases the function is invoked with
> + * desc's lock held. When @wait is true, the function tries to avoid unlocking.
> + * It always returns whether the lock is still held.
> + */
> +static bool invoke_set_eoi_ready(struct irq_desc *desc, bool wait)
> +{
> + const irq_guest_action_t *action = guest_action(desc);
> + cpumask_t cpu_eoi_map;
> +
> + cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
> +
> + if ( __cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
> + {
> + ASSERT(action->ack_type == ACKTYPE_EOI);
> + __set_eoi_ready(desc);
> + spin_unlock(&desc->lock);
> + flush_ready_eoi();
> + local_irq_enable();
> + }
> + else if ( wait && cpumask_empty(&cpu_eoi_map) )
> + return true;
> + else
> + {
> + ASSERT(action->ack_type == ACKTYPE_EOI);
> + spin_unlock_irq(&desc->lock);
> + }
> +
> + if ( cpumask_empty(&cpu_eoi_map) )
> + return false;
> +
> + while ( !get_cpu_maps() )
> + process_pending_softirqs();
> +
> + cpumask_and(&cpu_eoi_map, &cpu_eoi_map, &cpu_online_map);
> + if ( !cpumask_empty(&cpu_eoi_map) )
> + on_selected_cpus(&cpu_eoi_map, set_eoi_ready, desc, wait);
> +
> + put_cpu_maps();
> +
> + return false;
> +}
> +
> void pirq_guest_eoi(struct pirq *pirq)
Possibly for pirq_guest_eoi() when called from the PHYSDEVOP_eoi
hypercall we should propagate whether the EOI has been performed, and
otherwise use an hypercall continuation to retry?
As said above however, this approach is so ugly, and we would require
similar adjustments in other places that also use cpu_online_maks that
I would only consider going this route as last resort.
Thanks, Roger.
prev parent reply other threads:[~2025-02-11 18:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-11 9:38 [PATCH RFC DO-NOT-APPLY] x86/IRQ: don't pass offline CPU(s) to on_selected_cpus() Jan Beulich
2025-02-11 18:37 ` Roger Pau Monné [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6uZBATkX72AeDfU@macbook.local \
--to=roger.pau@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=anthony.perard@vates.tech \
--cc=jbeulich@suse.com \
--cc=julien@xen.org \
--cc=michal.orzel@amd.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.