All of lore.kernel.org
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Kiryl Shutsemau <kirill@shutemov.name>
Cc: Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	Doug Anderson <dianders@chromium.org>,
	Petr Mladek <pmladek@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Baoquan He <bhe@redhat.com>, Puranjay Mohan <puranjay@kernel.org>,
	Usama Arif <usama.arif@linux.dev>,
	Breno Leitao <leitao@debian.org>,
	Julien Thierry <julien.thierry.kdev@gmail.com>,
	Lecopzer Chen <lecopzer@gmail.com>,
	Sumit Garg <sumit.garg@kernel.org>,
	kernel-team@meta.com, kexec@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI
Date: Mon, 29 Jun 2026 16:54:18 +0100	[thread overview]
Message-ID: <akKVKhOsv6EJVFv4@arm.com> (raw)
In-Reply-To: <akJsQK2iG2oZ3vYE@thinkstation>

On Mon, Jun 29, 2026 at 02:05:14PM +0100, Kiryl Shutsemau wrote:
> On Fri, Jun 26, 2026 at 08:40:57PM +0100, Kiryl Shutsemau wrote:
> > But I have not tried calling CPU_OFF directly, without completing the
> > event. I assumed it is required. Will give it a try when I have time.
> 
> Tried it now, and it doesn't work either -- in a more interesting way.
> 
> Calling PSCI CPU_OFF directly from the SDEI handler (event left
> uncompleted) reproducibly breaks the kdump capture kernel, and this
> reproduces under QEMU's TF-A, not just on Grace -- so it isn't a Grace
> firmware quirk.

I had a quick grep (with the help of claude) through the TF-A code and
it doesn't seem to be compliant with the spec. It should subscribe to
the PSCI CPU_OFF event and complete the SDEI but it doesn't. It seems to
handle CPU_ON but that may not be sufficient. It only EIOs the SGI once
the OS completed the event, which doesn't happen if you issue CPU_OFF.

> The test: a CPU wedged with interrupts masked is stopped via the SDEI
> rung; its handler calls __cpu_try_die() instead of parking. A/B in QEMU,
> changing only that wedged CPU's handling (everything else identical):
> 
>   - park it (current series):  capture kernel boots fully to a shell.
>   - CPU_OFF from the handler:  capture kernel hangs in early boot, around
>                                SDEI re-init, never reaches a shell.
> 
> Powering the PE off while its SDEI event is still active leaves EL3's
> dispatch state dangling, and the capture kernel trips over it. Completing
> the event first and then CPU_OFF -- what I tried originally -- silently
> wedges EL3 on Grace instead.
> 
> So both routes off fail, and the CPU stays parked. The dump is complete
> either way; only re-onlining the stopped CPU in an SMP capture kernel is
> lost. It's a cheap QEMU repro now if anyone wants to dig into the EL3
> side.

Have you tried SDEI_EVENT_COMPLETE_AND_RESUME instead? Just COMPLETE
won't return to the kernel. We have sdei_handler_abort() to complete the
event and, hopefully, you can continue with the CPU_OFF. It's a work
around the TF-A non-compliance but I think this is useful even if you
don't issue the CPU_OFF (e.g. no CPU hotplug, just the park loop).

-- 
Catalin


  reply	other threads:[~2026-06-29 15:54 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-17 19:20 [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI Kiryl Shutsemau
2026-06-17 19:20 ` [PATCH v4 1/4] firmware: arm_sdei: add sdei_is_present() Kiryl Shutsemau
2026-06-17 20:02   ` Doug Anderson
2026-06-17 19:20 ` [PATCH v4 2/4] firmware: arm_sdei: add SDEI_EVENT_SIGNAL support Kiryl Shutsemau
2026-06-17 19:20 ` [PATCH v4 3/4] drivers/firmware: add SDEI cross-CPU NMI service for arm64 Kiryl Shutsemau
2026-06-18 10:46   ` Julian Braha
2026-06-18 15:48     ` Kiryl Shutsemau
2026-06-26 17:11   ` Catalin Marinas
2026-06-17 19:20 ` [PATCH v4 4/4] arm64: escalate smp_send_stop() to an SDEI NMI as a last resort Kiryl Shutsemau
2026-06-17 20:02   ` Doug Anderson
2026-06-26 17:08   ` Catalin Marinas
2026-06-26 19:46     ` Kiryl Shutsemau
2026-06-19 14:00 ` [PATCH v4 0/4] arm64: cross-CPU NMI via SDEI Catalin Marinas
2026-06-19 14:26   ` Marc Zyngier
2026-06-22 13:56     ` Kiryl Shutsemau
2026-06-22 16:52       ` Doug Anderson
2026-06-26  8:48         ` Breno Leitao
2026-06-26  8:25       ` YinFengwei
2026-06-26 17:07 ` Catalin Marinas
2026-06-26 19:40   ` Kiryl Shutsemau
2026-06-29 13:05     ` Kiryl Shutsemau
2026-06-29 15:54       ` Catalin Marinas [this message]
2026-06-29 16:53         ` Kiryl Shutsemau
2026-06-30 10:04           ` Kiryl Shutsemau
2026-07-02 13:57             ` Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akKVKhOsv6EJVFv4@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=dianders@chromium.org \
    --cc=james.morse@arm.com \
    --cc=julien.thierry.kdev@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kexec@lists.infradead.org \
    --cc=kirill@shutemov.name \
    --cc=lecopzer@gmail.com \
    --cc=leitao@debian.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=pmladek@suse.com \
    --cc=puranjay@kernel.org \
    --cc=sumit.garg@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=usama.arif@linux.dev \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.