From: Tyler Hicks <code@tyhicks.com>
To: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Jerry Snitselaar <jsnitsel@redhat.com>,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, Dexuan Cui <decui@microsoft.com>,
Easwar Hariharan <eahariha@linux.microsoft.com>
Subject: Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec?
Date: Tue, 19 Mar 2024 14:14:26 -0500 [thread overview]
Message-ID: <ZfnkEqglNPRzH3Zk@sequoia> (raw)
In-Reply-To: <20240319154756.GB2901@willie-the-truck>
On 2024-03-19 15:47:56, Will Deacon wrote:
> On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote:
> > Beyond properly quiescing and resetting the system back to a boot-time
> > state, the outgoing kernel in a kexec can only really do things which affect
> > itself. Sure, we *could* configure the SMMU to block all traffic and disable
> > the interrupt to avoid getting stuck in a storm of faults on the way out,
> > but what does that mean for the incoming kexec payload? That it can have the
> > pleasure of discovering the SMMU, innocently enabling the interrupt and
> > getting stuck in an unexpected storm of faults. Or perhaps just resetting
> > the SMMU into a disabled state and thus still unwittingly allowing its
> > memory to be corrupted by the previous kernel not supporting kexec properly.
>
> Right, it's hard to win if DMA-active devices weren't quiesced properly
> by the outgoing kernel. Either the SMMU was left in abort (leading to the
> problems you list above) or the SMMU is left in bypass (leading to possible
> data corruption). Which is better?
My thoughts are that a loud and obvious failure (via unidentified stream
fault messages and/or a possible interrupt storm preventing the new
kernel from booting) is favorable to silent and subtle data corruption
of the target kernel.
> The best solution is obviously to implement those missing ->shutdown()
> callbacks.
Completely agree here but it can be difficult to even identify that a
missing ->shutdown hook is the root cause without code changes to put
the SMMU into abort mode and sleep for a bit in the SMMU's ->shutdown
hook.
Tyler
WARNING: multiple messages have this Message-ID (diff)
From: Tyler Hicks <code@tyhicks.com>
To: Will Deacon <will@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Jerry Snitselaar <jsnitsel@redhat.com>,
linux-arm-kernel@lists.infradead.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org, Dexuan Cui <decui@microsoft.com>,
Easwar Hariharan <eahariha@linux.microsoft.com>
Subject: Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec?
Date: Tue, 19 Mar 2024 14:14:26 -0500 [thread overview]
Message-ID: <ZfnkEqglNPRzH3Zk@sequoia> (raw)
In-Reply-To: <20240319154756.GB2901@willie-the-truck>
On 2024-03-19 15:47:56, Will Deacon wrote:
> On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote:
> > Beyond properly quiescing and resetting the system back to a boot-time
> > state, the outgoing kernel in a kexec can only really do things which affect
> > itself. Sure, we *could* configure the SMMU to block all traffic and disable
> > the interrupt to avoid getting stuck in a storm of faults on the way out,
> > but what does that mean for the incoming kexec payload? That it can have the
> > pleasure of discovering the SMMU, innocently enabling the interrupt and
> > getting stuck in an unexpected storm of faults. Or perhaps just resetting
> > the SMMU into a disabled state and thus still unwittingly allowing its
> > memory to be corrupted by the previous kernel not supporting kexec properly.
>
> Right, it's hard to win if DMA-active devices weren't quiesced properly
> by the outgoing kernel. Either the SMMU was left in abort (leading to the
> problems you list above) or the SMMU is left in bypass (leading to possible
> data corruption). Which is better?
My thoughts are that a loud and obvious failure (via unidentified stream
fault messages and/or a possible interrupt storm preventing the new
kernel from booting) is favorable to silent and subtle data corruption
of the target kernel.
> The best solution is obviously to implement those missing ->shutdown()
> callbacks.
Completely agree here but it can be difficult to even identify that a
missing ->shutdown hook is the root cause without code changes to put
the SMMU into abort mode and sleep for a bit in the SMMU's ->shutdown
hook.
Tyler
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-03-19 19:14 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-14 7:49 Why is the ARM SMMU v1/v2 put into bypass mode on kexec? Tyler Hicks
2024-03-14 7:49 ` Tyler Hicks
2024-03-14 19:06 ` Tyler Hicks
2024-03-14 19:06 ` Tyler Hicks
2024-03-19 12:57 ` Robin Murphy
2024-03-19 12:57 ` Robin Murphy
2024-03-19 15:47 ` Will Deacon
2024-03-19 15:47 ` Will Deacon
2024-03-19 17:50 ` Jason Gunthorpe
2024-03-19 17:50 ` Jason Gunthorpe
2024-03-22 15:55 ` Will Deacon
2024-03-22 15:55 ` Will Deacon
2024-03-22 19:52 ` Tyler Hicks
2024-03-22 19:52 ` Tyler Hicks
2024-03-19 18:17 ` Robin Murphy
2024-03-19 18:17 ` Robin Murphy
2024-03-22 15:51 ` Will Deacon
2024-03-22 15:51 ` Will Deacon
2024-04-02 16:32 ` Robin Murphy
2024-04-02 16:32 ` Robin Murphy
2024-03-19 19:14 ` Tyler Hicks [this message]
2024-03-19 19:14 ` Tyler Hicks
2024-03-22 16:06 ` Will Deacon
2024-03-22 16:06 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZfnkEqglNPRzH3Zk@sequoia \
--to=code@tyhicks.com \
--cc=decui@microsoft.com \
--cc=eahariha@linux.microsoft.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@ziepe.ca \
--cc=jsnitsel@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.