* Why is the ARM SMMU v1/v2 put into bypass mode on kexec?
@ 2024-03-14 7:49 Tyler Hicks
2024-03-14 19:06 ` Tyler Hicks
0 siblings, 1 reply; 12+ messages in thread
From: Tyler Hicks @ 2024-03-14 7:49 UTC (permalink / raw)
To: Will Deacon, Robin Murphy, Jason Gunthorpe, Jerry Snitselaar
Cc: linux-arm-kernel, iommu, linux-kernel, Dexuan Cui,
Easwar Hariharan
Given that drivers are only optionally asked to implement the .shutdown
hook, which is required to properly quiesce devices before a kexec, why
is it that we put the ARM SMMU v1/v2 into bypass mode in the arm-smmu
driver's own .shutdown hook?
arm_smmu_device_shutdown() -> set SMMU_sCR0.CLIENTPD bit to 1
Driver authors often forget to even implement a .shutdown hook, which
results in some hard-to-debug memory corruption issues in the kexec'ed
target kernel due to pending DMA operations happening on untranslated
addresses. Why not leave the SMMU in translate mode but clear the stream
mapping table (or maybe even call arm_smmu_device_reset()) in the SMMU's
.shutdown hook to prevent the memory corruption from happening in the
first place?
Fully acknowledging that the proper fix is to quiesce the devices, I
feel like resetting the SMMU and leaving it in translate mode across
kexec would be more consistent with the intent behind v5.2 commit
954a03be033c ("iommu/arm-smmu: Break insecure users by disabling bypass
by default"). The incoming transactions of devices, that weren't
properly quiesced during a kexec, would be blocked until their drivers
have a chance to reinitialize the devices in the new kernel.
I appreciate any help understanding why bypass mode is utilized here as
I'm sure there are nuances that I haven't considered. Thank you!
Tyler
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-14 7:49 Why is the ARM SMMU v1/v2 put into bypass mode on kexec? Tyler Hicks @ 2024-03-14 19:06 ` Tyler Hicks 2024-03-19 12:57 ` Robin Murphy 0 siblings, 1 reply; 12+ messages in thread From: Tyler Hicks @ 2024-03-14 19:06 UTC (permalink / raw) To: Will Deacon, Robin Murphy, Jason Gunthorpe, Jerry Snitselaar Cc: linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-14 09:55:46, Tyler Hicks wrote: > Given that drivers are only optionally asked to implement the .shutdown > hook, which is required to properly quiesce devices before a kexec, why > is it that we put the ARM SMMU v1/v2 into bypass mode in the arm-smmu > driver's own .shutdown hook? > > arm_smmu_device_shutdown() -> set SMMU_sCR0.CLIENTPD bit to 1 > > Driver authors often forget to even implement a .shutdown hook, which > results in some hard-to-debug memory corruption issues in the kexec'ed > target kernel due to pending DMA operations happening on untranslated > addresses. Why not leave the SMMU in translate mode but clear the stream > mapping table (or maybe even call arm_smmu_device_reset()) in the SMMU's > .shutdown hook to prevent the memory corruption from happening in the > first place? > > Fully acknowledging that the proper fix is to quiesce the devices, I > feel like resetting the SMMU and leaving it in translate mode across > kexec would be more consistent with the intent behind v5.2 commit > 954a03be033c ("iommu/arm-smmu: Break insecure users by disabling bypass > by default"). The incoming transactions of devices, that weren't > properly quiesced during a kexec, would be blocked until their drivers > have a chance to reinitialize the devices in the new kernel. > > I appreciate any help understanding why bypass mode is utilized here as > I'm sure there are nuances that I haven't considered. Thank you! I now see that Will has previously mentioned that he'd be open to such a change: One thing I would be in favour of is changing the ->shutdown() code to honour disable_bypass=1 so that we put the SMMU in an aborting state instead of passthrough. Would that help at all? It would at least avoid the memory corruption on missing shutdown callback. - https://lore.kernel.org/linux-arm-kernel/20200608113852.GA3108@willie-the-truck/ Robin mentions the need to support kexec into a non-SMMU-aware OS. I hadn't considered that bit of complexity: ... consider if the first kernel *did* leave it enabled with whatever left-over translations in place, and kexec'ed into another OS that wasn't SMMU-aware... - https://lore.kernel.org/linux-arm-kernel/e072f61a-d6cf-2e34-efd5-c1db38c5c622@arm.com/ Now that we're 3-4 years removed from that conversation, has anything changed? Will, is there anything we'd need to watch out for if we were to prototype this sort of change? For example, would it be wise to disable fault interrupts when putting the SMMU in an aborting state before kexec'ing? Thanks again! Tyler _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-14 19:06 ` Tyler Hicks @ 2024-03-19 12:57 ` Robin Murphy 2024-03-19 15:47 ` Will Deacon 0 siblings, 1 reply; 12+ messages in thread From: Robin Murphy @ 2024-03-19 12:57 UTC (permalink / raw) To: Tyler Hicks, Will Deacon, Jason Gunthorpe, Jerry Snitselaar Cc: linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-14 7:06 pm, Tyler Hicks wrote: > On 2024-03-14 09:55:46, Tyler Hicks wrote: >> Given that drivers are only optionally asked to implement the .shutdown >> hook, which is required to properly quiesce devices before a kexec, why >> is it that we put the ARM SMMU v1/v2 into bypass mode in the arm-smmu >> driver's own .shutdown hook? >> >> arm_smmu_device_shutdown() -> set SMMU_sCR0.CLIENTPD bit to 1 >> >> Driver authors often forget to even implement a .shutdown hook, which >> results in some hard-to-debug memory corruption issues in the kexec'ed >> target kernel due to pending DMA operations happening on untranslated >> addresses. Why not leave the SMMU in translate mode but clear the stream >> mapping table (or maybe even call arm_smmu_device_reset()) in the SMMU's >> .shutdown hook to prevent the memory corruption from happening in the >> first place? >> >> Fully acknowledging that the proper fix is to quiesce the devices, I >> feel like resetting the SMMU and leaving it in translate mode across >> kexec would be more consistent with the intent behind v5.2 commit >> 954a03be033c ("iommu/arm-smmu: Break insecure users by disabling bypass >> by default"). The incoming transactions of devices, that weren't >> properly quiesced during a kexec, would be blocked until their drivers >> have a chance to reinitialize the devices in the new kernel. >> >> I appreciate any help understanding why bypass mode is utilized here as >> I'm sure there are nuances that I haven't considered. Thank you! > > I now see that Will has previously mentioned that he'd be open to such a > change: > > One thing I would be in favour of is changing the ->shutdown() code to > honour disable_bypass=1 so that we put the SMMU in an aborting state > instead of passthrough. Would that help at all? It would at least > avoid the memory corruption on missing shutdown callback. > > - https://lore.kernel.org/linux-arm-kernel/20200608113852.GA3108@willie-the-truck/ > > Robin mentions the need to support kexec into a non-SMMU-aware OS. I > hadn't considered that bit of complexity: > > ... consider if the first kernel *did* leave it enabled with whatever > left-over translations in place, and kexec'ed into another OS that > wasn't SMMU-aware... > > - https://lore.kernel.org/linux-arm-kernel/e072f61a-d6cf-2e34-efd5-c1db38c5c622@arm.com/ > > Now that we're 3-4 years removed from that conversation, has anything > changed? Will, is there anything we'd need to watch out for if we were > to prototype this sort of change? For example, would it be wise to > disable fault interrupts when putting the SMMU in an aborting state > before kexec'ing? Fundamentally, we expect the SMMU to be disabled at initial boot, so per the intent of kexec we put it back in that state. That also seems the most likely expectation of anything we could kexec into, given that it is the natural state of an untouched SMMU after a hard reset, and thus comes out as the least-worst option. Beyond properly quiescing and resetting the system back to a boot-time state, the outgoing kernel in a kexec can only really do things which affect itself. Sure, we *could* configure the SMMU to block all traffic and disable the interrupt to avoid getting stuck in a storm of faults on the way out, but what does that mean for the incoming kexec payload? That it can have the pleasure of discovering the SMMU, innocently enabling the interrupt and getting stuck in an unexpected storm of faults. Or perhaps just resetting the SMMU into a disabled state and thus still unwittingly allowing its memory to be corrupted by the previous kernel not supporting kexec properly. So no, I would not say that anything has changed here, at least not in favour of this idea. If anything, it's become even more impractical now that we have RMRs to properly support cases like an EFI framebuffer where neither the outgoing nor incoming kernels necessarily have the ability to quiesce the underlying DMA or recover it from faults, thus we have to be even more careful. Thanks, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 12:57 ` Robin Murphy @ 2024-03-19 15:47 ` Will Deacon 2024-03-19 17:50 ` Jason Gunthorpe ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Will Deacon @ 2024-03-19 15:47 UTC (permalink / raw) To: Robin Murphy Cc: Tyler Hicks, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote: > On 2024-03-14 7:06 pm, Tyler Hicks wrote: > > On 2024-03-14 09:55:46, Tyler Hicks wrote: > > > Given that drivers are only optionally asked to implement the .shutdown > > > hook, which is required to properly quiesce devices before a kexec, why > > > is it that we put the ARM SMMU v1/v2 into bypass mode in the arm-smmu > > > driver's own .shutdown hook? > > > > > > arm_smmu_device_shutdown() -> set SMMU_sCR0.CLIENTPD bit to 1 > > > > > > Driver authors often forget to even implement a .shutdown hook, which > > > results in some hard-to-debug memory corruption issues in the kexec'ed > > > target kernel due to pending DMA operations happening on untranslated > > > addresses. Why not leave the SMMU in translate mode but clear the stream > > > mapping table (or maybe even call arm_smmu_device_reset()) in the SMMU's > > > .shutdown hook to prevent the memory corruption from happening in the > > > first place? > > > > > > Fully acknowledging that the proper fix is to quiesce the devices, I > > > feel like resetting the SMMU and leaving it in translate mode across > > > kexec would be more consistent with the intent behind v5.2 commit > > > 954a03be033c ("iommu/arm-smmu: Break insecure users by disabling bypass > > > by default"). The incoming transactions of devices, that weren't > > > properly quiesced during a kexec, would be blocked until their drivers > > > have a chance to reinitialize the devices in the new kernel. > > > > > > I appreciate any help understanding why bypass mode is utilized here as > > > I'm sure there are nuances that I haven't considered. Thank you! > > > > I now see that Will has previously mentioned that he'd be open to such a > > change: > > > > One thing I would be in favour of is changing the ->shutdown() code to > > honour disable_bypass=1 so that we put the SMMU in an aborting state > > instead of passthrough. Would that help at all? It would at least > > avoid the memory corruption on missing shutdown callback. > > > > - https://lore.kernel.org/linux-arm-kernel/20200608113852.GA3108@willie-the-truck/ > > > > Robin mentions the need to support kexec into a non-SMMU-aware OS. I > > hadn't considered that bit of complexity: > > > > ... consider if the first kernel *did* leave it enabled with whatever > > left-over translations in place, and kexec'ed into another OS that > > wasn't SMMU-aware... > > > > - https://lore.kernel.org/linux-arm-kernel/e072f61a-d6cf-2e34-efd5-c1db38c5c622@arm.com/ > > > > Now that we're 3-4 years removed from that conversation, has anything > > changed? Will, is there anything we'd need to watch out for if we were > > to prototype this sort of change? For example, would it be wise to > > disable fault interrupts when putting the SMMU in an aborting state > > before kexec'ing? I've grown older and wiser in those four years and no longer think that's a good idea :) Well, older maybe, but the reality is that the code around the driver has evolved and 'disable_bypass' is even more of a hack now than it used to be. > Fundamentally, we expect the SMMU to be disabled at initial boot, so per the > intent of kexec we put it back in that state. That also seems the most > likely expectation of anything we could kexec into, given that it is the > natural state of an untouched SMMU after a hard reset, and thus comes out as > the least-worst option. Heh, that sounded too good to be true when I read it so I went and looked at the code: SMMUv3: arm_smmu_device_shutdown() -> clears CR0 but doesn't touch GBPA SMMUv2: arm_smmu_device_shutdown() -> writes CLIENTPD to CR0 So it's a bit of a muddle afaict; SMMUv2 explicitly goes into bypass whereas SMMUv3 probably does honour disable_bypass=false! Did I miss something? As discussed elsewhere, if we remove disable_bypass from SMMUv3, then we should be able to be consistent here. The question is, what's the right behaviour? > Beyond properly quiescing and resetting the system back to a boot-time > state, the outgoing kernel in a kexec can only really do things which affect > itself. Sure, we *could* configure the SMMU to block all traffic and disable > the interrupt to avoid getting stuck in a storm of faults on the way out, > but what does that mean for the incoming kexec payload? That it can have the > pleasure of discovering the SMMU, innocently enabling the interrupt and > getting stuck in an unexpected storm of faults. Or perhaps just resetting > the SMMU into a disabled state and thus still unwittingly allowing its > memory to be corrupted by the previous kernel not supporting kexec properly. Right, it's hard to win if DMA-active devices weren't quiesced properly by the outgoing kernel. Either the SMMU was left in abort (leading to the problems you list above) or the SMMU is left in bypass (leading to possible data corruption). Which is better? The best solution is obviously to implement those missing ->shutdown() callbacks. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 15:47 ` Will Deacon @ 2024-03-19 17:50 ` Jason Gunthorpe 2024-03-22 15:55 ` Will Deacon 2024-03-19 18:17 ` Robin Murphy 2024-03-19 19:14 ` Tyler Hicks 2 siblings, 1 reply; 12+ messages in thread From: Jason Gunthorpe @ 2024-03-19 17:50 UTC (permalink / raw) To: Will Deacon Cc: Robin Murphy, Tyler Hicks, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On Tue, Mar 19, 2024 at 03:47:56PM +0000, Will Deacon wrote: > Right, it's hard to win if DMA-active devices weren't quiesced properly > by the outgoing kernel. Either the SMMU was left in abort (leading to the > problems you list above) or the SMMU is left in bypass (leading to possible > data corruption). Which is better? For whatever reason (and I really don't like this design) alot of work was done on x86 so that device continues to work as-was right up until the crash kernel does the first DMA operation. Including having the crash kernel non disruptively inherit and retain the IOMMU configuration. (eg see translation_pre_enabled() stuff in intel driver) I think the idea was that the crash kernel driver will recover control of the device prior to trying to do DMA. Devices without a driver or devices that are not operated by the crash kernel just keep going as they were. In general practice this is unworkable as some devices can't be recovered without doing DMA in the first place creating a catch-22. So now lots of devices use their shutdown handler to quiet the device before handing over to the crash kernel. I think this emerged as some 'small work' to try and make crash kernels functional at all. Implementing every shutdown handler would be pretty hard, but many (?) devices seem to work OK if the crash kernel drivers runs for a bit before destroying their DMA setup. We don't trigger weird platform crashes or anything due to failing DMA operations either. Now we have all kinds of infrastructure and deployed crash kernels that have this assumption baked in. :( It sure would be nice to not spread this full complexity to ARM. If the original kernel could signal to the crash kernel that specific devices are quieted and then the crash kernel could simply ignore unquieted devices and set the IOMMU to abort them and don't allow any crash drivers to attach. (or maybe FLR them?) If someone wants a device to be usuable in the crash kernel then the original kernel needs to implement the shutdown handler. Regardless, I think if your goal is to support crash kernels then you have to do at least a bit of the x86 'keep the iommu unchanged'. The iommu shutdown should do less like x86 does and the iommu startup should detect the special case and try to atomic switch to the new STE table that aborts unquieted devices. Booting a non-crash OS is a different matter and in that case you really want every bit of HW put back to a clean "just booted" state, and arguably it can't work unless the original kernel implements all the shutdown handlers... I don't know if x86 kexec actually support this, it looks like it only works on Linux OS and things like the Linux iommu driver have code to support the crash focused hand over even in non crash cases. Jason _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 17:50 ` Jason Gunthorpe @ 2024-03-22 15:55 ` Will Deacon 2024-03-22 19:52 ` Tyler Hicks 0 siblings, 1 reply; 12+ messages in thread From: Will Deacon @ 2024-03-22 15:55 UTC (permalink / raw) To: Jason Gunthorpe Cc: Robin Murphy, Tyler Hicks, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan Hey Jason, On Tue, Mar 19, 2024 at 02:50:07PM -0300, Jason Gunthorpe wrote: > On Tue, Mar 19, 2024 at 03:47:56PM +0000, Will Deacon wrote: > > > Right, it's hard to win if DMA-active devices weren't quiesced properly > > by the outgoing kernel. Either the SMMU was left in abort (leading to the > > problems you list above) or the SMMU is left in bypass (leading to possible > > data corruption). Which is better? > > For whatever reason (and I really don't like this design) alot of work > was done on x86 so that device continues to work as-was right up until > the crash kernel does the first DMA operation. Including having the > crash kernel non disruptively inherit and retain the IOMMU > configuration. (eg see translation_pre_enabled() stuff in intel > driver) Right, I'm also not thrilled about trying to implement that :) What we have at the moment seems to be good enough to avoid folks complaining about it. For the case Tyler is reporting, though, I _think_ it's just a standard kexec() rather than a crashkernel. Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-22 15:55 ` Will Deacon @ 2024-03-22 19:52 ` Tyler Hicks 0 siblings, 0 replies; 12+ messages in thread From: Tyler Hicks @ 2024-03-22 19:52 UTC (permalink / raw) To: Will Deacon Cc: Jason Gunthorpe, Robin Murphy, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-22 15:55:29, Will Deacon wrote: > Hey Jason, > > On Tue, Mar 19, 2024 at 02:50:07PM -0300, Jason Gunthorpe wrote: > > On Tue, Mar 19, 2024 at 03:47:56PM +0000, Will Deacon wrote: > > > > > Right, it's hard to win if DMA-active devices weren't quiesced properly > > > by the outgoing kernel. Either the SMMU was left in abort (leading to the > > > problems you list above) or the SMMU is left in bypass (leading to possible > > > data corruption). Which is better? > > > > For whatever reason (and I really don't like this design) alot of work > > was done on x86 so that device continues to work as-was right up until > > the crash kernel does the first DMA operation. Including having the > > crash kernel non disruptively inherit and retain the IOMMU > > configuration. (eg see translation_pre_enabled() stuff in intel > > driver) > > Right, I'm also not thrilled about trying to implement that :) > What we have at the moment seems to be good enough to avoid folks > complaining about it. > > For the case Tyler is reporting, though, I _think_ it's just a standard > kexec() rather than a crashkernel. That's correct. Tyler _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 15:47 ` Will Deacon 2024-03-19 17:50 ` Jason Gunthorpe @ 2024-03-19 18:17 ` Robin Murphy 2024-03-22 15:51 ` Will Deacon 2024-03-19 19:14 ` Tyler Hicks 2 siblings, 1 reply; 12+ messages in thread From: Robin Murphy @ 2024-03-19 18:17 UTC (permalink / raw) To: Will Deacon Cc: Tyler Hicks, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-19 3:47 pm, Will Deacon wrote: > On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote: >> On 2024-03-14 7:06 pm, Tyler Hicks wrote: >>> On 2024-03-14 09:55:46, Tyler Hicks wrote: >>>> Given that drivers are only optionally asked to implement the .shutdown >>>> hook, which is required to properly quiesce devices before a kexec, why >>>> is it that we put the ARM SMMU v1/v2 into bypass mode in the arm-smmu >>>> driver's own .shutdown hook? >>>> >>>> arm_smmu_device_shutdown() -> set SMMU_sCR0.CLIENTPD bit to 1 >>>> >>>> Driver authors often forget to even implement a .shutdown hook, which >>>> results in some hard-to-debug memory corruption issues in the kexec'ed >>>> target kernel due to pending DMA operations happening on untranslated >>>> addresses. Why not leave the SMMU in translate mode but clear the stream >>>> mapping table (or maybe even call arm_smmu_device_reset()) in the SMMU's >>>> .shutdown hook to prevent the memory corruption from happening in the >>>> first place? >>>> >>>> Fully acknowledging that the proper fix is to quiesce the devices, I >>>> feel like resetting the SMMU and leaving it in translate mode across >>>> kexec would be more consistent with the intent behind v5.2 commit >>>> 954a03be033c ("iommu/arm-smmu: Break insecure users by disabling bypass >>>> by default"). The incoming transactions of devices, that weren't >>>> properly quiesced during a kexec, would be blocked until their drivers >>>> have a chance to reinitialize the devices in the new kernel. >>>> >>>> I appreciate any help understanding why bypass mode is utilized here as >>>> I'm sure there are nuances that I haven't considered. Thank you! >>> >>> I now see that Will has previously mentioned that he'd be open to such a >>> change: >>> >>> One thing I would be in favour of is changing the ->shutdown() code to >>> honour disable_bypass=1 so that we put the SMMU in an aborting state >>> instead of passthrough. Would that help at all? It would at least >>> avoid the memory corruption on missing shutdown callback. >>> >>> - https://lore.kernel.org/linux-arm-kernel/20200608113852.GA3108@willie-the-truck/ >>> >>> Robin mentions the need to support kexec into a non-SMMU-aware OS. I >>> hadn't considered that bit of complexity: >>> >>> ... consider if the first kernel *did* leave it enabled with whatever >>> left-over translations in place, and kexec'ed into another OS that >>> wasn't SMMU-aware... >>> >>> - https://lore.kernel.org/linux-arm-kernel/e072f61a-d6cf-2e34-efd5-c1db38c5c622@arm.com/ >>> >>> Now that we're 3-4 years removed from that conversation, has anything >>> changed? Will, is there anything we'd need to watch out for if we were >>> to prototype this sort of change? For example, would it be wise to >>> disable fault interrupts when putting the SMMU in an aborting state >>> before kexec'ing? > > I've grown older and wiser in those four years and no longer think that's > a good idea :) Well, older maybe, but the reality is that the code around > the driver has evolved and 'disable_bypass' is even more of a hack now > than it used to be. > >> Fundamentally, we expect the SMMU to be disabled at initial boot, so per the >> intent of kexec we put it back in that state. That also seems the most >> likely expectation of anything we could kexec into, given that it is the >> natural state of an untouched SMMU after a hard reset, and thus comes out as >> the least-worst option. > > Heh, that sounded too good to be true when I read it so I went and looked at > the code: > > SMMUv3: arm_smmu_device_shutdown() -> clears CR0 but doesn't touch GBPA > SMMUv2: arm_smmu_device_shutdown() -> writes CLIENTPD to CR0 > > So it's a bit of a muddle afaict; SMMUv2 explicitly goes into bypass > whereas SMMUv3 probably does honour disable_bypass=false! Did I miss > something? I think so, namely the utter madness around how and when we do actually touch GBPA - if we found SMMUEN set at the start of probe, then we set GBPA to abort before initially clearing SMMUEN; if the DT is broken then we set GBPA to bypass instead of enabling SMMUEN at the end of probe, *unless* disable_bypass was set. Thus by the time we get to shutdown, SMMUEN may or may not already be 0 and GPBA may or may not have been changed from its initial value to either one of bypass or abort. > As discussed elsewhere, if we remove disable_bypass from SMMUv3, then > we should be able to be consistent here. The question is, what's the > right behaviour? "Not that", at the very least ;) In terms of the shutdown behaviour, I think it actually works out as-is. For the normal case we haven't touched GBPA, so we are truly returning to the boot-time condition; in the unexpected case where SMMUEN was already enabled then we'll go into an explicit GPBA abort state, but that seems a not-unreasonable compromise for not preserving the entire boot-time Stream Table etc., whose presence kind of implies it wouldn't have been bypassing everything anyway. The more I look at the remaining aspect of disable_bypass for controlling broken-DT behaviour the more I suspect it can't actually be useful either way, especially not since default domains. I have no memory of what my original reasoning might have been, so I'm inclined to just rip that all out and let probe fail. I see no reason these days not to expect a broken DT to leads to a broken system, especially not now with DTSchema validation. Then there's just the kdump warning it suppresses, of which I also have no idea why it's there either, but apparently that one's on you :P Cheers, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 18:17 ` Robin Murphy @ 2024-03-22 15:51 ` Will Deacon 2024-04-02 16:32 ` Robin Murphy 0 siblings, 1 reply; 12+ messages in thread From: Will Deacon @ 2024-03-22 15:51 UTC (permalink / raw) To: Robin Murphy Cc: Tyler Hicks, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On Tue, Mar 19, 2024 at 06:17:39PM +0000, Robin Murphy wrote: > In terms of the shutdown behaviour, I think it actually works out as-is. For > the normal case we haven't touched GBPA, so we are truly returning to the > boot-time condition; in the unexpected case where SMMUEN was already enabled > then we'll go into an explicit GPBA abort state, but that seems a > not-unreasonable compromise for not preserving the entire boot-time Stream > Table etc., whose presence kind of implies it wouldn't have been bypassing > everything anyway. > > The more I look at the remaining aspect of disable_bypass for controlling > broken-DT behaviour the more I suspect it can't actually be useful either > way, especially not since default domains. I have no memory of what my > original reasoning might have been, so I'm inclined to just rip that all out > and let probe fail. I see no reason these days not to expect a broken DT to > leads to a broken system, especially not now with DTSchema validation. That sounds reasonable to me, although we may end up having to back it out if we regress systems with borked firmware :( > Then there's just the kdump warning it suppresses, of which I also have no > idea why it's there either, but apparently that one's on you :P I think _that_ one is because the previous (crashed) kernel won't have torn anything down, so there could be active DMA using translations in the SMMU. In that case, the crashkernel (which is running from some carveout) may find the SMMU enabled, but it really can't stick it into bypass mode because that's likely to corrupt random memory. So in that case, we do stick it into abort before we reinitialise it and then we disabling fault reporting altogether to avoid the log spam: if (is_kdump_kernel()) enables &= ~(CR0_EVTQEN | CR0_PRIQEN) Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-22 15:51 ` Will Deacon @ 2024-04-02 16:32 ` Robin Murphy 0 siblings, 0 replies; 12+ messages in thread From: Robin Murphy @ 2024-04-02 16:32 UTC (permalink / raw) To: Will Deacon Cc: Tyler Hicks, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-22 3:51 pm, Will Deacon wrote: > On Tue, Mar 19, 2024 at 06:17:39PM +0000, Robin Murphy wrote: >> In terms of the shutdown behaviour, I think it actually works out as-is. For >> the normal case we haven't touched GBPA, so we are truly returning to the >> boot-time condition; in the unexpected case where SMMUEN was already enabled >> then we'll go into an explicit GPBA abort state, but that seems a >> not-unreasonable compromise for not preserving the entire boot-time Stream >> Table etc., whose presence kind of implies it wouldn't have been bypassing >> everything anyway. >> >> The more I look at the remaining aspect of disable_bypass for controlling >> broken-DT behaviour the more I suspect it can't actually be useful either >> way, especially not since default domains. I have no memory of what my >> original reasoning might have been, so I'm inclined to just rip that all out >> and let probe fail. I see no reason these days not to expect a broken DT to >> leads to a broken system, especially not now with DTSchema validation. > > That sounds reasonable to me, although we may end up having to back it > out if we regress systems with borked firmware :( > >> Then there's just the kdump warning it suppresses, of which I also have no >> idea why it's there either, but apparently that one's on you :P > > I think _that_ one is because the previous (crashed) kernel won't have > torn anything down, so there could be active DMA using translations in > the SMMU. In that case, the crashkernel (which is running from some > carveout) may find the SMMU enabled, but it really can't stick it into > bypass mode because that's likely to corrupt random memory. So in that > case, we do stick it into abort before we reinitialise it and then we > disabling fault reporting altogether to avoid the log spam: > > if (is_kdump_kernel()) > enables &= ~(CR0_EVTQEN | CR0_PRIQEN) Oh, I know why we do what we do for the kdump situation in general - it was merely the matter of why we chose to demand that the user explicitly tells us to do what we know is the right thing (and scream at them if they don't), rather than to just go ahead and do the right thing anyway. (the significance of disable_bypass for kdump is after we turn the SMMU back on from GBPA Abort state - we don't want any ongoing traffic being able to inadvertently bypass via an STE config either) Cheers, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 15:47 ` Will Deacon 2024-03-19 17:50 ` Jason Gunthorpe 2024-03-19 18:17 ` Robin Murphy @ 2024-03-19 19:14 ` Tyler Hicks 2024-03-22 16:06 ` Will Deacon 2 siblings, 1 reply; 12+ messages in thread From: Tyler Hicks @ 2024-03-19 19:14 UTC (permalink / raw) To: Will Deacon Cc: Robin Murphy, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On 2024-03-19 15:47:56, Will Deacon wrote: > On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote: > > Beyond properly quiescing and resetting the system back to a boot-time > > state, the outgoing kernel in a kexec can only really do things which affect > > itself. Sure, we *could* configure the SMMU to block all traffic and disable > > the interrupt to avoid getting stuck in a storm of faults on the way out, > > but what does that mean for the incoming kexec payload? That it can have the > > pleasure of discovering the SMMU, innocently enabling the interrupt and > > getting stuck in an unexpected storm of faults. Or perhaps just resetting > > the SMMU into a disabled state and thus still unwittingly allowing its > > memory to be corrupted by the previous kernel not supporting kexec properly. > > Right, it's hard to win if DMA-active devices weren't quiesced properly > by the outgoing kernel. Either the SMMU was left in abort (leading to the > problems you list above) or the SMMU is left in bypass (leading to possible > data corruption). Which is better? My thoughts are that a loud and obvious failure (via unidentified stream fault messages and/or a possible interrupt storm preventing the new kernel from booting) is favorable to silent and subtle data corruption of the target kernel. > The best solution is obviously to implement those missing ->shutdown() > callbacks. Completely agree here but it can be difficult to even identify that a missing ->shutdown hook is the root cause without code changes to put the SMMU into abort mode and sleep for a bit in the SMMU's ->shutdown hook. Tyler _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec? 2024-03-19 19:14 ` Tyler Hicks @ 2024-03-22 16:06 ` Will Deacon 0 siblings, 0 replies; 12+ messages in thread From: Will Deacon @ 2024-03-22 16:06 UTC (permalink / raw) To: Tyler Hicks Cc: Robin Murphy, Jason Gunthorpe, Jerry Snitselaar, linux-arm-kernel, iommu, linux-kernel, Dexuan Cui, Easwar Hariharan On Tue, Mar 19, 2024 at 02:14:26PM -0500, Tyler Hicks wrote: > On 2024-03-19 15:47:56, Will Deacon wrote: > > On Tue, Mar 19, 2024 at 12:57:52PM +0000, Robin Murphy wrote: > > > Beyond properly quiescing and resetting the system back to a boot-time > > > state, the outgoing kernel in a kexec can only really do things which affect > > > itself. Sure, we *could* configure the SMMU to block all traffic and disable > > > the interrupt to avoid getting stuck in a storm of faults on the way out, > > > but what does that mean for the incoming kexec payload? That it can have the > > > pleasure of discovering the SMMU, innocently enabling the interrupt and > > > getting stuck in an unexpected storm of faults. Or perhaps just resetting > > > the SMMU into a disabled state and thus still unwittingly allowing its > > > memory to be corrupted by the previous kernel not supporting kexec properly. > > > > Right, it's hard to win if DMA-active devices weren't quiesced properly > > by the outgoing kernel. Either the SMMU was left in abort (leading to the > > problems you list above) or the SMMU is left in bypass (leading to possible > > data corruption). Which is better? > > My thoughts are that a loud and obvious failure (via unidentified stream > fault messages and/or a possible interrupt storm preventing the new > kernel from booting) is favorable to silent and subtle data corruption > of the target kernel. Looking at the SMMUv3 spec, the architecture does actually allow hardware to reset into an aborting state: [GBPA.ABORT] | Note: An implementation can reset this field to 1, in order to | implement a default deny policy on reset. so perhaps it's not that unreasonable. I just dread the flood of emails I'll get because the SMMU driver is noisy due to missing ->shutdown() callbacks elsewhere :/ > > The best solution is obviously to implement those missing ->shutdown() > > callbacks. > > Completely agree here but it can be difficult to even identify that a > missing ->shutdown hook is the root cause without code changes to put > the SMMU into abort mode and sleep for a bit in the SMMU's ->shutdown > hook. Perhaps that's the thing to tackle first, then? If we make it easier for folks to diagnose and fix the missing ->shutdown() callbacks, then going into abort is much more reasonable, Will _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-04-02 16:33 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-14 7:49 Why is the ARM SMMU v1/v2 put into bypass mode on kexec? Tyler Hicks 2024-03-14 19:06 ` Tyler Hicks 2024-03-19 12:57 ` Robin Murphy 2024-03-19 15:47 ` Will Deacon 2024-03-19 17:50 ` Jason Gunthorpe 2024-03-22 15:55 ` Will Deacon 2024-03-22 19:52 ` Tyler Hicks 2024-03-19 18:17 ` Robin Murphy 2024-03-22 15:51 ` Will Deacon 2024-04-02 16:32 ` Robin Murphy 2024-03-19 19:14 ` Tyler Hicks 2024-03-22 16:06 ` Will Deacon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).