* [PATCH] iommu/vt-d: fix system hang on reboot -f
@ 2025-02-20 10:15 Yunhui Cui
2025-02-21 8:40 ` Ethan Zhao
2025-02-24 1:02 ` Baolu Lu
0 siblings, 2 replies; 8+ messages in thread
From: Yunhui Cui @ 2025-02-20 10:15 UTC (permalink / raw)
To: dwmw2, baolu.lu, joro, will, robin.murphy, iommu, linux-kernel; +Cc: Yunhui Cui
When entering intel_iommu_shutdown, system interrupts are disabled,
and the reboot process might be scheduled out by down_write(). If the
scheduled process does not yield (e.g., while(1)), the system will hang.
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
---
drivers/iommu/intel/iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index cc46098f875b..76a1d83b46bf 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
if (no_iommu || dmar_disabled)
return;
- down_write(&dmar_global_lock);
+ if (!down_write_trylock(&dmar_global_lock))
+ return;
/* Disable PMRs explicitly here. */
for_each_iommu(iommu, drhd)
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-20 10:15 [PATCH] iommu/vt-d: fix system hang on reboot -f Yunhui Cui
@ 2025-02-21 8:40 ` Ethan Zhao
2025-02-21 9:46 ` [External] " yunhui cui
2025-02-24 1:02 ` Baolu Lu
1 sibling, 1 reply; 8+ messages in thread
From: Ethan Zhao @ 2025-02-21 8:40 UTC (permalink / raw)
To: Yunhui Cui, dwmw2, baolu.lu, joro, will, robin.murphy, iommu,
linux-kernel
在 2025/2/20 18:15, Yunhui Cui 写道:
> When entering intel_iommu_shutdown, system interrupts are disabled,
System interrupts were disabled ? you mean all interrupts were disabled
when entering intel_iommu_shutdown(), perhaps it is not true, at least
for upstream latest code.
> and the reboot process might be scheduled out by down_write(). If the
> scheduled process does not yield (e.g., while(1)), the system will hang.
No NMI lockup watchdog jumping out here ?
Thanks,
Ethan
>
> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
> ---
> drivers/iommu/intel/iommu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index cc46098f875b..76a1d83b46bf 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
> if (no_iommu || dmar_disabled)
> return;
>
> - down_write(&dmar_global_lock);
> + if (!down_write_trylock(&dmar_global_lock))
> + return;
>
> /* Disable PMRs explicitly here. */
> for_each_iommu(iommu, drhd)
--
"firm, enduring, strong, and long-lived"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-21 8:40 ` Ethan Zhao
@ 2025-02-21 9:46 ` yunhui cui
2025-02-24 2:53 ` Ethan Zhao
2025-02-24 3:21 ` Ethan Zhao
0 siblings, 2 replies; 8+ messages in thread
From: yunhui cui @ 2025-02-21 9:46 UTC (permalink / raw)
To: Ethan Zhao; +Cc: dwmw2, baolu.lu, joro, will, robin.murphy, iommu, linux-kernel
Hi Ethan,
On Fri, Feb 21, 2025 at 4:40 PM Ethan Zhao <haifeng.zhao@linux.intel.com> wrote:
>
>
> 在 2025/2/20 18:15, Yunhui Cui 写道:
> > When entering intel_iommu_shutdown, system interrupts are disabled,
>
> System interrupts were disabled ? you mean all interrupts were disabled
> when entering intel_iommu_shutdown(), perhaps it is not true, at least
> for upstream latest code.
>
> > and the reboot process might be scheduled out by down_write(). If the
> > scheduled process does not yield (e.g., while(1)), the system will hang.
>
> No NMI lockup watchdog jumping out here ?
Steps to reproduce:
1. Avoid return in:
if (no_iommu || dmar_disabled)
return;
2. Write a.out with while(1).
3. ./a.out &; reboot -f.
4. Observe. Send NMI via BIOS to check system response.
5. Add console=ttyS0,115200 to cmdline to increase reproduction chance.
Let's continue discussing based on the above.
>
> Thanks,
> Ethan
>
> >
> > Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
> > ---
> > drivers/iommu/intel/iommu.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index cc46098f875b..76a1d83b46bf 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
> > if (no_iommu || dmar_disabled)
> > return;
> >
> > - down_write(&dmar_global_lock);
> > + if (!down_write_trylock(&dmar_global_lock))
> > + return;
> >
> > /* Disable PMRs explicitly here. */
> > for_each_iommu(iommu, drhd)
>
> --
> "firm, enduring, strong, and long-lived"
>
Thanks,
Yunhui
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-20 10:15 [PATCH] iommu/vt-d: fix system hang on reboot -f Yunhui Cui
2025-02-21 8:40 ` Ethan Zhao
@ 2025-02-24 1:02 ` Baolu Lu
2025-02-24 3:42 ` [External] " yunhui cui
2025-02-24 5:37 ` Ethan Zhao
1 sibling, 2 replies; 8+ messages in thread
From: Baolu Lu @ 2025-02-24 1:02 UTC (permalink / raw)
To: Yunhui Cui, dwmw2, joro, will, robin.murphy, iommu, linux-kernel
On 2/20/25 18:15, Yunhui Cui wrote:
> When entering intel_iommu_shutdown, system interrupts are disabled,
> and the reboot process might be scheduled out by down_write(). If the
> scheduled process does not yield (e.g., while(1)), the system will hang.
>
> Signed-off-by: Yunhui Cui<cuiyunhui@bytedance.com>
> ---
> drivers/iommu/intel/iommu.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index cc46098f875b..76a1d83b46bf 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
> if (no_iommu || dmar_disabled)
> return;
>
> - down_write(&dmar_global_lock);
> + if (!down_write_trylock(&dmar_global_lock))
> + return;
If system interrupts are disabled here, locking is unnecessary. Hotplug
operations depend on interrupt events, so it's better to remove the
lock. The shutdown helper then appears like this:
void intel_iommu_shutdown(void)
{
struct dmar_drhd_unit *drhd;
struct intel_iommu *iommu = NULL;
if (no_iommu || dmar_disabled)
return;
/*
* System interrupts are disabled when it reaches here. Locking
* is unnecessary when iterating the IOMMU list.
*/
list_for_each_entry(drhd, &dmar_drhd_units, list) {
if (drhd->ignored)
continue;
iommu = drhd->iommu;
/* Disable PMRs explicitly here. */
iommu_disable_protect_mem_regions(iommu);
iommu_disable_translation(iommu);
}
}
Does it work for you?
Thanks,
baolu
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-21 9:46 ` [External] " yunhui cui
@ 2025-02-24 2:53 ` Ethan Zhao
2025-02-24 3:21 ` Ethan Zhao
1 sibling, 0 replies; 8+ messages in thread
From: Ethan Zhao @ 2025-02-24 2:53 UTC (permalink / raw)
To: yunhui cui; +Cc: dwmw2, baolu.lu, joro, will, robin.murphy, iommu, linux-kernel
在 2025/2/21 17:46, yunhui cui 写道:
> Hi Ethan,
>
> On Fri, Feb 21, 2025 at 4:40 PM Ethan Zhao <haifeng.zhao@linux.intel.com> wrote:
>>
>> 在 2025/2/20 18:15, Yunhui Cui 写道:
>>> When entering intel_iommu_shutdown, system interrupts are disabled,
>> System interrupts were disabled ? you mean all interrupts were disabled
>> when entering intel_iommu_shutdown(), perhaps it is not true, at least
>> for upstream latest code.
>>
>>> and the reboot process might be scheduled out by down_write(). If the
>>> scheduled process does not yield (e.g., while(1)), the system will hang.
>> No NMI lockup watchdog jumping out here ?
> Steps to reproduce:
>
> 1. Avoid return in:
> if (no_iommu || dmar_disabled)
> return;
>
> 2. Write a.out with while(1).
>
> 3. ./a.out &; reboot -f.
>
> 4. Observe. Send NMI via BIOS to check system response.
>
> 5. Add console=ttyS0,115200 to cmdline to increase reproduction chance.
>
> Let's continue discussing based on the above.
I will try these steps to reproduce.
Per the lastest upstream code, the local processor's interrupt mask is cleaned. so
the processor could accept interrupts and handle them. and lagacy interrupt should
be restored for later boot if there is lagacy device and as to NMI, no one could stop
it. In a short, perhaps it is fact under your hardware configureation that no interrupt
event come in to kick the scheduler to run when the a.out (while(1)) got scheduled in,
but not because all system interrupts are disabled.
Thanks,
Ethan
>> Thanks,
>> Ethan
>>
>>> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
>>> ---
>>> drivers/iommu/intel/iommu.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index cc46098f875b..76a1d83b46bf 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
>>> if (no_iommu || dmar_disabled)
>>> return;
>>>
>>> - down_write(&dmar_global_lock);
>>> + if (!down_write_trylock(&dmar_global_lock))
>>> + return;
>>>
>>> /* Disable PMRs explicitly here. */
>>> for_each_iommu(iommu, drhd)
>> --
>> "firm, enduring, strong, and long-lived"
>>
> Thanks,
> Yunhui
>
--
"firm, enduring, strong, and long-lived"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-21 9:46 ` [External] " yunhui cui
2025-02-24 2:53 ` Ethan Zhao
@ 2025-02-24 3:21 ` Ethan Zhao
1 sibling, 0 replies; 8+ messages in thread
From: Ethan Zhao @ 2025-02-24 3:21 UTC (permalink / raw)
To: yunhui cui; +Cc: dwmw2, baolu.lu, joro, will, robin.murphy, iommu, linux-kernel
在 2025/2/21 17:46, yunhui cui 写道:
> Hi Ethan,
>
> On Fri, Feb 21, 2025 at 4:40 PM Ethan Zhao <haifeng.zhao@linux.intel.com> wrote:
>>
>> 在 2025/2/20 18:15, Yunhui Cui 写道:
>>> When entering intel_iommu_shutdown, system interrupts are disabled,
>> System interrupts were disabled ? you mean all interrupts were disabled
>> when entering intel_iommu_shutdown(), perhaps it is not true, at least
>> for upstream latest code.
>>
>>> and the reboot process might be scheduled out by down_write(). If the
>>> scheduled process does not yield (e.g., while(1)), the system will hang.
>> No NMI lockup watchdog jumping out here ?
> Steps to reproduce:
>
> 1. Avoid return in:
> if (no_iommu || dmar_disabled)
> return;
>
> 2. Write a.out with while(1).
>
> 3. ./a.out &; reboot -f.
>
> 4. Observe. Send NMI via BIOS to check system response.
Via BMC ? There is 'NMI' hardware physical button on some machines to trigger
NMI to OS for diagnostic purpose, you could check your box for that. but no luck,
there is no NMI trigger in my GNR BMC.
Thanks,
Ethan
>
> 5. Add console=ttyS0,115200 to cmdline to increase reproduction chance.
>
> Let's continue discussing based on the above.
>
>> Thanks,
>> Ethan
>>
>>> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
>>> ---
>>> drivers/iommu/intel/iommu.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>> index cc46098f875b..76a1d83b46bf 100644
>>> --- a/drivers/iommu/intel/iommu.c
>>> +++ b/drivers/iommu/intel/iommu.c
>>> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
>>> if (no_iommu || dmar_disabled)
>>> return;
>>>
>>> - down_write(&dmar_global_lock);
>>> + if (!down_write_trylock(&dmar_global_lock))
>>> + return;
>>>
>>> /* Disable PMRs explicitly here. */
>>> for_each_iommu(iommu, drhd)
>> --
>> "firm, enduring, strong, and long-lived"
>>
> Thanks,
> Yunhui
--
"firm, enduring, strong, and long-lived"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [External] Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-24 1:02 ` Baolu Lu
@ 2025-02-24 3:42 ` yunhui cui
2025-02-24 5:37 ` Ethan Zhao
1 sibling, 0 replies; 8+ messages in thread
From: yunhui cui @ 2025-02-24 3:42 UTC (permalink / raw)
To: Baolu Lu; +Cc: dwmw2, joro, will, robin.murphy, iommu, linux-kernel
Hi Baolu,
On Mon, Feb 24, 2025 at 9:06 AM Baolu Lu <baolu.lu@linux.intel.com> wrote:
>
> On 2/20/25 18:15, Yunhui Cui wrote:
> > When entering intel_iommu_shutdown, system interrupts are disabled,
> > and the reboot process might be scheduled out by down_write(). If the
> > scheduled process does not yield (e.g., while(1)), the system will hang.
> >
> > Signed-off-by: Yunhui Cui<cuiyunhui@bytedance.com>
> > ---
> > drivers/iommu/intel/iommu.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index cc46098f875b..76a1d83b46bf 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
> > if (no_iommu || dmar_disabled)
> > return;
> >
> > - down_write(&dmar_global_lock);
> > + if (!down_write_trylock(&dmar_global_lock))
> > + return;
>
> If system interrupts are disabled here, locking is unnecessary. Hotplug
> operations depend on interrupt events, so it's better to remove the
> lock. The shutdown helper then appears like this:
Currently, intel_iommu_shutdown() is only called by
native_machine_shutdown(). The down_write/up operations can be
removed. Even if there's a hardware access error, IOMMU_WAIT_OP() will
trigger a panic().
>
> void intel_iommu_shutdown(void)
> {
> struct dmar_drhd_unit *drhd;
> struct intel_iommu *iommu = NULL;
>
> if (no_iommu || dmar_disabled)
> return;
>
> /*
> * System interrupts are disabled when it reaches here. Locking
> * is unnecessary when iterating the IOMMU list.
> */
> list_for_each_entry(drhd, &dmar_drhd_units, list) {
> if (drhd->ignored)
> continue;
>
> iommu = drhd->iommu;
> /* Disable PMRs explicitly here. */
> iommu_disable_protect_mem_regions(iommu);
> iommu_disable_translation(iommu);
> }
> }
>
> Does it work for you?
Yes.
>
> Thanks,
> baolu
Thanks,
Yunhui
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] iommu/vt-d: fix system hang on reboot -f
2025-02-24 1:02 ` Baolu Lu
2025-02-24 3:42 ` [External] " yunhui cui
@ 2025-02-24 5:37 ` Ethan Zhao
1 sibling, 0 replies; 8+ messages in thread
From: Ethan Zhao @ 2025-02-24 5:37 UTC (permalink / raw)
To: Baolu Lu, Yunhui Cui, dwmw2, joro, will, robin.murphy, iommu,
linux-kernel
在 2025/2/24 9:02, Baolu Lu 写道:
> On 2/20/25 18:15, Yunhui Cui wrote:
>> When entering intel_iommu_shutdown, system interrupts are disabled,
>> and the reboot process might be scheduled out by down_write(). If the
>> scheduled process does not yield (e.g., while(1)), the system will hang.
>>
>> Signed-off-by: Yunhui Cui<cuiyunhui@bytedance.com>
>> ---
>> drivers/iommu/intel/iommu.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index cc46098f875b..76a1d83b46bf 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -2871,7 +2871,8 @@ void intel_iommu_shutdown(void)
>> if (no_iommu || dmar_disabled)
>> return;
>> - down_write(&dmar_global_lock);
Only BSP is running at this point, no DMAR concurrency access protection is needed
anyore, even there is interrupt (only legacy & NMI) coming in, it is impossible to for
scheduler to run any other iommu access code.**
Thanks,
Ethan**
>> + if (!down_write_trylock(&dmar_global_lock))
>> + return;
>
> If system interrupts are disabled here, locking is unnecessary. Hotplug
> operations depend on interrupt events, so it's better to remove the
> lock. The shutdown helper then appears like this:
>
> void intel_iommu_shutdown(void)
> {
> struct dmar_drhd_unit *drhd;
> struct intel_iommu *iommu = NULL;
>
> if (no_iommu || dmar_disabled)
> return;
>
> /*
> * System interrupts are disabled when it reaches here. Locking
> * is unnecessary when iterating the IOMMU list.
> */
> list_for_each_entry(drhd, &dmar_drhd_units, list) {
> if (drhd->ignored)
> continue;
>
> iommu = drhd->iommu;
> /* Disable PMRs explicitly here. */
> iommu_disable_protect_mem_regions(iommu);
> iommu_disable_translation(iommu);
> }
> }
>
> Does it work for you?
>
> Thanks,
> baolu
>
--
"firm, enduring, strong, and long-lived"
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-02-24 5:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-20 10:15 [PATCH] iommu/vt-d: fix system hang on reboot -f Yunhui Cui
2025-02-21 8:40 ` Ethan Zhao
2025-02-21 9:46 ` [External] " yunhui cui
2025-02-24 2:53 ` Ethan Zhao
2025-02-24 3:21 ` Ethan Zhao
2025-02-24 1:02 ` Baolu Lu
2025-02-24 3:42 ` [External] " yunhui cui
2025-02-24 5:37 ` Ethan Zhao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox