public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ethan Zhao <haifeng.zhao@linux.intel.com>
To: Baolu Lu <baolu.lu@linux.intel.com>,
	Yunhui Cui <cuiyunhui@bytedance.com>,
	dwmw2@infradead.org, joro@8bytes.org, will@kernel.org,
	robin.murphy@arm.com, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] iommu/vt-d: fix system hang on reboot -f
Date: Tue, 25 Feb 2025 16:54:54 +0800	[thread overview]
Message-ID: <c059fb19-9e03-426c-a06a-41f46a07b30a@linux.intel.com> (raw)
In-Reply-To: <0691a295-0883-47b3-84a6-47d9a94af69a@linux.intel.com>


在 2025/2/25 15:01, Baolu Lu 写道:
> On 2025/2/25 14:48, Yunhui Cui wrote:
>> We found that executing the command ./a.out &;reboot -f (where a.out 
>> is a
>> program that only executes a while(1) infinite loop) can 
>> probabilistically
>> cause the system to hang in the intel_iommu_shutdown() function, 
>> rendering
>> it unresponsive. Through analysis, we identified that the factors
>> contributing to this issue are as follows:
>>
>> 1. The reboot -f command does not prompt the kernel to notify the
>> application layer to perform cleanup actions, allowing the 
>> application to
>> continue running.
>>
>> 2. When the kernel reaches the intel_iommu_shutdown() function, only the
>> BSP (Bootstrap Processor) CPU is operational in the system.
>>
>> 3. During the execution of intel_iommu_shutdown(), the function 
>> down_write
>> (&dmar_global_lock) causes the process to sleep and be scheduled out.
>>
>> 4. At this point, though the processor's interrupt flag is not cleared,
>>   allowing interrupts to be accepted. However, only legacy devices 
>> and NMI
>> (Non-Maskable Interrupt) interrupts could come in, as other interrupts
>> routing have already been disabled. If no legacy or NMI interrupts occur
>> at this stage, the scheduler will not be able to run.
>>
>> 5. If the application got scheduled at this time is executing a 
>> while(1)-
>> type loop, it will be unable to be preempted, leading to an infinite 
>> loop
>> and causing the system to become unresponsive.
>>
>> To resolve this issue, the intel_iommu_shutdown() function should not
>> execute down_write(), which can potentially cause the process to be
>> scheduled out. Furthermore, since only the BSP is running during the 
>> later
>> stages of the reboot, there is no need for protection against parallel
>> access to the DMAR (DMA Remapping) unit. Therefore, the following lines
>> could be removed:
>
> Good summary! Thank you!
>
>>
>> down_write(&dmar_global_lock);
>> up_write(&dmar_global_lock);
>>
>> After testing, the issue has been resolved.
>>
>> Fixes: 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown")
>> Co-developed-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
>> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
>> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
>> ---
>>   drivers/iommu/intel/iommu.c | 4 ----
>>   1 file changed, 4 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index cc46098f875b..6d9f2e56ce88 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -2871,16 +2871,12 @@ void intel_iommu_shutdown(void)
>>       if (no_iommu || dmar_disabled)
>>           return;
>>   -    down_write(&dmar_global_lock);
>> -
>>       /* Disable PMRs explicitly here. */
>>       for_each_iommu(iommu, drhd)
>
> Removing the locking for for_each_iommu() will trigger a suspicious RCU
> usage splat. You need to replace this helper with a raw
> list_for_each_entry() with some comments around it to explain why it is
> safe.
>
Oops,  RCU checking hids behind the for_each_iommu() macro.

How about

void intel_iommu_shutdown(void)

{

     struct dmar_drhd_unit *drhd;

     struct intel_iommu *iommu = NULL;

     if (no_iommu || dmar_disabled)

         return;


     /* Here only BSP is running, no RCU cocurrent lock checking needed */

     list_for_each_entry(drhd, &dmar_drhd_units, list) {

         iommu = drhd->iommu;

         /* Disable PMRs explicitly here. */

         iommu_disable_protect_mem_regions(iommu);

         iommu_disable_translation(iommu);

     }

}


Thanks,

Ethan

>> iommu_disable_protect_mem_regions(iommu);
>>         /* Make sure the IOMMUs are switched off */
>>       intel_disable_iommus();
>> -
>> -    up_write(&dmar_global_lock);
>>   }
>>     static struct intel_iommu *dev_to_intel_iommu(struct device *dev)
>
> Thanks,
> baolu

-- 
"firm, enduring, strong, and long-lived"


  reply	other threads:[~2025-02-25  8:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-25  6:48 [PATCH v2] iommu/vt-d: fix system hang on reboot -f Yunhui Cui
2025-02-25  7:01 ` Baolu Lu
2025-02-25  8:54   ` Ethan Zhao [this message]
2025-02-25 14:26     ` Jason Gunthorpe
2025-02-26  0:35       ` Ethan Zhao
2025-02-26  3:50       ` Ethan Zhao
2025-02-26  5:18         ` Baolu Lu
2025-02-26  5:55           ` Ethan Zhao
2025-02-26 13:04             ` Jason Gunthorpe
2025-02-27  0:40               ` Ethan Zhao
2025-02-27 20:38                 ` Jason Gunthorpe
2025-02-28  0:51                   ` Ethan Zhao
2025-02-28  2:18                     ` [External] " yunhui cui
2025-02-28  4:34                       ` Ethan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c059fb19-9e03-426c-a06a-41f46a07b30a@linux.intel.com \
    --to=haifeng.zhao@linux.intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=cuiyunhui@bytedance.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox