Re: [PATCH v3] iommu/vt-d: fix intel iommu iotlb sync hardlockup and retry

public inbox for iommu@lists.linux-foundation.org
 help / color / mirror / Atom feed

From: "guanghuifeng@linux.alibaba.com" <guanghuifeng@linux.alibaba.com>
To: baolu.lu@linux.intel.com, dwmw2@infradead.org, joro@8bytes.org,
	will@kernel.org, robin.murphy@arm.com, kevin.tian@intel.com,
	skhawaja@google.com
Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3] iommu/vt-d: fix intel iommu iotlb sync hardlockup and retry
Date: Mon, 9 Mar 2026 17:05:26 +0800	[thread overview]
Message-ID: <f6af2887-a524-4bb0-8ea8-acdca50e66fe@linux.alibaba.com> (raw)
In-Reply-To: <20260306101516.3885775-1-guanghuifeng@linux.alibaba.com>

There are some concerns:

1. During the invalid request execution process, the IOMMU first fetches 
requests

     from the invalid queue to the internal cache.


2. If an ITE timeout occurs during the execution of a request fetched to 
the cache in step 1,

     the IOMMU driver clears the ITE status, allowing IOMMU to resume 
processing requests from the invalid queue.


3. For requests already fetched in step 1 that experience an ITE 
timeout, after the IOMMU driver clears the ITE,

     will IOMMU directly discard these timed-out/cached requests? or 
will it continue to execute these cached requests again?


Currently, the IOMMU driver implementation first clears ite to resume 
IOMMU execution

before setting desc_status to QI_ABORT.

If IOMMU will re-execute requests from the cache, then the IOMMU driver 
needs to be modified.

It should first set desc_status to QI_ABORT, then execute 
writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG)

to resume IOMMU execution(In this case, some requests will be 
resubmitted and executed twice.).

Otherwise, iommu may write the QI_DONE result back to desc_status after 
execution, and the iommu driver will

simultaneously set desc_status to QI_ABORT, leading to data modification 
contention and timing issues.


Thanks.


在 2026/3/6 18:15, Guanghui Feng 写道:
> During the qi_check_fault process after an IOMMU ITE event,
> requests at odd-numbered positions in the queue are set to
> QI_ABORT, only satisfying single-request submissions. However,
> qi_submit_sync now supports multiple simultaneous submissions,
> and can't guarantee that the wait_desc will be at an odd-numbered
> position. Therefore, if an item times out, IOMMU can't re-initiate
> the request, resulting in an infinite polling wait.
>
> This patch modifies the process by setting the status of all requests
> already fetched by IOMMU and recorded as QI_IN_USE status (including
> wait_desc requests) to QI_ABORT, thus enabling multiple requests to
> be resubmitted.
>
> Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
> ---
>   drivers/iommu/intel/dmar.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index d68c06025cac..69222dbd2af0 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -1314,7 +1314,6 @@ static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
>   	if (fault & DMA_FSTS_ITE) {
>   		head = readl(iommu->reg + DMAR_IQH_REG);
>   		head = ((head >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
> -		head |= 1;
>   		tail = readl(iommu->reg + DMAR_IQT_REG);
>   		tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
>   
> @@ -1331,7 +1330,7 @@ static int qi_check_fault(struct intel_iommu *iommu, int index, int wait_index)
>   		do {
>   			if (qi->desc_status[head] == QI_IN_USE)
>   				qi->desc_status[head] = QI_ABORT;
> -			head = (head - 2 + QI_LENGTH) % QI_LENGTH;
> +			head = (head - 1 + QI_LENGTH) % QI_LENGTH;
>   		} while (head != tail);
>   
>   		/*

next prev parent reply	other threads:[~2026-03-09  9:05 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-06 10:15 [PATCH v3] iommu/vt-d: fix intel iommu iotlb sync hardlockup and retry Guanghui Feng
2026-03-06 19:42 ` Samiullah Khawaja
2026-03-09  9:05 ` guanghuifeng [this message]
2026-03-10  6:02   ` Baolu Lu
2026-03-10  6:46     ` Baolu Lu
2026-03-10 12:59 ` Shuai Xue
2026-03-11  2:18   ` Baolu Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6af2887-a524-4bb0-8ea8-acdca50e66fe@linux.alibaba.com \
    --to=guanghuifeng@linux.alibaba.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kevin.tian@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=skhawaja@google.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox