From: Baolu Lu <baolu.lu@linux.intel.com>
To: Ethan Zhao <haifeng.zhao@linux.intel.com>,
kevin.tian@intel.com, bhelgaas@google.com, dwmw2@infradead.org,
will@kernel.org, robin.murphy@arm.com, lukas@wunner.de
Cc: baolu.lu@linux.intel.com, linux-pci@vger.kernel.org,
iommu@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v10 0/5] fix vt-d hard lockup when hotplug ATS capable device
Date: Wed, 17 Jan 2024 11:24:10 +0800 [thread overview]
Message-ID: <3ee904e9-8a93-4bd9-8df7-6294885589e4@linux.intel.com> (raw)
In-Reply-To: <1a2a4069-c737-4a3c-a2f6-cce06823331b@linux.intel.com>
On 2024/1/15 15:58, Ethan Zhao wrote:
> -static int qi_check_fault(struct intel_iommu *iommu, int index, int
> wait_index)
> +static int qi_check_fault(struct intel_iommu *iommu, int index, int
> wait_index,
> + pci_dev *target_pdev)
> {
> u32 fault;
> int head, tail;
> + u64 iqe_err, ice_sid;
> struct q_inval *qi = iommu->qi;
> int shift = qi_shift(iommu);
>
> if (qi->desc_status[wait_index] == QI_ABORT)
> return -EAGAIN;
>
> + /*
> + * If the ATS invalidation target device is gone this moment
> (surprise
> + * removed, died, no response) don't try this request again. this
> + * request will not get valid result anymore. but the request was
> + * already submitted to hardware and we predict to get a ITE in
> + * followed batch of request, if so, it will get handled then.
> + */
We can't leave the ITE triggered by this request for the next one, which
has no context about why this happened. Perhaps move below code down to
the segment that handles ITEs.
Another concern is about qi_dump_fault(), which pr_err's the fault
message as long as the register is set. Some faults are predictable,
such as cache invalidation for surprise-removed devices. Unconditionally
reporting errors with pr_err() may lead the user to believe that a more
serious hardware error has occurred. Probably we can refine this part of
the code as well.
Others look sane to me.
> + if (target_pdev && !pci_device_is_present(target_pdev))
> + return -EINVAL;
> +
> fault = readl(iommu->reg + DMAR_FSTS_REG);
> if (fault & (DMA_FSTS_IQE | DMA_FSTS_ITE | DMA_FSTS_ICE))
> qi_dump_fault(iommu, fault);
> @@ -1315,6 +1327,13 @@ static int qi_check_fault(struct intel_iommu
> *iommu, int index, int wait_index)
> tail = readl(iommu->reg + DMAR_IQT_REG);
> tail = ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH;
>
> + /*
> + * SID field is valid only when the ITE field is Set in
> FSTS_REG
> + * see Intel VT-d spec r4.1, section 11.4.9.9
> + */
> + iqe_err = dmar_readq(iommu->reg + DMAR_IQER_REG);
> + ice_sid = DMAR_IQER_REG_ITESID(iqe_err);
> +
> writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG);
> pr_info("Invalidation Time-out Error (ITE) cleared\n");
>
> @@ -1324,6 +1343,16 @@ static int qi_check_fault(struct intel_iommu
> *iommu, int index, int wait_index)
> head = (head - 2 + QI_LENGTH) % QI_LENGTH;
> } while (head != tail);
>
> + /*
> + * If got ITE, we need to check if the sid of ITE is the
> same as
> + * current ATS invalidation target device, if yes, don't
> try this
> + * request anymore, the target device has a response
> time beyound
> + * expected. 0 value of ice_sid means old device, no
> ice_sid value.
> + */
> + if (target_pdev && ice_sid && ice_sid ==
> + pci_dev_id(pci_physfn(target_pdev))
> + return -ETIMEDOUT;
> +
> if (qi->desc_status[wait_index] == QI_ABORT)
> return -EAGAIN;
> }
Best regards,
baolu
next prev parent reply other threads:[~2024-01-17 3:24 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-28 17:02 [RFC PATCH v10 0/5] fix vt-d hard lockup when hotplug ATS capable device Ethan Zhao
2023-12-28 17:02 ` [RFC PATCH v10 1/5] iommu/vt-d: add pci_dev parameter to qi_submit_sync and refactor callers Ethan Zhao
2024-01-10 4:59 ` Baolu Lu
2024-01-10 7:51 ` Ethan Zhao
2023-12-28 17:02 ` [RFC PATCH v10 2/5] iommu/vt-d: break out ATS Invalidation if target device is gone Ethan Zhao
2024-01-10 5:17 ` Baolu Lu
2024-01-10 8:29 ` Ethan Zhao
2023-12-29 8:18 ` [RFC PATCH v10 0/5] fix vt-d hard lockup when hotplug ATS capable device Tian, Kevin
2023-12-29 9:28 ` Ethan Zhao
2024-01-09 1:24 ` Ethan Zhao
2024-01-15 7:58 ` Ethan Zhao
2024-01-17 3:24 ` Baolu Lu [this message]
2024-01-17 5:26 ` Ethan Zhao
2024-01-17 5:38 ` Ethan Zhao
2024-01-17 9:00 ` Ethan Zhao
2024-01-18 0:46 ` Baolu Lu
2024-01-18 2:26 ` Ethan Zhao
2024-01-18 2:32 ` Ethan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3ee904e9-8a93-4bd9-8df7-6294885589e4@linux.intel.com \
--to=baolu.lu@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=dwmw2@infradead.org \
--cc=haifeng.zhao@linux.intel.com \
--cc=iommu@lists.linux.dev \
--cc=kevin.tian@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox