From: Ethan Zhao <haifeng.zhao@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: bhelgaas@google.com, baolu.lu@linux.intel.com,
dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com,
linux-pci@vger.kernel.org, iommu@lists.linux.dev,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v6 4/4] iommu/vt-d: break out devTLB invalidation if target device is gone
Date: Mon, 25 Dec 2023 09:16:00 +0800 [thread overview]
Message-ID: <09b432c6-7716-4db5-a33d-23b8407955f1@linux.intel.com> (raw)
In-Reply-To: <20231224104709.GB31197@wunner.de>
On 12/24/2023 6:47 PM, Lukas Wunner wrote:
> On Sun, Dec 24, 2023 at 12:06:57AM -0500, Ethan Zhao wrote:
>> --- a/drivers/iommu/intel/dmar.c
>> +++ b/drivers/iommu/intel/dmar.c
>> @@ -1423,6 +1423,13 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
>> writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG);
>>
>> while (qi->desc_status[wait_index] != QI_DONE) {
>> + /*
>> + * if the devTLB invalidation target device is gone, don't wait
>> + * anymore, it might take up to 1min+50%, causes system hang.
>> + */
>> + if (type == QI_DIOTLB_TYPE && iommu->flush_target_dev)
>> + if (!pci_device_is_present(to_pci_dev(iommu->flush_target_dev)))
>> + break;
> As a general approach, this is much better now.
>
> Please combine the nested if-clauses into one.
That would be harder to read ?
> Please amend the code comment with a spec reference, i.e.
> "(see Implementation Note in PCIe r6.1 sec 10.3.1)"
> so that readers of the code know where the magic number "1min+50%"
> is coming from.
Yup.
>
> Is flush_target_dev guaranteed to always be a pci_dev?
yes, as Baolu said, only PCI and ATS capable device supports
devTLB invalidation operation, this is checked by its caller path.
>
> I'll let iommu maintainers comment on whether storing a flush_target_dev
> pointer is the right approach. (May store a back pointer from
> struct intel_iommu to struct device_domain_info?)
One of them, wonder which one is better, but device_domain_info
is still per device...seems no good to back it there.
>
> Maybe move the "to_pci_dev(iommu->flush_target_dev)" lookup outside the
> loop to avoid doing this over and over again?
hmm. that is a macro renam of container_of(), exactly, doesn't matter.
right ?
>
> I think we still have a problem here if the device is not removed
> but simply takes a long time to respond to Invalidate Requests
> (as it is permitted to do per the Implementation Note). We'll
> busy-wait for the completion and potentially run into the watchdog's
> time limit again. So I think you or someone else in your org should
> add OKRs to refactor the code so that it sleeps in-between polling
refactor code would be long story, so far still a quick fix for the issue.
and I think developers have other justifiction or conern about the
non-sync version, once again, thanks for your comment.
regards,
Ethan
> for Invalidate Completions (instead of busy-waiting with interrupts
> disabled).
>
> Thanks,
>
> Lukas
next prev parent reply other threads:[~2023-12-25 1:16 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-24 5:06 [RFC PATCH v6 0/4] fix vt-d hard lockup when hotplug ATS capable device Ethan Zhao
2023-12-24 5:06 ` [RFC PATCH v6 1/4] PCI: make pci_dev_is_disconnected() helper public for other drivers Ethan Zhao
2023-12-24 5:06 ` [RFC PATCH v6 2/4] iommu/vt-d: don's issue devTLB flush request when device is disconnected Ethan Zhao
2023-12-24 10:32 ` Lukas Wunner
2023-12-25 1:00 ` Ethan Zhao
2023-12-25 1:56 ` Ethan Zhao
2023-12-24 22:43 ` Bjorn Helgaas
2023-12-25 1:19 ` Ethan Zhao
2023-12-25 1:46 ` Ethan Zhao
2023-12-25 2:21 ` Bjorn Helgaas
2023-12-25 2:35 ` Ethan Zhao
2023-12-25 9:12 ` Ethan Zhao
2023-12-27 2:40 ` Ethan Zhao
2023-12-24 5:06 ` [RFC PATCH v6 3/4] iommu/vt-d: add flush_target_dev member to struct intel_iommu and pass device info to all needed functions Ethan Zhao
2023-12-24 5:06 ` [RFC PATCH v6 4/4] iommu/vt-d: break out devTLB invalidation if target device is gone Ethan Zhao
2023-12-24 10:47 ` Lukas Wunner
2023-12-25 1:16 ` Ethan Zhao [this message]
2023-12-25 8:57 ` Ethan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=09b432c6-7716-4db5-a33d-23b8407955f1@linux.intel.com \
--to=haifeng.zhao@linux.intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=robin.murphy@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox