public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ethan Zhao <haifeng.zhao@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: bhelgaas@google.com, baolu.lu@linux.intel.com,
	dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com,
	linux-pci@vger.kernel.org, iommu@lists.linux.dev,
	linux-kernel@vger.kernel.org,
	Haorong Ye <yehaorong@bytedance.com>
Subject: Re: [PATCH 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected
Date: Fri, 15 Dec 2023 08:43:24 +0800	[thread overview]
Message-ID: <b270f606-4a34-4477-9795-63cd4f019be3@linux.intel.com> (raw)
In-Reply-To: <7f756fc6-e8ea-4fea-ad8b-30066f41037e@linux.intel.com>


On 12/14/2023 10:16 AM, Ethan Zhao wrote:
>
> On 12/13/2023 6:44 PM, Lukas Wunner wrote:
>> On Tue, Dec 12, 2023 at 10:46:37PM -0500, Ethan Zhao wrote:
>>> For those endpoint devices connect to system via hotplug capable ports,
>>> users could request a warm reset to the device by flapping device's 
>>> link
>>> through setting the slot's link control register,
>> Well, users could just *unplug* the device, right?  Why is it relevant
>> that thay could fiddle with registers in config space?
>>
> Yes, if the device and it's slot are hotplug capable, users could just
>
> 'unplug' the device.
>
> But this case reported, users try to do a warm reset with a tool
>
> command like:
>
>   mlxfwreset -d <busid> -y reset
>
> Actually, it will access configuration space  just as
>
>  setpci -s 0000:17:01.0 0x78.L=0x21050010
>
> Well, we couldn't say don't fiddle PCIe config space registers like
>
> that.
>
>>> as pciehpt_ist() DLLSC
>>> interrupt sequence response, pciehp will unload the device driver and
>>> then power it off. thus cause an IOMMU devTLB flush request for 
>>> device to
>>> be sent and a long time completion/timeout waiting in interrupt 
>>> context.
>> A completion timeout should be on the order of usecs or msecs, why 
>> does it
>> cause a hard lockup?  The dmesg excerpt you've provided shows a 12 
>> *second*
>> delay between hot removal and watchdog reaction.
>>
> In my understanding, the devTLB flush request sent to ATS capable devcie
>
> is non-posted request, if the ATS transaction is broken by endpoint link
>
> -down, power-off event, the timeout will take up to 60 seconds+-30,
>
> see "Invalidate Completion Timeout " part of
>
> chapter 10.3.1 Invalidate Request
>
> In PCIe spec 6.1
>
> "
>
> IMPLEMENTATION NOTE:
>
> INVALIDATE COMPLETION TIMEOUT
>
> Devices should respond to Invalidate Requests within 1 minute (+50% 
> -0%).Having a bounded time
>
> permits an ATPT to implement Invalidate Completion Timeouts and reuse 
> the associated ITag values.
>
> ATPT designs are implementation specific. As such, Invalidate 
> Completion Timeouts and their
>
> associated error handling are outside the scope of this specification
>
> "
>
>>> Fix it by checking the device's error_state in
>>> devtlb_invalidation_with_pasid() to avoid sending meaningless devTLB 
>>> flush
>>> request to link down device that is set to 
>>> pci_channel_io_perm_failure and
>>> then powered off in
>> This doesn't seem to be a proper fix.  It will work most of the time
>> but not always.  A user might bring down the slot via sysfs, then yank
>> the card from the slot just when the iommu flush occurs such that the
>> pci_dev_is_disconnected(pdev) check returns false but the card is
>> physically gone immediately afterwards.  In other words, you've shrunk
>> the time window during which the issue may occur, but haven't eliminated
>> it completely.
>
> If you mean disable the slot via sysfs, that's SAFE_REMOVAL, right ?
>
> that would issse devTLB invalidation first, power off device later, it
>
> wouldn't trigger the hard lockup, though the
>
> pci_dev_is_disconnected() return false. this fix works such case.

Could you help to point out if there are any other window to close ?

Thanks,

Ethan


>
>
> Thanks,
>
> Ethan
>
>
>
>>
>> Thanks,
>>
>> Lukas

  reply	other threads:[~2023-12-15  0:43 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-13  3:46 [PATCH RFC 0/2] fix vt-d hard lockup when hotplug ATS capable device Ethan Zhao
2023-12-13  3:46 ` [PATCH 1/2] PCI: make pci_dev_is_disconnected() helper public for other drivers Ethan Zhao
2023-12-13 10:49   ` Lukas Wunner
2023-12-14  0:58     ` Ethan Zhao
2023-12-21 10:51       ` Lukas Wunner
2023-12-22  2:35         ` Ethan Zhao
2023-12-13  3:46 ` [PATCH 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected Ethan Zhao
2023-12-13 10:44   ` Lukas Wunner
2023-12-13 11:54     ` Robin Murphy
2023-12-14  2:40       ` Ethan Zhao
2023-12-21 10:42       ` Lukas Wunner
2023-12-21 11:01         ` Robin Murphy
2023-12-21 11:07           ` Lukas Wunner
2023-12-22  3:20         ` Ethan Zhao
2023-12-14  2:16     ` Ethan Zhao
2023-12-15  0:43       ` Ethan Zhao [this message]
2023-12-13 11:59   ` Baolu Lu
2023-12-14  2:26     ` Ethan Zhao
2023-12-15  1:03       ` Ethan Zhao
2023-12-15  1:34         ` Baolu Lu
2023-12-15  1:51           ` Ethan Zhao
  -- strict thread matches above, loose matches on Subject: below --
2023-12-20  0:51 [PATCH v4 0/2] fix vt-d hard lockup when hotplug ATS capable device Ethan Zhao
2023-12-20  0:51 ` [PATCH v4 1/2] PCI: make pci_dev_is_disconnected() helper public for other drivers Ethan Zhao
2023-12-20  0:51 ` [PATCH v4 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected Ethan Zhao
2023-12-21 10:39   ` Lukas Wunner
2023-12-21 11:01     ` Lukas Wunner
2023-12-22  2:08       ` Ethan Zhao
2023-12-22  3:56       ` Ethan Zhao
2023-12-22  1:56     ` Ethan Zhao
2023-12-22  8:14       ` Lukas Wunner
2023-12-22  9:01         ` Ethan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b270f606-4a34-4477-9795-63cd4f019be3@linux.intel.com \
    --to=haifeng.zhao@linux.intel.com \
    --cc=baolu.lu@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=dwmw2@infradead.org \
    --cc=iommu@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    --cc=yehaorong@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox