public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sathyanarayanan Kuppuswamy  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] PCI: pciehp: Make sure DPC trigger status is reset in PDC handler
Date: Thu, 15 Jun 2023 16:03:54 -0700	[thread overview]
Message-ID: <713d71dc-c4a5-cd7b-2deb-343c244dd14d@linux.intel.com> (raw)
In-Reply-To: <20230615183550.GA9773@wunner.de>

Hi Lukas,

Thanks for the review.

On 6/15/23 11:35 AM, Lukas Wunner wrote:
> On Wed, Jun 14, 2023 at 11:25:59PM -0700, Kuppuswamy Sathyanarayanan wrote:
>> During the EDR-based DPC recovery process, for devices with persistent
>> issues, the firmware may choose not to handle the DPC error and leave
>> the port in DPC triggered state. In such scenarios, if the user
>> replaces the faulty device with a new one, the OS is expected to clear
>> the DPC trigger status in the hotplug error handler to enable the new
>> device enumeration.
> 
> You're clearing the DPC trigger status upon a PDC event, yet are saying
> here the purpose is to reset port state for a future hotplugged device.

Sorry, it is a typo. I meant "hotplug interrupt handler".

The goal is to ensure that when a new device presence is detected, the
old DPC trigger status is cleared.

> 
> A PDC event may be synthesized, e.g. to trigger slot bringup via
> sysfs, so using a PDC event to clear DPC trigger status feels wrong.

IMO, it is harmless. We just want to make sure the previous DPC trigger
status is cleared before enumerating a new device.

> pciehp_unconfigure_device() seems like a more appropriate place to me.
> 

I initially thought to add it there. Spec also recommends clearing it
when removing the device. But I wasn't sure if pciehp_unconfigure_device()
would be called only during device removal. 

Let me test this path and get back to you.

> 
>> More details about this issue can be found in PCIe
>> firmware specification, r3.3, sec titled "DPC Event Handling"
>> Implementation note.
> 
> That Implementation Note contains a lot of text and a fairly complex
> flow chart. If you could point to specific paragraphs or numbers in
> the Implementation Note that would make life easier for a reviewer
> to make the connection between your code and the spec.

It is the text at the end of the flowchart. Copied it here for reference.

For devices with persistent errors, a port may be kept in the DPC triggered
state (disabled) to keep those devices from continuing to generate errors. For
hot-plug slots, the errant device may be removed and replaced with a new device.
If the DPC trigger state is not cleared, then the port above the newly inserted
device will still be disabled and will be non-operational. Therefore, operating
systems may need to modify their hot-plug interrupt handling code to clear DPC
Trigger Status when a device is removed so that a subsequent insertion
will succeed.

> 
> 
>> Similar issue might also happen if the DPC or EDR recovery handler
>> exits before clearing the trigger status. To fix this issue, clear the
>> DPC trigger status in PDC interrupt handler.
> 
> I was about to ask why the code is added to dpc.c, not edr.c,
> and why it's not constrained to CONFIG_PCIE_EDR, but I assume
> that's the reason?  Because it "might" happen for OS-native DPC
> as well?

Yes. There are code paths in the DPC driver where error recover handler
can exit before clearing the DPC trigger status. So I think this fix is
applicable for native code as well.

> 
> 
>> +/**
>> + * pci_reset_trigger - Clear DPC trigger status
>> + * @pdev: PCI device
>> + *
>> + * It is called from the PCIe hotplug driver to clean the DPC
>> + * trigger status in the PDC interrupt handler.
>> + */
>> +void pci_dpc_reset_trigger(struct pci_dev *pdev)
>> +{
>> +	if (!pdev->dpc_cap)
>> +		return;
>> +
>> +	pci_write_config_word(pdev, pdev->dpc_cap + PCI_EXP_DPC_STATUS,
>> +			      PCI_EXP_DPC_STATUS_TRIGGER);
>> +}
> 
> This may run concurrently to dpc_reset_link(), so I'd expect that
> you need some kind of serialization.  What happens if pciehp clears
> trigger status behind the DPC driver's back while it is handling an
> error?

Currently, we only call pci_dpc_reset_trigger() in PDC interrupt handler.

Do you think there would be a race between error handler and PDC handler?

> 
> Thanks,
> 
> Lukas

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

  reply	other threads:[~2023-06-15 23:04 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-15  6:25 [PATCH v1] PCI: pciehp: Make sure DPC trigger status is reset in PDC handler Kuppuswamy Sathyanarayanan
2023-06-15  8:18 ` kernel test robot
2023-06-15 18:35 ` Lukas Wunner
2023-06-15 23:03   ` Sathyanarayanan Kuppuswamy [this message]
2023-06-16  9:06     ` Lukas Wunner
2023-06-16 23:27       ` Sathyanarayanan Kuppuswamy
2023-06-17  8:31         ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=713d71dc-c4a5-cd7b-2deb-343c244dd14d@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox