From: Keith Busch <keith.busch@intel.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Linux PCI <linux-pci@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Sinan Kaya <okaya@kernel.org>, Thomas Tai <thomas.tai@oracle.com>,
poza@codeaurora.org, Lukas Wunner <lukas@wunner.de>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCHv2 15/20] PCI/pciehp: Fix powerfault detection order
Date: Thu, 6 Sep 2018 13:50:47 -0600 [thread overview]
Message-ID: <20180906195047.GD31024@localhost.localdomain> (raw)
In-Reply-To: <20180906193657.GH214747@bhelgaas-glaptop.roam.corp.google.com>
On Thu, Sep 06, 2018 at 02:36:57PM -0500, Bjorn Helgaas wrote:
> On Wed, Sep 05, 2018 at 02:35:41PM -0600, Keith Busch wrote:
> > A device add in a power controller controlled slot will power on and
> > clear power fault slot events, but this was happening before the interrupt
> > handler attempted to set the sticky status and attention indicators. The
> > wrong status will be set if a hot-add and power fault are handled in
> > one interrupt. This patch fixes that by checking for power faults before
> > checking for new devices.
>
> Can you clarify the part about "the interrupt handler attempting to set the
> sticky status and attention indicators"? My first impression is that
> you're talking about bits in the Slot Status register, but that's
> obviously wrong because those bits are set by hardware (not the interrupt
> handler) and they're RW1C so software clears them by writing 1 to them.
The sticky status being the pciehp driver's "power_fault_detected"
field. We set it on the first observation of a slot's PFD and do not
clear it until we have a successful board_added event.
> Lukas suggests that this patch should be in v4.19. Do you agree, and if
> so, can you help me justify it by describing the user-visible effect of
> this? I'm not sure what "setting the wrong status" means to a user, e.g.,
> does this result in a non-functional device, an incorrect status LED on the
> slot, something else? Does it fix a regression or something we merged for
> v4.19?
>From a user point of view, it is possible the attention LED light could be
on after a successful hot add.
The only reason this was successful before was how everything was chained
through work queues, the work order being:
INT_PRESENCE_ON -> INT_POWER_FAULT -> ENABLE_REQ
The ENABLE_REQ cleared the power fault at the end, but now everything
is handled inline with the interrupt thread (which was a great change,
IMO), such that the work ENABLE_REQ was doing happens before power
fault handling now.
The commit that changed that order:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=0e94916e6091f48391b65110e71c87c583021640
> > Signed-off-by: Keith Busch <keith.busch@intel.com>
> > Reviewed-by: Lukas Wunner <lukas@wunner.de>
> > ---
> > drivers/pci/hotplug/pciehp_hpc.c | 16 ++++++++--------
> > 1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> > index 9eb28a06cac6..52a18a7ec2a2 100644
> > --- a/drivers/pci/hotplug/pciehp_hpc.c
> > +++ b/drivers/pci/hotplug/pciehp_hpc.c
> > @@ -630,6 +630,14 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
> > pciehp_handle_button_press(slot);
> > }
> >
> > + /* Check Power Fault Detected */
> > + if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
> > + ctrl->power_fault_detected = 1;
> > + ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
> > + pciehp_set_attention_status(slot, 1);
> > + pciehp_green_led_off(slot);
> > + }
> > +
> > /*
> > * Disable requests have higher priority than Presence Detect Changed
> > * or Data Link Layer State Changed events.
> > @@ -641,14 +649,6 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
> > pciehp_handle_presence_or_link_change(slot, events);
> > up_read(&ctrl->reset_lock);
> >
> > - /* Check Power Fault Detected */
> > - if ((events & PCI_EXP_SLTSTA_PFD) && !ctrl->power_fault_detected) {
> > - ctrl->power_fault_detected = 1;
> > - ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(slot));
> > - pciehp_set_attention_status(slot, 1);
> > - pciehp_green_led_off(slot);
> > - }
> > -
> > pci_config_pm_runtime_put(pdev);
> > wake_up(&ctrl->requester);
> > return IRQ_HANDLED;
> > --
> > 2.14.4
> >
next prev parent reply other threads:[~2018-09-06 19:50 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-05 20:35 [PATCHv2 00/20] PCI, error handling and hot plug Keith Busch
2018-09-05 20:35 ` [PATCHv2 01/20] PCI: Simplify disconnected marking Keith Busch
2018-09-05 20:35 ` [PATCHv2 02/20] PCI: Fix faulty logic in pci_reset_bus() Keith Busch
2018-09-05 20:35 ` [PATCHv2 03/20] PCI: Add required waits on link active Keith Busch
2018-09-06 11:42 ` Lukas Wunner
2018-09-06 14:44 ` Keith Busch
2018-09-05 20:35 ` [PATCHv2 04/20] PCI/AER: Remove dead code Keith Busch
2018-09-05 20:35 ` [PATCHv2 05/20] PCI/ERR: Use slot reset if available Keith Busch
2018-09-05 20:35 ` [PATCHv2 06/20] PCI/ERR: Handle fatal error recovery Keith Busch
2018-09-05 20:35 ` [PATCHv2 07/20] PCI/ERR: Always use the first downstream port Keith Busch
2018-09-05 20:35 ` [PATCHv2 08/20] PCI/ERR: Simplify broadcast callouts Keith Busch
2018-09-05 20:35 ` [PATCHv2 09/20] PCI/ERR: Report current recovery status for udev Keith Busch
2018-09-05 20:35 ` [PATCHv2 10/20] PCI/ERR: Remove devices on recovery failure Keith Busch
2018-09-05 20:35 ` [PATCHv2 11/20] PCI/portdrv: Provide pci error callbacks Keith Busch
2018-09-05 20:35 ` [PATCHv2 12/20] PCI/portdrv: Restore pci state on slot reset Keith Busch
2018-09-05 20:35 ` [PATCHv2 13/20] PCI: Make link active reporting detection generic Keith Busch
2018-09-06 12:38 ` Lukas Wunner
2018-09-05 20:35 ` [PATCHv2 14/20] PCI: Create recursive bus walk Keith Busch
2018-09-05 20:35 ` [PATCHv2 15/20] PCI/pciehp: Fix powerfault detection order Keith Busch
2018-09-06 19:36 ` Bjorn Helgaas
2018-09-06 19:50 ` Keith Busch [this message]
2018-09-07 16:53 ` Bjorn Helgaas
2018-09-07 20:03 ` Bjorn Helgaas
2018-09-07 20:18 ` Keith Busch
2018-09-18 21:46 ` Bjorn Helgaas
2018-09-18 22:11 ` Keith Busch
2018-09-18 22:11 ` Keith Busch
2018-09-07 20:26 ` Lukas Wunner
2018-09-05 20:35 ` [PATCHv2 16/20] PCI/pciehp: Implement error handling callbacks Keith Busch
2018-09-06 18:23 ` Thomas Tai
2018-09-06 18:49 ` Keith Busch
2018-09-10 13:20 ` Lukas Wunner
2018-09-10 14:56 ` Keith Busch
2018-09-10 16:09 ` Lukas Wunner
2018-09-10 16:18 ` Keith Busch
2018-09-10 16:45 ` Keith Busch
2018-09-10 17:08 ` Lukas Wunner
2018-09-10 17:22 ` Keith Busch
2018-09-05 20:35 ` [PATCHv2 17/20] PCI/pciehp: Ignore link events during DPC event Keith Busch
2018-09-05 20:35 ` [PATCHv2 18/20] PCI/DPC: Wait for link active after reset Keith Busch
2018-09-05 20:35 ` [PATCHv2 19/20] PCI/DPC: Link reset code cleanup Keith Busch
2018-09-05 20:35 ` [PATCHv2 20/20] PCI: Unify device inaccessible Keith Busch
2018-09-06 4:20 ` Benjamin Herrenschmidt
2018-09-06 17:30 ` [PATCHv2 00/20] PCI, error handling and hot plug Thomas Tai
2018-09-06 17:36 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180906195047.GD31024@localhost.localdomain \
--to=keith.busch@intel.com \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=hch@lst.de \
--cc=helgaas@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=okaya@kernel.org \
--cc=poza@codeaurora.org \
--cc=thomas.tai@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).