From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
linux-pci@vger.kernel.org, Guenter Roeck <groeck@juniper.net>,
Mika Westerberg <mika.westerberg@linux.intel.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Rajat Jain <rajatxjain@gmail.com>,
Joel Mathew Thomas <proxy0@tutamail.com>,
LKML <linux-kernel@vger.kernel.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
stable@vger.kernel.org
Subject: Re: [PATCH 1/4] PCI/hotplug: Disable HPIE over reset
Date: Mon, 17 Mar 2025 20:08:12 +0200 (EET) [thread overview]
Message-ID: <a18432fc-a9ff-0435-cd94-912bf2dcb4b3@linux.intel.com> (raw)
In-Reply-To: <Z9Wjk2GzrSURZoTG@wunner.de>
[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]
On Sat, 15 Mar 2025, Lukas Wunner wrote:
> On Thu, Mar 13, 2025 at 04:23:30PM +0200, Ilpo Järvinen wrote:
> > pciehp_reset_slot() disables PDCE (Presence Detect Changed Enable) and
> > DLLSCE (Data Link Layer State Changed Enable) for the duration of reset
> > and clears the related status bits PDC and DLLSC from the Slot Status
> > register after the reset to avoid hotplug incorrectly assuming the card
> > was removed.
> >
> > However, hotplug shares interrupt with PME and BW notifications both of
> > which can make pciehp_isr() to run despite PDCE and DLLSCE bits being
> > off. pciehp_isr() then picks PDC or DLLSC bits from the Slot Status
> > register due to the events that occur during reset and caches them into
> > ->pending_events. Later, the IRQ thread in pciehp_ist() will process
> > the ->pending_events and will assume the Link went Down due to a card
> > change (in pciehp_handle_presence_or_link_change()).
> >
> > Change pciehp_reset_slot() to also clear HPIE (Hot-Plug Interrupt
> > Enable) as pciehp_isr() will first check HPIE to see if the interrupt
> > is not for it. Then synchronize with the IRQ handling to ensure no
> > events are pending, before invoking the reset.
>
> After dwelling on this for a while, I'm thinking that it may re-introduce
> the issue fixed by commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock
> between reset_lock and device_lock"):
>
> Looking at the second and third stack trace in its commit message,
> down_write(reset_lock) in pciehp_reset_slot() is basically equivalent
> to synchronize_irq() and we're holding device_lock() at that point,
> hindering progress of pciehp_ist().
This description was somewhat confusing but what I can see, now that you
mentioned this, is that if pciehp_reset_slot() calls synchronize_irq(), it
can result in trying to acquire device_lock() again while trying to drain
the pending events. ->reset_lock seems irrelevant to that problem.
Thus, pciehp_reset_slot() cannot ever rely on completing the processing of
all pending events before it invokes the reset as long as any of its
callers is holding device_lock().
It's a bit sad, because removing most of the reset_lock complexity would
have been nice simplification in locking, effectively it would have
reverted f5eff5591b8f too.
> So I think I have guided you in the wrong direction and I apologize
> for that.
>
> However it seems to me that this should be solvable with the small
> patch below. Am I missing something?
>
> @Joel Mathew Thomas, could you give the below patch a spin and see
> if it helps?
>
> Thanks!
>
> -- >8 --
>
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index bb5a8d9f03ad..99a2ac13a3d1 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -688,6 +688,11 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
> return IRQ_HANDLED;
> }
>
> + /* Ignore events masked by pciehp_reset_slot(). */
> + events &= ctrl->slot_ctrl;
> + if (!events)
> + return IRQ_HANDLED;
> +
> /* Save pending events for consumption by IRQ thread. */
> atomic_or(events, &ctrl->pending_events);
> return IRQ_WAKE_THREAD;
Yes, this should work, I think.
I'm not entirely sure though how reading ->slot_ctrl here synchronizes
wrt. pciehp_reset_slot() invoking reset. What guarantees pciehp_isr() sees
the updated ->slot_ctrl when pciehp_reset_slot() has proceeded to invoke
the reset? (I'm in general very hesitant about lockless and barrierless
reader being race free, I might be just paranoid about it.)
--
i.
next prev parent reply other threads:[~2025-03-17 18:08 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-13 14:23 [PATCH 0/4] PCI/hotplug: Fix interrupt / event handling problems Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 1/4] PCI/hotplug: Disable HPIE over reset Ilpo Järvinen
2025-03-15 15:58 ` Lukas Wunner
[not found] ` <OLQ9qyD--F-9@tutamail.com>
2025-03-15 21:12 ` Lukas Wunner
2025-03-17 18:08 ` Ilpo Järvinen [this message]
2025-03-13 14:23 ` [PATCH 2/4] PCI/hotplug: Clearing HPIE for the duration of reset is enough Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 3/4] PCI/hotplug: reset_lock is not required synchronizing with irq thread Ilpo Järvinen
2025-03-14 8:32 ` Lukas Wunner
2025-03-14 11:18 ` Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 4/4] PCI/hotplug: Don't enabled HPIE in poll mode Ilpo Järvinen
2025-03-15 11:57 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a18432fc-a9ff-0435-cd94-912bf2dcb4b3@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=bhelgaas@google.com \
--cc=groeck@juniper.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mika.westerberg@linux.intel.com \
--cc=proxy0@tutamail.com \
--cc=rafael.j.wysocki@intel.com \
--cc=rajatxjain@gmail.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.