public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org,  Guenter Roeck <groeck@juniper.net>,
	 Mika Westerberg <mika.westerberg@linux.intel.com>,
	 "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	 Rajat Jain <rajatxjain@gmail.com>,
	 Joel Mathew Thomas <proxy0@tutamail.com>,
	 LKML <linux-kernel@vger.kernel.org>,
	 Jonathan Cameron <Jonathan.Cameron@huawei.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH 1/4] PCI/hotplug: Disable HPIE over reset
Date: Mon, 17 Mar 2025 20:08:12 +0200 (EET)	[thread overview]
Message-ID: <a18432fc-a9ff-0435-cd94-912bf2dcb4b3@linux.intel.com> (raw)
In-Reply-To: <Z9Wjk2GzrSURZoTG@wunner.de>

[-- Attachment #1: Type: text/plain, Size: 3671 bytes --]

On Sat, 15 Mar 2025, Lukas Wunner wrote:

> On Thu, Mar 13, 2025 at 04:23:30PM +0200, Ilpo Järvinen wrote:
> > pciehp_reset_slot() disables PDCE (Presence Detect Changed Enable) and
> > DLLSCE (Data Link Layer State Changed Enable) for the duration of reset
> > and clears the related status bits PDC and DLLSC from the Slot Status
> > register after the reset to avoid hotplug incorrectly assuming the card
> > was removed.
> > 
> > However, hotplug shares interrupt with PME and BW notifications both of
> > which can make pciehp_isr() to run despite PDCE and DLLSCE bits being
> > off. pciehp_isr() then picks PDC or DLLSC bits from the Slot Status
> > register due to the events that occur during reset and caches them into
> > ->pending_events. Later, the IRQ thread in pciehp_ist() will process
> > the ->pending_events and will assume the Link went Down due to a card
> > change (in pciehp_handle_presence_or_link_change()).
> > 
> > Change pciehp_reset_slot() to also clear HPIE (Hot-Plug Interrupt
> > Enable) as pciehp_isr() will first check HPIE to see if the interrupt
> > is not for it. Then synchronize with the IRQ handling to ensure no
> > events are pending, before invoking the reset.
> 
> After dwelling on this for a while, I'm thinking that it may re-introduce
> the issue fixed by commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock
> between reset_lock and device_lock"):
> 
> Looking at the second and third stack trace in its commit message,
> down_write(reset_lock) in pciehp_reset_slot() is basically equivalent
> to synchronize_irq() and we're holding device_lock() at that point,
> hindering progress of pciehp_ist().

This description was somewhat confusing but what I can see, now that you 
mentioned this, is that if pciehp_reset_slot() calls synchronize_irq(), it 
can result in trying to acquire device_lock() again while trying to drain 
the pending events. ->reset_lock seems irrelevant to that problem.

Thus, pciehp_reset_slot() cannot ever rely on completing the processing of 
all pending events before it invokes the reset as long as any of its 
callers is holding device_lock().

It's a bit sad, because removing most of the reset_lock complexity would 
have been nice simplification in locking, effectively it would have 
reverted f5eff5591b8f too.

> So I think I have guided you in the wrong direction and I apologize
> for that.
> 
> However it seems to me that this should be solvable with the small
> patch below.  Am I missing something?
> 
> @Joel Mathew Thomas, could you give the below patch a spin and see
> if it helps?
> 
> Thanks!
> 
> -- >8 --
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index bb5a8d9f03ad..99a2ac13a3d1 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -688,6 +688,11 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id)
>  		return IRQ_HANDLED;
>  	}
>  
> +	/* Ignore events masked by pciehp_reset_slot(). */
> +	events &= ctrl->slot_ctrl;
> +	if (!events)
> +		return IRQ_HANDLED;
> +
>  	/* Save pending events for consumption by IRQ thread. */
>  	atomic_or(events, &ctrl->pending_events);
>  	return IRQ_WAKE_THREAD;

Yes, this should work, I think.

I'm not entirely sure though how reading ->slot_ctrl here synchronizes 
wrt. pciehp_reset_slot() invoking reset. What guarantees pciehp_isr() sees 
the updated ->slot_ctrl when pciehp_reset_slot() has proceeded to invoke 
the reset? (I'm in general very hesitant about lockless and barrierless 
reader being race free, I might be just paranoid about it.)

-- 
 i.

  parent reply	other threads:[~2025-03-17 18:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-13 14:23 [PATCH 0/4] PCI/hotplug: Fix interrupt / event handling problems Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 1/4] PCI/hotplug: Disable HPIE over reset Ilpo Järvinen
2025-03-15 15:58   ` Lukas Wunner
     [not found]     ` <OLQ9qyD--F-9@tutamail.com>
2025-03-15 21:12       ` Lukas Wunner
2025-03-17 18:08     ` Ilpo Järvinen [this message]
2025-03-13 14:23 ` [PATCH 2/4] PCI/hotplug: Clearing HPIE for the duration of reset is enough Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 3/4] PCI/hotplug: reset_lock is not required synchronizing with irq thread Ilpo Järvinen
2025-03-14  8:32   ` Lukas Wunner
2025-03-14 11:18     ` Ilpo Järvinen
2025-03-13 14:23 ` [PATCH 4/4] PCI/hotplug: Don't enabled HPIE in poll mode Ilpo Järvinen
2025-03-15 11:57   ` Lukas Wunner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a18432fc-a9ff-0435-cd94-912bf2dcb4b3@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=bhelgaas@google.com \
    --cc=groeck@juniper.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=mika.westerberg@linux.intel.com \
    --cc=proxy0@tutamail.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rajatxjain@gmail.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox