public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Bitao Hu" <yaoma@linux.alibaba.com>,
	bhelgaas@google.com, weirongguang@kylinos.cn,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	kanie@linux.alibaba.com,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Subject: Re: [PATCHv2] PCI: pciehp: Use appropriate conditions to check the hotplug controller status
Date: Sat, 15 Jun 2024 12:06:58 +0200	[thread overview]
Message-ID: <Zm1nwq97LdLNhrTz@wunner.de> (raw)
In-Reply-To: <20240614220327.GA1125489@bhelgaas>

On Fri, Jun 14, 2024 at 05:03:27PM -0500, Bjorn Helgaas wrote:
> On Fri, Jun 14, 2024 at 09:36:57PM +0200, Lukas Wunner wrote:
> > Hm, good point.  I guess we should change the logical expression instead:
> > 
> > -	if (present <= 0 && link_active <= 0) {
> > +	if (present < 0 || link_active < 0 || (!present && !link_active)) {
> 
> It gets to be a fairly complicated expression, and I'm not 100% sure
> we should handle the config read failure the same as the "!present &&
> !link_active" case.  The config read failure probably means the
> Downstream Port is gone, the other case means the device *below* that
> port is gone.
> 
> We likely want to cancel the delayed work in both cases, but what
> about the indicators?  If the Downstream Port is gone, we're not going
> to be able to change them.  Do we want the same message for both?
> 
> Maybe we should handle the config failures separately first?  These
> error conditions make everything so ugly.

To keep the code simple, I'm leaning towards not making the call to
pciehp_set_indicators() conditional.  The worst thing that can happen
is that pciehp waits 1 sec for a previous write to the Slot Control
register to time out.


> > > These are cases where we misinterpreted -ENODEV as "device is present"
> > > or "link is active".
> > > 
> > > pciehp_ignore_dpc_link_change() and pciehp_slot_reset() also call
> > > pciehp_check_link_active(), and I think they also interpret -ENODEV as
> > > "link is active".
> > > 
> > > Do we need similar changes there?
> > 
> > Another good observation, both need to check for <= 0 instead of == 0.
> > Do you want to fix that yourself or would you prefer me (or someone else)
> > to submit a patch?
> 
> It'd be great if you or somebody else could do that.

After looking at this with a fresh pair of eyeballs, I'm thinking now
that the code is actually fine the way it is:

- pciehp_ignore_dpc_link_change():

  If pciehp_check_link_active() returns -ENODEV, it means we recovered
  from DPC but immediately afterwards the hotplug port became inaccessible,
  perhaps because it was hot-removed or because a DPC event occurred
  further up in the hierarchy.  In neither case would it be called for
  to synthesize a Data Link Layer State Changed event:

  If the hotplug port was hot-removed, it's better to let the hotplug port
  in its ancestry handle the de-enumeration of its sub-hierarchy and not
  interfere with that by trying to concurrently remove a portion of that
  sub-hierarchy.
  
  If a DPC event occurred further up, it's better to let the DPC-capable
  port in the ancestry handle the recovery and not interfere with that.

- pciehp_slot_reset():

  If pciehp_check_link_active() returns -ENODEV, it means a Hot Reset
  was propagated down the hierarchy after which the hotplug port is
  no longer accessible.  Perhaps the hotplug port was hot removed by
  the user, in which case we should let the hotplug port in the
  ancestry handle de-enumeration.  Another possibility is that reset
  recovery failed.  I don't think we should try to de-enumerate devices
  below the hotplug port in that case.  Maybe another error occurred
  which triggered another reset and things will be fine after we've
  recovered from that.

Thanks,

Lukas

      reply	other threads:[~2024-06-15 10:07 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-24  6:30 [PATCH] PCI: pciehp: Use appropriate conditions to check the hotplug controller status Bitao Hu
2024-05-24  7:53 ` Lukas Wunner
2024-05-26 14:45   ` yaoma
2024-05-27  8:50     ` Lukas Wunner
2024-05-27  9:43       ` yaoma
2024-05-28  6:42 ` [PATCHv2] " Bitao Hu
2024-05-28 10:54   ` Ilpo Järvinen
2024-06-14 18:41   ` Bjorn Helgaas
2024-06-14 19:36     ` Lukas Wunner
2024-06-14 22:03       ` Bjorn Helgaas
2024-06-15 10:06         ` Lukas Wunner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zm1nwq97LdLNhrTz@wunner.de \
    --to=lukas@wunner.de \
    --cc=bhelgaas@google.com \
    --cc=helgaas@kernel.org \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=kanie@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=weirongguang@kylinos.cn \
    --cc=yaoma@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox