public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org,  LKML <linux-kernel@vger.kernel.org>,
	 "Maciej W. Rozycki" <macro@orcam.me.uk>
Subject: Re: [PATCH v2 1/1] PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
Date: Thu, 24 Apr 2025 15:37:38 +0300 (EEST)	[thread overview]
Message-ID: <e639b361-785e-d39b-3c3f-957bcdc54fcd@linux.intel.com> (raw)
In-Reply-To: <aAnOOj91-N6rwt2x@wunner.de>

[-- Attachment #1: Type: text/plain, Size: 5720 bytes --]

On Thu, 24 Apr 2025, Lukas Wunner wrote:

> On Wed, Apr 23, 2025 at 02:37:11PM +0300, Ilpo Järvinen wrote:
> > On Wed, 23 Apr 2025, Lukas Wunner wrote:
> > > On Tue, Apr 22, 2025 at 02:55:47PM +0300, Ilpo Järvinen wrote:
> > > > +void pcie_reset_lbms(struct pci_dev *port)
> > > >  {
> > > > -	struct pcie_bwctrl_data *data;
> > > > -
> > > > -	guard(rwsem_read)(&pcie_bwctrl_lbms_rwsem);
> > > > -	data = port->link_bwctrl;
> > > > -	if (data)
> > > > -		atomic_set(&data->lbms_count, 0);
> > > > -	else
> > > > -		pcie_capability_write_word(port, PCI_EXP_LNKSTA,
> > > > -					   PCI_EXP_LNKSTA_LBMS);
> > > > +	clear_bit(PCI_LINK_LBMS_SEEN, &port->priv_flags);
> > > > +	pcie_capability_write_word(port, PCI_EXP_LNKSTA, PCI_EXP_LNKSTA_LBMS);
> > > >  }
> > > 
> > > Hm, previously the LBMS bit was only cleared in the Link Status register
> > > if the bandwith controller hadn't probed yet.  Now it's cleared
> > > unconditionally.  I'm wondering if this changes the logic somehow?
> > 
> > Hmm, that's a good question and I hadn't thought all the implications.
> > I suppose leaving if (!port->link_bwctrl) there would retain the existing 
> > behavior better allowing bwctrl to pick the link speed changes more 
> > reliably.
> 
> I think the only potential issue with clearing the LBMS bit in the register
> is that the bandwidth controller's irq handler won't see the bit and may
> return with IRQ_NONE.
> 
> However, looking at the callers of pcie_reset_lbms(), that doesn't seem
> to be a real issue.  There are only two of them:
> 
> - pcie_retrain_link() calls the function after the link was retrained.
>   I guess the LBMS bit in the register may be set as a side-effect of
>   the link retraining?

Retraining does set LBMS, whether the speed was same before doesn't 
matter. I think it's because LTSSM-wise, retraining transitions through 
Recovery.
 
(I don't know why, but in most tests I've done LBMS is actually asserted 
not only once but twice with only one Link Retraining event).

>   The only concern here is whether the cached
>   link speed is updated.  pcie_bwctrl_change_speed() does call
>   pcie_update_link_speed() after calling pcie_retrain_link(), so that
>   looks fine.  But there's a second caller of pcie_retrain_link():
>   pcie_aspm_configure_common_clock().  It doesn't update the cached
>   link speed after calling pcie_retrain_link().  Not sure if this can
>   lead to a change in link speed and therefore the cached link speed
>   should be updated?  The Target Link Speed isn't changed, but maybe
>   the link fails to retrain to the same speed for electrical reasons?

I've never seen that to happen but it would seem odd if that is forbidden 
(as the alternative is probably that the link remains down).

Perhaps pcie_reset_lbms() should just call pcie_update_link_speed() as the 
last step, then the irq handler returning IRQ_NONE doesn't matter.

> - pciehp's remove_board() calls the function after bringing down the slot
>   to avoid a stale PCI_LINK_LBMS_SEEN flag.  No real harm in clearing the
>   bit in the register at this point I guess.  But I do wonder, is the link
>   speed updated somewhere when a new board is added?  The replacement
>   device may not support the same speeds as the previous device.

The supported speeds are always recalculated using dev->supported_speeds. 
A new board implies a new pci_dev structure with newly read supported 
speeds. Also, bringing the link up with the replacement device will also 
trigger LBMS so the new Link Speed should be picked up by that.

Racing LBMS reset from remove_board() with LBMS due to the replacement 
board shouldn't result in stale Link Speed because of:

board_added()
  pciehp_check_link_status()
    __pcie_update_link_speed()

> > Given this flag is only for the purposes of the quirk, it seems very much 
> > out of proportions.
> 
> Yes, let's try to minimize the amount of locking, flags and code to support
> the quirk.  Keep it as simple as possible.  So in that sense, the solution
> you've chosen is probably fine.
> 
> 
> > > >  static bool pcie_lbms_seen(struct pci_dev *dev, u16 lnksta)
> > > >  {
> > > > -	unsigned long count;
> > > > -	int ret;
> > > > -
> > > > -	ret = pcie_lbms_count(dev, &count);
> > > > -	if (ret < 0)
> > > > -		return lnksta & PCI_EXP_LNKSTA_LBMS;
> > > > +	if (test_bit(PCI_LINK_LBMS_SEEN, &dev->priv_flags))
> > > > +		return true;
> > > >  
> > > > -	return count > 0;
> > > > +	return lnksta & PCI_EXP_LNKSTA_LBMS;
> > > >  }
> > > 
> > > Another small logic change here:  Previously pcie_lbms_count()
> > > returned a negative value if the bandwidth controller hadn't
> > > probed yet or wasn't compiled into the kernel.
> > > Only in those two cases was the LBMS flag in the lnksta variable 
> > > returned.
> > > 
> > > Now the LBMS flag is also returned if the bandwidth controller
> > > is compiled into the kernel and has probed, but its irq handler
> > > hasn't recorded a seen LBMS bit yet.
> > > 
> > > I'm guessing this can happen if the quirk races with the irq
> > > handler and wins the race, so this safety net is needed?
> > 
> > The main reason why this check is here is for the boot when bwctrl is not 
> > yet probed when the quirk runs. But the check just seems harmless, or 
> > even somewhat useful, in the case when bwctrl has already probed. LBMS 
> > being asserted should result in PCI_LINK_LBMS_SEEN even if the irq 
> > handler has not yet done its job to transfer it into priv_flags.
> 
> Okay I'm convinced that the logic change in pcie_lbms_seen() is fine.
> 
> Thanks,
> 
> Lukas
> 

-- 
 i.

  reply	other threads:[~2025-04-24 12:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-22 11:55 [PATCH v2 1/1] PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag Ilpo Järvinen
2025-04-23 10:07 ` Lukas Wunner
2025-04-23 11:37   ` Ilpo Järvinen
2025-04-24  5:38     ` Lukas Wunner
2025-04-24 12:37       ` Ilpo Järvinen [this message]
2025-04-25 10:12         ` Lukas Wunner
2025-04-25 12:24           ` Ilpo Järvinen
2025-04-29 10:02             ` Lukas Wunner
2025-04-23 21:04 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e639b361-785e-d39b-3c3f-957bcdc54fcd@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=macro@orcam.me.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox