From: Bjorn Helgaas <helgaas@kernel.org>
To: Ido Schimmel <idosch@nvidia.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>,
Petr Machata <petrm@nvidia.com>,
mlxsw@nvidia.com, linux-pci@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH net-next 3/3] mlxsw: pci: Lock configuration space of upstream bridge during reset
Date: Fri, 12 Jul 2024 16:21:57 -0500 [thread overview]
Message-ID: <20240712212157.GA339030@bhelgaas> (raw)
In-Reply-To: <ZoVjPb_OwbKh7kHu@shredder.lan>
[+cc Dan]
On Wed, Jul 03, 2024 at 05:42:05PM +0300, Ido Schimmel wrote:
> On Tue, Jul 02, 2024 at 09:35:50AM +0200, Przemek Kitszel wrote:
> > On 7/1/24 18:41, Petr Machata wrote:
> > > From: Ido Schimmel <idosch@nvidia.com>
> > >
> > > The driver triggers a "Secondary Bus Reset" (SBR) by calling
> > > __pci_reset_function_locked() which asserts the SBR bit in the "Bridge
> > > Control Register" in the configuration space of the upstream bridge for
> > > 2ms. This is done without locking the configuration space of the
> > > upstream bridge port, allowing user space to access it concurrently.
> >
> > This means your patch is a bugfix.
> >
> > > Linux 6.11 will start warning about such unlocked resets [1][2]:
> > >
> > > pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0
> > >
> > > Avoid the warning by locking the configuration space of the upstream
> > > bridge prior to the reset and unlocking it afterwards.
> >
> > You are not avoiding the warning but protecting concurrent access,
> > please add a Fixes tag.
>
> The patch that added the missing lock in PCI core was posted without a
> Fixes tag and merged as part of the 6.10 PR. See commit 7e89efc6e9e4
> ("PCI: Lock upstream bridge for pci_reset_function()").
>
> I don't see a good reason for root to poke in the configuration space of
> the upstream bridge during SBR, but AFAICT the worst that can happen is
> that reset will fail and while it is a bug, it is not a regression.
>
> Bjorn, do you see a reason to post this as a fix?
Sorry, I was on vacation and missed this when I returned.
mlxsw is one of the few users of __pci_reset_function_locked().
Others are liquidio (octeon), VFIO, and Xen.
You need __pci_reset_function_locked() if you're already holding the
device mutex, i.e., device_lock(&pdev->dev). I looked at the
mlxsw_pci_reset_at_pci_disable() path, and didn't see where it holds
that device lock, but I probably missed it.
The usual pci_reset_function() path, which would be preferable if you
can use it, does basically this:
pci_dev_lock(bridge)
device_lock(&bridge->dev)
pci_cfg_access_lock(bridge)
pci_dev_lock(pdev)
device_lock(&pdev->dev)
pci_cfg_access_lock(pdev)
pci_dev_save_and_disable(dev)
__pci_reset_function_locked(pdev)
This patch adds pci_cfg_access_lock(bridge), but doesn't acquire the
device_lock for the bridge.
It looks like you always reset the device at mlxsw_pci_probe()-time,
which is quite unusual in the first place, but I suppose there's some
good reason for it.
If you can use pci_reset_function() directly (or avoid the reset
altogether), it would be far preferable and would avoid potential
issues like the warning here.
Bjorn
> > > [1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/
> > > [2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/
> > >
> > > Cc: linux-pci@vger.kernel.org
> > > Signed-off-by: Ido Schimmel <idosch@nvidia.com>
> > > Signed-off-by: Petr Machata <petrm@nvidia.com>
next prev parent reply other threads:[~2024-07-12 21:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-01 16:41 [PATCH net-next 0/3] mlxsw: Improvements Petr Machata
2024-07-01 16:41 ` [PATCH net-next 1/3] mlxsw: Warn about invalid accesses to array fields Petr Machata
2024-07-02 7:08 ` Przemek Kitszel
2024-07-03 12:40 ` Ido Schimmel
2024-07-08 9:45 ` Petr Machata
2024-07-01 16:41 ` [PATCH net-next 2/3] mlxsw: core_thermal: Report valid current state during cooling device registration Petr Machata
2024-07-02 7:27 ` Przemek Kitszel
2024-07-09 16:06 ` Rafael J. Wysocki
2024-07-01 16:41 ` [PATCH net-next 3/3] mlxsw: pci: Lock configuration space of upstream bridge during reset Petr Machata
2024-07-02 7:35 ` Przemek Kitszel
2024-07-03 14:42 ` Ido Schimmel
2024-07-12 21:21 ` Bjorn Helgaas [this message]
2024-07-14 11:29 ` Ido Schimmel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240712212157.GA339030@bhelgaas \
--to=helgaas@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=idosch@nvidia.com \
--cc=kuba@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mlxsw@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=petrm@nvidia.com \
--cc=przemyslaw.kitszel@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).