Linux PCI subsystem development
 help / color / mirror / Atom feed
* [PATCHv2 1/2] pci: provide bus reset attribute
@ 2024-10-25 22:27 Keith Busch
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Keith Busch @ 2024-10-25 22:27 UTC (permalink / raw)
  To: linux-pci, bhelgaas, alex.williamson
  Cc: ameynarkhede03, raphael.norwitz, Keith Busch

From: Keith Busch <kbusch@kernel.org>

Resetting a bus from an end device only works if it's the only function
on or below that bus.

Provide an attribute on the pci_dev bridge device that can perform the
secondary bus reset. This makes it possible for a user to safely reset
multiple devices in a single command using the secondary bus reset
action.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
v1->v2:

  Moved the attribute from the pci_bus to the bridge's pci_dev

  And renamed it to "reset_subordinate" to distinguish from other
  existing device "reset" attributes.

  Added documentation.

  Follow up patch to warn if the action was potentially harmful.

 Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
 drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
 drivers/pci/pci.c                       |  2 +-
 drivers/pci/pci.h                       |  1 +
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 7f63c7e977735..5da6a14dc326b 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -163,6 +163,17 @@ Description:
 		will be present in sysfs.  Writing 1 to this file
 		will perform reset.
 
+What:		/sys/bus/pci/devices/.../reset_subordinate
+Date:		October 2024
+Contact:	linux-pci@vger.kernel.org
+Description:
+		This is visible only for bridge devices. If you want to reset
+		all devices attached through the subordinate bus of a specific
+		bridge device, writing 1 to this will try to do it.  This will
+		affect all devices attached to the system through this bridge
+		similiar to writing 1 to their individual "reset" file, so use
+		with caution.
+
 What:		/sys/bus/pci/devices/.../vpd
 Date:		February 2008
 Contact:	Ben Hutchings <bwh@kernel.org>
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 5d0f4db1cab78..480a99e50612b 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -521,6 +521,28 @@ static ssize_t bus_rescan_store(struct device *dev,
 static struct device_attribute dev_attr_bus_rescan = __ATTR(rescan, 0200, NULL,
 							    bus_rescan_store);
 
+static ssize_t reset_subordinate_store(struct device *dev,
+				struct device_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct pci_bus *bus = pdev->subordinate;
+	unsigned long val;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	if (val) {
+		int ret = __pci_reset_bus(bus);
+
+		if (ret)
+			return ret;
+	}
+
+	return count;
+}
+static DEVICE_ATTR_WO(reset_subordinate);
+
 #if defined(CONFIG_PM) && defined(CONFIG_ACPI)
 static ssize_t d3cold_allowed_store(struct device *dev,
 				    struct device_attribute *attr,
@@ -625,6 +647,7 @@ static struct attribute *pci_dev_attrs[] = {
 static struct attribute *pci_bridge_attrs[] = {
 	&dev_attr_subordinate_bus_number.attr,
 	&dev_attr_secondary_bus_number.attr,
+	&dev_attr_reset_subordinate.attr,
 	NULL,
 };
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 7d85c04fbba2a..338dfcd983f1e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5880,7 +5880,7 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_bus);
  *
  * Same as above except return -EAGAIN if the bus cannot be locked
  */
-static int __pci_reset_bus(struct pci_bus *bus)
+int __pci_reset_bus(struct pci_bus *bus)
 {
 	int rc;
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 14d00ce45bfa9..1cdc2c9547a7e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -104,6 +104,7 @@ bool pci_reset_supported(struct pci_dev *dev);
 void pci_init_reset_methods(struct pci_dev *dev);
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
 int pci_bus_error_reset(struct pci_dev *dev);
+int __pci_reset_bus(struct pci_bus *bus);
 
 struct pci_cap_saved_data {
 	u16		cap_nr;
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
@ 2024-10-25 22:27 ` Keith Busch
  2024-10-28 19:35   ` Alex Williamson
                     ` (2 more replies)
  2024-10-28 19:35 ` [PATCHv2 1/2] pci: provide bus reset attribute Alex Williamson
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 19+ messages in thread
From: Keith Busch @ 2024-10-25 22:27 UTC (permalink / raw)
  To: linux-pci, bhelgaas, alex.williamson
  Cc: ameynarkhede03, raphael.norwitz, Keith Busch

From: Keith Busch <kbusch@kernel.org>

If a reset is issued to a running device with a driver that didn't
register the notification callbacks, the driver may be unaware of this
event and have an inconsistent view of the device's state. Log a warning
of this event because there's nothing else indicating the event occured,
which could be confusing when debugging such situations.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/pci/pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 338dfcd983f1e..bbf12d4998269 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
 	 */
 	if (err_handler && err_handler->reset_prepare)
 		err_handler->reset_prepare(dev);
+	else if (dev->driver)
+		pci_warn(dev, "resetting");
 
 	/*
 	 * Wake-up device prior to save.  PM registers default to D0 after
@@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev)
 	 */
 	if (err_handler && err_handler->reset_done)
 		err_handler->reset_done(dev);
+	else if (dev->driver)
+		pci_warn(dev, "reset done");
 }
 
 /* dev->reset_methods[] is a 0-terminated list of indices into this array */
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
@ 2024-10-28 19:35   ` Alex Williamson
  2024-10-29 11:27   ` Niklas Schnelle
  2024-10-29 15:06   ` ameynarkhede03
  2 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2024-10-28 19:35 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, ameynarkhede03, raphael.norwitz, Keith Busch

On Fri, 25 Oct 2024 15:27:55 -0700
Keith Busch <kbusch@meta.com> wrote:

> From: Keith Busch <kbusch@kernel.org>
> 
> If a reset is issued to a running device with a driver that didn't
> register the notification callbacks, the driver may be unaware of this
> event and have an inconsistent view of the device's state. Log a warning
> of this event because there's nothing else indicating the event occured,
> which could be confusing when debugging such situations.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  drivers/pci/pci.c | 4 ++++
>  1 file changed, 4 insertions(+)

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
@ 2024-10-28 19:35 ` Alex Williamson
  2024-10-29 15:05 ` ameynarkhede03
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2024-10-28 19:35 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, ameynarkhede03, raphael.norwitz, Keith Busch

On Fri, 25 Oct 2024 15:27:54 -0700
Keith Busch <kbusch@meta.com> wrote:

> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> v1->v2:
> 
>   Moved the attribute from the pci_bus to the bridge's pci_dev
> 
>   And renamed it to "reset_subordinate" to distinguish from other
>   existing device "reset" attributes.
> 
>   Added documentation.
> 
>   Follow up patch to warn if the action was potentially harmful.
> 
>  Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
>  drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
>  drivers/pci/pci.c                       |  2 +-
>  drivers/pci/pci.h                       |  1 +
>  4 files changed, 36 insertions(+), 1 deletion(-)

Reviewed-by: Alex Williamson <alex.williamson@redhat.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
  2024-10-28 19:35   ` Alex Williamson
@ 2024-10-29 11:27   ` Niklas Schnelle
  2024-11-01 19:21     ` Keith Busch
  2024-10-29 15:06   ` ameynarkhede03
  2 siblings, 1 reply; 19+ messages in thread
From: Niklas Schnelle @ 2024-10-29 11:27 UTC (permalink / raw)
  To: Keith Busch, linux-pci, bhelgaas, alex.williamson
  Cc: ameynarkhede03, raphael.norwitz, Keith Busch

On Fri, 2024-10-25 at 15:27 -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> If a reset is issued to a running device with a driver that didn't
> register the notification callbacks, the driver may be unaware of this
> event and have an inconsistent view of the device's state. Log a warning
> of this event because there's nothing else indicating the event occured,
> which could be confusing when debugging such situations.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  drivers/pci/pci.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 338dfcd983f1e..bbf12d4998269 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
>  	 */
>  	if (err_handler && err_handler->reset_prepare)
>  		err_handler->reset_prepare(dev);
> +	else if (dev->driver)
> +		pci_warn(dev, "resetting");
>  
>  	/*
>  	 * Wake-up device prior to save.  PM registers default to D0 after
> @@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev)
>  	 */
>  	if (err_handler && err_handler->reset_done)
>  		err_handler->reset_done(dev);
> +	else if (dev->driver)
> +		pci_warn(dev, "reset done");
>  }
>  
>  /* dev->reset_methods[] is a 0-terminated list of indices into this array */

For what it's worth on s390 I think the previous proposal of having the
attribute on the pci_bus would have been better in principle. On s390
the PCI bus probing is done by firmware and Linux doesn't see a pci_dev
for bridges but we do create struct pci_bus for example for a PF and
its child VFs.

Then again we can't really do a reset on this level, other than
manually going through the PCI functions and resetting them one by one.
In fact we may see PCI functions on their own bus while another Linux
instance (LPAR) sees other functions from that bus. So yeah, I guess
it's fair not to have this attribute but still wanted to offer these
thoughts.

Thanks,
Niklas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
  2024-10-28 19:35 ` [PATCHv2 1/2] pci: provide bus reset attribute Alex Williamson
@ 2024-10-29 15:05 ` ameynarkhede03
  2024-11-04 21:28 ` Keith Busch
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: ameynarkhede03 @ 2024-10-29 15:05 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, alex.williamson, raphael.norwitz,
	Keith Busch

On 24/10/25 03:27pm, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
>
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
  2024-10-28 19:35   ` Alex Williamson
  2024-10-29 11:27   ` Niklas Schnelle
@ 2024-10-29 15:06   ` ameynarkhede03
  2 siblings, 0 replies; 19+ messages in thread
From: ameynarkhede03 @ 2024-10-29 15:06 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, alex.williamson, raphael.norwitz,
	Keith Busch

On 24/10/25 03:27pm, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> If a reset is issued to a running device with a driver that didn't
> register the notification callbacks, the driver may be unaware of this
> event and have an inconsistent view of the device's state. Log a warning
> of this event because there's nothing else indicating the event occured,
> which could be confusing when debugging such situations.
>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>

> ---
>  drivers/pci/pci.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 338dfcd983f1e..bbf12d4998269 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5158,6 +5158,8 @@ static void pci_dev_save_and_disable(struct pci_dev *dev)
>  	 */
>  	if (err_handler && err_handler->reset_prepare)
>  		err_handler->reset_prepare(dev);
> +	else if (dev->driver)
> +		pci_warn(dev, "resetting");
>
>  	/*
>  	 * Wake-up device prior to save.  PM registers default to D0 after
> @@ -5191,6 +5193,8 @@ static void pci_dev_restore(struct pci_dev *dev)
>  	 */
>  	if (err_handler && err_handler->reset_done)
>  		err_handler->reset_done(dev);
> +	else if (dev->driver)
> +		pci_warn(dev, "reset done");
>  }
>
>  /* dev->reset_methods[] is a 0-terminated list of indices into this array */
> --
> 2.43.5
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-10-29 11:27   ` Niklas Schnelle
@ 2024-11-01 19:21     ` Keith Busch
  2024-11-04  9:44       ` Niklas Schnelle
  0 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-11-01 19:21 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Tue, Oct 29, 2024 at 12:27:41PM +0100, Niklas Schnelle wrote:
> For what it's worth on s390 I think the previous proposal of having the
> attribute on the pci_bus would have been better in principle. On s390
> the PCI bus probing is done by firmware and Linux doesn't see a pci_dev
> for bridges but we do create struct pci_bus for example for a PF and
> its child VFs.
> 
> Then again we can't really do a reset on this level, other than
> manually going through the PCI functions and resetting them one by one.
> In fact we may see PCI functions on their own bus while another Linux
> instance (LPAR) sees other functions from that bus. So yeah, I guess
> it's fair not to have this attribute but still wanted to offer these
> thoughts.

This attribute uses the pci_dev bridge control register. If you don't
have the bridge device, I don't think this would be useful, so I guess
you'd have to fallback to resetting individual functions.

It seems people can navigate /sys/bus/pci/devices/ easier than finding a
pci_bus under /sys/devices/, though, so that's a plus for pci_dev.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-11-01 19:21     ` Keith Busch
@ 2024-11-04  9:44       ` Niklas Schnelle
  2024-11-04 17:01         ` Keith Busch
  0 siblings, 1 reply; 19+ messages in thread
From: Niklas Schnelle @ 2024-11-04  9:44 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Fri, 2024-11-01 at 13:21 -0600, Keith Busch wrote:
> On Tue, Oct 29, 2024 at 12:27:41PM +0100, Niklas Schnelle wrote:
> > For what it's worth on s390 I think the previous proposal of having the
> > attribute on the pci_bus would have been better in principle. On s390
> > the PCI bus probing is done by firmware and Linux doesn't see a pci_dev
> > for bridges but we do create struct pci_bus for example for a PF and
> > its child VFs.
> > 
> > Then again we can't really do a reset on this level, other than
> > manually going through the PCI functions and resetting them one by one.
> > In fact we may see PCI functions on their own bus while another Linux
> > instance (LPAR) sees other functions from that bus. So yeah, I guess
> > it's fair not to have this attribute but still wanted to offer these
> > thoughts.
> 
> This attribute uses the pci_dev bridge control register. If you don't
> have the bridge device, I don't think this would be useful, so I guess
> you'd have to fallback to resetting individual functions.
> 
> It seems people can navigate /sys/bus/pci/devices/ easier than finding a
> pci_bus under /sys/devices/, though, so that's a plus for pci_dev.
> 

Yes you're right, since the reset via __pci_reset_bus() and then
pci_bridge_secondary_bus_reset() requires the bridge device (unlike
pci_reset_bus()) it makes more sense to have it on the bridge thus also
requiring the bridge device to exist.

I still kind of wish pci_bridge_secondary_bus_reset() and
pcibios_reset_secondary_bus() would take a struct pci_bus and then we
could do the fallback in the latter platform specific code and having
the attribute on the bus would make sense, but I'm not sure it's all
that useful.

One more question though, what would happen with this reset for a bus
with an SR-IOV device with more than 256 VFs i.e. where
pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since
VFs are physically still controlled by the bridge all VFs would be
reset but at the same time virtfn_add_bus() sets the bridge device for
the added bus as NULL so I think it might look odd in sysfs, sadly I
don't have such a device to test with. Still, this might actually be an
argument for having the attribute on the bridge.

Thanks,
Niklas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-11-04  9:44       ` Niklas Schnelle
@ 2024-11-04 17:01         ` Keith Busch
  2024-11-05 12:28           ` Niklas Schnelle
  0 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-11-04 17:01 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote:
> 
> One more question though, what would happen with this reset for a bus
> with an SR-IOV device with more than 256 VFs i.e. where
> pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since
> VFs are physically still controlled by the bridge all VFs would be
> reset but at the same time virtfn_add_bus() sets the bridge device for
> the added bus as NULL so I think it might look odd in sysfs, sadly I
> don't have such a device to test with. Still, this might actually be an
> argument for having the attribute on the bridge.

I assume everything is reset at the PCI level.

Are you asking what the kernel does? I don't think it does anything
special with SR-IOV functions. Those pci_dev's aren't attached to the
bridge pci_dev; you have to go through the pci_bus' children instead.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
                   ` (2 preceding siblings ...)
  2024-10-29 15:05 ` ameynarkhede03
@ 2024-11-04 21:28 ` Keith Busch
  2024-11-04 21:53   ` Krzysztof Wilczyński
  2024-11-12 19:12 ` Keith Busch
  2024-11-12 23:16 ` Bjorn Helgaas
  5 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-11-04 21:28 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.

Hi Bjorn, just want to check in. Do you have concerns remaining for this
feature? 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-11-04 21:28 ` Keith Busch
@ 2024-11-04 21:53   ` Krzysztof Wilczyński
  2024-11-04 21:58     ` Keith Busch
  0 siblings, 1 reply; 19+ messages in thread
From: Krzysztof Wilczyński @ 2024-11-04 21:53 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

Hello,

> > Resetting a bus from an end device only works if it's the only function
> > on or below that bus.
> > 
> > Provide an attribute on the pci_dev bridge device that can perform the
> > secondary bus reset. This makes it possible for a user to safely reset
> > multiple devices in a single command using the secondary bus reset
> > action.
> 
> Hi Bjorn, just want to check in. Do you have concerns remaining for this
> feature? 

Would you have anything against if we put this new bus reset sysfs object
access behind the following test?

  if (!capable(CAP_SYS_ADMIN))
  	return -EPERM;

This is irregardless of what the permissions on the sysfs objects from the
DAC point of view are set to.

Checking CAP_SYS_ADMIN capability, to improve our default security stance,
on a number of important sysfs objects (e.g., reset, remove, etc.) we have
was something I discussed in the past with Bjorn, but never got around to
sending a patch to add this check.

Thoughts?

	Krzysztof

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-11-04 21:53   ` Krzysztof Wilczyński
@ 2024-11-04 21:58     ` Keith Busch
  2024-11-04 22:42       ` Krzysztof Wilczy´nski
  0 siblings, 1 reply; 19+ messages in thread
From: Keith Busch @ 2024-11-04 21:58 UTC (permalink / raw)
  To: Krzysztof Wilczy´nski
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Tue, Nov 05, 2024 at 06:53:32AM +0900, Krzysztof Wilczy´nski wrote:
> Would you have anything against if we put this new bus reset sysfs object
> access behind the following test?
> 
>   if (!capable(CAP_SYS_ADMIN))
>   	return -EPERM;
> 
> This is irregardless of what the permissions on the sysfs objects from the
> DAC point of view are set to.
> 
> Checking CAP_SYS_ADMIN capability, to improve our default security stance,
> on a number of important sysfs objects (e.g., reset, remove, etc.) we have
> was something I discussed in the past with Bjorn, but never got around to
> sending a patch to add this check.
> 
> Thoughts?

Sure, I'm okay that. We are using DEVICE_ATTR_WO file attribute which
says should make it writable only by an admin, but totally fine with
adding this explicit check here too.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-11-04 21:58     ` Keith Busch
@ 2024-11-04 22:42       ` Krzysztof Wilczy´nski
  0 siblings, 0 replies; 19+ messages in thread
From: Krzysztof Wilczy´nski @ 2024-11-04 22:42 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On 24-11-04 14:58:38, Keith Busch wrote:
> On Tue, Nov 05, 2024 at 06:53:32AM +0900, Krzysztof Wilczy´nski wrote:
> > Would you have anything against if we put this new bus reset sysfs object
> > access behind the following test?
> > 
> >   if (!capable(CAP_SYS_ADMIN))
> >   	return -EPERM;
> > 
> > This is irregardless of what the permissions on the sysfs objects from the
> > DAC point of view are set to.
> > 
> > Checking CAP_SYS_ADMIN capability, to improve our default security stance,
> > on a number of important sysfs objects (e.g., reset, remove, etc.) we have
> > was something I discussed in the past with Bjorn, but never got around to
> > sending a patch to add this check.
> > 
> > Thoughts?
> 
> Sure, I'm okay that. We are using DEVICE_ATTR_WO file attribute which
> says should make it writable only by an admin, but totally fine with
> adding this explicit check here too.

Thank you!

Depending on whether Bjorn will have any feedback to might prompt a new
version of the patches to be sent, if there won't be one, then I will add
this extra check directly on the branch.

	Krzysztof


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-11-04 17:01         ` Keith Busch
@ 2024-11-05 12:28           ` Niklas Schnelle
  2024-11-05 15:46             ` Keith Busch
  0 siblings, 1 reply; 19+ messages in thread
From: Niklas Schnelle @ 2024-11-05 12:28 UTC (permalink / raw)
  To: Keith Busch
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Mon, 2024-11-04 at 10:01 -0700, Keith Busch wrote:
> On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote:
> > 
> > One more question though, what would happen with this reset for a bus
> > with an SR-IOV device with more than 256 VFs i.e. where
> > pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since
> > VFs are physically still controlled by the bridge all VFs would be
> > reset but at the same time virtfn_add_bus() sets the bridge device for
> > the added bus as NULL so I think it might look odd in sysfs, sadly I
> > don't have such a device to test with. Still, this might actually be an
> > argument for having the attribute on the bridge.
> 
> I assume everything is reset at the PCI level.
> 
> Are you asking what the kernel does? I don't think it does anything
> special with SR-IOV functions. Those pci_dev's aren't attached to the
> bridge pci_dev; you have to go through the pci_bus' children instead.
> 

I just want to make sure we're okay with the behavior with such VFs as
it seems like the one case where a reset via the bridge affects PCI
functions which aren't attached to the bridge pci_dev otherwise. And
for example as I understand it these would not be covered by the
pci_bus_save_and_disable_locked().

Thanks,
Niklas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 2/2] pci: warn if a running device is unaware of reset
  2024-11-05 12:28           ` Niklas Schnelle
@ 2024-11-05 15:46             ` Keith Busch
  0 siblings, 0 replies; 19+ messages in thread
From: Keith Busch @ 2024-11-05 15:46 UTC (permalink / raw)
  To: Niklas Schnelle
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Tue, Nov 05, 2024 at 01:28:39PM +0100, Niklas Schnelle wrote:
> On Mon, 2024-11-04 at 10:01 -0700, Keith Busch wrote:
> > On Mon, Nov 04, 2024 at 10:44:23AM +0100, Niklas Schnelle wrote:
> > > 
> > > One more question though, what would happen with this reset for a bus
> > > with an SR-IOV device with more than 256 VFs i.e. where
> > > pci_iov_virtfn_bus() returns anything other than 0. I'm guessing since
> > > VFs are physically still controlled by the bridge all VFs would be
> > > reset but at the same time virtfn_add_bus() sets the bridge device for
> > > the added bus as NULL so I think it might look odd in sysfs, sadly I
> > > don't have such a device to test with. Still, this might actually be an
> > > argument for having the attribute on the bridge.
> > 
> > I assume everything is reset at the PCI level.
> > 
> > Are you asking what the kernel does? I don't think it does anything
> > special with SR-IOV functions. Those pci_dev's aren't attached to the
> > bridge pci_dev; you have to go through the pci_bus' children instead.
> > 
> 
> I just want to make sure we're okay with the behavior with such VFs as
> it seems like the one case where a reset via the bridge affects PCI
> functions which aren't attached to the bridge pci_dev otherwise. And
> for example as I understand it these would not be covered by the
> pci_bus_save_and_disable_locked().

Well, it's no different than doing FLR to the PF while VFs are
configured, which is existing behavior, so I think it's okay. The PF's
SRIOV state still gets restored to the end of the reset. The VFs
themselves don't get saved and locked, though. But again, this new patch
is following how it already works for other reset methods.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
                   ` (3 preceding siblings ...)
  2024-11-04 21:28 ` Keith Busch
@ 2024-11-12 19:12 ` Keith Busch
  2024-11-12 23:16 ` Bjorn Helgaas
  5 siblings, 0 replies; 19+ messages in thread
From: Keith Busch @ 2024-11-12 19:12 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.
> 
> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.

Hi Bjorn, are we okay with this one? I am trying to get some tooling and
processes in place that rely on this reset capability, but I don't want
to get ahead of myself if the kernel side needs to go a different
direction.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
                   ` (4 preceding siblings ...)
  2024-11-12 19:12 ` Keith Busch
@ 2024-11-12 23:16 ` Bjorn Helgaas
  2024-11-13 17:38   ` Keith Busch
  5 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2024-11-12 23:16 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz, Keith Busch

On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
> 
> Resetting a bus from an end device only works if it's the only function
> on or below that bus.

Can we clarify this wording?  "Resetting a bus from an end device"?

I guess this has something to do with pci_parent_bus_reset(dev), which
declines to call pci_bridge_secondary_bus_reset() if there are any
other devices on the same bus as "dev"? 

pci_parent_bus_reset() is only called from pci_reset_bus_function(),
which is used by the "bus" and "cxl_bus" reset methods.

So I guess the point is something like:

  The "bus" and "cxl_bus" reset methods reset a device by asserting
  Secondary Bus Reset on the bridge leading to the device.
  pci_parent_bus_reset() only allows that if the device is the only
  device below the bridge.

  Add a sysfs attribute on bridges that can assert Secondary Bus Reset
  regardless of how many devices are below the bridge.  This makes it
  possible for users to reset multiple devices in a single command.

I omitted "safely" because this doesn't do any checking to ensure
safety, so I don't know in what sense it is safe.

It seems like this is partly just a convenience, to reset several
devices at once.

But I think it is *also* a new way to reset devices that we might not
be able to reset otherwise, e.g., if there's more than one device on
the bus, and they don't support ACPI, FLR, or PM reset, there
previously was no reset method that worked at all.  Right?

> Provide an attribute on the pci_dev bridge device that can perform the
> secondary bus reset. This makes it possible for a user to safely reset
> multiple devices in a single command using the secondary bus reset
> action.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
> v1->v2:
> 
>   Moved the attribute from the pci_bus to the bridge's pci_dev
> 
>   And renamed it to "reset_subordinate" to distinguish from other
>   existing device "reset" attributes.
> 
>   Added documentation.
> 
>   Follow up patch to warn if the action was potentially harmful.
> 
>  Documentation/ABI/testing/sysfs-bus-pci | 11 +++++++++++
>  drivers/pci/pci-sysfs.c                 | 23 +++++++++++++++++++++++
>  drivers/pci/pci.c                       |  2 +-
>  drivers/pci/pci.h                       |  1 +
>  4 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 7f63c7e977735..5da6a14dc326b 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -163,6 +163,17 @@ Description:
>  		will be present in sysfs.  Writing 1 to this file
>  		will perform reset.
>  
> +What:		/sys/bus/pci/devices/.../reset_subordinate
> +Date:		October 2024
> +Contact:	linux-pci@vger.kernel.org
> +Description:
> +		This is visible only for bridge devices. If you want to reset
> +		all devices attached through the subordinate bus of a specific
> +		bridge device, writing 1 to this will try to do it.  This will
> +		affect all devices attached to the system through this bridge
> +		similiar to writing 1 to their individual "reset" file, so use
> +		with caution.
> +
>  What:		/sys/bus/pci/devices/.../vpd
>  Date:		February 2008
>  Contact:	Ben Hutchings <bwh@kernel.org>
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 5d0f4db1cab78..480a99e50612b 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -521,6 +521,28 @@ static ssize_t bus_rescan_store(struct device *dev,
>  static struct device_attribute dev_attr_bus_rescan = __ATTR(rescan, 0200, NULL,
>  							    bus_rescan_store);
>  
> +static ssize_t reset_subordinate_store(struct device *dev,
> +				struct device_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct pci_bus *bus = pdev->subordinate;
> +	unsigned long val;
> +
> +	if (kstrtoul(buf, 0, &val) < 0)
> +		return -EINVAL;
> +
> +	if (val) {
> +		int ret = __pci_reset_bus(bus);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return count;
> +}
> +static DEVICE_ATTR_WO(reset_subordinate);
> +
>  #if defined(CONFIG_PM) && defined(CONFIG_ACPI)
>  static ssize_t d3cold_allowed_store(struct device *dev,
>  				    struct device_attribute *attr,
> @@ -625,6 +647,7 @@ static struct attribute *pci_dev_attrs[] = {
>  static struct attribute *pci_bridge_attrs[] = {
>  	&dev_attr_subordinate_bus_number.attr,
>  	&dev_attr_secondary_bus_number.attr,
> +	&dev_attr_reset_subordinate.attr,
>  	NULL,
>  };
>  
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 7d85c04fbba2a..338dfcd983f1e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5880,7 +5880,7 @@ EXPORT_SYMBOL_GPL(pci_probe_reset_bus);
>   *
>   * Same as above except return -EAGAIN if the bus cannot be locked
>   */
> -static int __pci_reset_bus(struct pci_bus *bus)
> +int __pci_reset_bus(struct pci_bus *bus)
>  {
>  	int rc;
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 14d00ce45bfa9..1cdc2c9547a7e 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -104,6 +104,7 @@ bool pci_reset_supported(struct pci_dev *dev);
>  void pci_init_reset_methods(struct pci_dev *dev);
>  int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
>  int pci_bus_error_reset(struct pci_dev *dev);
> +int __pci_reset_bus(struct pci_bus *bus);
>  
>  struct pci_cap_saved_data {
>  	u16		cap_nr;
> -- 
> 2.43.5
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCHv2 1/2] pci: provide bus reset attribute
  2024-11-12 23:16 ` Bjorn Helgaas
@ 2024-11-13 17:38   ` Keith Busch
  0 siblings, 0 replies; 19+ messages in thread
From: Keith Busch @ 2024-11-13 17:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Keith Busch, linux-pci, bhelgaas, alex.williamson, ameynarkhede03,
	raphael.norwitz

On Tue, Nov 12, 2024 at 05:16:23PM -0600, Bjorn Helgaas wrote:
> On Fri, Oct 25, 2024 at 03:27:54PM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> > 
> > Resetting a bus from an end device only works if it's the only function
> > on or below that bus.
> 
> Can we clarify this wording?  "Resetting a bus from an end device"?

What I mean is, if you find a device you want to reset in the sysfs
hierarchy, and find its reset method:

  /sys/bus/pci/devices/<D:B:D.f>/reset_method

If you request "bus", it only works if it is the only function on that bus.

I agree my commit message there is a bit unclear, though. I was a bit to
deep in that code when I wrote it.
 
> I guess this has something to do with pci_parent_bus_reset(dev), which
> declines to call pci_bridge_secondary_bus_reset() if there are any
> other devices on the same bus as "dev"? 

Yep, exactly.

> pci_parent_bus_reset() is only called from pci_reset_bus_function(),
> which is used by the "bus" and "cxl_bus" reset methods.
> 
> So I guess the point is something like:
> 
>   The "bus" and "cxl_bus" reset methods reset a device by asserting
>   Secondary Bus Reset on the bridge leading to the device.
>   pci_parent_bus_reset() only allows that if the device is the only
>   device below the bridge.
> 
>   Add a sysfs attribute on bridges that can assert Secondary Bus Reset
>   regardless of how many devices are below the bridge.  This makes it
>   possible for users to reset multiple devices in a single command.
> 
> I omitted "safely" because this doesn't do any checking to ensure
> safety, so I don't know in what sense it is safe.

By "safe", what I mean is the device is locked by the kernel, config
space is saved and restored on either side of the reset, and the
attached driver (if any) is notified this action is about to happen and
when it completes so it do whatever quiescent and restoring actions it
needs for a bus reset.

I can't say this is universally "safe" since it's a bit optomistic to
assume everything affected by this action is going to work as the pci
driver expects, but the alternatives (setpci) don't coordinate anything
with the kernel, so it's "safe" relative to that :)

> It seems like this is partly just a convenience, to reset several
> devices at once.
>
> But I think it is *also* a new way to reset devices that we might not
> be able to reset otherwise, e.g., if there's more than one device on
> the bus, and they don't support ACPI, FLR, or PM reset, there
> previously was no reset method that worked at all.  Right?

Exactly! I have multi-function devices in a switch hierarchy where this
unfortunately is really the only way to do it. They don't support
resetting individual functions, so SBR is the only way we have to
reliably reset the device.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-11-13 17:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-25 22:27 [PATCHv2 1/2] pci: provide bus reset attribute Keith Busch
2024-10-25 22:27 ` [PATCHv2 2/2] pci: warn if a running device is unaware of reset Keith Busch
2024-10-28 19:35   ` Alex Williamson
2024-10-29 11:27   ` Niklas Schnelle
2024-11-01 19:21     ` Keith Busch
2024-11-04  9:44       ` Niklas Schnelle
2024-11-04 17:01         ` Keith Busch
2024-11-05 12:28           ` Niklas Schnelle
2024-11-05 15:46             ` Keith Busch
2024-10-29 15:06   ` ameynarkhede03
2024-10-28 19:35 ` [PATCHv2 1/2] pci: provide bus reset attribute Alex Williamson
2024-10-29 15:05 ` ameynarkhede03
2024-11-04 21:28 ` Keith Busch
2024-11-04 21:53   ` Krzysztof Wilczyński
2024-11-04 21:58     ` Keith Busch
2024-11-04 22:42       ` Krzysztof Wilczy´nski
2024-11-12 19:12 ` Keith Busch
2024-11-12 23:16 ` Bjorn Helgaas
2024-11-13 17:38   ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox