* avoid null pointer rereference during FLR
@ 2017-05-23 5:42 Christoph Hellwig
2017-05-23 5:42 ` [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2017-05-23 5:42 UTC (permalink / raw)
To: helgaas; +Cc: rakesh, linux-pci, linux-nvme
Hi all,
Rakesh reported a bug where a FLR can trivially crash his system.
The reason for that is that NVMe unbinds the driver from the PCI device
on an unrecoverable error, and that races with the reset_notify method.
This is fairly easily fixable by taking the device lock for a slightly
longer period. Note that the other PCI error handling methods actually
have the same issue, but with them not taking the lock yet and me having
no good way to reproducibly call them I'm a little reluctant to touch
them, but it would be great if we could fix those issues as well.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-23 5:42 avoid null pointer rereference during FLR Christoph Hellwig
@ 2017-05-23 5:42 ` Christoph Hellwig
2017-05-29 9:19 ` Rakesh Pandit
2017-05-30 22:28 ` Bjorn Helgaas
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2017-05-23 5:42 UTC (permalink / raw)
To: helgaas; +Cc: rakesh, linux-pci, linux-nvme
Without this ->notify_reset instance may race with ->remove calls,
which can be easily triggered in NVMe.
Reported-by: Rakesh Pandit <rakesh@tuxera.com>
Tested-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/pci/pci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b01bd5bba8e6..b61ad77dc322 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4275,11 +4275,13 @@ int pci_reset_function(struct pci_dev *dev)
if (rc)
return rc;
+ pci_dev_lock(dev);
pci_dev_save_and_disable(dev);
- rc = pci_dev_reset(dev, 0);
+ rc = __pci_dev_reset(dev, 0);
pci_dev_restore(dev);
+ pci_dev_unlock(dev);
return rc;
}
--
2.11.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-23 5:42 ` [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls Christoph Hellwig
@ 2017-05-29 9:19 ` Rakesh Pandit
2017-05-30 22:28 ` Bjorn Helgaas
1 sibling, 0 replies; 7+ messages in thread
From: Rakesh Pandit @ 2017-05-29 9:19 UTC (permalink / raw)
To: linux-pci; +Cc: helgaas, linux-nvme, Christoph Hellwig
Hi,
On Tue, May 23, 2017 at 07:42:02AM +0200, Christoph Hellwig wrote:
> Without this ->notify_reset instance may race with ->remove calls,
> which can be easily triggered in NVMe.
>
Any input on this sometime before next -rc release would be great ?
> Reported-by: Rakesh Pandit <rakesh@tuxera.com>
> Tested-by: Rakesh Pandit <rakesh@tuxera.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/pci/pci.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b01bd5bba8e6..b61ad77dc322 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4275,11 +4275,13 @@ int pci_reset_function(struct pci_dev *dev)
> if (rc)
> return rc;
>
> + pci_dev_lock(dev);
> pci_dev_save_and_disable(dev);
>
> - rc = pci_dev_reset(dev, 0);
> + rc = __pci_dev_reset(dev, 0);
>
> pci_dev_restore(dev);
> + pci_dev_unlock(dev);
>
> return rc;
> }
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-23 5:42 ` [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls Christoph Hellwig
2017-05-29 9:19 ` Rakesh Pandit
@ 2017-05-30 22:28 ` Bjorn Helgaas
2017-05-31 4:58 ` Christoph Hellwig
1 sibling, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2017-05-30 22:28 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: rakesh, linux-pci, linux-nvme, Alex Williamson
[+cc Alex]
On Tue, May 23, 2017 at 07:42:02AM +0200, Christoph Hellwig wrote:
> Without this ->notify_reset instance may race with ->remove calls,
Do you mean the .reset_notify() method in struct pci_error_handlers?
I don't see a "notify_reset" symbol.
Can you elaborate on exactly how this race happens? I'm trying to
figure out whether this is also a problem or potential problem with
other reset paths like pci_try_reset_function(), pci_reset_bus(),
pci_try_reset_bus(), pci_reset_slot(), and pci_try_reset_slot().
What does the race look like when it happens? Oops, panic, etc?
Can this also be triggered via the sysfs "reset" file?
> which can be easily triggered in NVMe.
>
> Reported-by: Rakesh Pandit <rakesh@tuxera.com>
> Tested-by: Rakesh Pandit <rakesh@tuxera.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> drivers/pci/pci.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b01bd5bba8e6..b61ad77dc322 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4275,11 +4275,13 @@ int pci_reset_function(struct pci_dev *dev)
> if (rc)
> return rc;
>
> + pci_dev_lock(dev);
> pci_dev_save_and_disable(dev);
>
> - rc = pci_dev_reset(dev, 0);
> + rc = __pci_dev_reset(dev, 0);
>
> pci_dev_restore(dev);
> + pci_dev_unlock(dev);
>
> return rc;
> }
> --
> 2.11.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-30 22:28 ` Bjorn Helgaas
@ 2017-05-31 4:58 ` Christoph Hellwig
2017-05-31 16:51 ` Bjorn Helgaas
0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2017-05-31 4:58 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Christoph Hellwig, rakesh, linux-pci, linux-nvme, Alex Williamson
On Tue, May 30, 2017 at 05:28:44PM -0500, Bjorn Helgaas wrote:
> [+cc Alex]
>
> On Tue, May 23, 2017 at 07:42:02AM +0200, Christoph Hellwig wrote:
> > Without this ->notify_reset instance may race with ->remove calls,
>
> Do you mean the .reset_notify() method in struct pci_error_handlers?
> I don't see a "notify_reset" symbol.
Yes.
> Can you elaborate on exactly how this race happens? I'm trying to
> figure out whether this is also a problem or potential problem with
> other reset paths like pci_try_reset_function(), pci_reset_bus(),
> pci_try_reset_bus(), pci_reset_slot(), and pci_try_reset_slot().
>
> What does the race look like when it happens? Oops, panic, etc?
>
> Can this also be triggered via the sysfs "reset" file?
Here is the original bug report:
http://lists.infradead.org/pipermail/linux-nvme/2017-May/010345.html
the issue is triggered through sysfs.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-31 4:58 ` Christoph Hellwig
@ 2017-05-31 16:51 ` Bjorn Helgaas
2017-06-01 10:46 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2017-05-31 16:51 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: rakesh, linux-pci, linux-nvme, Alex Williamson
On Wed, May 31, 2017 at 06:58:48AM +0200, Christoph Hellwig wrote:
> On Tue, May 30, 2017 at 05:28:44PM -0500, Bjorn Helgaas wrote:
> > [+cc Alex]
> >
> > On Tue, May 23, 2017 at 07:42:02AM +0200, Christoph Hellwig wrote:
> > > Without this ->notify_reset instance may race with ->remove calls,
> >
> > Do you mean the .reset_notify() method in struct pci_error_handlers?
> > I don't see a "notify_reset" symbol.
>
> Yes.
>
> > Can you elaborate on exactly how this race happens? I'm trying to
> > figure out whether this is also a problem or potential problem with
> > other reset paths like pci_try_reset_function(), pci_reset_bus(),
> > pci_try_reset_bus(), pci_reset_slot(), and pci_try_reset_slot().
> >
> > What does the race look like when it happens? Oops, panic, etc?
> >
> > Can this also be triggered via the sysfs "reset" file?
>
> Here is the original bug report:
>
> http://lists.infradead.org/pipermail/linux-nvme/2017-May/010345.html
>
> the issue is triggered through sysfs.
Thanks.
Do you have any thoughts on whether the paths I mentioned above are
also susceptible?
I don't want to apply a patch that fixes one issue while leaving
similar ones unfixed.
Bjorn
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls
2017-05-31 16:51 ` Bjorn Helgaas
@ 2017-06-01 10:46 ` Christoph Hellwig
0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2017-06-01 10:46 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Christoph Hellwig, rakesh, linux-pci, linux-nvme, Alex Williamson
On Wed, May 31, 2017 at 11:51:31AM -0500, Bjorn Helgaas wrote:
> Do you have any thoughts on whether the paths I mentioned above are
> also susceptible?
At least in theory ever method call that doesn't have the device
locked is susceptible, so yes.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-06-01 10:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-23 5:42 avoid null pointer rereference during FLR Christoph Hellwig
2017-05-23 5:42 ` [PATCH] PCI: ensure the PCI device is locked over ->reset_notify calls Christoph Hellwig
2017-05-29 9:19 ` Rakesh Pandit
2017-05-30 22:28 ` Bjorn Helgaas
2017-05-31 4:58 ` Christoph Hellwig
2017-05-31 16:51 ` Bjorn Helgaas
2017-06-01 10:46 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).