* [PATCH] nvme: avoid NULL pointer dereference in error recovery path
@ 2017-04-05 19:40 Guilherme G. Piccoli
2017-04-05 19:43 ` Christoph Hellwig
0 siblings, 1 reply; 3+ messages in thread
From: Guilherme G. Piccoli @ 2017-04-05 19:40 UTC (permalink / raw)
It's possible that driver fails to recover from a PCI error and the
PCI core (or arch PCI specifics, like EEH in PowerPC) starts a process
of device removal. While this removal process is happening, if another
PCI error is triggered, we might have a NULL address for
"struct *nvme_dev", pointed by "pci_dev *driver_data" - for example this
happens if nvme_remove() already have set that pci_dev struct's field
to NULL.
In this case, the driver error handler functions will dereferece a NULL
pointer, causing a kernel oops. This patch checks for NULL pointer on
error handlers and in case "driver_data" points to NULL, it aborts the
error recovery path and return a fail error value to PCI core.
Also, the patch "standardize" the use of "pci_dev dev" as pointer to
"struct device", instead of "nvme_dev->nvme_ctrl.device".
Fixes: a0a3408ee614 ("NVMe: Add pci error handlers")
Cc: stable at vger.kernel.org # v4.5+
Signed-off-by: Guilherme G. Piccoli <gpiccoli at linux.vnet.ibm.com>
---
drivers/nvme/host/pci.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 26a5fd05fe88..283a930ab6ac 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2089,6 +2089,12 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
{
struct nvme_dev *dev = pci_get_drvdata(pdev);
+ if (!dev) {
+ dev_err(&pdev->dev,
+ "device already removed, aborting error recovery\n");
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
/*
* A frozen channel requires a reset. When detected, this method will
* shutdown the controller to quiesce. The controller will be restarted
@@ -2098,12 +2104,12 @@ static pci_ers_result_t nvme_error_detected(struct pci_dev *pdev,
case pci_channel_io_normal:
return PCI_ERS_RESULT_CAN_RECOVER;
case pci_channel_io_frozen:
- dev_warn(dev->ctrl.device,
+ dev_warn(&pdev->dev,
"frozen state error detected, reset controller\n");
nvme_dev_disable(dev, false);
return PCI_ERS_RESULT_NEED_RESET;
case pci_channel_io_perm_failure:
- dev_warn(dev->ctrl.device,
+ dev_warn(&pdev->dev,
"failure state error detected, request disconnect\n");
return PCI_ERS_RESULT_DISCONNECT;
}
@@ -2114,7 +2120,13 @@ static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)
{
struct nvme_dev *dev = pci_get_drvdata(pdev);
- dev_info(dev->ctrl.device, "restart after slot reset\n");
+ if (!dev) {
+ dev_err(&pdev->dev,
+ "device already removed, aborting slot reset\n");
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ dev_info(&pdev->dev, "restart after slot reset\n");
pci_restore_state(pdev);
nvme_reset(dev);
return PCI_ERS_RESULT_RECOVERED;
--
2.11.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH] nvme: avoid NULL pointer dereference in error recovery path
2017-04-05 19:40 [PATCH] nvme: avoid NULL pointer dereference in error recovery path Guilherme G. Piccoli
@ 2017-04-05 19:43 ` Christoph Hellwig
2017-04-05 20:01 ` Guilherme G. Piccoli
0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2017-04-05 19:43 UTC (permalink / raw)
On Wed, Apr 05, 2017@04:40:37PM -0300, Guilherme G. Piccoli wrote:
> It's possible that driver fails to recover from a PCI error and the
> PCI core (or arch PCI specifics, like EEH in PowerPC) starts a process
> of device removal. While this removal process is happening, if another
> PCI error is triggered, we might have a NULL address for
> "struct *nvme_dev", pointed by "pci_dev *driver_data" - for example this
> happens if nvme_remove() already have set that pci_dev struct's field
> to NULL.
>
> In this case, the driver error handler functions will dereferece a NULL
> pointer, causing a kernel oops. This patch checks for NULL pointer on
> error handlers and in case "driver_data" points to NULL, it aborts the
> error recovery path and return a fail error value to PCI core.
I think this needs to be fixed at a higher level, that is the PCI
core. Once you have the callbacks run in parallel a simple null check
isn't going to fix this but every single access to the structure
is a possible use after free.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] nvme: avoid NULL pointer dereference in error recovery path
2017-04-05 19:43 ` Christoph Hellwig
@ 2017-04-05 20:01 ` Guilherme G. Piccoli
0 siblings, 0 replies; 3+ messages in thread
From: Guilherme G. Piccoli @ 2017-04-05 20:01 UTC (permalink / raw)
On 04/05/2017 04:43 PM, Christoph Hellwig wrote:
> On Wed, Apr 05, 2017@04:40:37PM -0300, Guilherme G. Piccoli wrote:
>> It's possible that driver fails to recover from a PCI error and the
>> PCI core (or arch PCI specifics, like EEH in PowerPC) starts a process
>> of device removal. While this removal process is happening, if another
>> PCI error is triggered, we might have a NULL address for
>> "struct *nvme_dev", pointed by "pci_dev *driver_data" - for example this
>> happens if nvme_remove() already have set that pci_dev struct's field
>> to NULL.
>>
>> In this case, the driver error handler functions will dereferece a NULL
>> pointer, causing a kernel oops. This patch checks for NULL pointer on
>> error handlers and in case "driver_data" points to NULL, it aborts the
>> error recovery path and return a fail error value to PCI core.
>
> I think this needs to be fixed at a higher level, that is the PCI
> core. Once you have the callbacks run in parallel a simple null check
> isn't going to fix this but every single access to the structure
> is a possible use after free.
>
Christoph, your point makes sense.
It's possible (I guess not expected, but at least feasible) to have a
2nd PCI error being triggered while a 1st error is in
recovery/removal-on-failure process, after the slot_reset() step of the
1st error specifically.
Many drivers around do these NULL pointer checks - solving this in EEH
core would mean having a way to every driver communicate the error
recovery process is completed and the next error could be "issued".
Like PCI core would "stack" the next error.
But...what if the device has really a severe issue in its slot, and it's
unable to recover at all? So, it'll trigger another error and naturally
the callbacks are going to be called, expecting to drop recoveries and
signal to PCI core something went real bad.
So, I believe these NULL checks will avoid at least this dereference
case and enforce signaling to PCI core to remove the device if this
"racy" error sequence happens after a device recovery failure is in process.
If you see another paths that multiple PCI errors happening in sequence
might cause trouble to nvme driver that I'm not aware, please let me
know and we can think a better solution. Also, if you have ideas on how
to address this in a more elegant/generic way, let me know too, I really
appreciate. Guess this patch is the most simplistic approach, at least
the one I could think of.
Thanks for the quick review!
Cheers,
Guilherme
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-04-05 20:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-04-05 19:40 [PATCH] nvme: avoid NULL pointer dereference in error recovery path Guilherme G. Piccoli
2017-04-05 19:43 ` Christoph Hellwig
2017-04-05 20:01 ` Guilherme G. Piccoli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).