All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lukas Wunner <lukas@wunner.de>
To: gokul cg <gokuljnpr@gmail.com>, Thomas Tai <thomas.tai@oracle.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>,
	Bjorn Helgaas <helgaas@kernel.org>,
	Ashok Raj <ashok.raj@intel.com>,
	Keith Busch <keith.busch@intel.com>,
	Yinghai Lu <yinghai@kernel.org>, Sinan Kaya <okaya@kernel.org>,
	linux-pci@vger.kernel.org,
	Alexandru Gagniuc <mr.nuke.me@gmail.com>
Subject: Re: [PATCH] PCI: pciehp: Differentiate between surprise and safe removal
Date: Thu, 2 Aug 2018 17:07:49 +0200	[thread overview]
Message-ID: <20180802150749.GA31683@wunner.de> (raw)
In-Reply-To: <20180802084657.GA21267@wunner.de>

[cc += Thomas Tai]

On Thu, Aug 02, 2018 at 10:46:57AM +0200, Lukas Wunner wrote:
> On Thu, Aug 02, 2018 at 12:59:18PM +0530, gokul cg wrote:
> > I am suspecting a possible race condition in the kernel between PCI driver
> > and AER handling.
> 
> The solution is to acquire a ref on each device in add_error_device().
> Then release the ref aer_process_err_devices() by calling pci_dev_put().

So in case it wasn't clear, the below is what I had in mind.
Completely untested though.  Does this work for you?

For v3.10 compatibility, cherry-pick 89ee9f768003 (or alternatively
cherry-pick 8496e85c20e7 and replace pci_dev_is_disconnected(dev)
with !pci_device_is_present(dev)).

-- >8 --
Subject: [PATCH] PCI/AER: Fix use-after-free on surprise removal

The work item to consume errors, aer_isr(), walks the hierarchy using
pci_walk_bus() and stores a pointer to PCI devices which reported an
error in an array.  As long as pci_walk_bus() runs, those pointers are
valid because pci_bus_sem is held.  But once pci_walk_bus() finishes,
nothing prevents the pointers from becoming invalid, e.g. through
unplugging of the PCI devices.  The unprotected pointers are then
dereferenced in aer_process_err_devices(), which may oops:

  #5  general_protection at ffffffff8176cdf2
      [exception RIP: pci_bus_read_config_dword+100]
  #6  pci_find_next_ext_capability at ffffffff81345d7b
  #7  pci_find_ext_capability at ffffffff81347225
  #8  get_device_error_info at ffffffff81356c4d
  #9  aer_isr at ffffffff81357a38

Fix by holding a ref on the devices until they have been processed.
Skip processing of unplugged devices.

Reported-by: gokul cg <gokuljnpr@gmail.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/pci/pcie/aer.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e8838..937592e 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -657,7 +657,7 @@ void cper_print_aer(struct pci_dev *dev, int aer_severity,
 static int add_error_device(struct aer_err_info *e_info, struct pci_dev *dev)
 {
 	if (e_info->error_dev_num < AER_MAX_MULTI_ERR_DEVICES) {
-		e_info->dev[e_info->error_dev_num] = dev;
+		e_info->dev[e_info->error_dev_num] = pci_dev_get(dev);
 		e_info->error_dev_num++;
 		return 0;
 	}
@@ -898,6 +898,9 @@ static int get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
 	if (!pos)
 		return 0;
 
+	if (pci_dev_is_disconnected(dev))
+		return 0;
+
 	if (info->severity == AER_CORRECTABLE) {
 		pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
 			&info->status);
@@ -948,6 +951,7 @@ static inline void aer_process_err_devices(struct aer_err_info *e_info)
 	for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++) {
 		if (get_device_error_info(e_info->dev[i], e_info))
 			handle_error_source(e_info->dev[i], e_info);
+		pci_dev_put(e_info->dev[i]);
 	}
 }
 
-- 
2.18.0

  parent reply	other threads:[~2018-08-02 15:07 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31  5:50 [PATCH] PCI: pciehp: Differentiate between surprise and safe removal Lukas Wunner
2018-08-01 16:43 ` Mika Westerberg
2018-08-01 17:15   ` Lukas Wunner
2018-08-01 19:09     ` Alex G.
2018-08-02  7:20     ` Mika Westerberg
2018-08-02  7:29       ` gokul cg
2018-08-02  8:46         ` Lukas Wunner
2018-08-02 12:28           ` gokul cg
2018-08-02 15:07           ` Lukas Wunner [this message]
2018-08-02 17:09             ` Thomas Tai
2018-08-06 18:33               ` gokul cg
2018-08-07 14:26                 ` Thomas Tai
2018-08-07 15:30                 ` Thomas Tai
2018-08-08  9:59                   ` gokul cg
2018-08-08 11:21                   ` gokul cg
2018-08-08 20:49                     ` Thomas Tai
2018-09-04 17:53 ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180802150749.GA31683@wunner.de \
    --to=lukas@wunner.de \
    --cc=ashok.raj@intel.com \
    --cc=gokuljnpr@gmail.com \
    --cc=helgaas@kernel.org \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=mika.westerberg@linux.intel.com \
    --cc=mr.nuke.me@gmail.com \
    --cc=okaya@kernel.org \
    --cc=thomas.tai@oracle.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.