From: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: Jason Gunthorpe
<jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Tadeusz Struk
<tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: [PATCH v2 03/12] IB/hfi1: Fix an Oops on pci device force remove
Date: Tue, 25 Oct 2016 08:57:55 -0700 [thread overview]
Message-ID: <20161025155754.4950.23412.stgit@scvm10.sc.intel.com> (raw)
In-Reply-To: <20161017160731.GA5679-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
From: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
This patch fixes an Oops on device unbind, when the device is used
by a PSM user process. PSM processes access device resources which
are freed on device removal. Similar protection exists in uverbs
in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence
a separate protection is required for PSM clients.
Cc: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Tadeusz Struk <tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
Changes in v2:
Fixed race condition between open and unbind repored by Jason.
---
drivers/infiniband/hw/hfi1/chip.c | 5 +++++
drivers/infiniband/hw/hfi1/file_ops.c | 19 ++++++++++++++++---
drivers/infiniband/hw/hfi1/hfi.h | 4 ++++
drivers/infiniband/hw/hfi1/init.c | 21 +++++++++++++++++++--
4 files changed, 44 insertions(+), 5 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 9bf5f23..7992152 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -14691,6 +14691,11 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
if (ret)
goto bail_free_cntrs;
+ init_completion(&dd->user_comp);
+
+ /* The user refcount starts with one to inidicate an active device */
+ atomic_set(&dd->user_refcount, 1);
+
goto bail;
bail_free_rcverr:
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 677efa0..bd786b7 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -172,6 +172,9 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
struct hfi1_devdata,
user_cdev);
+ if (!atomic_inc_not_zero(&dd->user_refcount))
+ return -ENXIO;
+
/* Just take a ref now. Not all opens result in a context assign */
kobject_get(&dd->kobj);
@@ -183,11 +186,17 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
fd->rec_cpu_num = -1; /* no cpu affinity by default */
fd->mm = current->mm;
atomic_inc(&fd->mm->mm_count);
- }
+ fp->private_data = fd;
+ } else {
+ fp->private_data = NULL;
+
+ if (atomic_dec_and_test(&dd->user_refcount))
+ complete(&dd->user_comp);
- fp->private_data = fd;
+ return -ENOMEM;
+ }
- return fd ? 0 : -ENOMEM;
+ return 0;
}
static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
@@ -798,6 +807,10 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
done:
mmdrop(fdata->mm);
kobject_put(&dd->kobj);
+
+ if (atomic_dec_and_test(&dd->user_refcount))
+ complete(&dd->user_comp);
+
kfree(fdata);
return 0;
}
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 3c06d20..368e96c 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1174,6 +1174,10 @@ struct hfi1_devdata {
spinlock_t aspm_lock;
/* Number of verbs contexts which have disabled ASPM */
atomic_t aspm_disabled_cnt;
+ /* Keeps track of user space clients */
+ atomic_t user_refcount;
+ /* Used to wait for outstanding user space clients before dev removal */
+ struct completion user_comp;
struct hfi1_affinity *affinity;
struct rhashtable sdma_rht;
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 60db615..e28a6b6 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1538,12 +1538,31 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
return ret;
}
+static void wait_for_clients(struct hfi1_devdata *dd)
+{
+ /*
+ * Remove the device init value and complete the device if there is
+ * no clients or wait for active clients to finish.
+ */
+ if (atomic_dec_and_test(&dd->user_refcount))
+ complete(&dd->user_comp);
+
+ wait_for_completion(&dd->user_comp);
+}
+
static void remove_one(struct pci_dev *pdev)
{
struct hfi1_devdata *dd = pci_get_drvdata(pdev);
/* close debugfs files before ib unregister */
hfi1_dbg_ibdev_exit(&dd->verbs_dev);
+
+ /* remove the /dev hfi1 interface */
+ hfi1_device_remove(dd);
+
+ /* wait for existing user space clients to finish */
+ wait_for_clients(dd);
+
/* unregister from IB core */
hfi1_unregister_ib_device(dd);
@@ -1558,8 +1577,6 @@ static void remove_one(struct pci_dev *pdev)
/* wait until all of our (qsfp) queue_work() calls complete */
flush_workqueue(ib_wq);
- hfi1_device_remove(dd);
-
postinit_cleanup(dd);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-10-25 15:57 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-17 11:19 [PATCH 00/12] For 4.9 rc Dennis Dalessandro
[not found] ` <20161017103326.7934.21558.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-10-17 11:19 ` [PATCH 01/12] IB/rdmvat: Organize hot path calldowns into a single cacheline Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 02/12] IB/hfi1: Optimize pio cachelines Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 03/12] IB/hfi1: Fix an Oops on pci device force remove Dennis Dalessandro
[not found] ` <20161017111918.7934.72325.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-10-17 16:07 ` Jason Gunthorpe
[not found] ` <20161017160731.GA5679-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-10-19 18:01 ` Tadeusz Struk
2016-10-25 15:57 ` Dennis Dalessandro [this message]
[not found] ` <20161025155754.4950.23412.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-10-25 16:48 ` [PATCH v2 " Jason Gunthorpe
[not found] ` <20161025164851.GA28096-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2016-10-25 17:38 ` Tadeusz Struk
[not found] ` <7f4cbe0c-0c83-48b2-9901-4a5e27b306b4-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2016-10-25 18:03 ` Jason Gunthorpe
2016-10-17 11:19 ` [PATCH 04/12] IB/hfi1: Return ENODEV for unsupported PCI device ids Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 05/12] IB/hfi1: Unify access to GUID entries Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 06/12] IB/hfi1: Optimize devdata cachelines Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 07/12] IB/hfi1: Fix a potential memory leak in hfi1_create_ctxts() Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 08/12] IB/hfi1: Add active channel and backplane support for integrated devices Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 09/12] IB/hfi1: Remove leftover snoop references Dennis Dalessandro
2016-10-17 11:19 ` [PATCH 10/12] IB/hfi1: Clean up unused argument Dennis Dalessandro
2016-10-17 11:20 ` [PATCH 11/12] IB/hfi1: Delete unused lock Dennis Dalessandro
2016-10-17 11:20 ` [PATCH 12/12] IB/hfi1: Fix rnr_timer addition Dennis Dalessandro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161025155754.4950.23412.stgit@scvm10.sc.intel.com \
--to=dennis.dalessandro-ral2jqcrhueavxtiumwx3w@public.gmane.org \
--cc=dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=tadeusz.struk-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.