linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next V7 0/6] HW Device hot-removal support
@ 2015-08-04 14:03 Yishai Hadas
       [not found] ` <1438697008-26209-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: Yishai Hadas @ 2015-08-04 14:03 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, yishaih-VPRAkNaXOzVWk0Htik3J/w,
	raindel-VPRAkNaXOzVWk0Htik3J/w, jackm-VPRAkNaXOzVWk0Htik3J/w,
	jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/

Currently, if there is any user space application using an IB device,
it is impossible to unload the HW device driver for this device.

Similarly, if the device is hot-unplugged or reset, the device driver
hardware removal flow blocks until all user contexts are destroyed.

This patchset removes the above limitations from both uverbs and ucma.

The IB-core and uverbs layers are still required to remain loaded as
long as there are user applications using the verbs API. However, the
hardware device drivers are not blocked any more by the user space
activity.

To support this, the hardware device needs to expose a new kernel API
named 'disassociate_ucontext'. The device driver is given a ucontext
to detach from, and it should block this user context from any future
hardware access. In the IB-core level, we use this interface to
deactivate all ucontext that address a specific device when handling a
remove_one callback for it.

In the RDMA-CM layer, a similar change is needed in the ucma module,
to prevent blocking of the remove_one operation. This allows for
hot-removal of devices with RDMA-CM users in the user space.

The first three patches are preparation for this series.
The first patch fixes a reference counting issue pointed by Jason Gunthorpe.
The second patch fixes a race condition issue pointed by Jason Gunthorpe.
The third patch is a preparation step for deploying RCU for the device
removal flow.

The fourth patch introduces the new API between the HW device driver and
the IB core. For devices which implement the functionality, IB core
will use it in remove_one, disassociating any active ucontext from the
hardware device. Other drivers that didn't implement it will behave as
today, remove_one will block until all ucontexts referring the device
are destroyed before returning.

The fifth patch provides implementation of this API for the mlx4
driver.

The last patch extends ucma to avoid blocking remove_one operation in
the cma module. When such device removal event is received, ucma is
turning all user contexts to zombie contexts. This is done by
releasing all underlying resources and preventing any further user
operations on the context.

Changes from V6:
Added an extra patch #2 to solve a race that was introduced 5 years ago and was reported by Jason.
patch #4 (previously #3): Adapted to the fix of patch #2.

Changes from V5:
Addressed Jason's comments for below patches:
patch #1: Improve kref usage.
patch #3: Use 2 different krefs for complete and memory, improve some comments.

Changes from V4:
patch #1,#3 - addressed Jason's comments.
patch #2, #4 - rebased upon last stuff.

Changes from V3:
Add 2 patches as a preparation for this series, details above.
patch #3: Change the locking schema based on Jason's comments.

Changes from V2:
patch #1: Rebase over ODP patches.

Changes from V1:
patch #1: Use uverbs flags instead of disassociate support, drop fatal_event_raised flag.
patch #3: Add support in ucma for handling device removal.

Changes from V0:
patch #1: ib_uverbs_close, reduced mutex scope to enable tasks run in parallel.
Yishai Hadas (6):
  IB/uverbs: Fix reference counting usage of event files
  IB/uverbs: Fix race between ib_uverbs_open and remove_one
  IB/uverbs: Explicitly pass ib_dev to uverbs commands
  IB/uverbs: Enable device removal when there are active user space
    applications
  IB/mlx4_ib: Disassociate support
  IB/ucma: HW Device hot-removal support

 drivers/infiniband/core/ucma.c        |  130 +++++++++-
 drivers/infiniband/core/uverbs.h      |   16 +-
 drivers/infiniband/core/uverbs_cmd.c  |  114 ++++++----
 drivers/infiniband/core/uverbs_main.c |  442 +++++++++++++++++++++++++++------
 drivers/infiniband/hw/mlx4/main.c     |  139 ++++++++++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h  |   13 +
 include/rdma/ib_verbs.h               |    1 +
 7 files changed, 718 insertions(+), 137 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-08-11 16:35 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-04 14:03 [PATCH for-next V7 0/6] HW Device hot-removal support Yishai Hadas
     [not found] ` <1438697008-26209-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-04 14:03   ` [PATCH for-next V7 1/6] IB/uverbs: Fix reference counting usage of event files Yishai Hadas
2015-08-04 14:03   ` [PATCH for-next V7 2/6] IB/uverbs: Fix race between ib_uverbs_open and remove_one Yishai Hadas
     [not found]     ` <1438697008-26209-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-04 21:36       ` Jason Gunthorpe
2015-08-04 14:03   ` [PATCH for-next V7 3/6] IB/uverbs: Explicitly pass ib_dev to uverbs commands Yishai Hadas
2015-08-04 14:03   ` [PATCH for-next V7 4/6] IB/uverbs: Enable device removal when there are active user space applications Yishai Hadas
     [not found]     ` <1438697008-26209-5-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-04 21:48       ` Jason Gunthorpe
2015-08-04 14:03   ` [PATCH for-next V7 5/6] IB/mlx4_ib: Disassociate support Yishai Hadas
2015-08-04 14:03   ` [PATCH for-next V7 6/6] IB/ucma: HW Device hot-removal support Yishai Hadas
     [not found]     ` <1438697008-26209-7-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-08-04 22:09       ` Jason Gunthorpe
     [not found]         ` <20150804220903.GE10934-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-05 15:09           ` Yishai Hadas
     [not found]             ` <55C22739.5060808-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-08-05 18:21               ` Jason Gunthorpe
     [not found]                 ` <20150805182117.GA15583-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-06 15:36                   ` Liran Liss
     [not found]                     ` <HE1PR05MB1418EA195D579C09EB1FB746B1740-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-08-11  5:48                       ` Jason Gunthorpe
     [not found]                         ` <20150811054852.GC13314-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-11 12:47                           ` Liran Liss
     [not found]                             ` <HE1PR05MB1418DE9D40F9B2B20FD1AC3BB17F0-eBadYZ65MZ87O8BmmlM1zNqRiQSDpxhJvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2015-08-11 16:35                               ` Jason Gunthorpe
2015-08-05  0:23       ` Jason Gunthorpe
     [not found]         ` <20150805002338.GB22959-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-08-05 15:51           ` Yishai Hadas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).