public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* race in ULPs  when processing RDMA_CM_EVENT_DEVICE_REMOVAL
@ 2013-05-05 14:05 Or Gerlitz
       [not found] ` <5186671F.3020408-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Or Gerlitz @ 2013-05-05 14:05 UTC (permalink / raw)
  To: Hefty, Sean
  Cc: Roi Dayan,
	linux-rdma (linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org)

Hi Sean,

When the low level driver exercises the hot unplug code (e.g if the user 
does modprobe -r, pci hot unplug, etc)
they would call the rdma-cm remove_one callback, which would go and 
generate RDMA_CM_EVENT_DEVICE_REMOVAL
event for the cma consumers. Now, if the consumer doesn't make sure they 
destroy all the IB objects
created on that ll device instance (e.g mlx4_0) prior to finalizing all 
processing of the DEVICE_REMOVAL
callback, the rdma-cm will let the low level driver green light to 
finalize its de-registation (destruction
of the IB device instance etc) with the IB core, and a call from the 
consumer to (say) ib_destroy_cq(dev, cq)
will crash since that dev object is practically null or the call points 
to  a function/module which
doesn't exist any more in the kernel address space - agree?

What would be the correct way to go for consumers, is that making sure 
they destroy 1st all their IB
objects (PDs, MRs, CQs, QPs, etc) prior to destroying the last rdma_cm 
id on a device removal event?
any other idea?

In iSER we don't make sure to destroy all the IB objects prior to acking 
this event to the rdma-cm
and we see crashes under RoCE link layer, where under IB link layer not 
crashing, most likely some
timing which is different between the link layers, but the arch question 
is still valid.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-05-06 21:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-05 14:05 race in ULPs when processing RDMA_CM_EVENT_DEVICE_REMOVAL Or Gerlitz
     [not found] ` <5186671F.3020408-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2013-05-06 15:46   ` Hefty, Sean
     [not found]     ` <1828884A29C6694DAF28B7E6B8A823736FD21043-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-05-06 20:25       ` Or Gerlitz
     [not found]         ` <CAJZOPZKm2ZXKVgTKAYXj6uHzji_p00UQQbKHzmqiPaTSLctEKg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-05-06 20:33           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A823736FD2121B-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2013-05-06 21:08               ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox