public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Shirley Ma <shirley.ma-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
To: Chuck Lever <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Steve Wise
	<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
Cc: "Hefty,
	Sean" <sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Devesh Sharma
	<devesh.sharma-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>,
	Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider module
Date: Thu, 17 Jul 2014 14:25:14 -0700	[thread overview]
Message-ID: <53C83F3A.7020608@oracle.com> (raw)
In-Reply-To: <DF7CE85B-288D-4CC2-AD51-B326D5F1EE1A-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>


On 07/17/2014 01:41 PM, Chuck Lever wrote:
> On Jul 17, 2014, at 4:08 PM, Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote:
> 
>> > 
>> > 
>>> >> -----Original Message-----
>>> >> From: Steve Wise [mailto:swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org]
>>> >> Sent: Thursday, July 17, 2014 2:56 PM
>>> >> To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
>>> >> Cc: 'linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org'; 'chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org'
>>> >> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
>>> >> 
>>> >> 
>>> >> 
>>>> >>> -----Original Message-----
>>>> >>> From: Hefty, Sean [mailto:sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org]
>>>> >>> Sent: Thursday, July 17, 2014 2:50 PM
>>>> >>> To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier'
>>>> >>> Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org
>>>> >>> Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider module
>>>> >>> 
>>>>>> >>>>> So the rdma cm is expected to increase the driver reference count
>>>>> >>>> (try_module_get) for
>>>>>> >>>>> each new cm id, then deference count (module_put) when cm id is
>>>>> >>>> destroyed?
>>>>>> >>>>> 
>>>>> >>>> 
>>>>> >>>> No, I think he's saying the rdma-cm posts a RDMA_CM_DEVICE_REMOVAL event
>>>>> >>>> to each
>>>>> >>>> application with rdmacm objects allocated, and each application is expected
>>>>> >>>> to destroy all
>>>>> >>>> the objects it has allocated before returning from the event handler.
>>>> >>> 
>>>> >>> This is almost correct.  The applications do not have to destroy all the objects that
>> > it has
>>>> >>> allocated before returning from their event handler.  E.g. an app can queue a work
>> > item
>>>> >>> that does the destruction.  The rdmacm will block in its ib_client remove handler
>> > until all
>>>> >>> relevant rdma_cm_id's have been destroyed.
>>>> >>> 
>>> >> 
>>> >> Thanks for the clarification.
>>> >> 
>> > 
>> > And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in rpcrdma_conn_upcall().
>> > It sets ep->rep_connected to -ENODEV, wakes everybody up, and calls rpcrdma_conn_func()
>> > for that endpoint, which schedules rep_connect_worker...  and I gave up following the code
>> > path at this point... :)  
>> > 
>> > For this to all work correctly, it would need to destroy all the QPs, MRs, CQs, etc for
>> > that device _before_ destroying the rdma cm ids.  Otherwise the provider module could be
>> > unloaded too soon…
> We can’t really deal with a CM_DEVICE_REMOVE event while there are active
> NFS mounts.
> 
> System shutdown ordering should guarantee (one would hope) that NFS
> mount points are unmounted before the RDMA/IB core infrastructure is
> torn down. Ordering shouldn’t matter as long all NFS activity has
> ceased before the CM tries to remove the device.
> 
> So if something is hanging up the CM, there’s something xprtrdma is not
> cleaning up properly.

I saw a problem once, restart the system without umounting the NFS. CM was hung on waiting for completion. It looks like a  bug in xprtrdma cleanup up. I couldn't reproduce it.

Call Trace:
 [<ffffffff815c9aa9>] schedule+0x29/0x70
 [<ffffffff815c8d35>] schedule_timeout+0x165/0x200
 [<ffffffff815ca9ff>] ? wait_for_completion+0xcf/0x110
 [<ffffffff810a708e>] ? __lock_release+0x9e/0x1f0
 [<ffffffff815ca9ff>] ? wait_for_completion+0xcf/0x110
 [<ffffffff815caa07>] wait_for_completion+0xd7/0x110
 [<ffffffff8108bce0>] ? try_to_wake_up+0x260/0x260
 [<ffffffffa064cb6e>] cma_process_remove+0xee/0x110 [rdma_cm]
 [<ffffffffa064cbdc>] cma_remove_one+0x4c/0x60 [rdma_cm]
 [<ffffffffa0279e0f>] ib_unregister_device+0x4f/0x100 [ib_core]
 [<ffffffffa02f76ee>] mlx4_ib_remove+0x2e/0x260 [mlx4_ib]
 [<ffffffffa01754c9>] mlx4_remove_device+0x69/0x80 [mlx4_core]
 [<ffffffffa01755b3>] mlx4_unregister_interface+0x43/0x80 [mlx4_core]
 [<ffffffffa030970c>] mlx4_ib_cleanup+0x10/0x23 [mlx4_ib]
 [<ffffffff810d9183>] SyS_delete_module+0x183/0x1e0
 [<ffffffff810f7c94>] ? __audit_syscall_entry+0x94/0x100
 [<ffffffff812c5789>] ? lockdep_sys_exit_thunk+0x35/0x67
 [<ffffffff815cec92>] system_call_fastpath+0x16/0x1b


Shirley
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2014-07-17 21:25 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1405605697-11583-1-git-send-email-devesh.sharma@emulex.com>
     [not found] ` <1405605697-11583-1-git-send-email-devesh.sharma-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org>
2014-07-17 14:01   ` [for-next 1/2] xprtrdma: take reference of rdma provider module Devesh Sharma
     [not found]     ` <3e39e90f-7095-4eb9-a844-516672a355ad-3RiH6ntJJkOPfaB/Gd0HpljyZtpTMMwT@public.gmane.org>
2014-07-17 15:01       ` Steve Wise
     [not found]         ` <53C7E546.3080008-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-07-17 15:05           ` Chuck Lever
     [not found]             ` <78A77C48-AC73-4C01-B139-A00B4F674C70-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-17 15:31               ` Devesh Sharma
2014-07-17 15:20           ` Devesh Sharma
2014-07-17 16:06           ` Hefty, Sean
     [not found]             ` <1828884A29C6694DAF28B7E6B8A823739933FCA3-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-07-17 18:57               ` Shirley Ma
     [not found]                 ` <53C81CB7.2030000-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-17 19:07                   ` Steve Wise
2014-07-17 19:50                     ` Hefty, Sean
     [not found]                       ` <1828884A29C6694DAF28B7E6B8A823739933FDEA-P5GAC/sN6hkd3b2yrw5b5LfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-07-17 19:55                         ` Steve Wise
2014-07-17 20:23                           ` Shirley Ma
2014-07-17 20:08                       ` Steve Wise
2014-07-17 20:41                         ` Chuck Lever
     [not found]                           ` <DF7CE85B-288D-4CC2-AD51-B326D5F1EE1A-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-17 20:59                             ` Steve Wise
2014-07-18  5:05                               ` Devesh Sharma
     [not found]                                 ` <EE7902D3F51F404C82415C4803930ACD3FE1482F-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-07-18 13:27                                   ` Steve Wise
2014-07-18 15:47                                     ` Shirley Ma
     [not found]                                       ` <53C94199.4050601-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-21  6:11                                         ` Devesh Sharma
     [not found]                                           ` <EE7902D3F51F404C82415C4803930ACD3FE1C7B7-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-07-21 11:48                                             ` Devesh Sharma
     [not found]                                           ` <a6345162-863d -447c-b7c2-059ced190a13@CMEXHTCAS1.ad.emulex.com>
     [not found]                                             ` <a6345162-863d-447c-b7c2-059ced190a13-3RiH6ntJJkP8BX6JNMqfyFjyZtpTMMwT@public.gmane.org>
2014-07-21 14:53                                               ` Chuck Lever
     [not found]                                                 ` <27ACE237-161A-4CA5-AA5C-6349CC4118E3-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-21 15:03                                                   ` Steve Wise
2014-07-21 15:20                                                     ` Chuck Lever
     [not found]                                                     ` <D88D1952-83A1-4FF9-B028-AAE7A859A 5B1@oracle.com>
     [not found]                                                       ` <D88D1952-83A1-4FF9-B028-AAE7A859A5B1-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-21 15:22                                                         ` Steve Wise
2014-07-21 17:07                                                           ` Devesh Sharma
     [not found]                                                             ` <EE7902D3F51F404C82415C4803930ACD3FE1D9CF-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-07-21 17:30                                                               ` Chuck Lever
     [not found]                                                                 ` <0CDA5340-DDD6-42F8-8359-0069BBC9E24C-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-22  5:06                                                                   ` Devesh Sharma
     [not found]                                                                     ` <EE7902D3F51F404C82415C4803930ACD3FE1DB1D-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-07-30 18:39                                                                       ` Chuck Lever
     [not found]                                                                         ` <A40CDF7D-7ED2-4D67-957F-8F977D567774-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-31  5:14                                                                           ` Devesh Sharma
     [not found]                                                                             ` <EE7902D3F51F404C82415C4803930ACD3FE23695-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-08-18  9:52                                                                               ` Devesh Sharma
     [not found]                                                                                 ` <6a71f6a5-f335-42c6-b8b7-8b4bac5aae83-3RiH6ntJJkP8BX6JNMqfyFjyZtpTMMwT@public.gmane.org>
2014-08-18 13:13                                                                                   ` Chuck Lever
2014-07-21  5:23                                     ` Devesh Sharma
2014-07-17 21:25                             ` Shirley Ma [this message]
2014-07-18  6:19                         ` Devesh Sharma
     [not found]                           ` <EE7902D3F51F404C82415C4803930ACD3FE1686F-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-07-18 15:27                             ` Chuck Lever
     [not found]                               ` <D9783B2E-8D18-442E-9BFE-0863F9DD6B96-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-07-21  5:40                                 ` Devesh Sharma
2014-07-17 14:01   ` [for-next 2/2] xprtrdma: fix deallocation sequence of pd Devesh Sharma
     [not found]     ` <3fdcf67f-2e90-4c61-92da-a8f7743cf54a-3RiH6ntJJkOPfaB/Gd0HpljyZtpTMMwT@public.gmane.org>
2014-07-17 15:05       ` Steve Wise
     [not found]         ` <53C7E64D.90501-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-07-17 15:35           ` Devesh Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53C83F3A.7020608@oracle.com \
    --to=shirley.ma-qhclzuegtsvqt0dzr+alfa@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=devesh.sharma-laKkSmNT4hbQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox