From: Pradeep Satyanarayana <pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Ralph Campbell <ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
Cc: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
"linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH v5] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path
Date: Fri, 01 Oct 2010 15:38:25 -0700 [thread overview]
Message-ID: <4CA662E1.1040902@linux.vnet.ibm.com> (raw)
In-Reply-To: <1285893348.22791.120.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
On 9/30/2010 5:35 PM, Ralph Campbell wrote:
> I was looking at the Rx connection tear down and found a bug.
> I don't know if it would cause this panic but you might try it.
> I haven't stress tested it but it compiles and basic network
> connections work.
>
> I also don't like the call to cancel_delayed_work(&priv->cm.stale_task)
> at the end of ipoib_cm_dev_stop(). I think it should be called after
> ib_destroy_cm_id() and priv->cm.id = NULL.
>
Ralph,
I have managed to recreate this crash a few times under stress. I expect to
be able to try your patch some time next week, and will let you know. Thanks
for taking time to look into this.
Thanks
Pradeep
> On Thu, 2010-09-02 at 20:41 -0700, Pradeep Satyanarayana wrote:
>> Ralph,
>>
>> I see the following crash sporadically (only under stress) with a Sles11SP1 (which is 2.6.32 kernel).
>> I saw this crash with V4 of your patch and have not yet had a chance to try V5. Have you seen this
>> in your testing? If this not the crash stack can you please share what your patch fixes?
>>
>> <4>ib0: RX drain timing out
>> <4>idr_remove called for id=11491974 which is not allocated.
>> <4>Call Trace:
>> <4>[c000000749fe33b0] [c0000000000129e4] .show_stack+0x6c/0x198 (unreliable)
>> <4>[c000000749fe3460] [c0000000002ea594] .sub_remove+0x1ec/0x1f8
>> <4>[c000000749fe3520] [c0000000002ea5e0] .idr_remove+0x40/0xf8
>> <4>[c000000749fe35b0] [d000000012d84d70] .cm_destroy_id+0xa0/0x520 [ib_cm]
>> <4>[c000000749fe3680] [d00000001b7fb644] .ipoib_cm_free_rx_reap_list+0xd4/0x190 [ib_ipoib]
>> <4>[c000000749fe3740] [d00000001b7fe404] .ipoib_cm_dev_stop+0x23c/0x360 [ib_ipoib]
>> <4>[c000000749fe3800] [d00000001b7f4dbc] .ipoib_ib_dev_stop+0xe4/0x4b0 [ib_ipoib]
>> <4>[c000000749fe3960] [d00000001b7f0f30] .ipoib_stop+0x88/0x178 [ib_ipoib]
>> <4>[c000000749fe39f0] [c0000000004eacf4] .dev_close+0xdc/0x148
>> <4>[c000000749fe3a80] [c0000000004ea2b8] .dev_change_flags+0x1f0/0x288
>> <4>[c000000749fe3b20] [d00000001b7f11b8] .ipoib_remove_one+0xb8/0x140 [ib_ipoib]
>> <4>[c000000749fe3bc0] [d00000001210425c] .ib_unregister_client+0xb4/0x1b8 [ib_core]
>> <4>[c000000749fe3c90] [d00000001b7ffde8] .ipoib_cleanup_module+0x20/0x60 [ib_ipoib]
>> <4>[c000000749fe3d20] [c0000000000ec408] .SyS_delete_module+0x238/0x320
>> <4>[c000000749fe3e30] [c0000000000085b4] syscall_exit+0x0/0x40
>> <1>Unable to handle kernel paging request for data at address 0x45000027228d1ffb
>> <1>Faulting instruction address: 0xc0000000005a8e88
>> 12:mon> e
>> cpu 0x12: Vector: 300 (Data Access) at [c000000749fe3250]
>> pc: c0000000005a8e88: .wait_for_common+0xb8/0x268
>> lr: c0000000005a8e20: .wait_for_common+0x50/0x268
>> sp: c000000749fe34d0
>> msr: 8000000000009032
>> dar: 45000027228d1ffb
>> dsisr: 42000000
>> current = 0xc00000074b4ce0e0
>> paca = 0xc000000000f64a00
>> pid = 13605, comm = modprobe
>> 12:mon>
>>
>> Thanks
>> Pradeep
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-10-01 22:38 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-17 20:36 [PATCH v5 0/1] IB/ipoib: fix dangling pointer references to ipoib_neigh and ipoib_path Ralph Campbell
[not found] ` <20100817203619.22174.62871.stgit-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-08-17 20:36 ` [PATCH v5] " Ralph Campbell
[not found] ` <20100817203624.22174.69480.stgit-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-09-03 3:41 ` Pradeep Satyanarayana
[not found] ` <4C806E72.6030507-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-09-03 16:06 ` Ralph Campbell
2010-10-01 0:35 ` Ralph Campbell
[not found] ` <1285893348.22791.120.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-10-01 22:38 ` Pradeep Satyanarayana [this message]
2010-08-17 20:48 ` [PATCH v5 0/1] " Ralph Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CA662E1.1040902@linux.vnet.ibm.com \
--to=pradeeps-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
--cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox