From: Pradeep Satyanarayana <pradeeps-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Ralph Campbell <ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org>
Cc: Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>,
linux-rdma <linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: IB/ipoib: fix dangling pointer reference to ipoib_neigh and ipoib_path -when will it go upstream?
Date: Fri, 16 Jul 2010 02:13:32 -0700 [thread overview]
Message-ID: <4C4022BC.3030401@linux.vnet.ibm.com> (raw)
In-Reply-To: <1279243768.31421.48.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
Ralph Campbell wrote:
> On Thu, 2010-07-15 at 04:56 -0700, Pradeep Satyanarayana wrote:
>> Pradeep Satyanarayana wrote:
>>> Pradeep Satyanarayana wrote:
>>>> Roland Dreier wrote:
>>>>> > I guess I came to a premature conclusion. One set of tests ran fine and I made that
>>>>> > conclusion. Another set of tests caused the following crash:
>>>>>
>>>>> I don't really know how to interpret this. Is this crash new, or is it
>>>>> the same crash you were hoping this patch fixed?
>>>> This is a new crash.
>>> I see other manifestations resulting in different crashes :
>>>
>>> :mon> t
>>> [c00000074603ba20] d0000000193527ac .ipoib_neigh_flush+0x6c/0x350 [ib_ipoib]
>>> [c00000074603bb10] d000000019356dac .ipoib_mcast_free+0x74/0x2a0 [ib_ipoib]
>>> [c00000074603bbe0] d000000019358558 .ipoib_mcast_restart_task+0x3d0/0x560 [ib_ipoib]
>>> [c00000074603bd40] c0000000000c6fe4 .run_workqueue+0xf4/0x1e0
>>> [c00000074603be00] c0000000000c7190 .worker_thread+0xc0/0x180
>>> [c00000074603bed0] c0000000000ccf4c .kthread+0xb4/0xc0
>>> [c00000074603bf90] c0000000000309fc .kernel_thread+0x54/0x70
>>> 9:mon> e
>>> cpu 0x9: Vector: 300 (Data Access) at [c00000074603b720]
>>> pc: c0000000005ac390: ._spin_lock+0x20/0xc8
>>> lr: d0000000193527ac: .ipoib_neigh_flush+0x6c/0x350 [ib_ipoib]
>>> sp: c00000074603b9a0
>>> msr: 8000000000009032
>>> dar: 3a0
>>> dsisr: 40000000
>>> current = 0xc000000756ce8b00
>>> paca = 0xc000000000f63800
>>> pid = 18095, comm = ipoib
>>> 9:mon>
>> Recreating the crash has been tricky. I have tried several several hundred times today
>> to unload and reload IPoIB while there is traffic and no crashes happened. I took
>> a closer look at the IPoIB CM code and I see a few things that look suspicious.
>>
>> In the ipoib_cm_send() path no priv->lock is held, whereas the priv->lock is held before
>> calling ipoib_cm_destroy_tx(). This is true with and without Ralph's patch (fix dangling pointer).
>> Is this a potential race?
>
> ipoib_cm_send() is only called by ipoib_start_xmit() so it is protected
> by netif_tx_lock(dev) or stopping the ipoib network device.
I still see one case in ipoib_neigh_cleanup() wherein ipoib_cm_destroy_tx() appears to be called
without netif_tx_lock(dev) held. Is that correct?
Thanks
Pradeep
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2010-07-16 9:13 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-12 4:57 IB/ipoib: fix dangling pointer reference to ipoib_neigh and ipoib_path -when will it go upstream? Pradeep Satyanarayana
[not found] ` <4C3AA0A2.3090406-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-12 10:07 ` Bart Van Assche
[not found] ` <AANLkTilkXY0vDA4dtq9cMIO2shX-kw0ZlwadVjm4QMui-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-12 10:20 ` Pradeep Satyanarayana
[not found] ` <4C3AEC65.1090304-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-12 10:35 ` Bart Van Assche
[not found] ` <AANLkTil7Vbpg4n_TsR71F1mK5Sq6kzI0blgVjJ7_lebB-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-13 19:23 ` Ralph Campbell
2010-07-12 21:21 ` Roland Dreier
[not found] ` <adapqysxup6.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-07-13 6:07 ` Pradeep Satyanarayana
[not found] ` <4C3C02B2.9040408-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-13 14:45 ` Roland Dreier
[not found] ` <adaeif7xwwz.fsf-BjVyx320WGW9gfZ95n9DRSW4+XlvGpQz@public.gmane.org>
2010-07-14 2:54 ` Pradeep Satyanarayana
[not found] ` <4C3D26D0.3090508-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-14 16:25 ` Pradeep Satyanarayana
[not found] ` <4C3DE512.3020903-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-15 11:56 ` Pradeep Satyanarayana
[not found] ` <4C3EF754.4060502-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-15 19:59 ` Ralph Campbell
2010-07-16 1:29 ` Ralph Campbell
[not found] ` <1279243768.31421.48.camel-/vjeY7uYZjrPXfVEPVhPGq6RkeBMCJyt@public.gmane.org>
2010-07-16 9:13 ` Pradeep Satyanarayana [this message]
[not found] ` <4C4022BC.3030401-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-16 17:43 ` Ralph Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C4022BC.3030401@linux.vnet.ibm.com \
--to=pradeeps-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=ralph.campbell-h88ZbnxC6KDQT0dZR+AlfA@public.gmane.org \
--cc=rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox