public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] iw_cm: reject connect requests if cmid is not in LISTEN
Date: Thu, 23 Feb 2012 13:55:13 -0600	[thread overview]
Message-ID: <4F4699A1.7030402@opengridcomputing.com> (raw)
In-Reply-To: <4F465A46.3060301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>

On 02/23/2012 09:24 AM, Steve Wise wrote:
> On 02/23/2012 01:46 AM, Roland Dreier wrote:
>> On Wed, Feb 22, 2012 at 1:43 PM, Steve Wise<swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>  wrote:
>>> diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
>>> index 1a696f7..6847d76 100644
>>> --- a/drivers/infiniband/core/iwcm.c
>>> +++ b/drivers/infiniband/core/iwcm.c
>>> @@ -631,6 +631,8 @@ static void cm_conn_req_handler(struct iwcm_id_private *listen_id_priv,
>>>         spin_lock_irqsave(&listen_id_priv->lock, flags);
>>>         if (listen_id_priv->state != IW_CM_STATE_LISTEN) {
>>>                 spin_unlock_irqrestore(&listen_id_priv->lock, flags);
>>> +               iw_cm_reject(cm_id, NULL, 0);
>>> +               iw_destroy_cm_id(cm_id);
>>>                 goto out;
>>>         }
>>>         spin_unlock_irqrestore(&listen_id_priv->lock, flags);
>> Thanks, this makes more sense to my brain at least.
>>
>
> Yes, this is the best fix methinks.  Thanks for the review!
>
>> I assume this works just as well in your testing? :)
>
> Yes, I've run some large NP MPI tests that tickle this condition and all the connections get cleaned up now.  I also 
> ran some other MPI regression tests with this fix.
>

Hrm.  I just hit this after more testing.  Debugging now.  Just hold of on this patch until I root cause this.


Unable to handle kernel paging request at 0000000000200200 RIP:
  [<0000000000200200>]
PGD 183c984067 PUD 0
Oops: 0010 [1] SMP
last sysfs file: /class/infiniband/cxgb4_0/node_guid
CPU 10
Modules linked in: nfs fscache nfs_acl cxgb3(U) iw_cxgb4(U) kretprobes(U) autofs4 hidp rfcomm l2cap bluetooth lockd 
sunrpc be2iscsi iscsi_tcp bnx2i cnic uio libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi rdma_ucm(U) 
ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api 
ib_uverbs(U) ib_umad(U) iw_nes(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) 
dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi 
acpi_memhotplug ac parport_pc lp parport joydev cxgb4(U) tpm_tis tpm e1000e tpm_bios sr_mod shpchp i7core_edac edac_mc 
cdrom i2c_i801 i2c_core serio_raw 8021q sg pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ahci 
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 5708, comm: iw_cm_wq Tainted: G      2.6.18-238.el5 #1
RIP: 0010:[<0000000000200200>]  [<0000000000200200>]
RSP: 0018:ffff81183e0cfcf8  EFLAGS: 00010097
RAX: ffff810c3cf3ca58 RBX: 0c30100000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff81012aad6a58
RBP: ffff81183e0cfd30 R08: ffff81012aad6a70 R09: 0000000000000282
R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000000
R13: 0000000000003c15 R14: ffff810c3cf3ca50 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff810c6a3c42c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000200200 CR3: 0000000c3d5a4000 CR4: 00000000000006e0
Process iw_cm_wq (pid: 5708, threadinfo ffff81183e0ce000, task ffff810c3ea79080)
Stack:  ffffffff8008c846 0000000300000000 ffff810c3cf3ca50 0000000000000000
  0000000000000000 0000000000000282 0000000000000003 ffff81183e0cfd70
  ffffffff8002e261 0000000000000000 ffff810c3cf3c9c0 ffff810c3cf3c900
Call Trace:
  [<ffffffff8008c846>] __wake_up_common+0x3e/0x68
  [<ffffffff8002e261>] __wake_up+0x38/0x4f
  [<ffffffff8867410b>] :iw_cm:iw_cm_reject+0x5a/0xa7
  [<ffffffff88674baa>] :iw_cm:cm_work_handler+0x15e/0x424
  [<ffffffff88674a4c>] :iw_cm:cm_work_handler+0x0/0x424
  [<ffffffff8004d7ae>] run_workqueue+0x99/0xf6
  [<ffffffff80049ff6>] worker_thread+0x0/0x122
  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
  [<ffffffff8004a0e6>] worker_thread+0xf0/0x122
  [<ffffffff8008e40a>] default_wake_function+0x0/0xe
  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80032974>] kthread+0xfe/0x132
  [<ffffffff8005dfb1>] child_rip+0xa/0x11
  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
  [<ffffffff80032876>] kthread+0x0/0x132
  [<ffffffff8005dfa7>] child_rip+0x0/0x11


Code:  Bad RIP value.
RIP  [<0000000000200200>]
  RSP <ffff81183e0cfcf8>
crash>


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-02-23 19:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-22 21:43 [PATCH] iw_cm: reject connect requests if cmid is not in LISTEN Steve Wise
     [not found] ` <20120222214307.23921.83903.stgit-T4OLL4TyM9aNDNWfRnPdfg@public.gmane.org>
2012-02-23  7:46   ` Roland Dreier
     [not found]     ` <CAL1RGDV7ZoKWgbh+ERF+af3_B7K2USAkXSPKWeQEg5atpHY0og-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-23 15:24       ` Steve Wise
     [not found]         ` <4F465A46.3060301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2012-02-23 19:55           ` Steve Wise [this message]
     [not found]             ` <4F4699A1.7030402-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2012-02-23 20:23               ` Steve Wise
2012-02-24  1:57               ` Roland Dreier
     [not found]                 ` <CAL1RGDWkVJxEDZ5SaaSa8oA_y6a0u1NCbzTK9agsJE+V_YzimQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-24 14:16                   ` Steve Wise
2012-02-24 21:32   ` Roland Dreier
     [not found]     ` <CAL1RGDWb0ocYN5oM3QtxRj5VWCAWrp3Jtx6N1UHSrNDP2A1WEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-24 21:41       ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F4699A1.7030402@opengridcomputing.com \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox