All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: Roland Dreier <roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] iw_cm: reject connect requests if cmid is not in LISTEN
Date: Thu, 23 Feb 2012 14:23:50 -0600	[thread overview]
Message-ID: <4F46A056.5090005@opengridcomputing.com> (raw)
In-Reply-To: <4F4699A1.7030402-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>


>
> Hrm.  I just hit this after more testing.  Debugging now.  Just hold of on this patch until I root cause this.
>
>
> Unable to handle kernel paging request at 0000000000200200 RIP:
>  [<0000000000200200>]
> PGD 183c984067 PUD 0
> Oops: 0010 [1] SMP
> last sysfs file: /class/infiniband/cxgb4_0/node_guid
> CPU 10
> Modules linked in: nfs fscache nfs_acl cxgb3(U) iw_cxgb4(U) kretprobes(U) autofs4 hidp rfcomm l2cap bluetooth lockd 
> sunrpc be2iscsi iscsi_tcp bnx2i cnic uio libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi rdma_ucm(U) 
> ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api 
> ib_uverbs(U) ib_umad(U) iw_nes(U) ib_qib(U) dca mlx4_ib(U) mlx4_en(U) mlx4_core(U) ib_mthca(U) ib_mad(U) ib_core(U) 
> dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi 
> acpi_memhotplug ac parport_pc lp parport joydev cxgb4(U) tpm_tis tpm e1000e tpm_bios sr_mod shpchp i7core_edac edac_mc 
> cdrom i2c_i801 i2c_core serio_raw 8021q sg pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ahci 
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
> Pid: 5708, comm: iw_cm_wq Tainted: G      2.6.18-238.el5 #1
> RIP: 0010:[<0000000000200200>]  [<0000000000200200>]
> RSP: 0018:ffff81183e0cfcf8  EFLAGS: 00010097
> RAX: ffff810c3cf3ca58 RBX: 0c30100000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff81012aad6a58
> RBP: ffff81183e0cfd30 R08: ffff81012aad6a70 R09: 0000000000000282
> R10: 0000000000000000 R11: 0000000000000280 R12: 0000000000000000
> R13: 0000000000003c15 R14: ffff810c3cf3ca50 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff810c6a3c42c0(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000200200 CR3: 0000000c3d5a4000 CR4: 00000000000006e0
> Process iw_cm_wq (pid: 5708, threadinfo ffff81183e0ce000, task ffff810c3ea79080)
> Stack:  ffffffff8008c846 0000000300000000 ffff810c3cf3ca50 0000000000000000
>  0000000000000000 0000000000000282 0000000000000003 ffff81183e0cfd70
>  ffffffff8002e261 0000000000000000 ffff810c3cf3c9c0 ffff810c3cf3c900
> Call Trace:
>  [<ffffffff8008c846>] __wake_up_common+0x3e/0x68
>  [<ffffffff8002e261>] __wake_up+0x38/0x4f
>  [<ffffffff8867410b>] :iw_cm:iw_cm_reject+0x5a/0xa7
>  [<ffffffff88674baa>] :iw_cm:cm_work_handler+0x15e/0x424
>  [<ffffffff88674a4c>] :iw_cm:cm_work_handler+0x0/0x424
>  [<ffffffff8004d7ae>] run_workqueue+0x99/0xf6
>  [<ffffffff80049ff6>] worker_thread+0x0/0x122
>  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff8004a0e6>] worker_thread+0xf0/0x122
>  [<ffffffff8008e40a>] default_wake_function+0x0/0xe
>  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff80032974>] kthread+0xfe/0x132
>  [<ffffffff8005dfb1>] child_rip+0xa/0x11
>  [<ffffffff800a269c>] keventd_create_kthread+0x0/0xc4
>  [<ffffffff80032876>] kthread+0x0/0x132
>  [<ffffffff8005dfa7>] child_rip+0x0/0x11
>
>


Strange.  From my analysis, cm_work_handler + 0x15e points to cm_conn_req_handler() in the block where 
alloc_work_entries() returns non zero:

         cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
         cm_id_priv->state = IW_CM_STATE_CONN_RECV;

         ret = alloc_work_entries(cm_id_priv, 3);
         if (ret) {
                 iw_cm_reject(cm_id, NULL, 0);
                 iw_destroy_cm_id(cm_id);
                 goto out;
         }


So its calling iw_cm_reject() in the block above having just set the state to CONN_RECV.

Now, iw_cm_reject + 0x5a points to this code in iw_cm_reject():

         if (cm_id_priv->state != IW_CM_STATE_CONN_RECV) {
                 spin_unlock_irqrestore(&cm_id_priv->lock, flags);
                 clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
                 wake_up_all(&cm_id_priv->connect_wait);
                 return -EINVAL;
         }


Since the state isn't CONN_RECV, yet the previous stack frame set the state to this, then I can only assume some other 
thread is whacking the cm_id concurrently.




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2012-02-23 20:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-22 21:43 [PATCH] iw_cm: reject connect requests if cmid is not in LISTEN Steve Wise
     [not found] ` <20120222214307.23921.83903.stgit-T4OLL4TyM9aNDNWfRnPdfg@public.gmane.org>
2012-02-23  7:46   ` Roland Dreier
     [not found]     ` <CAL1RGDV7ZoKWgbh+ERF+af3_B7K2USAkXSPKWeQEg5atpHY0og-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-23 15:24       ` Steve Wise
     [not found]         ` <4F465A46.3060301-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2012-02-23 19:55           ` Steve Wise
     [not found]             ` <4F4699A1.7030402-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2012-02-23 20:23               ` Steve Wise [this message]
2012-02-24  1:57               ` Roland Dreier
     [not found]                 ` <CAL1RGDWkVJxEDZ5SaaSa8oA_y6a0u1NCbzTK9agsJE+V_YzimQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-24 14:16                   ` Steve Wise
2012-02-24 21:32   ` Roland Dreier
     [not found]     ` <CAL1RGDWb0ocYN5oM3QtxRj5VWCAWrp3Jtx6N1UHSrNDP2A1WEw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-02-24 21:41       ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F46A056.5090005@opengridcomputing.com \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=roland-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.