public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: sagig <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
	Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: nvmet_rdma crash - DISCONNECT event with NULL queue
Date: Tue, 1 Nov 2016 10:57:44 -0500	[thread overview]
Message-ID: <01b401d23458$af277210$0d765630$@opengridcomputing.com> (raw)

Hey guys,

I just hit an nvmf target NULL pointer deref BUG after a few hours of keep-alive
timeout testing.  It appears that nvmet_rdma_cm_handler() was called with
cm_id->qp == NULL, so the local nvmet_rdma_queue * variable queue is left as
NULL.  But then nvmet_rdma_queue_disconnect() is called with queue == NULL which
causes the crash.

In the log, I see that the target side keep-alive fired:

[20676.867545] eth2: link up, 40Gbps, full-duplex, Tx/Rx PAUSE
[20677.079669] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[20677.079684] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!

Then all the queues are freed followed by the crash.

[20677.080066] nvmet_rdma: freeing queue 222
[20677.080074] nvmet_rdma: sending cmd response failed
[20677.080351] nvmet_rdma: freeing queue 227
[20677.080775] nvmet_rdma: freeing queue 230
[20677.081137] nvmet_rdma: freeing queue 232
[20677.081371] nvmet_rdma: freeing queue 234
[20677.081604] nvmet_rdma: freeing queue 236
[20677.081835] nvmet_rdma: freeing queue 237
[20677.082062] nvmet_rdma: freeing queue 238
[20677.082106] nvmet_rdma: freeing queue 239
[20677.082366] nvmet_rdma: freeing queue 240
[20677.082570] nvmet_rdma: freeing queue 241
[20677.082995] nvmet_rdma: freeing queue 242
[20677.083222] nvmet_rdma: freeing queue 243
[20677.083475] nvmet_rdma: freeing queue 244
[20677.083522] nvmet_rdma: freeing queue 245
[20677.083801] nvmet_rdma: freeing queue 246
[20677.084264] nvmet_rdma: freeing queue 247
[20677.084307] nvmet_rdma: freeing queue 248
[20677.084501] nvmet_rdma: freeing queue 249
[20677.084846] nvmet_rdma: freeing queue 250
[20677.085184] nvmet_rdma: freeing queue 252
[20677.085500] nvmet_rdma: freeing queue 254
[20677.085733] nvmet_rdma: freeing queue 256
[20677.085997] nvmet_rdma: freeing queue 258
[20677.086224] nvmet_rdma: freeing queue 260
[20677.086517] nvmet_rdma: freeing queue 262
[20677.086768] nvmet_rdma: freeing queue 264
[20677.087031] nvmet_rdma: freeing queue 266
[20677.087359] nvmet_rdma: freeing queue 268
[20677.087567] nvmet_rdma: freeing queue 270
[20677.087821] nvmet_rdma: freeing queue 272
[20677.088162] nvmet_rdma: freeing queue 274
[20677.088402] nvmet_rdma: freeing queue 276
[20677.090981] BUG: unable to handle kernel NULL pointer dereference at
0000000000000120
[20677.090988] IP: [<ffffffffa084b6b4>] nvmet_rdma_queue_disconnect+0x24/0x90
[nvmet_rdma]


So maybe there is just a race in that keep-alive can free the queue and yet a
DISCONNECTED event still received on the cm_id after the queue is freed?

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

             reply	other threads:[~2016-11-01 15:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-01 15:57 Steve Wise [this message]
2016-11-01 16:15 ` nvmet_rdma crash - DISCONNECT event with NULL queue Sagi Grimberg
     [not found]   ` <6f42d056-284d-00fc-2b98-189f54957980-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-01 16:20     ` Steve Wise
2016-11-01 16:34       ` Sagi Grimberg
     [not found]         ` <4cc25277-429a-4ab9-470c-b3af1428ce93-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-01 16:37           ` Steve Wise
2016-11-01 16:44             ` Sagi Grimberg
     [not found]               ` <dbe5f18d-7928-f065-920f-753b30fb99a2-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-01 16:49                 ` Steve Wise
2016-11-01 17:41                   ` Sagi Grimberg
     [not found]                     ` <025201d23476$66812290$338367b0$@opengridcomputing.com>
2016-11-01 19:42                       ` Steve Wise
     [not found]                     ` <024e01d23476$6668b890$333a29b0$@opengridcomputing.com>
2016-11-01 22:34                       ` Sagi Grimberg
     [not found]                         ` <3512b8bb-4d29-b90a-49e1-ebf1085c47d7-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-02 15:07                           ` Steve Wise
2016-11-02 15:15                             ` 'Christoph Hellwig'
     [not found]                               ` <20161102151540.GB14825-jcswGhMUV9g@public.gmane.org>
2016-11-06  7:35                                 ` Sagi Grimberg
     [not found]                                   ` <bd82c206-668a-0794-2d51-3c48058350d3-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-07 18:29                                     ` J Freyensee
     [not found]                                       ` <1478543378.3350.17.camel-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2016-11-07 18:41                                         ` 'Christoph Hellwig'
     [not found]                                           ` <20161107184126.GA4400-jcswGhMUV9g@public.gmane.org>
2016-11-07 18:50                                             ` J Freyensee
     [not found]                                               ` <1478544616.3350.29.camel-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2016-11-07 18:51                                                 ` 'Christoph Hellwig'
     [not found]                         ` <004701d2351a$d9e4ad70$8dae0850$@opengridcomputing.com>
2016-11-02 19:18                           ` Steve Wise
2016-11-06  8:51                             ` Sagi Grimberg
     [not found]                               ` <b499abb0-afc9-2985-c4c4-3ceba4ca6f33-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
2016-11-08 20:45                                 ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='01b401d23458$af277210$0d765630$@opengridcomputing.com' \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=hch-jcswGhMUV9g@public.gmane.org \
    --cc=linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox