From: Keith Busch <kbusch@kernel.org>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Hannes Reinecke <hare@suse.com>,
"Busch, Keith" <keith.busch@intel.com>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
Sagi Grimberg <sagi@grimberg.me>,
Dongli Zhang <dongli.zhang@oracle.com>,
James Smart <james.smart@broadcom.com>,
Bart Van Assche <bart.vanassche@wdc.com>,
"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
Christoph Hellwig <hch@lst.de>,
"James E . J . Bottomley" <jejb@linux.vnet.ibm.com>,
jianchao wang <jianchao.w.wang@oracle.com>
Subject: Re: [PATCH V6 9/9] nvme: hold request queue's refcount in ns's whole lifetime
Date: Wed, 17 Apr 2019 09:55:12 -0600 [thread overview]
Message-ID: <20190417155511.GA6005@localhost.localdomain> (raw)
In-Reply-To: <20190417034410.31957-10-ming.lei@redhat.com>
On Tue, Apr 16, 2019 at 08:44:10PM -0700, Ming Lei wrote:
> Hennes reported the following kernel oops:
>
> There is a race condition between namespace rescanning and
> controller reset; during controller reset all namespaces are
> quiesed vie nams_stop_ctrl(), and after reset all namespaces
> are unquiesced again.
> When namespace scanning was active by the time controller reset
> was triggered the rescan code will call nvme_ns_remove(), which
> then will cause a kernel crash in nvme_start_ctrl() as it'll trip
> over uninitialized namespaces.
>
> Patch "blk-mq: free hw queue's resource in hctx's release handler"
> should make this issue quite difficult to trigger. However it can't
> kill the issue completely becasue pre-condition of that patch is to
> hold request queue's refcount before calling block layer API, and
> there is still a small window between blk_cleanup_queue() and removing
> the ns from the controller namspace list in nvme_ns_remove().
>
> Hold request queue's refcount until the ns is freed, then the above race
> can be avoided completely. Given the 'namespaces_rwsem' is always held
> to retrieve ns for starting/stopping request queue, this lock can prevent
> namespaces from being freed.
This looks good to me.
Reviewed-by: Keith Busch <keith.busch@intel.com>
WARNING: multiple messages have this Message-ID (diff)
From: kbusch@kernel.org (Keith Busch)
Subject: [PATCH V6 9/9] nvme: hold request queue's refcount in ns's whole lifetime
Date: Wed, 17 Apr 2019 09:55:12 -0600 [thread overview]
Message-ID: <20190417155511.GA6005@localhost.localdomain> (raw)
In-Reply-To: <20190417034410.31957-10-ming.lei@redhat.com>
On Tue, Apr 16, 2019@08:44:10PM -0700, Ming Lei wrote:
> Hennes reported the following kernel oops:
>
> There is a race condition between namespace rescanning and
> controller reset; during controller reset all namespaces are
> quiesed vie nams_stop_ctrl(), and after reset all namespaces
> are unquiesced again.
> When namespace scanning was active by the time controller reset
> was triggered the rescan code will call nvme_ns_remove(), which
> then will cause a kernel crash in nvme_start_ctrl() as it'll trip
> over uninitialized namespaces.
>
> Patch "blk-mq: free hw queue's resource in hctx's release handler"
> should make this issue quite difficult to trigger. However it can't
> kill the issue completely becasue pre-condition of that patch is to
> hold request queue's refcount before calling block layer API, and
> there is still a small window between blk_cleanup_queue() and removing
> the ns from the controller namspace list in nvme_ns_remove().
>
> Hold request queue's refcount until the ns is freed, then the above race
> can be avoided completely. Given the 'namespaces_rwsem' is always held
> to retrieve ns for starting/stopping request queue, this lock can prevent
> namespaces from being freed.
This looks good to me.
Reviewed-by: Keith Busch <keith.busch at intel.com>
next prev parent reply other threads:[~2019-04-17 16:01 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-17 3:44 [PATCH V6 0/9] blk-mq: fix races related with freeing queue Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 3:44 ` [PATCH V6 1/9] blk-mq: grab .q_usage_counter when queuing request from plug code path Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 3:44 ` [PATCH V6 2/9] blk-mq: move cancel of requeue_work into blk_mq_release Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 12:00 ` Hannes Reinecke
2019-04-17 12:00 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 3/9] blk-mq: free hw queue's resource in hctx's release handler Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 12:02 ` Hannes Reinecke
2019-04-17 12:02 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 4/9] blk-mq: move all hctx alloction & initialization into __blk_mq_alloc_and_init_hctx Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 12:03 ` Hannes Reinecke
2019-04-17 12:03 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 5/9] blk-mq: split blk_mq_alloc_and_init_hctx into two parts Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 3:44 ` [PATCH V6 6/9] blk-mq: always free hctx after request queue is freed Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 12:08 ` Hannes Reinecke
2019-04-17 12:08 ` Hannes Reinecke
2019-04-17 12:59 ` Ming Lei
2019-04-17 12:59 ` Ming Lei
2019-04-22 3:30 ` Ming Lei
2019-04-22 3:30 ` Ming Lei
2019-04-23 11:19 ` Hannes Reinecke
2019-04-23 11:19 ` Hannes Reinecke
2019-04-23 13:30 ` Ming Lei
2019-04-23 13:30 ` Ming Lei
2019-04-23 14:07 ` Hannes Reinecke
2019-04-23 14:07 ` Hannes Reinecke
2019-04-24 1:12 ` Ming Lei
2019-04-24 1:12 ` Ming Lei
2019-04-24 1:45 ` Ming Lei
2019-04-24 1:45 ` Ming Lei
2019-04-24 5:55 ` Hannes Reinecke
2019-04-24 5:55 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 7/9] blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 3:44 ` [PATCH V6 8/9] block: don't drain in-progress dispatch in blk_cleanup_queue() Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 3:44 ` [PATCH V6 9/9] nvme: hold request queue's refcount in ns's whole lifetime Ming Lei
2019-04-17 3:44 ` Ming Lei
2019-04-17 12:10 ` Hannes Reinecke
2019-04-17 12:10 ` Hannes Reinecke
2019-04-17 15:55 ` Keith Busch [this message]
2019-04-17 15:55 ` Keith Busch
2019-04-17 17:22 ` [PATCH V6 0/9] blk-mq: fix races related with freeing queue James Smart
2019-04-17 17:22 ` James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190417155511.GA6005@localhost.localdomain \
--to=kbusch@kernel.org \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@wdc.com \
--cc=dongli.zhang@oracle.com \
--cc=hare@suse.com \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=jejb@linux.vnet.ibm.com \
--cc=jianchao.w.wang@oracle.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.