linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Tejun Heo <tj@kernel.org>
Cc: linux-kernel@vger.kernel.org, vgoyal@redhat.com,
	ctalbott@google.com, ni@google.com
Subject: Re: [PATCH 10/10] block: fix request_queue lifetime handling by making blk_queue_cleanup() proper shutdown
Date: Wed, 19 Oct 2011 14:43:46 +0200	[thread overview]
Message-ID: <4E9EC602.5010207@kernel.dk> (raw)
In-Reply-To: <1318998384-22525-11-git-send-email-tj@kernel.org>

On 2011-10-19 06:26, Tejun Heo wrote:
> request_queue is refcounted but actually depdends on lifetime
> management from the queue owner - on blk_cleanup_queue(), block layer
> expects that there's no request passing through request_queue and no
> new one will.
> 
> This is fundamentally broken.  The queue owner (e.g. SCSI layer)
> doesn't have a way to know whether there are other active users before
> calling blk_cleanup_queue() and other users (e.g. bsg) don't have any
> guarantee that the queue is and would stay valid while it's holding a
> reference.
> 
> With delay added in blk_queue_bio() before queue_lock is grabbed, the
> following oops can be easily triggered when a device is removed with
> in-flight IOs.
> 
>  sd 0:0:1:0: [sdb] Stopping disk
>  ata1.01: disabled
>  general protection fault: 0000 [#1] PREEMPT SMP
>  CPU 2
>  Modules linked in:
> 
>  Pid: 648, comm: test_rawio Not tainted 3.1.0-rc3-work+ #56 Bochs Bochs
>  RIP: 0010:[<ffffffff8137d651>]  [<ffffffff8137d651>] elv_rqhash_find+0x61/0x100
>  ...
>  Process test_rawio (pid: 648, threadinfo ffff880019efa000, task ffff880019ef8a80)
>  ...
>  Call Trace:
>   [<ffffffff8137d774>] elv_merge+0x84/0xe0
>   [<ffffffff81385b54>] blk_queue_bio+0xf4/0x400
>   [<ffffffff813838ea>] generic_make_request+0xca/0x100
>   [<ffffffff81383994>] submit_bio+0x74/0x100
>   [<ffffffff811c53ec>] dio_bio_submit+0xbc/0xc0
>   [<ffffffff811c610e>] __blockdev_direct_IO+0x92e/0xb40
>   [<ffffffff811c39f7>] blkdev_direct_IO+0x57/0x60
>   [<ffffffff8113b1c5>] generic_file_aio_read+0x6d5/0x760
>   [<ffffffff8118c1ca>] do_sync_read+0xda/0x120
>   [<ffffffff8118ce55>] vfs_read+0xc5/0x180
>   [<ffffffff8118cfaa>] sys_pread64+0x9a/0xb0
>   [<ffffffff81afaf6b>] system_call_fastpath+0x16/0x1b
> 
> This happens because blk_queue_cleanup() destroys the queue and
> elevator whether IOs are in progress or not and DEAD tests are
> sprinkled in the request processing path without proper
> synchronization.
> 
> Similar problem exists for blk-throtl.  On queue cleanup, blk-throtl
> is shutdown whether it has requests in it or not.  Depending on
> timing, it either oopses or throttled bios are lost putting tasks
> which are waiting for bio completion into eternal D state.
> 
> The way it should work is having the usual clear distinction between
> shutdown and release.  Shutdown drains all currently pending requests,
> marks the queue dead, and performs partial teardown of the now
> unnecessary part of the queue.  Even after shutdown is complete,
> reference holders are still allowed to issue requests to the queue
> although they will be immmediately failed.  The rest of teardown
> happens on release.
> 
> This patch makes the following changes to make blk_queue_cleanup()
> behave as proper shutdown.
> 
> * QUEUE_FLAG_DEAD is now set while holding both q->exit_mutex and
>   queue_lock.
> 
> * Unsynchronized DEAD check in generic_make_request_checks() removed.
>   This couldn't make any meaningful difference as the queue could die
>   after the check.
> 
> * blk_drain_queue() updated such that it can drain all requests and is
>   now called during cleanup.
> 
> * blk_throtl updated such that it checks DEAD on grabbing queue_lock,
>   drains all throttled bios during cleanup and free td when queue is
>   released.

This patch clashes a bit with the previous "fix" for getting rid of
privately assigned queue locks. I've kept that single bit.

-- 
Jens Axboe


  reply	other threads:[~2011-10-19 12:43 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-19  4:26 [PATCHSET block/for-next] fix request_queue life-cycle management Tejun Heo
2011-10-19  4:26 ` [PATCH 01/10] block: make gendisk hold a reference to its queue Tejun Heo
2011-10-19  4:26 ` [PATCH 02/10] block: fix genhd refcounting in blkio_policy_parse_and_set() Tejun Heo
2011-10-19 13:26   ` Vivek Goyal
2011-10-19 16:29     ` Tejun Heo
2011-10-19 16:59       ` Vivek Goyal
2011-10-19 22:05         ` Tejun Heo
2011-10-19 22:07           ` Tejun Heo
2011-10-19 23:51             ` Tejun Heo
2011-10-20 13:41               ` Vivek Goyal
2011-10-20 16:11                 ` Tejun Heo
2011-10-20 16:16                   ` Kay Sievers
2011-10-20 17:50                     ` Vivek Goyal
2011-10-20 17:47                   ` Vivek Goyal
2011-10-19  4:26 ` [PATCH 03/10] block: move blk_throtl prototypes to block/blk.h Tejun Heo
2011-10-19 13:33   ` Vivek Goyal
2011-10-19  4:26 ` [PATCH 04/10] block: pass around REQ_* flags instead of broken down booleans during request alloc/free Tejun Heo
2011-10-19 13:44   ` Vivek Goyal
2011-10-19 16:31     ` Tejun Heo
2011-10-19  4:26 ` [PATCH 05/10] block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg() Tejun Heo
2011-10-19 13:52   ` Vivek Goyal
2011-10-19 16:35     ` Tejun Heo
2011-10-19  4:26 ` [PATCH 06/10] block: reorganize queue draining Tejun Heo
2011-10-19  4:26 ` [PATCH 07/10] block: reorganize throtl_get_tg() and blk_throtl_bio() Tejun Heo
2011-10-19 14:56   ` Vivek Goyal
2011-10-19 17:06     ` Tejun Heo
2011-10-19 17:19       ` Vivek Goyal
2011-10-19 17:30         ` Tejun Heo
2011-10-19 17:45           ` Vivek Goyal
2011-10-19 17:49             ` Tejun Heo
2011-10-19  4:26 ` [PATCH 08/10] block: make get_request[_wait]() fail if queue is dead Tejun Heo
2011-10-19 15:22   ` Vivek Goyal
2011-10-19  4:26 ` [PATCH 09/10] block: drop @tsk from attempt_plug_merge() and explain sync rules Tejun Heo
2011-10-19  4:26 ` [PATCH 10/10] block: fix request_queue lifetime handling by making blk_queue_cleanup() proper shutdown Tejun Heo
2011-10-19 12:43   ` Jens Axboe [this message]
2011-10-19 17:13     ` Tejun Heo
2011-10-19 18:04       ` Jens Axboe
2011-10-19 16:18   ` Vivek Goyal
2011-10-19 17:12     ` Tejun Heo
2011-10-19 17:29       ` Vivek Goyal
2011-10-19 17:33         ` Tejun Heo
2011-10-19  4:29 ` [PATCHSET block/for-next] fix request_queue life-cycle management Tejun Heo
2011-10-19 12:44 ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E9EC602.5010207@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=ctalbott@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ni@google.com \
    --cc=tj@kernel.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).