public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, axboe@kernel.dk,
	Vivek Goyal <vgoyal@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	bharrosh@panasas.com, david@fromorbit.com
Subject: Re: [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Mon, 10 Sep 2012 15:09:10 -0700	[thread overview]
Message-ID: <20120910220910.GB7677@google.com> (raw)
In-Reply-To: <20120910215633.GA19739@google.com>

Hello, Kent.

On Mon, Sep 10, 2012 at 02:56:33PM -0700, Kent Overstreet wrote:
> commit df7e63cbffa3065fcc4ba2b9a93418d7c7312243
> Author: Kent Overstreet <koverstreet@google.com>
> Date:   Mon Sep 10 14:33:46 2012 -0700
> 
>     block: Avoid deadlocks with bio allocation by stacking drivers
>     
>     Previously, if we ever try to allocate more than once from the same bio
>     set while running under generic_make_request() (i.e. a stacking block
>     driver), we risk deadlock.
>     
>     This is because of the code in generic_make_request() that converts
>     recursion to iteration; any bios we submit won't actually be submitted
>     (so they can complete and eventually be freed) until after we return -
>     this means if we allocate a second bio, we're blocking the first one
>     from ever being freed.
>     
>     Thus if enough threads call into a stacking block driver at the same
>     time with bios that need multiple splits, and the bio_set's reserve gets
>     used up, we deadlock.
>     
>     This can be worked around in the driver code - we could check if we're
>     running under generic_make_request(), then mask out __GFP_WAIT when we
>     go to allocate a bio, and if the allocation fails punt to workqueue and
>     retry the allocation.
>     
>     But this is tricky and not a generic solution. This patch solves it for
>     all users by inverting the previously described technique. We allocate a
>     rescuer workqueue for each bio_set, and then in the allocation code if
>     there are bios on current->bio_list we would be blocking, we punt them
>     to the rescuer workqueue to be submitted.
>     
>     This guarantees forward progress for bio allocations under
>     generic_make_request() provided each bio is submitted before allocating
>     the next, and provided the bios are freed after they complete.
>     
>     Note that this doesn't do anything for allocation from other mempools.
>     Instead of allocating per bio data structures from a mempool, code
>     should use bio_set's front_pad.
>     
>     Tested it by forcing the rescue codepath to be taken (by disabling the
>     first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
>     of arbitrary bio splitting) and verified that the rescuer was being
>     invoked.
>     
>     Signed-off-by: Kent Overstreet <koverstreet@google.com>
>     CC: Jens Axboe <axboe@kernel.dk>

I'm still a bit scared but think this is correct.

 Acked-by: Tejun Heo <tj@kernel.org>

One last thing is that we may want to add @name on bioset creation so
that we can name the workqueue properly but that's for another patch.

Thanks.

-- 
tejun

  reply	other threads:[~2012-09-10 22:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-07 22:12 [PATCH 0/2] Avoid deadlocks with bio allocation Kent Overstreet
2012-09-07 22:12 ` [PATCH 1/2] block: Reorder struct bio_set Kent Overstreet
2012-09-07 22:12 ` [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers Kent Overstreet
2012-09-08 19:36   ` Tejun Heo
2012-09-10  0:28     ` Kent Overstreet
2012-09-10 15:25       ` Vivek Goyal
2012-09-10 17:22       ` Tejun Heo
2012-09-10 20:24         ` Kent Overstreet
2012-09-10 20:40           ` Tejun Heo
2012-09-10 21:33             ` Kent Overstreet
2012-09-10 21:37               ` Tejun Heo
2012-09-10 21:56                 ` Kent Overstreet
2012-09-10 22:09                   ` Tejun Heo [this message]
2012-09-10 22:50                     ` [dm-devel] " Alasdair G Kergon
2012-09-10 23:01                       ` Tejun Heo
2012-09-10 23:06                         ` Kent Overstreet
2012-09-10 23:09                         ` Alasdair G Kergon
2012-09-10 23:35                           ` Tejun Heo
2012-09-10 23:45                             ` Alasdair G Kergon
2012-09-10 23:01                       ` Kent Overstreet
2012-09-10 23:13                     ` Tejun Heo
2012-09-11 18:36                   ` Muthu Kumar
2012-09-11 18:45                     ` Tejun Heo
2012-09-11 18:58                       ` Muthu Kumar
2012-09-11 19:31                         ` Kent Overstreet
2012-09-11 20:00                           ` Muthu Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120910220910.GB7677@google.com \
    --to=tj@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bharrosh@panasas.com \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=koverstreet@google.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox