All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
	Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Mikulas Patocka
	<mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org,
	david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org
Subject: Re: [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Mon, 10 Sep 2012 15:09:10 -0700	[thread overview]
Message-ID: <20120910220910.GB7677@google.com> (raw)
In-Reply-To: <20120910215633.GA19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hello, Kent.

On Mon, Sep 10, 2012 at 02:56:33PM -0700, Kent Overstreet wrote:
> commit df7e63cbffa3065fcc4ba2b9a93418d7c7312243
> Author: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> Date:   Mon Sep 10 14:33:46 2012 -0700
> 
>     block: Avoid deadlocks with bio allocation by stacking drivers
>     
>     Previously, if we ever try to allocate more than once from the same bio
>     set while running under generic_make_request() (i.e. a stacking block
>     driver), we risk deadlock.
>     
>     This is because of the code in generic_make_request() that converts
>     recursion to iteration; any bios we submit won't actually be submitted
>     (so they can complete and eventually be freed) until after we return -
>     this means if we allocate a second bio, we're blocking the first one
>     from ever being freed.
>     
>     Thus if enough threads call into a stacking block driver at the same
>     time with bios that need multiple splits, and the bio_set's reserve gets
>     used up, we deadlock.
>     
>     This can be worked around in the driver code - we could check if we're
>     running under generic_make_request(), then mask out __GFP_WAIT when we
>     go to allocate a bio, and if the allocation fails punt to workqueue and
>     retry the allocation.
>     
>     But this is tricky and not a generic solution. This patch solves it for
>     all users by inverting the previously described technique. We allocate a
>     rescuer workqueue for each bio_set, and then in the allocation code if
>     there are bios on current->bio_list we would be blocking, we punt them
>     to the rescuer workqueue to be submitted.
>     
>     This guarantees forward progress for bio allocations under
>     generic_make_request() provided each bio is submitted before allocating
>     the next, and provided the bios are freed after they complete.
>     
>     Note that this doesn't do anything for allocation from other mempools.
>     Instead of allocating per bio data structures from a mempool, code
>     should use bio_set's front_pad.
>     
>     Tested it by forcing the rescue codepath to be taken (by disabling the
>     first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
>     of arbitrary bio splitting) and verified that the rescuer was being
>     invoked.
>     
>     Signed-off-by: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
>     CC: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>

I'm still a bit scared but think this is correct.

 Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

One last thing is that we may want to add @name on bioset creation so
that we can name the workqueue properly but that's for another patch.

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Kent Overstreet <koverstreet@google.com>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, axboe@kernel.dk,
	Vivek Goyal <vgoyal@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	bharrosh@panasas.com, david@fromorbit.com
Subject: Re: [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Mon, 10 Sep 2012 15:09:10 -0700	[thread overview]
Message-ID: <20120910220910.GB7677@google.com> (raw)
In-Reply-To: <20120910215633.GA19739@google.com>

Hello, Kent.

On Mon, Sep 10, 2012 at 02:56:33PM -0700, Kent Overstreet wrote:
> commit df7e63cbffa3065fcc4ba2b9a93418d7c7312243
> Author: Kent Overstreet <koverstreet@google.com>
> Date:   Mon Sep 10 14:33:46 2012 -0700
> 
>     block: Avoid deadlocks with bio allocation by stacking drivers
>     
>     Previously, if we ever try to allocate more than once from the same bio
>     set while running under generic_make_request() (i.e. a stacking block
>     driver), we risk deadlock.
>     
>     This is because of the code in generic_make_request() that converts
>     recursion to iteration; any bios we submit won't actually be submitted
>     (so they can complete and eventually be freed) until after we return -
>     this means if we allocate a second bio, we're blocking the first one
>     from ever being freed.
>     
>     Thus if enough threads call into a stacking block driver at the same
>     time with bios that need multiple splits, and the bio_set's reserve gets
>     used up, we deadlock.
>     
>     This can be worked around in the driver code - we could check if we're
>     running under generic_make_request(), then mask out __GFP_WAIT when we
>     go to allocate a bio, and if the allocation fails punt to workqueue and
>     retry the allocation.
>     
>     But this is tricky and not a generic solution. This patch solves it for
>     all users by inverting the previously described technique. We allocate a
>     rescuer workqueue for each bio_set, and then in the allocation code if
>     there are bios on current->bio_list we would be blocking, we punt them
>     to the rescuer workqueue to be submitted.
>     
>     This guarantees forward progress for bio allocations under
>     generic_make_request() provided each bio is submitted before allocating
>     the next, and provided the bios are freed after they complete.
>     
>     Note that this doesn't do anything for allocation from other mempools.
>     Instead of allocating per bio data structures from a mempool, code
>     should use bio_set's front_pad.
>     
>     Tested it by forcing the rescue codepath to be taken (by disabling the
>     first GFP_NOWAIT) attempt, and then ran it with bcache (which does a lot
>     of arbitrary bio splitting) and verified that the rescuer was being
>     invoked.
>     
>     Signed-off-by: Kent Overstreet <koverstreet@google.com>
>     CC: Jens Axboe <axboe@kernel.dk>

I'm still a bit scared but think this is correct.

 Acked-by: Tejun Heo <tj@kernel.org>

One last thing is that we may want to add @name on bioset creation so
that we can name the workqueue properly but that's for another patch.

Thanks.

-- 
tejun

  parent reply	other threads:[~2012-09-10 22:09 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-07 22:12 [PATCH 0/2] Avoid deadlocks with bio allocation Kent Overstreet
2012-09-07 22:12 ` Kent Overstreet
     [not found] ` <1347055973-11581-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-07 22:12   ` [PATCH 1/2] block: Reorder struct bio_set Kent Overstreet
2012-09-07 22:12     ` Kent Overstreet
2012-09-07 22:12 ` [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers Kent Overstreet
     [not found]   ` <1347055973-11581-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-08 19:36     ` Tejun Heo
2012-09-08 19:36       ` Tejun Heo
     [not found]       ` <20120908193641.GB12773-RcKxWJ4Cfj1J2suj2OqeGauc2jM2gXBXkQQo+JxHRPFibQn6LdNjmg@public.gmane.org>
2012-09-10  0:28         ` Kent Overstreet
2012-09-10  0:28           ` Kent Overstreet
     [not found]           ` <20120910002810.GA23241-jC9Py7bek1znysI04z7BkA@public.gmane.org>
2012-09-10 15:25             ` Vivek Goyal
2012-09-10 15:25               ` Vivek Goyal
2012-09-10 17:22             ` Tejun Heo
2012-09-10 17:22               ` Tejun Heo
     [not found]               ` <20120910172210.GC14103-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 20:24                 ` Kent Overstreet
2012-09-10 20:24                   ` Kent Overstreet
     [not found]                   ` <20120910202435.GG16360-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 20:40                     ` Tejun Heo
2012-09-10 20:40                       ` Tejun Heo
     [not found]                       ` <20120910204010.GA32310-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 21:33                         ` Kent Overstreet
2012-09-10 21:33                           ` Kent Overstreet
     [not found]                           ` <20120910213349.GH16360-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 21:37                             ` Tejun Heo
2012-09-10 21:37                               ` Tejun Heo
2012-09-10 21:56                               ` Kent Overstreet
     [not found]                                 ` <20120910215633.GA19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 22:09                                   ` Tejun Heo [this message]
2012-09-10 22:09                                     ` Tejun Heo
     [not found]                                     ` <20120910220910.GB7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 22:50                                       ` [dm-devel] " Alasdair G Kergon
2012-09-10 22:50                                         ` Alasdair G Kergon
     [not found]                                         ` <20120910225057.GA10477-FDJ95KluN3Z0klwcnFlA1dvLeJWuRmrY@public.gmane.org>
2012-09-10 23:01                                           ` Tejun Heo
2012-09-10 23:01                                             ` Tejun Heo
2012-09-10 23:06                                             ` Kent Overstreet
     [not found]                                             ` <CAOS58YO5puV2iPhwe0vbyEydPnR=eWPStNRm=xEpMeo-TMHCaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-10 23:09                                               ` Alasdair G Kergon
2012-09-10 23:09                                                 ` Alasdair G Kergon
     [not found]                                                 ` <20120910230917.GB10477-FDJ95KluN3Z0klwcnFlA1dvLeJWuRmrY@public.gmane.org>
2012-09-10 23:35                                                   ` Tejun Heo
2012-09-10 23:35                                                     ` Tejun Heo
     [not found]                                                     ` <20120910233502.GH7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 23:45                                                       ` Alasdair G Kergon
2012-09-10 23:45                                                         ` Alasdair G Kergon
2012-09-10 23:01                                           ` Kent Overstreet
2012-09-10 23:01                                             ` Kent Overstreet
2012-09-10 23:13                                       ` Tejun Heo
2012-09-10 23:13                                         ` Tejun Heo
2012-09-11 18:36                                   ` Muthu Kumar
2012-09-11 18:36                                     ` Muthu Kumar
     [not found]                                     ` <CAFR8ued=Le4B4i_Rf4eECXsVgKvce9J7bO9sDSS91PFzBZ6j5w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-11 18:45                                       ` Tejun Heo
2012-09-11 18:45                                         ` Tejun Heo
     [not found]                                         ` <20120911184553.GT7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 18:58                                           ` Muthu Kumar
2012-09-11 18:58                                             ` Muthu Kumar
2012-09-11 19:31                                             ` Kent Overstreet
     [not found]                                               ` <20120911193124.GI19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 20:00                                                 ` Muthu Kumar
2012-09-11 20:00                                                   ` Muthu Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120910220910.GB7677@google.com \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.