All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
	Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Mikulas Patocka
	<mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org,
	david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org
Subject: Re: [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Mon, 10 Sep 2012 13:24:35 -0700	[thread overview]
Message-ID: <20120910202435.GG16360@google.com> (raw)
In-Reply-To: <20120910172210.GC14103-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On Mon, Sep 10, 2012 at 10:22:10AM -0700, Tejun Heo wrote:
> Hello, Kent.
> 
> On Sun, Sep 09, 2012 at 05:28:10PM -0700, Kent Overstreet wrote:
> > > > +	while ((bio = bio_list_pop(current->bio_list)))
> > > > +		bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> > > > +
> > > > +	*current->bio_list = nopunt;
> > > 
> > > Why this is necessary needs explanation and it's done in rather
> > > unusual way.  I suppose the weirdness is from bio_list API
> > > restriction?
> > 
> > It's because bio_lists are singly linked, so deleting an entry from the
> > middle of the list would be a real pain - just much cleaner/simpler to
> > do it this way.
> 
> Yeah, I wonder how benefical that singly linked list is.  Eh well...

Well, this is the first time I can think of that it's come up, and IMO
this is no less clean a way of writing it... just a bit unusual in C,
it feels more functional to me instead of imperative.

> > > Wouldn't the following be better?
> > > 
> > > 	p = mempool_alloc(bs->bi_pool, gfp_mask);
> > > 	if (unlikely(!p) && gfp_mask != saved_gfp) {
> > > 		punt_bios_to_rescuer(bs);
> > > 		p = mempool_alloc(bs->bi_pool, saved_gfp);
> > > 	}
> > 
> > That'd require duplicating the error handling in two different places -
> > once for the initial allocation, once for the bvec allocation. And I
> > really hate that writing code that does
> > 
> > alloc_something()
> > if (fail) {
> > 	alloc_something_again()
> > }
> > 
> > it just screams ugly to me.
> 
> I don't know.  That at least represents what's going on and goto'ing
> back and forth is hardly pretty.  Sometimes the code gets much uglier
> / unwieldy and we have to live with gotos.  Here, that doesn't seem to
> be the case.

I think this is really more personal preference than anything, but:

Setting gfp_mask = saved_gfp after calling punt_bio_to_rescuer() is
really the correct thing to do, and makes the code clearer IMO: once
we've run punt_bio_to_rescuer() we don't need to mask out GFP_WAIT (not
until the next time a bio is submitted, really).

This matters a bit for the bvl allocation too, if we call
punt_bio_to_rescuer() for the bio allocation no point doing it again.

So to be rigorously correct, your way would have to be

p = mempool_alloc(bs->bio_pool, gfp_mask);
if (!p && gfp_mask != saved_gfp) {
	punt_bios_to_rescuer(bs);
	gfp_mask = saved_gfp;
	p = mempool_alloc(bs->bio_pool, gfp_mask);
}

And at that point, why duplicate that line of code? It doesn't matter that
much, but IMO a goto retry better labels what's actually going on (it's
something that's not uncommon in the kernel and if I see a retry label
in a function I pretty immediately have an idea of what's going on).

So we could do

retry:
p = mempool_alloc(bs->bio_pool, gfp_mask);
if (!p && gfp_mask != saved_gfp) {
	punt_bios_to_rescuer(bs);
	gfp_mask = saved_gfp;
	goto retry;
}

(side note: not that it really matters here, but gcc will inline the
bvec_alloc_bs() call if it's not duplicated, I've never seen it
consolidate duplicated code and /then/ inline based off that)

This does have the advantage that we're not freeing and reallocating the
bio like Vivek pointed out, but I'm not a huge fan of having the
punting/retry logic in the main code path.

I don't care that much though. I'd prefer not to have the actual
allocations duplicated, but it's starting to feel like bikeshedding to
me.

> > +static void punt_bios_to_rescuer(struct bio_set *bs)
> > +{
> > +	struct bio_list punt, nopunt;
> > +	struct bio *bio;
> > +
> > +	/*
> > +	 * Don't want to punt all bios on current->bio_list; if there was a bio
> > +	 * on there for a stacking driver higher up in the stack, processing it
> > +	 * could require allocating bios from this bio_set, and we don't want to
> > +	 * do that from our own rescuer.
> 
> Hmmm... isn't it more like we "must" process only the bios which are
> from this bio_set to have any kind of forward-progress guarantee?  The
> above sounds like it's just something undesirable.

Yeah, that'd be better, I'll change it.

WARNING: multiple messages have this Message-ID (diff)
From: Kent Overstreet <koverstreet@google.com>
To: Tejun Heo <tj@kernel.org>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, axboe@kernel.dk,
	Vivek Goyal <vgoyal@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	bharrosh@panasas.com, david@fromorbit.com
Subject: Re: [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers
Date: Mon, 10 Sep 2012 13:24:35 -0700	[thread overview]
Message-ID: <20120910202435.GG16360@google.com> (raw)
In-Reply-To: <20120910172210.GC14103@google.com>

On Mon, Sep 10, 2012 at 10:22:10AM -0700, Tejun Heo wrote:
> Hello, Kent.
> 
> On Sun, Sep 09, 2012 at 05:28:10PM -0700, Kent Overstreet wrote:
> > > > +	while ((bio = bio_list_pop(current->bio_list)))
> > > > +		bio_list_add(bio->bi_pool == bs ? &punt : &nopunt, bio);
> > > > +
> > > > +	*current->bio_list = nopunt;
> > > 
> > > Why this is necessary needs explanation and it's done in rather
> > > unusual way.  I suppose the weirdness is from bio_list API
> > > restriction?
> > 
> > It's because bio_lists are singly linked, so deleting an entry from the
> > middle of the list would be a real pain - just much cleaner/simpler to
> > do it this way.
> 
> Yeah, I wonder how benefical that singly linked list is.  Eh well...

Well, this is the first time I can think of that it's come up, and IMO
this is no less clean a way of writing it... just a bit unusual in C,
it feels more functional to me instead of imperative.

> > > Wouldn't the following be better?
> > > 
> > > 	p = mempool_alloc(bs->bi_pool, gfp_mask);
> > > 	if (unlikely(!p) && gfp_mask != saved_gfp) {
> > > 		punt_bios_to_rescuer(bs);
> > > 		p = mempool_alloc(bs->bi_pool, saved_gfp);
> > > 	}
> > 
> > That'd require duplicating the error handling in two different places -
> > once for the initial allocation, once for the bvec allocation. And I
> > really hate that writing code that does
> > 
> > alloc_something()
> > if (fail) {
> > 	alloc_something_again()
> > }
> > 
> > it just screams ugly to me.
> 
> I don't know.  That at least represents what's going on and goto'ing
> back and forth is hardly pretty.  Sometimes the code gets much uglier
> / unwieldy and we have to live with gotos.  Here, that doesn't seem to
> be the case.

I think this is really more personal preference than anything, but:

Setting gfp_mask = saved_gfp after calling punt_bio_to_rescuer() is
really the correct thing to do, and makes the code clearer IMO: once
we've run punt_bio_to_rescuer() we don't need to mask out GFP_WAIT (not
until the next time a bio is submitted, really).

This matters a bit for the bvl allocation too, if we call
punt_bio_to_rescuer() for the bio allocation no point doing it again.

So to be rigorously correct, your way would have to be

p = mempool_alloc(bs->bio_pool, gfp_mask);
if (!p && gfp_mask != saved_gfp) {
	punt_bios_to_rescuer(bs);
	gfp_mask = saved_gfp;
	p = mempool_alloc(bs->bio_pool, gfp_mask);
}

And at that point, why duplicate that line of code? It doesn't matter that
much, but IMO a goto retry better labels what's actually going on (it's
something that's not uncommon in the kernel and if I see a retry label
in a function I pretty immediately have an idea of what's going on).

So we could do

retry:
p = mempool_alloc(bs->bio_pool, gfp_mask);
if (!p && gfp_mask != saved_gfp) {
	punt_bios_to_rescuer(bs);
	gfp_mask = saved_gfp;
	goto retry;
}

(side note: not that it really matters here, but gcc will inline the
bvec_alloc_bs() call if it's not duplicated, I've never seen it
consolidate duplicated code and /then/ inline based off that)

This does have the advantage that we're not freeing and reallocating the
bio like Vivek pointed out, but I'm not a huge fan of having the
punting/retry logic in the main code path.

I don't care that much though. I'd prefer not to have the actual
allocations duplicated, but it's starting to feel like bikeshedding to
me.

> > +static void punt_bios_to_rescuer(struct bio_set *bs)
> > +{
> > +	struct bio_list punt, nopunt;
> > +	struct bio *bio;
> > +
> > +	/*
> > +	 * Don't want to punt all bios on current->bio_list; if there was a bio
> > +	 * on there for a stacking driver higher up in the stack, processing it
> > +	 * could require allocating bios from this bio_set, and we don't want to
> > +	 * do that from our own rescuer.
> 
> Hmmm... isn't it more like we "must" process only the bios which are
> from this bio_set to have any kind of forward-progress guarantee?  The
> above sounds like it's just something undesirable.

Yeah, that'd be better, I'll change it.

  parent reply	other threads:[~2012-09-10 20:24 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-07 22:12 [PATCH 0/2] Avoid deadlocks with bio allocation Kent Overstreet
2012-09-07 22:12 ` Kent Overstreet
     [not found] ` <1347055973-11581-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-07 22:12   ` [PATCH 1/2] block: Reorder struct bio_set Kent Overstreet
2012-09-07 22:12     ` Kent Overstreet
2012-09-07 22:12 ` [PATCH 2/2] block: Avoid deadlocks with bio allocation by stacking drivers Kent Overstreet
     [not found]   ` <1347055973-11581-3-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-08 19:36     ` Tejun Heo
2012-09-08 19:36       ` Tejun Heo
     [not found]       ` <20120908193641.GB12773-RcKxWJ4Cfj1J2suj2OqeGauc2jM2gXBXkQQo+JxHRPFibQn6LdNjmg@public.gmane.org>
2012-09-10  0:28         ` Kent Overstreet
2012-09-10  0:28           ` Kent Overstreet
     [not found]           ` <20120910002810.GA23241-jC9Py7bek1znysI04z7BkA@public.gmane.org>
2012-09-10 15:25             ` Vivek Goyal
2012-09-10 15:25               ` Vivek Goyal
2012-09-10 17:22             ` Tejun Heo
2012-09-10 17:22               ` Tejun Heo
     [not found]               ` <20120910172210.GC14103-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 20:24                 ` Kent Overstreet [this message]
2012-09-10 20:24                   ` Kent Overstreet
     [not found]                   ` <20120910202435.GG16360-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 20:40                     ` Tejun Heo
2012-09-10 20:40                       ` Tejun Heo
     [not found]                       ` <20120910204010.GA32310-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 21:33                         ` Kent Overstreet
2012-09-10 21:33                           ` Kent Overstreet
     [not found]                           ` <20120910213349.GH16360-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 21:37                             ` Tejun Heo
2012-09-10 21:37                               ` Tejun Heo
2012-09-10 21:56                               ` Kent Overstreet
     [not found]                                 ` <20120910215633.GA19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 22:09                                   ` Tejun Heo
2012-09-10 22:09                                     ` Tejun Heo
     [not found]                                     ` <20120910220910.GB7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 22:50                                       ` [dm-devel] " Alasdair G Kergon
2012-09-10 22:50                                         ` Alasdair G Kergon
     [not found]                                         ` <20120910225057.GA10477-FDJ95KluN3Z0klwcnFlA1dvLeJWuRmrY@public.gmane.org>
2012-09-10 23:01                                           ` Tejun Heo
2012-09-10 23:01                                             ` Tejun Heo
2012-09-10 23:06                                             ` Kent Overstreet
     [not found]                                             ` <CAOS58YO5puV2iPhwe0vbyEydPnR=eWPStNRm=xEpMeo-TMHCaw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-10 23:09                                               ` Alasdair G Kergon
2012-09-10 23:09                                                 ` Alasdair G Kergon
     [not found]                                                 ` <20120910230917.GB10477-FDJ95KluN3Z0klwcnFlA1dvLeJWuRmrY@public.gmane.org>
2012-09-10 23:35                                                   ` Tejun Heo
2012-09-10 23:35                                                     ` Tejun Heo
     [not found]                                                     ` <20120910233502.GH7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-10 23:45                                                       ` Alasdair G Kergon
2012-09-10 23:45                                                         ` Alasdair G Kergon
2012-09-10 23:01                                           ` Kent Overstreet
2012-09-10 23:01                                             ` Kent Overstreet
2012-09-10 23:13                                       ` Tejun Heo
2012-09-10 23:13                                         ` Tejun Heo
2012-09-11 18:36                                   ` Muthu Kumar
2012-09-11 18:36                                     ` Muthu Kumar
     [not found]                                     ` <CAFR8ued=Le4B4i_Rf4eECXsVgKvce9J7bO9sDSS91PFzBZ6j5w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-11 18:45                                       ` Tejun Heo
2012-09-11 18:45                                         ` Tejun Heo
     [not found]                                         ` <20120911184553.GT7677-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 18:58                                           ` Muthu Kumar
2012-09-11 18:58                                             ` Muthu Kumar
2012-09-11 19:31                                             ` Kent Overstreet
     [not found]                                               ` <20120911193124.GI19739-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-11 20:00                                                 ` Muthu Kumar
2012-09-11 20:00                                                   ` Muthu Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120910202435.GG16360@google.com \
    --to=koverstreet-hpiqsd4aklfqt0dzr+alfa@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=bharrosh-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org \
    --cc=david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org \
    --cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mpatocka-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.