From: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Kent Overstreet <koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>,
axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
agk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Subject: Re: [dm-devel] [PATCH 12/13] Make generic_make_request handle arbitrarily large bios
Date: Mon, 21 May 2012 13:17:06 -0400 [thread overview]
Message-ID: <20120521171706.GH23993@redhat.com> (raw)
In-Reply-To: <20120518081444.GA27205-RcKxWJ4Cfj3IzGYXcIpNmNLIRw13R84JkQQo+JxHRPFibQn6LdNjmg@public.gmane.org>
On Fri, May 18, 2012 at 04:14:44AM -0400, Kent Overstreet wrote:
> On Fri, May 18, 2012 at 06:05:50PM +1000, NeilBrown wrote:
> >
> > Hi Kent,
> > there is lots of good stuff in this series and I certainly hope a lot of it
> > can eventually go upstream.
> >
> > However there is a part of this patch that I don't think is safe:
> >
> >
> > > +static void __bio_submit_split(struct closure *cl)
> > > +{
> > > + struct bio_split_hook *s = container_of(cl, struct bio_split_hook, cl);
> > > + struct bio *bio = s->bio, *n;
> > > +
> > > + do {
> > > + n = bio_split(bio, bio_max_sectors(bio),
> > > + GFP_NOIO, s->q->bio_split);
> > > + if (!n)
> > > + continue_at(cl, __bio_submit_split, bio_split_wq);
> > > +
> > > + closure_get(cl);
> > > + generic_make_request(n);
> > > + } while (n != bio);
> > > +
> > > + continue_at(cl, bio_submit_split_done, NULL);
> > > +}
> >
> > Firstly a small point: Can bio_split ever return NULL here? I don't
> > think it can, so there is no need to test.
> > But if it can, then calling generic_make_request(NULL) doesn't seem like a
> > good idea.
> >
> > More significantly though::
> > This is called from generic_make_request which can be called recursively and
> > enforces a tail-recursion semantic.
> > If that generic_make_request was a recursive call, then the
> > generic_make_request in __bio_submit_split will not start the request, but
> > will queue the bio for later handling. If we then call bio_split again, we
> > could try to allocation from a mempool while we are holding one entry
> > allocated from that pool captive. This can deadlock.
> >
> > i.e. if the original bio is so large that it needs to be split into 3 pieces,
> > then we will try to allocate the second piece before the first piece has a
> > chance to be released. If this happens in enough threads to exhaust the pool
> > (4 I think), it will deadlock.
> >
> > I realise this sounds like a very unlikely case, but of course they happen.
> >
> > One possible approach might be to count how many splits will be required,
> > then have an interface to mempools so you can allocate them all at once,
> > or block and wait.
>
> Yeah, I actually thought of that (I probably should've documented it,
> though).
>
> That's why the code is checking if bio_split() returned NULL; my verison
> of bio_split() checks if current->bio_list is non NULL, and if it is it
> doesn't pass __GFP_WAIT to bio_alloc_bioset().
>
> (I was debating whether bio_split() should do that itself or leave it up
> to the caller. I tend to think it's too easy to accidentally screw up -
> and not notice it - so it should be enforced by generic code. Possibly
> the check should be moved to bio_alloc_bioset(), though.)
>
> If we do get a memory allocation failure, then we just punt to workqueue
> - continue_at() runs __bio_submit_split from the bio_split workqueue -
> where we can safely use the mempool.
May be I am missing something, hence I will ask. Is punting to workqueue
will really solve the issue raised by Neil.
Due to spliting required, we will be holding some bios in the stack and
these bios can't be submitted till further allocation from pool happens. So
will it matter whether we are waiting for allocation in submitting process
context or in worker thread context.
IOW, say you have a pool of 2 bios. We allocate 1 bio (say bio A), and submit
it for IO (now 1 bio left in pool). Now, bio A needs to be split up, so we
allocate bio B and submit it (pool is empty now). Now we try to submit bio
B and this also needs to be split. There are no more free bios so we will
wait for some to get free but none of the bios (A and B) have actually
been submitted for IO so nothing will get freed and we have a deadlock
(This is assuming that memory is tight enough that we are not able to do
any allocations from the slab backing the mempool).
biodoc.txt says following.
**************************************************************************
The caller of bio_alloc is expected to taken certain steps to avoid
deadlocks, e.g. avoid trying to allocate more memory from the pool while
already holding memory obtained from the pool.
[TBD: This is a potential issue, though a rare possibility
in the bounce bio allocation that happens in the current code, since
it ends up allocating a second bio from the same pool while
holding the original bio ]
**************************************************************************
Thanks
Vivek
next prev parent reply other threads:[~2012-05-21 17:17 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-18 2:59 [PATCH 00/13] Block cleanups (for bcache) koverstreet
2012-05-18 2:59 ` [PATCH 01/13] block: Generalized bio pool freeing koverstreet
[not found] ` <ea774fea2a27c9f1028a12ce31a7ee5e5517bef4.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 15:55 ` Tejun Heo
[not found] ` <20120518155538.GA19388-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:14 ` [dm-devel] " Alasdair G Kergon
2012-05-18 2:59 ` [PATCH 02/13] dm: kill dm_rq_bio_destructor koverstreet
2012-05-18 15:57 ` Tejun Heo
[not found] ` <20120518155729.GB19388-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:43 ` [dm-devel] " Alasdair G Kergon
[not found] ` <20120518164319.GJ29330-FDJ95KluN3Z0klwcnFlA1dvLeJWuRmrY@public.gmane.org>
2012-05-18 18:50 ` Kent Overstreet
[not found] ` <20120518185027.GA9673-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-22 4:29 ` Jun'ichi Nomura
2012-05-18 2:59 ` [PATCH 03/13] block: Add bio_clone_bioset() koverstreet
2012-05-18 16:05 ` Tejun Heo
2012-05-18 20:31 ` Kent Overstreet
[not found] ` <eb6a7d3fe7ae203202bc365d7274ee531631a9ca.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:11 ` [dm-devel] " Vivek Goyal
2012-05-18 18:55 ` Kent Overstreet
2012-05-18 2:59 ` [PATCH 04/13] block: Add bio_clone_kmalloc() koverstreet
[not found] ` <1c7c2d4b89bc3d0e907608cec37bcf0ee50f4c0e.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:09 ` Tejun Heo
[not found] ` <20120518160903.GD19388-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 20:39 ` Kent Overstreet
2012-05-18 16:45 ` Boaz Harrosh
2012-05-18 2:59 ` [PATCH 05/13] block: Only clone bio vecs that are in use koverstreet
2012-05-18 16:13 ` Tejun Heo
2012-05-18 21:14 ` Kent Overstreet
2012-05-18 2:59 ` [PATCH 06/13] block: Add bio_reset() koverstreet
2012-05-18 16:16 ` Tejun Heo
[not found] ` <20120518161608.GF19388-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 21:48 ` Kent Overstreet
2012-05-18 2:59 ` [PATCH 07/13] pktcdvd: Switch to bio_kmalloc() koverstreet
[not found] ` <04a4d6c2c8b6f0097e3594c6e0932093afffc1da.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:18 ` Tejun Heo
2012-05-18 2:59 ` [PATCH 08/13] block: Kill bi_destructor koverstreet
[not found] ` <de05855cf0fa4800cfab7e8340f106dccc7a75a1.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:21 ` Tejun Heo
[not found] ` <20120518162142.GH19388-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 22:21 ` Kent Overstreet
2012-05-18 2:59 ` [PATCH 09/13] block: Add an explicit bio flag for bios that own their bvec koverstreet
[not found] ` <363875943e9d0e13bee6ed28239280543e6e5055.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 16:30 ` Tejun Heo
2012-05-18 21:49 ` Kent Overstreet
2012-05-18 17:07 ` Boaz Harrosh
2012-05-18 2:59 ` [PATCH 10/13] block: Rework bio splitting koverstreet
[not found] ` <ac4c1cbd10934fdc3a4af74d4cb5ec370f9139d5.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 17:07 ` Tejun Heo
2012-05-18 17:46 ` Boaz Harrosh
[not found] ` <cover.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 2:59 ` [PATCH 11/13] Closures koverstreet-hpIqsD4AKlfQT0dZR+AlfA
[not found] ` <a184989cfbf92297d4cca2f823e5ac24ec8fe1e3.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 17:08 ` Tejun Heo
2012-05-18 2:59 ` [PATCH 12/13] Make generic_make_request handle arbitrarily large bios koverstreet
2012-05-18 8:05 ` NeilBrown
[not found] ` <20120518180550.0a6cdc34-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-05-18 8:14 ` Kent Overstreet
[not found] ` <20120518081444.GA27205-RcKxWJ4Cfj3IzGYXcIpNmNLIRw13R84JkQQo+JxHRPFibQn6LdNjmg@public.gmane.org>
2012-05-21 17:17 ` Vivek Goyal [this message]
2012-05-21 17:55 ` [dm-devel] " Kent Overstreet
2012-05-21 18:32 ` Vivek Goyal
2012-05-18 17:52 ` Tejun Heo
2012-05-19 0:59 ` [dm-devel] " Alasdair G Kergon
[not found] ` <1114e7019b0055fc09a54b59b36398d5c54f5e32.1337308722.git.koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-05-18 22:48 ` Mikulas Patocka
2012-05-18 3:00 ` [PATCH 13/13] Gut bio_add_page() koverstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120521171706.GH23993@redhat.com \
--to=vgoyal-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=agk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=dm-devel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=linux-bcache-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=neilb-l3A5Bk7waGM@public.gmane.org \
--cc=tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).