From: Kent Overstreet <koverstreet@google.com>
To: Boaz Harrosh <bharrosh@panasas.com>
Cc: linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org,
dm-devel@redhat.com, tj@kernel.org, axboe@kernel.dk,
agk@redhat.com, neilb@suse.de, drbd-dev@lists.linbit.com,
vgoyal@redhat.com, mpatocka@redhat.com, sage@newdream.net,
yehuda@hq.newdream.net
Subject: Re: [PATCH v4 08/12] block: Introduce new bio_split()
Date: Wed, 25 Jul 2012 16:26:11 -0700 [thread overview]
Message-ID: <20120725232611.GD8673@moria.home.lan> (raw)
In-Reply-To: <500FDEBC.9050607@panasas.com>
On Wed, Jul 25, 2012 at 02:55:40PM +0300, Boaz Harrosh wrote:
> On 07/24/2012 11:11 PM, Kent Overstreet wrote:
>
> > The new bio_split() can split arbitrary bios - it's not restricted to
> > single page bios, like the old bio_split() (previously renamed to
> > bio_pair_split()). It also has different semantics - it doesn't allocate
> > a struct bio_pair, leaving it up to the caller to handle completions.
> >
> > Signed-off-by: Kent Overstreet <koverstreet@google.com>
> > ---
> > fs/bio.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > 1 files changed, 99 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/bio.c b/fs/bio.c
> > index 5d02aa5..a15e121 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -1539,6 +1539,105 @@ struct bio_pair *bio_pair_split(struct bio *bi, int first_sectors)
> > EXPORT_SYMBOL(bio_pair_split);
> >
> > /**
> > + * bio_split - split a bio
> > + * @bio: bio to split
> > + * @sectors: number of sectors to split from the front of @bio
> > + * @gfp: gfp mask
> > + * @bs: bio set to allocate from
> > + *
> > + * Allocates and returns a new bio which represents @sectors from the start of
> > + * @bio, and updates @bio to represent the remaining sectors.
> > + *
> > + * If bio_sectors(@bio) was less than or equal to @sectors, returns @bio
> > + * unchanged.
> > + *
> > + * The newly allocated bio will point to @bio's bi_io_vec, if the split was on a
> > + * bvec boundry; it is the caller's responsibility to ensure that @bio is not
> > + * freed before the split.
> > + *
> > + * If bio_split() is running under generic_make_request(), it's not safe to
> > + * allocate more than one bio from the same bio set. Therefore, if it is running
> > + * under generic_make_request() it masks out __GFP_WAIT when doing the
> > + * allocation. The caller must check for failure if there's any possibility of
> > + * it being called from under generic_make_request(); it is then the caller's
> > + * responsibility to retry from a safe context (by e.g. punting to workqueue).
> > + */
> > +struct bio *bio_split(struct bio *bio, int sectors,
> > + gfp_t gfp, struct bio_set *bs)
> > +{
> > + unsigned idx, vcnt = 0, nbytes = sectors << 9;
> > + struct bio_vec *bv;
> > + struct bio *ret = NULL;
> > +
> > + BUG_ON(sectors <= 0);
> > +
> > + /*
> > + * If we're being called from underneath generic_make_request() and we
> > + * already allocated any bios from this bio set, we risk deadlock if we
> > + * use the mempool. So instead, we possibly fail and let the caller punt
> > + * to workqueue or somesuch and retry in a safe context.
> > + */
> > + if (current->bio_list)
> > + gfp &= ~__GFP_WAIT;
>
>
> NACK!
>
> If as you said above in the comment:
> if there's any possibility of it being called from under generic_make_request();
> it is then the caller's responsibility to ...
>
> So all the comment needs to say is:
> ... caller's responsibility to not set __GFP_WAIT at gfp.
>
> And drop this here. It is up to the caller to decide. If the caller wants he can do
> "if (current->bio_list)" by his own.
>
> This is a general purpose utility you might not know it's context.
> for example with osdblk above will break.
Well I'm highly highly skeptical that using __GFP_WAIT under
generic_make_request() is ever a sane thing to do - it could certainly
be safe in specific circumstances, but it's just such a fragile thing to
rely on, you have to _never_ use the same bio pool more than once. And
even then I bet there's other subtle ways it could break.
But you're not the first to complain about it, and your point about
existing code is compelling.
commit ea124f899af29887e24d07497442066572012e5b
Author: Kent Overstreet <koverstreet@google.com>
Date: Wed Jul 25 16:25:10 2012 -0700
block: Introduce new bio_split()
The new bio_split() can split arbitrary bios - it's not restricted to
single page bios, like the old bio_split() (previously renamed to
bio_pair_split()). It also has different semantics - it doesn't allocate
a struct bio_pair, leaving it up to the caller to handle completions.
diff --git a/fs/bio.c b/fs/bio.c
index 0470376..312e5de 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -1537,6 +1537,102 @@ struct bio_pair *bio_pair_split(struct bio *bi, int first_sectors)
EXPORT_SYMBOL(bio_pair_split);
/**
+ * bio_split - split a bio
+ * @bio: bio to split
+ * @sectors: number of sectors to split from the front of @bio
+ * @gfp: gfp mask
+ * @bs: bio set to allocate from
+ *
+ * Allocates and returns a new bio which represents @sectors from the start of
+ * @bio, and updates @bio to represent the remaining sectors.
+ *
+ * If bio_sectors(@bio) was less than or equal to @sectors, returns @bio
+ * unchanged.
+ *
+ * The newly allocated bio will point to @bio's bi_io_vec, if the split was on a
+ * bvec boundry; it is the caller's responsibility to ensure that @bio is not
+ * freed before the split.
+ *
+ * BIG FAT WARNING:
+ *
+ * If you're calling this from under generic_make_request() (i.e.
+ * current->bio_list != NULL), you should mask out __GFP_WAIT and punt to
+ * workqueue if the allocation fails. Otherwise, your code will probably
+ * deadlock.
+ *
+ * You can't allocate more than once from the same bio pool without submitting
+ * the previous allocations (so they'll eventually complete and deallocate
+ * themselves), but if you're under generic_make_request() those previous
+ * allocations won't submit until you return . And if you have to split bios,
+ * you should expect that some bios will require multiple splits.
+ */
+struct bio *bio_split(struct bio *bio, int sectors,
+ gfp_t gfp, struct bio_set *bs)
+{
+ unsigned idx, vcnt = 0, nbytes = sectors << 9;
+ struct bio_vec *bv;
+ struct bio *ret = NULL;
+
+ BUG_ON(sectors <= 0);
+
+ if (sectors >= bio_sectors(bio))
+ return bio;
+
+ trace_block_split(bdev_get_queue(bio->bi_bdev), bio,
+ bio->bi_sector + sectors);
+
+ bio_for_each_segment(bv, bio, idx) {
+ vcnt = idx - bio->bi_idx;
+
+ if (!nbytes) {
+ ret = bio_alloc_bioset(gfp, 0, bs);
+ if (!ret)
+ return NULL;
+
+ ret->bi_io_vec = bio_iovec(bio);
+ ret->bi_flags |= 1 << BIO_CLONED;
+ break;
+ } else if (nbytes < bv->bv_len) {
+ ret = bio_alloc_bioset(gfp, ++vcnt, bs);
+ if (!ret)
+ return NULL;
+
+ memcpy(ret->bi_io_vec, bio_iovec(bio),
+ sizeof(struct bio_vec) * vcnt);
+
+ ret->bi_io_vec[vcnt - 1].bv_len = nbytes;
+ bv->bv_offset += nbytes;
+ bv->bv_len -= nbytes;
+ break;
+ }
+
+ nbytes -= bv->bv_len;
+ }
+
+ ret->bi_bdev = bio->bi_bdev;
+ ret->bi_sector = bio->bi_sector;
+ ret->bi_size = sectors << 9;
+ ret->bi_rw = bio->bi_rw;
+ ret->bi_vcnt = vcnt;
+ ret->bi_max_vecs = vcnt;
+ ret->bi_end_io = bio->bi_end_io;
+ ret->bi_private = bio->bi_private;
+
+ bio->bi_sector += sectors;
+ bio->bi_size -= sectors << 9;
+ bio->bi_idx = idx;
+
+ if (bio_integrity(bio)) {
+ bio_integrity_clone(ret, bio, gfp, bs);
+ bio_integrity_trim(ret, 0, bio_sectors(ret));
+ bio_integrity_trim(bio, bio_sectors(ret), bio_sectors(bio));
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(bio_split);
+
+/**
* bio_sector_offset - Find hardware sector offset in bio
* @bio: bio to inspect
* @index: bio_vec index
next prev parent reply other threads:[~2012-07-25 23:26 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-24 20:11 [PATCH v4 00/12] Block cleanups Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 01/12] block: Generalized bio pool freeing Kent Overstreet
[not found] ` <1343160689-12378-2-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 11:06 ` Boaz Harrosh
[not found] ` <500FD32F.2010809-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2012-07-25 23:38 ` Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 02/12] dm: Use bioset's front_pad for dm_rq_clone_bio_info Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 03/12] block: Add bio_reset() Kent Overstreet
[not found] ` <1343160689-12378-4-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 11:19 ` Boaz Harrosh
[not found] ` <500FD63F.7050501-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2012-07-25 22:56 ` Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 04/12] pktcdvd: Switch to bio_kmalloc() Kent Overstreet
[not found] ` <1343160689-12378-5-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 11:29 ` Boaz Harrosh
[not found] ` <500FD8B7.9040701-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2012-07-25 23:01 ` Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 05/12] block: Kill bi_destructor Kent Overstreet
[not found] ` <1343160689-12378-6-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 11:39 ` Boaz Harrosh
[not found] ` <500FDB0D.4070605-C4P08NqkoRlBDgjK7y7TUQ@public.gmane.org>
2012-07-25 23:15 ` Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 06/12] block: Add an explicit bio flag for bios that own their bvec Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 08/12] block: Introduce new bio_split() Kent Overstreet
2012-07-25 11:55 ` Boaz Harrosh
2012-07-25 23:26 ` Kent Overstreet [this message]
2012-07-27 0:50 ` [PATCH] A possible deadlock with stacked devices (was: [PATCH v4 08/12] block: Introduce new bio_split()) Mikulas Patocka
2012-08-15 23:07 ` Kent Overstreet
[not found] ` <20120815230715.GD2758-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-08-29 16:08 ` Mikulas Patocka
[not found] ` <1343160689-12378-1-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-24 20:11 ` [PATCH v4 07/12] block: Rename bio_split() -> bio_pair_split() Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 09/12] block: Rework bio_pair_split() Kent Overstreet
[not found] ` <1343160689-12378-10-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 12:03 ` Boaz Harrosh
2012-07-24 20:11 ` [PATCH v4 10/12] block: Add bio_clone_kmalloc() Kent Overstreet
[not found] ` <1343160689-12378-11-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-07-25 12:05 ` Boaz Harrosh
2012-07-24 20:11 ` [PATCH v4 11/12] block: Add bio_clone_bioset() Kent Overstreet
2012-07-24 20:11 ` [PATCH v4 12/12] block: Only clone bio vecs that are in use Kent Overstreet
[not found] ` <1343160689-12378-13-git-send-email-koverstreet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-08-07 3:17 ` Muthu Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120725232611.GD8673@moria.home.lan \
--to=koverstreet@google.com \
--cc=agk@redhat.com \
--cc=axboe@kernel.dk \
--cc=bharrosh@panasas.com \
--cc=dm-devel@redhat.com \
--cc=drbd-dev@lists.linbit.com \
--cc=linux-bcache@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=neilb@suse.de \
--cc=sage@newdream.net \
--cc=tj@kernel.org \
--cc=vgoyal@redhat.com \
--cc=yehuda@hq.newdream.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).