public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
	dm-devel@redhat.com, Jens Axboe <axboe@kernel.dk>,
	linux-kernel@vger.kernel.org, Chris Mason <clm@fb.com>
Subject: Re: [PATCH for-4.2 01/14] block: remove management of bi_remaining when restoring original bi_end_io
Date: Mon, 18 May 2015 17:36:44 +0200	[thread overview]
Message-ID: <20150518153644.GA10187@quack.suse.cz> (raw)
In-Reply-To: <20150518131358.GA13998@redhat.com>

On Mon 18-05-15 09:13:59, Mike Snitzer wrote:
> On Mon, May 18 2015 at  3:22am -0400,
> Jan Kara <jack@suse.cz> wrote:
> 
> > On Thu 14-05-15 17:04:59, Mike Snitzer wrote:
> > > Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
> > > non-chains") regressed all existing callers that followed this pattern:
> > >  1) saving a bio's original bi_end_io
> > >  2) wiring up an intermediate bi_end_io
> > >  3) restoring the original bi_end_io from intermediate bi_end_io
> > >  4) calling bio_endio() to execute the restored original bi_end_io
> > > 
> > > The regression was due to BIO_CHAIN only ever getting set if
> > > bio_inc_remaining() is called.  For the above pattern it isn't set until
> > > step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
> > > the first bio_endio(), in step 2 above, never decremented __bi_remaining
> > > before calling the intermediate bi_end_io -- leaving __bi_remaining with
> > > the value 1 instead of 0.  When bio_inc_remaining() occurred during step
> > > 3 it brought it to a value of 2.  When the second bio_endio() was
> > > called, in step 4 above, it should've called the original bi_end_io but
> > > it didn't because there was an extra reference that wasn't dropped (due
> > > to atomic operations being optimized away since BIO_CHAIN wasn't set
> > > upfront).
> > > 
> > > Fix this issue by removing the __bi_remaining management complexity for
> > > all callers that use the above pattern -- bio_chain() is the only
> > > interface that _needs_ to be concerned with __bi_remaining.  For the
> > > above pattern callers just expect the bi_end_io they set to get called!
> > > Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
> > > that aren't associated with the bio_chain() interface.
> > > 
> > > The bio_inc_remaining() interface has been left exported because it is
> > > still useful for more elaborate uses of bio_chain() -- it will be used
> > > in an upcoming DM commit "dm thin: range discard support".
> > > 
> > > Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > Cc: Jan Kara <jack@suse.cz>
> > > Cc: Chris Mason <clm@fb.com>
> >   One question: What happens if you stack dm-thin on top of e.g. dm-linear?
> > dm-thin will do it's thing to a bio and passes it to dm-linear. That will
> > split & chain the bio so BIO_CHAIN will be set. And on IO completion you
> > will have troubles in dm-thinp as now bi_remaining gets decremented in
> > bio_endio(). That's the reason why I suggested that we should clear
> > BIO_CHAIN once bi_remaining hits zero...
> 
> I think you need to be more precise in explaining the scenario you're
> concerned about.  Could be there is an issue but I'm not seeing it yet.
> 
> Are you referring to the patch that makes DM thinp use the proposed
> blkdev_issue_discard_async() interface?  The bios issued to DM linear
> are generated by blkdev_issue_discard_async().  By using bio_chain()
> they establish ancestory with the parent DM thinp bio (which has
> had BIO_CHAIN set even before calling blkdev_issue_discard_async because
> there is potential for DM thinp to complete the parent bio before all N
> blkdev_issue_discard_async() generated bios complete -- so that is why
> DM thinp itself takes an extra reference on the parent bio using
> bio_inc_remaining() before calling blkdev_issue_discard_async)

  No, I'm not referring to your proposed interface. I'm referring to
current kernel + your patch to remove bio_inc_remaining() from all the dm
targets. Ah, after checking again I see where misunderstanding may have
come from - the device below has to be handled by drivers/md/linear.c which
is MD linear driver, not DM one. I confused those two. Anyway here is the
failure I envision (and frankly, I don't understand dm details much so I may
be just completely wrong but I'd like to understand what prevents the following
from happening):
* We have dm-thin stacked on top of drivers/dm/linear.c
* FS issues bio to dm-thin. remap_and_issue_overwrite() sets bi_end_io to
  overwrite_endio. dm-thin eventually calls generic_make_request(bio).
* Now linear_make_request() gets called and it ends up calling
  bio_chain(split, bio). This sets BIO_CHAIN on bio.
* IO for all chained bios is completed. So bio->bi_remaining is now zero,
  bio still has BIO_CHAIN set and overwrite_endio gets called.
* process_prepared_mapping() will eventually try to call original bi_end_io
  callback but that never happens because bi_remaining is 0 and BIO_CHAIN
  remained set.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2015-05-18 15:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 21:04 [PATCH for-4.2 00/14] block, dm: first batch of changes for 4.2 Mike Snitzer
2015-05-14 21:04 ` [PATCH for-4.2 01/14] block: remove management of bi_remaining when restoring original bi_end_io Mike Snitzer
2015-05-18  7:22   ` Jan Kara
2015-05-18 13:13     ` Mike Snitzer
2015-05-18 15:36       ` Jan Kara [this message]
2015-05-18 15:59         ` Mike Snitzer
2015-05-18 20:40           ` [PATCH for-4.2 v2 " Mike Snitzer
2015-05-19  6:28             ` Christoph Hellwig
2015-05-19  7:20             ` Jan Kara
2015-05-18  8:24   ` [dm-devel] [PATCH for-4.2 " Christoph Hellwig
2015-05-18 13:20     ` Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 02/14] block: remove export for blk_queue_bio Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 03/14] block, dm: don't copy bios for request clones Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 04/14] block: factor out blkdev_issue_discard_async Mike Snitzer
2015-05-18  8:27   ` [dm-devel] " Christoph Hellwig
2015-05-18 13:32     ` Mike Snitzer
2015-05-18 16:17       ` [dm-devel] " Christoph Hellwig
2015-05-18 19:18         ` Mike Snitzer
2015-05-19  8:32           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150518153644.GA10187@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=clm@fb.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox