All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jan Kara <jack@suse.cz>,
	dm-devel@redhat.com, Jens Axboe <axboe@kernel.dk>,
	linux-kernel@vger.kernel.org, Chris Mason <clm@fb.com>
Subject: Re: [PATCH for-4.2 01/14] block: remove management of bi_remaining when restoring original bi_end_io
Date: Mon, 18 May 2015 17:36:44 +0200	[thread overview]
Message-ID: <20150518153644.GA10187@quack.suse.cz> (raw)
In-Reply-To: <20150518131358.GA13998@redhat.com>

On Mon 18-05-15 09:13:59, Mike Snitzer wrote:
> On Mon, May 18 2015 at  3:22am -0400,
> Jan Kara <jack@suse.cz> wrote:
> 
> > On Thu 14-05-15 17:04:59, Mike Snitzer wrote:
> > > Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
> > > non-chains") regressed all existing callers that followed this pattern:
> > >  1) saving a bio's original bi_end_io
> > >  2) wiring up an intermediate bi_end_io
> > >  3) restoring the original bi_end_io from intermediate bi_end_io
> > >  4) calling bio_endio() to execute the restored original bi_end_io
> > > 
> > > The regression was due to BIO_CHAIN only ever getting set if
> > > bio_inc_remaining() is called.  For the above pattern it isn't set until
> > > step 3 above (step 2 would've needed to establish BIO_CHAIN).  As such
> > > the first bio_endio(), in step 2 above, never decremented __bi_remaining
> > > before calling the intermediate bi_end_io -- leaving __bi_remaining with
> > > the value 1 instead of 0.  When bio_inc_remaining() occurred during step
> > > 3 it brought it to a value of 2.  When the second bio_endio() was
> > > called, in step 4 above, it should've called the original bi_end_io but
> > > it didn't because there was an extra reference that wasn't dropped (due
> > > to atomic operations being optimized away since BIO_CHAIN wasn't set
> > > upfront).
> > > 
> > > Fix this issue by removing the __bi_remaining management complexity for
> > > all callers that use the above pattern -- bio_chain() is the only
> > > interface that _needs_ to be concerned with __bi_remaining.  For the
> > > above pattern callers just expect the bi_end_io they set to get called!
> > > Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
> > > that aren't associated with the bio_chain() interface.
> > > 
> > > The bio_inc_remaining() interface has been left exported because it is
> > > still useful for more elaborate uses of bio_chain() -- it will be used
> > > in an upcoming DM commit "dm thin: range discard support".
> > > 
> > > Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > Cc: Jan Kara <jack@suse.cz>
> > > Cc: Chris Mason <clm@fb.com>
> >   One question: What happens if you stack dm-thin on top of e.g. dm-linear?
> > dm-thin will do it's thing to a bio and passes it to dm-linear. That will
> > split & chain the bio so BIO_CHAIN will be set. And on IO completion you
> > will have troubles in dm-thinp as now bi_remaining gets decremented in
> > bio_endio(). That's the reason why I suggested that we should clear
> > BIO_CHAIN once bi_remaining hits zero...
> 
> I think you need to be more precise in explaining the scenario you're
> concerned about.  Could be there is an issue but I'm not seeing it yet.
> 
> Are you referring to the patch that makes DM thinp use the proposed
> blkdev_issue_discard_async() interface?  The bios issued to DM linear
> are generated by blkdev_issue_discard_async().  By using bio_chain()
> they establish ancestory with the parent DM thinp bio (which has
> had BIO_CHAIN set even before calling blkdev_issue_discard_async because
> there is potential for DM thinp to complete the parent bio before all N
> blkdev_issue_discard_async() generated bios complete -- so that is why
> DM thinp itself takes an extra reference on the parent bio using
> bio_inc_remaining() before calling blkdev_issue_discard_async)

  No, I'm not referring to your proposed interface. I'm referring to
current kernel + your patch to remove bio_inc_remaining() from all the dm
targets. Ah, after checking again I see where misunderstanding may have
come from - the device below has to be handled by drivers/md/linear.c which
is MD linear driver, not DM one. I confused those two. Anyway here is the
failure I envision (and frankly, I don't understand dm details much so I may
be just completely wrong but I'd like to understand what prevents the following
from happening):
* We have dm-thin stacked on top of drivers/dm/linear.c
* FS issues bio to dm-thin. remap_and_issue_overwrite() sets bi_end_io to
  overwrite_endio. dm-thin eventually calls generic_make_request(bio).
* Now linear_make_request() gets called and it ends up calling
  bio_chain(split, bio). This sets BIO_CHAIN on bio.
* IO for all chained bios is completed. So bio->bi_remaining is now zero,
  bio still has BIO_CHAIN set and overwrite_endio gets called.
* process_prepared_mapping() will eventually try to call original bi_end_io
  callback but that never happens because bi_remaining is 0 and BIO_CHAIN
  remained set.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2015-05-18 15:36 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-14 21:04 [PATCH for-4.2 00/14] block, dm: first batch of changes for 4.2 Mike Snitzer
2015-05-14 21:04 ` [PATCH for-4.2 01/14] block: remove management of bi_remaining when restoring original bi_end_io Mike Snitzer
2015-05-18  7:22   ` Jan Kara
2015-05-18 13:13     ` Mike Snitzer
2015-05-18 15:36       ` Jan Kara [this message]
2015-05-18 15:59         ` Mike Snitzer
2015-05-18 20:40           ` [PATCH for-4.2 v2 " Mike Snitzer
2015-05-19  6:28             ` Christoph Hellwig
2015-05-19  7:20             ` Jan Kara
2015-05-18  8:24   ` [dm-devel] [PATCH for-4.2 " Christoph Hellwig
2015-05-18 13:20     ` Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 02/14] block: remove export for blk_queue_bio Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 03/14] block, dm: don't copy bios for request clones Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 04/14] block: factor out blkdev_issue_discard_async Mike Snitzer
2015-05-18  8:27   ` [dm-devel] " Christoph Hellwig
2015-05-18 13:32     ` Mike Snitzer
2015-05-18 16:17       ` [dm-devel] " Christoph Hellwig
2015-05-18 19:18         ` Mike Snitzer
2015-05-19  8:32           ` Christoph Hellwig
2015-05-14 21:05 ` [PATCH for-4.2 05/14] dm: do not allocate any mempools for blk-mq request-based DM Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 06/14] dm: rename methods that requeue requests Mike Snitzer
2015-05-18  8:29   ` Christoph Hellwig
2015-05-18 15:44     ` Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 07/14] dm: factor out a common cleanup_mapped_device() Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 08/14] dm btree: add dm_btree_remove_leaves() Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 09/14] dm thin metadata: add dm_thin_find_mapped_range() Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 10/14] dm thin metadata: add dm_thin_remove_range() Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 11/14] dm thin: range discard support Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 12/14] dm thin: cleanup overwrite's endio restore to be centralized Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 13/14] dm thin: cleanup schedule_zero() to read more logically Mike Snitzer
2015-05-14 21:05 ` [PATCH for-4.2 14/14] dm thin metadata: remove in-core 'read_only' flag Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150518153644.GA10187@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=axboe@kernel.dk \
    --cc=clm@fb.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.