From: Mike Snitzer <snitzer@redhat.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: bgurney@redhat.com, ming.lei@redhat.com, stable@vger.kernel.org,
axboe@kernel.dk
Subject: Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
Date: Mon, 28 Jan 2019 12:18:12 -0500 [thread overview]
Message-ID: <20190128171811.GA22768@redhat.com> (raw)
In-Reply-To: <20190128160013.GA28714@kroah.com>
On Mon, Jan 28 2019 at 11:00am -0500,
Greg KH <gregkh@linuxfoundation.org> wrote:
> On Mon, Jan 28, 2019 at 10:31:41AM -0500, Mike Snitzer wrote:
> > On Mon, Jan 28 2019 at 7:50am -0500,
> > gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote:
> >
> > >
> > > The patch below does not apply to the 4.20-stable tree.
> > > If someone wants it applied there, or to any other stable or longterm
> > > tree, then please email the backport, including the original git commit
> > > id to <stable@vger.kernel.org>.
> > >
> > > thanks,
> > >
> > > greg k-h
> > >
> > > ------------------ original commit in Linus's tree ------------------
> > >
> > > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Thu, 17 Jan 2019 10:48:01 -0500
> > > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting
> > >
> > > The risk of redundant IO accounting was not taken into consideration
> > > when commit 18a25da84354 ("dm: ensure bio submission follows a
> > > depth-first tree walk") introduced IO splitting in terms of recursion
> > > via generic_make_request().
> > >
> > > Fix this by subtracting the split bio's payload from the IO stats that
> > > were already accounted for by start_io_acct() upon dm_make_request()
> > > entry. This repeat oscillation of the IO accounting, up then down,
> > > isn't ideal but refactoring DM core's IO splitting to pre-split bios
> > > _before_ they are accounted turned out to be an excessive amount of
> > > change that will need a full development cycle to refine and verify.
> > >
> > > Before this fix:
> > >
> > > /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
> > > bios are split on 32k boundaries.
> > >
> > > # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
> > > --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> > >
> > > with debugging added:
> > > [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
> > > [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
> > > [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
> > > ...
> > >
> > > 16M written yet 136M (278528 * 512b) accounted:
> > > # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> > > 278528
> > >
> > > After this fix:
> > >
> > > 16M written and 16M (32768 * 512b) accounted:
> > > # cat /sys/block/dm-2/stat | awk '{ print $7 }'
> > > 32768
> > >
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Reported-by: Bryan Gurney <bgurney@redhat.com>
> > > Reviewed-by: Ming Lei <ming.lei@redhat.com>
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > >
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fcb97b0a5743..fbadda68e23b 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
> > > ci->sector = bio->bi_iter.bi_sector;
> > > }
> > >
> > > +#define __dm_part_stat_sub(part, field, subnd) \
> > > + (part_stat_get(part, field) -= (subnd))
> > > +
> > > /*
> > > * Entry point to split a bio into clones and submit them to the targets.
> > > */
> > > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
> > > GFP_NOIO, &md->queue->bio_split);
> > > ci.io->orig_bio = b;
> > > +
> > > + /*
> > > + * Adjust IO stats for each split, otherwise upon queue
> > > + * reentry there will be redundant IO accounting.
> > > + * NOTE: this is a stop-gap fix, a proper fix involves
> > > + * significant refactoring of DM core's bio splitting
> > > + * (by eliminating DM's splitting and just using bio_split)
> > > + */
> > > + part_stat_lock();
> > > + __dm_part_stat_sub(&dm_disk(md)->part0,
> > > + sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > + part_stat_unlock();
> > > +
> > > bio_chain(b, bio);
> > > ret = generic_make_request(bio);
> > > break;
> > >
> >
> > Seems to apply fine.. not sure what the problem is on your end:
> >
> > $ git checkout stable/linux-4.20.y
> > Previous HEAD position was 8fe28cb58bcb... Linux 4.20
> > HEAD is now at 9f1a389a0b5b... Linux 4.20.5
> >
> > $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry
> > patching file drivers/md/dm.c
> > Hunk #1 succeeded at 1578 (offset -6 lines).
> > Hunk #2 succeeded at 1626 (offset -15 lines).
> >
> > $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336
> > [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting
> > Date: Thu Jan 17 10:48:01 2019 -0500
> > 1 file changed, 16 insertions(+)
>
> Try building it, it blows up into tiny pieces :)
>
> I guess I need a different script that says, "the patch applied, but
> broke the build", but it is so rare it's almost not worth it...
Ah, gotcha. Because of part_stat_get() it implicitly depends on commit
1226b8dd0e913 but that shouldn't go to stable.
stable@ would need to factor out __part_stat_sub(), like
__part_stat_add(), and part_stat_sub() updated to use __part_stat_sub().
As is, existing part_stat_sub() is broken on all kernels (and with no
callers nobody cares).
Now that I've woken the dragon (Jens) and told him I papered over block
core's broekn part_stat_sub() in DM.. I'll do whatever Jens wants me to
do ;)
Mike
p.s. since this bug has existed for 1.5 years maybe nobody cares that
DM's io stats are completely bogus and we can just ignore it? Yeah,
that is a cop-out...
prev parent reply other threads:[~2019-01-28 17:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-28 12:50 FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree gregkh
2019-01-28 15:31 ` Mike Snitzer
2019-01-28 16:00 ` Greg KH
2019-01-28 17:18 ` Mike Snitzer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190128171811.GA22768@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@kernel.dk \
--cc=bgurney@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=ming.lei@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.