* FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree
@ 2019-01-28 12:50 gregkh
2019-01-28 15:31 ` Mike Snitzer
0 siblings, 1 reply; 4+ messages in thread
From: gregkh @ 2019-01-28 12:50 UTC (permalink / raw)
To: snitzer, bgurney, ming.lei; +Cc: stable
The patch below does not apply to the 4.20-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@redhat.com>
Date: Thu, 17 Jan 2019 10:48:01 -0500
Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting
The risk of redundant IO accounting was not taken into consideration
when commit 18a25da84354 ("dm: ensure bio submission follows a
depth-first tree walk") introduced IO splitting in terms of recursion
via generic_make_request().
Fix this by subtracting the split bio's payload from the IO stats that
were already accounted for by start_io_acct() upon dm_make_request()
entry. This repeat oscillation of the IO accounting, up then down,
isn't ideal but refactoring DM core's IO splitting to pre-split bios
_before_ they are accounted turned out to be an excessive amount of
change that will need a full development cycle to refine and verify.
Before this fix:
/dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
bios are split on 32k boundaries.
# fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
with debugging added:
[103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
[103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
[103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
...
16M written yet 136M (278528 * 512b) accounted:
# cat /sys/block/dm-2/stat | awk '{ print $7 }'
278528
After this fix:
16M written and 16M (32768 * 512b) accounted:
# cat /sys/block/dm-2/stat | awk '{ print $7 }'
32768
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Reported-by: Bryan Gurney <bgurney@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fcb97b0a5743..fbadda68e23b 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
ci->sector = bio->bi_iter.bi_sector;
}
+#define __dm_part_stat_sub(part, field, subnd) \
+ (part_stat_get(part, field) -= (subnd))
+
/*
* Entry point to split a bio into clones and submit them to the targets.
*/
@@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
GFP_NOIO, &md->queue->bio_split);
ci.io->orig_bio = b;
+
+ /*
+ * Adjust IO stats for each split, otherwise upon queue
+ * reentry there will be redundant IO accounting.
+ * NOTE: this is a stop-gap fix, a proper fix involves
+ * significant refactoring of DM core's bio splitting
+ * (by eliminating DM's splitting and just using bio_split)
+ */
+ part_stat_lock();
+ __dm_part_stat_sub(&dm_disk(md)->part0,
+ sectors[op_stat_group(bio_op(bio))], ci.sector_count);
+ part_stat_unlock();
+
bio_chain(b, bio);
ret = generic_make_request(bio);
break;
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree 2019-01-28 12:50 FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree gregkh @ 2019-01-28 15:31 ` Mike Snitzer 2019-01-28 16:00 ` Greg KH 0 siblings, 1 reply; 4+ messages in thread From: Mike Snitzer @ 2019-01-28 15:31 UTC (permalink / raw) To: gregkh; +Cc: bgurney, ming.lei, stable On Mon, Jan 28 2019 at 7:50am -0500, gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote: > > The patch below does not apply to the 4.20-stable tree. > If someone wants it applied there, or to any other stable or longterm > tree, then please email the backport, including the original git commit > id to <stable@vger.kernel.org>. > > thanks, > > greg k-h > > ------------------ original commit in Linus's tree ------------------ > > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001 > From: Mike Snitzer <snitzer@redhat.com> > Date: Thu, 17 Jan 2019 10:48:01 -0500 > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting > > The risk of redundant IO accounting was not taken into consideration > when commit 18a25da84354 ("dm: ensure bio submission follows a > depth-first tree walk") introduced IO splitting in terms of recursion > via generic_make_request(). > > Fix this by subtracting the split bio's payload from the IO stats that > were already accounted for by start_io_acct() upon dm_make_request() > entry. This repeat oscillation of the IO accounting, up then down, > isn't ideal but refactoring DM core's IO splitting to pre-split bios > _before_ they are accounted turned out to be an excessive amount of > change that will need a full development cycle to refine and verify. > > Before this fix: > > /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so > bios are split on 32k boundaries. > > # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \ > --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers > > with debugging added: > [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128 > [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio: > [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64 > ... > > 16M written yet 136M (278528 * 512b) accounted: > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > 278528 > > After this fix: > > 16M written and 16M (32768 * 512b) accounted: > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > 32768 > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk") > Cc: stable@vger.kernel.org # 4.16+ > Reported-by: Bryan Gurney <bgurney@redhat.com> > Reviewed-by: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index fcb97b0a5743..fbadda68e23b 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md, > ci->sector = bio->bi_iter.bi_sector; > } > > +#define __dm_part_stat_sub(part, field, subnd) \ > + (part_stat_get(part, field) -= (subnd)) > + > /* > * Entry point to split a bio into clones and submit them to the targets. > */ > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md, > struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count, > GFP_NOIO, &md->queue->bio_split); > ci.io->orig_bio = b; > + > + /* > + * Adjust IO stats for each split, otherwise upon queue > + * reentry there will be redundant IO accounting. > + * NOTE: this is a stop-gap fix, a proper fix involves > + * significant refactoring of DM core's bio splitting > + * (by eliminating DM's splitting and just using bio_split) > + */ > + part_stat_lock(); > + __dm_part_stat_sub(&dm_disk(md)->part0, > + sectors[op_stat_group(bio_op(bio))], ci.sector_count); > + part_stat_unlock(); > + > bio_chain(b, bio); > ret = generic_make_request(bio); > break; > Seems to apply fine.. not sure what the problem is on your end: $ git checkout stable/linux-4.20.y Previous HEAD position was 8fe28cb58bcb... Linux 4.20 HEAD is now at 9f1a389a0b5b... Linux 4.20.5 $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry patching file drivers/md/dm.c Hunk #1 succeeded at 1578 (offset -6 lines). Hunk #2 succeeded at 1626 (offset -15 lines). $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336 [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting Date: Thu Jan 17 10:48:01 2019 -0500 1 file changed, 16 insertions(+) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree 2019-01-28 15:31 ` Mike Snitzer @ 2019-01-28 16:00 ` Greg KH 2019-01-28 17:18 ` Mike Snitzer 0 siblings, 1 reply; 4+ messages in thread From: Greg KH @ 2019-01-28 16:00 UTC (permalink / raw) To: Mike Snitzer; +Cc: bgurney, ming.lei, stable On Mon, Jan 28, 2019 at 10:31:41AM -0500, Mike Snitzer wrote: > On Mon, Jan 28 2019 at 7:50am -0500, > gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote: > > > > > The patch below does not apply to the 4.20-stable tree. > > If someone wants it applied there, or to any other stable or longterm > > tree, then please email the backport, including the original git commit > > id to <stable@vger.kernel.org>. > > > > thanks, > > > > greg k-h > > > > ------------------ original commit in Linus's tree ------------------ > > > > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001 > > From: Mike Snitzer <snitzer@redhat.com> > > Date: Thu, 17 Jan 2019 10:48:01 -0500 > > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting > > > > The risk of redundant IO accounting was not taken into consideration > > when commit 18a25da84354 ("dm: ensure bio submission follows a > > depth-first tree walk") introduced IO splitting in terms of recursion > > via generic_make_request(). > > > > Fix this by subtracting the split bio's payload from the IO stats that > > were already accounted for by start_io_acct() upon dm_make_request() > > entry. This repeat oscillation of the IO accounting, up then down, > > isn't ideal but refactoring DM core's IO splitting to pre-split bios > > _before_ they are accounted turned out to be an excessive amount of > > change that will need a full development cycle to refine and verify. > > > > Before this fix: > > > > /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so > > bios are split on 32k boundaries. > > > > # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \ > > --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers > > > > with debugging added: > > [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128 > > [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio: > > [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64 > > ... > > > > 16M written yet 136M (278528 * 512b) accounted: > > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > > 278528 > > > > After this fix: > > > > 16M written and 16M (32768 * 512b) accounted: > > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > > 32768 > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk") > > Cc: stable@vger.kernel.org # 4.16+ > > Reported-by: Bryan Gurney <bgurney@redhat.com> > > Reviewed-by: Ming Lei <ming.lei@redhat.com> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > index fcb97b0a5743..fbadda68e23b 100644 > > --- a/drivers/md/dm.c > > +++ b/drivers/md/dm.c > > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md, > > ci->sector = bio->bi_iter.bi_sector; > > } > > > > +#define __dm_part_stat_sub(part, field, subnd) \ > > + (part_stat_get(part, field) -= (subnd)) > > + > > /* > > * Entry point to split a bio into clones and submit them to the targets. > > */ > > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md, > > struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count, > > GFP_NOIO, &md->queue->bio_split); > > ci.io->orig_bio = b; > > + > > + /* > > + * Adjust IO stats for each split, otherwise upon queue > > + * reentry there will be redundant IO accounting. > > + * NOTE: this is a stop-gap fix, a proper fix involves > > + * significant refactoring of DM core's bio splitting > > + * (by eliminating DM's splitting and just using bio_split) > > + */ > > + part_stat_lock(); > > + __dm_part_stat_sub(&dm_disk(md)->part0, > > + sectors[op_stat_group(bio_op(bio))], ci.sector_count); > > + part_stat_unlock(); > > + > > bio_chain(b, bio); > > ret = generic_make_request(bio); > > break; > > > > Seems to apply fine.. not sure what the problem is on your end: > > $ git checkout stable/linux-4.20.y > Previous HEAD position was 8fe28cb58bcb... Linux 4.20 > HEAD is now at 9f1a389a0b5b... Linux 4.20.5 > > $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry > patching file drivers/md/dm.c > Hunk #1 succeeded at 1578 (offset -6 lines). > Hunk #2 succeeded at 1626 (offset -15 lines). > > $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336 > [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting > Date: Thu Jan 17 10:48:01 2019 -0500 > 1 file changed, 16 insertions(+) Try building it, it blows up into tiny pieces :) I guess I need a different script that says, "the patch applied, but broke the build", but it is so rare it's almost not worth it... thanks, greg k-h ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree 2019-01-28 16:00 ` Greg KH @ 2019-01-28 17:18 ` Mike Snitzer 0 siblings, 0 replies; 4+ messages in thread From: Mike Snitzer @ 2019-01-28 17:18 UTC (permalink / raw) To: Greg KH; +Cc: bgurney, ming.lei, stable, axboe On Mon, Jan 28 2019 at 11:00am -0500, Greg KH <gregkh@linuxfoundation.org> wrote: > On Mon, Jan 28, 2019 at 10:31:41AM -0500, Mike Snitzer wrote: > > On Mon, Jan 28 2019 at 7:50am -0500, > > gregkh@linuxfoundation.org <gregkh@linuxfoundation.org> wrote: > > > > > > > > The patch below does not apply to the 4.20-stable tree. > > > If someone wants it applied there, or to any other stable or longterm > > > tree, then please email the backport, including the original git commit > > > id to <stable@vger.kernel.org>. > > > > > > thanks, > > > > > > greg k-h > > > > > > ------------------ original commit in Linus's tree ------------------ > > > > > > From a1e1cb72d96491277ede8d257ce6b48a381dd336 Mon Sep 17 00:00:00 2001 > > > From: Mike Snitzer <snitzer@redhat.com> > > > Date: Thu, 17 Jan 2019 10:48:01 -0500 > > > Subject: [PATCH] dm: fix redundant IO accounting for bios that need splitting > > > > > > The risk of redundant IO accounting was not taken into consideration > > > when commit 18a25da84354 ("dm: ensure bio submission follows a > > > depth-first tree walk") introduced IO splitting in terms of recursion > > > via generic_make_request(). > > > > > > Fix this by subtracting the split bio's payload from the IO stats that > > > were already accounted for by start_io_acct() upon dm_make_request() > > > entry. This repeat oscillation of the IO accounting, up then down, > > > isn't ideal but refactoring DM core's IO splitting to pre-split bios > > > _before_ they are accounted turned out to be an excessive amount of > > > change that will need a full development cycle to refine and verify. > > > > > > Before this fix: > > > > > > /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so > > > bios are split on 32k boundaries. > > > > > > # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \ > > > --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers > > > > > > with debugging added: > > > [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128 > > > [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio: > > > [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64 > > > ... > > > > > > 16M written yet 136M (278528 * 512b) accounted: > > > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > > > 278528 > > > > > > After this fix: > > > > > > 16M written and 16M (32768 * 512b) accounted: > > > # cat /sys/block/dm-2/stat | awk '{ print $7 }' > > > 32768 > > > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk") > > > Cc: stable@vger.kernel.org # 4.16+ > > > Reported-by: Bryan Gurney <bgurney@redhat.com> > > > Reviewed-by: Ming Lei <ming.lei@redhat.com> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com> > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > > index fcb97b0a5743..fbadda68e23b 100644 > > > --- a/drivers/md/dm.c > > > +++ b/drivers/md/dm.c > > > @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md, > > > ci->sector = bio->bi_iter.bi_sector; > > > } > > > > > > +#define __dm_part_stat_sub(part, field, subnd) \ > > > + (part_stat_get(part, field) -= (subnd)) > > > + > > > /* > > > * Entry point to split a bio into clones and submit them to the targets. > > > */ > > > @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md, > > > struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count, > > > GFP_NOIO, &md->queue->bio_split); > > > ci.io->orig_bio = b; > > > + > > > + /* > > > + * Adjust IO stats for each split, otherwise upon queue > > > + * reentry there will be redundant IO accounting. > > > + * NOTE: this is a stop-gap fix, a proper fix involves > > > + * significant refactoring of DM core's bio splitting > > > + * (by eliminating DM's splitting and just using bio_split) > > > + */ > > > + part_stat_lock(); > > > + __dm_part_stat_sub(&dm_disk(md)->part0, > > > + sectors[op_stat_group(bio_op(bio))], ci.sector_count); > > > + part_stat_unlock(); > > > + > > > bio_chain(b, bio); > > > ret = generic_make_request(bio); > > > break; > > > > > > > Seems to apply fine.. not sure what the problem is on your end: > > > > $ git checkout stable/linux-4.20.y > > Previous HEAD position was 8fe28cb58bcb... Linux 4.20 > > HEAD is now at 9f1a389a0b5b... Linux 4.20.5 > > > > $ git show a1e1cb72d96491277ede8d257ce6b48a381dd336 | patch -p1 --dry > > patching file drivers/md/dm.c > > Hunk #1 succeeded at 1578 (offset -6 lines). > > Hunk #2 succeeded at 1626 (offset -15 lines). > > > > $ git cherry-pick a1e1cb72d96491277ede8d257ce6b48a381dd336 > > [detached HEAD 3d6015ea633a] dm: fix redundant IO accounting for bios that need splitting > > Date: Thu Jan 17 10:48:01 2019 -0500 > > 1 file changed, 16 insertions(+) > > Try building it, it blows up into tiny pieces :) > > I guess I need a different script that says, "the patch applied, but > broke the build", but it is so rare it's almost not worth it... Ah, gotcha. Because of part_stat_get() it implicitly depends on commit 1226b8dd0e913 but that shouldn't go to stable. stable@ would need to factor out __part_stat_sub(), like __part_stat_add(), and part_stat_sub() updated to use __part_stat_sub(). As is, existing part_stat_sub() is broken on all kernels (and with no callers nobody cares). Now that I've woken the dragon (Jens) and told him I papered over block core's broekn part_stat_sub() in DM.. I'll do whatever Jens wants me to do ;) Mike p.s. since this bug has existed for 1.5 years maybe nobody cares that DM's io stats are completely bogus and we can just ignore it? Yeah, that is a cop-out... ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-01-28 17:18 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-01-28 12:50 FAILED: patch "[PATCH] dm: fix redundant IO accounting for bios that need splitting" failed to apply to 4.20-stable tree gregkh 2019-01-28 15:31 ` Mike Snitzer 2019-01-28 16:00 ` Greg KH 2019-01-28 17:18 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).