All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	dm-devel@redhat.com, NeilBrown <neilb@suse.com>
Subject: Re: [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
Date: Mon, 21 Jan 2019 11:52:04 +0800	[thread overview]
Message-ID: <20190121035203.GC24610@ming.t460p> (raw)
In-Reply-To: <20190119180506.1300-3-snitzer@redhat.com>

On Sat, Jan 19, 2019 at 01:05:04PM -0500, Mike Snitzer wrote:
> The risk of redundant IO accounting was not taken into consideration
> when commit 18a25da84354 ("dm: ensure bio submission follows a
> depth-first tree walk") introduced IO splitting in terms of recursion
> via generic_make_request().
> 
> Fix this by subtracting the split bio's payload from the IO stats that
> were already accounted for by start_io_acct() upon dm_make_request()
> entry.  This repeat oscillation of the IO accounting, up then down,
> isn't ideal but refactoring DM core's IO splitting to pre-split bios
> _before_ they are accounted turned out to be an excessive amount of
> change that will need a full development cycle to refine and verify.
> 
> Before this fix:
> 
>   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
>   bios are split on 32k boundaries.
> 
>   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
>     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> 
>   with debugging added:
>   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
>   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
>   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
>   ...
> 
>   16M written yet 136M (278528 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   278528
> 
> After this fix:
> 
>   16M written and 16M (32768 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   32768
> 
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Reported-by: Bryan Gurney <bgurney@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fcb97b0a5743..fbadda68e23b 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
>  	ci->sector = bio->bi_iter.bi_sector;
>  }
>  
> +#define __dm_part_stat_sub(part, field, subnd)	\
> +	(part_stat_get(part, field) -= (subnd))
> +
>  /*
>   * Entry point to split a bio into clones and submit them to the targets.
>   */
> @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
>  							  GFP_NOIO, &md->queue->bio_split);
>  				ci.io->orig_bio = b;
> +
> +				/*
> +				 * Adjust IO stats for each split, otherwise upon queue
> +				 * reentry there will be redundant IO accounting.
> +				 * NOTE: this is a stop-gap fix, a proper fix involves
> +				 * significant refactoring of DM core's bio splitting
> +				 * (by eliminating DM's splitting and just using bio_split)
> +				 */
> +				part_stat_lock();
> +				__dm_part_stat_sub(&dm_disk(md)->part0,
> +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> +				part_stat_unlock();
> +
>  				bio_chain(b, bio);
>  				ret = generic_make_request(bio);
>  				break;

This ways is a bit ugly, but looks it works and it is simple, especially
DM target may accept partial bio, so:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

WARNING: multiple messages have this Message-ID (diff)
From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com, NeilBrown <neilb@suse.com>,
	axboe@kernel.dk, linux-block@vger.kernel.org
Subject: Re: [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting
Date: Mon, 21 Jan 2019 11:52:04 +0800	[thread overview]
Message-ID: <20190121035203.GC24610@ming.t460p> (raw)
In-Reply-To: <20190119180506.1300-3-snitzer@redhat.com>

On Sat, Jan 19, 2019 at 01:05:04PM -0500, Mike Snitzer wrote:
> The risk of redundant IO accounting was not taken into consideration
> when commit 18a25da84354 ("dm: ensure bio submission follows a
> depth-first tree walk") introduced IO splitting in terms of recursion
> via generic_make_request().
> 
> Fix this by subtracting the split bio's payload from the IO stats that
> were already accounted for by start_io_acct() upon dm_make_request()
> entry.  This repeat oscillation of the IO accounting, up then down,
> isn't ideal but refactoring DM core's IO splitting to pre-split bios
> _before_ they are accounted turned out to be an excessive amount of
> change that will need a full development cycle to refine and verify.
> 
> Before this fix:
> 
>   /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so
>   bios are split on 32k boundaries.
> 
>   # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \
>     	--iodepth=1 --ioengine=libaio --direct=1 --refill_buffers
> 
>   with debugging added:
>   [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128
>   [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio:
>   [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64
>   ...
> 
>   16M written yet 136M (278528 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   278528
> 
> After this fix:
> 
>   16M written and 16M (32768 * 512b) accounted:
>   # cat /sys/block/dm-2/stat | awk '{ print $7 }'
>   32768
> 
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Reported-by: Bryan Gurney <bgurney@redhat.com>
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fcb97b0a5743..fbadda68e23b 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
>  	ci->sector = bio->bi_iter.bi_sector;
>  }
>  
> +#define __dm_part_stat_sub(part, field, subnd)	\
> +	(part_stat_get(part, field) -= (subnd))
> +
>  /*
>   * Entry point to split a bio into clones and submit them to the targets.
>   */
> @@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  				struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
>  							  GFP_NOIO, &md->queue->bio_split);
>  				ci.io->orig_bio = b;
> +
> +				/*
> +				 * Adjust IO stats for each split, otherwise upon queue
> +				 * reentry there will be redundant IO accounting.
> +				 * NOTE: this is a stop-gap fix, a proper fix involves
> +				 * significant refactoring of DM core's bio splitting
> +				 * (by eliminating DM's splitting and just using bio_split)
> +				 */
> +				part_stat_lock();
> +				__dm_part_stat_sub(&dm_disk(md)->part0,
> +						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> +				part_stat_unlock();
> +
>  				bio_chain(b, bio);
>  				ret = generic_make_request(bio);
>  				break;

This ways is a bit ugly, but looks it works and it is simple, especially
DM target may accept partial bio, so:

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming

  reply	other threads:[~2019-01-21  3:52 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-19 18:05 [PATCH 0/4] dm: fix various issues with bio splitting code Mike Snitzer
2019-01-19 18:05 ` Mike Snitzer
2019-01-19 18:05 ` [PATCH 1/4] dm: fix clone_bio() to trigger blk_recount_segments() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:25   ` Ming Lei
2019-01-21  3:25     ` Ming Lei
2019-01-19 18:05 ` [PATCH 2/4] dm: fix redundant IO accounting for bios that need splitting Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:52   ` Ming Lei [this message]
2019-01-21  3:52     ` Ming Lei
2019-01-22 15:55   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio() Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer
2019-01-21  3:21   ` Ming Lei
2019-01-21  3:21     ` Ming Lei
2019-01-21 16:02     ` Mike Snitzer
2019-01-21 16:02       ` Mike Snitzer
2019-01-22  2:46       ` Ming Lei
2019-01-22  2:46         ` Ming Lei
2019-01-22  3:17         ` Mike Snitzer
2019-01-22  3:17           ` Mike Snitzer
2019-01-22  3:35           ` Mike Snitzer
2019-01-22  3:35             ` Mike Snitzer
2019-01-22  3:49             ` Ming Lei
2019-01-22  3:49               ` Ming Lei
2019-01-21  4:39   ` NeilBrown
2019-01-21  4:39     ` [dm-devel] " NeilBrown
2019-01-22 15:56   ` Sasha Levin
2019-01-19 18:05 ` [PATCH 4/4] dm: fix dm_wq_work() to only use __split_and_process_bio() if appropriate Mike Snitzer
2019-01-19 18:05   ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190121035203.GC24610@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.