Re: [RFC 1/2] Eliminate over- and under-counting of io_ticks

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Josh Snyder <joshs@netflix.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	Josh Snyder <josh@code406.com>
Subject: Re: [RFC 1/2] Eliminate over- and under-counting of io_ticks
Date: Tue, 9 Jun 2020 16:08:08 +0800	[thread overview]
Message-ID: <20200609080808.GA270404@T590> (raw)
In-Reply-To: <20200609040724.448519-2-joshs@netflix.com>

On Mon, Jun 08, 2020 at 09:07:23PM -0700, Josh Snyder wrote:
> Previously, io_ticks could be under-counted. Consider these I/Os along
> the time axis (in jiffies):
> 
>   t          012345678
>   io1        |----|
>   io2            |---|

In current way, when io2 is done, io tickes should be 5, since 1 tick
is added for two io start.

> 
> Under the old approach, io_ticks would count up to 6, like so:
> 
>   t          012345678
>   io1        |----|
>   io2            |---|
>   stamp      0   45  8
>   io_ticks   1   23  6

Before commit 5b18b5a73760("block: delete part_round_stats and switch to less precise counting"),
io tick is calculated accurately, which is basically:

	(4 - 0) + (5 - 4) + (8 - 5) = 8

> 
> With this change, io_ticks instead counts to 8, eliminating the
> under-counting:
> 
>   t          012345678
>   io1        |----|
>   io2            |---|
>   stamp      0    5  8
>   io_ticks   0    5  8
> 
> It was also possible for io_ticks to be over-counted. Consider a
> workload that issues I/Os deterministically at intervals of 8ms (125Hz).
> If each I/O takes 1ms, then the true utilization is 12.5%. The previous
> implementation will increment io_ticks once for each jiffy in which an
> I/O ends. Since the workload issues an I/O reliably for each jiffy, the
> reported utilization will be 100%. This commit changes the approach such
> that only I/Os which cross a boundary between jiffies are counted. With
> this change, the given workload would count an I/O tick on every eighth
> jiffy, resulting in a (correct) calculated utilization of 12.5%.
> 
> Signed-off-by: Josh Snyder <joshs@netflix.com>
> Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
> ---
>  block/blk-core.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index d1b79dfe9540..a0bbd9e099b9 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1396,14 +1396,22 @@ unsigned int blk_rq_err_bytes(const struct request *rq)
>  }
>  EXPORT_SYMBOL_GPL(blk_rq_err_bytes);
>  
> -static void update_io_ticks(struct hd_struct *part, unsigned long now, bool end)
> +static void update_io_ticks(struct hd_struct *part, unsigned long now, unsigned long start)
>  {
>  	unsigned long stamp;
> +	unsigned long elapsed;
>  again:
>  	stamp = READ_ONCE(part->stamp);
>  	if (unlikely(stamp != now)) {
> -		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp))
> -			__part_stat_add(part, io_ticks, end ? now - stamp : 1);
> +		if (likely(cmpxchg(&part->stamp, stamp, now) == stamp)) {
> +			// stamp denotes the last IO to finish
> +			// If this IO started before stamp, then there was overlap between this IO
> +			// and that one. We increment only by the non-overlap time.
> +			// If not, there was no overlap and we increment by our own time,
> +			// disregarding stamp.

Linux kernel's comment style is '/**/'

> +			elapsed = now - (start < stamp ? stamp : start);
> +			__part_stat_add(part, io_ticks, elapsed);

Looks this way of only sampling IO done is smart, io ticks becomes much
more accurate than before.

> +		}
>  	}
>  	if (part->partno) {
>  		part = &part_to_disk(part)->part0;
> @@ -1439,7 +1447,7 @@ void blk_account_io_done(struct request *req, u64 now)
>  		part_stat_lock();
>  		part = req->part;
>  
> -		update_io_ticks(part, jiffies, true);
> +		update_io_ticks(part, jiffies, nsecs_to_jiffies(req->start_time_ns));

jiffies and req->start_time_ns may be from different clock sources, so
I'd suggest to merge the two patches into one.


Thanks, 
Ming

next prev parent reply	other threads:[~2020-06-09  8:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-09  4:07 [RFC 0/2] Increase accuracy and precision of sampled io_ticks Josh Snyder
2020-06-09  4:07 ` [RFC 1/2] Eliminate over- and under-counting of io_ticks Josh Snyder
2020-06-09  8:08   ` Ming Lei [this message]
2020-06-10  1:41   ` Hou Tao
2020-06-10  7:26     ` Josh Snyder
2020-06-09  4:07 ` [RFC 2/2] Track io_ticks at microsecond granularity Josh Snyder
2020-06-09  6:23   ` kernel test robot
2020-06-09  6:36   ` kernel test robot
2020-06-09  8:50   ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200609080808.GA270404@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=josh@code406.com \
    --cc=joshs@netflix.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.