public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios
@ 2025-03-05  4:31 Ming Lei
  2025-03-05  4:31 ` [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp Ming Lei
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Ming Lei @ 2025-03-05  4:31 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Tejun Heo, Josef Bacik, Yu Kuai

Hello,

Remove last_bytes_disp and last_ios_disp which isn't used any more.

Remove carryover bytes/ios because we can carry the compensation bytes/ios
against dispatch bytes/ios directly.

Depends on "[PATCH v2] blk-throttle: fix lower bps rate by throtl_trim_slice()"[1]

[1] https://lore.kernel.org/linux-block/20250227120645.812815-1-yukuai1@huaweicloud.com/raw

Thanks,
Ming

Cc: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Yu Kuai <yukuai3@huawei.com>

Ming Lei (3):
  blk-throttle: remove last_bytes_disp and last_ios_disp
  blk-throttle: don't take carryover for prioritized processing of
    metadata
  blk-throttle: carry over directly

 block/blk-throttle.c | 69 +++++++++++++++++---------------------------
 block/blk-throttle.h |  7 ++---
 2 files changed, 29 insertions(+), 47 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp
  2025-03-05  4:31 [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Ming Lei
@ 2025-03-05  4:31 ` Ming Lei
  2025-03-05 19:07   ` Tejun Heo
  2025-03-05  4:31 ` [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata Ming Lei
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2025-03-05  4:31 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Tejun Heo, Josef Bacik, Yu Kuai

The two fields are not used any more, so remove them.

Cc: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 5 +----
 block/blk-throttle.h | 3 ---
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a52f0d6b40ad..213e7b04617a 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -819,13 +819,10 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 	unsigned int bio_size = throtl_bio_data_size(bio);
 
 	/* Charge the bio to the group */
-	if (!bio_flagged(bio, BIO_BPS_THROTTLED)) {
+	if (!bio_flagged(bio, BIO_BPS_THROTTLED))
 		tg->bytes_disp[rw] += bio_size;
-		tg->last_bytes_disp[rw] += bio_size;
-	}
 
 	tg->io_disp[rw]++;
-	tg->last_io_disp[rw]++;
 }
 
 /**
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index 1a36d1278eea..ba8f6e986994 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -106,9 +106,6 @@ struct throtl_grp {
 	/* Number of bio's dispatched in current slice */
 	unsigned int io_disp[2];
 
-	uint64_t last_bytes_disp[2];
-	unsigned int last_io_disp[2];
-
 	/*
 	 * The following two fields are updated when new configuration is
 	 * submitted while some bios are still throttled, they record how many
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata
  2025-03-05  4:31 [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Ming Lei
  2025-03-05  4:31 ` [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp Ming Lei
@ 2025-03-05  4:31 ` Ming Lei
  2025-03-05 19:11   ` Tejun Heo
  2025-03-05  4:31 ` [PATCH 3/3] blk-throttle: carry over directly Ming Lei
  2025-03-05 23:25 ` [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Jens Axboe
  3 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2025-03-05  4:31 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Tejun Heo, Josef Bacik, Yu Kuai

Commit 29390bb5661d ("blk-throttle: support prioritized processing of metadata")
takes bytes/ios carryover for prioritized processing of metadata. Turns out
we can support it by charging it directly without trimming slice, and the
result is same with carryover.

Cc: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 213e7b04617a..7271aee94faf 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1620,13 +1620,6 @@ static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool rw)
 	return tg_may_dispatch(tg, bio, NULL);
 }
 
-static void tg_dispatch_in_debt(struct throtl_grp *tg, struct bio *bio, bool rw)
-{
-	if (!bio_flagged(bio, BIO_BPS_THROTTLED))
-		tg->carryover_bytes[rw] -= throtl_bio_data_size(bio);
-	tg->carryover_ios[rw]--;
-}
-
 bool __blk_throtl_bio(struct bio *bio)
 {
 	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
@@ -1663,10 +1656,12 @@ bool __blk_throtl_bio(struct bio *bio)
 			/*
 			 * IOs which may cause priority inversions are
 			 * dispatched directly, even if they're over limit.
-			 * Debts are handled by carryover_bytes/ios while
-			 * calculating wait time.
+			 *
+			 * Charge and dispatch directly, and our throttle
+			 * control algorithm is adaptive, and extra IO bytes
+			 * will be throttled for paying the debt
 			 */
-			tg_dispatch_in_debt(tg, bio, rw);
+			throtl_charge_bio(tg, bio);
 		} else {
 			/* if above limits, break to queue */
 			break;
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] blk-throttle: carry over directly
  2025-03-05  4:31 [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Ming Lei
  2025-03-05  4:31 ` [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp Ming Lei
  2025-03-05  4:31 ` [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata Ming Lei
@ 2025-03-05  4:31 ` Ming Lei
  2025-03-05 19:16   ` Tejun Heo
  2025-04-11  2:53   ` Yu Kuai
  2025-03-05 23:25 ` [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Jens Axboe
  3 siblings, 2 replies; 13+ messages in thread
From: Ming Lei @ 2025-03-05  4:31 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Tejun Heo, Josef Bacik, Yu Kuai

Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
update.

Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
->io_disp[] directly, since the carryover is one-shot thing and only valid
in current slice.

Then we can remove the two fields and simplify code much.

Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
two fields may become negative when updating limits or config, but both are
big enough for holding bytes/ios dispatched in single slice

Cc: Tejun Heo <tj@kernel.org>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
 block/blk-throttle.h |  4 ++--
 2 files changed, 23 insertions(+), 30 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 7271aee94faf..91dab43c65ab 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
 {
 	tg->bytes_disp[rw] = 0;
 	tg->io_disp[rw] = 0;
-	tg->carryover_bytes[rw] = 0;
-	tg->carryover_ios[rw] = 0;
 
 	/*
 	 * Previous slice has expired. We must have trimmed it after last
@@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
 }
 
 static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
-					  bool clear_carryover)
+					  bool clear)
 {
-	tg->bytes_disp[rw] = 0;
-	tg->io_disp[rw] = 0;
+	if (clear) {
+		tg->bytes_disp[rw] = 0;
+		tg->io_disp[rw] = 0;
+	}
 	tg->slice_start[rw] = jiffies;
 	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
-	if (clear_carryover) {
-		tg->carryover_bytes[rw] = 0;
-		tg->carryover_ios[rw] = 0;
-	}
 
 	throtl_log(&tg->service_queue,
 		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
@@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 	 */
 	time_elapsed -= tg->td->throtl_slice;
 	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
-					     time_elapsed) +
-		     tg->carryover_bytes[rw];
-	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
-		  tg->carryover_ios[rw];
+					     time_elapsed);
+	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
 	if (bytes_trim <= 0 && io_trim <= 0)
 		return;
 
-	tg->carryover_bytes[rw] = 0;
 	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
 		tg->bytes_disp[rw] -= bytes_trim;
 	else
 		tg->bytes_disp[rw] = 0;
 
-	tg->carryover_ios[rw] = 0;
 	if ((int)tg->io_disp[rw] >= io_trim)
 		tg->io_disp[rw] -= io_trim;
 	else
@@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
 		   jiffies);
 }
 
-static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
+static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
+				  long long *bytes, int *ios)
 {
 	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
 	u64 bps_limit = tg_bps_limit(tg, rw);
@@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
 	 * configuration.
 	 */
 	if (bps_limit != U64_MAX)
-		tg->carryover_bytes[rw] +=
-			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
+		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
 			tg->bytes_disp[rw];
 	if (iops_limit != UINT_MAX)
-		tg->carryover_ios[rw] +=
-			calculate_io_allowed(iops_limit, jiffy_elapsed) -
+		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
 			tg->io_disp[rw];
+	tg->bytes_disp[rw] -= *bytes;
+	tg->io_disp[rw] -= *ios;
 }
 
 static void tg_update_carryover(struct throtl_grp *tg)
 {
+	long long bytes[2] = {0};
+	int ios[2] = {0};
+
 	if (tg->service_queue.nr_queued[READ])
-		__tg_update_carryover(tg, READ);
+		__tg_update_carryover(tg, READ, &bytes[READ], &ios[READ]);
 	if (tg->service_queue.nr_queued[WRITE])
-		__tg_update_carryover(tg, WRITE);
+		__tg_update_carryover(tg, WRITE, &bytes[WRITE], &ios[WRITE]);
 
 	/* see comments in struct throtl_grp for meaning of these fields. */
 	throtl_log(&tg->service_queue, "%s: %lld %lld %d %d\n", __func__,
-		   tg->carryover_bytes[READ], tg->carryover_bytes[WRITE],
-		   tg->carryover_ios[READ], tg->carryover_ios[WRITE]);
+		   bytes[READ], bytes[WRITE], ios[READ], ios[WRITE]);
 }
 
 static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio,
@@ -695,8 +690,7 @@ static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio
 
 	/* Round up to the next throttle slice, wait time must be nonzero */
 	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
-	io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) +
-		     tg->carryover_ios[rw];
+	io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd);
 	if (io_allowed > 0 && tg->io_disp[rw] + 1 <= io_allowed)
 		return 0;
 
@@ -729,8 +723,7 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
 		jiffy_elapsed_rnd = tg->td->throtl_slice;
 
 	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
-	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
-			tg->carryover_bytes[rw];
+	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
 		return 0;
 
diff --git a/block/blk-throttle.h b/block/blk-throttle.h
index ba8f6e986994..7964cc041e06 100644
--- a/block/blk-throttle.h
+++ b/block/blk-throttle.h
@@ -102,9 +102,9 @@ struct throtl_grp {
 	unsigned int iops[2];
 
 	/* Number of bytes dispatched in current slice */
-	uint64_t bytes_disp[2];
+	int64_t bytes_disp[2];
 	/* Number of bio's dispatched in current slice */
-	unsigned int io_disp[2];
+	int io_disp[2];
 
 	/*
 	 * The following two fields are updated when new configuration is
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp
  2025-03-05  4:31 ` [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp Ming Lei
@ 2025-03-05 19:07   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2025-03-05 19:07 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Josef Bacik, Yu Kuai

On Wed, Mar 05, 2025 at 12:31:19PM +0800, Ming Lei wrote:
> The two fields are not used any more, so remove them.
> 
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata
  2025-03-05  4:31 ` [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata Ming Lei
@ 2025-03-05 19:11   ` Tejun Heo
  0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2025-03-05 19:11 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Josef Bacik, Yu Kuai

On Wed, Mar 05, 2025 at 12:31:20PM +0800, Ming Lei wrote:
> Commit 29390bb5661d ("blk-throttle: support prioritized processing of metadata")
> takes bytes/ios carryover for prioritized processing of metadata. Turns out
> we can support it by charging it directly without trimming slice, and the
> result is same with carryover.
> 
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-03-05  4:31 ` [PATCH 3/3] blk-throttle: carry over directly Ming Lei
@ 2025-03-05 19:16   ` Tejun Heo
  2025-04-11  2:53   ` Yu Kuai
  1 sibling, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2025-03-05 19:16 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Josef Bacik, Yu Kuai

On Wed, Mar 05, 2025 at 12:31:21PM +0800, Ming Lei wrote:
> Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
> update.
> 
> Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
> ->io_disp[] directly, since the carryover is one-shot thing and only valid
> in current slice.
> 
> Then we can remove the two fields and simplify code much.
> 
> Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
> two fields may become negative when updating limits or config, but both are
> big enough for holding bytes/ios dispatched in single slice
> 
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios
  2025-03-05  4:31 [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Ming Lei
                   ` (2 preceding siblings ...)
  2025-03-05  4:31 ` [PATCH 3/3] blk-throttle: carry over directly Ming Lei
@ 2025-03-05 23:25 ` Jens Axboe
  3 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2025-03-05 23:25 UTC (permalink / raw)
  To: linux-block, Ming Lei; +Cc: Tejun Heo, Josef Bacik, Yu Kuai


On Wed, 05 Mar 2025 12:31:18 +0800, Ming Lei wrote:
> Remove last_bytes_disp and last_ios_disp which isn't used any more.
> 
> Remove carryover bytes/ios because we can carry the compensation bytes/ios
> against dispatch bytes/ios directly.
> 
> Depends on "[PATCH v2] blk-throttle: fix lower bps rate by throtl_trim_slice()"[1]
> 
> [...]

Applied, thanks!

[1/3] blk-throttle: remove last_bytes_disp and last_ios_disp
      commit: 483a393e7e6189aac7d47b5295029159ab7a1cf1
[2/3] blk-throttle: don't take carryover for prioritized processing of metadata
      commit: a9fc8868b350cbf4ff730a4ea9651319cc669516
[3/3] blk-throttle: carry over directly
      commit: 6cc477c36875ea5329b8bfbdf4d91f83dc653c91

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-03-05  4:31 ` [PATCH 3/3] blk-throttle: carry over directly Ming Lei
  2025-03-05 19:16   ` Tejun Heo
@ 2025-04-11  2:53   ` Yu Kuai
  2025-04-11 15:01     ` Ming Lei
  1 sibling, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2025-04-11  2:53 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block
  Cc: Tejun Heo, Josef Bacik, yukuai (C), WoZ1zh1

Hi, Ming

在 2025/03/05 12:31, Ming Lei 写道:
> Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
> update.
> 
> Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
> ->io_disp[] directly, since the carryover is one-shot thing and only valid
> in current slice.
> 
> Then we can remove the two fields and simplify code much.
> 
> Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
> two fields may become negative when updating limits or config, but both are
> big enough for holding bytes/ios dispatched in single slice
> 
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Josef Bacik <josef@toxicpanda.com>
> Cc: Yu Kuai <yukuai3@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
>   block/blk-throttle.h |  4 ++--
>   2 files changed, 23 insertions(+), 30 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 7271aee94faf..91dab43c65ab 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>   {
>   	tg->bytes_disp[rw] = 0;
>   	tg->io_disp[rw] = 0;
> -	tg->carryover_bytes[rw] = 0;
> -	tg->carryover_ios[rw] = 0;
>   
>   	/*
>   	 * Previous slice has expired. We must have trimmed it after last
> @@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>   }
>   
>   static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
> -					  bool clear_carryover)
> +					  bool clear)
>   {
> -	tg->bytes_disp[rw] = 0;
> -	tg->io_disp[rw] = 0;
> +	if (clear) {
> +		tg->bytes_disp[rw] = 0;
> +		tg->io_disp[rw] = 0;
> +	}
>   	tg->slice_start[rw] = jiffies;
>   	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
> -	if (clear_carryover) {
> -		tg->carryover_bytes[rw] = 0;
> -		tg->carryover_ios[rw] = 0;
> -	}
>   
>   	throtl_log(&tg->service_queue,
>   		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
> @@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>   	 */
>   	time_elapsed -= tg->td->throtl_slice;
>   	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
> -					     time_elapsed) +
> -		     tg->carryover_bytes[rw];
> -	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
> -		  tg->carryover_ios[rw];
> +					     time_elapsed);
> +	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
>   	if (bytes_trim <= 0 && io_trim <= 0)
>   		return;
>   
> -	tg->carryover_bytes[rw] = 0;
>   	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
>   		tg->bytes_disp[rw] -= bytes_trim;
>   	else
>   		tg->bytes_disp[rw] = 0;
>   
> -	tg->carryover_ios[rw] = 0;
>   	if ((int)tg->io_disp[rw] >= io_trim)
>   		tg->io_disp[rw] -= io_trim;
>   	else
> @@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>   		   jiffies);
>   }
>   
> -static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
> +static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
> +				  long long *bytes, int *ios)
>   {
>   	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
>   	u64 bps_limit = tg_bps_limit(tg, rw);
> @@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
>   	 * configuration.
>   	 */
>   	if (bps_limit != U64_MAX)
> -		tg->carryover_bytes[rw] +=
> -			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
> +		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
>   			tg->bytes_disp[rw];
>   	if (iops_limit != UINT_MAX)
> -		tg->carryover_ios[rw] +=
> -			calculate_io_allowed(iops_limit, jiffy_elapsed) -
> +		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
>   			tg->io_disp[rw];
> +	tg->bytes_disp[rw] -= *bytes;
> +	tg->io_disp[rw] -= *ios;

This patch is applied before I get a chance to review. :( I think the
above update should be:

tg->bytes_disp[rw] = -*bytes;
tg->io_disp[rw] = -*ios;

Otherwise, the result is actually (2 * disp - allowed), which might be a
huge value, causing long dealy for the next dispatch.

This is what the old carryover fileds do, above change will work, but
look wried.

Thanks,
Kuai

>   }
>   
>   static void tg_update_carryover(struct throtl_grp *tg)
>   {
> +	long long bytes[2] = {0};
> +	int ios[2] = {0};
> +
>   	if (tg->service_queue.nr_queued[READ])
> -		__tg_update_carryover(tg, READ);
> +		__tg_update_carryover(tg, READ, &bytes[READ], &ios[READ]);
>   	if (tg->service_queue.nr_queued[WRITE])
> -		__tg_update_carryover(tg, WRITE);
> +		__tg_update_carryover(tg, WRITE, &bytes[WRITE], &ios[WRITE]);
>   
>   	/* see comments in struct throtl_grp for meaning of these fields. */
>   	throtl_log(&tg->service_queue, "%s: %lld %lld %d %d\n", __func__,
> -		   tg->carryover_bytes[READ], tg->carryover_bytes[WRITE],
> -		   tg->carryover_ios[READ], tg->carryover_ios[WRITE]);
> +		   bytes[READ], bytes[WRITE], ios[READ], ios[WRITE]);
>   }
>   
>   static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio,
> @@ -695,8 +690,7 @@ static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio
>   
>   	/* Round up to the next throttle slice, wait time must be nonzero */
>   	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
> -	io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) +
> -		     tg->carryover_ios[rw];
> +	io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd);
>   	if (io_allowed > 0 && tg->io_disp[rw] + 1 <= io_allowed)
>   		return 0;
>   
> @@ -729,8 +723,7 @@ static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
>   		jiffy_elapsed_rnd = tg->td->throtl_slice;
>   
>   	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
> -	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
> -			tg->carryover_bytes[rw];
> +	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
>   	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
>   		return 0;
>   
> diff --git a/block/blk-throttle.h b/block/blk-throttle.h
> index ba8f6e986994..7964cc041e06 100644
> --- a/block/blk-throttle.h
> +++ b/block/blk-throttle.h
> @@ -102,9 +102,9 @@ struct throtl_grp {
>   	unsigned int iops[2];
>   
>   	/* Number of bytes dispatched in current slice */
> -	uint64_t bytes_disp[2];
> +	int64_t bytes_disp[2];
>   	/* Number of bio's dispatched in current slice */
> -	unsigned int io_disp[2];
> +	int io_disp[2];
>   
>   	/*
>   	 * The following two fields are updated when new configuration is
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-04-11  2:53   ` Yu Kuai
@ 2025-04-11 15:01     ` Ming Lei
  2025-04-12  0:37       ` Yu Kuai
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2025-04-11 15:01 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Jens Axboe, linux-block, Tejun Heo, Josef Bacik, yukuai (C),
	WoZ1zh1

On Fri, Apr 11, 2025 at 10:53:12AM +0800, Yu Kuai wrote:
> Hi, Ming
> 
> 在 2025/03/05 12:31, Ming Lei 写道:
> > Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
> > update.
> > 
> > Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
> > ->io_disp[] directly, since the carryover is one-shot thing and only valid
> > in current slice.
> > 
> > Then we can remove the two fields and simplify code much.
> > 
> > Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
> > two fields may become negative when updating limits or config, but both are
> > big enough for holding bytes/ios dispatched in single slice
> > 
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Josef Bacik <josef@toxicpanda.com>
> > Cc: Yu Kuai <yukuai3@huawei.com>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
> >   block/blk-throttle.h |  4 ++--
> >   2 files changed, 23 insertions(+), 30 deletions(-)
> > 
> > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > index 7271aee94faf..91dab43c65ab 100644
> > --- a/block/blk-throttle.c
> > +++ b/block/blk-throttle.c
> > @@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
> >   {
> >   	tg->bytes_disp[rw] = 0;
> >   	tg->io_disp[rw] = 0;
> > -	tg->carryover_bytes[rw] = 0;
> > -	tg->carryover_ios[rw] = 0;
> >   	/*
> >   	 * Previous slice has expired. We must have trimmed it after last
> > @@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
> >   }
> >   static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
> > -					  bool clear_carryover)
> > +					  bool clear)
> >   {
> > -	tg->bytes_disp[rw] = 0;
> > -	tg->io_disp[rw] = 0;
> > +	if (clear) {
> > +		tg->bytes_disp[rw] = 0;
> > +		tg->io_disp[rw] = 0;
> > +	}
> >   	tg->slice_start[rw] = jiffies;
> >   	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
> > -	if (clear_carryover) {
> > -		tg->carryover_bytes[rw] = 0;
> > -		tg->carryover_ios[rw] = 0;
> > -	}
> >   	throtl_log(&tg->service_queue,
> >   		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
> > @@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> >   	 */
> >   	time_elapsed -= tg->td->throtl_slice;
> >   	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
> > -					     time_elapsed) +
> > -		     tg->carryover_bytes[rw];
> > -	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
> > -		  tg->carryover_ios[rw];
> > +					     time_elapsed);
> > +	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
> >   	if (bytes_trim <= 0 && io_trim <= 0)
> >   		return;
> > -	tg->carryover_bytes[rw] = 0;
> >   	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
> >   		tg->bytes_disp[rw] -= bytes_trim;
> >   	else
> >   		tg->bytes_disp[rw] = 0;
> > -	tg->carryover_ios[rw] = 0;
> >   	if ((int)tg->io_disp[rw] >= io_trim)
> >   		tg->io_disp[rw] -= io_trim;
> >   	else
> > @@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> >   		   jiffies);
> >   }
> > -static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
> > +static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
> > +				  long long *bytes, int *ios)
> >   {
> >   	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
> >   	u64 bps_limit = tg_bps_limit(tg, rw);
> > @@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
> >   	 * configuration.
> >   	 */
> >   	if (bps_limit != U64_MAX)
> > -		tg->carryover_bytes[rw] +=
> > -			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
> > +		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
> >   			tg->bytes_disp[rw];
> >   	if (iops_limit != UINT_MAX)
> > -		tg->carryover_ios[rw] +=
> > -			calculate_io_allowed(iops_limit, jiffy_elapsed) -
> > +		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
> >   			tg->io_disp[rw];
> > +	tg->bytes_disp[rw] -= *bytes;
> > +	tg->io_disp[rw] -= *ios;
> 
> This patch is applied before I get a chance to review. :( I think the
> above update should be:

oops, your review period takes too long(> 1 month), :-(

> 
> tg->bytes_disp[rw] = -*bytes;
> tg->io_disp[rw] = -*ios;

I think the above is wrong since it simply override the existed dispatched
bytes/ios.

The calculation can be understood from two ways:

1) delta = calculate_bytes_allowed(bps_limit, jiffy_elapsed) - tg->bytes_disp[rw];

`delta` represents difference between theoretical and actual dispatch bytes.

If `delta` > 0, it means we dispatch too less in past, and we have to subtract
`delta` from ->bytes_disp, so that in future we can dispatch more.

Similar with 'delta < 0'.

2) from consumer viewpoint:

tg_within_bps_limit(): patched

	...
	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
		...

tg_within_bps_limit(): before patched
	...
    bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
		tg->carryover_bytes[rw];
	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
		...

So if `delta` is subtracted from `bytes_allowed` in patched code, we should
subtract same bytes from ->byte_disp[] side for maintaining the relation.


> 
> Otherwise, the result is actually (2 * disp - allowed), which might be a
> huge value, causing long dealy for the next dispatch.
> 
> This is what the old carryover fileds do, above change will work, but
> look wried.

As I explained, the patched code follows the original carryover calculation, and it
passes all throt blktests.

Or do you have test case broken by this patch?



Thanks.
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-04-11 15:01     ` Ming Lei
@ 2025-04-12  0:37       ` Yu Kuai
  2025-04-14  2:32         ` Ming Lei
  0 siblings, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2025-04-12  0:37 UTC (permalink / raw)
  To: Ming Lei, Yu Kuai
  Cc: Jens Axboe, linux-block, Tejun Heo, Josef Bacik, WoZ1zh1,
	yukuai (C)

Hi,

在 2025/04/11 23:01, Ming Lei 写道:
> On Fri, Apr 11, 2025 at 10:53:12AM +0800, Yu Kuai wrote:
>> Hi, Ming
>>
>> 在 2025/03/05 12:31, Ming Lei 写道:
>>> Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
>>> update.
>>>
>>> Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
>>> ->io_disp[] directly, since the carryover is one-shot thing and only valid
>>> in current slice.
>>>
>>> Then we can remove the two fields and simplify code much.
>>>
>>> Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
>>> two fields may become negative when updating limits or config, but both are
>>> big enough for holding bytes/ios dispatched in single slice
>>>
>>> Cc: Tejun Heo <tj@kernel.org>
>>> Cc: Josef Bacik <josef@toxicpanda.com>
>>> Cc: Yu Kuai <yukuai3@huawei.com>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
>>>    block/blk-throttle.h |  4 ++--
>>>    2 files changed, 23 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>>> index 7271aee94faf..91dab43c65ab 100644
>>> --- a/block/blk-throttle.c
>>> +++ b/block/blk-throttle.c
>>> @@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>>>    {
>>>    	tg->bytes_disp[rw] = 0;
>>>    	tg->io_disp[rw] = 0;
>>> -	tg->carryover_bytes[rw] = 0;
>>> -	tg->carryover_ios[rw] = 0;
>>>    	/*
>>>    	 * Previous slice has expired. We must have trimmed it after last
>>> @@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>>>    }
>>>    static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
>>> -					  bool clear_carryover)
>>> +					  bool clear)
>>>    {
>>> -	tg->bytes_disp[rw] = 0;
>>> -	tg->io_disp[rw] = 0;
>>> +	if (clear) {
>>> +		tg->bytes_disp[rw] = 0;
>>> +		tg->io_disp[rw] = 0;
>>> +	}
>>>    	tg->slice_start[rw] = jiffies;
>>>    	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
>>> -	if (clear_carryover) {
>>> -		tg->carryover_bytes[rw] = 0;
>>> -		tg->carryover_ios[rw] = 0;
>>> -	}
>>>    	throtl_log(&tg->service_queue,
>>>    		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
>>> @@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>>>    	 */
>>>    	time_elapsed -= tg->td->throtl_slice;
>>>    	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
>>> -					     time_elapsed) +
>>> -		     tg->carryover_bytes[rw];
>>> -	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
>>> -		  tg->carryover_ios[rw];
>>> +					     time_elapsed);
>>> +	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
>>>    	if (bytes_trim <= 0 && io_trim <= 0)
>>>    		return;
>>> -	tg->carryover_bytes[rw] = 0;
>>>    	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
>>>    		tg->bytes_disp[rw] -= bytes_trim;
>>>    	else
>>>    		tg->bytes_disp[rw] = 0;
>>> -	tg->carryover_ios[rw] = 0;
>>>    	if ((int)tg->io_disp[rw] >= io_trim)
>>>    		tg->io_disp[rw] -= io_trim;
>>>    	else
>>> @@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>>>    		   jiffies);
>>>    }
>>> -static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
>>> +static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
>>> +				  long long *bytes, int *ios)
>>>    {
>>>    	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
>>>    	u64 bps_limit = tg_bps_limit(tg, rw);
>>> @@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
>>>    	 * configuration.
>>>    	 */
>>>    	if (bps_limit != U64_MAX)
>>> -		tg->carryover_bytes[rw] +=
>>> -			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
>>> +		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
>>>    			tg->bytes_disp[rw];
>>>    	if (iops_limit != UINT_MAX)
>>> -		tg->carryover_ios[rw] +=
>>> -			calculate_io_allowed(iops_limit, jiffy_elapsed) -
>>> +		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
>>>    			tg->io_disp[rw];
>>> +	tg->bytes_disp[rw] -= *bytes;
>>> +	tg->io_disp[rw] -= *ios;
>>
>> This patch is applied before I get a chance to review. :( I think the
>> above update should be:
> 
> oops, your review period takes too long(> 1 month), :-(

Yes, I just didn't review in detail when I see this set is applied...
> 
>>
>> tg->bytes_disp[rw] = -*bytes;
>> tg->io_disp[rw] = -*ios;
> 
> I think the above is wrong since it simply override the existed dispatched
> bytes/ios.
> 
> The calculation can be understood from two ways:
> 
> 1) delta = calculate_bytes_allowed(bps_limit, jiffy_elapsed) - tg->bytes_disp[rw];
> 
> `delta` represents difference between theoretical and actual dispatch bytes.
> 
> If `delta` > 0, it means we dispatch too less in past, and we have to subtract
> `delta` from ->bytes_disp, so that in future we can dispatch more.

But the problem is that in this patch, slice_start is set to *jiffies*,
keep the old disp filed that is between old slice_start to jiffies does
not make sense.
> 
> Similar with 'delta < 0'.
> 
> 2) from consumer viewpoint:
> 
> tg_within_bps_limit(): patched
> 
> 	...
> 	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
> 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
> 		...
> 
> tg_within_bps_limit(): before patched
> 	...
>      bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
> 		tg->carryover_bytes[rw];
> 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
> 		...
> 
> So if `delta` is subtracted from `bytes_allowed` in patched code, we should
> subtract same bytes from ->byte_disp[] side for maintaining the relation.
> 

In the original carryover calculation, bytes_disp is always set to 0,
while slice start is set to jiffies. Patched version actually will be
less than old version if bytes_disp is not 0;
> 
>>
>> Otherwise, the result is actually (2 * disp - allowed), which might be a
>> huge value, causing long dealy for the next dispatch.
>>
>> This is what the old carryover fileds do, above change will work, but
>> look wried.
> 
> As I explained, the patched code follows the original carryover calculation, and it
> passes all throt blktests.

BTW, there is another case that this patch broken, if bps_limit is set
while iops_limit is not, and BIO is throttled, io_disp will be recored.

And if user set iops_limit as well, before this set, io_disp will set to
0, however, this set will keep the old io_disp which will also cause
long delay.

tg_conf_updated
  throtl_start_new_slice
> 
> Or do you have test case broken by this patch?

Yes, we can change test/005 a bit for this problem, currently the test
just issue one BIO and bytes_disp/io_disp is zero when config is
chagned.

Thanks,
Kuai

> 
> 
> 
> Thanks.
> Ming
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-04-12  0:37       ` Yu Kuai
@ 2025-04-14  2:32         ` Ming Lei
  2025-04-14 11:47           ` Yu Kuai
  0 siblings, 1 reply; 13+ messages in thread
From: Ming Lei @ 2025-04-14  2:32 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Jens Axboe, linux-block, Tejun Heo, Josef Bacik, WoZ1zh1,
	yukuai (C)

On Sat, Apr 12, 2025 at 08:37:28AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2025/04/11 23:01, Ming Lei 写道:
> > On Fri, Apr 11, 2025 at 10:53:12AM +0800, Yu Kuai wrote:
> > > Hi, Ming
> > > 
> > > 在 2025/03/05 12:31, Ming Lei 写道:
> > > > Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
> > > > update.
> > > > 
> > > > Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
> > > > ->io_disp[] directly, since the carryover is one-shot thing and only valid
> > > > in current slice.
> > > > 
> > > > Then we can remove the two fields and simplify code much.
> > > > 
> > > > Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
> > > > two fields may become negative when updating limits or config, but both are
> > > > big enough for holding bytes/ios dispatched in single slice
> > > > 
> > > > Cc: Tejun Heo <tj@kernel.org>
> > > > Cc: Josef Bacik <josef@toxicpanda.com>
> > > > Cc: Yu Kuai <yukuai3@huawei.com>
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > ---
> > > >    block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
> > > >    block/blk-throttle.h |  4 ++--
> > > >    2 files changed, 23 insertions(+), 30 deletions(-)
> > > > 
> > > > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > > > index 7271aee94faf..91dab43c65ab 100644
> > > > --- a/block/blk-throttle.c
> > > > +++ b/block/blk-throttle.c
> > > > @@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
> > > >    {
> > > >    	tg->bytes_disp[rw] = 0;
> > > >    	tg->io_disp[rw] = 0;
> > > > -	tg->carryover_bytes[rw] = 0;
> > > > -	tg->carryover_ios[rw] = 0;
> > > >    	/*
> > > >    	 * Previous slice has expired. We must have trimmed it after last
> > > > @@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
> > > >    }
> > > >    static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
> > > > -					  bool clear_carryover)
> > > > +					  bool clear)
> > > >    {
> > > > -	tg->bytes_disp[rw] = 0;
> > > > -	tg->io_disp[rw] = 0;
> > > > +	if (clear) {
> > > > +		tg->bytes_disp[rw] = 0;
> > > > +		tg->io_disp[rw] = 0;
> > > > +	}
> > > >    	tg->slice_start[rw] = jiffies;
> > > >    	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
> > > > -	if (clear_carryover) {
> > > > -		tg->carryover_bytes[rw] = 0;
> > > > -		tg->carryover_ios[rw] = 0;
> > > > -	}
> > > >    	throtl_log(&tg->service_queue,
> > > >    		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
> > > > @@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> > > >    	 */
> > > >    	time_elapsed -= tg->td->throtl_slice;
> > > >    	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
> > > > -					     time_elapsed) +
> > > > -		     tg->carryover_bytes[rw];
> > > > -	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
> > > > -		  tg->carryover_ios[rw];
> > > > +					     time_elapsed);
> > > > +	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
> > > >    	if (bytes_trim <= 0 && io_trim <= 0)
> > > >    		return;
> > > > -	tg->carryover_bytes[rw] = 0;
> > > >    	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
> > > >    		tg->bytes_disp[rw] -= bytes_trim;
> > > >    	else
> > > >    		tg->bytes_disp[rw] = 0;
> > > > -	tg->carryover_ios[rw] = 0;
> > > >    	if ((int)tg->io_disp[rw] >= io_trim)
> > > >    		tg->io_disp[rw] -= io_trim;
> > > >    	else
> > > > @@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
> > > >    		   jiffies);
> > > >    }
> > > > -static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
> > > > +static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
> > > > +				  long long *bytes, int *ios)
> > > >    {
> > > >    	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
> > > >    	u64 bps_limit = tg_bps_limit(tg, rw);
> > > > @@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
> > > >    	 * configuration.
> > > >    	 */
> > > >    	if (bps_limit != U64_MAX)
> > > > -		tg->carryover_bytes[rw] +=
> > > > -			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
> > > > +		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
> > > >    			tg->bytes_disp[rw];
> > > >    	if (iops_limit != UINT_MAX)
> > > > -		tg->carryover_ios[rw] +=
> > > > -			calculate_io_allowed(iops_limit, jiffy_elapsed) -
> > > > +		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
> > > >    			tg->io_disp[rw];
> > > > +	tg->bytes_disp[rw] -= *bytes;
> > > > +	tg->io_disp[rw] -= *ios;
> > > 
> > > This patch is applied before I get a chance to review. :( I think the
> > > above update should be:
> > 
> > oops, your review period takes too long(> 1 month), :-(
> 
> Yes, I just didn't review in detail when I see this set is applied...
> > 
> > > 
> > > tg->bytes_disp[rw] = -*bytes;
> > > tg->io_disp[rw] = -*ios;
> > 
> > I think the above is wrong since it simply override the existed dispatched
> > bytes/ios.
> > 
> > The calculation can be understood from two ways:
> > 
> > 1) delta = calculate_bytes_allowed(bps_limit, jiffy_elapsed) - tg->bytes_disp[rw];
> > 
> > `delta` represents difference between theoretical and actual dispatch bytes.
> > 
> > If `delta` > 0, it means we dispatch too less in past, and we have to subtract
> > `delta` from ->bytes_disp, so that in future we can dispatch more.
> 
> But the problem is that in this patch, slice_start is set to *jiffies*,
> keep the old disp filed that is between old slice_start to jiffies does
> not make sense.



> > 
> > Similar with 'delta < 0'.
> > 
> > 2) from consumer viewpoint:
> > 
> > tg_within_bps_limit(): patched
> > 
> > 	...
> > 	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
> > 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
> > 		...
> > 
> > tg_within_bps_limit(): before patched
> > 	...
> >      bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
> > 		tg->carryover_bytes[rw];
> > 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
> > 		...
> > 
> > So if `delta` is subtracted from `bytes_allowed` in patched code, we should
> > subtract same bytes from ->byte_disp[] side for maintaining the relation.
> > 
> 
> In the original carryover calculation, bytes_disp is always set to 0,
> while slice start is set to jiffies. Patched version actually will be
> less than old version if bytes_disp is not 0;

Indeed, you are right, care to send one fix?

Otherwise, please let me know, and I can do it too.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] blk-throttle: carry over directly
  2025-04-14  2:32         ` Ming Lei
@ 2025-04-14 11:47           ` Yu Kuai
  0 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2025-04-14 11:47 UTC (permalink / raw)
  To: Ming Lei, Yu Kuai
  Cc: Jens Axboe, linux-block, Tejun Heo, Josef Bacik, WoZ1zh1,
	yukuai (C)

Hi,

在 2025/04/14 10:32, Ming Lei 写道:
> On Sat, Apr 12, 2025 at 08:37:28AM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2025/04/11 23:01, Ming Lei 写道:
>>> On Fri, Apr 11, 2025 at 10:53:12AM +0800, Yu Kuai wrote:
>>>> Hi, Ming
>>>>
>>>> 在 2025/03/05 12:31, Ming Lei 写道:
>>>>> Now ->carryover_bytes[] and ->carryover_ios[] only covers limit/config
>>>>> update.
>>>>>
>>>>> Actually the carryover bytes/ios can be carried to ->bytes_disp[] and
>>>>> ->io_disp[] directly, since the carryover is one-shot thing and only valid
>>>>> in current slice.
>>>>>
>>>>> Then we can remove the two fields and simplify code much.
>>>>>
>>>>> Type of ->bytes_disp[] and ->io_disp[] has to change as signed because the
>>>>> two fields may become negative when updating limits or config, but both are
>>>>> big enough for holding bytes/ios dispatched in single slice
>>>>>
>>>>> Cc: Tejun Heo <tj@kernel.org>
>>>>> Cc: Josef Bacik <josef@toxicpanda.com>
>>>>> Cc: Yu Kuai <yukuai3@huawei.com>
>>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>>>> ---
>>>>>     block/blk-throttle.c | 49 +++++++++++++++++++-------------------------
>>>>>     block/blk-throttle.h |  4 ++--
>>>>>     2 files changed, 23 insertions(+), 30 deletions(-)
>>>>>
>>>>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>>>>> index 7271aee94faf..91dab43c65ab 100644
>>>>> --- a/block/blk-throttle.c
>>>>> +++ b/block/blk-throttle.c
>>>>> @@ -478,8 +478,6 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>>>>>     {
>>>>>     	tg->bytes_disp[rw] = 0;
>>>>>     	tg->io_disp[rw] = 0;
>>>>> -	tg->carryover_bytes[rw] = 0;
>>>>> -	tg->carryover_ios[rw] = 0;
>>>>>     	/*
>>>>>     	 * Previous slice has expired. We must have trimmed it after last
>>>>> @@ -498,16 +496,14 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
>>>>>     }
>>>>>     static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw,
>>>>> -					  bool clear_carryover)
>>>>> +					  bool clear)
>>>>>     {
>>>>> -	tg->bytes_disp[rw] = 0;
>>>>> -	tg->io_disp[rw] = 0;
>>>>> +	if (clear) {
>>>>> +		tg->bytes_disp[rw] = 0;
>>>>> +		tg->io_disp[rw] = 0;
>>>>> +	}
>>>>>     	tg->slice_start[rw] = jiffies;
>>>>>     	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
>>>>> -	if (clear_carryover) {
>>>>> -		tg->carryover_bytes[rw] = 0;
>>>>> -		tg->carryover_ios[rw] = 0;
>>>>> -	}
>>>>>     	throtl_log(&tg->service_queue,
>>>>>     		   "[%c] new slice start=%lu end=%lu jiffies=%lu",
>>>>> @@ -617,20 +613,16 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>>>>>     	 */
>>>>>     	time_elapsed -= tg->td->throtl_slice;
>>>>>     	bytes_trim = calculate_bytes_allowed(tg_bps_limit(tg, rw),
>>>>> -					     time_elapsed) +
>>>>> -		     tg->carryover_bytes[rw];
>>>>> -	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed) +
>>>>> -		  tg->carryover_ios[rw];
>>>>> +					     time_elapsed);
>>>>> +	io_trim = calculate_io_allowed(tg_iops_limit(tg, rw), time_elapsed);
>>>>>     	if (bytes_trim <= 0 && io_trim <= 0)
>>>>>     		return;
>>>>> -	tg->carryover_bytes[rw] = 0;
>>>>>     	if ((long long)tg->bytes_disp[rw] >= bytes_trim)
>>>>>     		tg->bytes_disp[rw] -= bytes_trim;
>>>>>     	else
>>>>>     		tg->bytes_disp[rw] = 0;
>>>>> -	tg->carryover_ios[rw] = 0;
>>>>>     	if ((int)tg->io_disp[rw] >= io_trim)
>>>>>     		tg->io_disp[rw] -= io_trim;
>>>>>     	else
>>>>> @@ -645,7 +637,8 @@ static inline void throtl_trim_slice(struct throtl_grp *tg, bool rw)
>>>>>     		   jiffies);
>>>>>     }
>>>>> -static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
>>>>> +static void __tg_update_carryover(struct throtl_grp *tg, bool rw,
>>>>> +				  long long *bytes, int *ios)
>>>>>     {
>>>>>     	unsigned long jiffy_elapsed = jiffies - tg->slice_start[rw];
>>>>>     	u64 bps_limit = tg_bps_limit(tg, rw);
>>>>> @@ -658,26 +651,28 @@ static void __tg_update_carryover(struct throtl_grp *tg, bool rw)
>>>>>     	 * configuration.
>>>>>     	 */
>>>>>     	if (bps_limit != U64_MAX)
>>>>> -		tg->carryover_bytes[rw] +=
>>>>> -			calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
>>>>> +		*bytes = calculate_bytes_allowed(bps_limit, jiffy_elapsed) -
>>>>>     			tg->bytes_disp[rw];
>>>>>     	if (iops_limit != UINT_MAX)
>>>>> -		tg->carryover_ios[rw] +=
>>>>> -			calculate_io_allowed(iops_limit, jiffy_elapsed) -
>>>>> +		*ios = calculate_io_allowed(iops_limit, jiffy_elapsed) -
>>>>>     			tg->io_disp[rw];
>>>>> +	tg->bytes_disp[rw] -= *bytes;
>>>>> +	tg->io_disp[rw] -= *ios;
>>>>
>>>> This patch is applied before I get a chance to review. :( I think the
>>>> above update should be:
>>>
>>> oops, your review period takes too long(> 1 month), :-(
>>
>> Yes, I just didn't review in detail when I see this set is applied...
>>>
>>>>
>>>> tg->bytes_disp[rw] = -*bytes;
>>>> tg->io_disp[rw] = -*ios;
>>>
>>> I think the above is wrong since it simply override the existed dispatched
>>> bytes/ios.
>>>
>>> The calculation can be understood from two ways:
>>>
>>> 1) delta = calculate_bytes_allowed(bps_limit, jiffy_elapsed) - tg->bytes_disp[rw];
>>>
>>> `delta` represents difference between theoretical and actual dispatch bytes.
>>>
>>> If `delta` > 0, it means we dispatch too less in past, and we have to subtract
>>> `delta` from ->bytes_disp, so that in future we can dispatch more.
>>
>> But the problem is that in this patch, slice_start is set to *jiffies*,
>> keep the old disp filed that is between old slice_start to jiffies does
>> not make sense.
> 
> 
> 
>>>
>>> Similar with 'delta < 0'.
>>>
>>> 2) from consumer viewpoint:
>>>
>>> tg_within_bps_limit(): patched
>>>
>>> 	...
>>> 	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd);
>>> 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
>>> 		...
>>>
>>> tg_within_bps_limit(): before patched
>>> 	...
>>>       bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
>>> 		tg->carryover_bytes[rw];
>>> 	if (bytes_allowed > 0 && tg->bytes_disp[rw] + bio_size <= bytes_allowed)
>>> 		...
>>>
>>> So if `delta` is subtracted from `bytes_allowed` in patched code, we should
>>> subtract same bytes from ->byte_disp[] side for maintaining the relation.
>>>
>>
>> In the original carryover calculation, bytes_disp is always set to 0,
>> while slice start is set to jiffies. Patched version actually will be
>> less than old version if bytes_disp is not 0;
> 
> Indeed, you are right, care to send one fix?

Sure, my colleague is working on this, if you don't mind. :)
I'll review internally first, if you don't mind.

Thanks,
Kuai

> 
> Otherwise, please let me know, and I can do it too.
> 
> 
> Thanks,
> Ming
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-04-14 11:47 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05  4:31 [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Ming Lei
2025-03-05  4:31 ` [PATCH 1/3] blk-throttle: remove last_bytes_disp and last_ios_disp Ming Lei
2025-03-05 19:07   ` Tejun Heo
2025-03-05  4:31 ` [PATCH 2/3] blk-throttle: don't take carryover for prioritized processing of metadata Ming Lei
2025-03-05 19:11   ` Tejun Heo
2025-03-05  4:31 ` [PATCH 3/3] blk-throttle: carry over directly Ming Lei
2025-03-05 19:16   ` Tejun Heo
2025-04-11  2:53   ` Yu Kuai
2025-04-11 15:01     ` Ming Lei
2025-04-12  0:37       ` Yu Kuai
2025-04-14  2:32         ` Ming Lei
2025-04-14 11:47           ` Yu Kuai
2025-03-05 23:25 ` [PATCH 0/3] blk-throttle: remove last_bytes/ios and carryover byte/ios Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox