cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle
@ 2022-11-29  3:01 Kemeng Shi
  2022-11-29  3:01 ` [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init Kemeng Shi
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

Hi, this series contain a few patches to fix problem when on the default
hierarchy, corrent comment and so on. More detail can be found in
respective changelogs. Thanks.

---
V1->V2:
 -Thanks for the review and advice from Tejun. The corrected comment of
  "blk-throttle: correct stale comment in throtl_pd_init" and the
  solution of "blk-throttle: Fix that bps of child could exceed bps
  limited in parent" are from reply of Tejun.
 -Collect Ack from Tejun.
 -Fix the compile problem when CONFIG_BLK_DEV_THROTTLING_LOW is set.
 -Drop "blk-throttle: Limit whole system if root group is configured
  when on the default hierarchy", "blk-throttle: remove unnecessary check
  for validation of limit index" and "blk-throttle: remove unused variable
  td in tg_update_has_rules"
 -Add "blk-throttle: correct stale comment in throtl_pd_init" and
  "blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade"
 -Use solution that set the BIO_BPS_THROTTLED flag only when the bio
  traversed the entire tree to fix that bps of child could exceed bps
  limited in parent in patch 2/10.
 -Improve the description and comment of most commits.
---

Kemeng Shi (10):
  blk-throttle: correct stale comment in throtl_pd_init
  blk-throttle: Fix that bps of child could exceed bps limited in parent
  blk-throttle: ignore cgroup without io queued in
    blk_throtl_cancel_bios
  blk-throttle: correct calculation of wait time in tg_may_dispatch
  blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade
  blk-throttle: fix typo in comment of throtl_adjusted_limit
  blk-throttle: remove incorrect comment for tg_last_low_overflow_time
  blk-throttle: remove repeat check of elapsed time from last upgrade in
    throtl_hierarchy_can_downgrade
  blk-throttle: Use more siutable time_after check for update of
    slice_start
  blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade

 block/blk-throttle.c | 104 +++++++++++++++++++++++--------------------
 1 file changed, 56 insertions(+), 48 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
@ 2022-11-29  3:01 ` Kemeng Shi
  2022-11-30 21:08   ` Tejun Heo
  2022-11-29  3:01 ` [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch Kemeng Shi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

On the default hierarchy (cgroup2), the throttle interface files don't
exist in the root cgroup, so the ablity to limit the whole system
by configuring root group is not existing anymore. In general, cgroup
doesn't wanna be in the business of restricting resources at the
system level, so correct the stale comment that we can limit whole
system to we can only limit subtree.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
---
 block/blk-throttle.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 847721dc2b2b..59acfac87764 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -395,8 +395,9 @@ static void throtl_pd_init(struct blkg_policy_data *pd)
 	 * If on the default hierarchy, we switch to properly hierarchical
 	 * behavior where limits on a given throtl_grp are applied to the
 	 * whole subtree rather than just the group itself.  e.g. If 16M
-	 * read_bps limit is set on the root group, the whole system can't
-	 * exceed 16M for the device.
+	 * read_bps limit is set on a "parent" group, summary bps of
+	 * "parent" group and its subtree groups can't exceed 16M for the
+	 * device.
 	 *
 	 * If not on the default hierarchy, the broken flat hierarchy
 	 * behavior is retained where all throtl_grps are treated as if
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-29  3:01   ` Kemeng Shi
       [not found]     ` <20221129030147.27400-3-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2022-11-29  3:01   ` [PATCH v2 03/10] blk-throttle: ignore cgroup without io queued in blk_throtl_cancel_bios Kemeng Shi
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, josef-DigfWCa+lFGyeJad7bwFQA,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	shikemeng-hv44wF8Li93QT0dZR+AlfA

Consider situation as following (on the default hierarchy):
 HDD
  |
root (bps limit: 4k)
  |
child (bps limit :8k)
  |
fio bs=8k
Rate of fio is supposed to be 4k, but result is 8k. Reason is as
following:
Size of single IO from fio is larger than bytes allowed in one
throtl_slice in child, so IOs are always queued in child group first.
When queued IOs in child are dispatched to parent group, BIO_BPS_THROTTLED
is set and these IOs will not be limited by tg_within_bps_limit anymore.
Fix this by only set BIO_BPS_THROTTLED when the bio traversed the entire
tree.

There patch has no influence on situation which is not on the default
hierarchy as each group is a single root group without parent.

Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 block/blk-throttle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 59acfac87764..d516a37e82fb 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1067,7 +1067,6 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw)
 	sq->nr_queued[rw]--;
 
 	throtl_charge_bio(tg, bio);
-	bio_set_flag(bio, BIO_BPS_THROTTLED);
 
 	/*
 	 * If our parent is another tg, we just need to transfer @bio to
@@ -1080,6 +1079,7 @@ static void tg_dispatch_one_bio(struct throtl_grp *tg, bool rw)
 		throtl_add_bio_tg(bio, &tg->qnode_on_parent[rw], parent_tg);
 		start_parent_slice_with_credit(tg, parent_tg, rw);
 	} else {
+		bio_set_flag(bio, BIO_BPS_THROTTLED);
 		throtl_qnode_add_bio(bio, &tg->qnode_on_parent[rw],
 				     &parent_sq->queued[rw]);
 		BUG_ON(tg->td->nr_queued[rw] <= 0);
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 03/10] blk-throttle: ignore cgroup without io queued in blk_throtl_cancel_bios
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2022-11-29  3:01   ` [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent Kemeng Shi
@ 2022-11-29  3:01   ` Kemeng Shi
  2022-11-29  3:01   ` [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade Kemeng Shi
  2022-11-29  3:01   ` [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade Kemeng Shi
  3 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, josef-DigfWCa+lFGyeJad7bwFQA,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	shikemeng-hv44wF8Li93QT0dZR+AlfA

Ignore cgroup without io queued in blk_throtl_cancel_bios for two
reasons:
1. Save cpu cycle for trying to dispatch cgroup which is no io queued.
2. Avoid non-consistent state that cgroup is inserted to service queue
without THROTL_TG_PENDING set as tg_update_disptime will unconditional
re-insert cgroup to service queue. If we are on the default hierarchy,
IO dispatched from child in tg_dispatch_one_bio will trigger inserting
cgroup to service queue without erase first and ruin the tree.

Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 block/blk-throttle.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index d516a37e82fb..ee7dc1a0adfd 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1738,7 +1738,18 @@ void blk_throtl_cancel_bios(struct gendisk *disk)
 		 * Set the flag to make sure throtl_pending_timer_fn() won't
 		 * stop until all throttled bios are dispatched.
 		 */
-		blkg_to_tg(blkg)->flags |= THROTL_TG_CANCELING;
+		tg->flags |= THROTL_TG_CANCELING;
+
+		/*
+		 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
+		 * will be inserted to service queue without THROTL_TG_PENDING
+		 * set in tg_update_disptime below. Then IO dispatched from
+		 * child in tg_dispatch_one_bio will trigger double insertion
+		 * and corrupt the tree.
+		 */
+		if (!(tg->flags & THROTL_TG_PENDING))
+			continue;
+
 		/*
 		 * Update disptime after setting the above flag to make sure
 		 * throtl_select_dispatch() won't exit without dispatching.
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
  2022-11-29  3:01 ` [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init Kemeng Shi
@ 2022-11-29  3:01 ` Kemeng Shi
  2022-11-30 21:27   ` Tejun Heo
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

In C language, When executing "if (expression1 && expression2)" and
expression1 return false, the expression2 may not be executed.
For "tg_within_bps_limit(tg, bio, bps_limit, &bps_wait) &&
tg_within_iops_limit(tg, bio, iops_limit, &iops_wait))", if bps is
limited, tg_within_bps_limit will return false and
tg_within_iops_limit will not be called. So even bps and iops are
both limited, iops_wait will not be calculated and is always zero.
So wait time of iops is always ignored.

Fix this by always calling tg_within_bps_limit and tg_within_iops_limit
to get wait time for both bps and iops.

Observed that:
1. Wait time in tg_within_iops_limit/tg_within_bps_limit need always
be stored as wait argument is always passed.
2. wait time is stored to zero if iops/bps is limited otherwise non-zero
is stored.
Simpfy tg_within_iops_limit/tg_within_bps_limit by removing wait argument
and return wait time directly. Caller tg_may_dispatch checks if wait time
is zero to find if iops/bps is limited.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
---
 block/blk-throttle.c | 38 +++++++++++++-------------------------
 1 file changed, 13 insertions(+), 25 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index ee7dc1a0adfd..06e4193b064e 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -822,17 +822,15 @@ static void tg_update_carryover(struct throtl_grp *tg)
 		   tg->carryover_ios[READ], tg->carryover_ios[WRITE]);
 }
 
-static bool tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio,
-				 u32 iops_limit, unsigned long *wait)
+static unsigned long tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio,
+				 u32 iops_limit)
 {
 	bool rw = bio_data_dir(bio);
 	unsigned int io_allowed;
 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
 
 	if (iops_limit == UINT_MAX) {
-		if (wait)
-			*wait = 0;
-		return true;
+		return 0;
 	}
 
 	jiffy_elapsed = jiffies - tg->slice_start[rw];
@@ -842,21 +840,16 @@ static bool tg_within_iops_limit(struct throtl_grp *tg, struct bio *bio,
 	io_allowed = calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) +
 		     tg->carryover_ios[rw];
 	if (tg->io_disp[rw] + 1 <= io_allowed) {
-		if (wait)
-			*wait = 0;
-		return true;
+		return 0;
 	}
 
 	/* Calc approx time to dispatch */
 	jiffy_wait = jiffy_elapsed_rnd - jiffy_elapsed;
-
-	if (wait)
-		*wait = jiffy_wait;
-	return false;
+	return jiffy_wait;
 }
 
-static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
-				u64 bps_limit, unsigned long *wait)
+static unsigned long tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
+				u64 bps_limit)
 {
 	bool rw = bio_data_dir(bio);
 	u64 bytes_allowed, extra_bytes;
@@ -865,9 +858,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
 
 	/* no need to throttle if this bio's bytes have been accounted */
 	if (bps_limit == U64_MAX || bio_flagged(bio, BIO_BPS_THROTTLED)) {
-		if (wait)
-			*wait = 0;
-		return true;
+		return 0;
 	}
 
 	jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
@@ -880,9 +871,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	bytes_allowed = calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) +
 			tg->carryover_bytes[rw];
 	if (tg->bytes_disp[rw] + bio_size <= bytes_allowed) {
-		if (wait)
-			*wait = 0;
-		return true;
+		return 0;
 	}
 
 	/* Calc approx time to dispatch */
@@ -897,9 +886,7 @@ static bool tg_within_bps_limit(struct throtl_grp *tg, struct bio *bio,
 	 * up we did. Add that time also.
 	 */
 	jiffy_wait = jiffy_wait + (jiffy_elapsed_rnd - jiffy_elapsed);
-	if (wait)
-		*wait = jiffy_wait;
-	return false;
+	return jiffy_wait;
 }
 
 /*
@@ -947,8 +934,9 @@ static bool tg_may_dispatch(struct throtl_grp *tg, struct bio *bio,
 				jiffies + tg->td->throtl_slice);
 	}
 
-	if (tg_within_bps_limit(tg, bio, bps_limit, &bps_wait) &&
-	    tg_within_iops_limit(tg, bio, iops_limit, &iops_wait)) {
+	bps_wait = tg_within_bps_limit(tg, bio, bps_limit);
+	iops_wait = tg_within_iops_limit(tg, bio, iops_limit);
+	if (bps_wait + iops_wait == 0) {
 		if (wait)
 			*wait = 0;
 		return true;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2022-11-29  3:01   ` [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent Kemeng Shi
  2022-11-29  3:01   ` [PATCH v2 03/10] blk-throttle: ignore cgroup without io queued in blk_throtl_cancel_bios Kemeng Shi
@ 2022-11-29  3:01   ` Kemeng Shi
       [not found]     ` <20221129030147.27400-6-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2022-11-29  3:01   ` [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade Kemeng Shi
  3 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, josef-DigfWCa+lFGyeJad7bwFQA,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	shikemeng-hv44wF8Li93QT0dZR+AlfA

Commit c79892c557616 ("blk-throttle: add upgrade logic for LIMIT_LOW
state") added upgrade logic for low limit and methioned that
1. "To determine if a cgroup exceeds its limitation, we check if the cgroup
has pending request. Since cgroup is throttled according to the limit,
pending request means the cgroup reaches the limit."
2. "If a cgroup has limit set for both read and write, we consider the
combination of them for upgrade. The reason is read IO and write IO can
interfere with each other. If we do the upgrade based in one direction IO,
the other direction IO could be severly harmed."
Besides, we also determine that cgroup reaches low limit if low limit is 0,
see comment in throtl_tg_can_upgrade.

Collect the information above, the desgin of upgrade check is as following:
1.The low limit is reached if limit is zero or io is already queued.
2.Cgroup will pass upgrade check if low limits of READ and WRITE are both
reached.

Simpfy the check code described above to removce repeat check and improve
readability. There is no functional change.

Detail equivalence proof is as following:
All replaced conditions to return true are as following:
condition 1
(!read_limit && !write_limit)
condition 2
read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE])
condition 3
write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ])

Transferring condition 2 as following:
read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE])
is equivalent to
(read_limit && sq->nr_queued[READ]) &&
(!write_limit || (write_limit && sq->nr_queued[WRITE]))
is equivalent to
condition 2.1
(read_limit && sq->nr_queued[READ] && !write_limit) ||
condition 2.2
(read_limit && sq->nr_queued[READ] &&
(write_limit && sq->nr_queued[WRITE]))

Transferring condition 3 as following:
write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ])
is equivalent to
(write_limit && sq->nr_queued[WRITE]) &&
(!read_limit || (read_limit && sq->nr_queued[READ]))
is equivalent to
condition 3.1
((write_limit && sq->nr_queued[WRITE]) && !read_limit) ||
condition 3.2
((write_limit && sq->nr_queued[WRITE]) &&
(read_limit && sq->nr_queued[READ]))

Condition 3.2 is the same as condition 2.2, so all conditions we get to
return are as following:
(!read_limit && !write_limit) (1)
(!read_limit && (write_limit && sq->nr_queued[WRITE])) (3.1)
((read_limit && sq->nr_queued[READ]) && !write_limit) (2.1)
((write_limit && sq->nr_queued[WRITE]) &&
(read_limit && sq->nr_queued[READ])) (2.2)

As we can extract conditions "(a1 || a2) && (b1 || b2)" to:
a1 && b1
a1 && b2
a2 && b1
ab && b2

Considering that:
a1 = !read_limit
a2 = read_limit && sq->nr_queued[READ]
b1 = !write_limit
b2 = write_limit && sq->nr_queued[WRITE]

We can pack replaced conditions to
(!read_limint || (read_limit && sq->nr_queued[READ])) &&
(!write_limit || (write_limit && sq->nr_queued[WRITE])
which is equivalent to
(!read_limint || sq->nr_queued[READ]) &&
(!write_limit || sq->nr_queued[WRITE])

Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Reported-by: kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 block/blk-throttle.c | 31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 06e4193b064e..4977fac56a00 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1816,24 +1816,29 @@ static bool throtl_tg_is_idle(struct throtl_grp *tg)
 	return ret;
 }
 
-static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
+static bool throtl_tg_reach_low_limit(struct throtl_grp *tg, int rw)
 {
 	struct throtl_service_queue *sq = &tg->service_queue;
-	bool read_limit, write_limit;
+	bool limit = tg->bps[rw][LIMIT_LOW] || tg->iops[rw][LIMIT_LOW];
 
 	/*
-	 * if cgroup reaches low limit (if low limit is 0, the cgroup always
-	 * reaches), it's ok to upgrade to next limit
+	 * if low limit is zero, low limit is always reached.
+	 * if low limit is non-zero, we can check if there is any request
+	 * is queued to determine if low limit is reached as we throttle
+	 * request according to limit.
 	 */
-	read_limit = tg->bps[READ][LIMIT_LOW] || tg->iops[READ][LIMIT_LOW];
-	write_limit = tg->bps[WRITE][LIMIT_LOW] || tg->iops[WRITE][LIMIT_LOW];
-	if (!read_limit && !write_limit)
-		return true;
-	if (read_limit && sq->nr_queued[READ] &&
-	    (!write_limit || sq->nr_queued[WRITE]))
-		return true;
-	if (write_limit && sq->nr_queued[WRITE] &&
-	    (!read_limit || sq->nr_queued[READ]))
+	return !limit || sq->nr_queued[rw];
+}
+
+static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
+{
+	/*
+	 * cgroup reaches low limit when low limit of READ and WRITE are
+	 * both reached, it's ok to upgrade to next limit if cgroup reaches
+	 * low limit
+	 */
+	if (throtl_tg_reach_low_limit(tg, READ) &&
+	    throtl_tg_reach_low_limit(tg, WRITE))
 		return true;
 
 	if (time_after_eq(jiffies,
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 06/10] blk-throttle: fix typo in comment of throtl_adjusted_limit
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
                   ` (2 preceding siblings ...)
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-29  3:01 ` Kemeng Shi
  2022-11-29  3:01 ` [PATCH v2 07/10] blk-throttle: remove incorrect comment for tg_last_low_overflow_time Kemeng Shi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

lapsed time -> elapsed time

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
---
 block/blk-throttle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 4977fac56a00..4b11099625da 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -129,7 +129,7 @@ static struct throtl_data *sq_to_td(struct throtl_service_queue *sq)
 /*
  * cgroup's limit in LIMIT_MAX is scaled if low limit is set. This scale is to
  * make the IO dispatch more smooth.
- * Scale up: linearly scale up according to lapsed time since upgrade. For
+ * Scale up: linearly scale up according to elapsed time since upgrade. For
  *           every throtl_slice, the limit scales up 1/2 .low limit till the
  *           limit hits .max limit
  * Scale down: exponentially scale down if a cgroup doesn't hit its .low limit
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 07/10] blk-throttle: remove incorrect comment for tg_last_low_overflow_time
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
                   ` (3 preceding siblings ...)
  2022-11-29  3:01 ` [PATCH v2 06/10] blk-throttle: fix typo in comment of throtl_adjusted_limit Kemeng Shi
@ 2022-11-29  3:01 ` Kemeng Shi
  2022-11-29  3:01 ` [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade Kemeng Shi
  2022-11-29  3:01 ` [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start Kemeng Shi
  6 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

Function tg_last_low_overflow_time is called with intermediate node as
following:
throtl_hierarchy_can_downgrade
  throtl_tg_can_downgrade
    tg_last_low_overflow_time

throtl_hierarchy_can_upgrade
  throtl_tg_can_upgrade
    tg_last_low_overflow_time

throtl_hierarchy_can_downgrade/throtl_hierarchy_can_upgrade will traverse
from leaf node to sub-root node and pass traversed intermediate node
to tg_last_low_overflow_time.

No such limit could be found from context and implementation of
tg_last_low_overflow_time, so remove this limit in comment.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
---
 block/blk-throttle.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 4b11099625da..3ddd8a15aa3f 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1762,7 +1762,6 @@ static unsigned long __tg_last_low_overflow_time(struct throtl_grp *tg)
 	return min(rtime, wtime);
 }
 
-/* tg should not be an intermediate node */
 static unsigned long tg_last_low_overflow_time(struct throtl_grp *tg)
 {
 	struct throtl_service_queue *parent_sq;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
                   ` (4 preceding siblings ...)
  2022-11-29  3:01 ` [PATCH v2 07/10] blk-throttle: remove incorrect comment for tg_last_low_overflow_time Kemeng Shi
@ 2022-11-29  3:01 ` Kemeng Shi
       [not found]   ` <20221129030147.27400-9-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2022-11-29  3:01 ` [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start Kemeng Shi
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

There is no need to check elapsed time from last upgrade for each node in
hierarchy. Move this check before traversing as throtl_can_upgrade do
to remove repeat check.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
---
 block/blk-throttle.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 3ddd8a15aa3f..94d850b57462 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1955,8 +1955,7 @@ static bool throtl_tg_can_downgrade(struct throtl_grp *tg)
 	 * If cgroup is below low limit, consider downgrade and throttle other
 	 * cgroups
 	 */
-	if (time_after_eq(now, td->low_upgrade_time + td->throtl_slice) &&
-	    time_after_eq(now, tg_last_low_overflow_time(tg) +
+	if (time_after_eq(now, tg_last_low_overflow_time(tg) +
 					td->throtl_slice) &&
 	    (!throtl_tg_is_idle(tg) ||
 	     !list_empty(&tg_to_blkg(tg)->blkcg->css.children)))
@@ -1966,6 +1965,11 @@ static bool throtl_tg_can_downgrade(struct throtl_grp *tg)
 
 static bool throtl_hierarchy_can_downgrade(struct throtl_grp *tg)
 {
+	struct throtl_data *td = tg->td;
+
+	if (time_before(jiffies, td->low_upgrade_time + td->throtl_slice))
+		return false;
+
 	while (true) {
 		if (!throtl_tg_can_downgrade(tg))
 			return false;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start
  2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
                   ` (5 preceding siblings ...)
  2022-11-29  3:01 ` [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade Kemeng Shi
@ 2022-11-29  3:01 ` Kemeng Shi
  2022-11-30 21:34   ` Tejun Heo
  6 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj, josef, axboe; +Cc: cgroups, linux-block, linux-kernel, shikemeng

There is no need to update tg->slice_start[rw] to start when they are
equal already. So remove "eq" part of check before update slice_start.

Signed-off-by: Kemeng Shi <shikemeng@huawei.com>
---
 block/blk-throttle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 94d850b57462..4c80f2aa1e29 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -645,7 +645,7 @@ static inline void throtl_start_new_slice_with_credit(struct throtl_grp *tg,
 	 * that bandwidth. Do try to make use of that bandwidth while giving
 	 * credit.
 	 */
-	if (time_after_eq(start, tg->slice_start[rw]))
+	if (time_after(start, tg->slice_start[rw]))
 		tg->slice_start[rw] = start;
 
 	tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade
       [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2022-11-29  3:01   ` [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade Kemeng Shi
@ 2022-11-29  3:01   ` Kemeng Shi
       [not found]     ` <20221129030147.27400-11-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  3 siblings, 1 reply; 19+ messages in thread
From: Kemeng Shi @ 2022-11-29  3:01 UTC (permalink / raw)
  To: tj-DgEjT+Ai2ygdnm+yROfE0A, josef-DigfWCa+lFGyeJad7bwFQA,
	axboe-tSWWG44O7X1aa/9Udqfwiw
  Cc: cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	shikemeng-hv44wF8Li93QT0dZR+AlfA

Function throtl_hierarchy_can_upgrade will always return from while loop,
so the return outside while loop is never reached. Break the loop when
we traverse to root as throtl_hierarchy_can_downgrade do to avoid dead
code.

Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 block/blk-throttle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 4c80f2aa1e29..9946524de158 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1854,7 +1854,7 @@ static bool throtl_hierarchy_can_upgrade(struct throtl_grp *tg)
 			return true;
 		tg = sq_to_tg(tg->service_queue.parent_sq);
 		if (!tg || !tg_to_blkg(tg)->parent)
-			return false;
+			break;
 	}
 	return false;
 }
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init
  2022-11-29  3:01 ` [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init Kemeng Shi
@ 2022-11-30 21:08   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:08 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: josef, axboe, cgroups, linux-block, linux-kernel

On Tue, Nov 29, 2022 at 11:01:38AM +0800, Kemeng Shi wrote:
> On the default hierarchy (cgroup2), the throttle interface files don't
> exist in the root cgroup, so the ablity to limit the whole system
> by configuring root group is not existing anymore. In general, cgroup
> doesn't wanna be in the business of restricting resources at the
> system level, so correct the stale comment that we can limit whole
> system to we can only limit subtree.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huawei.com>

Acked-by: Tejun Heo <tj@kernel.org>

> +	 * read_bps limit is set on a "parent" group, summary bps of
> +	 * "parent" group and its subtree groups can't exceed 16M for the

but why the quotes around parent?

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent
       [not found]     ` <20221129030147.27400-3-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-30 21:09       ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:09 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: josef-DigfWCa+lFGyeJad7bwFQA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 29, 2022 at 11:01:39AM +0800, Kemeng Shi wrote:
> Consider situation as following (on the default hierarchy):
>  HDD
>   |
> root (bps limit: 4k)
>   |
> child (bps limit :8k)
>   |
> fio bs=8k
> Rate of fio is supposed to be 4k, but result is 8k. Reason is as
> following:
> Size of single IO from fio is larger than bytes allowed in one
> throtl_slice in child, so IOs are always queued in child group first.
> When queued IOs in child are dispatched to parent group, BIO_BPS_THROTTLED
> is set and these IOs will not be limited by tg_within_bps_limit anymore.
> Fix this by only set BIO_BPS_THROTTLED when the bio traversed the entire
> tree.
> 
> There patch has no influence on situation which is not on the default
> hierarchy as each group is a single root group without parent.
> 
> Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch
  2022-11-29  3:01 ` [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch Kemeng Shi
@ 2022-11-30 21:27   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:27 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: josef, axboe, cgroups, linux-block, linux-kernel

On Tue, Nov 29, 2022 at 11:01:41AM +0800, Kemeng Shi wrote:
> In C language, When executing "if (expression1 && expression2)" and
> expression1 return false, the expression2 may not be executed.
> For "tg_within_bps_limit(tg, bio, bps_limit, &bps_wait) &&
> tg_within_iops_limit(tg, bio, iops_limit, &iops_wait))", if bps is
> limited, tg_within_bps_limit will return false and
> tg_within_iops_limit will not be called. So even bps and iops are
> both limited, iops_wait will not be calculated and is always zero.
> So wait time of iops is always ignored.
> 
> Fix this by always calling tg_within_bps_limit and tg_within_iops_limit
> to get wait time for both bps and iops.
> 
> Observed that:
> 1. Wait time in tg_within_iops_limit/tg_within_bps_limit need always
> be stored as wait argument is always passed.
> 2. wait time is stored to zero if iops/bps is limited otherwise non-zero
> is stored.
> Simpfy tg_within_iops_limit/tg_within_bps_limit by removing wait argument
> and return wait time directly. Caller tg_may_dispatch checks if wait time
> is zero to find if iops/bps is limited.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huawei.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade
       [not found]   ` <20221129030147.27400-9-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-30 21:34     ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:34 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: josef-DigfWCa+lFGyeJad7bwFQA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 29, 2022 at 11:01:45AM +0800, Kemeng Shi wrote:
> There is no need to check elapsed time from last upgrade for each node in
> hierarchy. Move this check before traversing as throtl_can_upgrade do
> to remove repeat check.
> 
> Signed-off-by: Kemeng Shi <shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Reported-by: kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start
  2022-11-29  3:01 ` [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start Kemeng Shi
@ 2022-11-30 21:34   ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:34 UTC (permalink / raw)
  To: Kemeng Shi; +Cc: josef, axboe, cgroups, linux-block, linux-kernel

On Tue, Nov 29, 2022 at 11:01:46AM +0800, Kemeng Shi wrote:
> There is no need to update tg->slice_start[rw] to start when they are
> equal already. So remove "eq" part of check before update slice_start.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huawei.com>

Acked-by: Tejun Heo <tj@kernel.org>

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade
       [not found]     ` <20221129030147.27400-11-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-30 21:36       ` Tejun Heo
       [not found]         ` <Y4fMyEo0dxPl/Kt1-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 21:36 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: josef-DigfWCa+lFGyeJad7bwFQA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 29, 2022 at 11:01:47AM +0800, Kemeng Shi wrote:
> Function throtl_hierarchy_can_upgrade will always return from while loop,
> so the return outside while loop is never reached. Break the loop when
> we traverse to root as throtl_hierarchy_can_downgrade do to avoid dead
> code.

I don't know why this is an improvement.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade
       [not found]     ` <20221129030147.27400-6-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2022-11-30 22:09       ` Tejun Heo
  0 siblings, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2022-11-30 22:09 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: josef-DigfWCa+lFGyeJad7bwFQA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 29, 2022 at 11:01:42AM +0800, Kemeng Shi wrote:
> Commit c79892c557616 ("blk-throttle: add upgrade logic for LIMIT_LOW
> state") added upgrade logic for low limit and methioned that
> 1. "To determine if a cgroup exceeds its limitation, we check if the cgroup
> has pending request. Since cgroup is throttled according to the limit,
> pending request means the cgroup reaches the limit."
> 2. "If a cgroup has limit set for both read and write, we consider the
> combination of them for upgrade. The reason is read IO and write IO can
> interfere with each other. If we do the upgrade based in one direction IO,
> the other direction IO could be severly harmed."
> Besides, we also determine that cgroup reaches low limit if low limit is 0,
> see comment in throtl_tg_can_upgrade.
> 
> Collect the information above, the desgin of upgrade check is as following:
> 1.The low limit is reached if limit is zero or io is already queued.
> 2.Cgroup will pass upgrade check if low limits of READ and WRITE are both
> reached.
> 
> Simpfy the check code described above to removce repeat check and improve
> readability. There is no functional change.
> 
> Detail equivalence proof is as following:
> All replaced conditions to return true are as following:
> condition 1
> (!read_limit && !write_limit)
> condition 2
> read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE])
> condition 3
> write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ])
> 
> Transferring condition 2 as following:
> read_limit && sq->nr_queued[READ] && (!write_limit || sq->nr_queued[WRITE])
> is equivalent to
> (read_limit && sq->nr_queued[READ]) &&
> (!write_limit || (write_limit && sq->nr_queued[WRITE]))
> is equivalent to
> condition 2.1
> (read_limit && sq->nr_queued[READ] && !write_limit) ||
> condition 2.2
> (read_limit && sq->nr_queued[READ] &&
> (write_limit && sq->nr_queued[WRITE]))
> 
> Transferring condition 3 as following:
> write_limit && sq->nr_queued[WRITE] && (!read_limit || sq->nr_queued[READ])
> is equivalent to
> (write_limit && sq->nr_queued[WRITE]) &&
> (!read_limit || (read_limit && sq->nr_queued[READ]))
> is equivalent to
> condition 3.1
> ((write_limit && sq->nr_queued[WRITE]) && !read_limit) ||
> condition 3.2
> ((write_limit && sq->nr_queued[WRITE]) &&
> (read_limit && sq->nr_queued[READ]))
> 
> Condition 3.2 is the same as condition 2.2, so all conditions we get to
> return are as following:
> (!read_limit && !write_limit) (1)
> (!read_limit && (write_limit && sq->nr_queued[WRITE])) (3.1)
> ((read_limit && sq->nr_queued[READ]) && !write_limit) (2.1)
> ((write_limit && sq->nr_queued[WRITE]) &&
> (read_limit && sq->nr_queued[READ])) (2.2)
> 
> As we can extract conditions "(a1 || a2) && (b1 || b2)" to:
> a1 && b1
> a1 && b2
> a2 && b1
> ab && b2
> 
> Considering that:
> a1 = !read_limit
> a2 = read_limit && sq->nr_queued[READ]
> b1 = !write_limit
> b2 = write_limit && sq->nr_queued[WRITE]
> 
> We can pack replaced conditions to
> (!read_limint || (read_limit && sq->nr_queued[READ])) &&
> (!write_limit || (write_limit && sq->nr_queued[WRITE])
> which is equivalent to
> (!read_limint || sq->nr_queued[READ]) &&
             ^
             typo
> (!write_limit || sq->nr_queued[WRITE])

Can you indent the whole thing a bit so that it's more readable?

> -static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
> +static bool throtl_tg_reach_low_limit(struct throtl_grp *tg, int rw)
>  {
>  	struct throtl_service_queue *sq = &tg->service_queue;
> -	bool read_limit, write_limit;
> +	bool limit = tg->bps[rw][LIMIT_LOW] || tg->iops[rw][LIMIT_LOW];
>  
>  	/*
> -	 * if cgroup reaches low limit (if low limit is 0, the cgroup always
> -	 * reaches), it's ok to upgrade to next limit
> +	 * if low limit is zero, low limit is always reached.
> +	 * if low limit is non-zero, we can check if there is any request
> +	 * is queued to determine if low limit is reached as we throttle
> +	 * request according to limit.
>  	 */
> -	read_limit = tg->bps[READ][LIMIT_LOW] || tg->iops[READ][LIMIT_LOW];
> -	write_limit = tg->bps[WRITE][LIMIT_LOW] || tg->iops[WRITE][LIMIT_LOW];
> -	if (!read_limit && !write_limit)
> -		return true;
> -	if (read_limit && sq->nr_queued[READ] &&
> -	    (!write_limit || sq->nr_queued[WRITE]))
> -		return true;
> -	if (write_limit && sq->nr_queued[WRITE] &&
> -	    (!read_limit || sq->nr_queued[READ]))
> +	return !limit || sq->nr_queued[rw];
> +}
> +
> +static bool throtl_tg_can_upgrade(struct throtl_grp *tg)
> +{
> +	/*
> +	 * cgroup reaches low limit when low limit of READ and WRITE are
> +	 * both reached, it's ok to upgrade to next limit if cgroup reaches
> +	 * low limit
> +	 */
> +	if (throtl_tg_reach_low_limit(tg, READ) &&
> +	    throtl_tg_reach_low_limit(tg, WRITE))

Can you please name it throtl_low_limit_reached()? Other than that,

Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade
       [not found]         ` <Y4fMyEo0dxPl/Kt1-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
@ 2022-12-01  1:36           ` Kemeng Shi
  0 siblings, 0 replies; 19+ messages in thread
From: Kemeng Shi @ 2022-12-01  1:36 UTC (permalink / raw)
  To: Tejun Heo, Kemeng Shi
  Cc: josef-DigfWCa+lFGyeJad7bwFQA, axboe-tSWWG44O7X1aa/9Udqfwiw,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-block-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


Hi, Tejun
on 12/1/2022 5:36 AM, Tejun Heo wrote:
> On Tue, Nov 29, 2022 at 11:01:47AM +0800, Kemeng Shi wrote:
>> Function throtl_hierarchy_can_upgrade will always return from while loop,
>> so the return outside while loop is never reached. Break the loop when
>> we traverse to root as throtl_hierarchy_can_downgrade do to avoid dead
>> code.
> 
> I don't know why this is an improvement.
> 
I just found that the "return false" outside of the while loop is never
executed which may be not reached by code coverage test. Of course,
we can simply remove this "return false", but I found a similar function
throtl_hierarchy_can_upgrade which has no such problem. So I avoid
unreachable code as throtl_hierarchy_can_upgrade do.
Is this make sense to you? If not, I will remove this patch in next
version.

Thanks.
-- 
Best wishes
Kemeng Shi


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-12-01  1:36 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-29  3:01 [PATCH v2 00/10] A few bugfix and cleanup patches for blk-throttle Kemeng Shi
2022-11-29  3:01 ` [PATCH v2 01/10] blk-throttle: correct stale comment in throtl_pd_init Kemeng Shi
2022-11-30 21:08   ` Tejun Heo
2022-11-29  3:01 ` [PATCH v2 04/10] blk-throttle: correct calculation of wait time in tg_may_dispatch Kemeng Shi
2022-11-30 21:27   ` Tejun Heo
     [not found] ` <20221129030147.27400-1-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-11-29  3:01   ` [PATCH v2 02/10] blk-throttle: Fix that bps of child could exceed bps limited in parent Kemeng Shi
     [not found]     ` <20221129030147.27400-3-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-11-30 21:09       ` Tejun Heo
2022-11-29  3:01   ` [PATCH v2 03/10] blk-throttle: ignore cgroup without io queued in blk_throtl_cancel_bios Kemeng Shi
2022-11-29  3:01   ` [PATCH v2 05/10] blk-throttle: simpfy low limit reached check in throtl_tg_can_upgrade Kemeng Shi
     [not found]     ` <20221129030147.27400-6-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-11-30 22:09       ` Tejun Heo
2022-11-29  3:01   ` [PATCH v2 10/10] blk-throttle: avoid dead code in throtl_hierarchy_can_upgrade Kemeng Shi
     [not found]     ` <20221129030147.27400-11-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-11-30 21:36       ` Tejun Heo
     [not found]         ` <Y4fMyEo0dxPl/Kt1-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-12-01  1:36           ` Kemeng Shi
2022-11-29  3:01 ` [PATCH v2 06/10] blk-throttle: fix typo in comment of throtl_adjusted_limit Kemeng Shi
2022-11-29  3:01 ` [PATCH v2 07/10] blk-throttle: remove incorrect comment for tg_last_low_overflow_time Kemeng Shi
2022-11-29  3:01 ` [PATCH v2 08/10] blk-throttle: remove repeat check of elapsed time from last upgrade in throtl_hierarchy_can_downgrade Kemeng Shi
     [not found]   ` <20221129030147.27400-9-shikemeng-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2022-11-30 21:34     ` Tejun Heo
2022-11-29  3:01 ` [PATCH v2 09/10] blk-throttle: Use more siutable time_after check for update of slice_start Kemeng Shi
2022-11-30 21:34   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).