public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] block: flush all throttled bios when deleting the cgroup
@ 2024-06-27 14:26 Li Lingfeng
  2024-06-27 20:43 ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: Li Lingfeng @ 2024-06-27 14:26 UTC (permalink / raw)
  To: tj, josef, hch, axboe, mkoutny
  Cc: cgroups, linux-block, linux-kernel, yangerkun, yukuai1, houtao1,
	yi.zhang, lilingfeng, lilingfeng3

From: Li Lingfeng <lilingfeng3@huawei.com>

When a process migrates to another cgroup and the original cgroup is deleted,
the restrictions of throttled bios cannot be removed. If the restrictions
are set too low, it will take a long time to complete these bios.

Refer to the process of deleting a disk to remove the restrictions and
issue bios when deleting the cgroup.

This makes difference on the behavior of throttled bios:
Before: the limit of the throttled bios can't be changed and the bios will
complete under this limit;
Now: the limit will be canceled and the throttled bios will be flushed
immediately.

References:
https://lore.kernel.org/r/20220318130144.1066064-4-ming.lei@redhat.com

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
  v1->v2:
    Use "flush" instead of "cancel";
    Add description of the affect of throttled bios.
 block/blk-throttle.c | 68 ++++++++++++++++++++++++++++----------------
 1 file changed, 44 insertions(+), 24 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index c1bf73f8c75d..a0e5b28951ca 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1534,6 +1534,42 @@ static void throtl_shutdown_wq(struct request_queue *q)
 	cancel_work_sync(&td->dispatch_work);
 }
 
+static void tg_cancel_bios(struct throtl_grp *tg)
+{
+	struct throtl_service_queue *sq = &tg->service_queue;
+
+	if (tg->flags & THROTL_TG_CANCELING)
+		return;
+	/*
+	 * Set the flag to make sure throtl_pending_timer_fn() won't
+	 * stop until all throttled bios are dispatched.
+	 */
+	tg->flags |= THROTL_TG_CANCELING;
+
+	/*
+	 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
+	 * will be inserted to service queue without THROTL_TG_PENDING
+	 * set in tg_update_disptime below. Then IO dispatched from
+	 * child in tg_dispatch_one_bio will trigger double insertion
+	 * and corrupt the tree.
+	 */
+	if (!(tg->flags & THROTL_TG_PENDING))
+		return;
+
+	/*
+	 * Update disptime after setting the above flag to make sure
+	 * throtl_select_dispatch() won't exit without dispatching.
+	 */
+	tg_update_disptime(tg);
+
+	throtl_schedule_pending_timer(sq, jiffies + 1);
+}
+
+static void throtl_pd_offline(struct blkg_policy_data *pd)
+{
+	tg_cancel_bios(pd_to_tg(pd));
+}
+
 struct blkcg_policy blkcg_policy_throtl = {
 	.dfl_cftypes		= throtl_files,
 	.legacy_cftypes		= throtl_legacy_files,
@@ -1541,6 +1577,7 @@ struct blkcg_policy blkcg_policy_throtl = {
 	.pd_alloc_fn		= throtl_pd_alloc,
 	.pd_init_fn		= throtl_pd_init,
 	.pd_online_fn		= throtl_pd_online,
+	.pd_offline_fn		= throtl_pd_offline,
 	.pd_free_fn		= throtl_pd_free,
 };
 
@@ -1561,32 +1598,15 @@ void blk_throtl_cancel_bios(struct gendisk *disk)
 	 */
 	rcu_read_lock();
 	blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) {
-		struct throtl_grp *tg = blkg_to_tg(blkg);
-		struct throtl_service_queue *sq = &tg->service_queue;
-
-		/*
-		 * Set the flag to make sure throtl_pending_timer_fn() won't
-		 * stop until all throttled bios are dispatched.
-		 */
-		tg->flags |= THROTL_TG_CANCELING;
-
 		/*
-		 * Do not dispatch cgroup without THROTL_TG_PENDING or cgroup
-		 * will be inserted to service queue without THROTL_TG_PENDING
-		 * set in tg_update_disptime below. Then IO dispatched from
-		 * child in tg_dispatch_one_bio will trigger double insertion
-		 * and corrupt the tree.
+		 * disk_release will call pd_offline_fn to cancel bios.
+		 * However, disk_release can't be called if someone get
+		 * the refcount of device and issued bios which are
+		 * inflight after del_gendisk.
+		 * Cancel bios here to ensure no bios are inflight after
+		 * del_gendisk.
 		 */
-		if (!(tg->flags & THROTL_TG_PENDING))
-			continue;
-
-		/*
-		 * Update disptime after setting the above flag to make sure
-		 * throtl_select_dispatch() won't exit without dispatching.
-		 */
-		tg_update_disptime(tg);
-
-		throtl_schedule_pending_timer(sq, jiffies + 1);
+		tg_cancel_bios(blkg_to_tg(blkg));
 	}
 	rcu_read_unlock();
 	spin_unlock_irq(&q->queue_lock);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: flush all throttled bios when deleting the cgroup
  2024-06-27 14:26 [PATCH v2] block: flush all throttled bios when deleting the cgroup Li Lingfeng
@ 2024-06-27 20:43 ` Tejun Heo
  2024-06-28  2:04   ` Li Lingfeng
  0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2024-06-27 20:43 UTC (permalink / raw)
  To: Li Lingfeng
  Cc: josef, hch, axboe, mkoutny, cgroups, linux-block, linux-kernel,
	yangerkun, yukuai1, houtao1, yi.zhang, lilingfeng3

Hello, Li.

On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote:
> From: Li Lingfeng <lilingfeng3@huawei.com>
> 
> When a process migrates to another cgroup and the original cgroup is deleted,
> the restrictions of throttled bios cannot be removed. If the restrictions
> are set too low, it will take a long time to complete these bios.
> 
> Refer to the process of deleting a disk to remove the restrictions and
> issue bios when deleting the cgroup.
> 
> This makes difference on the behavior of throttled bios:
> Before: the limit of the throttled bios can't be changed and the bios will
> complete under this limit;
> Now: the limit will be canceled and the throttled bios will be flushed
> immediately.

I'm not necessarily against this but the description doesn't explain why
this is better either. Can you please detail why this behavior is better?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: flush all throttled bios when deleting the cgroup
  2024-06-27 20:43 ` Tejun Heo
@ 2024-06-28  2:04   ` Li Lingfeng
  2024-06-28  3:32     ` Yu Kuai
  2024-07-02 14:25     ` Michal Koutný
  0 siblings, 2 replies; 6+ messages in thread
From: Li Lingfeng @ 2024-06-28  2:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: josef, hch, axboe, mkoutny, cgroups, linux-block, linux-kernel,
	yangerkun, yukuai1, houtao1, yi.zhang, lilingfeng3


在 2024/6/28 4:43, Tejun Heo 写道:
> Hello, Li.
>
> On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote:
>> From: Li Lingfeng <lilingfeng3@huawei.com>
>>
>> When a process migrates to another cgroup and the original cgroup is deleted,
>> the restrictions of throttled bios cannot be removed. If the restrictions
>> are set too low, it will take a long time to complete these bios.
>>
>> Refer to the process of deleting a disk to remove the restrictions and
>> issue bios when deleting the cgroup.
>>
>> This makes difference on the behavior of throttled bios:
>> Before: the limit of the throttled bios can't be changed and the bios will
>> complete under this limit;
>> Now: the limit will be canceled and the throttled bios will be flushed
>> immediately.
> I'm not necessarily against this but the description doesn't explain why
> this is better either. Can you please detail why this behavior is better?
I think it may be more appropriate to remove the limit of bios after the
cgroup is deleted, rather than let the bios continue to be throttled by a
non-existent cgroup.

If the limit is set too low, and the original cgourp has been deleted, we
now have no way to make the bios complete immediately, but to wait for the
bios to slowly complete under the limit.

Thanks.

>
> Thanks.
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: flush all throttled bios when deleting the cgroup
  2024-06-28  2:04   ` Li Lingfeng
@ 2024-06-28  3:32     ` Yu Kuai
  2024-07-02 14:25     ` Michal Koutný
  1 sibling, 0 replies; 6+ messages in thread
From: Yu Kuai @ 2024-06-28  3:32 UTC (permalink / raw)
  To: Li Lingfeng, Tejun Heo
  Cc: josef, hch, axboe, mkoutny, cgroups, linux-block, linux-kernel,
	yangerkun, yukuai1, houtao1, yi.zhang, lilingfeng3, yukuai (C)

Hi,

在 2024/06/28 10:04, Li Lingfeng 写道:
> 
> 在 2024/6/28 4:43, Tejun Heo 写道:
>> Hello, Li.
>>
>> On Thu, Jun 27, 2024 at 10:26:06PM +0800, Li Lingfeng wrote:
>>> From: Li Lingfeng <lilingfeng3@huawei.com>
>>>
>>> When a process migrates to another cgroup and the original cgroup is 
>>> deleted,
>>> the restrictions of throttled bios cannot be removed. If the 
>>> restrictions
>>> are set too low, it will take a long time to complete these bios.
>>>
>>> Refer to the process of deleting a disk to remove the restrictions and
>>> issue bios when deleting the cgroup.
>>>
>>> This makes difference on the behavior of throttled bios:
>>> Before: the limit of the throttled bios can't be changed and the bios 
>>> will
>>> complete under this limit;
>>> Now: the limit will be canceled and the throttled bios will be flushed
>>> immediately.
>> I'm not necessarily against this but the description doesn't explain why
>> this is better either. Can you please detail why this behavior is better?
> I think it may be more appropriate to remove the limit of bios after the
> cgroup is deleted, rather than let the bios continue to be throttled by a
> non-existent cgroup.

The backgroud is that our test found this, by:

1) setting a low limit in one cgroup;
2) bind a task in the cgroup and issue lots of IO;
3) migrate the task to root cgroup;
4) delete the cgroup;

And oops, unless the disk is deleted, IO will hang for a long time and
there is no way to recover.

The good thing is that after flushing throttled bio while deleting the
cgroup, this "IO hang" can be avoided. However, I'm not sure for this
change, because user may still want the BIO to be throttled. Anyway,
I don't think this will be a problem in reallife.

Thanks,
Kuai

> 
> If the limit is set too low, and the original cgourp has been deleted, we
> now have no way to make the bios complete immediately, but to wait for the
> bios to slowly complete under the limit.
> 
> Thanks.
> 
>>
>> Thanks.
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: flush all throttled bios when deleting the cgroup
  2024-06-28  2:04   ` Li Lingfeng
  2024-06-28  3:32     ` Yu Kuai
@ 2024-07-02 14:25     ` Michal Koutný
  2024-07-06  7:21       ` Li Lingfeng
  1 sibling, 1 reply; 6+ messages in thread
From: Michal Koutný @ 2024-07-02 14:25 UTC (permalink / raw)
  To: Li Lingfeng
  Cc: Tejun Heo, josef, hch, axboe, cgroups, linux-block, linux-kernel,
	yangerkun, yukuai1, houtao1, yi.zhang, lilingfeng3

[-- Attachment #1: Type: text/plain, Size: 768 bytes --]

On Fri, Jun 28, 2024 at 10:04:20AM GMT, Li Lingfeng <lilingfeng@huaweicloud.com> wrote:
> I think it may be more appropriate to remove the limit of bios after the
> cgroup is deleted, rather than let the bios continue to be throttled by a
> non-existent cgroup.

I'm not that familiar with this part -- can this also happen for IOs
submitted by an exited task? (In contrast to a running task migrated
elsewhere.)

> If the limit is set too low, and the original cgourp has been deleted, we
> now have no way to make the bios complete immediately, but to wait for the
> bios to slowly complete under the limit.

It makes some sense, it's not unlike reparenting of memcg objects, IIRC
flushed bios would actually be passed to a parent throtl_grp, right?

Thanks,
Michal

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] block: flush all throttled bios when deleting the cgroup
  2024-07-02 14:25     ` Michal Koutný
@ 2024-07-06  7:21       ` Li Lingfeng
  0 siblings, 0 replies; 6+ messages in thread
From: Li Lingfeng @ 2024-07-06  7:21 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Tejun Heo, josef, hch, axboe, cgroups, linux-block, linux-kernel,
	yangerkun, yukuai1, houtao1, yi.zhang, lilingfeng3


在 2024/7/2 22:25, Michal Koutný 写道:
> On Fri, Jun 28, 2024 at 10:04:20AM GMT, Li Lingfeng <lilingfeng@huaweicloud.com> wrote:
>> I think it may be more appropriate to remove the limit of bios after the
>> cgroup is deleted, rather than let the bios continue to be throttled by a
>> non-existent cgroup.
> I'm not that familiar with this part -- can this also happen for IOs
> submitted by an exited task? (In contrast to a running task migrated
> elsewhere.)
Yes, IOs will be throttled no matter whether the task that delivers them
exits.
>> If the limit is set too low, and the original cgourp has been deleted, we
>> now have no way to make the bios complete immediately, but to wait for the
>> bios to slowly complete under the limit.
> It makes some sense, it's not unlike reparenting of memcg objects, IIRC
> flushed bios would actually be passed to a parent throtl_grp, right?
Yes, flushed bios would be throttled by the parent throtl_grp.
> Thanks,
> Michal


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-07-06  7:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-27 14:26 [PATCH v2] block: flush all throttled bios when deleting the cgroup Li Lingfeng
2024-06-27 20:43 ` Tejun Heo
2024-06-28  2:04   ` Li Lingfeng
2024-06-28  3:32     ` Yu Kuai
2024-07-02 14:25     ` Michal Koutný
2024-07-06  7:21       ` Li Lingfeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox