[PATCH] block: fix q->blkg_list corruption during disk rebind

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] block: fix q->blkg_list corruption during disk rebind
@ 2024-04-07 12:59 Ming Lei
  2024-04-07 13:40 ` Yu Kuai
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ming Lei @ 2024-04-07 12:59 UTC (permalink / raw)
  To: Jens Axboe, linux-block; +Cc: Ming Lei, Yu Kuai, Tejun Heo

Multiple gendisk instances can allocated/added for single request queue
in case of disk rebind. blkg may still stay in q->blkg_list when calling
blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.

Fix the list corruption issue by:

- add blkg_init_queue() to initialize q->blkg_list & q->blkcg_mutex only
- move calling blkg_init_queue() into blk_alloc_queue()

The list corruption should be started since commit f1c006f1c685 ("blk-cgroup:
synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
which delays removing blkg from q->blkg_list into blkg_free_workfn().

Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
Fixes: 1059699f87eb ("block: move blkcg initialization/destroy into disk allocation/release handler")
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
blktests:
	https://lore.kernel.org/linux-block/20240407125717.4052964-1-ming.lei@redhat.com/

 block/blk-cgroup.c | 9 ++++++---
 block/blk-cgroup.h | 2 ++
 block/blk-core.c   | 2 ++
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bdbb557feb5a..059467086b13 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1409,6 +1409,12 @@ static int blkcg_css_online(struct cgroup_subsys_state *css)
 	return 0;
 }
 
+void blkg_init_queue(struct request_queue *q)
+{
+	INIT_LIST_HEAD(&q->blkg_list);
+	mutex_init(&q->blkcg_mutex);
+}
+
 int blkcg_init_disk(struct gendisk *disk)
 {
 	struct request_queue *q = disk->queue;
@@ -1416,9 +1422,6 @@ int blkcg_init_disk(struct gendisk *disk)
 	bool preloaded;
 	int ret;
 
-	INIT_LIST_HEAD(&q->blkg_list);
-	mutex_init(&q->blkcg_mutex);
-
 	new_blkg = blkg_alloc(&blkcg_root, disk, GFP_KERNEL);
 	if (!new_blkg)
 		return -ENOMEM;
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 78b74106bf10..90b3959d88cf 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -189,6 +189,7 @@ struct blkcg_policy {
 extern struct blkcg blkcg_root;
 extern bool blkcg_debug_stats;
 
+void blkg_init_queue(struct request_queue *q);
 int blkcg_init_disk(struct gendisk *disk);
 void blkcg_exit_disk(struct gendisk *disk);
 
@@ -482,6 +483,7 @@ struct blkcg {
 };
 
 static inline struct blkcg_gq *blkg_lookup(struct blkcg *blkcg, void *key) { return NULL; }
+static inline void blkg_init_queue(struct request_queue *q) { }
 static inline int blkcg_init_disk(struct gendisk *disk) { return 0; }
 static inline void blkcg_exit_disk(struct gendisk *disk) { }
 static inline int blkcg_policy_register(struct blkcg_policy *pol) { return 0; }
diff --git a/block/blk-core.c b/block/blk-core.c
index a16b5abdbbf5..3a6f5603fb44 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -442,6 +442,8 @@ struct request_queue *blk_alloc_queue(struct queue_limits *lim, int node_id)
 	init_waitqueue_head(&q->mq_freeze_wq);
 	mutex_init(&q->mq_freeze_lock);
 
+	blkg_init_queue(q);
+
 	/*
 	 * Init percpu_ref in atomic mode so that it's faster to shutdown.
 	 * See blk_register_queue() for details.
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix q->blkg_list corruption during disk rebind
  2024-04-07 12:59 [PATCH] block: fix q->blkg_list corruption during disk rebind Ming Lei
@ 2024-04-07 13:40 ` Yu Kuai
  2024-04-08  3:05   ` Ming Lei
  2024-04-07 21:50 ` Jens Axboe
  2024-04-08  5:58 ` Christoph Hellwig
  2 siblings, 1 reply; 6+ messages in thread
From: Yu Kuai @ 2024-04-07 13:40 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe, linux-block; +Cc: Tejun Heo, yukuai (C)

Hi,

在 2024/04/07 20:59, Ming Lei 写道:
> Multiple gendisk instances can allocated/added for single request queue
> in case of disk rebind. blkg may still stay in q->blkg_list when calling
> blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.
> 
> Fix the list corruption issue by:
> 
> - add blkg_init_queue() to initialize q->blkg_list & q->blkcg_mutex only
> - move calling blkg_init_queue() into blk_alloc_queue()
> 
> The list corruption should be started since commit f1c006f1c685 ("blk-cgroup:
> synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
> which delays removing blkg from q->blkg_list into blkg_free_workfn().

I'm not familiar with how bind/unbind works yet, however, the patch
itself looks reasonable to me, the initialization of fields related to
queue should not be delayed to disk allocation.

Reviewed-by: Yu Kuai <yukuai3@huawei.com>

BTW, it looks like the whole blkcg_init_disk() can go away:
  - init of ioprio and blk-throttle can be delayed to the first user
configuration;
  - root_blkg allocation doesn't rely on disk at all;

Or is there any plan to move the blkcg related field or code path to
gendisk instead of queue? I might missing some previous discussions.

Thanks,
Kuai

> 
> Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
> Fixes: 1059699f87eb ("block: move blkcg initialization/destroy into disk allocation/release handler")
> Cc: Yu Kuai <yukuai3@huawei.com>
> Cc: Tejun Heo <tj@kernel.org>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
> blktests:
> 	https://lore.kernel.org/linux-block/20240407125717.4052964-1-ming.lei@redhat.com/
> 
>   block/blk-cgroup.c | 9 ++++++---
>   block/blk-cgroup.h | 2 ++
>   block/blk-core.c   | 2 ++
>   3 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
> index bdbb557feb5a..059467086b13 100644
> --- a/block/blk-cgroup.c
> +++ b/block/blk-cgroup.c
> @@ -1409,6 +1409,12 @@ static int blkcg_css_online(struct cgroup_subsys_state *css)
>   	return 0;
>   }
>   
> +void blkg_init_queue(struct request_queue *q)
> +{
> +	INIT_LIST_HEAD(&q->blkg_list);
> +	mutex_init(&q->blkcg_mutex);
> +}
> +
>   int blkcg_init_disk(struct gendisk *disk)
>   {
>   	struct request_queue *q = disk->queue;
> @@ -1416,9 +1422,6 @@ int blkcg_init_disk(struct gendisk *disk)
>   	bool preloaded;
>   	int ret;
>   
> -	INIT_LIST_HEAD(&q->blkg_list);
> -	mutex_init(&q->blkcg_mutex);
> -
>   	new_blkg = blkg_alloc(&blkcg_root, disk, GFP_KERNEL);
>   	if (!new_blkg)
>   		return -ENOMEM;
> diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
> index 78b74106bf10..90b3959d88cf 100644
> --- a/block/blk-cgroup.h
> +++ b/block/blk-cgroup.h
> @@ -189,6 +189,7 @@ struct blkcg_policy {
>   extern struct blkcg blkcg_root;
>   extern bool blkcg_debug_stats;
>   
> +void blkg_init_queue(struct request_queue *q);
>   int blkcg_init_disk(struct gendisk *disk);
>   void blkcg_exit_disk(struct gendisk *disk);
>   
> @@ -482,6 +483,7 @@ struct blkcg {
>   };
>   
>   static inline struct blkcg_gq *blkg_lookup(struct blkcg *blkcg, void *key) { return NULL; }
> +static inline void blkg_init_queue(struct request_queue *q) { }
>   static inline int blkcg_init_disk(struct gendisk *disk) { return 0; }
>   static inline void blkcg_exit_disk(struct gendisk *disk) { }
>   static inline int blkcg_policy_register(struct blkcg_policy *pol) { return 0; }
> diff --git a/block/blk-core.c b/block/blk-core.c
> index a16b5abdbbf5..3a6f5603fb44 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -442,6 +442,8 @@ struct request_queue *blk_alloc_queue(struct queue_limits *lim, int node_id)
>   	init_waitqueue_head(&q->mq_freeze_wq);
>   	mutex_init(&q->mq_freeze_lock);
>   
> +	blkg_init_queue(q);
> +
>   	/*
>   	 * Init percpu_ref in atomic mode so that it's faster to shutdown.
>   	 * See blk_register_queue() for details.
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix q->blkg_list corruption during disk rebind
  2024-04-07 13:40 ` Yu Kuai
@ 2024-04-08  3:05   ` Ming Lei
  0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2024-04-08  3:05 UTC (permalink / raw)
  To: Yu Kuai; +Cc: Jens Axboe, linux-block, Tejun Heo, yukuai (C)

On Sun, Apr 07, 2024 at 09:40:52PM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2024/04/07 20:59, Ming Lei 写道:
> > Multiple gendisk instances can allocated/added for single request queue
> > in case of disk rebind. blkg may still stay in q->blkg_list when calling
> > blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.
> > 
> > Fix the list corruption issue by:
> > 
> > - add blkg_init_queue() to initialize q->blkg_list & q->blkcg_mutex only
> > - move calling blkg_init_queue() into blk_alloc_queue()
> > 
> > The list corruption should be started since commit f1c006f1c685 ("blk-cgroup:
> > synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()")
> > which delays removing blkg from q->blkg_list into blkg_free_workfn().
> 
> I'm not familiar with how bind/unbind works yet, however, the patch
> itself looks reasonable to me, the initialization of fields related to
> queue should not be delayed to disk allocation.
> 
> Reviewed-by: Yu Kuai <yukuai3@huawei.com>

Thanks!

> 
> BTW, it looks like the whole blkcg_init_disk() can go away:
>  - init of ioprio and blk-throttle can be delayed to the first user
> configuration;
>  - root_blkg allocation doesn't rely on disk at all;
> 
> Or is there any plan to move the blkcg related field or code path to
> gendisk instead of queue? I might missing some previous discussions.

Christoph worked towards this direction, but not succeed:

a06377c5d01e Revert "blk-cgroup: pin the gendisk in struct blkcg_gq"
9a9c261e6b55 Revert "blk-cgroup: pass a gendisk to blkg_lookup"
1231039db31c Revert "blk-cgroup: move the cgroup information to struct gendisk"
dcb522014351 Revert "blk-cgroup: simplify blkg freeing from initialization failure paths"



Thanks, 
Ming


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix q->blkg_list corruption during disk rebind
  2024-04-07 12:59 [PATCH] block: fix q->blkg_list corruption during disk rebind Ming Lei
  2024-04-07 13:40 ` Yu Kuai
@ 2024-04-07 21:50 ` Jens Axboe
  2024-04-08  5:58 ` Christoph Hellwig
  2 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2024-04-07 21:50 UTC (permalink / raw)
  To: linux-block, Ming Lei; +Cc: Yu Kuai, Tejun Heo


On Sun, 07 Apr 2024 20:59:10 +0800, Ming Lei wrote:
> Multiple gendisk instances can allocated/added for single request queue
> in case of disk rebind. blkg may still stay in q->blkg_list when calling
> blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.
> 
> Fix the list corruption issue by:
> 
> - add blkg_init_queue() to initialize q->blkg_list & q->blkcg_mutex only
> - move calling blkg_init_queue() into blk_alloc_queue()
> 
> [...]

Applied, thanks!

[1/1] block: fix q->blkg_list corruption during disk rebind
      commit: 8b8ace080319a866f5dfe9da8e665ae51d971c54

Best regards,
-- 
Jens Axboe




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix q->blkg_list corruption during disk rebind
  2024-04-07 12:59 [PATCH] block: fix q->blkg_list corruption during disk rebind Ming Lei
  2024-04-07 13:40 ` Yu Kuai
  2024-04-07 21:50 ` Jens Axboe
@ 2024-04-08  5:58 ` Christoph Hellwig
  2024-04-08  7:42   ` Ming Lei
  2 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2024-04-08  5:58 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, Yu Kuai, Tejun Heo

On Sun, Apr 07, 2024 at 08:59:10PM +0800, Ming Lei wrote:
> Multiple gendisk instances can allocated/added for single request queue
> in case of disk rebind. blkg may still stay in q->blkg_list when calling
> blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.
> 
> Fix the list corruption issue by:

The right fix is to move the blkgs to the gendisk as there is no use
for them on a request_queue without a gendisk.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] block: fix q->blkg_list corruption during disk rebind
  2024-04-08  5:58 ` Christoph Hellwig
@ 2024-04-08  7:42   ` Ming Lei
  0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2024-04-08  7:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Yu Kuai, Tejun Heo

On Sun, Apr 07, 2024 at 10:58:29PM -0700, Christoph Hellwig wrote:
> On Sun, Apr 07, 2024 at 08:59:10PM +0800, Ming Lei wrote:
> > Multiple gendisk instances can allocated/added for single request queue
> > in case of disk rebind. blkg may still stay in q->blkg_list when calling
> > blkcg_init_disk() for rebind, then q->blkg_list becomes corrupted.
> > 
> > Fix the list corruption issue by:
> 
> The right fix is to move the blkgs to the gendisk as there is no use
> for them on a request_queue without a gendisk.

The fix needs to be backported to stable, so it isn't good to fix in
that aggressive way, especially last time your attempt failed, please see
the revert commits in my last reply to Yu Kuai.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-04-08  7:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-07 12:59 [PATCH] block: fix q->blkg_list corruption during disk rebind Ming Lei
2024-04-07 13:40 ` Yu Kuai
2024-04-08  3:05   ` Ming Lei
2024-04-07 21:50 ` Jens Axboe
2024-04-08  5:58 ` Christoph Hellwig
2024-04-08  7:42   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox