public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module
@ 2024-06-18  3:17 Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 1/7] blk-cgroup: add a new helper pr_cont_blkg_path() Yu Kuai
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Changes from RFC v1:
 - replace the first patch.
 - add commit message of our motivation and advantages to build iocost
 as kernel module.

The motivation is that iocost is not used widely in our production, and
some customers don't want to increase kernel size to enable iocost that
they will never use, and it'll be painful to maintain a new downstream
kernel. Hence it'll be beneficially to build iocost as kernel module:

- Kernel Size and Resource Usage, modules are loaded only when their
specific functionality is required.

- Flexibility and Maintainability, allows for dynamic loading and unloading
of modules at runtime without the need to recompile and restart the kernel,
for example we can just replace blk-iocost.ko to fix iocost CVE in our
production environment.

Yu Kuai (7):
  blk-cgroup: add a new helper pr_cont_blkg_path()
  cgroup: export cgroup_parse_float
  block: export some API
  blk-iocost: factor out helpers to handle params from ioc_qos_write()
  blk-iocost: parse params before initializing iocost
  blk-iocost: support to free iocost
  blk-iocost: support to build iocost as kernel module

 block/Kconfig             |   2 +-
 block/blk-cgroup.c        |  10 ++
 block/blk-cgroup.h        |   1 +
 block/blk-iocost.c        | 225 ++++++++++++++++++++++++++------------
 block/blk-rq-qos.c        |   2 +
 include/linux/blk_types.h |   2 +-
 kernel/cgroup/cgroup.c    |   1 +
 7 files changed, 170 insertions(+), 73 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 1/7] blk-cgroup: add a new helper pr_cont_blkg_path()
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 2/7] cgroup: export cgroup_parse_float Yu Kuai
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

To print blkg path in iocost, and prepare to build iocost as kernel
module.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-cgroup.c | 6 ++++++
 block/blk-cgroup.h | 1 +
 block/blk-iocost.c | 2 +-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 37e6cc91d576..4da70fc7775e 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -2196,5 +2196,11 @@ bool blk_cgroup_congested(void)
 	return ret;
 }
 
+void pr_cont_blkg_path(struct blkcg_gq *blkg)
+{
+	return pr_cont_cgroup_path(blkg->blkcg->css.cgroup);
+}
+EXPORT_SYMBOL_GPL(pr_cont_blkg_path);
+
 module_param(blkcg_debug_stats, bool, 0644);
 MODULE_PARM_DESC(blkcg_debug_stats, "True if you want debug stats, false if not");
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 90b3959d88cf..25833221a12b 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -220,6 +220,7 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx);
 int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol,
 		   struct blkg_conf_ctx *ctx);
 void blkg_conf_exit(struct blkg_conf_ctx *ctx);
+void pr_cont_blkg_path(struct blkcg_gq *blkg);
 
 /**
  * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 690ca99dfaca..ca72a62bc9c0 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -1954,7 +1954,7 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now)
 				 iocg->hweight_donating <= 1 ||
 				 iocg->hweight_after_donation == 0)) {
 			pr_warn("iocg: invalid donation weights in ");
-			pr_cont_cgroup_path(iocg_to_blkg(iocg)->blkcg->css.cgroup);
+			pr_cont_blkg_path(iocg_to_blkg(iocg));
 			pr_cont(": active=%u donating=%u after=%u\n",
 				iocg->hweight_active, iocg->hweight_donating,
 				iocg->hweight_after_donation);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 2/7] cgroup: export cgroup_parse_float
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 1/7] blk-cgroup: add a new helper pr_cont_blkg_path() Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 3/7] block: export some API Yu Kuai
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The symbol is used by iocost, prepare to build iocost as kernel module.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 kernel/cgroup/cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 4abd817b0c7c..81b579495f8c 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6966,6 +6966,7 @@ int cgroup_parse_float(const char *input, unsigned dec_shift, s64 *v)
 	*v = whole * power_of_ten(dec_shift) + frac;
 	return 0;
 }
+EXPORT_SYMBOL_GPL(cgroup_parse_float);
 
 /*
  * sock->sk_cgrp_data handling.  For more info, see sock_cgroup_data
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 3/7] block: export some API
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 1/7] blk-cgroup: add a new helper pr_cont_blkg_path() Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 2/7] cgroup: export cgroup_parse_float Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  7:14   ` Christoph Hellwig
  2024-06-18  3:17 ` [PATCH RFC v2 4/7] blk-iocost: factor out helpers to handle params from ioc_qos_write() Yu Kuai
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

These APIs are used in iocost, prepare to build iocost as kernel module.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-cgroup.c | 4 ++++
 block/blk-rq-qos.c | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 4da70fc7775e..787e3023a366 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -57,6 +57,7 @@ static struct blkcg_policy *blkcg_policy[BLKCG_MAX_POLS];
 static LIST_HEAD(all_blkcgs);		/* protected by blkcg_pol_mutex */
 
 bool blkcg_debug_stats = false;
+EXPORT_SYMBOL_GPL(blkcg_debug_stats);
 
 static DEFINE_RAW_SPINLOCK(blkg_stat_lock);
 
@@ -688,6 +689,7 @@ const char *blkg_dev_name(struct blkcg_gq *blkg)
 		return NULL;
 	return bdi_dev_name(blkg->q->disk->bdi);
 }
+EXPORT_SYMBOL_GPL(blkg_dev_name);
 
 /**
  * blkcg_print_blkgs - helper for printing per-blkg data
@@ -815,6 +817,7 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
 	ctx->bdev = bdev;
 	return 0;
 }
+EXPORT_SYMBOL_GPL(blkg_conf_open_bdev);
 
 /**
  * blkg_conf_prep - parse and prepare for per-blkg config update
@@ -2011,6 +2014,7 @@ void blkcg_schedule_throttle(struct gendisk *disk, bool use_memdelay)
 		current->use_memdelay = use_memdelay;
 	set_notify_resume(current);
 }
+EXPORT_SYMBOL_GPL(blkcg_schedule_throttle);
 
 /**
  * blkcg_add_delay - add delay to this blkg
diff --git a/block/blk-rq-qos.c b/block/blk-rq-qos.c
index dd7310c94713..c3fdf91ddf8d 100644
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -332,6 +332,7 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
 	blk_mq_unfreeze_queue(q);
 	return -EBUSY;
 }
+EXPORT_SYMBOL_GPL(rq_qos_add);
 
 void rq_qos_del(struct rq_qos *rqos)
 {
@@ -353,3 +354,4 @@ void rq_qos_del(struct rq_qos *rqos)
 	blk_mq_debugfs_unregister_rqos(rqos);
 	mutex_unlock(&q->debugfs_mutex);
 }
+EXPORT_SYMBOL_GPL(rq_qos_del);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 4/7] blk-iocost: factor out helpers to handle params from ioc_qos_write()
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
                   ` (2 preceding siblings ...)
  2024-06-18  3:17 ` [PATCH RFC v2 3/7] block: export some API Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 5/7] blk-iocost: parse params before initializing iocost Yu Kuai
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Currently the procedures are:

1) parse input disk and open the disk;
2) get ioc, if this is the first writer, init iocost;
3) init qos params, by copying from ioc;
4) parse input qos params;
5) update qos params to ioc;

This patch just factor out step 3-5 into separate helpers, there are no
functional changes, and prepare for fulture optimizations:

- move step 3-4 before setp 2, and don't init iocost for invalid input
  qos params;
- add a new input, and support to free iocost after step 4;
- support to build iocost as kernel module;

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-iocost.c | 155 ++++++++++++++++++++++++++-------------------
 1 file changed, 91 insertions(+), 64 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index ca72a62bc9c0..07f7f49b61b3 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3206,45 +3206,24 @@ static const match_table_t qos_tokens = {
 	{ NR_QOS_PARAMS,	NULL		},
 };
 
-static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
-			     size_t nbytes, loff_t off)
-{
-	struct blkg_conf_ctx ctx;
-	struct gendisk *disk;
-	struct ioc *ioc;
+struct ioc_qos_params {
 	u32 qos[NR_QOS_PARAMS];
-	bool enable, user;
-	char *body, *p;
-	int ret;
-
-	blkg_conf_init(&ctx, input);
-
-	ret = blkg_conf_open_bdev(&ctx);
-	if (ret)
-		goto err;
-
-	body = ctx.body;
-	disk = ctx.bdev->bd_disk;
-	if (!queue_is_mq(disk->queue)) {
-		ret = -EOPNOTSUPP;
-		goto err;
-	}
-
-	ioc = q_to_ioc(disk->queue);
-	if (!ioc) {
-		ret = blk_iocost_init(disk);
-		if (ret)
-			goto err;
-		ioc = q_to_ioc(disk->queue);
-	}
+	bool enable;
+	bool user;
+};
 
-	blk_mq_freeze_queue(disk->queue);
-	blk_mq_quiesce_queue(disk->queue);
+static void ioc_qos_params_init(struct ioc *ioc, struct ioc_qos_params *params)
+{
+	memcpy(params->qos, ioc->params.qos, sizeof(params->qos));
+	params->enable = ioc->enabled;
+	params->user = ioc->user_qos_params;
+}
 
-	spin_lock_irq(&ioc->lock);
-	memcpy(qos, ioc->params.qos, sizeof(qos));
-	enable = ioc->enabled;
-	user = ioc->user_qos_params;
+static int ioc_qos_params_parse(struct blkg_conf_ctx *ctx,
+				struct ioc_qos_params *params)
+{
+	char *body = ctx->body;
+	char *p;
 
 	while ((p = strsep(&body, " \t\n"))) {
 		substring_t args[MAX_OPT_ARGS];
@@ -3258,17 +3237,17 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 		switch (match_token(p, qos_ctrl_tokens, args)) {
 		case QOS_ENABLE:
 			if (match_u64(&args[0], &v))
-				goto einval;
-			enable = v;
+				return -EINVAL;
+			params->enable = v;
 			continue;
 		case QOS_CTRL:
 			match_strlcpy(buf, &args[0], sizeof(buf));
 			if (!strcmp(buf, "auto"))
-				user = false;
+				params->user = false;
 			else if (!strcmp(buf, "user"))
-				user = true;
+				params->user = true;
 			else
-				goto einval;
+				return -EINVAL;
 			continue;
 		}
 
@@ -3278,61 +3257,110 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 		case QOS_WPPM:
 			if (match_strlcpy(buf, &args[0], sizeof(buf)) >=
 			    sizeof(buf))
-				goto einval;
+				return -EINVAL;
 			if (cgroup_parse_float(buf, 2, &v))
-				goto einval;
+				return -EINVAL;
 			if (v < 0 || v > 10000)
-				goto einval;
-			qos[tok] = v * 100;
+				return -EINVAL;
+			params->qos[tok] = v * 100;
 			break;
 		case QOS_RLAT:
 		case QOS_WLAT:
 			if (match_u64(&args[0], &v))
-				goto einval;
-			qos[tok] = v;
+				return -EINVAL;
+			params->qos[tok] = v;
 			break;
 		case QOS_MIN:
 		case QOS_MAX:
 			if (match_strlcpy(buf, &args[0], sizeof(buf)) >=
 			    sizeof(buf))
-				goto einval;
+				return -EINVAL;
 			if (cgroup_parse_float(buf, 2, &v))
-				goto einval;
+				return -EINVAL;
 			if (v < 0)
-				goto einval;
-			qos[tok] = clamp_t(s64, v * 100,
-					   VRATE_MIN_PPM, VRATE_MAX_PPM);
+				return -EINVAL;
+			params->qos[tok] = clamp_t(s64, v * 100,
+						   VRATE_MIN_PPM,
+						   VRATE_MAX_PPM);
 			break;
 		default:
-			goto einval;
+			return -EINVAL;
 		}
-		user = true;
+		params->user = true;
 	}
 
-	if (qos[QOS_MIN] > qos[QOS_MAX])
-		goto einval;
+	if (params->qos[QOS_MIN] > params->qos[QOS_MAX])
+		return -EINVAL;
+
+	return 0;
+}
 
-	if (enable && !ioc->enabled) {
+static void ioc_qos_params_update(struct gendisk *disk, struct ioc *ioc,
+				  struct ioc_qos_params *params)
+{
+	if (params->enable && !ioc->enabled) {
 		blk_stat_enable_accounting(disk->queue);
 		blk_queue_flag_set(QUEUE_FLAG_RQ_ALLOC_TIME, disk->queue);
 		ioc->enabled = true;
-	} else if (!enable && ioc->enabled) {
+	} else if (!params->enable && ioc->enabled) {
 		blk_stat_disable_accounting(disk->queue);
 		blk_queue_flag_clear(QUEUE_FLAG_RQ_ALLOC_TIME, disk->queue);
 		ioc->enabled = false;
 	}
 
-	if (user) {
-		memcpy(ioc->params.qos, qos, sizeof(qos));
+	if (params->user) {
+		memcpy(ioc->params.qos, params->qos, sizeof(params->qos));
 		ioc->user_qos_params = true;
 	} else {
 		ioc->user_qos_params = false;
 	}
 
 	ioc_refresh_params(ioc, true);
+}
+
+static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
+			     size_t nbytes, loff_t off)
+{
+	struct ioc_qos_params params;
+	struct blkg_conf_ctx ctx;
+	struct gendisk *disk;
+	struct ioc *ioc;
+	int ret;
+
+	blkg_conf_init(&ctx, input);
+
+	ret = blkg_conf_open_bdev(&ctx);
+	if (ret)
+		goto err;
+
+	disk = ctx.bdev->bd_disk;
+	if (!queue_is_mq(disk->queue)) {
+		ret = -EOPNOTSUPP;
+		goto err;
+	}
+
+	ioc = q_to_ioc(disk->queue);
+	if (!ioc) {
+		ret = blk_iocost_init(disk);
+		if (ret)
+			goto err;
+		ioc = q_to_ioc(disk->queue);
+	}
+
+	blk_mq_freeze_queue(disk->queue);
+	blk_mq_quiesce_queue(disk->queue);
+
+	spin_lock_irq(&ioc->lock);
+	ioc_qos_params_init(ioc, &params);
+
+	ret = ioc_qos_params_parse(&ctx, &params);
+	if (ret)
+		goto err_parse;
+
+	ioc_qos_params_update(disk, ioc, &params);
 	spin_unlock_irq(&ioc->lock);
 
-	if (enable)
+	if (params.enable)
 		wbt_disable_default(disk);
 	else
 		wbt_enable_default(disk);
@@ -3342,13 +3370,12 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 
 	blkg_conf_exit(&ctx);
 	return nbytes;
-einval:
+
+err_parse:
 	spin_unlock_irq(&ioc->lock);
 
 	blk_mq_unquiesce_queue(disk->queue);
 	blk_mq_unfreeze_queue(disk->queue);
-
-	ret = -EINVAL;
 err:
 	blkg_conf_exit(&ctx);
 	return ret;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 5/7] blk-iocost: parse params before initializing iocost
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
                   ` (3 preceding siblings ...)
  2024-06-18  3:17 ` [PATCH RFC v2 4/7] blk-iocost: factor out helpers to handle params from ioc_qos_write() Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 6/7] blk-iocost: support to free iocost Yu Kuai
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

On the one hand prevent to initialize iocost for invalid input, on the
ohter hand prepare to add a new input to free iocost.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-iocost.c | 44 ++++++++++++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 07f7f49b61b3..34dcf4cdfadb 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -3214,9 +3214,20 @@ struct ioc_qos_params {
 
 static void ioc_qos_params_init(struct ioc *ioc, struct ioc_qos_params *params)
 {
-	memcpy(params->qos, ioc->params.qos, sizeof(params->qos));
-	params->enable = ioc->enabled;
-	params->user = ioc->user_qos_params;
+	if (ioc) {
+		memcpy(params->qos, ioc->params.qos, sizeof(params->qos));
+		params->enable = ioc->enabled;
+		params->user = ioc->user_qos_params;
+	} else {
+		params->qos[QOS_RPPM] = 0;
+		params->qos[QOS_RLAT] = 0;
+		params->qos[QOS_WPPM] = 0;
+		params->qos[QOS_WLAT] = 0;
+		params->qos[QOS_MIN] = VRATE_MIN_PPM;
+		params->qos[QOS_MAX] = VRATE_MAX_PPM;
+		params->enable = false;
+		params->user = false;
+	}
 }
 
 static int ioc_qos_params_parse(struct blkg_conf_ctx *ctx,
@@ -3309,7 +3320,16 @@ static void ioc_qos_params_update(struct gendisk *disk, struct ioc *ioc,
 	}
 
 	if (params->user) {
-		memcpy(ioc->params.qos, params->qos, sizeof(params->qos));
+		if (params->qos[QOS_RPPM])
+			ioc->params.qos[QOS_RPPM] = params->qos[QOS_RPPM];
+		if (params->qos[QOS_RLAT])
+			ioc->params.qos[QOS_RLAT] = params->qos[QOS_RLAT];
+		if (params->qos[QOS_WPPM])
+			ioc->params.qos[QOS_RPPM] = params->qos[QOS_WPPM];
+		if (params->qos[QOS_WLAT])
+			ioc->params.qos[QOS_WLAT] = params->qos[QOS_WLAT];
+		ioc->params.qos[QOS_MIN] = params->qos[QOS_MIN];
+		ioc->params.qos[QOS_MAX] = params->qos[QOS_MAX];
 		ioc->user_qos_params = true;
 	} else {
 		ioc->user_qos_params = false;
@@ -3340,6 +3360,12 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	}
 
 	ioc = q_to_ioc(disk->queue);
+	ioc_qos_params_init(ioc, &params);
+
+	ret = ioc_qos_params_parse(&ctx, &params);
+	if (ret)
+		goto err;
+
 	if (!ioc) {
 		ret = blk_iocost_init(disk);
 		if (ret)
@@ -3351,11 +3377,6 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	blk_mq_quiesce_queue(disk->queue);
 
 	spin_lock_irq(&ioc->lock);
-	ioc_qos_params_init(ioc, &params);
-
-	ret = ioc_qos_params_parse(&ctx, &params);
-	if (ret)
-		goto err_parse;
 
 	ioc_qos_params_update(disk, ioc, &params);
 	spin_unlock_irq(&ioc->lock);
@@ -3371,11 +3392,6 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	blkg_conf_exit(&ctx);
 	return nbytes;
 
-err_parse:
-	spin_unlock_irq(&ioc->lock);
-
-	blk_mq_unquiesce_queue(disk->queue);
-	blk_mq_unfreeze_queue(disk->queue);
 err:
 	blkg_conf_exit(&ctx);
 	return ret;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 6/7] blk-iocost: support to free iocost
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
                   ` (4 preceding siblings ...)
  2024-06-18  3:17 ` [PATCH RFC v2 5/7] blk-iocost: parse params before initializing iocost Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  3:17 ` [PATCH RFC v2 7/7] blk-iocost: support to build iocost as kernel module Yu Kuai
  2024-06-18  7:12 ` [PATCH RFC v2 0/7] " Christoph Hellwig
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Currently if iocost is initialized, it can never be freed until the disk
is deleted.

A new param "free" is added to the blk-io cgroup api "io.cost.qos", and
user can use it to free iocost. On the one hand prevent overhead from
fast path, on the other hand prepare to build iocost as kernel module.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/blk-iocost.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 34dcf4cdfadb..9765c988113f 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -324,6 +324,7 @@ enum ioc_running {
 enum {
 	QOS_ENABLE,
 	QOS_CTRL,
+	QOS_FREE,
 	NR_QOS_CTRL_PARAMS,
 };
 
@@ -2847,11 +2848,9 @@ static void ioc_rqos_queue_depth_changed(struct rq_qos *rqos)
 	spin_unlock_irq(&ioc->lock);
 }
 
-static void ioc_rqos_exit(struct rq_qos *rqos)
+static void __ioc_exit(struct ioc *ioc)
 {
-	struct ioc *ioc = rqos_to_ioc(rqos);
-
-	blkcg_deactivate_policy(rqos->disk, &blkcg_policy_iocost);
+	blkcg_deactivate_policy(ioc->rqos.disk, &blkcg_policy_iocost);
 
 	spin_lock_irq(&ioc->lock);
 	ioc->running = IOC_STOP;
@@ -2862,6 +2861,13 @@ static void ioc_rqos_exit(struct rq_qos *rqos)
 	kfree(ioc);
 }
 
+static void ioc_rqos_exit(struct rq_qos *rqos)
+{
+	struct ioc *ioc = rqos_to_ioc(rqos);
+
+	__ioc_exit(ioc);
+}
+
 static const struct rq_qos_ops ioc_rqos_ops = {
 	.throttle = ioc_rqos_throttle,
 	.merge = ioc_rqos_merge,
@@ -3193,6 +3199,7 @@ static int ioc_qos_show(struct seq_file *sf, void *v)
 static const match_table_t qos_ctrl_tokens = {
 	{ QOS_ENABLE,		"enable=%u"	},
 	{ QOS_CTRL,		"ctrl=%s"	},
+	{ QOS_FREE,		"free"		},
 	{ NR_QOS_CTRL_PARAMS,	NULL		},
 };
 
@@ -3210,6 +3217,7 @@ struct ioc_qos_params {
 	u32 qos[NR_QOS_PARAMS];
 	bool enable;
 	bool user;
+	bool free;
 };
 
 static void ioc_qos_params_init(struct ioc *ioc, struct ioc_qos_params *params)
@@ -3228,6 +3236,8 @@ static void ioc_qos_params_init(struct ioc *ioc, struct ioc_qos_params *params)
 		params->enable = false;
 		params->user = false;
 	}
+
+	params->free = false;
 }
 
 static int ioc_qos_params_parse(struct blkg_conf_ctx *ctx,
@@ -3260,6 +3270,9 @@ static int ioc_qos_params_parse(struct blkg_conf_ctx *ctx,
 			else
 				return -EINVAL;
 			continue;
+		case QOS_FREE:
+			params->free = true;
+			continue;
 		}
 
 		tok = match_token(p, qos_tokens, args);
@@ -3338,6 +3351,15 @@ static void ioc_qos_params_update(struct gendisk *disk, struct ioc *ioc,
 	ioc_refresh_params(ioc, true);
 }
 
+static void ioc_free(struct ioc *ioc)
+{
+	if (!ioc)
+		return;
+
+	rq_qos_del(&ioc->rqos);
+	__ioc_exit(ioc);
+}
+
 static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 			     size_t nbytes, loff_t off)
 {
@@ -3366,7 +3388,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	if (ret)
 		goto err;
 
-	if (!ioc) {
+	if (!params.free && !ioc) {
 		ret = blk_iocost_init(disk);
 		if (ret)
 			goto err;
@@ -3376,6 +3398,11 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	blk_mq_freeze_queue(disk->queue);
 	blk_mq_quiesce_queue(disk->queue);
 
+	if (params.free) {
+		ioc_free(ioc);
+		goto out;
+	}
+
 	spin_lock_irq(&ioc->lock);
 
 	ioc_qos_params_update(disk, ioc, &params);
@@ -3386,6 +3413,7 @@ static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input,
 	else
 		wbt_enable_default(disk);
 
+out:
 	blk_mq_unquiesce_queue(disk->queue);
 	blk_mq_unfreeze_queue(disk->queue);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH RFC v2 7/7] blk-iocost: support to build iocost as kernel module
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
                   ` (5 preceding siblings ...)
  2024-06-18  3:17 ` [PATCH RFC v2 6/7] blk-iocost: support to free iocost Yu Kuai
@ 2024-06-18  3:17 ` Yu Kuai
  2024-06-18  7:12 ` [PATCH RFC v2 0/7] " Christoph Hellwig
  7 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  3:17 UTC (permalink / raw)
  To: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes
  Cc: linux-block, linux-kernel, cgroups, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The motivation is that iocost is not used widely in our production, and
some customers don't want to increase kernel size to enable iocost that
they will never use, and it'll be painful to maintain a new downstream
kernel version. Hence it'll be beneficially to build iocost as kernel
module:

- Kernel Size and Resource Usage, modules are loaded only when their
specific functionality is required.

- Flexibility and Maintainability, allows for dynamic loading and unloading
of modules at runtime without the need to recompile and restart the kernel,
for example we can just replace blk-iocost.ko to fix iocost CVE in our
production environment.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/Kconfig             |  2 +-
 block/blk-iocost.c        | 14 +++++++++++++-
 include/linux/blk_types.h |  2 +-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/block/Kconfig b/block/Kconfig
index dc12af58dbae..b94b93158e57 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -156,7 +156,7 @@ config BLK_CGROUP_FC_APPID
 	  application specific identification into the FC frame.
 
 config BLK_CGROUP_IOCOST
-	bool "Enable support for cost model based cgroup IO controller"
+	tristate "Enable support for cost model based cgroup IO controller"
 	depends on BLK_CGROUP
 	select BLK_RQ_ALLOC_TIME
 	help
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 9765c988113f..33ab6e436af2 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -2850,6 +2850,7 @@ static void ioc_rqos_queue_depth_changed(struct rq_qos *rqos)
 
 static void __ioc_exit(struct ioc *ioc)
 {
+	module_put(THIS_MODULE);
 	blkcg_deactivate_policy(ioc->rqos.disk, &blkcg_policy_iocost);
 
 	spin_lock_irq(&ioc->lock);
@@ -2882,13 +2883,19 @@ static int blk_iocost_init(struct gendisk *disk)
 	struct ioc *ioc;
 	int i, cpu, ret;
 
+	if (!try_module_get(THIS_MODULE))
+		return -ENODEV;
+
 	ioc = kzalloc(sizeof(*ioc), GFP_KERNEL);
-	if (!ioc)
+	if (!ioc) {
+		module_put(THIS_MODULE);
 		return -ENOMEM;
+	}
 
 	ioc->pcpu_stat = alloc_percpu(struct ioc_pcpu_stat);
 	if (!ioc->pcpu_stat) {
 		kfree(ioc);
+		module_put(THIS_MODULE);
 		return -ENOMEM;
 	}
 
@@ -2938,6 +2945,7 @@ static int blk_iocost_init(struct gendisk *disk)
 	rq_qos_del(&ioc->rqos);
 err_free_ioc:
 	free_percpu(ioc->pcpu_stat);
+	module_put(THIS_MODULE);
 	kfree(ioc);
 	return ret;
 }
@@ -3616,3 +3624,7 @@ static void __exit ioc_exit(void)
 
 module_init(ioc_init);
 module_exit(ioc_exit);
+
+MODULE_AUTHOR("Tejun Heo");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Cost model based cgroup IO controller");
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 781c4500491b..8da12ebc7777 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -234,7 +234,7 @@ struct bio {
 	 */
 	struct blkcg_gq		*bi_blkg;
 	struct bio_issue	bi_issue;
-#ifdef CONFIG_BLK_CGROUP_IOCOST
+#if IS_ENABLED(CONFIG_BLK_CGROUP_IOCOST)
 	u64			bi_iocost_cost;
 #endif
 #endif
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module
  2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
                   ` (6 preceding siblings ...)
  2024-06-18  3:17 ` [PATCH RFC v2 7/7] blk-iocost: support to build iocost as kernel module Yu Kuai
@ 2024-06-18  7:12 ` Christoph Hellwig
  2024-06-18  8:03   ` Yu Kuai
  7 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2024-06-18  7:12 UTC (permalink / raw)
  To: Yu Kuai
  Cc: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes,
	linux-block, linux-kernel, cgroups, yukuai3, yi.zhang, yangerkun

On Tue, Jun 18, 2024 at 11:17:44AM +0800, Yu Kuai wrote:
> The motivation is that iocost is not used widely in our production, and
> some customers don't want to increase kernel size to enable iocost that
> they will never use, and it'll be painful to maintain a new downstream
> kernel. Hence it'll be beneficially to build iocost as kernel module:
> 
> - Kernel Size and Resource Usage, modules are loaded only when their
> specific functionality is required.
> 
> - Flexibility and Maintainability, allows for dynamic loading and unloading
> of modules at runtime without the need to recompile and restart the kernel,
> for example we can just replace blk-iocost.ko to fix iocost CVE in our
> production environment.

Given the amount of new exports and infrastructure it adds this still
feels like a bad tradeoff.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v2 3/7] block: export some API
  2024-06-18  3:17 ` [PATCH RFC v2 3/7] block: export some API Yu Kuai
@ 2024-06-18  7:14   ` Christoph Hellwig
  2024-06-18  7:58     ` Yu Kuai
  0 siblings, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2024-06-18  7:14 UTC (permalink / raw)
  To: Yu Kuai
  Cc: axboe, tj, gregkh, bvanassche, hch, josef, lizefan.x, hannes,
	linux-block, linux-kernel, cgroups, yukuai3, yi.zhang, yangerkun

On Tue, Jun 18, 2024 at 11:17:47AM +0800, Yu Kuai wrote:
>  bool blkcg_debug_stats = false;
> +EXPORT_SYMBOL_GPL(blkcg_debug_stats);

exporting variables is lways a bad idea.a

>  
>  static DEFINE_RAW_SPINLOCK(blkg_stat_lock);
>  
> @@ -688,6 +689,7 @@ const char *blkg_dev_name(struct blkcg_gq *blkg)
>  		return NULL;
>  	return bdi_dev_name(blkg->q->disk->bdi);
>  }
> +EXPORT_SYMBOL_GPL(blkg_dev_name);

And this helper is just horribly to be honest.  The bdi_dev_name
should not be used anywhere in block code, and something like this
certainly should not be exported even if we have to keep it for
legacy reasons.

>  /**
>   * blkcg_print_blkgs - helper for printing per-blkg data
> @@ -815,6 +817,7 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
>  	ctx->bdev = bdev;
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(blkg_conf_open_bdev);

This again is a horrible function papeing over blk-cgroup design
mistakes.  I absolutely should not be exported.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v2 3/7] block: export some API
  2024-06-18  7:14   ` Christoph Hellwig
@ 2024-06-18  7:58     ` Yu Kuai
  2024-06-20  6:48       ` Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  7:58 UTC (permalink / raw)
  To: Christoph Hellwig, Yu Kuai
  Cc: axboe, tj, gregkh, bvanassche, josef, lizefan.x, hannes,
	linux-block, linux-kernel, cgroups, yi.zhang, yangerkun,
	yukuai (C)

Hi,

在 2024/06/18 15:14, Christoph Hellwig 写道:
> On Tue, Jun 18, 2024 at 11:17:47AM +0800, Yu Kuai wrote:
>>   bool blkcg_debug_stats = false;
>> +EXPORT_SYMBOL_GPL(blkcg_debug_stats);
> 
> exporting variables is lways a bad idea.a

Yes, export variable is the easiest way here. Will it be acceptable to
make this variable static and add a helper to access it?

> 
>>   
>>   static DEFINE_RAW_SPINLOCK(blkg_stat_lock);
>>   
>> @@ -688,6 +689,7 @@ const char *blkg_dev_name(struct blkcg_gq *blkg)
>>   		return NULL;
>>   	return bdi_dev_name(blkg->q->disk->bdi);
>>   }
>> +EXPORT_SYMBOL_GPL(blkg_dev_name);
> 
> And this helper is just horribly to be honest.  The bdi_dev_name
> should not be used anywhere in block code, and something like this
> certainly should not be exported even if we have to keep it for
> legacy reasons.

And I can reuse __blkg_prfill_u64() for iocost like bfq did, then this
can be removed.

> 
>>   /**
>>    * blkcg_print_blkgs - helper for printing per-blkg data
>> @@ -815,6 +817,7 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
>>   	ctx->bdev = bdev;
>>   	return 0;
>>   }
>> +EXPORT_SYMBOL_GPL(blkg_conf_open_bdev);
> 
> This again is a horrible function papeing over blk-cgroup design
> mistakes.  I absolutely should not be exported.

There is already bigger helper blkg_conf_prep() exported, as used by bfq
for a long time already, and this helper is introduced for code
optimization, as iocost configuration doesn't need to access the blkg.

Perhaps this should be another discussion about "design mistakes",
however, I'm not sure why. :( Do you have previous discussions?

Thanks,
Kuai

> 
> .
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module
  2024-06-18  7:12 ` [PATCH RFC v2 0/7] " Christoph Hellwig
@ 2024-06-18  8:03   ` Yu Kuai
  0 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2024-06-18  8:03 UTC (permalink / raw)
  To: Christoph Hellwig, Yu Kuai
  Cc: axboe, tj, gregkh, bvanassche, josef, lizefan.x, hannes,
	linux-block, linux-kernel, cgroups, yi.zhang, yangerkun,
	yukuai (C)

Hi,

在 2024/06/18 15:12, Christoph Hellwig 写道:
> On Tue, Jun 18, 2024 at 11:17:44AM +0800, Yu Kuai wrote:
>> The motivation is that iocost is not used widely in our production, and
>> some customers don't want to increase kernel size to enable iocost that
>> they will never use, and it'll be painful to maintain a new downstream
>> kernel. Hence it'll be beneficially to build iocost as kernel module:
>>
>> - Kernel Size and Resource Usage, modules are loaded only when their
>> specific functionality is required.
>>
>> - Flexibility and Maintainability, allows for dynamic loading and unloading
>> of modules at runtime without the need to recompile and restart the kernel,
>> for example we can just replace blk-iocost.ko to fix iocost CVE in our
>> production environment.
> 
> Given the amount of new exports and infrastructure it adds this still
> feels like a bad tradeoff.

Yes, I understand your concern, let's see if we can export less and
hopefully accept the tradeoff. :) All other cgroup policies and wbt can
benefit the same without more helpers to be exported.

Thanks,
Kuai
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH RFC v2 3/7] block: export some API
  2024-06-18  7:58     ` Yu Kuai
@ 2024-06-20  6:48       ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2024-06-20  6:48 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Christoph Hellwig, axboe, tj, gregkh, bvanassche, josef,
	lizefan.x, hannes, linux-block, linux-kernel, cgroups, yi.zhang,
	yangerkun, yukuai (C)

On Tue, Jun 18, 2024 at 03:58:54PM +0800, Yu Kuai wrote:
> There is already bigger helper blkg_conf_prep() exported, as used by bfq
> for a long time already, and this helper is introduced for code
> optimization, as iocost configuration doesn't need to access the blkg.
> 
> Perhaps this should be another discussion about "design mistakes",
> however, I'm not sure why. :( Do you have previous discussions?

blkg_conf_prep at least has some level of abstraction.

But at least my conclusion is:  don't make blk-iocost modular.  It's
far too messy to split out, and not really worth it.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-06-20  6:48 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-18  3:17 [PATCH RFC v2 0/7] blk-iocost: support to build iocost as kernel module Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 1/7] blk-cgroup: add a new helper pr_cont_blkg_path() Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 2/7] cgroup: export cgroup_parse_float Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 3/7] block: export some API Yu Kuai
2024-06-18  7:14   ` Christoph Hellwig
2024-06-18  7:58     ` Yu Kuai
2024-06-20  6:48       ` Christoph Hellwig
2024-06-18  3:17 ` [PATCH RFC v2 4/7] blk-iocost: factor out helpers to handle params from ioc_qos_write() Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 5/7] blk-iocost: parse params before initializing iocost Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 6/7] blk-iocost: support to free iocost Yu Kuai
2024-06-18  3:17 ` [PATCH RFC v2 7/7] blk-iocost: support to build iocost as kernel module Yu Kuai
2024-06-18  7:12 ` [PATCH RFC v2 0/7] " Christoph Hellwig
2024-06-18  8:03   ` Yu Kuai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox