cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET cgroup/for-3.14] cgroup: consolidate file handling
@ 2013-11-27 23:42 Tejun Heo
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

Hello,

cgroup is scheduled to be converted to use kernfs, which is currently
in the process of being separated out of sysfs, so that, among other
things, cgroup core locking can be decoupled from vfs layer.  This
patchset cleans up and conslidates cgroup file handling to facilitate
such conversion.

There currently are a couple different rw paths including the ones
which don't impose any structure.  All existing users and expected
reasonable use cases can be served with standard seq_file interface
and buffered writes, which is what's provided by kernfs.

This patchset updates cgroup file handling so that the interface and
usages are more concise and there is single path for read and single
path for write, both of which closely map to the interface kernfs
provides.

This series ends up adding some amount of code which will be replaced
by kernfs but, overall, things get more streamlined and LOC is
reduced.

The following 12 patches are included in the series.

 0001-cgroup-sched-convert-away-from-cftype-read_map.patch
 0002-cpuset-convert-away-from-cftype-read.patch
 0003-memcg-convert-away-from-cftype-read-and-read_map.patch
 0004-netprio_cgroup-convert-away-from-cftype-read_map.patch
 0005-hugetlb_cgroup-convert-away-from-cftype-read.patch
 0006-cgroup-remove-cftype-read-read_map-and-write.patch
 0007-cgroup-unify-cgroup_write_X64-and-cgroup_write_strin.patch
 0008-cgroup-unify-read-path-so-that-seq_file-is-always-us.patch
 0009-cgroup-generalize-cgroup_pidlist_open_file.patch
 0010-cgroup-attach-cgroup_open_file-to-all-cgroup-files.patch
 0011-cgroup-replace-cftype-read_seq_string-with-cftype-se.patch
 0012-cgroup-unify-pidlist-and-other-file-handling.patch

While this series touches a lot of controllers, all updates to the
controllers are mostly mechnical.  I think it'd be best if the series
can be routed through cgroup/for-3.14 branch.

This patchset is on top of

  cgroup/for-3.14 c729b11edf74 ("cgroup: Merge branch 'for-3.13-fixes' into for-3.14")
+ [1] [PATCHSET] cgroup: restructure pidlist handling

and available in the following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-consolidate-file-handling

diffstat follows.

 block/blk-throttle.c      |   35 +---
 block/cfq-iosched.c       |  131 +++++++--------
 include/linux/cgroup.h    |   71 ++++----
 kernel/cgroup.c           |  380 ++++++++++++++++------------------------------
 kernel/cgroup_freezer.c   |    7
 kernel/cpuset.c           |   71 ++------
 kernel/sched/core.c       |   13 -
 kernel/sched/cpuacct.c    |   18 --
 mm/hugetlb_cgroup.c       |   22 --
 mm/memcontrol.c           |   73 +++-----
 net/core/netprio_cgroup.c |    8
 security/device_cgroup.c  |    7
 12 files changed, 339 insertions(+), 497 deletions(-)

Thanks.

--
tejun

[1] http://lkml.kernel.org/g/1385331096-7918-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 01/12] cgroup, sched: convert away from cftype->read_map()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-27 23:42   ` Tejun Heo
  2013-11-27 23:42   ` [PATCH 02/12] cpuset: convert away from cftype->read() Tejun Heo
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

cftype->read_map() doesn't add any value and being replaced with
->read_seq_string().  Update cpu_stats_show() and cpuacct_stats_show()
accordingly.

This patch doesn't make any visible behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
---
 kernel/sched/core.c    | 10 +++++-----
 kernel/sched/cpuacct.c |  8 ++++----
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c180860..f28ec67 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7257,14 +7257,14 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
 }
 
 static int cpu_stats_show(struct cgroup_subsys_state *css, struct cftype *cft,
-		struct cgroup_map_cb *cb)
+			  struct seq_file *sf)
 {
 	struct task_group *tg = css_tg(css);
 	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
 
-	cb->fill(cb, "nr_periods", cfs_b->nr_periods);
-	cb->fill(cb, "nr_throttled", cfs_b->nr_throttled);
-	cb->fill(cb, "throttled_time", cfs_b->throttled_time);
+	seq_printf(sf, "nr_periods %d\n", cfs_b->nr_periods);
+	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
+	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
 
 	return 0;
 }
@@ -7318,7 +7318,7 @@ static struct cftype cpu_files[] = {
 	},
 	{
 		.name = "stat",
-		.read_map = cpu_stats_show,
+		.read_seq_string = cpu_stats_show,
 	},
 #endif
 #ifdef CONFIG_RT_GROUP_SCHED
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index f64722f..dd88738 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -184,7 +184,7 @@ static const char * const cpuacct_stat_desc[] = {
 };
 
 static int cpuacct_stats_show(struct cgroup_subsys_state *css,
-			      struct cftype *cft, struct cgroup_map_cb *cb)
+			      struct cftype *cft, struct seq_file *sf)
 {
 	struct cpuacct *ca = css_ca(css);
 	int cpu;
@@ -196,7 +196,7 @@ static int cpuacct_stats_show(struct cgroup_subsys_state *css,
 		val += kcpustat->cpustat[CPUTIME_NICE];
 	}
 	val = cputime64_to_clock_t(val);
-	cb->fill(cb, cpuacct_stat_desc[CPUACCT_STAT_USER], val);
+	seq_printf(sf, "%s %lld\n", cpuacct_stat_desc[CPUACCT_STAT_USER], val);
 
 	val = 0;
 	for_each_online_cpu(cpu) {
@@ -207,7 +207,7 @@ static int cpuacct_stats_show(struct cgroup_subsys_state *css,
 	}
 
 	val = cputime64_to_clock_t(val);
-	cb->fill(cb, cpuacct_stat_desc[CPUACCT_STAT_SYSTEM], val);
+	seq_printf(sf, "%s %lld\n", cpuacct_stat_desc[CPUACCT_STAT_SYSTEM], val);
 
 	return 0;
 }
@@ -224,7 +224,7 @@ static struct cftype files[] = {
 	},
 	{
 		.name = "stat",
-		.read_map = cpuacct_stats_show,
+		.read_seq_string = cpuacct_stats_show,
 	},
 	{ }	/* terminate */
 };
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 02/12] cpuset: convert away from cftype->read()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 01/12] cgroup, sched: convert away from cftype->read_map() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
  2013-11-27 23:42   ` [PATCH 03/12] memcg: convert away from cftype->read() and ->read_map() Tejun Heo
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

All users of cftype->read() can be easily served, usually better, by
seq_file and other methods.  Rename cpuset_common_file_read() to
cpuset_common_read_seq_string() and convert it to use
read_seq_string() interface instead.  This not only simplifies the
code but also makes it more versatile.  Before, the file couldn't
output if the result is longer than PAGE_SIZE.  After the conversion,
seq_file automatically grows the buffer until the output can fit.

This patch doesn't make any visible behavior changes except for being
able to handle output larger than PAGE_SIZE.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 kernel/cpuset.c | 71 +++++++++++++++++++--------------------------------------
 1 file changed, 24 insertions(+), 47 deletions(-)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 4772034..032929f 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1731,66 +1731,43 @@ out_unlock:
  * used, list of ranges of sequential numbers, is variable length,
  * and since these maps can change value dynamically, one could read
  * gibberish by doing partial reads while a list was changing.
- * A single large read to a buffer that crosses a page boundary is
- * ok, because the result being copied to user land is not recomputed
- * across a page fault.
  */
-
-static size_t cpuset_sprintf_cpulist(char *page, struct cpuset *cs)
-{
-	size_t count;
-
-	mutex_lock(&callback_mutex);
-	count = cpulist_scnprintf(page, PAGE_SIZE, cs->cpus_allowed);
-	mutex_unlock(&callback_mutex);
-
-	return count;
-}
-
-static size_t cpuset_sprintf_memlist(char *page, struct cpuset *cs)
-{
-	size_t count;
-
-	mutex_lock(&callback_mutex);
-	count = nodelist_scnprintf(page, PAGE_SIZE, cs->mems_allowed);
-	mutex_unlock(&callback_mutex);
-
-	return count;
-}
-
-static ssize_t cpuset_common_file_read(struct cgroup_subsys_state *css,
-				       struct cftype *cft, struct file *file,
-				       char __user *buf, size_t nbytes,
-				       loff_t *ppos)
+static int cpuset_common_read_seq_string(struct cgroup_subsys_state *css,
+					 struct cftype *cft,
+					 struct seq_file *sf)
 {
 	struct cpuset *cs = css_cs(css);
 	cpuset_filetype_t type = cft->private;
-	char *page;
-	ssize_t retval = 0;
-	char *s;
+	ssize_t count;
+	char *buf, *s;
+	int ret = 0;
 
-	if (!(page = (char *)__get_free_page(GFP_TEMPORARY)))
-		return -ENOMEM;
+	count = seq_get_buf(sf, &buf);
+	s = buf;
 
-	s = page;
+	mutex_lock(&callback_mutex);
 
 	switch (type) {
 	case FILE_CPULIST:
-		s += cpuset_sprintf_cpulist(s, cs);
+		s += cpulist_scnprintf(s, count, cs->cpus_allowed);
 		break;
 	case FILE_MEMLIST:
-		s += cpuset_sprintf_memlist(s, cs);
+		s += nodelist_scnprintf(s, count, cs->mems_allowed);
 		break;
 	default:
-		retval = -EINVAL;
-		goto out;
+		ret = -EINVAL;
+		goto out_unlock;
 	}
-	*s++ = '\n';
 
-	retval = simple_read_from_buffer(buf, nbytes, ppos, page, s - page);
-out:
-	free_page((unsigned long)page);
-	return retval;
+	if (s < buf + count - 1) {
+		*s++ = '\n';
+		seq_commit(sf, s - buf);
+	} else {
+		seq_commit(sf, -1);
+	}
+out_unlock:
+	mutex_unlock(&callback_mutex);
+	return ret;
 }
 
 static u64 cpuset_read_u64(struct cgroup_subsys_state *css, struct cftype *cft)
@@ -1847,7 +1824,7 @@ static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft)
 static struct cftype files[] = {
 	{
 		.name = "cpus",
-		.read = cpuset_common_file_read,
+		.read_seq_string = cpuset_common_read_seq_string,
 		.write_string = cpuset_write_resmask,
 		.max_write_len = (100U + 6 * NR_CPUS),
 		.private = FILE_CPULIST,
@@ -1855,7 +1832,7 @@ static struct cftype files[] = {
 
 	{
 		.name = "mems",
-		.read = cpuset_common_file_read,
+		.read_seq_string = cpuset_common_read_seq_string,
 		.write_string = cpuset_write_resmask,
 		.max_write_len = (100U + 6 * MAX_NUMNODES),
 		.private = FILE_MEMLIST,
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 03/12] memcg: convert away from cftype->read() and ->read_map()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 01/12] cgroup, sched: convert away from cftype->read_map() Tejun Heo
  2013-11-27 23:42   ` [PATCH 02/12] cpuset: convert away from cftype->read() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map() Tejun Heo
                     ` (10 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

cftype->read_map() doesn't add any value and being replaced with
->read_seq_string(), and all users of cftype->read() can be easily
served, usually better, by seq_file and other methods.

Update mem_cgroup_read() to return u64 instead of printing itself and
rename it to mem_cgroup_read_u64(), and update
mem_cgroup_oom_control_read() to use ->read_seq_string() instead of
->read_map().

This patch doesn't make any visible behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
---
 mm/memcontrol.c | 49 +++++++++++++++++++++----------------------------
 1 file changed, 21 insertions(+), 28 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7aa0d40..f149521 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5150,14 +5150,12 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
 	return val << PAGE_SHIFT;
 }
 
-static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
-			       struct cftype *cft, struct file *file,
-			       char __user *buf, size_t nbytes, loff_t *ppos)
+static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
+				   struct cftype *cft)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
-	char str[64];
 	u64 val;
-	int name, len;
+	int name;
 	enum res_type type;
 
 	type = MEMFILE_TYPE(cft->private);
@@ -5183,8 +5181,7 @@ static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
 		BUG();
 	}
 
-	len = scnprintf(str, sizeof(str), "%llu\n", (unsigned long long)val);
-	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
+	return val;
 }
 
 static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val)
@@ -5911,16 +5908,12 @@ static void mem_cgroup_oom_unregister_event(struct mem_cgroup *memcg,
 }
 
 static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
-	struct cftype *cft,  struct cgroup_map_cb *cb)
+				       struct cftype *cft, struct seq_file *sf)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
 
-	cb->fill(cb, "oom_kill_disable", memcg->oom_kill_disable);
-
-	if (atomic_read(&memcg->under_oom))
-		cb->fill(cb, "under_oom", 1);
-	else
-		cb->fill(cb, "under_oom", 0);
+	seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
+	seq_printf(sf, "under_oom %d\n", (bool)atomic_read(&memcg->under_oom));
 	return 0;
 }
 
@@ -6239,31 +6232,31 @@ static struct cftype mem_cgroup_files[] = {
 	{
 		.name = "usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "max_usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_MAX_USAGE),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "soft_limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "failcnt",
 		.private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "stat",
@@ -6297,7 +6290,7 @@ static struct cftype mem_cgroup_files[] = {
 	},
 	{
 		.name = "oom_control",
-		.read_map = mem_cgroup_oom_control_read,
+		.read_seq_string = mem_cgroup_oom_control_read,
 		.write_u64 = mem_cgroup_oom_control_write,
 		.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
 	},
@@ -6315,24 +6308,24 @@ static struct cftype mem_cgroup_files[] = {
 		.name = "kmem.limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "kmem.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "kmem.failcnt",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "kmem.max_usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 #ifdef CONFIG_SLABINFO
 	{
@@ -6349,25 +6342,25 @@ static struct cftype memsw_cgroup_files[] = {
 	{
 		.name = "memsw.usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_USAGE),
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "memsw.max_usage_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_MAX_USAGE),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "memsw.limit_in_bytes",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_LIMIT),
 		.write_string = mem_cgroup_write,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{
 		.name = "memsw.failcnt",
 		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,
-		.read = mem_cgroup_read,
+		.read_u64 = mem_cgroup_read_u64,
 	},
 	{ },	/* terminate */
 };
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (2 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 03/12] memcg: convert away from cftype->read() and ->read_map() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 05/12] hugetlb_cgroup: convert away from cftype->read() Tejun Heo
                     ` (9 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

cftype->read_map() doesn't add any value and being replaced with
->read_seq_string().  Update read_priomap() to use ->read_seq_string()
instead.

This patch doesn't make any visible behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
---
 net/core/netprio_cgroup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 9b7cf6c..498710d 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -174,13 +174,13 @@ static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
 }
 
 static int read_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
-			struct cgroup_map_cb *cb)
+			struct seq_file *sf)
 {
 	struct net_device *dev;
 
 	rcu_read_lock();
 	for_each_netdev_rcu(&init_net, dev)
-		cb->fill(cb, dev->name, netprio_prio(css, dev));
+		seq_printf(sf, "%s %u\n", dev->name, netprio_prio(css, dev));
 	rcu_read_unlock();
 	return 0;
 }
@@ -238,7 +238,7 @@ static struct cftype ss_files[] = {
 	},
 	{
 		.name = "ifpriomap",
-		.read_map = read_priomap,
+		.read_seq_string = read_priomap,
 		.write_string = write_priomap,
 	},
 	{ }	/* terminate */
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 05/12] hugetlb_cgroup: convert away from cftype->read()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (3 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 06/12] cgroup: remove cftype->read(), ->read_map() and ->write() Tejun Heo
                     ` (8 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo, Aneesh Kumar K.V

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

All users of cftype->read() can be easily served, usually better, by
seq_file and other methods.  Update hugetlb_cgroup_read() to return
u64 instead of printing itself and rename it to
hugetlb_cgroup_read_u64().

This patch doesn't make any visible behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
---
 mm/hugetlb_cgroup.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index bda8e44..d747a84 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -242,22 +242,16 @@ void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
 	return;
 }
 
-static ssize_t hugetlb_cgroup_read(struct cgroup_subsys_state *css,
-				   struct cftype *cft, struct file *file,
-				   char __user *buf, size_t nbytes,
-				   loff_t *ppos)
+static u64 hugetlb_cgroup_read_u64(struct cgroup_subsys_state *css,
+				   struct cftype *cft)
 {
-	u64 val;
-	char str[64];
-	int idx, name, len;
+	int idx, name;
 	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
 
 	idx = MEMFILE_IDX(cft->private);
 	name = MEMFILE_ATTR(cft->private);
 
-	val = res_counter_read_u64(&h_cg->hugepage[idx], name);
-	len = scnprintf(str, sizeof(str), "%llu\n", (unsigned long long)val);
-	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
+	return res_counter_read_u64(&h_cg->hugepage[idx], name);
 }
 
 static int hugetlb_cgroup_write(struct cgroup_subsys_state *css,
@@ -337,28 +331,28 @@ static void __init __hugetlb_cgroup_file_init(int idx)
 	cft = &h->cgroup_files[0];
 	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.limit_in_bytes", buf);
 	cft->private = MEMFILE_PRIVATE(idx, RES_LIMIT);
-	cft->read = hugetlb_cgroup_read;
+	cft->read_u64 = hugetlb_cgroup_read_u64;
 	cft->write_string = hugetlb_cgroup_write;
 
 	/* Add the usage file */
 	cft = &h->cgroup_files[1];
 	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.usage_in_bytes", buf);
 	cft->private = MEMFILE_PRIVATE(idx, RES_USAGE);
-	cft->read = hugetlb_cgroup_read;
+	cft->read_u64 = hugetlb_cgroup_read_u64;
 
 	/* Add the MAX usage file */
 	cft = &h->cgroup_files[2];
 	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.max_usage_in_bytes", buf);
 	cft->private = MEMFILE_PRIVATE(idx, RES_MAX_USAGE);
 	cft->trigger = hugetlb_cgroup_reset;
-	cft->read = hugetlb_cgroup_read;
+	cft->read_u64 = hugetlb_cgroup_read_u64;
 
 	/* Add the failcntfile */
 	cft = &h->cgroup_files[3];
 	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.failcnt", buf);
 	cft->private  = MEMFILE_PRIVATE(idx, RES_FAILCNT);
 	cft->trigger  = hugetlb_cgroup_reset;
-	cft->read = hugetlb_cgroup_read;
+	cft->read_u64 = hugetlb_cgroup_read_u64;
 
 	/* NULL terminate the last cft */
 	cft = &h->cgroup_files[4];
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 06/12] cgroup: remove cftype->read(), ->read_map() and ->write()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (4 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 05/12] hugetlb_cgroup: convert away from cftype->read() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
  2013-11-27 23:42   ` [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string() Tejun Heo
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is being
consolidated so that it can be easily mapped to the seq_file based
interface of kernfs.

After recent updates, ->read() and ->read_map() don't have any user
left and ->write() never had any user.  Remove them.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 include/linux/cgroup.h | 25 -------------------------
 kernel/cgroup.c        | 26 ++++----------------------
 2 files changed, 4 insertions(+), 47 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 50d8cc3..53e11da 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -387,16 +387,6 @@ struct css_set {
 };
 
 /*
- * cgroup_map_cb is an abstract callback API for reporting map-valued
- * control files
- */
-
-struct cgroup_map_cb {
-	int (*fill)(struct cgroup_map_cb *cb, const char *key, u64 value);
-	void *state;
-};
-
-/*
  * struct cftype: handler definitions for cgroup control files
  *
  * When reading/writing to a file:
@@ -444,9 +434,6 @@ struct cftype {
 	struct cgroup_subsys *ss;
 
 	int (*open)(struct inode *inode, struct file *file);
-	ssize_t (*read)(struct cgroup_subsys_state *css, struct cftype *cft,
-			struct file *file,
-			char __user *buf, size_t nbytes, loff_t *ppos);
 	/*
 	 * read_u64() is a shortcut for the common case of returning a
 	 * single integer. Use it in place of read()
@@ -457,24 +444,12 @@ struct cftype {
 	 */
 	s64 (*read_s64)(struct cgroup_subsys_state *css, struct cftype *cft);
 	/*
-	 * read_map() is used for defining a map of key/value
-	 * pairs. It should call cb->fill(cb, key, value) for each
-	 * entry. The key/value pairs (and their ordering) should not
-	 * change between reboots.
-	 */
-	int (*read_map)(struct cgroup_subsys_state *css, struct cftype *cft,
-			struct cgroup_map_cb *cb);
-	/*
 	 * read_seq_string() is used for outputting a simple sequence
 	 * using seqfile.
 	 */
 	int (*read_seq_string)(struct cgroup_subsys_state *css,
 			       struct cftype *cft, struct seq_file *m);
 
-	ssize_t (*write)(struct cgroup_subsys_state *css, struct cftype *cft,
-			 struct file *file,
-			 const char __user *buf, size_t nbytes, loff_t *ppos);
-
 	/*
 	 * write_u64() is a shortcut for the common case of accepting
 	 * a single integer (as parsed by simple_strtoull) from
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c36d906..3592515 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2324,8 +2324,6 @@ static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
 	struct cftype *cft = __d_cft(file->f_dentry);
 	struct cgroup_subsys_state *css = cfe->css;
 
-	if (cft->write)
-		return cft->write(css, cft, file, buf, nbytes, ppos);
 	if (cft->write_u64 || cft->write_s64)
 		return cgroup_write_X64(css, cft, file, buf, nbytes, ppos);
 	if (cft->write_string)
@@ -2366,8 +2364,6 @@ static ssize_t cgroup_file_read(struct file *file, char __user *buf,
 	struct cftype *cft = __d_cft(file->f_dentry);
 	struct cgroup_subsys_state *css = cfe->css;
 
-	if (cft->read)
-		return cft->read(css, cft, file, buf, nbytes, ppos);
 	if (cft->read_u64)
 		return cgroup_read_u64(css, cft, file, buf, nbytes, ppos);
 	if (cft->read_s64)
@@ -2380,25 +2376,12 @@ static ssize_t cgroup_file_read(struct file *file, char __user *buf,
  * supports string->u64 maps, but can be extended in future.
  */
 
-static int cgroup_map_add(struct cgroup_map_cb *cb, const char *key, u64 value)
-{
-	struct seq_file *sf = cb->state;
-	return seq_printf(sf, "%s %llu\n", key, (unsigned long long)value);
-}
-
 static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
 	struct cfent *cfe = m->private;
 	struct cftype *cft = cfe->type;
 	struct cgroup_subsys_state *css = cfe->css;
 
-	if (cft->read_map) {
-		struct cgroup_map_cb cb = {
-			.fill = cgroup_map_add,
-			.state = m,
-		};
-		return cft->read_map(css, cft, &cb);
-	}
 	return cft->read_seq_string(css, cft, m);
 }
 
@@ -2444,7 +2427,7 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->read_map || cft->read_seq_string) {
+	if (cft->read_seq_string) {
 		file->f_op = &cgroup_seqfile_operations;
 		err = single_open(file, cgroup_seqfile_show, cfe);
 	} else if (cft->open) {
@@ -2658,12 +2641,11 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
 	if (cft->mode)
 		return cft->mode;
 
-	if (cft->read || cft->read_u64 || cft->read_s64 ||
-	    cft->read_map || cft->read_seq_string)
+	if (cft->read_u64 || cft->read_s64 || cft->read_seq_string)
 		mode |= S_IRUGO;
 
-	if (cft->write || cft->write_u64 || cft->write_s64 ||
-	    cft->write_string || cft->trigger)
+	if (cft->write_u64 || cft->write_s64 || cft->write_string ||
+	    cft->trigger)
 		mode |= S_IWUSR;
 
 	return mode;
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (5 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 06/12] cgroup: remove cftype->read(), ->read_map() and ->write() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 08/12] cgroup: unify read path so that seq_file is always used Tejun Heo
                     ` (6 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

cgroup_write_X64() and cgroup_write_string() both implement about the
same buffering logic.  Unify the two into cgroup_file_write() which
always allocates dynamic buffer for simplicity and uses kstrto*()
instead of simple_strto*().

This patch doesn't make any visible behavior changes except for
possibly different error value from kstrsto*().

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 kernel/cgroup.c | 112 ++++++++++++++++++--------------------------------------
 1 file changed, 36 insertions(+), 76 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 3592515..23abe8e 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2249,90 +2249,50 @@ static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
 /* A buffer size big enough for numbers or short strings */
 #define CGROUP_LOCAL_BUFFER_SIZE 64
 
-static ssize_t cgroup_write_X64(struct cgroup_subsys_state *css,
-				struct cftype *cft, struct file *file,
-				const char __user *userbuf, size_t nbytes,
-				loff_t *unused_ppos)
-{
-	char buffer[CGROUP_LOCAL_BUFFER_SIZE];
-	int retval = 0;
-	char *end;
-
-	if (!nbytes)
-		return -EINVAL;
-	if (nbytes >= sizeof(buffer))
-		return -E2BIG;
-	if (copy_from_user(buffer, userbuf, nbytes))
-		return -EFAULT;
-
-	buffer[nbytes] = 0;     /* nul-terminate */
-	if (cft->write_u64) {
-		u64 val = simple_strtoull(strstrip(buffer), &end, 0);
-		if (*end)
-			return -EINVAL;
-		retval = cft->write_u64(css, cft, val);
-	} else {
-		s64 val = simple_strtoll(strstrip(buffer), &end, 0);
-		if (*end)
-			return -EINVAL;
-		retval = cft->write_s64(css, cft, val);
-	}
-	if (!retval)
-		retval = nbytes;
-	return retval;
-}
-
-static ssize_t cgroup_write_string(struct cgroup_subsys_state *css,
-				   struct cftype *cft, struct file *file,
-				   const char __user *userbuf, size_t nbytes,
-				   loff_t *unused_ppos)
+static ssize_t cgroup_file_write(struct file *file, const char __user *userbuf,
+				 size_t nbytes, loff_t *ppos)
 {
-	char local_buffer[CGROUP_LOCAL_BUFFER_SIZE];
-	int retval = 0;
-	size_t max_bytes = cft->max_write_len;
-	char *buffer = local_buffer;
+	struct cfent *cfe = __d_cfe(file->f_dentry);
+	struct cftype *cft = __d_cft(file->f_dentry);
+	struct cgroup_subsys_state *css = cfe->css;
+	size_t max_bytes = cft->max_write_len ?: CGROUP_LOCAL_BUFFER_SIZE - 1;
+	char *buf;
+	int ret;
 
-	if (!max_bytes)
-		max_bytes = sizeof(local_buffer) - 1;
 	if (nbytes >= max_bytes)
 		return -E2BIG;
-	/* Allocate a dynamic buffer if we need one */
-	if (nbytes >= sizeof(local_buffer)) {
-		buffer = kmalloc(nbytes + 1, GFP_KERNEL);
-		if (buffer == NULL)
-			return -ENOMEM;
-	}
-	if (nbytes && copy_from_user(buffer, userbuf, nbytes)) {
-		retval = -EFAULT;
-		goto out;
-	}
 
-	buffer[nbytes] = 0;     /* nul-terminate */
-	retval = cft->write_string(css, cft, strstrip(buffer));
-	if (!retval)
-		retval = nbytes;
-out:
-	if (buffer != local_buffer)
-		kfree(buffer);
-	return retval;
-}
+	buf = kmalloc(nbytes + 1, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
 
-static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
-				 size_t nbytes, loff_t *ppos)
-{
-	struct cfent *cfe = __d_cfe(file->f_dentry);
-	struct cftype *cft = __d_cft(file->f_dentry);
-	struct cgroup_subsys_state *css = cfe->css;
+	if (copy_from_user(buf, userbuf, nbytes)) {
+		ret = -EFAULT;
+		goto out_free;
+	}
 
-	if (cft->write_u64 || cft->write_s64)
-		return cgroup_write_X64(css, cft, file, buf, nbytes, ppos);
-	if (cft->write_string)
-		return cgroup_write_string(css, cft, file, buf, nbytes, ppos);
-	if (cft->trigger) {
-		int ret = cft->trigger(css, (unsigned int)cft->private);
-		return ret ? ret : nbytes;
+	buf[nbytes] = '\0';
+
+	if (cft->write_string) {
+		ret = cft->write_string(css, cft, strstrip(buf));
+	} else if (cft->write_u64) {
+		unsigned long long v;
+		ret = kstrtoull(buf, 0, &v);
+		if (!ret)
+			ret = cft->write_u64(css, cft, v);
+	} else if (cft->write_s64) {
+		long long v;
+		ret = kstrtoll(buf, 0, &v);
+		if (!ret)
+			ret = cft->write_s64(css, cft, v);
+	} else if (cft->trigger) {
+		ret = cft->trigger(css, (unsigned int)cft->private);
+	} else {
+		ret = -EINVAL;
 	}
-	return -EINVAL;
+out_free:
+	kfree(buf);
+	return ret ?: nbytes;
 }
 
 static ssize_t cgroup_read_u64(struct cgroup_subsys_state *css,
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 08/12] cgroup: unify read path so that seq_file is always used
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (6 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
  2013-11-27 23:42   ` [PATCH 09/12] cgroup: generalize cgroup_pidlist_open_file Tejun Heo
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

With the recent removal of cftype->read() and ->read_map(), only three
operations are remaining, ->read_u64(), ->read_s64() and
->read_seq_string().  Currently, the first two are handled directly
while the last is handled through seq_file.

It is trivial to serve the first two through the seq_file path too.
This patch restructures read path so that all operations are served
through cgroup_seqfile_show().  This makes all cgroup files seq_file -
single_open/release() are now used by default,
cgroup_seqfile_operations is dropped, and cgroup_file_operations uses
seq_read() for read.

This simplifies the code and makes the read path easy to convert to
use kernfs.

Note that, while cgroup_file_operations uses seq_read() for read, it
still uses generic_file_llseek() for seeking instead of seq_lseek().
This is different from cgroup_seqfile_operations but shouldn't break
anything and brings the seeking behavior aligned with kernfs.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 kernel/cgroup.c | 68 +++++++++++++--------------------------------------------
 1 file changed, 15 insertions(+), 53 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 23abe8e..7048e7f 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2295,42 +2295,6 @@ out_free:
 	return ret ?: nbytes;
 }
 
-static ssize_t cgroup_read_u64(struct cgroup_subsys_state *css,
-			       struct cftype *cft, struct file *file,
-			       char __user *buf, size_t nbytes, loff_t *ppos)
-{
-	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
-	u64 val = cft->read_u64(css, cft);
-	int len = sprintf(tmp, "%llu\n", (unsigned long long) val);
-
-	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
-}
-
-static ssize_t cgroup_read_s64(struct cgroup_subsys_state *css,
-			       struct cftype *cft, struct file *file,
-			       char __user *buf, size_t nbytes, loff_t *ppos)
-{
-	char tmp[CGROUP_LOCAL_BUFFER_SIZE];
-	s64 val = cft->read_s64(css, cft);
-	int len = sprintf(tmp, "%lld\n", (long long) val);
-
-	return simple_read_from_buffer(buf, nbytes, ppos, tmp, len);
-}
-
-static ssize_t cgroup_file_read(struct file *file, char __user *buf,
-				size_t nbytes, loff_t *ppos)
-{
-	struct cfent *cfe = __d_cfe(file->f_dentry);
-	struct cftype *cft = __d_cft(file->f_dentry);
-	struct cgroup_subsys_state *css = cfe->css;
-
-	if (cft->read_u64)
-		return cgroup_read_u64(css, cft, file, buf, nbytes, ppos);
-	if (cft->read_s64)
-		return cgroup_read_s64(css, cft, file, buf, nbytes, ppos);
-	return -EINVAL;
-}
-
 /*
  * seqfile ops/methods for returning structured data. Currently just
  * supports string->u64 maps, but can be extended in future.
@@ -2342,15 +2306,17 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 	struct cftype *cft = cfe->type;
 	struct cgroup_subsys_state *css = cfe->css;
 
-	return cft->read_seq_string(css, cft, m);
-}
+	if (cft->read_seq_string)
+		return cft->read_seq_string(css, cft, m);
 
-static const struct file_operations cgroup_seqfile_operations = {
-	.read = seq_read,
-	.write = cgroup_file_write,
-	.llseek = seq_lseek,
-	.release = cgroup_file_release,
-};
+	if (cft->read_u64)
+		seq_printf(m, "%llu\n", cft->read_u64(css, cft));
+	else if (cft->read_s64)
+		seq_printf(m, "%lld\n", cft->read_s64(css, cft));
+	else
+		return -EINVAL;
+	return 0;
+}
 
 static int cgroup_file_open(struct inode *inode, struct file *file)
 {
@@ -2387,12 +2353,10 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->read_seq_string) {
-		file->f_op = &cgroup_seqfile_operations;
-		err = single_open(file, cgroup_seqfile_show, cfe);
-	} else if (cft->open) {
+	if (cft->open)
 		err = cft->open(inode, file);
-	}
+	else
+		err = single_open(file, cgroup_seqfile_show, cfe);
 
 	if (css->ss && err)
 		css_put(css);
@@ -2406,9 +2370,7 @@ static int cgroup_file_release(struct inode *inode, struct file *file)
 
 	if (css->ss)
 		css_put(css);
-	if (file->f_op == &cgroup_seqfile_operations)
-		single_release(inode, file);
-	return 0;
+	return single_release(inode, file);
 }
 
 /*
@@ -2519,7 +2481,7 @@ static ssize_t cgroup_listxattr(struct dentry *dentry, char *buf, size_t size)
 }
 
 static const struct file_operations cgroup_file_operations = {
-	.read = cgroup_file_read,
+	.read = seq_read,
 	.write = cgroup_file_write,
 	.llseek = generic_file_llseek,
 	.open = cgroup_file_open,
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 09/12] cgroup: generalize cgroup_pidlist_open_file
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (7 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 08/12] cgroup: unify read path so that seq_file is always used Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
  2013-11-27 23:42   ` [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files Tejun Heo
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  This patch renames
cgroup_pidlist_open_file to cgroup_open_file and updates it so that it
only contains a field to identify the specific file, ->cfe, and an
opaque ->priv pointer.  When cgroup is converted to kernfs, this will
be replaced by kernfs_open_file which contains about the same
information.

As whether the file is "cgroup.procs" or "tasks" should now be
determined from cgroup_open_file->cfe, the cftype->private for the two
files now carry the file type and cgroup_pidlist_start() reads the
type through cfe->type->private.  This makes the distinction between
cgroup_tasks_open() and cgroup_procs_open() unnecessary.
cgroup_pidlist_open() is now directly used as the open method.

This patch doesn't make any behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 kernel/cgroup.c | 66 ++++++++++++++++++++++++++-------------------------------
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7048e7f..bdb8e8d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -3369,10 +3369,9 @@ struct cgroup_pidlist {
 };
 
 /* seq_file->private points to the following */
-struct cgroup_pidlist_open_file {
-	enum cgroup_filetype		type;
-	struct cgroup			*cgrp;
-	struct cgroup_pidlist		*pidlist;
+struct cgroup_open_file {
+	struct cfent			*cfe;
+	void				*priv;
 };
 
 /*
@@ -3689,33 +3688,35 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 	 * after a seek to the start). Use a binary-search to find the
 	 * next pid to display, if any
 	 */
-	struct cgroup_pidlist_open_file *of = s->private;
-	struct cgroup *cgrp = of->cgrp;
+	struct cgroup_open_file *of = s->private;
+	struct cgroup *cgrp = of->cfe->css->cgroup;
 	struct cgroup_pidlist *l;
+	enum cgroup_filetype type = of->cfe->type->private;
 	int index = 0, pid = *pos;
 	int *iter, ret;
 
 	mutex_lock(&cgrp->pidlist_mutex);
 
 	/*
-	 * !NULL @of->pidlist indicates that this isn't the first start()
+	 * !NULL @of->priv indicates that this isn't the first start()
 	 * after open.  If the matching pidlist is around, we can use that.
-	 * Look for it.  Note that @of->pidlist can't be used directly.  It
+	 * Look for it.  Note that @of->priv can't be used directly.  It
 	 * could already have been destroyed.
 	 */
-	if (of->pidlist)
-		of->pidlist = cgroup_pidlist_find(cgrp, of->type);
+	if (of->priv)
+		of->priv = cgroup_pidlist_find(cgrp, type);
 
 	/*
 	 * Either this is the first start() after open or the matching
 	 * pidlist has been destroyed inbetween.  Create a new one.
 	 */
-	if (!of->pidlist) {
-		ret = pidlist_array_load(of->cgrp, of->type, &of->pidlist);
+	if (!of->priv) {
+		ret = pidlist_array_load(cgrp, type,
+					 (struct cgroup_pidlist **)&of->priv);
 		if (ret)
 			return ERR_PTR(ret);
 	}
-	l = of->pidlist;
+	l = of->priv;
 
 	if (pid) {
 		int end = l->length;
@@ -3742,19 +3743,19 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 
 static void cgroup_pidlist_stop(struct seq_file *s, void *v)
 {
-	struct cgroup_pidlist_open_file *of = s->private;
+	struct cgroup_open_file *of = s->private;
+	struct cgroup_pidlist *l = of->priv;
 
-	if (of->pidlist)
-		mod_delayed_work(cgroup_pidlist_destroy_wq,
-				 &of->pidlist->destroy_dwork,
+	if (l)
+		mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork,
 				 CGROUP_PIDLIST_DESTROY_DELAY);
-	mutex_unlock(&of->cgrp->pidlist_mutex);
+	mutex_unlock(&of->cfe->css->cgroup->pidlist_mutex);
 }
 
 static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
 {
-	struct cgroup_pidlist_open_file *of = s->private;
-	struct cgroup_pidlist *l = of->pidlist;
+	struct cgroup_open_file *of = s->private;
+	struct cgroup_pidlist *l = of->priv;
 	pid_t *p = v;
 	pid_t *end = l->list + l->length;
 	/*
@@ -3765,7 +3766,7 @@ static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
 	if (p >= end) {
 		return NULL;
 	} else {
-		*pos = cgroup_pid_fry(of->cgrp, *p);
+		*pos = cgroup_pid_fry(of->cfe->css->cgroup, *p);
 		return p;
 	}
 }
@@ -3799,10 +3800,10 @@ static const struct file_operations cgroup_pidlist_operations = {
  * in the cgroup.
  */
 /* helper function for the two below it */
-static int cgroup_pidlist_open(struct file *file, enum cgroup_filetype type)
+static int cgroup_pidlist_open(struct inode *unused, struct file *file)
 {
-	struct cgroup *cgrp = __d_cgrp(file->f_dentry->d_parent);
-	struct cgroup_pidlist_open_file *of;
+	struct cfent *cfe = __d_cfe(file->f_dentry);
+	struct cgroup_open_file *of;
 	int retval;
 
 	/* configure file information */
@@ -3814,18 +3815,9 @@ static int cgroup_pidlist_open(struct file *file, enum cgroup_filetype type)
 		return retval;
 
 	of = ((struct seq_file *)file->private_data)->private;
-	of->type = type;
-	of->cgrp = cgrp;
+	of->cfe = cfe;
 	return 0;
 }
-static int cgroup_tasks_open(struct inode *unused, struct file *file)
-{
-	return cgroup_pidlist_open(file, CGROUP_FILE_TASKS);
-}
-static int cgroup_procs_open(struct inode *unused, struct file *file)
-{
-	return cgroup_pidlist_open(file, CGROUP_FILE_PROCS);
-}
 
 static u64 cgroup_read_notify_on_release(struct cgroup_subsys_state *css,
 					 struct cftype *cft)
@@ -3880,7 +3872,8 @@ static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
 static struct cftype cgroup_base_files[] = {
 	{
 		.name = "cgroup.procs",
-		.open = cgroup_procs_open,
+		.open = cgroup_pidlist_open,
+		.private = CGROUP_FILE_PROCS,
 		.write_u64 = cgroup_procs_write,
 		.mode = S_IRUGO | S_IWUSR,
 	},
@@ -3904,7 +3897,8 @@ static struct cftype cgroup_base_files[] = {
 	{
 		.name = "tasks",
 		.flags = CFTYPE_INSANE,		/* use "procs" instead */
-		.open = cgroup_tasks_open,
+		.open = cgroup_pidlist_open,
+		.private = CGROUP_FILE_TASKS,
 		.write_u64 = cgroup_tasks_write,
 		.mode = S_IRUGO | S_IWUSR,
 	},
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (8 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 09/12] cgroup: generalize cgroup_pidlist_open_file Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show() Tejun Heo
                     ` (3 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  This patch
attaches cgroup_open_file, which used to be attached to pidlist files,
to all cgroup files, introduces seq_css/cft() accessors to determine
the cgroup_subsys_state and cftype associated with a given cgroup
seq_file, exports them as public interface.

This doesn't cause any behavior changes but unifies cgroup file
handling across different file types and will help converting them to
kernfs seq_show() interface.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 include/linux/cgroup.h | 33 +++++++++++++++++++++++++++++++++
 kernel/cgroup.c        | 47 +++++++++++++++++------------------------------
 2 files changed, 50 insertions(+), 30 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 53e11da..c3d698a 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -21,6 +21,7 @@
 #include <linux/xattr.h>
 #include <linux/fs.h>
 #include <linux/percpu-refcount.h>
+#include <linux/seq_file.h>
 
 #ifdef CONFIG_CGROUPS
 
@@ -490,6 +491,26 @@ struct cftype_set {
 };
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.  Don't
+ * access directly.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+	struct cgroup_subsys_state	*css;
+
+	/* file xattrs */
+	struct simple_xattrs		xattrs;
+};
+
+/* seq_file->private points to the following, only ->priv is public */
+struct cgroup_open_file {
+	struct cfent			*cfe;
+	void				*priv;
+};
+
+/*
  * See the comment above CGRP_ROOT_SANE_BEHAVIOR for details.  This
  * function can be called as long as @cgrp is accessible.
  */
@@ -504,6 +525,18 @@ static inline const char *cgroup_name(const struct cgroup *cgrp)
 	return rcu_dereference(cgrp->name)->name;
 }
 
+static inline struct cgroup_subsys_state *seq_css(struct seq_file *seq)
+{
+	struct cgroup_open_file *of = seq->private;
+	return of->cfe->css;
+}
+
+static inline struct cftype *seq_cft(struct seq_file *seq)
+{
+	struct cgroup_open_file *of = seq->private;
+	return of->cfe->type;
+}
+
 int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
 int cgroup_rm_cftypes(struct cftype *cfts);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index bdb8e8d..1189de7 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -41,7 +41,6 @@
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
 #include <linux/backing-dev.h>
-#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/magic.h>
 #include <linux/spinlock.h>
@@ -130,19 +129,6 @@ static struct cgroupfs_root cgroup_dummy_root;
 /* dummy_top is a shorthand for the dummy hierarchy's top cgroup */
 static struct cgroup * const cgroup_dummy_top = &cgroup_dummy_root.top_cgroup;
 
-/*
- * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
- */
-struct cfent {
-	struct list_head		node;
-	struct dentry			*dentry;
-	struct cftype			*type;
-	struct cgroup_subsys_state	*css;
-
-	/* file xattrs */
-	struct simple_xattrs		xattrs;
-};
-
 /* The list of hierarchy roots */
 
 static LIST_HEAD(cgroup_roots);
@@ -2302,9 +2288,8 @@ out_free:
 
 static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
-	struct cfent *cfe = m->private;
-	struct cftype *cft = cfe->type;
-	struct cgroup_subsys_state *css = cfe->css;
+	struct cftype *cft = seq_cft(m);
+	struct cgroup_subsys_state *css = seq_css(m);
 
 	if (cft->read_seq_string)
 		return cft->read_seq_string(css, cft, m);
@@ -2353,10 +2338,18 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->open)
+	if (cft->open) {
 		err = cft->open(inode, file);
-	else
-		err = single_open(file, cgroup_seqfile_show, cfe);
+	} else {
+		err = single_open_size(file, cgroup_seqfile_show, cfe,
+				       sizeof(struct cgroup_open_file));
+		if (!err) {
+			struct seq_file *sf = file->private_data;
+			struct cgroup_open_file *of = sf->private;
+
+			of->cfe = cfe;
+		}
+	}
 
 	if (css->ss && err)
 		css_put(css);
@@ -3368,12 +3361,6 @@ struct cgroup_pidlist {
 	struct delayed_work destroy_dwork;
 };
 
-/* seq_file->private points to the following */
-struct cgroup_open_file {
-	struct cfent			*cfe;
-	void				*priv;
-};
-
 /*
  * The following two functions "fix" the issue where there are more pids
  * than kmalloc will give memory for; in such cases, we use vmalloc/vfree.
@@ -3689,9 +3676,9 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 	 * next pid to display, if any
 	 */
 	struct cgroup_open_file *of = s->private;
-	struct cgroup *cgrp = of->cfe->css->cgroup;
+	struct cgroup *cgrp = seq_css(s)->cgroup;
 	struct cgroup_pidlist *l;
-	enum cgroup_filetype type = of->cfe->type->private;
+	enum cgroup_filetype type = seq_cft(s)->private;
 	int index = 0, pid = *pos;
 	int *iter, ret;
 
@@ -3749,7 +3736,7 @@ static void cgroup_pidlist_stop(struct seq_file *s, void *v)
 	if (l)
 		mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork,
 				 CGROUP_PIDLIST_DESTROY_DELAY);
-	mutex_unlock(&of->cfe->css->cgroup->pidlist_mutex);
+	mutex_unlock(&seq_css(s)->cgroup->pidlist_mutex);
 }
 
 static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
@@ -3766,7 +3753,7 @@ static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
 	if (p >= end) {
 		return NULL;
 	} else {
-		*pos = cgroup_pid_fry(of->cfe->css->cgroup, *p);
+		*pos = cgroup_pid_fry(seq_css(s)->cgroup, *p);
 		return p;
 	}
 }
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show()
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (9 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-27 23:42   ` [PATCH 12/12] cgroup: unify pidlist and other file handling Tejun Heo
                     ` (2 subsequent siblings)
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo, Jens Axboe

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  This patch
replaces cftype->read_seq_string() with cftype->seq_show() which is
not limited to single_open() operation and will map directcly to
kernfs seq_file interface.

The conversions are mechanical.  As ->seq_show() doesn't have @css and
@cft, the functions which make use of them are converted to use
seq_css() and seq_cft() respectively.  In several occassions, e.f. if
it has seq_string in its name, the function name is updated to fit the
new method better.

This patch does not introduce any behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
 block/blk-throttle.c      |  35 ++++++-------
 block/cfq-iosched.c       | 131 ++++++++++++++++++++--------------------------
 include/linux/cgroup.h    |   9 ++--
 kernel/cgroup.c           |  34 ++++++------
 kernel/cgroup_freezer.c   |   7 ++-
 kernel/cpuset.c           |  12 ++---
 kernel/sched/core.c       |   7 ++-
 kernel/sched/cpuacct.c    |  14 +++--
 mm/memcontrol.c           |  28 +++++-----
 net/core/netprio_cgroup.c |   8 +--
 security/device_cgroup.c  |   7 ++-
 11 files changed, 128 insertions(+), 164 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 0653404..a760857 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -1303,13 +1303,10 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
 	return __blkg_prfill_rwstat(sf, pd, &rwstat);
 }
 
-static int tg_print_cpu_rwstat(struct cgroup_subsys_state *css,
-			       struct cftype *cft, struct seq_file *sf)
+static int tg_print_cpu_rwstat(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, tg_prfill_cpu_rwstat, &blkcg_policy_throtl,
-			  cft->private, true);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_cpu_rwstat,
+			  &blkcg_policy_throtl, seq_cft(sf)->private, true);
 	return 0;
 }
 
@@ -1335,19 +1332,17 @@ static u64 tg_prfill_conf_uint(struct seq_file *sf, struct blkg_policy_data *pd,
 	return __blkg_prfill_u64(sf, pd, v);
 }
 
-static int tg_print_conf_u64(struct cgroup_subsys_state *css,
-			     struct cftype *cft, struct seq_file *sf)
+static int tg_print_conf_u64(struct seq_file *sf, void *v)
 {
-	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_u64,
-			  &blkcg_policy_throtl, cft->private, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_conf_u64,
+			  &blkcg_policy_throtl, seq_cft(sf)->private, false);
 	return 0;
 }
 
-static int tg_print_conf_uint(struct cgroup_subsys_state *css,
-			      struct cftype *cft, struct seq_file *sf)
+static int tg_print_conf_uint(struct seq_file *sf, void *v)
 {
-	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_uint,
-			  &blkcg_policy_throtl, cft->private, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_conf_uint,
+			  &blkcg_policy_throtl, seq_cft(sf)->private, false);
 	return 0;
 }
 
@@ -1428,40 +1423,40 @@ static struct cftype throtl_files[] = {
 	{
 		.name = "throttle.read_bps_device",
 		.private = offsetof(struct throtl_grp, bps[READ]),
-		.read_seq_string = tg_print_conf_u64,
+		.seq_show = tg_print_conf_u64,
 		.write_string = tg_set_conf_u64,
 		.max_write_len = 256,
 	},
 	{
 		.name = "throttle.write_bps_device",
 		.private = offsetof(struct throtl_grp, bps[WRITE]),
-		.read_seq_string = tg_print_conf_u64,
+		.seq_show = tg_print_conf_u64,
 		.write_string = tg_set_conf_u64,
 		.max_write_len = 256,
 	},
 	{
 		.name = "throttle.read_iops_device",
 		.private = offsetof(struct throtl_grp, iops[READ]),
-		.read_seq_string = tg_print_conf_uint,
+		.seq_show = tg_print_conf_uint,
 		.write_string = tg_set_conf_uint,
 		.max_write_len = 256,
 	},
 	{
 		.name = "throttle.write_iops_device",
 		.private = offsetof(struct throtl_grp, iops[WRITE]),
-		.read_seq_string = tg_print_conf_uint,
+		.seq_show = tg_print_conf_uint,
 		.write_string = tg_set_conf_uint,
 		.max_write_len = 256,
 	},
 	{
 		.name = "throttle.io_service_bytes",
 		.private = offsetof(struct tg_stats_cpu, service_bytes),
-		.read_seq_string = tg_print_cpu_rwstat,
+		.seq_show = tg_print_cpu_rwstat,
 	},
 	{
 		.name = "throttle.io_serviced",
 		.private = offsetof(struct tg_stats_cpu, serviced),
-		.read_seq_string = tg_print_cpu_rwstat,
+		.seq_show = tg_print_cpu_rwstat,
 	},
 	{ }	/* terminate */
 };
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4d5cec1..744833b 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -1632,11 +1632,11 @@ static u64 cfqg_prfill_weight_device(struct seq_file *sf,
 	return __blkg_prfill_u64(sf, pd, cfqg->dev_weight);
 }
 
-static int cfqg_print_weight_device(struct cgroup_subsys_state *css,
-				    struct cftype *cft, struct seq_file *sf)
+static int cfqg_print_weight_device(struct seq_file *sf, void *v)
 {
-	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_weight_device,
-			  &blkcg_policy_cfq, 0, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
+			  cfqg_prfill_weight_device, &blkcg_policy_cfq,
+			  0, false);
 	return 0;
 }
 
@@ -1650,26 +1650,23 @@ static u64 cfqg_prfill_leaf_weight_device(struct seq_file *sf,
 	return __blkg_prfill_u64(sf, pd, cfqg->dev_leaf_weight);
 }
 
-static int cfqg_print_leaf_weight_device(struct cgroup_subsys_state *css,
-					 struct cftype *cft,
-					 struct seq_file *sf)
+static int cfqg_print_leaf_weight_device(struct seq_file *sf, void *v)
 {
-	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_leaf_weight_device,
-			  &blkcg_policy_cfq, 0, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
+			  cfqg_prfill_leaf_weight_device, &blkcg_policy_cfq,
+			  0, false);
 	return 0;
 }
 
-static int cfq_print_weight(struct cgroup_subsys_state *css, struct cftype *cft,
-			    struct seq_file *sf)
+static int cfq_print_weight(struct seq_file *sf, void *v)
 {
-	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_weight);
+	seq_printf(sf, "%u\n", css_to_blkcg(seq_css(sf))->cfq_weight);
 	return 0;
 }
 
-static int cfq_print_leaf_weight(struct cgroup_subsys_state *css,
-				 struct cftype *cft, struct seq_file *sf)
+static int cfq_print_leaf_weight(struct seq_file *sf, void *v)
 {
-	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_leaf_weight);
+	seq_printf(sf, "%u\n", css_to_blkcg(seq_css(sf))->cfq_leaf_weight);
 	return 0;
 }
 
@@ -1762,23 +1759,17 @@ static int cfq_set_leaf_weight(struct cgroup_subsys_state *css,
 	return __cfq_set_weight(css, cft, val, true);
 }
 
-static int cfqg_print_stat(struct cgroup_subsys_state *css, struct cftype *cft,
-			   struct seq_file *sf)
+static int cfqg_print_stat(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, blkg_prfill_stat, &blkcg_policy_cfq,
-			  cft->private, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat,
+			  &blkcg_policy_cfq, seq_cft(sf)->private, false);
 	return 0;
 }
 
-static int cfqg_print_rwstat(struct cgroup_subsys_state *css,
-			     struct cftype *cft, struct seq_file *sf)
+static int cfqg_print_rwstat(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, blkg_prfill_rwstat, &blkcg_policy_cfq,
-			  cft->private, true);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat,
+			  &blkcg_policy_cfq, seq_cft(sf)->private, true);
 	return 0;
 }
 
@@ -1798,23 +1789,19 @@ static u64 cfqg_prfill_rwstat_recursive(struct seq_file *sf,
 	return __blkg_prfill_rwstat(sf, pd, &sum);
 }
 
-static int cfqg_print_stat_recursive(struct cgroup_subsys_state *css,
-				     struct cftype *cft, struct seq_file *sf)
+static int cfqg_print_stat_recursive(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_stat_recursive,
-			  &blkcg_policy_cfq, cft->private, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
+			  cfqg_prfill_stat_recursive, &blkcg_policy_cfq,
+			  seq_cft(sf)->private, false);
 	return 0;
 }
 
-static int cfqg_print_rwstat_recursive(struct cgroup_subsys_state *css,
-				       struct cftype *cft, struct seq_file *sf)
+static int cfqg_print_rwstat_recursive(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_rwstat_recursive,
-			  &blkcg_policy_cfq, cft->private, true);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
+			  cfqg_prfill_rwstat_recursive, &blkcg_policy_cfq,
+			  seq_cft(sf)->private, true);
 	return 0;
 }
 
@@ -1835,13 +1822,11 @@ static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
 }
 
 /* print avg_queue_size */
-static int cfqg_print_avg_queue_size(struct cgroup_subsys_state *css,
-				     struct cftype *cft, struct seq_file *sf)
+static int cfqg_print_avg_queue_size(struct seq_file *sf, void *v)
 {
-	struct blkcg *blkcg = css_to_blkcg(css);
-
-	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_avg_queue_size,
-			  &blkcg_policy_cfq, 0, false);
+	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
+			  cfqg_prfill_avg_queue_size, &blkcg_policy_cfq,
+			  0, false);
 	return 0;
 }
 #endif	/* CONFIG_DEBUG_BLK_CGROUP */
@@ -1851,14 +1836,14 @@ static struct cftype cfq_blkcg_files[] = {
 	{
 		.name = "weight_device",
 		.flags = CFTYPE_ONLY_ON_ROOT,
-		.read_seq_string = cfqg_print_leaf_weight_device,
+		.seq_show = cfqg_print_leaf_weight_device,
 		.write_string = cfqg_set_leaf_weight_device,
 		.max_write_len = 256,
 	},
 	{
 		.name = "weight",
 		.flags = CFTYPE_ONLY_ON_ROOT,
-		.read_seq_string = cfq_print_leaf_weight,
+		.seq_show = cfq_print_leaf_weight,
 		.write_u64 = cfq_set_leaf_weight,
 	},
 
@@ -1866,26 +1851,26 @@ static struct cftype cfq_blkcg_files[] = {
 	{
 		.name = "weight_device",
 		.flags = CFTYPE_NOT_ON_ROOT,
-		.read_seq_string = cfqg_print_weight_device,
+		.seq_show = cfqg_print_weight_device,
 		.write_string = cfqg_set_weight_device,
 		.max_write_len = 256,
 	},
 	{
 		.name = "weight",
 		.flags = CFTYPE_NOT_ON_ROOT,
-		.read_seq_string = cfq_print_weight,
+		.seq_show = cfq_print_weight,
 		.write_u64 = cfq_set_weight,
 	},
 
 	{
 		.name = "leaf_weight_device",
-		.read_seq_string = cfqg_print_leaf_weight_device,
+		.seq_show = cfqg_print_leaf_weight_device,
 		.write_string = cfqg_set_leaf_weight_device,
 		.max_write_len = 256,
 	},
 	{
 		.name = "leaf_weight",
-		.read_seq_string = cfq_print_leaf_weight,
+		.seq_show = cfq_print_leaf_weight,
 		.write_u64 = cfq_set_leaf_weight,
 	},
 
@@ -1893,114 +1878,114 @@ static struct cftype cfq_blkcg_files[] = {
 	{
 		.name = "time",
 		.private = offsetof(struct cfq_group, stats.time),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "sectors",
 		.private = offsetof(struct cfq_group, stats.sectors),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "io_service_bytes",
 		.private = offsetof(struct cfq_group, stats.service_bytes),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 	{
 		.name = "io_serviced",
 		.private = offsetof(struct cfq_group, stats.serviced),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 	{
 		.name = "io_service_time",
 		.private = offsetof(struct cfq_group, stats.service_time),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 	{
 		.name = "io_wait_time",
 		.private = offsetof(struct cfq_group, stats.wait_time),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 	{
 		.name = "io_merged",
 		.private = offsetof(struct cfq_group, stats.merged),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 	{
 		.name = "io_queued",
 		.private = offsetof(struct cfq_group, stats.queued),
-		.read_seq_string = cfqg_print_rwstat,
+		.seq_show = cfqg_print_rwstat,
 	},
 
 	/* the same statictics which cover the cfqg and its descendants */
 	{
 		.name = "time_recursive",
 		.private = offsetof(struct cfq_group, stats.time),
-		.read_seq_string = cfqg_print_stat_recursive,
+		.seq_show = cfqg_print_stat_recursive,
 	},
 	{
 		.name = "sectors_recursive",
 		.private = offsetof(struct cfq_group, stats.sectors),
-		.read_seq_string = cfqg_print_stat_recursive,
+		.seq_show = cfqg_print_stat_recursive,
 	},
 	{
 		.name = "io_service_bytes_recursive",
 		.private = offsetof(struct cfq_group, stats.service_bytes),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 	{
 		.name = "io_serviced_recursive",
 		.private = offsetof(struct cfq_group, stats.serviced),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 	{
 		.name = "io_service_time_recursive",
 		.private = offsetof(struct cfq_group, stats.service_time),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 	{
 		.name = "io_wait_time_recursive",
 		.private = offsetof(struct cfq_group, stats.wait_time),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 	{
 		.name = "io_merged_recursive",
 		.private = offsetof(struct cfq_group, stats.merged),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 	{
 		.name = "io_queued_recursive",
 		.private = offsetof(struct cfq_group, stats.queued),
-		.read_seq_string = cfqg_print_rwstat_recursive,
+		.seq_show = cfqg_print_rwstat_recursive,
 	},
 #ifdef CONFIG_DEBUG_BLK_CGROUP
 	{
 		.name = "avg_queue_size",
-		.read_seq_string = cfqg_print_avg_queue_size,
+		.seq_show = cfqg_print_avg_queue_size,
 	},
 	{
 		.name = "group_wait_time",
 		.private = offsetof(struct cfq_group, stats.group_wait_time),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "idle_time",
 		.private = offsetof(struct cfq_group, stats.idle_time),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "empty_time",
 		.private = offsetof(struct cfq_group, stats.empty_time),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "dequeue",
 		.private = offsetof(struct cfq_group, stats.dequeue),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 	{
 		.name = "unaccounted_time",
 		.private = offsetof(struct cfq_group, stats.unaccounted_time),
-		.read_seq_string = cfqg_print_stat,
+		.seq_show = cfqg_print_stat,
 	},
 #endif	/* CONFIG_DEBUG_BLK_CGROUP */
 	{ }	/* terminate */
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index c3d698a..b32a0f8 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -444,12 +444,9 @@ struct cftype {
 	 * read_s64() is a signed version of read_u64()
 	 */
 	s64 (*read_s64)(struct cgroup_subsys_state *css, struct cftype *cft);
-	/*
-	 * read_seq_string() is used for outputting a simple sequence
-	 * using seqfile.
-	 */
-	int (*read_seq_string)(struct cgroup_subsys_state *css,
-			       struct cftype *cft, struct seq_file *m);
+
+	/* generic seq_file read interface */
+	int (*seq_show)(struct seq_file *sf, void *v);
 
 	/*
 	 * write_u64() is a shortcut for the common case of accepting
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 1189de7..39fdf75 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2212,10 +2212,9 @@ static int cgroup_release_agent_write(struct cgroup_subsys_state *css,
 	return 0;
 }
 
-static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
-				     struct cftype *cft, struct seq_file *seq)
+static int cgroup_release_agent_show(struct seq_file *seq, void *v)
 {
-	struct cgroup *cgrp = css->cgroup;
+	struct cgroup *cgrp = seq_css(seq)->cgroup;
 
 	if (!cgroup_lock_live_group(cgrp))
 		return -ENODEV;
@@ -2225,10 +2224,11 @@ static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
 	return 0;
 }
 
-static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
-				     struct cftype *cft, struct seq_file *seq)
+static int cgroup_sane_behavior_show(struct seq_file *seq, void *v)
 {
-	seq_printf(seq, "%d\n", cgroup_sane_behavior(css->cgroup));
+	struct cgroup *cgrp = seq_css(seq)->cgroup;
+
+	seq_printf(seq, "%d\n", cgroup_sane_behavior(cgrp));
 	return 0;
 }
 
@@ -2291,8 +2291,8 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 	struct cftype *cft = seq_cft(m);
 	struct cgroup_subsys_state *css = seq_css(m);
 
-	if (cft->read_seq_string)
-		return cft->read_seq_string(css, cft, m);
+	if (cft->seq_show)
+		return cft->seq_show(m, arg);
 
 	if (cft->read_u64)
 		seq_printf(m, "%llu\n", cft->read_u64(css, cft));
@@ -2556,7 +2556,7 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
 	if (cft->mode)
 		return cft->mode;
 
-	if (cft->read_u64 || cft->read_s64 || cft->read_seq_string)
+	if (cft->read_u64 || cft->read_s64 || cft->seq_show)
 		mode |= S_IRUGO;
 
 	if (cft->write_u64 || cft->write_s64 || cft->write_string ||
@@ -3873,7 +3873,7 @@ static struct cftype cgroup_base_files[] = {
 	{
 		.name = "cgroup.sane_behavior",
 		.flags = CFTYPE_ONLY_ON_ROOT,
-		.read_seq_string = cgroup_sane_behavior_show,
+		.seq_show = cgroup_sane_behavior_show,
 	},
 
 	/*
@@ -3898,7 +3898,7 @@ static struct cftype cgroup_base_files[] = {
 	{
 		.name = "release_agent",
 		.flags = CFTYPE_INSANE | CFTYPE_ONLY_ON_ROOT,
-		.read_seq_string = cgroup_release_agent_show,
+		.seq_show = cgroup_release_agent_show,
 		.write_string = cgroup_release_agent_write,
 		.max_write_len = PATH_MAX,
 	},
@@ -5273,9 +5273,7 @@ static u64 current_css_set_refcount_read(struct cgroup_subsys_state *css,
 	return count;
 }
 
-static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
-					 struct cftype *cft,
-					 struct seq_file *seq)
+static int current_css_set_cg_links_read(struct seq_file *seq, void *v)
 {
 	struct cgrp_cset_link *link;
 	struct css_set *cset;
@@ -5300,9 +5298,9 @@ static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
 }
 
 #define MAX_TASKS_SHOWN_PER_CSS 25
-static int cgroup_css_links_read(struct cgroup_subsys_state *css,
-				 struct cftype *cft, struct seq_file *seq)
+static int cgroup_css_links_read(struct seq_file *seq, void *v)
 {
+	struct cgroup_subsys_state *css = seq_css(seq);
 	struct cgrp_cset_link *link;
 
 	read_lock(&css_set_lock);
@@ -5348,12 +5346,12 @@ static struct cftype debug_files[] =  {
 
 	{
 		.name = "current_css_set_cg_links",
-		.read_seq_string = current_css_set_cg_links_read,
+		.seq_show = current_css_set_cg_links_read,
 	},
 
 	{
 		.name = "cgroup_css_links",
-		.read_seq_string = cgroup_css_links_read,
+		.seq_show = cgroup_css_links_read,
 	},
 
 	{
diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
index f0ff64d..6c3154e 100644
--- a/kernel/cgroup_freezer.c
+++ b/kernel/cgroup_freezer.c
@@ -301,10 +301,9 @@ out_unlock:
 	spin_unlock_irq(&freezer->lock);
 }
 
-static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
-			struct seq_file *m)
+static int freezer_read(struct seq_file *m, void *v)
 {
-	struct cgroup_subsys_state *pos;
+	struct cgroup_subsys_state *css = seq_css(m), *pos;
 
 	rcu_read_lock();
 
@@ -458,7 +457,7 @@ static struct cftype files[] = {
 	{
 		.name = "state",
 		.flags = CFTYPE_NOT_ON_ROOT,
-		.read_seq_string = freezer_read,
+		.seq_show = freezer_read,
 		.write_string = freezer_write,
 	},
 	{
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 032929f..4410ac6 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1732,12 +1732,10 @@ out_unlock:
  * and since these maps can change value dynamically, one could read
  * gibberish by doing partial reads while a list was changing.
  */
-static int cpuset_common_read_seq_string(struct cgroup_subsys_state *css,
-					 struct cftype *cft,
-					 struct seq_file *sf)
+static int cpuset_common_seq_show(struct seq_file *sf, void *v)
 {
-	struct cpuset *cs = css_cs(css);
-	cpuset_filetype_t type = cft->private;
+	struct cpuset *cs = css_cs(seq_css(sf));
+	cpuset_filetype_t type = seq_cft(sf)->private;
 	ssize_t count;
 	char *buf, *s;
 	int ret = 0;
@@ -1824,7 +1822,7 @@ static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft)
 static struct cftype files[] = {
 	{
 		.name = "cpus",
-		.read_seq_string = cpuset_common_read_seq_string,
+		.seq_show = cpuset_common_seq_show,
 		.write_string = cpuset_write_resmask,
 		.max_write_len = (100U + 6 * NR_CPUS),
 		.private = FILE_CPULIST,
@@ -1832,7 +1830,7 @@ static struct cftype files[] = {
 
 	{
 		.name = "mems",
-		.read_seq_string = cpuset_common_read_seq_string,
+		.seq_show = cpuset_common_seq_show,
 		.write_string = cpuset_write_resmask,
 		.max_write_len = (100U + 6 * MAX_NUMNODES),
 		.private = FILE_MEMLIST,
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f28ec67..7e8cbb9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7256,10 +7256,9 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
 	return ret;
 }
 
-static int cpu_stats_show(struct cgroup_subsys_state *css, struct cftype *cft,
-			  struct seq_file *sf)
+static int cpu_stats_show(struct seq_file *sf, void *v)
 {
-	struct task_group *tg = css_tg(css);
+	struct task_group *tg = css_tg(seq_css(sf));
 	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
 
 	seq_printf(sf, "nr_periods %d\n", cfs_b->nr_periods);
@@ -7318,7 +7317,7 @@ static struct cftype cpu_files[] = {
 	},
 	{
 		.name = "stat",
-		.read_seq_string = cpu_stats_show,
+		.seq_show = cpu_stats_show,
 	},
 #endif
 #ifdef CONFIG_RT_GROUP_SCHED
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index dd88738..622e081 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -163,10 +163,9 @@ out:
 	return err;
 }
 
-static int cpuacct_percpu_seq_read(struct cgroup_subsys_state *css,
-				   struct cftype *cft, struct seq_file *m)
+static int cpuacct_percpu_seq_show(struct seq_file *m, void *V)
 {
-	struct cpuacct *ca = css_ca(css);
+	struct cpuacct *ca = css_ca(seq_css(m));
 	u64 percpu;
 	int i;
 
@@ -183,10 +182,9 @@ static const char * const cpuacct_stat_desc[] = {
 	[CPUACCT_STAT_SYSTEM] = "system",
 };
 
-static int cpuacct_stats_show(struct cgroup_subsys_state *css,
-			      struct cftype *cft, struct seq_file *sf)
+static int cpuacct_stats_show(struct seq_file *sf, void *v)
 {
-	struct cpuacct *ca = css_ca(css);
+	struct cpuacct *ca = css_ca(seq_css(sf));
 	int cpu;
 	s64 val = 0;
 
@@ -220,11 +218,11 @@ static struct cftype files[] = {
 	},
 	{
 		.name = "usage_percpu",
-		.read_seq_string = cpuacct_percpu_seq_read,
+		.seq_show = cpuacct_percpu_seq_show,
 	},
 	{
 		.name = "stat",
-		.read_seq_string = cpuacct_stats_show,
+		.seq_show = cpuacct_stats_show,
 	},
 	{ }	/* terminate */
 };
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f149521..9252219 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3014,10 +3014,9 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p)
 }
 
 #ifdef CONFIG_SLABINFO
-static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css,
-				    struct cftype *cft, struct seq_file *m)
+static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
 	struct memcg_cache_params *params;
 
 	if (!memcg_can_account_kmem(memcg))
@@ -5418,8 +5417,7 @@ static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
 #endif
 
 #ifdef CONFIG_NUMA
-static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
-				struct cftype *cft, struct seq_file *m)
+static int memcg_numa_stat_show(struct seq_file *m, void *v)
 {
 	struct numa_stat {
 		const char *name;
@@ -5435,7 +5433,7 @@ static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
 	const struct numa_stat *stat;
 	int nid;
 	unsigned long nr;
-	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
 
 	for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) {
 		nr = mem_cgroup_nr_lru_pages(memcg, stat->lru_mask);
@@ -5474,10 +5472,9 @@ static inline void mem_cgroup_lru_names_not_uptodate(void)
 	BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
 }
 
-static int memcg_stat_show(struct cgroup_subsys_state *css, struct cftype *cft,
-				 struct seq_file *m)
+static int memcg_stat_show(struct seq_file *m, void *v)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
 	struct mem_cgroup *mi;
 	unsigned int i;
 
@@ -5907,10 +5904,9 @@ static void mem_cgroup_oom_unregister_event(struct mem_cgroup *memcg,
 	spin_unlock(&memcg_oom_lock);
 }
 
-static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
-				       struct cftype *cft, struct seq_file *sf)
+static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
 {
-	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(sf));
 
 	seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
 	seq_printf(sf, "under_oom %d\n", (bool)atomic_read(&memcg->under_oom));
@@ -6260,7 +6256,7 @@ static struct cftype mem_cgroup_files[] = {
 	},
 	{
 		.name = "stat",
-		.read_seq_string = memcg_stat_show,
+		.seq_show = memcg_stat_show,
 	},
 	{
 		.name = "force_empty",
@@ -6290,7 +6286,7 @@ static struct cftype mem_cgroup_files[] = {
 	},
 	{
 		.name = "oom_control",
-		.read_seq_string = mem_cgroup_oom_control_read,
+		.seq_show = mem_cgroup_oom_control_read,
 		.write_u64 = mem_cgroup_oom_control_write,
 		.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
 	},
@@ -6300,7 +6296,7 @@ static struct cftype mem_cgroup_files[] = {
 #ifdef CONFIG_NUMA
 	{
 		.name = "numa_stat",
-		.read_seq_string = memcg_numa_stat_show,
+		.seq_show = memcg_numa_stat_show,
 	},
 #endif
 #ifdef CONFIG_MEMCG_KMEM
@@ -6330,7 +6326,7 @@ static struct cftype mem_cgroup_files[] = {
 #ifdef CONFIG_SLABINFO
 	{
 		.name = "kmem.slabinfo",
-		.read_seq_string = mem_cgroup_slabinfo_read,
+		.seq_show = mem_cgroup_slabinfo_read,
 	},
 #endif
 #endif
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 498710d..56cbb69 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -173,14 +173,14 @@ static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
 	return css->cgroup->id;
 }
 
-static int read_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
-			struct seq_file *sf)
+static int read_priomap(struct seq_file *sf, void *v)
 {
 	struct net_device *dev;
 
 	rcu_read_lock();
 	for_each_netdev_rcu(&init_net, dev)
-		seq_printf(sf, "%s %u\n", dev->name, netprio_prio(css, dev));
+		seq_printf(sf, "%s %u\n", dev->name,
+			   netprio_prio(seq_css(sf), dev));
 	rcu_read_unlock();
 	return 0;
 }
@@ -238,7 +238,7 @@ static struct cftype ss_files[] = {
 	},
 	{
 		.name = "ifpriomap",
-		.read_seq_string = read_priomap,
+		.seq_show = read_priomap,
 		.write_string = write_priomap,
 	},
 	{ }	/* terminate */
diff --git a/security/device_cgroup.c b/security/device_cgroup.c
index 7c2a0a7..d3b6d2c 100644
--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -274,10 +274,9 @@ static void set_majmin(char *str, unsigned m)
 		sprintf(str, "%u", m);
 }
 
-static int devcgroup_seq_read(struct cgroup_subsys_state *css,
-			      struct cftype *cft, struct seq_file *m)
+static int devcgroup_seq_show(struct seq_file *m, void *v)
 {
-	struct dev_cgroup *devcgroup = css_to_devcgroup(css);
+	struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
 	struct dev_exception_item *ex;
 	char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
 
@@ -679,7 +678,7 @@ static struct cftype dev_cgroup_files[] = {
 	},
 	{
 		.name = "list",
-		.read_seq_string = devcgroup_seq_read,
+		.seq_show = devcgroup_seq_show,
 		.private = DEVCG_LIST,
 	},
 	{ }	/* terminate */
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 12/12] cgroup: unify pidlist and other file handling
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (10 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show() Tejun Heo
@ 2013-11-27 23:42   ` Tejun Heo
       [not found]     ` <1385595759-17656-13-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-12-05  1:48   ` [PATCHSET cgroup/for-3.14] cgroup: consolidate " Li Zefan
  2013-12-05 17:26   ` Tejun Heo
  13 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-27 23:42 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  With the previous
changes, the difference between pidlist and other files are very
small.  Both are served by seq_file in a pretty standard way with the
only difference being !pidlist files use single_open().

This patch adds cftype->seq_start(), ->seq_next and ->seq_stop() and
implements the matching cgroup_seqfile_start/next/stop() which either
emulates single_open() behavior or invokes cftype->seq_*() operations
if specified.  This allows using single seq_operations for both
pidlist and other files and makes cgroup_pidlist_operations and
cgorup_pidlist_open() no longer necessary.  As cgroup_pidlist_open()
was the only user of cftype->open(), the method is dropped together.

This brings cftype file interface very close to kernfs interface and
mapping shouldn't be too difficult.  Once converted to kernfs, most of
the plumbing code including cgroup_seqfile_*() will be removed as
kernfs provides those facilities.

This patch does not introduce any behavior changes.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
---
 include/linux/cgroup.h |   6 ++-
 kernel/cgroup.c        | 109 ++++++++++++++++++++++++++++---------------------
 2 files changed, 68 insertions(+), 47 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b32a0f8..8b9a594 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -434,7 +434,6 @@ struct cftype {
 	 */
 	struct cgroup_subsys *ss;
 
-	int (*open)(struct inode *inode, struct file *file);
 	/*
 	 * read_u64() is a shortcut for the common case of returning a
 	 * single integer. Use it in place of read()
@@ -448,6 +447,11 @@ struct cftype {
 	/* generic seq_file read interface */
 	int (*seq_show)(struct seq_file *sf, void *v);
 
+	/* optional ops, implement all or none */
+	void *(*seq_start)(struct seq_file *sf, loff_t *ppos);
+	void *(*seq_next)(struct seq_file *sf, void *v, loff_t *ppos);
+	void (*seq_stop)(struct seq_file *sf, void *v);
+
 	/*
 	 * write_u64() is a shortcut for the common case of accepting
 	 * a single integer (as parsed by simple_strtoull) from
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 39fdf75..13a3d3b 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2286,6 +2286,45 @@ out_free:
  * supports string->u64 maps, but can be extended in future.
  */
 
+static void *cgroup_seqfile_start(struct seq_file *seq, loff_t *ppos)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_start) {
+		return cft->seq_start(seq, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open().  Returns
+		 * !NULL if pos is at the beginning; otherwise, NULL.
+		 */
+		return NULL + !*ppos;
+	}
+}
+
+static void *cgroup_seqfile_next(struct seq_file *seq, void *v, loff_t *ppos)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_next) {
+		return cft->seq_next(seq, v, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open(), always
+		 * terminate after the initial read.
+		 */
+		++*ppos;
+		return NULL;
+	}
+}
+
+static void cgroup_seqfile_stop(struct seq_file *seq, void *v)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_stop)
+		cft->seq_stop(seq, v);
+}
+
 static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
 	struct cftype *cft = seq_cft(m);
@@ -2303,6 +2342,13 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 	return 0;
 }
 
+static struct seq_operations cgroup_seq_operations = {
+	.start		= cgroup_seqfile_start,
+	.next		= cgroup_seqfile_next,
+	.stop		= cgroup_seqfile_stop,
+	.show		= cgroup_seqfile_show,
+};
+
 static int cgroup_file_open(struct inode *inode, struct file *file)
 {
 	struct cfent *cfe = __d_cfe(file->f_dentry);
@@ -2338,20 +2384,17 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->open) {
-		err = cft->open(inode, file);
-	} else {
-		err = single_open_size(file, cgroup_seqfile_show, cfe,
-				       sizeof(struct cgroup_open_file));
-		if (!err) {
-			struct seq_file *sf = file->private_data;
-			struct cgroup_open_file *of = sf->private;
+	err = seq_open_private(file, &cgroup_seq_operations,
+			       sizeof(struct cgroup_open_file));
+	if (!err) {
+		struct seq_file *sf = file->private_data;
+		struct cgroup_open_file *of = sf->private;
 
-			of->cfe = cfe;
-		}
+		of->cfe = cfe;
+		return 0;
 	}
 
-	if (css->ss && err)
+	if (css->ss)
 		css_put(css);
 	return err;
 }
@@ -2363,7 +2406,7 @@ static int cgroup_file_release(struct inode *inode, struct file *file)
 
 	if (css->ss)
 		css_put(css);
-	return single_release(inode, file);
+	return seq_release_private(inode, file);
 }
 
 /*
@@ -3774,38 +3817,6 @@ static const struct seq_operations cgroup_pidlist_seq_operations = {
 	.show = cgroup_pidlist_show,
 };
 
-static const struct file_operations cgroup_pidlist_operations = {
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.write = cgroup_file_write,
-	.release = seq_release_private,
-};
-
-/*
- * The following functions handle opens on a file that displays a pidlist
- * (tasks or procs). Prepare an array of the process/thread IDs of whoever's
- * in the cgroup.
- */
-/* helper function for the two below it */
-static int cgroup_pidlist_open(struct inode *unused, struct file *file)
-{
-	struct cfent *cfe = __d_cfe(file->f_dentry);
-	struct cgroup_open_file *of;
-	int retval;
-
-	/* configure file information */
-	file->f_op = &cgroup_pidlist_operations;
-
-	retval = seq_open_private(file, &cgroup_pidlist_seq_operations,
-				  sizeof(*of));
-	if (retval)
-		return retval;
-
-	of = ((struct seq_file *)file->private_data)->private;
-	of->cfe = cfe;
-	return 0;
-}
-
 static u64 cgroup_read_notify_on_release(struct cgroup_subsys_state *css,
 					 struct cftype *cft)
 {
@@ -3859,7 +3870,10 @@ static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
 static struct cftype cgroup_base_files[] = {
 	{
 		.name = "cgroup.procs",
-		.open = cgroup_pidlist_open,
+		.seq_start = cgroup_pidlist_start,
+		.seq_next = cgroup_pidlist_next,
+		.seq_stop = cgroup_pidlist_stop,
+		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_PROCS,
 		.write_u64 = cgroup_procs_write,
 		.mode = S_IRUGO | S_IWUSR,
@@ -3884,7 +3898,10 @@ static struct cftype cgroup_base_files[] = {
 	{
 		.name = "tasks",
 		.flags = CFTYPE_INSANE,		/* use "procs" instead */
-		.open = cgroup_pidlist_open,
+		.seq_start = cgroup_pidlist_start,
+		.seq_next = cgroup_pidlist_next,
+		.seq_stop = cgroup_pidlist_stop,
+		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_TASKS,
 		.write_u64 = cgroup_tasks_write,
 		.mode = S_IRUGO | S_IWUSR,
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 03/12] memcg: convert away from cftype->read() and ->read_map()
       [not found]     ` <1385595759-17656-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-28  8:26       ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2013-11-28  8:26 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed 27-11-13 18:42:30, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is being
> consolidated so that it can be easily mapped to the seq_file based
> interface of kernfs.
> 
> cftype->read_map() doesn't add any value and being replaced with
> ->read_seq_string(), and all users of cftype->read() can be easily
> served, usually better, by seq_file and other methods.
> 
> Update mem_cgroup_read() to return u64 instead of printing itself and
> rename it to mem_cgroup_read_u64(), and update
> mem_cgroup_oom_control_read() to use ->read_seq_string() instead of
> ->read_map().
> 
> This patch doesn't make any visible behavior changes.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>

Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Thanks!

> ---
>  mm/memcontrol.c | 49 +++++++++++++++++++++----------------------------
>  1 file changed, 21 insertions(+), 28 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7aa0d40..f149521 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5150,14 +5150,12 @@ static inline u64 mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>  	return val << PAGE_SHIFT;
>  }
>  
> -static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
> -			       struct cftype *cft, struct file *file,
> -			       char __user *buf, size_t nbytes, loff_t *ppos)
> +static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
> +				   struct cftype *cft)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> -	char str[64];
>  	u64 val;
> -	int name, len;
> +	int name;
>  	enum res_type type;
>  
>  	type = MEMFILE_TYPE(cft->private);
> @@ -5183,8 +5181,7 @@ static ssize_t mem_cgroup_read(struct cgroup_subsys_state *css,
>  		BUG();
>  	}
>  
> -	len = scnprintf(str, sizeof(str), "%llu\n", (unsigned long long)val);
> -	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
> +	return val;
>  }
>  
>  static int memcg_update_kmem_limit(struct cgroup_subsys_state *css, u64 val)
> @@ -5911,16 +5908,12 @@ static void mem_cgroup_oom_unregister_event(struct mem_cgroup *memcg,
>  }
>  
>  static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
> -	struct cftype *cft,  struct cgroup_map_cb *cb)
> +				       struct cftype *cft, struct seq_file *sf)
>  {
>  	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
>  
> -	cb->fill(cb, "oom_kill_disable", memcg->oom_kill_disable);
> -
> -	if (atomic_read(&memcg->under_oom))
> -		cb->fill(cb, "under_oom", 1);
> -	else
> -		cb->fill(cb, "under_oom", 0);
> +	seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
> +	seq_printf(sf, "under_oom %d\n", (bool)atomic_read(&memcg->under_oom));
>  	return 0;
>  }
>  
> @@ -6239,31 +6232,31 @@ static struct cftype mem_cgroup_files[] = {
>  	{
>  		.name = "usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEM, RES_USAGE),
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "max_usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEM, RES_MAX_USAGE),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "limit_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEM, RES_LIMIT),
>  		.write_string = mem_cgroup_write,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "soft_limit_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),
>  		.write_string = mem_cgroup_write,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "failcnt",
>  		.private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "stat",
> @@ -6297,7 +6290,7 @@ static struct cftype mem_cgroup_files[] = {
>  	},
>  	{
>  		.name = "oom_control",
> -		.read_map = mem_cgroup_oom_control_read,
> +		.read_seq_string = mem_cgroup_oom_control_read,
>  		.write_u64 = mem_cgroup_oom_control_write,
>  		.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
>  	},
> @@ -6315,24 +6308,24 @@ static struct cftype mem_cgroup_files[] = {
>  		.name = "kmem.limit_in_bytes",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_LIMIT),
>  		.write_string = mem_cgroup_write,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "kmem.usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_USAGE),
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "kmem.failcnt",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "kmem.max_usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  #ifdef CONFIG_SLABINFO
>  	{
> @@ -6349,25 +6342,25 @@ static struct cftype memsw_cgroup_files[] = {
>  	{
>  		.name = "memsw.usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_USAGE),
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "memsw.max_usage_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_MAX_USAGE),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "memsw.limit_in_bytes",
>  		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_LIMIT),
>  		.write_string = mem_cgroup_write,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{
>  		.name = "memsw.failcnt",
>  		.private = MEMFILE_PRIVATE(_MEMSWAP, RES_FAILCNT),
>  		.trigger = mem_cgroup_reset,
> -		.read = mem_cgroup_read,
> +		.read_u64 = mem_cgroup_read_u64,
>  	},
>  	{ },	/* terminate */
>  };
> -- 
> 1.8.4.2
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 05/12] hugetlb_cgroup: convert away from cftype->read()
       [not found]     ` <1385595759-17656-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-28  8:29       ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2013-11-28  8:29 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ, Aneesh Kumar K.V,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed 27-11-13 18:42:32, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is being
> consolidated so that it can be easily mapped to the seq_file based
> interface of kernfs.
> 
> All users of cftype->read() can be easily served, usually better, by
> seq_file and other methods.  Update hugetlb_cgroup_read() to return
> u64 instead of printing itself and rename it to
> hugetlb_cgroup_read_u64().
> 
> This patch doesn't make any visible behavior changes.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Aneesh Kumar K.V <aneesh.kumar-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Reviewed-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Thanks!

> ---
>  mm/hugetlb_cgroup.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
> index bda8e44..d747a84 100644
> --- a/mm/hugetlb_cgroup.c
> +++ b/mm/hugetlb_cgroup.c
> @@ -242,22 +242,16 @@ void hugetlb_cgroup_uncharge_cgroup(int idx, unsigned long nr_pages,
>  	return;
>  }
>  
> -static ssize_t hugetlb_cgroup_read(struct cgroup_subsys_state *css,
> -				   struct cftype *cft, struct file *file,
> -				   char __user *buf, size_t nbytes,
> -				   loff_t *ppos)
> +static u64 hugetlb_cgroup_read_u64(struct cgroup_subsys_state *css,
> +				   struct cftype *cft)
>  {
> -	u64 val;
> -	char str[64];
> -	int idx, name, len;
> +	int idx, name;
>  	struct hugetlb_cgroup *h_cg = hugetlb_cgroup_from_css(css);
>  
>  	idx = MEMFILE_IDX(cft->private);
>  	name = MEMFILE_ATTR(cft->private);
>  
> -	val = res_counter_read_u64(&h_cg->hugepage[idx], name);
> -	len = scnprintf(str, sizeof(str), "%llu\n", (unsigned long long)val);
> -	return simple_read_from_buffer(buf, nbytes, ppos, str, len);
> +	return res_counter_read_u64(&h_cg->hugepage[idx], name);
>  }
>  
>  static int hugetlb_cgroup_write(struct cgroup_subsys_state *css,
> @@ -337,28 +331,28 @@ static void __init __hugetlb_cgroup_file_init(int idx)
>  	cft = &h->cgroup_files[0];
>  	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.limit_in_bytes", buf);
>  	cft->private = MEMFILE_PRIVATE(idx, RES_LIMIT);
> -	cft->read = hugetlb_cgroup_read;
> +	cft->read_u64 = hugetlb_cgroup_read_u64;
>  	cft->write_string = hugetlb_cgroup_write;
>  
>  	/* Add the usage file */
>  	cft = &h->cgroup_files[1];
>  	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.usage_in_bytes", buf);
>  	cft->private = MEMFILE_PRIVATE(idx, RES_USAGE);
> -	cft->read = hugetlb_cgroup_read;
> +	cft->read_u64 = hugetlb_cgroup_read_u64;
>  
>  	/* Add the MAX usage file */
>  	cft = &h->cgroup_files[2];
>  	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.max_usage_in_bytes", buf);
>  	cft->private = MEMFILE_PRIVATE(idx, RES_MAX_USAGE);
>  	cft->trigger = hugetlb_cgroup_reset;
> -	cft->read = hugetlb_cgroup_read;
> +	cft->read_u64 = hugetlb_cgroup_read_u64;
>  
>  	/* Add the failcntfile */
>  	cft = &h->cgroup_files[3];
>  	snprintf(cft->name, MAX_CFTYPE_NAME, "%s.failcnt", buf);
>  	cft->private  = MEMFILE_PRIVATE(idx, RES_FAILCNT);
>  	cft->trigger  = hugetlb_cgroup_reset;
> -	cft->read = hugetlb_cgroup_read;
> +	cft->read_u64 = hugetlb_cgroup_read_u64;
>  
>  	/* NULL terminate the last cft */
>  	cft = &h->cgroup_files[4];
> -- 
> 1.8.4.2
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show()
       [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-28  9:07       ` Daniel Wagner
  2013-11-28 11:25       ` Michal Hocko
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 38+ messages in thread
From: Daniel Wagner @ 2013-11-28  9:07 UTC (permalink / raw)
  To: Tejun Heo, lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: Jens Axboe, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mhocko-AlSwsSmVLrQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	vgoyal-H+wXaHxf7aLQT0dZR+AlfA

Hi Tejun,

On 11/28/2013 12:42 AM, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is
> updated so that it can be easily mapped to kernfs.  This patch
> replaces cftype->read_seq_string() with cftype->seq_show() which is
> not limited to single_open() operation and will map directcly to
> kernfs seq_file interface.
>
> The conversions are mechanical.  As ->seq_show() doesn't have @css and
> @cft, the functions which make use of them are converted to use
> seq_css() and seq_cft() respectively.  In several occassions, e.f. if
> it has seq_string in its name, the function name is updated to fit the
> new method better.
>
> This patch does not introduce any behavior changes.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> Cc: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Ack for netprio_cgroup.c.

Thanks,
daniel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]     ` <1385595759-17656-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-28 11:18       ` Michal Hocko
       [not found]         ` <20131128111818.GG2761-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2013-11-28 11:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed 27-11-13 18:42:34, Tejun Heo wrote:
> cgroup_write_X64() and cgroup_write_string() both implement about the
> same buffering logic. 

> Unify the two into cgroup_file_write() which always allocates dynamic
> buffer for simplicity

It's true that this is simpler but the allocation might cause some
issues with memcg. Although it is not common that userspace oom handler
is a part of the target memcg there are users (e.g. Google) who do that.

Why is that a problem? Consider that a memcg is under OOM, handler gets
notified and tries to solve the situation by enlarging the hard limit.
This worked before this patch because cgroup_write_X64 used an on stack
buffer but now it would use kmalloc which might be accounted and trip
over the same OOM situation.

This is not limited to the OOM handling. The group might be close to OOM
and the last thing user expects is to trigger OOM when he tries to
enlarge the limit.

Is the additional simplicity worth breaking those usecases?

> and uses kstrto*() instead of simple_strto*().
> 
> This patch doesn't make any visible behavior changes except for
> possibly different error value from kstrsto*().
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> ---
>  kernel/cgroup.c | 112 ++++++++++++++++++--------------------------------------
>  1 file changed, 36 insertions(+), 76 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 3592515..23abe8e 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2249,90 +2249,50 @@ static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
>  /* A buffer size big enough for numbers or short strings */
>  #define CGROUP_LOCAL_BUFFER_SIZE 64
>  
> -static ssize_t cgroup_write_X64(struct cgroup_subsys_state *css,
> -				struct cftype *cft, struct file *file,
> -				const char __user *userbuf, size_t nbytes,
> -				loff_t *unused_ppos)
> -{
> -	char buffer[CGROUP_LOCAL_BUFFER_SIZE];
> -	int retval = 0;
> -	char *end;
> -
> -	if (!nbytes)
> -		return -EINVAL;
> -	if (nbytes >= sizeof(buffer))
> -		return -E2BIG;
> -	if (copy_from_user(buffer, userbuf, nbytes))
> -		return -EFAULT;
> -
> -	buffer[nbytes] = 0;     /* nul-terminate */
> -	if (cft->write_u64) {
> -		u64 val = simple_strtoull(strstrip(buffer), &end, 0);
> -		if (*end)
> -			return -EINVAL;
> -		retval = cft->write_u64(css, cft, val);
> -	} else {
> -		s64 val = simple_strtoll(strstrip(buffer), &end, 0);
> -		if (*end)
> -			return -EINVAL;
> -		retval = cft->write_s64(css, cft, val);
> -	}
> -	if (!retval)
> -		retval = nbytes;
> -	return retval;
> -}
> -
> -static ssize_t cgroup_write_string(struct cgroup_subsys_state *css,
> -				   struct cftype *cft, struct file *file,
> -				   const char __user *userbuf, size_t nbytes,
> -				   loff_t *unused_ppos)
> +static ssize_t cgroup_file_write(struct file *file, const char __user *userbuf,
> +				 size_t nbytes, loff_t *ppos)
>  {
> -	char local_buffer[CGROUP_LOCAL_BUFFER_SIZE];
> -	int retval = 0;
> -	size_t max_bytes = cft->max_write_len;
> -	char *buffer = local_buffer;
> +	struct cfent *cfe = __d_cfe(file->f_dentry);
> +	struct cftype *cft = __d_cft(file->f_dentry);
> +	struct cgroup_subsys_state *css = cfe->css;
> +	size_t max_bytes = cft->max_write_len ?: CGROUP_LOCAL_BUFFER_SIZE - 1;
> +	char *buf;
> +	int ret;
>  
> -	if (!max_bytes)
> -		max_bytes = sizeof(local_buffer) - 1;
>  	if (nbytes >= max_bytes)
>  		return -E2BIG;
> -	/* Allocate a dynamic buffer if we need one */
> -	if (nbytes >= sizeof(local_buffer)) {
> -		buffer = kmalloc(nbytes + 1, GFP_KERNEL);
> -		if (buffer == NULL)
> -			return -ENOMEM;
> -	}
> -	if (nbytes && copy_from_user(buffer, userbuf, nbytes)) {
> -		retval = -EFAULT;
> -		goto out;
> -	}
>  
> -	buffer[nbytes] = 0;     /* nul-terminate */
> -	retval = cft->write_string(css, cft, strstrip(buffer));
> -	if (!retval)
> -		retval = nbytes;
> -out:
> -	if (buffer != local_buffer)
> -		kfree(buffer);
> -	return retval;
> -}
> +	buf = kmalloc(nbytes + 1, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
>  
> -static ssize_t cgroup_file_write(struct file *file, const char __user *buf,
> -				 size_t nbytes, loff_t *ppos)
> -{
> -	struct cfent *cfe = __d_cfe(file->f_dentry);
> -	struct cftype *cft = __d_cft(file->f_dentry);
> -	struct cgroup_subsys_state *css = cfe->css;
> +	if (copy_from_user(buf, userbuf, nbytes)) {
> +		ret = -EFAULT;
> +		goto out_free;
> +	}
>  
> -	if (cft->write_u64 || cft->write_s64)
> -		return cgroup_write_X64(css, cft, file, buf, nbytes, ppos);
> -	if (cft->write_string)
> -		return cgroup_write_string(css, cft, file, buf, nbytes, ppos);
> -	if (cft->trigger) {
> -		int ret = cft->trigger(css, (unsigned int)cft->private);
> -		return ret ? ret : nbytes;
> +	buf[nbytes] = '\0';
> +
> +	if (cft->write_string) {
> +		ret = cft->write_string(css, cft, strstrip(buf));
> +	} else if (cft->write_u64) {
> +		unsigned long long v;
> +		ret = kstrtoull(buf, 0, &v);
> +		if (!ret)
> +			ret = cft->write_u64(css, cft, v);
> +	} else if (cft->write_s64) {
> +		long long v;
> +		ret = kstrtoll(buf, 0, &v);
> +		if (!ret)
> +			ret = cft->write_s64(css, cft, v);
> +	} else if (cft->trigger) {
> +		ret = cft->trigger(css, (unsigned int)cft->private);
> +	} else {
> +		ret = -EINVAL;
>  	}
> -	return -EINVAL;
> +out_free:
> +	kfree(buf);
> +	return ret ?: nbytes;
>  }
>  
>  static ssize_t cgroup_read_u64(struct cgroup_subsys_state *css,
> -- 
> 1.8.4.2
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show()
       [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-28  9:07       ` Daniel Wagner
@ 2013-11-28 11:25       ` Michal Hocko
  2013-12-02 14:41       ` Aristeu Rozanski
  2013-12-02 14:52       ` Vivek Goyal
  3 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2013-11-28 11:25 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed 27-11-13 18:42:38, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is
> updated so that it can be easily mapped to kernfs.  This patch
> replaces cftype->read_seq_string() with cftype->seq_show() which is
> not limited to single_open() operation and will map directcly to
> kernfs seq_file interface.
> 
> The conversions are mechanical.  As ->seq_show() doesn't have @css and
> @cft, the functions which make use of them are converted to use
> seq_css() and seq_cft() respectively.  In several occassions, e.f. if
> it has seq_string in its name, the function name is updated to fit the
> new method better.
> 
> This patch does not introduce any behavior changes.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> Cc: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Looks good to me. For memcg part
Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Thanks!

> ---
>  block/blk-throttle.c      |  35 ++++++-------
>  block/cfq-iosched.c       | 131 ++++++++++++++++++++--------------------------
>  include/linux/cgroup.h    |   9 ++--
>  kernel/cgroup.c           |  34 ++++++------
>  kernel/cgroup_freezer.c   |   7 ++-
>  kernel/cpuset.c           |  12 ++---
>  kernel/sched/core.c       |   7 ++-
>  kernel/sched/cpuacct.c    |  14 +++--
>  mm/memcontrol.c           |  28 +++++-----
>  net/core/netprio_cgroup.c |   8 +--
>  security/device_cgroup.c  |   7 ++-
>  11 files changed, 128 insertions(+), 164 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 0653404..a760857 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -1303,13 +1303,10 @@ static u64 tg_prfill_cpu_rwstat(struct seq_file *sf,
>  	return __blkg_prfill_rwstat(sf, pd, &rwstat);
>  }
>  
> -static int tg_print_cpu_rwstat(struct cgroup_subsys_state *css,
> -			       struct cftype *cft, struct seq_file *sf)
> +static int tg_print_cpu_rwstat(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, tg_prfill_cpu_rwstat, &blkcg_policy_throtl,
> -			  cft->private, true);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_cpu_rwstat,
> +			  &blkcg_policy_throtl, seq_cft(sf)->private, true);
>  	return 0;
>  }
>  
> @@ -1335,19 +1332,17 @@ static u64 tg_prfill_conf_uint(struct seq_file *sf, struct blkg_policy_data *pd,
>  	return __blkg_prfill_u64(sf, pd, v);
>  }
>  
> -static int tg_print_conf_u64(struct cgroup_subsys_state *css,
> -			     struct cftype *cft, struct seq_file *sf)
> +static int tg_print_conf_u64(struct seq_file *sf, void *v)
>  {
> -	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_u64,
> -			  &blkcg_policy_throtl, cft->private, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_conf_u64,
> +			  &blkcg_policy_throtl, seq_cft(sf)->private, false);
>  	return 0;
>  }
>  
> -static int tg_print_conf_uint(struct cgroup_subsys_state *css,
> -			      struct cftype *cft, struct seq_file *sf)
> +static int tg_print_conf_uint(struct seq_file *sf, void *v)
>  {
> -	blkcg_print_blkgs(sf, css_to_blkcg(css), tg_prfill_conf_uint,
> -			  &blkcg_policy_throtl, cft->private, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), tg_prfill_conf_uint,
> +			  &blkcg_policy_throtl, seq_cft(sf)->private, false);
>  	return 0;
>  }
>  
> @@ -1428,40 +1423,40 @@ static struct cftype throtl_files[] = {
>  	{
>  		.name = "throttle.read_bps_device",
>  		.private = offsetof(struct throtl_grp, bps[READ]),
> -		.read_seq_string = tg_print_conf_u64,
> +		.seq_show = tg_print_conf_u64,
>  		.write_string = tg_set_conf_u64,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "throttle.write_bps_device",
>  		.private = offsetof(struct throtl_grp, bps[WRITE]),
> -		.read_seq_string = tg_print_conf_u64,
> +		.seq_show = tg_print_conf_u64,
>  		.write_string = tg_set_conf_u64,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "throttle.read_iops_device",
>  		.private = offsetof(struct throtl_grp, iops[READ]),
> -		.read_seq_string = tg_print_conf_uint,
> +		.seq_show = tg_print_conf_uint,
>  		.write_string = tg_set_conf_uint,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "throttle.write_iops_device",
>  		.private = offsetof(struct throtl_grp, iops[WRITE]),
> -		.read_seq_string = tg_print_conf_uint,
> +		.seq_show = tg_print_conf_uint,
>  		.write_string = tg_set_conf_uint,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "throttle.io_service_bytes",
>  		.private = offsetof(struct tg_stats_cpu, service_bytes),
> -		.read_seq_string = tg_print_cpu_rwstat,
> +		.seq_show = tg_print_cpu_rwstat,
>  	},
>  	{
>  		.name = "throttle.io_serviced",
>  		.private = offsetof(struct tg_stats_cpu, serviced),
> -		.read_seq_string = tg_print_cpu_rwstat,
> +		.seq_show = tg_print_cpu_rwstat,
>  	},
>  	{ }	/* terminate */
>  };
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 4d5cec1..744833b 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -1632,11 +1632,11 @@ static u64 cfqg_prfill_weight_device(struct seq_file *sf,
>  	return __blkg_prfill_u64(sf, pd, cfqg->dev_weight);
>  }
>  
> -static int cfqg_print_weight_device(struct cgroup_subsys_state *css,
> -				    struct cftype *cft, struct seq_file *sf)
> +static int cfqg_print_weight_device(struct seq_file *sf, void *v)
>  {
> -	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_weight_device,
> -			  &blkcg_policy_cfq, 0, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
> +			  cfqg_prfill_weight_device, &blkcg_policy_cfq,
> +			  0, false);
>  	return 0;
>  }
>  
> @@ -1650,26 +1650,23 @@ static u64 cfqg_prfill_leaf_weight_device(struct seq_file *sf,
>  	return __blkg_prfill_u64(sf, pd, cfqg->dev_leaf_weight);
>  }
>  
> -static int cfqg_print_leaf_weight_device(struct cgroup_subsys_state *css,
> -					 struct cftype *cft,
> -					 struct seq_file *sf)
> +static int cfqg_print_leaf_weight_device(struct seq_file *sf, void *v)
>  {
> -	blkcg_print_blkgs(sf, css_to_blkcg(css), cfqg_prfill_leaf_weight_device,
> -			  &blkcg_policy_cfq, 0, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
> +			  cfqg_prfill_leaf_weight_device, &blkcg_policy_cfq,
> +			  0, false);
>  	return 0;
>  }
>  
> -static int cfq_print_weight(struct cgroup_subsys_state *css, struct cftype *cft,
> -			    struct seq_file *sf)
> +static int cfq_print_weight(struct seq_file *sf, void *v)
>  {
> -	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_weight);
> +	seq_printf(sf, "%u\n", css_to_blkcg(seq_css(sf))->cfq_weight);
>  	return 0;
>  }
>  
> -static int cfq_print_leaf_weight(struct cgroup_subsys_state *css,
> -				 struct cftype *cft, struct seq_file *sf)
> +static int cfq_print_leaf_weight(struct seq_file *sf, void *v)
>  {
> -	seq_printf(sf, "%u\n", css_to_blkcg(css)->cfq_leaf_weight);
> +	seq_printf(sf, "%u\n", css_to_blkcg(seq_css(sf))->cfq_leaf_weight);
>  	return 0;
>  }
>  
> @@ -1762,23 +1759,17 @@ static int cfq_set_leaf_weight(struct cgroup_subsys_state *css,
>  	return __cfq_set_weight(css, cft, val, true);
>  }
>  
> -static int cfqg_print_stat(struct cgroup_subsys_state *css, struct cftype *cft,
> -			   struct seq_file *sf)
> +static int cfqg_print_stat(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, blkg_prfill_stat, &blkcg_policy_cfq,
> -			  cft->private, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat,
> +			  &blkcg_policy_cfq, seq_cft(sf)->private, false);
>  	return 0;
>  }
>  
> -static int cfqg_print_rwstat(struct cgroup_subsys_state *css,
> -			     struct cftype *cft, struct seq_file *sf)
> +static int cfqg_print_rwstat(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, blkg_prfill_rwstat, &blkcg_policy_cfq,
> -			  cft->private, true);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_rwstat,
> +			  &blkcg_policy_cfq, seq_cft(sf)->private, true);
>  	return 0;
>  }
>  
> @@ -1798,23 +1789,19 @@ static u64 cfqg_prfill_rwstat_recursive(struct seq_file *sf,
>  	return __blkg_prfill_rwstat(sf, pd, &sum);
>  }
>  
> -static int cfqg_print_stat_recursive(struct cgroup_subsys_state *css,
> -				     struct cftype *cft, struct seq_file *sf)
> +static int cfqg_print_stat_recursive(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_stat_recursive,
> -			  &blkcg_policy_cfq, cft->private, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
> +			  cfqg_prfill_stat_recursive, &blkcg_policy_cfq,
> +			  seq_cft(sf)->private, false);
>  	return 0;
>  }
>  
> -static int cfqg_print_rwstat_recursive(struct cgroup_subsys_state *css,
> -				       struct cftype *cft, struct seq_file *sf)
> +static int cfqg_print_rwstat_recursive(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_rwstat_recursive,
> -			  &blkcg_policy_cfq, cft->private, true);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
> +			  cfqg_prfill_rwstat_recursive, &blkcg_policy_cfq,
> +			  seq_cft(sf)->private, true);
>  	return 0;
>  }
>  
> @@ -1835,13 +1822,11 @@ static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
>  }
>  
>  /* print avg_queue_size */
> -static int cfqg_print_avg_queue_size(struct cgroup_subsys_state *css,
> -				     struct cftype *cft, struct seq_file *sf)
> +static int cfqg_print_avg_queue_size(struct seq_file *sf, void *v)
>  {
> -	struct blkcg *blkcg = css_to_blkcg(css);
> -
> -	blkcg_print_blkgs(sf, blkcg, cfqg_prfill_avg_queue_size,
> -			  &blkcg_policy_cfq, 0, false);
> +	blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)),
> +			  cfqg_prfill_avg_queue_size, &blkcg_policy_cfq,
> +			  0, false);
>  	return 0;
>  }
>  #endif	/* CONFIG_DEBUG_BLK_CGROUP */
> @@ -1851,14 +1836,14 @@ static struct cftype cfq_blkcg_files[] = {
>  	{
>  		.name = "weight_device",
>  		.flags = CFTYPE_ONLY_ON_ROOT,
> -		.read_seq_string = cfqg_print_leaf_weight_device,
> +		.seq_show = cfqg_print_leaf_weight_device,
>  		.write_string = cfqg_set_leaf_weight_device,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "weight",
>  		.flags = CFTYPE_ONLY_ON_ROOT,
> -		.read_seq_string = cfq_print_leaf_weight,
> +		.seq_show = cfq_print_leaf_weight,
>  		.write_u64 = cfq_set_leaf_weight,
>  	},
>  
> @@ -1866,26 +1851,26 @@ static struct cftype cfq_blkcg_files[] = {
>  	{
>  		.name = "weight_device",
>  		.flags = CFTYPE_NOT_ON_ROOT,
> -		.read_seq_string = cfqg_print_weight_device,
> +		.seq_show = cfqg_print_weight_device,
>  		.write_string = cfqg_set_weight_device,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "weight",
>  		.flags = CFTYPE_NOT_ON_ROOT,
> -		.read_seq_string = cfq_print_weight,
> +		.seq_show = cfq_print_weight,
>  		.write_u64 = cfq_set_weight,
>  	},
>  
>  	{
>  		.name = "leaf_weight_device",
> -		.read_seq_string = cfqg_print_leaf_weight_device,
> +		.seq_show = cfqg_print_leaf_weight_device,
>  		.write_string = cfqg_set_leaf_weight_device,
>  		.max_write_len = 256,
>  	},
>  	{
>  		.name = "leaf_weight",
> -		.read_seq_string = cfq_print_leaf_weight,
> +		.seq_show = cfq_print_leaf_weight,
>  		.write_u64 = cfq_set_leaf_weight,
>  	},
>  
> @@ -1893,114 +1878,114 @@ static struct cftype cfq_blkcg_files[] = {
>  	{
>  		.name = "time",
>  		.private = offsetof(struct cfq_group, stats.time),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "sectors",
>  		.private = offsetof(struct cfq_group, stats.sectors),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "io_service_bytes",
>  		.private = offsetof(struct cfq_group, stats.service_bytes),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  	{
>  		.name = "io_serviced",
>  		.private = offsetof(struct cfq_group, stats.serviced),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  	{
>  		.name = "io_service_time",
>  		.private = offsetof(struct cfq_group, stats.service_time),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  	{
>  		.name = "io_wait_time",
>  		.private = offsetof(struct cfq_group, stats.wait_time),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  	{
>  		.name = "io_merged",
>  		.private = offsetof(struct cfq_group, stats.merged),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  	{
>  		.name = "io_queued",
>  		.private = offsetof(struct cfq_group, stats.queued),
> -		.read_seq_string = cfqg_print_rwstat,
> +		.seq_show = cfqg_print_rwstat,
>  	},
>  
>  	/* the same statictics which cover the cfqg and its descendants */
>  	{
>  		.name = "time_recursive",
>  		.private = offsetof(struct cfq_group, stats.time),
> -		.read_seq_string = cfqg_print_stat_recursive,
> +		.seq_show = cfqg_print_stat_recursive,
>  	},
>  	{
>  		.name = "sectors_recursive",
>  		.private = offsetof(struct cfq_group, stats.sectors),
> -		.read_seq_string = cfqg_print_stat_recursive,
> +		.seq_show = cfqg_print_stat_recursive,
>  	},
>  	{
>  		.name = "io_service_bytes_recursive",
>  		.private = offsetof(struct cfq_group, stats.service_bytes),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  	{
>  		.name = "io_serviced_recursive",
>  		.private = offsetof(struct cfq_group, stats.serviced),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  	{
>  		.name = "io_service_time_recursive",
>  		.private = offsetof(struct cfq_group, stats.service_time),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  	{
>  		.name = "io_wait_time_recursive",
>  		.private = offsetof(struct cfq_group, stats.wait_time),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  	{
>  		.name = "io_merged_recursive",
>  		.private = offsetof(struct cfq_group, stats.merged),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  	{
>  		.name = "io_queued_recursive",
>  		.private = offsetof(struct cfq_group, stats.queued),
> -		.read_seq_string = cfqg_print_rwstat_recursive,
> +		.seq_show = cfqg_print_rwstat_recursive,
>  	},
>  #ifdef CONFIG_DEBUG_BLK_CGROUP
>  	{
>  		.name = "avg_queue_size",
> -		.read_seq_string = cfqg_print_avg_queue_size,
> +		.seq_show = cfqg_print_avg_queue_size,
>  	},
>  	{
>  		.name = "group_wait_time",
>  		.private = offsetof(struct cfq_group, stats.group_wait_time),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "idle_time",
>  		.private = offsetof(struct cfq_group, stats.idle_time),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "empty_time",
>  		.private = offsetof(struct cfq_group, stats.empty_time),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "dequeue",
>  		.private = offsetof(struct cfq_group, stats.dequeue),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  	{
>  		.name = "unaccounted_time",
>  		.private = offsetof(struct cfq_group, stats.unaccounted_time),
> -		.read_seq_string = cfqg_print_stat,
> +		.seq_show = cfqg_print_stat,
>  	},
>  #endif	/* CONFIG_DEBUG_BLK_CGROUP */
>  	{ }	/* terminate */
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index c3d698a..b32a0f8 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -444,12 +444,9 @@ struct cftype {
>  	 * read_s64() is a signed version of read_u64()
>  	 */
>  	s64 (*read_s64)(struct cgroup_subsys_state *css, struct cftype *cft);
> -	/*
> -	 * read_seq_string() is used for outputting a simple sequence
> -	 * using seqfile.
> -	 */
> -	int (*read_seq_string)(struct cgroup_subsys_state *css,
> -			       struct cftype *cft, struct seq_file *m);
> +
> +	/* generic seq_file read interface */
> +	int (*seq_show)(struct seq_file *sf, void *v);
>  
>  	/*
>  	 * write_u64() is a shortcut for the common case of accepting
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 1189de7..39fdf75 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -2212,10 +2212,9 @@ static int cgroup_release_agent_write(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>  
> -static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
> -				     struct cftype *cft, struct seq_file *seq)
> +static int cgroup_release_agent_show(struct seq_file *seq, void *v)
>  {
> -	struct cgroup *cgrp = css->cgroup;
> +	struct cgroup *cgrp = seq_css(seq)->cgroup;
>  
>  	if (!cgroup_lock_live_group(cgrp))
>  		return -ENODEV;
> @@ -2225,10 +2224,11 @@ static int cgroup_release_agent_show(struct cgroup_subsys_state *css,
>  	return 0;
>  }
>  
> -static int cgroup_sane_behavior_show(struct cgroup_subsys_state *css,
> -				     struct cftype *cft, struct seq_file *seq)
> +static int cgroup_sane_behavior_show(struct seq_file *seq, void *v)
>  {
> -	seq_printf(seq, "%d\n", cgroup_sane_behavior(css->cgroup));
> +	struct cgroup *cgrp = seq_css(seq)->cgroup;
> +
> +	seq_printf(seq, "%d\n", cgroup_sane_behavior(cgrp));
>  	return 0;
>  }
>  
> @@ -2291,8 +2291,8 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
>  	struct cftype *cft = seq_cft(m);
>  	struct cgroup_subsys_state *css = seq_css(m);
>  
> -	if (cft->read_seq_string)
> -		return cft->read_seq_string(css, cft, m);
> +	if (cft->seq_show)
> +		return cft->seq_show(m, arg);
>  
>  	if (cft->read_u64)
>  		seq_printf(m, "%llu\n", cft->read_u64(css, cft));
> @@ -2556,7 +2556,7 @@ static umode_t cgroup_file_mode(const struct cftype *cft)
>  	if (cft->mode)
>  		return cft->mode;
>  
> -	if (cft->read_u64 || cft->read_s64 || cft->read_seq_string)
> +	if (cft->read_u64 || cft->read_s64 || cft->seq_show)
>  		mode |= S_IRUGO;
>  
>  	if (cft->write_u64 || cft->write_s64 || cft->write_string ||
> @@ -3873,7 +3873,7 @@ static struct cftype cgroup_base_files[] = {
>  	{
>  		.name = "cgroup.sane_behavior",
>  		.flags = CFTYPE_ONLY_ON_ROOT,
> -		.read_seq_string = cgroup_sane_behavior_show,
> +		.seq_show = cgroup_sane_behavior_show,
>  	},
>  
>  	/*
> @@ -3898,7 +3898,7 @@ static struct cftype cgroup_base_files[] = {
>  	{
>  		.name = "release_agent",
>  		.flags = CFTYPE_INSANE | CFTYPE_ONLY_ON_ROOT,
> -		.read_seq_string = cgroup_release_agent_show,
> +		.seq_show = cgroup_release_agent_show,
>  		.write_string = cgroup_release_agent_write,
>  		.max_write_len = PATH_MAX,
>  	},
> @@ -5273,9 +5273,7 @@ static u64 current_css_set_refcount_read(struct cgroup_subsys_state *css,
>  	return count;
>  }
>  
> -static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
> -					 struct cftype *cft,
> -					 struct seq_file *seq)
> +static int current_css_set_cg_links_read(struct seq_file *seq, void *v)
>  {
>  	struct cgrp_cset_link *link;
>  	struct css_set *cset;
> @@ -5300,9 +5298,9 @@ static int current_css_set_cg_links_read(struct cgroup_subsys_state *css,
>  }
>  
>  #define MAX_TASKS_SHOWN_PER_CSS 25
> -static int cgroup_css_links_read(struct cgroup_subsys_state *css,
> -				 struct cftype *cft, struct seq_file *seq)
> +static int cgroup_css_links_read(struct seq_file *seq, void *v)
>  {
> +	struct cgroup_subsys_state *css = seq_css(seq);
>  	struct cgrp_cset_link *link;
>  
>  	read_lock(&css_set_lock);
> @@ -5348,12 +5346,12 @@ static struct cftype debug_files[] =  {
>  
>  	{
>  		.name = "current_css_set_cg_links",
> -		.read_seq_string = current_css_set_cg_links_read,
> +		.seq_show = current_css_set_cg_links_read,
>  	},
>  
>  	{
>  		.name = "cgroup_css_links",
> -		.read_seq_string = cgroup_css_links_read,
> +		.seq_show = cgroup_css_links_read,
>  	},
>  
>  	{
> diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> index f0ff64d..6c3154e 100644
> --- a/kernel/cgroup_freezer.c
> +++ b/kernel/cgroup_freezer.c
> @@ -301,10 +301,9 @@ out_unlock:
>  	spin_unlock_irq(&freezer->lock);
>  }
>  
> -static int freezer_read(struct cgroup_subsys_state *css, struct cftype *cft,
> -			struct seq_file *m)
> +static int freezer_read(struct seq_file *m, void *v)
>  {
> -	struct cgroup_subsys_state *pos;
> +	struct cgroup_subsys_state *css = seq_css(m), *pos;
>  
>  	rcu_read_lock();
>  
> @@ -458,7 +457,7 @@ static struct cftype files[] = {
>  	{
>  		.name = "state",
>  		.flags = CFTYPE_NOT_ON_ROOT,
> -		.read_seq_string = freezer_read,
> +		.seq_show = freezer_read,
>  		.write_string = freezer_write,
>  	},
>  	{
> diff --git a/kernel/cpuset.c b/kernel/cpuset.c
> index 032929f..4410ac6 100644
> --- a/kernel/cpuset.c
> +++ b/kernel/cpuset.c
> @@ -1732,12 +1732,10 @@ out_unlock:
>   * and since these maps can change value dynamically, one could read
>   * gibberish by doing partial reads while a list was changing.
>   */
> -static int cpuset_common_read_seq_string(struct cgroup_subsys_state *css,
> -					 struct cftype *cft,
> -					 struct seq_file *sf)
> +static int cpuset_common_seq_show(struct seq_file *sf, void *v)
>  {
> -	struct cpuset *cs = css_cs(css);
> -	cpuset_filetype_t type = cft->private;
> +	struct cpuset *cs = css_cs(seq_css(sf));
> +	cpuset_filetype_t type = seq_cft(sf)->private;
>  	ssize_t count;
>  	char *buf, *s;
>  	int ret = 0;
> @@ -1824,7 +1822,7 @@ static s64 cpuset_read_s64(struct cgroup_subsys_state *css, struct cftype *cft)
>  static struct cftype files[] = {
>  	{
>  		.name = "cpus",
> -		.read_seq_string = cpuset_common_read_seq_string,
> +		.seq_show = cpuset_common_seq_show,
>  		.write_string = cpuset_write_resmask,
>  		.max_write_len = (100U + 6 * NR_CPUS),
>  		.private = FILE_CPULIST,
> @@ -1832,7 +1830,7 @@ static struct cftype files[] = {
>  
>  	{
>  		.name = "mems",
> -		.read_seq_string = cpuset_common_read_seq_string,
> +		.seq_show = cpuset_common_seq_show,
>  		.write_string = cpuset_write_resmask,
>  		.max_write_len = (100U + 6 * MAX_NUMNODES),
>  		.private = FILE_MEMLIST,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f28ec67..7e8cbb9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7256,10 +7256,9 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
>  	return ret;
>  }
>  
> -static int cpu_stats_show(struct cgroup_subsys_state *css, struct cftype *cft,
> -			  struct seq_file *sf)
> +static int cpu_stats_show(struct seq_file *sf, void *v)
>  {
> -	struct task_group *tg = css_tg(css);
> +	struct task_group *tg = css_tg(seq_css(sf));
>  	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
>  
>  	seq_printf(sf, "nr_periods %d\n", cfs_b->nr_periods);
> @@ -7318,7 +7317,7 @@ static struct cftype cpu_files[] = {
>  	},
>  	{
>  		.name = "stat",
> -		.read_seq_string = cpu_stats_show,
> +		.seq_show = cpu_stats_show,
>  	},
>  #endif
>  #ifdef CONFIG_RT_GROUP_SCHED
> diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
> index dd88738..622e081 100644
> --- a/kernel/sched/cpuacct.c
> +++ b/kernel/sched/cpuacct.c
> @@ -163,10 +163,9 @@ out:
>  	return err;
>  }
>  
> -static int cpuacct_percpu_seq_read(struct cgroup_subsys_state *css,
> -				   struct cftype *cft, struct seq_file *m)
> +static int cpuacct_percpu_seq_show(struct seq_file *m, void *V)
>  {
> -	struct cpuacct *ca = css_ca(css);
> +	struct cpuacct *ca = css_ca(seq_css(m));
>  	u64 percpu;
>  	int i;
>  
> @@ -183,10 +182,9 @@ static const char * const cpuacct_stat_desc[] = {
>  	[CPUACCT_STAT_SYSTEM] = "system",
>  };
>  
> -static int cpuacct_stats_show(struct cgroup_subsys_state *css,
> -			      struct cftype *cft, struct seq_file *sf)
> +static int cpuacct_stats_show(struct seq_file *sf, void *v)
>  {
> -	struct cpuacct *ca = css_ca(css);
> +	struct cpuacct *ca = css_ca(seq_css(sf));
>  	int cpu;
>  	s64 val = 0;
>  
> @@ -220,11 +218,11 @@ static struct cftype files[] = {
>  	},
>  	{
>  		.name = "usage_percpu",
> -		.read_seq_string = cpuacct_percpu_seq_read,
> +		.seq_show = cpuacct_percpu_seq_show,
>  	},
>  	{
>  		.name = "stat",
> -		.read_seq_string = cpuacct_stats_show,
> +		.seq_show = cpuacct_stats_show,
>  	},
>  	{ }	/* terminate */
>  };
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index f149521..9252219 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3014,10 +3014,9 @@ static struct kmem_cache *memcg_params_to_cache(struct memcg_cache_params *p)
>  }
>  
>  #ifdef CONFIG_SLABINFO
> -static int mem_cgroup_slabinfo_read(struct cgroup_subsys_state *css,
> -				    struct cftype *cft, struct seq_file *m)
> +static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
>  	struct memcg_cache_params *params;
>  
>  	if (!memcg_can_account_kmem(memcg))
> @@ -5418,8 +5417,7 @@ static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css,
>  #endif
>  
>  #ifdef CONFIG_NUMA
> -static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
> -				struct cftype *cft, struct seq_file *m)
> +static int memcg_numa_stat_show(struct seq_file *m, void *v)
>  {
>  	struct numa_stat {
>  		const char *name;
> @@ -5435,7 +5433,7 @@ static int memcg_numa_stat_show(struct cgroup_subsys_state *css,
>  	const struct numa_stat *stat;
>  	int nid;
>  	unsigned long nr;
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
>  
>  	for (stat = stats; stat < stats + ARRAY_SIZE(stats); stat++) {
>  		nr = mem_cgroup_nr_lru_pages(memcg, stat->lru_mask);
> @@ -5474,10 +5472,9 @@ static inline void mem_cgroup_lru_names_not_uptodate(void)
>  	BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
>  }
>  
> -static int memcg_stat_show(struct cgroup_subsys_state *css, struct cftype *cft,
> -				 struct seq_file *m)
> +static int memcg_stat_show(struct seq_file *m, void *v)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
>  	struct mem_cgroup *mi;
>  	unsigned int i;
>  
> @@ -5907,10 +5904,9 @@ static void mem_cgroup_oom_unregister_event(struct mem_cgroup *memcg,
>  	spin_unlock(&memcg_oom_lock);
>  }
>  
> -static int mem_cgroup_oom_control_read(struct cgroup_subsys_state *css,
> -				       struct cftype *cft, struct seq_file *sf)
> +static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
>  {
> -	struct mem_cgroup *memcg = mem_cgroup_from_css(css);
> +	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(sf));
>  
>  	seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
>  	seq_printf(sf, "under_oom %d\n", (bool)atomic_read(&memcg->under_oom));
> @@ -6260,7 +6256,7 @@ static struct cftype mem_cgroup_files[] = {
>  	},
>  	{
>  		.name = "stat",
> -		.read_seq_string = memcg_stat_show,
> +		.seq_show = memcg_stat_show,
>  	},
>  	{
>  		.name = "force_empty",
> @@ -6290,7 +6286,7 @@ static struct cftype mem_cgroup_files[] = {
>  	},
>  	{
>  		.name = "oom_control",
> -		.read_seq_string = mem_cgroup_oom_control_read,
> +		.seq_show = mem_cgroup_oom_control_read,
>  		.write_u64 = mem_cgroup_oom_control_write,
>  		.private = MEMFILE_PRIVATE(_OOM_TYPE, OOM_CONTROL),
>  	},
> @@ -6300,7 +6296,7 @@ static struct cftype mem_cgroup_files[] = {
>  #ifdef CONFIG_NUMA
>  	{
>  		.name = "numa_stat",
> -		.read_seq_string = memcg_numa_stat_show,
> +		.seq_show = memcg_numa_stat_show,
>  	},
>  #endif
>  #ifdef CONFIG_MEMCG_KMEM
> @@ -6330,7 +6326,7 @@ static struct cftype mem_cgroup_files[] = {
>  #ifdef CONFIG_SLABINFO
>  	{
>  		.name = "kmem.slabinfo",
> -		.read_seq_string = mem_cgroup_slabinfo_read,
> +		.seq_show = mem_cgroup_slabinfo_read,
>  	},
>  #endif
>  #endif
> diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
> index 498710d..56cbb69 100644
> --- a/net/core/netprio_cgroup.c
> +++ b/net/core/netprio_cgroup.c
> @@ -173,14 +173,14 @@ static u64 read_prioidx(struct cgroup_subsys_state *css, struct cftype *cft)
>  	return css->cgroup->id;
>  }
>  
> -static int read_priomap(struct cgroup_subsys_state *css, struct cftype *cft,
> -			struct seq_file *sf)
> +static int read_priomap(struct seq_file *sf, void *v)
>  {
>  	struct net_device *dev;
>  
>  	rcu_read_lock();
>  	for_each_netdev_rcu(&init_net, dev)
> -		seq_printf(sf, "%s %u\n", dev->name, netprio_prio(css, dev));
> +		seq_printf(sf, "%s %u\n", dev->name,
> +			   netprio_prio(seq_css(sf), dev));
>  	rcu_read_unlock();
>  	return 0;
>  }
> @@ -238,7 +238,7 @@ static struct cftype ss_files[] = {
>  	},
>  	{
>  		.name = "ifpriomap",
> -		.read_seq_string = read_priomap,
> +		.seq_show = read_priomap,
>  		.write_string = write_priomap,
>  	},
>  	{ }	/* terminate */
> diff --git a/security/device_cgroup.c b/security/device_cgroup.c
> index 7c2a0a7..d3b6d2c 100644
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -274,10 +274,9 @@ static void set_majmin(char *str, unsigned m)
>  		sprintf(str, "%u", m);
>  }
>  
> -static int devcgroup_seq_read(struct cgroup_subsys_state *css,
> -			      struct cftype *cft, struct seq_file *m)
> +static int devcgroup_seq_show(struct seq_file *m, void *v)
>  {
> -	struct dev_cgroup *devcgroup = css_to_devcgroup(css);
> +	struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
>  	struct dev_exception_item *ex;
>  	char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
>  
> @@ -679,7 +678,7 @@ static struct cftype dev_cgroup_files[] = {
>  	},
>  	{
>  		.name = "list",
> -		.read_seq_string = devcgroup_seq_read,
> +		.seq_show = devcgroup_seq_show,
>  		.private = DEVCG_LIST,
>  	},
>  	{ }	/* terminate */
> -- 
> 1.8.4.2
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map()
       [not found]     ` <1385595759-17656-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-11-29  1:56       ` Neil Horman
  2013-11-29  8:52       ` Daniel Wagner
  1 sibling, 0 replies; 38+ messages in thread
From: Neil Horman @ 2013-11-29  1:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Nov 27, 2013 at 06:42:31PM -0500, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is being
> consolidated so that it can be easily mapped to the seq_file based
> interface of kernfs.
> 
> cftype->read_map() doesn't add any value and being replaced with
> ->read_seq_string().  Update read_priomap() to use ->read_seq_string()
> instead.
> 
> This patch doesn't make any visible behavior changes.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Acked-by: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map()
       [not found]     ` <1385595759-17656-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-29  1:56       ` Neil Horman
@ 2013-11-29  8:52       ` Daniel Wagner
  1 sibling, 0 replies; 38+ messages in thread
From: Daniel Wagner @ 2013-11-29  8:52 UTC (permalink / raw)
  To: Tejun Heo, lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mhocko-AlSwsSmVLrQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	vgoyal-H+wXaHxf7aLQT0dZR+AlfA

Hi Tejun,

On 11/28/2013 12:42 AM, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is being
> consolidated so that it can be easily mapped to the seq_file based
> interface of kernfs.
>
> cftype->read_map() doesn't add any value and being replaced with
> ->read_seq_string().  Update read_priomap() to use ->read_seq_string()
> instead.
>
> This patch doesn't make any visible behavior changes.
>
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>

Acked-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>

Forgot to ack this patch. Sorry.

cheers,
daniel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]         ` <20131128111818.GG2761-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2013-11-29 20:05           ` Tejun Heo
       [not found]             ` <20131129200525.GC21755-9pTldWuhBndy/B6EtB590w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-11-29 20:05 UTC (permalink / raw)
  To: Michal Hocko
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

Hello, Michal.

On Thu, Nov 28, 2013 at 12:18:18PM +0100, Michal Hocko wrote:
> > Unify the two into cgroup_file_write() which always allocates dynamic
> > buffer for simplicity
> 
> It's true that this is simpler but the allocation might cause some
> issues with memcg. Although it is not common that userspace oom handler
> is a part of the target memcg there are users (e.g. Google) who do that.
> 
> Why is that a problem? Consider that a memcg is under OOM, handler gets
> notified and tries to solve the situation by enlarging the hard limit.
> This worked before this patch because cgroup_write_X64 used an on stack
> buffer but now it would use kmalloc which might be accounted and trip
> over the same OOM situation.
> 
> This is not limited to the OOM handling. The group might be close to OOM
> and the last thing user expects is to trigger OOM when he tries to
> enlarge the limit.
> 
> Is the additional simplicity worth breaking those usecases?

Whoa, so we support oom handler inside the memcg that it handles?
Does that work reliably?  Changing the above detail in this patch
isn't difficult (and we'll later need to update kernfs too) but
supporting such setup properly would be a *lot* of commitment and I'm
very doubtful we'd be able to achieve that by just carefully avoiding
memory allocation in the operations that usreland oom handler uses -
that set is destined to expand over time, extremely fragile and will
be hellish to maintain.

So, I'm not at all excited about commiting to this guarantee.  This
one is an easy one but it looks like the first step onto dizzying
slippery slope.

Am I misunderstanding something here?  Are you and Johannes firm on
supporting this?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]             ` <20131129200525.GC21755-9pTldWuhBndy/B6EtB590w@public.gmane.org>
@ 2013-12-02  9:54               ` Michal Hocko
       [not found]                 ` <20131202095401.GA18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  2013-12-02 16:44               ` Johannes Weiner
  1 sibling, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2013-12-02  9:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Fri 29-11-13 15:05:25, Tejun Heo wrote:
> Hello, Michal.
> 
> On Thu, Nov 28, 2013 at 12:18:18PM +0100, Michal Hocko wrote:
> > > Unify the two into cgroup_file_write() which always allocates dynamic
> > > buffer for simplicity
> > 
> > It's true that this is simpler but the allocation might cause some
> > issues with memcg. Although it is not common that userspace oom handler
> > is a part of the target memcg there are users (e.g. Google) who do that.
> > 
> > Why is that a problem? Consider that a memcg is under OOM, handler gets
> > notified and tries to solve the situation by enlarging the hard limit.
> > This worked before this patch because cgroup_write_X64 used an on stack
> > buffer but now it would use kmalloc which might be accounted and trip
> > over the same OOM situation.
> > 
> > This is not limited to the OOM handling. The group might be close to OOM
> > and the last thing user expects is to trigger OOM when he tries to
> > enlarge the limit.
> > 
> > Is the additional simplicity worth breaking those usecases?
> 
> Whoa, so we support oom handler inside the memcg that it handles?
> Does that work reliably?  Changing the above detail in this patch
> isn't difficult (and we'll later need to update kernfs too) but
> supporting such setup properly would be a *lot* of commitment and I'm
> very doubtful we'd be able to achieve that by just carefully avoiding
> memory allocation in the operations that usreland oom handler uses -
> that set is destined to expand over time, extremely fragile and will
> be hellish to maintain.
> 
> So, I'm not at all excited about commiting to this guarantee.  This
> one is an easy one but it looks like the first step onto dizzying
> slippery slope.
> 
> Am I misunderstanding something here?  Are you and Johannes firm on
> supporting this?

Well, I am not happy about the use case as well and I will always
discourage people from doing this. I was merely pointing out that the
patch will break those even though it seems trivial to not do so. That
being said I would rather see no allocation in that path but if that
doesn't seem viable to you then I will not loose any sleep over it.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]                 ` <20131202095401.GA18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2013-12-02 13:30                   ` Tejun Heo
       [not found]                     ` <20131202133059.GA3626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-12-02 13:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

Hello, Michal.

On Mon, Dec 02, 2013 at 10:54:01AM +0100, Michal Hocko wrote:
> Well, I am not happy about the use case as well and I will always
> discourage people from doing this. I was merely pointing out that the
> patch will break those even though it seems trivial to not do so. That
> being said I would rather see no allocation in that path but if that
> doesn't seem viable to you then I will not loose any sleep over it.

So, is it something already working reliably?  I really don't like the
implications of heading this way.  Does it also mean cgroup won't be
able to use seq_file for read paths either?

We're talking about a *lot* of extra restrictions and shackles if we
go down this path and many of them would be extremely subtle and
fragile.  If this is an one-off thing to keep the existing users
happy, I'm okay with it and will update this patch and kernfs so that
writes under certain size use on-stack buffer, but I *DO* want an
assurance that memcg isn't gonna march that way, which is likely to
bring down the rest of cgroup in the process.

Well, either that or please convince me it's something sane to
support, but you don't sound too sure either, so....

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]                     ` <20131202133059.GA3626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2013-12-02 14:12                       ` Michal Hocko
       [not found]                         ` <20131202141242.GD18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2013-12-02 14:12 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Mon 02-12-13 08:30:59, Tejun Heo wrote:
> Hello, Michal.
> 
> On Mon, Dec 02, 2013 at 10:54:01AM +0100, Michal Hocko wrote:
> > Well, I am not happy about the use case as well and I will always
> > discourage people from doing this. I was merely pointing out that the
> > patch will break those even though it seems trivial to not do so. That
> > being said I would rather see no allocation in that path but if that
> > doesn't seem viable to you then I will not loose any sleep over it.
> 
> So, is it something already working reliably?  I really don't like the
> implications of heading this way.  Does it also mean cgroup won't be
> able to use seq_file for read paths either?

One has to try hard but I guess this is doable. At least as simple
things as try to enlarge the limit approach should be easy to do.

> We're talking about a *lot* of extra restrictions and shackles if we
> go down this path and many of them would be extremely subtle and
> fragile. If this is an one-off thing to keep the existing users
> happy, I'm okay with it and will update this patch and kernfs so that
> writes under certain size use on-stack buffer, but I *DO* want an
> assurance that memcg isn't gonna march that way, which is likely to
> bring down the rest of cgroup in the process.

I really do not want to add more hacks just to make this use case work.
There are some proposals for more systematic implementation (memory
reserves for oom killer etc.) but that won't interfere with the cgroup
core.
This one just looks trivial so I was thinking whether we can keep the
!allocating write as before. It is nothing I would insist on, though. So
I will leave the decision on you.

> Well, either that or please convince me it's something sane to
> support, but you don't sound too sure either, so....
> 
> Thanks.
> 
> -- 
> tejun

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show()
       [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-11-28  9:07       ` Daniel Wagner
  2013-11-28 11:25       ` Michal Hocko
@ 2013-12-02 14:41       ` Aristeu Rozanski
  2013-12-02 14:52       ` Vivek Goyal
  3 siblings, 0 replies; 38+ messages in thread
From: Aristeu Rozanski @ 2013-12-02 14:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Nov 27, 2013 at 06:42:38PM -0500, Tejun Heo wrote:
> --- a/security/device_cgroup.c
> +++ b/security/device_cgroup.c
> @@ -274,10 +274,9 @@ static void set_majmin(char *str, unsigned m)
>  		sprintf(str, "%u", m);
>  }
>  
> -static int devcgroup_seq_read(struct cgroup_subsys_state *css,
> -			      struct cftype *cft, struct seq_file *m)
> +static int devcgroup_seq_show(struct seq_file *m, void *v)
>  {
> -	struct dev_cgroup *devcgroup = css_to_devcgroup(css);
> +	struct dev_cgroup *devcgroup = css_to_devcgroup(seq_css(m));
>  	struct dev_exception_item *ex;
>  	char maj[MAJMINLEN], min[MAJMINLEN], acc[ACCLEN];
>  
> @@ -679,7 +678,7 @@ static struct cftype dev_cgroup_files[] = {
>  	},
>  	{
>  		.name = "list",
> -		.read_seq_string = devcgroup_seq_read,
> +		.seq_show = devcgroup_seq_show,
>  		.private = DEVCG_LIST,
>  	},
>  	{ }	/* terminate */

Acked-by: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

-- 
Aristeu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show()
       [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                         ` (2 preceding siblings ...)
  2013-12-02 14:41       ` Aristeu Rozanski
@ 2013-12-02 14:52       ` Vivek Goyal
  3 siblings, 0 replies; 38+ messages in thread
From: Vivek Goyal @ 2013-12-02 14:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jens Axboe, nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Nov 27, 2013 at 06:42:38PM -0500, Tejun Heo wrote:
> In preparation of conversion to kernfs, cgroup file handling is
> updated so that it can be easily mapped to kernfs.  This patch
> replaces cftype->read_seq_string() with cftype->seq_show() which is
> not limited to single_open() operation and will map directcly to
> kernfs seq_file interface.
> 
> The conversions are mechanical.  As ->seq_show() doesn't have @css and
> @cft, the functions which make use of them are converted to use
> seq_css() and seq_cft() respectively.  In several occassions, e.f. if
> it has seq_string in its name, the function name is updated to fit the
> new method better.
> 
> This patch does not introduce any behavior changes.
> 
> Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
> Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Cc: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
> Cc: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Cc: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Cc: Balbir Singh <bsingharora-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> Cc: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
>  block/blk-throttle.c      |  35 ++++++-------
>  block/cfq-iosched.c       | 131 ++++++++++++++++++++--------------------------

Looks good to me for blk-throttle and cfq-iosched part.

Acked-by: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

Thanks
Vivek

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]             ` <20131129200525.GC21755-9pTldWuhBndy/B6EtB590w@public.gmane.org>
  2013-12-02  9:54               ` Michal Hocko
@ 2013-12-02 16:44               ` Johannes Weiner
       [not found]                 ` <20131202164406.GP3556-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
  1 sibling, 1 reply; 38+ messages in thread
From: Johannes Weiner @ 2013-12-02 16:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michal Hocko, lizefan-hv44wF8Li93QT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

On Fri, Nov 29, 2013 at 03:05:25PM -0500, Tejun Heo wrote:
> Hello, Michal.
> 
> On Thu, Nov 28, 2013 at 12:18:18PM +0100, Michal Hocko wrote:
> > > Unify the two into cgroup_file_write() which always allocates dynamic
> > > buffer for simplicity
> > 
> > It's true that this is simpler but the allocation might cause some
> > issues with memcg. Although it is not common that userspace oom handler
> > is a part of the target memcg there are users (e.g. Google) who do that.
> > 
> > Why is that a problem? Consider that a memcg is under OOM, handler gets
> > notified and tries to solve the situation by enlarging the hard limit.
> > This worked before this patch because cgroup_write_X64 used an on stack
> > buffer but now it would use kmalloc which might be accounted and trip
> > over the same OOM situation.
> >
> > This is not limited to the OOM handling. The group might be close to OOM
> > and the last thing user expects is to trigger OOM when he tries to
> > enlarge the limit.
> > 
> > Is the additional simplicity worth breaking those usecases?
> 
> Whoa, so we support oom handler inside the memcg that it handles?
> Does that work reliably?  Changing the above detail in this patch
> isn't difficult (and we'll later need to update kernfs too) but
> supporting such setup properly would be a *lot* of commitment and I'm
> very doubtful we'd be able to achieve that by just carefully avoiding
> memory allocation in the operations that usreland oom handler uses -
> that set is destined to expand over time, extremely fragile and will
> be hellish to maintain.
> 
> So, I'm not at all excited about commiting to this guarantee.  This
> one is an easy one but it looks like the first step onto dizzying
> slippery slope.
> 
> Am I misunderstanding something here?  Are you and Johannes firm on
> supporting this?

Handling a memcg OOM from userspace running inside that OOM memcg is
completely crazy.  I mean, think about this for just two seconds...
Really?

I get that people are doing it right now, and if you can get away with
it for now, good for you.  But you have to be aware how crazy this is
and if it breaks you get to keep the pieces and we are not going to
accomodate this in the kernel.  Fix your crazy userspace.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]                 ` <20131202164406.GP3556-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
@ 2013-12-03  7:35                   ` Li Zefan
  0 siblings, 0 replies; 38+ messages in thread
From: Li Zefan @ 2013-12-03  7:35 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, Michal Hocko,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, Tejun Heo,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

> Handling a memcg OOM from userspace running inside that OOM memcg is
> completely crazy.  I mean, think about this for just two seconds...
> Really?
> 
> I get that people are doing it right now, and if you can get away with
> it for now, good for you.  But you have to be aware how crazy this is
> and if it breaks you get to keep the pieces and we are not going to
> accomodate this in the kernel.  Fix your crazy userspace.

+1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]                         ` <20131202141242.GD18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
@ 2013-12-03 20:41                           ` Tejun Heo
       [not found]                             ` <20131203204155.GL8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Tejun Heo @ 2013-12-03 20:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	hannes-druUgvl0LCNAfugRpC6u6w, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

Hello, Michal.

On Mon, Dec 02, 2013 at 03:12:42PM +0100, Michal Hocko wrote:
> I really do not want to add more hacks just to make this use case work.
> There are some proposals for more systematic implementation (memory
> reserves for oom killer etc.) but that won't interfere with the cgroup
> core.
> This one just looks trivial so I was thinking whether we can keep the
> !allocating write as before. It is nothing I would insist on, though. So
> I will leave the decision on you.

So, I'm just gonna commit the patches as-is because I can't really see
how this is anything which can work in any reasonable way.  As it
currently stands, the userland wouldn't even be able to read any knob.
Wouldn't it at least need to do that?  Actually, the answer to that
question doesn't even matter because "no" would mean that the OOM
notification, however it's done, can't depend on the userland being
able to read *any* knob, which in turn is likely to constrain and
distort the notification mechanism itself.  These things are all
connected and this type of bad decisions propagates through the whole
stack.

In case this *really* is necessary, let's please do it in a separate
patch with rationale and detailed explanation of actual usage.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string()
       [not found]                             ` <20131203204155.GL8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
@ 2013-12-03 21:04                               ` Michal Hocko
  0 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2013-12-03 21:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: lizefan-hv44wF8Li93QT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

On Tue 03-12-13 15:41:55, Tejun Heo wrote:
[...]
> In case this *really* is necessary, let's please do it in a separate
> patch with rationale and detailed explanation of actual usage.

I am fine with this.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files
       [not found]     ` <1385595759-17656-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-12-04  6:04       ` Li Zefan
       [not found]         ` <529EC5F4.10708-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2013-12-04 15:09       ` [PATCH v2 " Tejun Heo
  1 sibling, 1 reply; 38+ messages in thread
From: Li Zefan @ 2013-12-04  6:04 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

> @@ -2353,10 +2338,18 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
>  	WARN_ON_ONCE(cfe->css && cfe->css != css);
>  	cfe->css = css;
>  
> -	if (cft->open)
> +	if (cft->open) {
>  		err = cft->open(inode, file);
> -	else
> -		err = single_open(file, cgroup_seqfile_show, cfe);
> +	} else {
> +		err = single_open_size(file, cgroup_seqfile_show, cfe,
> +				       sizeof(struct cgroup_open_file));

This is wrong.

I guess you wanted to use single_open_private(), which doesn't exist.

> +		if (!err) {
> +			struct seq_file *sf = file->private_data;
> +			struct cgroup_open_file *of = sf->private;
> +
> +			of->cfe = cfe;
> +		}
> +	}
>  
>  	if (css->ss && err)
>  		css_put(css);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 12/12] cgroup: unify pidlist and other file handling
       [not found]     ` <1385595759-17656-13-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
@ 2013-12-04  6:20       ` Li Zefan
       [not found]         ` <529EC9A6.903-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
  2013-12-04 15:09       ` [PATCH v3 " Tejun Heo
  1 sibling, 1 reply; 38+ messages in thread
From: Li Zefan @ 2013-12-04  6:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

> +static void *cgroup_seqfile_start(struct seq_file *seq, loff_t *ppos)
> +{
> +	struct cftype *cft = seq_cft(seq);
> +
> +	if (cft->seq_start) {
> +		return cft->seq_start(seq, ppos);
> +	} else {
> +		/*
> +		 * The same behavior and code as single_open().  Returns
> +		 * !NULL if pos is at the beginning; otherwise, NULL.
> +		 */

Isn't it simpler to choose between seq_open() and single_open() depending
on if cft->seq_start is implemented?

> +		return NULL + !*ppos;
> +	}
> +}
...
>  static int cgroup_file_open(struct inode *inode, struct file *file)
>  {
>  	struct cfent *cfe = __d_cfe(file->f_dentry);
> @@ -2338,20 +2384,17 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
>  	WARN_ON_ONCE(cfe->css && cfe->css != css);
>  	cfe->css = css;
>  
> -	if (cft->open) {
> -		err = cft->open(inode, file);
> -	} else {
> -		err = single_open_size(file, cgroup_seqfile_show, cfe,
> -				       sizeof(struct cgroup_open_file));
> -		if (!err) {
> -			struct seq_file *sf = file->private_data;
> -			struct cgroup_open_file *of = sf->private;
> +	err = seq_open_private(file, &cgroup_seq_operations,
> +			       sizeof(struct cgroup_open_file));
> +	if (!err) {
> +		struct seq_file *sf = file->private_data;
> +		struct cgroup_open_file *of = sf->private;
>  
> -			of->cfe = cfe;
> -		}
> +		of->cfe = cfe;
> +		return 0;
>  	}
>  
> -	if (css->ss && err)
> +	if (css->ss)
>  		css_put(css);
>  	return err;
>  }


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files
       [not found]         ` <529EC5F4.10708-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2013-12-04 13:04           ` Tejun Heo
  0 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-12-04 13:04 UTC (permalink / raw)
  To: Li Zefan
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Dec 04, 2013 at 02:04:36PM +0800, Li Zefan wrote:
> > @@ -2353,10 +2338,18 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
> >  	WARN_ON_ONCE(cfe->css && cfe->css != css);
> >  	cfe->css = css;
> >  
> > -	if (cft->open)
> > +	if (cft->open) {
> >  		err = cft->open(inode, file);
> > -	else
> > -		err = single_open(file, cgroup_seqfile_show, cfe);
> > +	} else {
> > +		err = single_open_size(file, cgroup_seqfile_show, cfe,
> > +				       sizeof(struct cgroup_open_file));
> 
> This is wrong.
> 
> I guess you wanted to use single_open_private(), which doesn't exist.

Yeah, right, I got completely confused about what single_open_size()
was doing.  Hah, so, I guess I'll just introduce the single callbacks
here using seq_open_private().  More on this in the next patch.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 12/12] cgroup: unify pidlist and other file handling
       [not found]         ` <529EC9A6.903-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2013-12-04 13:08           ` Tejun Heo
  0 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-12-04 13:08 UTC (permalink / raw)
  To: Li Zefan
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

On Wed, Dec 04, 2013 at 02:20:22PM +0800, Li Zefan wrote:
> > +static void *cgroup_seqfile_start(struct seq_file *seq, loff_t *ppos)
> > +{
> > +	struct cftype *cft = seq_cft(seq);
> > +
> > +	if (cft->seq_start) {
> > +		return cft->seq_start(seq, ppos);
> > +	} else {
> > +		/*
> > +		 * The same behavior and code as single_open().  Returns
> > +		 * !NULL if pos is at the beginning; otherwise, NULL.
> > +		 */
> 
> Isn't it simpler to choose between seq_open() and single_open() depending
> on if cft->seq_start is implemented?

This is primarily to duplicate and match kernfs implementation and
will be removed when cgroup transitions to kernfs.  As for whether
kernfs could have used different open paths.  Maybe.  I don't think
either way is particularly better tho especially given that
seq_open*() and single_open*() interfaces don't match each other.  It
gets confusing pretty fast once you start mixing the two variants.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 10/12] cgroup: attach cgroup_open_file to all cgroup files
       [not found]     ` <1385595759-17656-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-12-04  6:04       ` Li Zefan
@ 2013-12-04 15:09       ` Tejun Heo
  1 sibling, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-12-04 15:09 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

From dfc10ab0fd9a14efe6c279989b3e65a52b03ca40 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Wed, 4 Dec 2013 10:06:18 -0500

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  This patch
attaches cgroup_open_file, which used to be attached to pidlist files,
to all cgroup files, introduces seq_css/cft() accessors to determine
the cgroup_subsys_state and cftype associated with a given cgroup
seq_file, exports them as public interface.

This doesn't cause any behavior changes but unifies cgroup file
handling across different file types and will help converting them to
kernfs seq_show() interface.

v2: Li pointed out that the original patch was using
    single_open_size() incorrectly assuming that the size param is
    private data size.  Fix it by allocating @of separately and
    passing it to single_open() and explicitly freeing it in the
    release path.  This isn't the prettiest but this path is gonna be
    restructured by the following patches pretty soon.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
git branch updated accordingly.  Thanks.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git review-consolidate-file-handling

 include/linux/cgroup.h | 33 +++++++++++++++++++++++++++++++++
 kernel/cgroup.c        | 50 ++++++++++++++++++++------------------------------
 2 files changed, 53 insertions(+), 30 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 53e11da..c3d698a 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -21,6 +21,7 @@
 #include <linux/xattr.h>
 #include <linux/fs.h>
 #include <linux/percpu-refcount.h>
+#include <linux/seq_file.h>
 
 #ifdef CONFIG_CGROUPS
 
@@ -490,6 +491,26 @@ struct cftype_set {
 };
 
 /*
+ * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.  Don't
+ * access directly.
+ */
+struct cfent {
+	struct list_head		node;
+	struct dentry			*dentry;
+	struct cftype			*type;
+	struct cgroup_subsys_state	*css;
+
+	/* file xattrs */
+	struct simple_xattrs		xattrs;
+};
+
+/* seq_file->private points to the following, only ->priv is public */
+struct cgroup_open_file {
+	struct cfent			*cfe;
+	void				*priv;
+};
+
+/*
  * See the comment above CGRP_ROOT_SANE_BEHAVIOR for details.  This
  * function can be called as long as @cgrp is accessible.
  */
@@ -504,6 +525,18 @@ static inline const char *cgroup_name(const struct cgroup *cgrp)
 	return rcu_dereference(cgrp->name)->name;
 }
 
+static inline struct cgroup_subsys_state *seq_css(struct seq_file *seq)
+{
+	struct cgroup_open_file *of = seq->private;
+	return of->cfe->css;
+}
+
+static inline struct cftype *seq_cft(struct seq_file *seq)
+{
+	struct cgroup_open_file *of = seq->private;
+	return of->cfe->type;
+}
+
 int cgroup_add_cftypes(struct cgroup_subsys *ss, struct cftype *cfts);
 int cgroup_rm_cftypes(struct cftype *cfts);
 
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 17272893..036c05d 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -41,7 +41,6 @@
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
 #include <linux/backing-dev.h>
-#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <linux/magic.h>
 #include <linux/spinlock.h>
@@ -130,19 +129,6 @@ static struct cgroupfs_root cgroup_dummy_root;
 /* dummy_top is a shorthand for the dummy hierarchy's top cgroup */
 static struct cgroup * const cgroup_dummy_top = &cgroup_dummy_root.top_cgroup;
 
-/*
- * cgroupfs file entry, pointed to from leaf dentry->d_fsdata.
- */
-struct cfent {
-	struct list_head		node;
-	struct dentry			*dentry;
-	struct cftype			*type;
-	struct cgroup_subsys_state	*css;
-
-	/* file xattrs */
-	struct simple_xattrs		xattrs;
-};
-
 /* The list of hierarchy roots */
 
 static LIST_HEAD(cgroup_roots);
@@ -2302,9 +2288,8 @@ out_free:
 
 static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
-	struct cfent *cfe = m->private;
-	struct cftype *cft = cfe->type;
-	struct cgroup_subsys_state *css = cfe->css;
+	struct cftype *cft = seq_cft(m);
+	struct cgroup_subsys_state *css = seq_css(m);
 
 	if (cft->read_seq_string)
 		return cft->read_seq_string(css, cft, m);
@@ -2353,10 +2338,20 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->open)
+	if (cft->open) {
 		err = cft->open(inode, file);
-	else
-		err = single_open(file, cgroup_seqfile_show, cfe);
+	} else {
+		struct cgroup_open_file *of;
+
+		err = -ENOMEM;
+		of = kzalloc(sizeof(*of), GFP_KERNEL);
+		if (of) {
+			of->cfe = cfe;
+			err = single_open(file, cgroup_seqfile_show, of);
+			if (err)
+				kfree(of);
+		}
+	}
 
 	if (css->ss && err)
 		css_put(css);
@@ -2370,6 +2365,7 @@ static int cgroup_file_release(struct inode *inode, struct file *file)
 
 	if (css->ss)
 		css_put(css);
+	kfree(((struct seq_file *)file->private_data)->private);
 	return single_release(inode, file);
 }
 
@@ -3368,12 +3364,6 @@ struct cgroup_pidlist {
 	struct delayed_work destroy_dwork;
 };
 
-/* seq_file->private points to the following */
-struct cgroup_open_file {
-	struct cfent			*cfe;
-	void				*priv;
-};
-
 /*
  * The following two functions "fix" the issue where there are more pids
  * than kmalloc will give memory for; in such cases, we use vmalloc/vfree.
@@ -3689,9 +3679,9 @@ static void *cgroup_pidlist_start(struct seq_file *s, loff_t *pos)
 	 * next pid to display, if any
 	 */
 	struct cgroup_open_file *of = s->private;
-	struct cgroup *cgrp = of->cfe->css->cgroup;
+	struct cgroup *cgrp = seq_css(s)->cgroup;
 	struct cgroup_pidlist *l;
-	enum cgroup_filetype type = of->cfe->type->private;
+	enum cgroup_filetype type = seq_cft(s)->private;
 	int index = 0, pid = *pos;
 	int *iter, ret;
 
@@ -3749,7 +3739,7 @@ static void cgroup_pidlist_stop(struct seq_file *s, void *v)
 	if (l)
 		mod_delayed_work(cgroup_pidlist_destroy_wq, &l->destroy_dwork,
 				 CGROUP_PIDLIST_DESTROY_DELAY);
-	mutex_unlock(&of->cfe->css->cgroup->pidlist_mutex);
+	mutex_unlock(&seq_css(s)->cgroup->pidlist_mutex);
 }
 
 static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
@@ -3766,7 +3756,7 @@ static void *cgroup_pidlist_next(struct seq_file *s, void *v, loff_t *pos)
 	if (p >= end) {
 		return NULL;
 	} else {
-		*pos = cgroup_pid_fry(of->cfe->css->cgroup, *p);
+		*pos = cgroup_pid_fry(seq_css(s)->cgroup, *p);
 		return p;
 	}
 }
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3 12/12] cgroup: unify pidlist and other file handling
       [not found]     ` <1385595759-17656-13-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
  2013-12-04  6:20       ` Li Zefan
@ 2013-12-04 15:09       ` Tejun Heo
  1 sibling, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-12-04 15:09 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

From 319da8b44d61ab3452caccc46a0944755e617cbe Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Date: Wed, 4 Dec 2013 10:06:18 -0500

In preparation of conversion to kernfs, cgroup file handling is
updated so that it can be easily mapped to kernfs.  With the previous
changes, the difference between pidlist and other files are very
small.  Both are served by seq_file in a pretty standard way with the
only difference being !pidlist files use single_open().

This patch adds cftype->seq_start(), ->seq_next and ->seq_stop() and
implements the matching cgroup_seqfile_start/next/stop() which either
emulates single_open() behavior or invokes cftype->seq_*() operations
if specified.  This allows using single seq_operations for both
pidlist and other files and makes cgroup_pidlist_operations and
cgorup_pidlist_open() no longer necessary.  As cgroup_pidlist_open()
was the only user of cftype->open(), the method is dropped together.

This brings cftype file interface very close to kernfs interface and
mapping shouldn't be too difficult.  Once converted to kernfs, most of
the plumbing code including cgroup_seqfile_*() will be removed as
kernfs provides those facilities.

This patch does not introduce any behavior changes.

v2: Refreshed on top of the updated "cgroup: introduce struct
    cgroup_pidlist_open_file".

v3: Refreshed on top of the updated "cgroup: attach cgroup_open_file
    to all cgroup files".

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 include/linux/cgroup.h |   6 ++-
 kernel/cgroup.c        | 112 +++++++++++++++++++++++++++----------------------
 2 files changed, 68 insertions(+), 50 deletions(-)

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index b32a0f8..8b9a594 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -434,7 +434,6 @@ struct cftype {
 	 */
 	struct cgroup_subsys *ss;
 
-	int (*open)(struct inode *inode, struct file *file);
 	/*
 	 * read_u64() is a shortcut for the common case of returning a
 	 * single integer. Use it in place of read()
@@ -448,6 +447,11 @@ struct cftype {
 	/* generic seq_file read interface */
 	int (*seq_show)(struct seq_file *sf, void *v);
 
+	/* optional ops, implement all or none */
+	void *(*seq_start)(struct seq_file *sf, loff_t *ppos);
+	void *(*seq_next)(struct seq_file *sf, void *v, loff_t *ppos);
+	void (*seq_stop)(struct seq_file *sf, void *v);
+
 	/*
 	 * write_u64() is a shortcut for the common case of accepting
 	 * a single integer (as parsed by simple_strtoull) from
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c45e633..f9ae38a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -2286,6 +2286,45 @@ out_free:
  * supports string->u64 maps, but can be extended in future.
  */
 
+static void *cgroup_seqfile_start(struct seq_file *seq, loff_t *ppos)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_start) {
+		return cft->seq_start(seq, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open().  Returns
+		 * !NULL if pos is at the beginning; otherwise, NULL.
+		 */
+		return NULL + !*ppos;
+	}
+}
+
+static void *cgroup_seqfile_next(struct seq_file *seq, void *v, loff_t *ppos)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_next) {
+		return cft->seq_next(seq, v, ppos);
+	} else {
+		/*
+		 * The same behavior and code as single_open(), always
+		 * terminate after the initial read.
+		 */
+		++*ppos;
+		return NULL;
+	}
+}
+
+static void cgroup_seqfile_stop(struct seq_file *seq, void *v)
+{
+	struct cftype *cft = seq_cft(seq);
+
+	if (cft->seq_stop)
+		cft->seq_stop(seq, v);
+}
+
 static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 {
 	struct cftype *cft = seq_cft(m);
@@ -2303,12 +2342,20 @@ static int cgroup_seqfile_show(struct seq_file *m, void *arg)
 	return 0;
 }
 
+static struct seq_operations cgroup_seq_operations = {
+	.start		= cgroup_seqfile_start,
+	.next		= cgroup_seqfile_next,
+	.stop		= cgroup_seqfile_stop,
+	.show		= cgroup_seqfile_show,
+};
+
 static int cgroup_file_open(struct inode *inode, struct file *file)
 {
 	struct cfent *cfe = __d_cfe(file->f_dentry);
 	struct cftype *cft = __d_cft(file->f_dentry);
 	struct cgroup *cgrp = __d_cgrp(cfe->dentry->d_parent);
 	struct cgroup_subsys_state *css;
+	struct cgroup_open_file *of;
 	int err;
 
 	err = generic_file_open(inode, file);
@@ -2338,24 +2385,16 @@ static int cgroup_file_open(struct inode *inode, struct file *file)
 	WARN_ON_ONCE(cfe->css && cfe->css != css);
 	cfe->css = css;
 
-	if (cft->open) {
-		err = cft->open(inode, file);
-	} else {
-		struct cgroup_open_file *of;
-
-		err = -ENOMEM;
-		of = kzalloc(sizeof(*of), GFP_KERNEL);
-		if (of) {
-			of->cfe = cfe;
-			err = single_open(file, cgroup_seqfile_show, of);
-			if (err)
-				kfree(of);
-		}
+	of = __seq_open_private(file, &cgroup_seq_operations,
+				sizeof(struct cgroup_open_file));
+	if (of) {
+		of->cfe = cfe;
+		return 0;
 	}
 
-	if (css->ss && err)
+	if (css->ss)
 		css_put(css);
-	return err;
+	return -ENOMEM;
 }
 
 static int cgroup_file_release(struct inode *inode, struct file *file)
@@ -2365,8 +2404,7 @@ static int cgroup_file_release(struct inode *inode, struct file *file)
 
 	if (css->ss)
 		css_put(css);
-	kfree(((struct seq_file *)file->private_data)->private);
-	return single_release(inode, file);
+	return seq_release_private(inode, file);
 }
 
 /*
@@ -3777,36 +3815,6 @@ static const struct seq_operations cgroup_pidlist_seq_operations = {
 	.show = cgroup_pidlist_show,
 };
 
-static const struct file_operations cgroup_pidlist_operations = {
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.write = cgroup_file_write,
-	.release = seq_release_private,
-};
-
-/*
- * The following functions handle opens on a file that displays a pidlist
- * (tasks or procs). Prepare an array of the process/thread IDs of whoever's
- * in the cgroup.
- */
-/* helper function for the two below it */
-static int cgroup_pidlist_open(struct inode *unused, struct file *file)
-{
-	struct cfent *cfe = __d_cfe(file->f_dentry);
-	struct cgroup_open_file *of;
-
-	/* configure file information */
-	file->f_op = &cgroup_pidlist_operations;
-
-	of = __seq_open_private(file, &cgroup_pidlist_seq_operations,
-				sizeof(*of));
-	if (!of)
-		return -ENOMEM;
-
-	of->cfe = cfe;
-	return 0;
-}
-
 static u64 cgroup_read_notify_on_release(struct cgroup_subsys_state *css,
 					 struct cftype *cft)
 {
@@ -3860,7 +3868,10 @@ static int cgroup_clone_children_write(struct cgroup_subsys_state *css,
 static struct cftype cgroup_base_files[] = {
 	{
 		.name = "cgroup.procs",
-		.open = cgroup_pidlist_open,
+		.seq_start = cgroup_pidlist_start,
+		.seq_next = cgroup_pidlist_next,
+		.seq_stop = cgroup_pidlist_stop,
+		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_PROCS,
 		.write_u64 = cgroup_procs_write,
 		.mode = S_IRUGO | S_IWUSR,
@@ -3885,7 +3896,10 @@ static struct cftype cgroup_base_files[] = {
 	{
 		.name = "tasks",
 		.flags = CFTYPE_INSANE,		/* use "procs" instead */
-		.open = cgroup_pidlist_open,
+		.seq_start = cgroup_pidlist_start,
+		.seq_next = cgroup_pidlist_next,
+		.seq_stop = cgroup_pidlist_stop,
+		.seq_show = cgroup_pidlist_show,
 		.private = CGROUP_FILE_TASKS,
 		.write_u64 = cgroup_tasks_write,
 		.mode = S_IRUGO | S_IWUSR,
-- 
1.8.4.2

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCHSET cgroup/for-3.14] cgroup: consolidate file handling
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (11 preceding siblings ...)
  2013-11-27 23:42   ` [PATCH 12/12] cgroup: unify pidlist and other file handling Tejun Heo
@ 2013-12-05  1:48   ` Li Zefan
  2013-12-05 17:26   ` Tejun Heo
  13 siblings, 0 replies; 38+ messages in thread
From: Li Zefan @ 2013-12-05  1:48 UTC (permalink / raw)
  To: Tejun Heo
  Cc: nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w, mhocko-AlSwsSmVLrQ,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA

> cgroup is scheduled to be converted to use kernfs, which is currently
> in the process of being separated out of sysfs, so that, among other
> things, cgroup core locking can be decoupled from vfs layer.  This
> patchset cleans up and conslidates cgroup file handling to facilitate
> such conversion.
> 
> There currently are a couple different rw paths including the ones
> which don't impose any structure.  All existing users and expected
> reasonable use cases can be served with standard seq_file interface
> and buffered writes, which is what's provided by kernfs.
> 
> This patchset updates cgroup file handling so that the interface and
> usages are more concise and there is single path for read and single
> path for write, both of which closely map to the interface kernfs
> provides.
> 
> This series ends up adding some amount of code which will be replaced
> by kernfs but, overall, things get more streamlined and LOC is
> reduced.
> 
> The following 12 patches are included in the series.
> 
>  0001-cgroup-sched-convert-away-from-cftype-read_map.patch
>  0002-cpuset-convert-away-from-cftype-read.patch
>  0003-memcg-convert-away-from-cftype-read-and-read_map.patch
>  0004-netprio_cgroup-convert-away-from-cftype-read_map.patch
>  0005-hugetlb_cgroup-convert-away-from-cftype-read.patch
>  0006-cgroup-remove-cftype-read-read_map-and-write.patch
>  0007-cgroup-unify-cgroup_write_X64-and-cgroup_write_strin.patch
>  0008-cgroup-unify-read-path-so-that-seq_file-is-always-us.patch
>  0009-cgroup-generalize-cgroup_pidlist_open_file.patch
>  0010-cgroup-attach-cgroup_open_file-to-all-cgroup-files.patch
>  0011-cgroup-replace-cftype-read_seq_string-with-cftype-se.patch
>  0012-cgroup-unify-pidlist-and-other-file-handling.patch
> 

Acked-by: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCHSET cgroup/for-3.14] cgroup: consolidate file handling
       [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
                     ` (12 preceding siblings ...)
  2013-12-05  1:48   ` [PATCHSET cgroup/for-3.14] cgroup: consolidate " Li Zefan
@ 2013-12-05 17:26   ` Tejun Heo
  13 siblings, 0 replies; 38+ messages in thread
From: Tejun Heo @ 2013-12-05 17:26 UTC (permalink / raw)
  To: lizefan-hv44wF8Li93QT0dZR+AlfA
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	cgroups-u79uwXL29TY76Z2rM5mHXA, vgoyal-H+wXaHxf7aLQT0dZR+AlfA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-AlSwsSmVLrQ,
	bsingharora-Re5JQEeQqe8AvxtiuMwx3w,
	kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	daniel.wagner-98C5kh4wR6ohFhg+JK9F0w,
	arozansk-H+wXaHxf7aLQT0dZR+AlfA

Applying to cgroup/for-3.14.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2013-12-05 17:26 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-27 23:42 [PATCHSET cgroup/for-3.14] cgroup: consolidate file handling Tejun Heo
     [not found] ` <1385595759-17656-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-27 23:42   ` [PATCH 01/12] cgroup, sched: convert away from cftype->read_map() Tejun Heo
2013-11-27 23:42   ` [PATCH 02/12] cpuset: convert away from cftype->read() Tejun Heo
2013-11-27 23:42   ` [PATCH 03/12] memcg: convert away from cftype->read() and ->read_map() Tejun Heo
     [not found]     ` <1385595759-17656-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-28  8:26       ` Michal Hocko
2013-11-27 23:42   ` [PATCH 04/12] netprio_cgroup: convert away from cftype->read_map() Tejun Heo
     [not found]     ` <1385595759-17656-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-29  1:56       ` Neil Horman
2013-11-29  8:52       ` Daniel Wagner
2013-11-27 23:42   ` [PATCH 05/12] hugetlb_cgroup: convert away from cftype->read() Tejun Heo
     [not found]     ` <1385595759-17656-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-28  8:29       ` Michal Hocko
2013-11-27 23:42   ` [PATCH 06/12] cgroup: remove cftype->read(), ->read_map() and ->write() Tejun Heo
2013-11-27 23:42   ` [PATCH 07/12] cgroup: unify cgroup_write_X64() and cgroup_write_string() Tejun Heo
     [not found]     ` <1385595759-17656-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-28 11:18       ` Michal Hocko
     [not found]         ` <20131128111818.GG2761-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-11-29 20:05           ` Tejun Heo
     [not found]             ` <20131129200525.GC21755-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-12-02  9:54               ` Michal Hocko
     [not found]                 ` <20131202095401.GA18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-12-02 13:30                   ` Tejun Heo
     [not found]                     ` <20131202133059.GA3626-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-12-02 14:12                       ` Michal Hocko
     [not found]                         ` <20131202141242.GD18838-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2013-12-03 20:41                           ` Tejun Heo
     [not found]                             ` <20131203204155.GL8277-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-12-03 21:04                               ` Michal Hocko
2013-12-02 16:44               ` Johannes Weiner
     [not found]                 ` <20131202164406.GP3556-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2013-12-03  7:35                   ` Li Zefan
2013-11-27 23:42   ` [PATCH 08/12] cgroup: unify read path so that seq_file is always used Tejun Heo
2013-11-27 23:42   ` [PATCH 09/12] cgroup: generalize cgroup_pidlist_open_file Tejun Heo
2013-11-27 23:42   ` [PATCH 10/12] cgroup: attach cgroup_open_file to all cgroup files Tejun Heo
     [not found]     ` <1385595759-17656-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-12-04  6:04       ` Li Zefan
     [not found]         ` <529EC5F4.10708-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-12-04 13:04           ` Tejun Heo
2013-12-04 15:09       ` [PATCH v2 " Tejun Heo
2013-11-27 23:42   ` [PATCH 11/12] cgroup: replace cftype->read_seq_string() with cftype->seq_show() Tejun Heo
     [not found]     ` <1385595759-17656-12-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-11-28  9:07       ` Daniel Wagner
2013-11-28 11:25       ` Michal Hocko
2013-12-02 14:41       ` Aristeu Rozanski
2013-12-02 14:52       ` Vivek Goyal
2013-11-27 23:42   ` [PATCH 12/12] cgroup: unify pidlist and other file handling Tejun Heo
     [not found]     ` <1385595759-17656-13-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2013-12-04  6:20       ` Li Zefan
     [not found]         ` <529EC9A6.903-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2013-12-04 13:08           ` Tejun Heo
2013-12-04 15:09       ` [PATCH v3 " Tejun Heo
2013-12-05  1:48   ` [PATCHSET cgroup/for-3.14] cgroup: consolidate " Li Zefan
2013-12-05 17:26   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).