* [PATCH] cgroups: Do not show inactive devices in cgroups statistics
@ 2013-06-26 19:26 Anatol Pomozov
[not found] ` <1372274770-30679-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 12+ messages in thread
From: Anatol Pomozov @ 2013-06-26 19:26 UTC (permalink / raw)
To: tj-DgEjT+Ai2ygdnm+yROfE0A; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Anatol Pomozov
Before 3.5 cgrups stats files shown only devices with activity.
But latest kernel shows stats for all devices. The previous implementation
looks better. Imagine a server machine with hundreds cgroups and hundreds
iSCSI block devices (where cgroup uses just a few block devices). This
generates *a lot* of useless data that requires more resources to
process/store.
Recover previous behaviour by skipping entries with zero stats.
Signed-off-by: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
block/blk-cgroup.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index e8918ff..915c2b0 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -548,6 +548,9 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v)
if (!dname)
return 0;
+ if (!v)
+ return 0;
+
seq_printf(sf, "%s %llu\n", dname, (unsigned long long)v);
return v;
}
@@ -571,19 +574,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd,
[BLKG_RWSTAT_ASYNC] = "Async",
};
const char *dname = blkg_dev_name(pd->blkg);
- u64 v;
+ u64 total;
int i;
if (!dname)
return 0;
+ total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE];
+ /* skip devices with no activity */
+ if (!total)
+ return 0;
+
for (i = 0; i < BLKG_RWSTAT_NR; i++)
seq_printf(sf, "%s %s %llu\n", dname, rwstr[i],
(unsigned long long)rwstat->cnt[i]);
- v = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE];
- seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v);
- return v;
+ seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)total);
+ return total;
}
EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat);
--
1.8.3
^ permalink raw reply related [flat|nested] 12+ messages in thread[parent not found: <1372274770-30679-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <1372274770-30679-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2013-06-26 19:36 ` Anatol Pomozov 2013-06-26 20:45 ` Tejun Heo 1 sibling, 0 replies; 12+ messages in thread From: Anatol Pomozov @ 2013-06-26 19:36 UTC (permalink / raw) To: tj-DgEjT+Ai2ygdnm+yROfE0A; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Anatol Pomozov Hi, Tejun A little bit more context about this change. At Google we actively use block crgoups and generate a lot of statistics files. We have a lot of block devices per machine and usually apps in a cgroup use only small subset of the drives. While migrating to the latest kernel we've found that those stat files became much bigger - now they are full of devices with zero stats. The previous behavior seems better to me as it emphasizes device usage. I do not know whether the change was made intentionally or accidentally during block cgroups refactoring. If latter then the patch above restores the previous behavior. On Wed, Jun 26, 2013 at 12:26 PM, Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > Before 3.5 cgrups stats files shown only devices with activity. > But latest kernel shows stats for all devices. The previous implementation > looks better. Imagine a server machine with hundreds cgroups and hundreds > iSCSI block devices (where cgroup uses just a few block devices). This > generates *a lot* of useless data that requires more resources to > process/store. > > Recover previous behaviour by skipping entries with zero stats. > > Signed-off-by: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > --- > block/blk-cgroup.c | 15 +++++++++++---- > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index e8918ff..915c2b0 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -548,6 +548,9 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v) > if (!dname) > return 0; > > + if (!v) > + return 0; I am not sure about this change. It looks like previously we've shown zero values here as well. But I made "drop inactive devices" here to be consistent with the change below. Feel free to drop it if you are against it. > > + > seq_printf(sf, "%s %llu\n", dname, (unsigned long long)v); > return v; > } > @@ -571,19 +574,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, > [BLKG_RWSTAT_ASYNC] = "Async", > }; > const char *dname = blkg_dev_name(pd->blkg); > - u64 v; > + u64 total; > int i; > > if (!dname) > return 0; > > + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > + /* skip devices with no activity */ > + if (!total) > + return 0; > + > for (i = 0; i < BLKG_RWSTAT_NR; i++) > seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], > (unsigned long long)rwstat->cnt[i]); > > - v = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > - seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v); > - return v; > + seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)total); > + return total; > } > EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); > > -- > 1.8.3 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <1372274770-30679-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2013-06-26 19:36 ` Anatol Pomozov @ 2013-06-26 20:45 ` Tejun Heo [not found] ` <20130626204540.GA4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 1 sibling, 1 reply; 12+ messages in thread From: Tejun Heo @ 2013-06-26 20:45 UTC (permalink / raw) To: Anatol Pomozov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Vivek Goyal, Jens Axboe (cc'ing Vivek and Jens) On Wed, Jun 26, 2013 at 12:26:10PM -0700, Anatol Pomozov wrote: > @@ -548,6 +548,9 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v) > if (!dname) > return 0; > > + if (!v) > + return 0; > + I don't think it'd be a good idea to filter out 0 by default from __blkg_prfill_u64(). It'd probably be a better idea to do the filtering from the users of __blkg_prfill_u64(). Would that be a lot more churn? > @@ -571,19 +574,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, > [BLKG_RWSTAT_ASYNC] = "Async", > }; > const char *dname = blkg_dev_name(pd->blkg); > - u64 v; > + u64 total; > int i; > > if (!dname) > return 0; > > + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > + /* skip devices with no activity */ > + if (!total) > + return 0; > + Doing it from rwstat is fine as it's always printing "stats" and suppressing 0 stats by default should be fine, I think. Can you please update the function comment accordingly tho? Vivek, any objections? Thanks. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20130626204540.GA4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130626204540.GA4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2013-06-26 21:08 ` Vivek Goyal [not found] ` <20130626210820.GA17564-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Vivek Goyal @ 2013-06-26 21:08 UTC (permalink / raw) To: Tejun Heo; +Cc: Anatol Pomozov, cgroups-u79uwXL29TY76Z2rM5mHXA, Jens Axboe On Wed, Jun 26, 2013 at 01:45:40PM -0700, Tejun Heo wrote: > (cc'ing Vivek and Jens) > > On Wed, Jun 26, 2013 at 12:26:10PM -0700, Anatol Pomozov wrote: > > @@ -548,6 +548,9 @@ u64 __blkg_prfill_u64(struct seq_file *sf, struct blkg_policy_data *pd, u64 v) > > if (!dname) > > return 0; > > > > + if (!v) > > + return 0; > > + > > I don't think it'd be a good idea to filter out 0 by default from > __blkg_prfill_u64(). It'd probably be a better idea to do the > filtering from the users of __blkg_prfill_u64(). Would that be a lot > more churn? > > > @@ -571,19 +574,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, > > [BLKG_RWSTAT_ASYNC] = "Async", > > }; > > const char *dname = blkg_dev_name(pd->blkg); > > - u64 v; > > + u64 total; > > int i; > > > > if (!dname) > > return 0; > > > > + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > > + /* skip devices with no activity */ > > + if (!total) > > + return 0; > > + > > Doing it from rwstat is fine as it's always printing "stats" and > suppressing 0 stats by default should be fine, I think. Can you > please update the function comment accordingly tho? > > Vivek, any objections? I don't have an objection to not listing stats of devices which are zero, but wondering why all the devices of system are showing in cgroup stat. Don't we add a blkg to a blkcg only if that cgroup did some IO to that particular device. If yes, then only those devices should show to which cgroup did some IO and that should be non-zero. Is it because of hierarchical support where if a child does IO to device, we will add an blkg instance to parent's cgroup? In that case hierarchical stats will still be non-zero but group local stat can be zero. I think it makes sense to remove zero stats. Hierarchical stat will anyway have that data. Thanks Vivek ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20130626210820.GA17564-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130626210820.GA17564-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-06-26 21:11 ` Tejun Heo [not found] ` <CAOS58YP_Umud2DTY+3tc4m02FywGNOEALSN1An4OLXczcmsQ4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2013-06-26 21:11 UTC (permalink / raw) To: Vivek Goyal; +Cc: Anatol Pomozov, Cgroups, Jens Axboe On Wed, Jun 26, 2013 at 2:08 PM, Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > I don't have an objection to not listing stats of devices which are > zero, but wondering why all the devices of system are showing in > cgroup stat. > > Don't we add a blkg to a blkcg only if that cgroup did some IO to > that particular device. If yes, then only those devices should > show to which cgroup did some IO and that should be non-zero. > > Is it because of hierarchical support where if a child does IO > to device, we will add an blkg instance to parent's cgroup? In > that case hierarchical stats will still be non-zero but group > local stat can be zero. I wondered that too. Maybe they're configuring all combinations? Anatol? Thanks. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <CAOS58YP_Umud2DTY+3tc4m02FywGNOEALSN1An4OLXczcmsQ4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <CAOS58YP_Umud2DTY+3tc4m02FywGNOEALSN1An4OLXczcmsQ4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-06-27 0:37 ` Anatol Pomozov [not found] ` <CAOMFOmVqevNSKgWR5keT8u7vsCRpHhYROejM6OucyNeeA6j6zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Anatol Pomozov @ 2013-06-27 0:37 UTC (permalink / raw) To: Tejun Heo; +Cc: Vivek Goyal, Cgroups, Jens Axboe Hi On Wed, Jun 26, 2013 at 2:11 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > On Wed, Jun 26, 2013 at 2:08 PM, Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> I don't have an objection to not listing stats of devices which are >> zero, but wondering why all the devices of system are showing in >> cgroup stat. >> >> Don't we add a blkg to a blkcg only if that cgroup did some IO to >> that particular device. If yes, then only those devices should >> show to which cgroup did some IO and that should be non-zero. >> >> Is it because of hierarchical support where if a child does IO >> to device, we will add an blkg instance to parent's cgroup? In >> that case hierarchical stats will still be non-zero but group >> local stat can be zero. > > I wondered that too. Maybe they're configuring all combinations? Anatol? We use an io scheduler similar to cfq. So I added a log to find place where the blkg is created. I found that the structure is created on data write to CGROUP/weight_device CFQ file. Now let's look at cfqg_set_weight_device() function (I am looking at 3.5 sources but seems HEAD has the same issue). When we set weight for a device in that group then an instance of cfq_group is created for it. Here is the codepath from write() syscall to cfq_pd_init() function: [ 119.584630] [<ffffffff805001ba>] cfq_pd_init+0x10a/0x110 [ 119.584634] [<ffffffff804fce00>] blkg_alloc+0x110/0x130 [ 119.584639] [<ffffffff804fd098>] __blkg_lookup_create+0x278/0x3c0 [ 119.584643] [<ffffffff804fd1fb>] blkg_lookup_create+0x1b/0x40 [ 119.584647] [<ffffffff804fd433>] blkg_conf_prep+0x213/0x270 [ 119.584669] [<ffffffff805008ca>] cfqg_set_weight_device+0x4a/0xd0 [ 119.584678] [<ffffffff802c1118>] cgroup_write_string.isra.17+0xc8/0x130 [ 119.584687] [<ffffffff802c12d9>] cgroup_file_write+0x159/0x1e0 [ 119.584707] [<ffffffff8038aa4f>] vfs_write+0xaf/0x160 [ 119.584711] [<ffffffff8038add6>] sys_write+0x76/0x100 [ 119.584716] [<ffffffff807da672>] system_call_fastpath+0x16/0x1b Bottom line: when we set weight for a device then it creates cfq_group and initializes device stats with zero. Later that zero stat is shown even if no activity happened on the device. Both CFQ and our custom IO scheduler have this problem. And because we write set weight for all devices in all cgroups we endup with a lot of zero useless stats. What is the right fix in this situation? ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <CAOMFOmVqevNSKgWR5keT8u7vsCRpHhYROejM6OucyNeeA6j6zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <CAOMFOmVqevNSKgWR5keT8u7vsCRpHhYROejM6OucyNeeA6j6zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-06-27 1:23 ` Tejun Heo [not found] ` <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2013-06-27 1:23 UTC (permalink / raw) To: Anatol Pomozov; +Cc: Vivek Goyal, Cgroups, Jens Axboe Hello, On Wed, Jun 26, 2013 at 05:37:38PM -0700, Anatol Pomozov wrote: > Bottom line: when we set weight for a device then it creates cfq_group > and initializes device stats with zero. Later that zero stat is shown > even if no activity happened on the device. Both CFQ and our custom IO > scheduler have this problem. And because we write set weight for all > devices in all cgroups we endup with a lot of zero useless stats. > > What is the right fix in this situation? Yeah, I think yours is fine. You're just configuring each blkg during init. Please just update the u64 case and repost. Thanks! -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> @ 2013-06-27 1:23 ` Tejun Heo 2013-06-27 16:37 ` Anatol Pomozov 2013-06-27 16:39 ` Anatol Pomozov 2 siblings, 0 replies; 12+ messages in thread From: Tejun Heo @ 2013-06-27 1:23 UTC (permalink / raw) To: Anatol Pomozov; +Cc: Vivek Goyal, Cgroups, Jens Axboe On Wed, Jun 26, 2013 at 06:23:04PM -0700, Tejun Heo wrote: > Hello, > > On Wed, Jun 26, 2013 at 05:37:38PM -0700, Anatol Pomozov wrote: > > Bottom line: when we set weight for a device then it creates cfq_group > > and initializes device stats with zero. Later that zero stat is shown > > even if no activity happened on the device. Both CFQ and our custom IO > > scheduler have this problem. And because we write set weight for all > > devices in all cgroups we endup with a lot of zero useless stats. > > > > What is the right fix in this situation? > > Yeah, I think yours is fine. You're just configuring each blkg during > init. Please just update the u64 case and repost. BTW, if there are a lot of users which ignore 0 u64, it may be cleaner to introduce a wrapper around the existing prfill. Thanks. -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-27 1:23 ` Tejun Heo @ 2013-06-27 16:37 ` Anatol Pomozov [not found] ` <1372351046-11976-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2013-06-27 16:39 ` Anatol Pomozov 2 siblings, 1 reply; 12+ messages in thread From: Anatol Pomozov @ 2013-06-27 16:37 UTC (permalink / raw) To: tj-DgEjT+Ai2ygdnm+yROfE0A; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Anatol Pomozov Before 3.5 CFQ cgroups stats was lazy initialized - the stat object was created only if the device had any I/O in that cgroup. Around 3.5 time CFQ stat structure became part of device structure. And zero stat is created when device is initialized. Initialization can happen e.g. when we configure it via 'weight_device' file. It calls cfqg_set_weight_device() and it creates cfq_group structure. But even if device is initialized the stats is still zero. If we configure all devices in all in all cgroups it generates a lot of useless stat info that takes more resources to process and store. Imagine servers with handreds cgroups and handreds iSCSI block devices. Do not print stats for inactive devices - it restores functionality that was before 3.5 Signed-off-by: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/blk-cgroup.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index e8918ff..4e07e73 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -560,6 +560,7 @@ EXPORT_SYMBOL_GPL(__blkg_prfill_u64); * @rwstat: rwstat to print * * Print @rwstat to @sf for the device assocaited with @pd. + * Devices with zero activity will not be printed. */ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, const struct blkg_rwstat *rwstat) @@ -571,19 +572,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, [BLKG_RWSTAT_ASYNC] = "Async", }; const char *dname = blkg_dev_name(pd->blkg); - u64 v; + u64 total; int i; if (!dname) return 0; + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; + /* skip devices with no activity */ + if (!total) + return 0; + for (i = 0; i < BLKG_RWSTAT_NR; i++) seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], (unsigned long long)rwstat->cnt[i]); - v = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; - seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v); - return v; + seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)total); + return total; } EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); -- 1.8.3 ^ permalink raw reply related [flat|nested] 12+ messages in thread
[parent not found: <1372351046-11976-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <1372351046-11976-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2013-06-27 16:59 ` Tejun Heo [not found] ` <20130627165943.GA5599-9pTldWuhBndy/B6EtB590w@public.gmane.org> 0 siblings, 1 reply; 12+ messages in thread From: Tejun Heo @ 2013-06-27 16:59 UTC (permalink / raw) To: Anatol Pomozov; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Jens Axboe (cc'ing Jens) Hello, This one probably needs -stable cc'd. Jens, do you wanna take this? If not, I'll be happy to take it through cgroup tree. Thanks. On Thu, Jun 27, 2013 at 09:37:26AM -0700, Anatol Pomozov wrote: > Before 3.5 CFQ cgroups stats was lazy initialized - the stat object was created > only if the device had any I/O in that cgroup. > > Around 3.5 time CFQ stat structure became part of device structure. And > zero stat is created when device is initialized. Initialization can happen > e.g. when we configure it via 'weight_device' file. It calls > cfqg_set_weight_device() and it creates cfq_group structure. > > But even if device is initialized the stats is still zero. > If we configure all devices in all in all cgroups it generates a lot of > useless stat info that takes more resources to process and store. Imagine > servers with handreds cgroups and handreds iSCSI block devices. > > Do not print stats for inactive devices - it restores functionality > that was before 3.5 > > Signed-off-by: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > --- > block/blk-cgroup.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c > index e8918ff..4e07e73 100644 > --- a/block/blk-cgroup.c > +++ b/block/blk-cgroup.c > @@ -560,6 +560,7 @@ EXPORT_SYMBOL_GPL(__blkg_prfill_u64); > * @rwstat: rwstat to print > * > * Print @rwstat to @sf for the device assocaited with @pd. > + * Devices with zero activity will not be printed. > */ > u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, > const struct blkg_rwstat *rwstat) > @@ -571,19 +572,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, > [BLKG_RWSTAT_ASYNC] = "Async", > }; > const char *dname = blkg_dev_name(pd->blkg); > - u64 v; > + u64 total; > int i; > > if (!dname) > return 0; > > + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > + /* skip devices with no activity */ > + if (!total) > + return 0; > + > for (i = 0; i < BLKG_RWSTAT_NR; i++) > seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], > (unsigned long long)rwstat->cnt[i]); > > - v = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; > - seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v); > - return v; > + seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)total); > + return total; > } > EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); > > -- > 1.8.3 > -- tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
[parent not found: <20130627165943.GA5599-9pTldWuhBndy/B6EtB590w@public.gmane.org>]
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130627165943.GA5599-9pTldWuhBndy/B6EtB590w@public.gmane.org> @ 2013-06-27 17:21 ` Anatol Pomozov 0 siblings, 0 replies; 12+ messages in thread From: Anatol Pomozov @ 2013-06-27 17:21 UTC (permalink / raw) To: Tejun Heo; +Cc: Cgroups, Jens Axboe Hi On Thu, Jun 27, 2013 at 9:59 AM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > (cc'ing Jens) > > Hello, > > This one probably needs -stable cc'd. Jens, do you wanna take this? > If not, I'll be happy to take it through cgroup tree. Personally I think it should not go to -stable. The stats was changed in 3.5 kernel and nobody complained about it, so I guess nobody (except us) cares about the file format. > On Thu, Jun 27, 2013 at 09:37:26AM -0700, Anatol Pomozov wrote: >> Before 3.5 CFQ cgroups stats was lazy initialized - the stat object was created >> only if the device had any I/O in that cgroup. >> >> Around 3.5 time CFQ stat structure became part of device structure. And >> zero stat is created when device is initialized. Initialization can happen >> e.g. when we configure it via 'weight_device' file. It calls >> cfqg_set_weight_device() and it creates cfq_group structure. >> >> But even if device is initialized the stats is still zero. >> If we configure all devices in all in all cgroups it generates a lot of >> useless stat info that takes more resources to process and store. Imagine >> servers with handreds cgroups and handreds iSCSI block devices. >> >> Do not print stats for inactive devices - it restores functionality >> that was before 3.5 >> >> Signed-off-by: Anatol Pomozov <anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> > > Acked-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> > >> --- >> block/blk-cgroup.c | 13 +++++++++---- >> 1 file changed, 9 insertions(+), 4 deletions(-) >> >> diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c >> index e8918ff..4e07e73 100644 >> --- a/block/blk-cgroup.c >> +++ b/block/blk-cgroup.c >> @@ -560,6 +560,7 @@ EXPORT_SYMBOL_GPL(__blkg_prfill_u64); >> * @rwstat: rwstat to print >> * >> * Print @rwstat to @sf for the device assocaited with @pd. >> + * Devices with zero activity will not be printed. >> */ >> u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, >> const struct blkg_rwstat *rwstat) >> @@ -571,19 +572,23 @@ u64 __blkg_prfill_rwstat(struct seq_file *sf, struct blkg_policy_data *pd, >> [BLKG_RWSTAT_ASYNC] = "Async", >> }; >> const char *dname = blkg_dev_name(pd->blkg); >> - u64 v; >> + u64 total; >> int i; >> >> if (!dname) >> return 0; >> >> + total = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; >> + /* skip devices with no activity */ >> + if (!total) >> + return 0; >> + >> for (i = 0; i < BLKG_RWSTAT_NR; i++) >> seq_printf(sf, "%s %s %llu\n", dname, rwstr[i], >> (unsigned long long)rwstat->cnt[i]); >> >> - v = rwstat->cnt[BLKG_RWSTAT_READ] + rwstat->cnt[BLKG_RWSTAT_WRITE]; >> - seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)v); >> - return v; >> + seq_printf(sf, "%s Total %llu\n", dname, (unsigned long long)total); >> + return total; >> } >> EXPORT_SYMBOL_GPL(__blkg_prfill_rwstat); >> >> -- >> 1.8.3 >> > > -- > tejun ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] cgroups: Do not show inactive devices in cgroups statistics [not found] ` <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org> 2013-06-27 1:23 ` Tejun Heo 2013-06-27 16:37 ` Anatol Pomozov @ 2013-06-27 16:39 ` Anatol Pomozov 2 siblings, 0 replies; 12+ messages in thread From: Anatol Pomozov @ 2013-06-27 16:39 UTC (permalink / raw) To: Tejun Heo; +Cc: Vivek Goyal, Cgroups, Jens Axboe Hi On Wed, Jun 26, 2013 at 6:23 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > Hello, > > On Wed, Jun 26, 2013 at 05:37:38PM -0700, Anatol Pomozov wrote: >> Bottom line: when we set weight for a device then it creates cfq_group >> and initializes device stats with zero. Later that zero stat is shown >> even if no activity happened on the device. Both CFQ and our custom IO >> scheduler have this problem. And because we write set weight for all >> devices in all cgroups we endup with a lot of zero useless stats. >> >> What is the right fix in this situation? > > Yeah, I think yours is fine. You're just configuring each blkg during > init. Please just update the u64 case and repost. Yep. But it looks like before 3.5 stats initialization was not tied to device initialization. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-06-27 17:21 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-26 19:26 [PATCH] cgroups: Do not show inactive devices in cgroups statistics Anatol Pomozov
[not found] ` <1372274770-30679-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-06-26 19:36 ` Anatol Pomozov
2013-06-26 20:45 ` Tejun Heo
[not found] ` <20130626204540.GA4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-26 21:08 ` Vivek Goyal
[not found] ` <20130626210820.GA17564-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-06-26 21:11 ` Tejun Heo
[not found] ` <CAOS58YP_Umud2DTY+3tc4m02FywGNOEALSN1An4OLXczcmsQ4Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-27 0:37 ` Anatol Pomozov
[not found] ` <CAOMFOmVqevNSKgWR5keT8u7vsCRpHhYROejM6OucyNeeA6j6zQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-06-27 1:23 ` Tejun Heo
[not found] ` <20130627012304.GG4536-Gd/HAXX7CRxy/B6EtB590w@public.gmane.org>
2013-06-27 1:23 ` Tejun Heo
2013-06-27 16:37 ` Anatol Pomozov
[not found] ` <1372351046-11976-1-git-send-email-anatol.pomozov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2013-06-27 16:59 ` Tejun Heo
[not found] ` <20130627165943.GA5599-9pTldWuhBndy/B6EtB590w@public.gmane.org>
2013-06-27 17:21 ` Anatol Pomozov
2013-06-27 16:39 ` Anatol Pomozov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox