* [PATCH v1 0/5] cgroup: Assign subsystem IDs during compile time
@ 2012-08-17 14:58 Daniel Wagner
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Hi,
I was able to 'fix' CGROUP_BUILTIN_SUBSYS_COUNT defition. With this
version there is no unused subsys_id.
The number of builtin subsystem are counted with gcc's predefined
__COUNTER__ macro. This is a bit fragile, because __COUNTER__
is only reset to 0 per compile unit. There is a workaround for this.
When starting to enumate we need to store the current value of
__COUNTER__ and then subtract that from all enums we define.
Not sure if that is okay or not.
cheers,
daniel
Original cover letter:
The patch #1 and #2 are there to be able to introduce (#3, #4) the
jump labels in task_cls_classid() and task_netprioidx(). The jump
labels are needed to know when it is safe to access the controller.
For example not safe means the module is not yet loaded.
All those patches are just preparation for the center piece (#5)
of these series. This one will remove the dynamic subsystem ID
generation and falls back to compile time generated IDs.
This is the first result from the discussion around on the
"cgroup cls & netprio 'cleanups'" patches.
This patches are against net-next
v1: - only use jump labels when built as module (#3, #4)
- get rid of the additional 'pointer' (#5)
v0: - initial version
Daniel Wagner (5):
cgroup: Use empty task_cls_classid() when !CONFIG_NET_CLS(_MODULE)
cgroup: Move sock_update_classid() decleration to cls_cgroup.h
cgroup: Protect access to task_cls_classid() when built as module
cgroup: Protect access to task_netprioidx() when built as module
cgroup: Assign subsystem IDs during compile time
include/linux/cgroup.h | 27 ++++++++++++++++++---------
include/linux/cgroup_subsys.h | 24 ++++++++++++------------
include/net/cls_cgroup.h | 39 ++++++++++++++++++++++++++-------------
include/net/netprio_cgroup.h | 25 +++++++++++--------------
include/net/sock.h | 8 --------
kernel/cgroup.c | 25 ++++++-------------------
net/core/netprio_cgroup.c | 21 ++++++++++-----------
net/core/sock.c | 12 ++++++------
net/sched/cls_cgroup.c | 22 +++++++++-------------
9 files changed, 98 insertions(+), 105 deletions(-)
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v1 1/5] cgroup: Use empty task_cls_classid() when !CONFIG_NET_CLS(_MODULE)
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
@ 2012-08-17 14:58 ` Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 2/5] cgroup: Move sock_update_classid() decleration to cls_cgroup.h Daniel Wagner
` (3 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Neil Horman, Tejun Heo
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/net/cls_cgroup.h | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
index a4dc5b0..e2fe2b9 100644
--- a/include/net/cls_cgroup.h
+++ b/include/net/cls_cgroup.h
@@ -24,7 +24,8 @@ struct cgroup_cls_state
u32 classid;
};
-#ifdef CONFIG_NET_CLS_CGROUP
+#if IS_BUILTIN(CONFIG_NET_CLS_CGROUP)
+
static inline u32 task_cls_classid(struct task_struct *p)
{
int classid;
@@ -39,7 +40,9 @@ static inline u32 task_cls_classid(struct task_struct *p)
return classid;
}
-#else
+
+#elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
+
extern int net_cls_subsys_id;
static inline u32 task_cls_classid(struct task_struct *p)
@@ -60,11 +63,16 @@ static inline u32 task_cls_classid(struct task_struct *p)
return classid;
}
-#endif
+
#else
+
static inline u32 task_cls_classid(struct task_struct *p)
{
return 0;
}
-#endif
+
+#endif /* CONFIG_NET_CLS_CGROUP */
+
+#endif /* CONFIG_CGROURPS */
+
#endif /* _NET_CLS_CGROUP_H */
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 2/5] cgroup: Move sock_update_classid() decleration to cls_cgroup.h
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-17 14:58 ` [PATCH v1 1/5] cgroup: Use empty task_cls_classid() when !CONFIG_NET_CLS(_MODULE) Daniel Wagner
@ 2012-08-17 14:58 ` Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module Daniel Wagner
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner, Gao feng, Jamal Hadi Salim, John Fastabend,
Li Zefan, Neil Horman, Tejun Heo
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
The only user of sock_update_classid() is net/socket.c which
happens to include cls_cgroup.h direclty.
Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/net/cls_cgroup.h | 8 ++++++++
include/net/sock.h | 8 --------
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
index e2fe2b9..401672c 100644
--- a/include/net/cls_cgroup.h
+++ b/include/net/cls_cgroup.h
@@ -24,6 +24,8 @@ struct cgroup_cls_state
u32 classid;
};
+extern void sock_update_classid(struct sock *sk);
+
#if IS_BUILTIN(CONFIG_NET_CLS_CGROUP)
static inline u32 task_cls_classid(struct task_struct *p)
@@ -73,6 +75,12 @@ static inline u32 task_cls_classid(struct task_struct *p)
#endif /* CONFIG_NET_CLS_CGROUP */
+#else /* !CONFIG_CGROUPS */
+
+static inline void sock_update_classid(struct sock *sk)
+{
+}
+
#endif /* CONFIG_CGROURPS */
#endif /* _NET_CLS_CGROUP_H */
diff --git a/include/net/sock.h b/include/net/sock.h
index 72132ae..160a680 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1486,14 +1486,6 @@ extern void *sock_kmalloc(struct sock *sk, int size,
extern void sock_kfree_s(struct sock *sk, void *mem, int size);
extern void sk_send_sigurg(struct sock *sk);
-#ifdef CONFIG_CGROUPS
-extern void sock_update_classid(struct sock *sk);
-#else
-static inline void sock_update_classid(struct sock *sk)
-{
-}
-#endif
-
/*
* Functions to fill in entries in struct proto_ops when a protocol
* does not implement a particular function.
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-17 14:58 ` [PATCH v1 1/5] cgroup: Use empty task_cls_classid() when !CONFIG_NET_CLS(_MODULE) Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 2/5] cgroup: Move sock_update_classid() decleration to cls_cgroup.h Daniel Wagner
@ 2012-08-17 14:58 ` Daniel Wagner
[not found] ` <1345215494-9181-4-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-17 14:58 ` [PATCH v1 4/5] cgroup: Protect access to task_netprioidx() " Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 5/5] cgroup: Assign subsystem IDs during compile time Daniel Wagner
4 siblings, 1 reply; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Neil Horman, Tejun Heo
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
The module version of task_cls_classid() checks if net_cls_sbusys_id
is valid to indentify when it is okay to access the controller.
Instead relying on the subusys_id to be set, make it explicit
with a jump label.
Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/net/cls_cgroup.h | 5 ++++-
net/core/sock.c | 5 +++++
net/sched/cls_cgroup.c | 9 +++++++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
index 401672c..bbbd957 100644
--- a/include/net/cls_cgroup.h
+++ b/include/net/cls_cgroup.h
@@ -16,6 +16,7 @@
#include <linux/cgroup.h>
#include <linux/hardirq.h>
#include <linux/rcupdate.h>
+#include <linux/jump_label.h>
#ifdef CONFIG_CGROUPS
struct cgroup_cls_state
@@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
}
#elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
+extern struct static_key cgroup_cls_enabled;
+#define clscg_enabled static_key_false(&cgroup_cls_enabled)
extern int net_cls_subsys_id;
@@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
int id;
u32 classid = 0;
- if (in_interrupt())
+ if (!clscg_enabled || in_interrupt())
return 0;
rcu_read_lock();
diff --git a/net/core/sock.c b/net/core/sock.c
index 8f67ced..8106e77 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
EXPORT_SYMBOL(__sk_backlog_rcv);
#if defined(CONFIG_CGROUPS)
+#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
+struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
+EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
+#endif
+
#if !defined(CONFIG_NET_CLS_CGROUP)
int net_cls_subsys_id = -1;
EXPORT_SYMBOL_GPL(net_cls_subsys_id);
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 7743ea8..0635894 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
if (cgrp->parent)
cs->classid = cgrp_cls_state(cgrp->parent)->classid;
+#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
+ else if (!clscg_enabled)
+ static_key_slow_inc(&cgroup_cls_enabled);
+#endif
return &cs->css;
}
static void cgrp_destroy(struct cgroup *cgrp)
{
+#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
+ if (!cgrp->parent && clscg_enabled)
+ static_key_slow_dec(&cgroup_cls_enabled);
+#endif
+
kfree(cgrp_cls_state(cgrp));
}
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 4/5] cgroup: Protect access to task_netprioidx() when built as module
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
` (2 preceding siblings ...)
2012-08-17 14:58 ` [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module Daniel Wagner
@ 2012-08-17 14:58 ` Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 5/5] cgroup: Assign subsystem IDs during compile time Daniel Wagner
4 siblings, 0 replies; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Neil Horman, Tejun Heo
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
The module version of task_netprioidex() checks if net_prio_subsys_id
is valid to indentify when it is okay to access the controller.
Instead relying on the net_prio_subsys_id to be set, make it explicit
with a jump label.
Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/net/netprio_cgroup.h | 8 +++++++-
net/core/netprio_cgroup.c | 10 ++++++++++
net/core/sock.c | 4 ++++
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/include/net/netprio_cgroup.h b/include/net/netprio_cgroup.h
index 2719dec..9ff58e4 100644
--- a/include/net/netprio_cgroup.h
+++ b/include/net/netprio_cgroup.h
@@ -16,7 +16,7 @@
#include <linux/cgroup.h>
#include <linux/hardirq.h>
#include <linux/rcupdate.h>
-
+#include <linux/jump_label.h>
struct netprio_map {
struct rcu_head rcu;
@@ -54,12 +54,18 @@ static inline u32 task_netprioidx(struct task_struct *p)
#elif IS_MODULE(CONFIG_NETPRIO_CGROUP)
+extern struct static_key cgroup_netprio_enabled;
+#define netpriocg_enabled static_key_false(&cgroup_netprio_enabled)
+
static inline u32 task_netprioidx(struct task_struct *p)
{
struct cgroup_netprio_state *state;
int subsys_id;
u32 idx = 0;
+ if (!netpriocg_enabled)
+ return 0;
+
rcu_read_lock();
subsys_id = rcu_dereference_index_check(net_prio_subsys_id,
rcu_read_lock_held());
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index ed0c043..94e1270 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -155,6 +155,11 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
goto out;
}
+#if IS_MODULE(CONFIG_NETPRIO_CGROUP)
+ if (!netpriocg_enabled && !cgrp->parent)
+ static_key_slow_inc(&cgroup_netprio_enabled);
+#endif
+
ret = update_netdev_tables();
if (ret < 0) {
put_prioidx(cs->prioidx);
@@ -173,6 +178,11 @@ static void cgrp_destroy(struct cgroup *cgrp)
struct net_device *dev;
struct netprio_map *map;
+#if IS_MODULE(CONFIG_NETPRIO_CGROUP)
+ if (netpriocg_enabled && !cgrp->parent)
+ static_key_slow_dec(&cgroup_netprio_enabled);
+#endif
+
cs = cgrp_netprio_state(cgrp);
rtnl_lock();
for_each_netdev(&init_net, dev) {
diff --git a/net/core/sock.c b/net/core/sock.c
index 8106e77..1f119d2 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -331,6 +331,10 @@ EXPORT_SYMBOL(__sk_backlog_rcv);
struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
#endif
+#if IS_MODULE(CONFIG_NETPRIO_CGROUP)
+struct static_key cgroup_netprio_enabled = STATIC_KEY_INIT_FALSE;
+EXPORT_SYMBOL_GPL(cgroup_netprio_enabled);
+#endif
#if !defined(CONFIG_NET_CLS_CGROUP)
int net_cls_subsys_id = -1;
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v1 5/5] cgroup: Assign subsystem IDs during compile time
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
` (3 preceding siblings ...)
2012-08-17 14:58 ` [PATCH v1 4/5] cgroup: Protect access to task_netprioidx() " Daniel Wagner
@ 2012-08-17 14:58 ` Daniel Wagner
4 siblings, 0 replies; 15+ messages in thread
From: Daniel Wagner @ 2012-08-17 14:58 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA
Cc: Daniel Wagner, David S. Miller, Andrew Morton, Eric Dumazet,
Gao feng, Glauber Costa, Jamal Hadi Salim, John Fastabend,
Kamezawa Hiroyuki, Li Zefan, Neil Horman, Tejun Heo
From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
We are able to safe some space when we assign the subsystem
IDs at compile time. Instead of allocating per cgroup
cgroup->subsys[CGROUP_SUBSYS_COUNT] where CGROUP_SUBSYS_COUNT is
always 64, we allocate at max 12 (at this point there are 12
subsystem).
The enum is created by passing in cgroup_subsys.h twice and
redefine the IS_SUBSYS_ENABLED. In the first pass, we just select
the builtin subsystem and in the second pass only the module
subsystems.
task_cls_classid() and task_netprioidx() (when built as
module) are protected by a jump label and therefore we can
simply replace the subsystem index lookup with the enum.
Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: Eric Dumazet <edumazet-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
Cc: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
include/linux/cgroup.h | 27 ++++++++++++++++++---------
include/linux/cgroup_subsys.h | 24 ++++++++++++------------
include/net/cls_cgroup.h | 12 +++---------
include/net/netprio_cgroup.h | 17 ++++-------------
kernel/cgroup.c | 25 ++++++-------------------
net/core/netprio_cgroup.c | 11 -----------
net/core/sock.c | 9 ---------
net/sched/cls_cgroup.c | 13 -------------
8 files changed, 43 insertions(+), 95 deletions(-)
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index c90eaa8..1b9c8d9 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -44,18 +44,25 @@ extern void cgroup_unload_subsys(struct cgroup_subsys *ss);
extern const struct file_operations proc_cgroup_operations;
/* Define the enumeration of all builtin cgroup subsystems */
-#define SUBSYS(_x) _x ## _subsys_id,
+#define INCREASE_COUNTER() __COUNTER__
enum cgroup_subsys_id {
+
+#define SUBSYS(_x) _x ## _subsys_id = INCREASE_COUNTER(),
+#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option)
#include <linux/cgroup_subsys.h>
- CGROUP_BUILTIN_SUBSYS_COUNT
-};
#undef SUBSYS
-/*
- * This define indicates the maximum number of subsystems that can be loaded
- * at once. We limit to this many since cgroupfs_root has subsys_bits to keep
- * track of all of them.
- */
-#define CGROUP_SUBSYS_COUNT (BITS_PER_BYTE*sizeof(unsigned long))
+#undef IS_SUBSYS_ENABLED
+
+#define SUBSYS(_x) _x ## _subsys_id,
+#define IS_SUBSYS_ENABLED(option) IS_MODULE(option)
+#include <linux/cgroup_subsys.h>
+#undef SUBSYS
+#undef IS_SUBSYS_ENABLED
+
+ CGROUP_SUBSYS_COUNT,
+ CGROUP_BUILTIN_SUBSYS_COUNT = INCREASE_COUNTER()
+};
+#undef INCREASE_COUNTER
/* Per-subsystem/per-cgroup state maintained by the system. */
struct cgroup_subsys_state {
@@ -521,7 +528,9 @@ struct cgroup_subsys {
};
#define SUBSYS(_x) extern struct cgroup_subsys _x ## _subsys;
+#define IS_SUBSYS_ENABLED(option) IS_ENABLED(option)
#include <linux/cgroup_subsys.h>
+#undef IS_SUBSYS_ENABLED
#undef SUBSYS
static inline struct cgroup_subsys_state *cgroup_subsys_state(
diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h
index dfae957..f204a7a 100644
--- a/include/linux/cgroup_subsys.h
+++ b/include/linux/cgroup_subsys.h
@@ -7,73 +7,73 @@
/* */
-#ifdef CONFIG_CPUSETS
+#if IS_SUBSYS_ENABLED(CONFIG_CPUSETS)
SUBSYS(cpuset)
#endif
/* */
-#ifdef CONFIG_CGROUP_DEBUG
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_DEBUG)
SUBSYS(debug)
#endif
/* */
-#ifdef CONFIG_CGROUP_SCHED
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_SCHED)
SUBSYS(cpu_cgroup)
#endif
/* */
-#ifdef CONFIG_CGROUP_CPUACCT
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_CPUACCT)
SUBSYS(cpuacct)
#endif
/* */
-#ifdef CONFIG_MEMCG
+#if IS_SUBSYS_ENABLED(CONFIG_MEMCG)
SUBSYS(mem_cgroup)
#endif
/* */
-#ifdef CONFIG_CGROUP_DEVICE
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_DEVICE)
SUBSYS(devices)
#endif
/* */
-#ifdef CONFIG_CGROUP_FREEZER
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_FREEZER)
SUBSYS(freezer)
#endif
/* */
-#ifdef CONFIG_NET_CLS_CGROUP
+#if IS_SUBSYS_ENABLED(CONFIG_NET_CLS_CGROUP)
SUBSYS(net_cls)
#endif
/* */
-#ifdef CONFIG_BLK_CGROUP
+#if IS_SUBSYS_ENABLED(CONFIG_BLK_CGROUP)
SUBSYS(blkio)
#endif
/* */
-#ifdef CONFIG_CGROUP_PERF
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_PERF)
SUBSYS(perf)
#endif
/* */
-#ifdef CONFIG_NETPRIO_CGROUP
+#if IS_SUBSYS_ENABLED(CONFIG_NETPRIO_CGROUP)
SUBSYS(net_prio)
#endif
/* */
-#ifdef CONFIG_CGROUP_HUGETLB
+#if IS_SUBSYS_ENABLED(CONFIG_CGROUP_HUGETLB)
SUBSYS(hugetlb)
#endif
diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
index bbbd957..037bfb1 100644
--- a/include/net/cls_cgroup.h
+++ b/include/net/cls_cgroup.h
@@ -48,22 +48,16 @@ static inline u32 task_cls_classid(struct task_struct *p)
extern struct static_key cgroup_cls_enabled;
#define clscg_enabled static_key_false(&cgroup_cls_enabled)
-extern int net_cls_subsys_id;
-
static inline u32 task_cls_classid(struct task_struct *p)
{
- int id;
- u32 classid = 0;
+ u32 classid;
if (!clscg_enabled || in_interrupt())
return 0;
rcu_read_lock();
- id = rcu_dereference_index_check(net_cls_subsys_id,
- rcu_read_lock_held());
- if (id >= 0)
- classid = container_of(task_subsys_state(p, id),
- struct cgroup_cls_state, css)->classid;
+ classid = container_of(task_subsys_state(p, net_cls_subsys_id),
+ struct cgroup_cls_state, css)->classid;
rcu_read_unlock();
return classid;
diff --git a/include/net/netprio_cgroup.h b/include/net/netprio_cgroup.h
index 9ff58e4..66241c6 100644
--- a/include/net/netprio_cgroup.h
+++ b/include/net/netprio_cgroup.h
@@ -31,10 +31,6 @@ struct cgroup_netprio_state {
u32 prioidx;
};
-#ifndef CONFIG_NETPRIO_CGROUP
-extern int net_prio_subsys_id;
-#endif
-
extern void sock_update_netprioidx(struct sock *sk, struct task_struct *task);
#if IS_BUILTIN(CONFIG_NETPRIO_CGROUP)
@@ -60,20 +56,15 @@ extern struct static_key cgroup_netprio_enabled;
static inline u32 task_netprioidx(struct task_struct *p)
{
struct cgroup_netprio_state *state;
- int subsys_id;
- u32 idx = 0;
+ u32 idx;
if (!netpriocg_enabled)
return 0;
rcu_read_lock();
- subsys_id = rcu_dereference_index_check(net_prio_subsys_id,
- rcu_read_lock_held());
- if (subsys_id >= 0) {
- state = container_of(task_subsys_state(p, subsys_id),
- struct cgroup_netprio_state, css);
- idx = state->prioidx;
- }
+ state = container_of(task_subsys_state(p, net_prio_subsys_id),
+ struct cgroup_netprio_state, css);
+ idx = state->prioidx;
rcu_read_unlock();
return idx;
}
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 7981850..b00ae8a 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -93,9 +93,12 @@ static DEFINE_MUTEX(cgroup_root_mutex);
* cgroup_mutex.
*/
#define SUBSYS(_x) &_x ## _subsys,
+#define IS_SUBSYS_ENABLED(option) IS_BUILTIN(option)
static struct cgroup_subsys *subsys[CGROUP_SUBSYS_COUNT] = {
#include <linux/cgroup_subsys.h>
};
+#undef IS_SUBSYS_ENABLED
+#undef SUBSYS
#define MAX_CGROUP_ROOT_NAMELEN 64
@@ -4330,24 +4333,8 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
/* init base cftset */
cgroup_init_cftsets(ss);
- /*
- * need to register a subsys id before anything else - for example,
- * init_cgroup_css needs it.
- */
mutex_lock(&cgroup_mutex);
- /* find the first empty slot in the array */
- for (i = CGROUP_BUILTIN_SUBSYS_COUNT; i < CGROUP_SUBSYS_COUNT; i++) {
- if (subsys[i] == NULL)
- break;
- }
- if (i == CGROUP_SUBSYS_COUNT) {
- /* maximum number of subsystems already registered! */
- mutex_unlock(&cgroup_mutex);
- return -EBUSY;
- }
- /* assign ourselves the subsys_id */
- ss->subsys_id = i;
- subsys[i] = ss;
+ subsys[ss->subsys_id] = ss;
/*
* no ss->create seems to need anything important in the ss struct, so
@@ -4356,7 +4343,7 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
css = ss->create(dummytop);
if (IS_ERR(css)) {
/* failure case - need to deassign the subsys[] slot. */
- subsys[i] = NULL;
+ subsys[ss->subsys_id] = NULL;
mutex_unlock(&cgroup_mutex);
return PTR_ERR(css);
}
@@ -4372,7 +4359,7 @@ int __init_or_module cgroup_load_subsys(struct cgroup_subsys *ss)
if (ret) {
dummytop->subsys[ss->subsys_id] = NULL;
ss->destroy(dummytop);
- subsys[i] = NULL;
+ subsys[ss->subsys_id] = NULL;
mutex_unlock(&cgroup_mutex);
return ret;
}
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index 94e1270..3a1d5cc 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -352,9 +352,7 @@ struct cgroup_subsys net_prio_subsys = {
.create = cgrp_create,
.destroy = cgrp_destroy,
.attach = net_prio_attach,
-#ifdef CONFIG_NETPRIO_CGROUP
.subsys_id = net_prio_subsys_id,
-#endif
.base_cftypes = ss_files,
.module = THIS_MODULE
};
@@ -392,10 +390,6 @@ static int __init init_cgroup_netprio(void)
ret = cgroup_load_subsys(&net_prio_subsys);
if (ret)
goto out;
-#ifndef CONFIG_NETPRIO_CGROUP
- smp_wmb();
- net_prio_subsys_id = net_prio_subsys.subsys_id;
-#endif
register_netdevice_notifier(&netprio_device_notifier);
@@ -412,11 +406,6 @@ static void __exit exit_cgroup_netprio(void)
cgroup_unload_subsys(&net_prio_subsys);
-#ifndef CONFIG_NETPRIO_CGROUP
- net_prio_subsys_id = -1;
- synchronize_rcu();
-#endif
-
rtnl_lock();
for_each_netdev(&init_net, dev) {
old = rtnl_dereference(dev->priomap);
diff --git a/net/core/sock.c b/net/core/sock.c
index 1f119d2..aa762d9 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -335,15 +335,6 @@ EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
struct static_key cgroup_netprio_enabled = STATIC_KEY_INIT_FALSE;
EXPORT_SYMBOL_GPL(cgroup_netprio_enabled);
#endif
-
-#if !defined(CONFIG_NET_CLS_CGROUP)
-int net_cls_subsys_id = -1;
-EXPORT_SYMBOL_GPL(net_cls_subsys_id);
-#endif
-#if !defined(CONFIG_NETPRIO_CGROUP)
-int net_prio_subsys_id = -1;
-EXPORT_SYMBOL_GPL(net_prio_subsys_id);
-#endif
#endif
static int sock_set_timeout(long *timeo_p, char __user *optval, int optlen)
diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
index 0635894..5dd6fe6 100644
--- a/net/sched/cls_cgroup.c
+++ b/net/sched/cls_cgroup.c
@@ -86,9 +86,7 @@ struct cgroup_subsys net_cls_subsys = {
.name = "net_cls",
.create = cgrp_create,
.destroy = cgrp_destroy,
-#ifdef CONFIG_NET_CLS_CGROUP
.subsys_id = net_cls_subsys_id,
-#endif
.base_cftypes = ss_files,
.module = THIS_MODULE,
};
@@ -292,12 +290,6 @@ static int __init init_cgroup_cls(void)
if (ret)
goto out;
-#ifndef CONFIG_NET_CLS_CGROUP
- /* We can't use rcu_assign_pointer because this is an int. */
- smp_wmb();
- net_cls_subsys_id = net_cls_subsys.subsys_id;
-#endif
-
ret = register_tcf_proto_ops(&cls_cgroup_ops);
if (ret)
cgroup_unload_subsys(&net_cls_subsys);
@@ -310,11 +302,6 @@ static void __exit exit_cgroup_cls(void)
{
unregister_tcf_proto_ops(&cls_cgroup_ops);
-#ifndef CONFIG_NET_CLS_CGROUP
- net_cls_subsys_id = -1;
- synchronize_rcu();
-#endif
-
cgroup_unload_subsys(&net_cls_subsys);
}
--
1.7.12.rc1.16.g05a20c8
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <1345215494-9181-4-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
@ 2012-08-17 18:28 ` Neil Horman
2012-08-20 0:59 ` Li Zefan
[not found] ` <20120817182855.GA11607-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
0 siblings, 2 replies; 15+ messages in thread
From: Neil Horman @ 2012-08-17 18:28 UTC (permalink / raw)
To: Daniel Wagner
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Tejun Heo
On Fri, Aug 17, 2012 at 04:58:12PM +0200, Daniel Wagner wrote:
> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
>
> The module version of task_cls_classid() checks if net_cls_sbusys_id
> is valid to indentify when it is okay to access the controller.
>
> Instead relying on the subusys_id to be set, make it explicit
> with a jump label.
>
> Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
> Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
> include/net/cls_cgroup.h | 5 ++++-
> net/core/sock.c | 5 +++++
> net/sched/cls_cgroup.c | 9 +++++++++
> 3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
> index 401672c..bbbd957 100644
> --- a/include/net/cls_cgroup.h
> +++ b/include/net/cls_cgroup.h
> @@ -16,6 +16,7 @@
> #include <linux/cgroup.h>
> #include <linux/hardirq.h>
> #include <linux/rcupdate.h>
> +#include <linux/jump_label.h>
>
> #ifdef CONFIG_CGROUPS
> struct cgroup_cls_state
> @@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
> }
>
> #elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
> +extern struct static_key cgroup_cls_enabled;
> +#define clscg_enabled static_key_false(&cgroup_cls_enabled)
>
> extern int net_cls_subsys_id;
>
> @@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
> int id;
> u32 classid = 0;
>
> - if (in_interrupt())
> + if (!clscg_enabled || in_interrupt())
> return 0;
>
> rcu_read_lock();
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 8f67ced..8106e77 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> EXPORT_SYMBOL(__sk_backlog_rcv);
>
> #if defined(CONFIG_CGROUPS)
> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> +struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
> +EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
> +#endif
> +
> #if !defined(CONFIG_NET_CLS_CGROUP)
> int net_cls_subsys_id = -1;
> EXPORT_SYMBOL_GPL(net_cls_subsys_id);
> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> index 7743ea8..0635894 100644
> --- a/net/sched/cls_cgroup.c
> +++ b/net/sched/cls_cgroup.c
> @@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
>
> if (cgrp->parent)
> cs->classid = cgrp_cls_state(cgrp->parent)->classid;
> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> + else if (!clscg_enabled)
> + static_key_slow_inc(&cgroup_cls_enabled);
This is racy I think. The read of the static key is atomic with other reads,
but the entire conditional is not atomic. If two cpus were creating cgroups in
parallel, it would be possible for both to read the static key as being zero
(the second cpu would read the key before the first cpu could increment it).
> +#endif
>
> return &cs->css;
> }
>
> static void cgrp_destroy(struct cgroup *cgrp)
> {
> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> + if (!cgrp->parent && clscg_enabled)
> + static_key_slow_dec(&cgroup_cls_enabled);
Ditto here with the race above. I think what you want is one of:
1) Use static_key_slow_[inc|dec] unconditionally
2) Keep a separate internal counter to track the number of cgroup instances
so that you only inc the static key on the first create and dec it on the last
delete.
I would think (1) would be sufficent. It looks like static_key_slow_inc uses
atomic_inc_not_zero to just do an inc anyway in the event that multiple inc
events are made.
Neil
> +#endif
> +
> kfree(cgrp_cls_state(cgrp));
> }
>
> --
> 1.7.12.rc1.16.g05a20c8
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
2012-08-17 18:28 ` Neil Horman
@ 2012-08-20 0:59 ` Li Zefan
[not found] ` <50318BF9.2080803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
[not found] ` <20120817182855.GA11607-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
1 sibling, 1 reply; 15+ messages in thread
From: Li Zefan @ 2012-08-20 0:59 UTC (permalink / raw)
To: Neil Horman
Cc: Daniel Wagner, netdev, cgroups, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Tejun Heo
于 2012/8/18 2:28, Neil Horman 写道:
> On Fri, Aug 17, 2012 at 04:58:12PM +0200, Daniel Wagner wrote:
>> From: Daniel Wagner <daniel.wagner@bmw-carit.de>
>>
>> The module version of task_cls_classid() checks if net_cls_sbusys_id
>> is valid to indentify when it is okay to access the controller.
>>
>> Instead relying on the subusys_id to be set, make it explicit
>> with a jump label.
>>
>> Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Gao feng <gaofeng@cn.fujitsu.com>
>> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
>> Cc: John Fastabend <john.r.fastabend@intel.com>
>> Cc: Li Zefan <lizefan@huawei.com>
>> Cc: Neil Horman <nhorman@tuxdriver.com>
>> Cc: Tejun Heo <tj@kernel.org>
>> Cc: netdev@vger.kernel.org
>> Cc: cgroups@vger.kernel.org
>> ---
>> include/net/cls_cgroup.h | 5 ++++-
>> net/core/sock.c | 5 +++++
>> net/sched/cls_cgroup.c | 9 +++++++++
>> 3 files changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
>> index 401672c..bbbd957 100644
>> --- a/include/net/cls_cgroup.h
>> +++ b/include/net/cls_cgroup.h
>> @@ -16,6 +16,7 @@
>> #include <linux/cgroup.h>
>> #include <linux/hardirq.h>
>> #include <linux/rcupdate.h>
>> +#include <linux/jump_label.h>
>>
>> #ifdef CONFIG_CGROUPS
>> struct cgroup_cls_state
>> @@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
>> }
>>
>> #elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
>> +extern struct static_key cgroup_cls_enabled;
>> +#define clscg_enabled static_key_false(&cgroup_cls_enabled)
>>
>> extern int net_cls_subsys_id;
>>
>> @@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
>> int id;
>> u32 classid = 0;
>>
>> - if (in_interrupt())
>> + if (!clscg_enabled || in_interrupt())
>> return 0;
>>
>> rcu_read_lock();
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 8f67ced..8106e77 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
>> EXPORT_SYMBOL(__sk_backlog_rcv);
>>
>> #if defined(CONFIG_CGROUPS)
>> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
>> +struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
>> +EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
>> +#endif
>> +
>> #if !defined(CONFIG_NET_CLS_CGROUP)
>> int net_cls_subsys_id = -1;
>> EXPORT_SYMBOL_GPL(net_cls_subsys_id);
>> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
>> index 7743ea8..0635894 100644
>> --- a/net/sched/cls_cgroup.c
>> +++ b/net/sched/cls_cgroup.c
>> @@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
>>
>> if (cgrp->parent)
>> cs->classid = cgrp_cls_state(cgrp->parent)->classid;
>> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
>> + else if (!clscg_enabled)
>> + static_key_slow_inc(&cgroup_cls_enabled);
> This is racy I think. The read of the static key is atomic with other reads,
> but the entire conditional is not atomic. If two cpus were creating cgroups in
> parallel, it would be possible for both to read the static key as being zero
> (the second cpu would read the key before the first cpu could increment it).
static_key_slow_inc() is called only when we're creating the root cgroup, and that's
module loading.
so it's safe.
>> +#endif
>>
>> return &cs->css;
>> }
>>
>> static void cgrp_destroy(struct cgroup *cgrp)
>> {
>> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
>> + if (!cgrp->parent && clscg_enabled)
>> + static_key_slow_dec(&cgroup_cls_enabled);
> Ditto here with the race above. I think what you want is one of:
and this is called only when we're destroying the root cgroup, and that's
module unload.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <50318BF9.2080803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
@ 2012-08-20 11:08 ` Neil Horman
0 siblings, 0 replies; 15+ messages in thread
From: Neil Horman @ 2012-08-20 11:08 UTC (permalink / raw)
To: Li Zefan
Cc: Daniel Wagner, netdev-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Tejun Heo
On Mon, Aug 20, 2012 at 08:59:37AM +0800, Li Zefan wrote:
> 于 2012/8/18 2:28, Neil Horman 写道:
> > On Fri, Aug 17, 2012 at 04:58:12PM +0200, Daniel Wagner wrote:
> >> From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> >>
> >> The module version of task_cls_classid() checks if net_cls_sbusys_id
> >> is valid to indentify when it is okay to access the controller.
> >>
> >> Instead relying on the subusys_id to be set, make it explicit
> >> with a jump label.
> >>
> >> Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> >> Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> >> Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> >> Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
> >> Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> >> Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >> Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> >> Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> >> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> >> ---
> >> include/net/cls_cgroup.h | 5 ++++-
> >> net/core/sock.c | 5 +++++
> >> net/sched/cls_cgroup.c | 9 +++++++++
> >> 3 files changed, 18 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
> >> index 401672c..bbbd957 100644
> >> --- a/include/net/cls_cgroup.h
> >> +++ b/include/net/cls_cgroup.h
> >> @@ -16,6 +16,7 @@
> >> #include <linux/cgroup.h>
> >> #include <linux/hardirq.h>
> >> #include <linux/rcupdate.h>
> >> +#include <linux/jump_label.h>
> >>
> >> #ifdef CONFIG_CGROUPS
> >> struct cgroup_cls_state
> >> @@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
> >> }
> >>
> >> #elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
> >> +extern struct static_key cgroup_cls_enabled;
> >> +#define clscg_enabled static_key_false(&cgroup_cls_enabled)
> >>
> >> extern int net_cls_subsys_id;
> >>
> >> @@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
> >> int id;
> >> u32 classid = 0;
> >>
> >> - if (in_interrupt())
> >> + if (!clscg_enabled || in_interrupt())
> >> return 0;
> >>
> >> rcu_read_lock();
> >> diff --git a/net/core/sock.c b/net/core/sock.c
> >> index 8f67ced..8106e77 100644
> >> --- a/net/core/sock.c
> >> +++ b/net/core/sock.c
> >> @@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> >> EXPORT_SYMBOL(__sk_backlog_rcv);
> >>
> >> #if defined(CONFIG_CGROUPS)
> >> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> >> +struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
> >> +EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
> >> +#endif
> >> +
> >> #if !defined(CONFIG_NET_CLS_CGROUP)
> >> int net_cls_subsys_id = -1;
> >> EXPORT_SYMBOL_GPL(net_cls_subsys_id);
> >> diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> >> index 7743ea8..0635894 100644
> >> --- a/net/sched/cls_cgroup.c
> >> +++ b/net/sched/cls_cgroup.c
> >> @@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
> >>
> >> if (cgrp->parent)
> >> cs->classid = cgrp_cls_state(cgrp->parent)->classid;
> >> +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> >> + else if (!clscg_enabled)
> >> + static_key_slow_inc(&cgroup_cls_enabled);
> > This is racy I think. The read of the static key is atomic with other reads,
> > but the entire conditional is not atomic. If two cpus were creating cgroups in
> > parallel, it would be possible for both to read the static key as being zero
> > (the second cpu would read the key before the first cpu could increment it).
>
> static_key_slow_inc() is called only when we're creating the root cgroup, and that's
> module loading.
>
> so it's safe.
>
What makes you say that? It appears to me that we call ss->create (and
potential ss->destroy, depending on failure conditions) from cgroup_create,
which in turn is called from cgroup_mkdir, which is called for every cgroup
instance created, not just the root cgroup.
Neil
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <20120817182855.GA11607-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
@ 2012-08-20 11:29 ` Daniel Wagner
[not found] ` <20120820112938.GA22415-rjQKm2AMs/APuY3F5OKgMy7zKzJi9e1+kcYEyfhdaNw@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Wagner @ 2012-08-20 11:29 UTC (permalink / raw)
To: Neil Horman
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Tejun Heo
On Fri, Aug 17, 2012 at 02:28:55PM -0400, Neil Horman wrote:
> On Fri, Aug 17, 2012 at 04:58:12PM +0200, Daniel Wagner wrote:
> > From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> >
> > The module version of task_cls_classid() checks if net_cls_sbusys_id
> > is valid to indentify when it is okay to access the controller.
> >
> > Instead relying on the subusys_id to be set, make it explicit
> > with a jump label.
> >
> > Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> > Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> > Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> > Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
> > Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> > Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > ---
> > include/net/cls_cgroup.h | 5 ++++-
> > net/core/sock.c | 5 +++++
> > net/sched/cls_cgroup.c | 9 +++++++++
> > 3 files changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
> > index 401672c..bbbd957 100644
> > --- a/include/net/cls_cgroup.h
> > +++ b/include/net/cls_cgroup.h
> > @@ -16,6 +16,7 @@
> > #include <linux/cgroup.h>
> > #include <linux/hardirq.h>
> > #include <linux/rcupdate.h>
> > +#include <linux/jump_label.h>
> >
> > #ifdef CONFIG_CGROUPS
> > struct cgroup_cls_state
> > @@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
> > }
> >
> > #elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > +extern struct static_key cgroup_cls_enabled;
> > +#define clscg_enabled static_key_false(&cgroup_cls_enabled)
> >
> > extern int net_cls_subsys_id;
> >
> > @@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
> > int id;
> > u32 classid = 0;
> >
> > - if (in_interrupt())
> > + if (!clscg_enabled || in_interrupt())
> > return 0;
> >
> > rcu_read_lock();
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 8f67ced..8106e77 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> > EXPORT_SYMBOL(__sk_backlog_rcv);
> >
> > #if defined(CONFIG_CGROUPS)
> > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > +struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
> > +EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
> > +#endif
> > +
> > #if !defined(CONFIG_NET_CLS_CGROUP)
> > int net_cls_subsys_id = -1;
> > EXPORT_SYMBOL_GPL(net_cls_subsys_id);
> > diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> > index 7743ea8..0635894 100644
> > --- a/net/sched/cls_cgroup.c
> > +++ b/net/sched/cls_cgroup.c
> > @@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
> >
> > if (cgrp->parent)
> > cs->classid = cgrp_cls_state(cgrp->parent)->classid;
> > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > + else if (!clscg_enabled)
> > + static_key_slow_inc(&cgroup_cls_enabled);
> This is racy I think. The read of the static key is atomic with other reads,
> but the entire conditional is not atomic. If two cpus were creating cgroups in
> parallel, it would be possible for both to read the static key as being zero
> (the second cpu would read the key before the first cpu could increment it).
D'oh, That is racy.
> > +#endif
> >
> > return &cs->css;
> > }
> >
> > static void cgrp_destroy(struct cgroup *cgrp)
> > {
> > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > + if (!cgrp->parent && clscg_enabled)
> > + static_key_slow_dec(&cgroup_cls_enabled);
> Ditto here with the race above. I think what you want is one of:
>
> 1) Use static_key_slow_[inc|dec] unconditionally
While the static_key_slow_inc() case will work, I am not so sure about
the static_key_slow_dec(), e.g. we could still access inside
task_cls_classid() a destroyed container.
> 2) Keep a separate internal counter to track the number of cgroup instances
> so that you only inc the static key on the first create and dec it on the last
> delete.
If I got you right, than this would not be different then direclty using
static_key_slow_[inc|dec].
> I would think (1) would be sufficent. It looks like static_key_slow_inc uses
> atomic_inc_not_zero to just do an inc anyway in the event that multiple inc
> events are made.
Would something like this work?
static inline u32 task_cls_classid(struct task_struct *p)
{
u32 classid;
struct cgroup_cls_state *css;
if (!clscg_enabled || in_interrupt())
return 0;
rcu_read_lock();
css = container_of(task_subsys_state(p, net_cls_subsys_id),
struct cgroup_cls_state, css);
if (!css)
classid = css->classid;
else
classid = 0;
rcu_read_unlock();
return classid;
}
Daniel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <20120820112938.GA22415-rjQKm2AMs/APuY3F5OKgMy7zKzJi9e1+kcYEyfhdaNw@public.gmane.org>
@ 2012-08-20 11:33 ` Glauber Costa
2012-08-20 17:03 ` Neil Horman
1 sibling, 0 replies; 15+ messages in thread
From: Glauber Costa @ 2012-08-20 11:33 UTC (permalink / raw)
To: Daniel Wagner
Cc: Neil Horman, netdev-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Li Zefan, Tejun Heo
On 08/20/2012 03:29 PM, Daniel Wagner wrote:
>> This is racy I think. The read of the static key is atomic with other reads,
>> > but the entire conditional is not atomic. If two cpus were creating cgroups in
>> > parallel, it would be possible for both to read the static key as being zero
>> > (the second cpu would read the key before the first cpu could increment it).
> D'oh, That is racy.
>
Take a look at how we solve this particular problem in memcg. By using a
pair of bits meaning "ever active" and "currently active", we can avoid
problems with the static key increments.
Nothing bad can't happen by increasing the static key more than once;
the problem is that you need to somehow take note of that to make sure
you decrement the as many times you incremented when you release it.
Two other important things to keep in mind while dealing with static
branches:
* You can't increase/decrease while holding the cgroup_lock. lockdep may
trigger, because the cgroup_lock holds the cpu_hotplug lock, that the
static branches update path will also take. (due to cpusets). Doing a
logical hotplug will stress this path, and make the problem visible.
Take a look at disarm_sock_keys() (mm/memcontrol.c) to see how we solve
this particular problem.
* If you have code in more than one call site, the update among them is
not atomic. Not sure if this one applies to your case, but it can lead
you to some very unpleasant to debug problems. We use one of those two
bits I mentioned ("currently active") to make sure objects are not
marked before all sites are already patched.
Hope our previous experience with this can help you.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <20120820112938.GA22415-rjQKm2AMs/APuY3F5OKgMy7zKzJi9e1+kcYEyfhdaNw@public.gmane.org>
2012-08-20 11:33 ` Glauber Costa
@ 2012-08-20 17:03 ` Neil Horman
[not found] ` <20120820170342.GC1734-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
1 sibling, 1 reply; 15+ messages in thread
From: Neil Horman @ 2012-08-20 17:03 UTC (permalink / raw)
To: Daniel Wagner
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, cgroups-u79uwXL29TY76Z2rM5mHXA,
Daniel Wagner, David S. Miller, Gao feng, Jamal Hadi Salim,
John Fastabend, Li Zefan, Tejun Heo
On Mon, Aug 20, 2012 at 01:29:38PM +0200, Daniel Wagner wrote:
> On Fri, Aug 17, 2012 at 02:28:55PM -0400, Neil Horman wrote:
> > On Fri, Aug 17, 2012 at 04:58:12PM +0200, Daniel Wagner wrote:
> > > From: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> > >
> > > The module version of task_cls_classid() checks if net_cls_sbusys_id
> > > is valid to indentify when it is okay to access the controller.
> > >
> > > Instead relying on the subusys_id to be set, make it explicit
> > > with a jump label.
> > >
> > > Signed-off-by: Daniel Wagner <daniel.wagner-98C5kh4wR6ohFhg+JK9F0w@public.gmane.org>
> > > Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> > > Cc: Gao feng <gaofeng-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> > > Cc: Jamal Hadi Salim <jhs-jkUAjuhPggJWk0Htik3J/w@public.gmane.org>
> > > Cc: John Fastabend <john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > > Cc: Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > > Cc: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>
> > > Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> > > Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > > ---
> > > include/net/cls_cgroup.h | 5 ++++-
> > > net/core/sock.c | 5 +++++
> > > net/sched/cls_cgroup.c | 9 +++++++++
> > > 3 files changed, 18 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/net/cls_cgroup.h b/include/net/cls_cgroup.h
> > > index 401672c..bbbd957 100644
> > > --- a/include/net/cls_cgroup.h
> > > +++ b/include/net/cls_cgroup.h
> > > @@ -16,6 +16,7 @@
> > > #include <linux/cgroup.h>
> > > #include <linux/hardirq.h>
> > > #include <linux/rcupdate.h>
> > > +#include <linux/jump_label.h>
> > >
> > > #ifdef CONFIG_CGROUPS
> > > struct cgroup_cls_state
> > > @@ -44,6 +45,8 @@ static inline u32 task_cls_classid(struct task_struct *p)
> > > }
> > >
> > > #elif IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > > +extern struct static_key cgroup_cls_enabled;
> > > +#define clscg_enabled static_key_false(&cgroup_cls_enabled)
> > >
> > > extern int net_cls_subsys_id;
> > >
> > > @@ -52,7 +55,7 @@ static inline u32 task_cls_classid(struct task_struct *p)
> > > int id;
> > > u32 classid = 0;
> > >
> > > - if (in_interrupt())
> > > + if (!clscg_enabled || in_interrupt())
> > > return 0;
> > >
> > > rcu_read_lock();
> > > diff --git a/net/core/sock.c b/net/core/sock.c
> > > index 8f67ced..8106e77 100644
> > > --- a/net/core/sock.c
> > > +++ b/net/core/sock.c
> > > @@ -327,6 +327,11 @@ int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> > > EXPORT_SYMBOL(__sk_backlog_rcv);
> > >
> > > #if defined(CONFIG_CGROUPS)
> > > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > > +struct static_key cgroup_cls_enabled = STATIC_KEY_INIT_FALSE;
> > > +EXPORT_SYMBOL_GPL(cgroup_cls_enabled);
> > > +#endif
> > > +
> > > #if !defined(CONFIG_NET_CLS_CGROUP)
> > > int net_cls_subsys_id = -1;
> > > EXPORT_SYMBOL_GPL(net_cls_subsys_id);
> > > diff --git a/net/sched/cls_cgroup.c b/net/sched/cls_cgroup.c
> > > index 7743ea8..0635894 100644
> > > --- a/net/sched/cls_cgroup.c
> > > +++ b/net/sched/cls_cgroup.c
> > > @@ -44,12 +44,21 @@ static struct cgroup_subsys_state *cgrp_create(struct cgroup *cgrp)
> > >
> > > if (cgrp->parent)
> > > cs->classid = cgrp_cls_state(cgrp->parent)->classid;
> > > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > > + else if (!clscg_enabled)
> > > + static_key_slow_inc(&cgroup_cls_enabled);
> > This is racy I think. The read of the static key is atomic with other reads,
> > but the entire conditional is not atomic. If two cpus were creating cgroups in
> > parallel, it would be possible for both to read the static key as being zero
> > (the second cpu would read the key before the first cpu could increment it).
>
> D'oh, That is racy.
>
> > > +#endif
> > >
> > > return &cs->css;
> > > }
> > >
> > > static void cgrp_destroy(struct cgroup *cgrp)
> > > {
> > > +#if IS_MODULE(CONFIG_NET_CLS_CGROUP)
> > > + if (!cgrp->parent && clscg_enabled)
> > > + static_key_slow_dec(&cgroup_cls_enabled);
> > Ditto here with the race above. I think what you want is one of:
> >
> > 1) Use static_key_slow_[inc|dec] unconditionally
>
> While the static_key_slow_inc() case will work, I am not so sure about
> the static_key_slow_dec(), e.g. we could still access inside
> task_cls_classid() a destroyed container.
>
Possibly, yes, I think.
> > 2) Keep a separate internal counter to track the number of cgroup instances
> > so that you only inc the static key on the first create and dec it on the last
> > delete.
>
> If I got you right, than this would not be different then direclty using
> static_key_slow_[inc|dec].
>
As long as a cgroup subsystems ->destroy method is only called when the
subsystem is being removed, then I think thats correct. I'm not 100% sure thats
the case though.
> > I would think (1) would be sufficent. It looks like static_key_slow_inc uses
> > atomic_inc_not_zero to just do an inc anyway in the event that multiple inc
> > events are made.
>
> Would something like this work?
>
I think so yes, assuming that you also make the slow_inc|dec changes
> static inline u32 task_cls_classid(struct task_struct *p)
> {
> u32 classid;
> struct cgroup_cls_state *css;
>
> if (!clscg_enabled || in_interrupt())
> return 0;
>
> rcu_read_lock();
> css = container_of(task_subsys_state(p, net_cls_subsys_id),
> struct cgroup_cls_state, css);
> if (!css)
> classid = css->classid;
> else
> classid = 0;
> rcu_read_unlock();
>
> return classid;
> }
>
> Daniel
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <20120820170342.GC1734-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
@ 2012-08-21 9:12 ` Glauber Costa
[not found] ` <50335101.5030109-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Glauber Costa @ 2012-08-21 9:12 UTC (permalink / raw)
To: Neil Horman
Cc: Daniel Wagner, netdev-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Li Zefan, Tejun Heo
On 08/20/2012 09:03 PM, Neil Horman wrote:
>>> 2) Keep a separate internal counter to track the number of cgroup instances
>>> > > so that you only inc the static key on the first create and dec it on the last
>>> > > delete.
>> >
>> > If I got you right, than this would not be different then direclty using
>> > static_key_slow_[inc|dec].
>> >
> As long as a cgroup subsystems ->destroy method is only called when the
> subsystem is being removed, then I think thats correct. I'm not 100% sure thats
> the case though.
>
THAT is correct, but not the call itself. ->destroy() is called with the
cgroup_lock held, and there is a lockdep dependency created by cpuset
that prevents the cpu_hotplug lock, taken by static branch updates, to
be taken inside cgroup_lock.
So unless cpuset is fixed - which is a major work, we can't do
static_branch updates while holding the cgroup_lock.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <5035F3FC.202-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
@ 2012-08-23 9:12 ` Glauber Costa
0 siblings, 0 replies; 15+ messages in thread
From: Glauber Costa @ 2012-08-23 9:12 UTC (permalink / raw)
To: Daniel Wagner
Cc: Neil Horman, netdev-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Li Zefan, Tejun Heo
>
> So there is no need to move the enabling/disabling of the static branch
> at this point to create/destroy.
>
That will surely save you some trouble.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module
[not found] ` <50335101.5030109-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
@ 2012-08-23 9:12 ` Daniel Wagner
[not found] ` <5035F3FC.202-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
0 siblings, 1 reply; 15+ messages in thread
From: Daniel Wagner @ 2012-08-23 9:12 UTC (permalink / raw)
To: Glauber Costa
Cc: Neil Horman, netdev-u79uwXL29TY76Z2rM5mHXA,
cgroups-u79uwXL29TY76Z2rM5mHXA, Daniel Wagner, David S. Miller,
Gao feng, Jamal Hadi Salim, John Fastabend, Li Zefan, Tejun Heo
On 21.08.2012 11:12, Glauber Costa wrote:
> On 08/20/2012 09:03 PM, Neil Horman wrote:
>>>> 2) Keep a separate internal counter to track the number of cgroup instances
>>>>>> so that you only inc the static key on the first create and dec it on the last
>>>>>> delete.
>>>>
>>>> If I got you right, than this would not be different then direclty using
>>>> static_key_slow_[inc|dec].
>>>>
>> As long as a cgroup subsystems ->destroy method is only called when the
>> subsystem is being removed, then I think thats correct. I'm not 100% sure thats
>> the case though.
>>
> THAT is correct, but not the call itself. ->destroy() is called with the
> cgroup_lock held, and there is a lockdep dependency created by cpuset
> that prevents the cpu_hotplug lock, taken by static branch updates, to
> be taken inside cgroup_lock.
>
> So unless cpuset is fixed - which is a major work, we can't do
> static_branch updates while holding the cgroup_lock.
Thanks a lot for the info on this problem. I have spend the last days
trying to figure out how you have solved this in memcg. It looks like
complex and fragile. So I rather avoid enabling/disabling the static
branch in cgrp_create()/cgroup_destroy() and do it on module init/exit
as it is done currently with the id assigment.
Initially I wanted to optimize the task_cls_classid() path, so that only
when a cgroup really exists we spent time in there. Though as soon the
controller is registered to the cgroup core, the code path is enabled.
That means when cls is builtin it is used all the time (even though no
one is using it). In the module case, it is the same. task_cls_classid()
path is used right after module init.
So there is no need to move the enabling/disabling of the static branch
at this point to create/destroy.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2012-08-23 9:12 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-17 14:58 [PATCH v1 0/5] cgroup: Assign subsystem IDs during compile time Daniel Wagner
[not found] ` <1345215494-9181-1-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-17 14:58 ` [PATCH v1 1/5] cgroup: Use empty task_cls_classid() when !CONFIG_NET_CLS(_MODULE) Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 2/5] cgroup: Move sock_update_classid() decleration to cls_cgroup.h Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 3/5] cgroup: Protect access to task_cls_classid() when built as module Daniel Wagner
[not found] ` <1345215494-9181-4-git-send-email-wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-17 18:28 ` Neil Horman
2012-08-20 0:59 ` Li Zefan
[not found] ` <50318BF9.2080803-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2012-08-20 11:08 ` Neil Horman
[not found] ` <20120817182855.GA11607-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2012-08-20 11:29 ` Daniel Wagner
[not found] ` <20120820112938.GA22415-rjQKm2AMs/APuY3F5OKgMy7zKzJi9e1+kcYEyfhdaNw@public.gmane.org>
2012-08-20 11:33 ` Glauber Costa
2012-08-20 17:03 ` Neil Horman
[not found] ` <20120820170342.GC1734-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2012-08-21 9:12 ` Glauber Costa
[not found] ` <50335101.5030109-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-08-23 9:12 ` Daniel Wagner
[not found] ` <5035F3FC.202-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-08-23 9:12 ` Glauber Costa
2012-08-17 14:58 ` [PATCH v1 4/5] cgroup: Protect access to task_netprioidx() " Daniel Wagner
2012-08-17 14:58 ` [PATCH v1 5/5] cgroup: Assign subsystem IDs during compile time Daniel Wagner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).