* [PATCH] cgroups: defer free css_set
@ 2008-11-21 8:49 Lai Jiangshan
[not found] ` <4926761B.2060608-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2008-11-21 18:28 ` Paul Menage
0 siblings, 2 replies; 4+ messages in thread
From: Lai Jiangshan @ 2008-11-21 8:49 UTC (permalink / raw)
To: Andrew Morton, Paul Menage, Linux Kernel Mailing List,
Linux Containers
we free css_set when refcnt became 0 immediately(except cgroup_attach_task()).
I will destroy the data which read side maybe still access it.
this patch use call_rcu() to defer free css_set
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 1164963..22901ff 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -178,6 +178,8 @@ struct css_set {
*/
struct list_head cg_links;
+ struct rcu_head rcu;
+
/*
* Set of subsystem states, one for each subsystem. This array
* is immutable after creation apart from the init_css_set
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 358e775..ddc10ac 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -252,6 +252,11 @@ static void unlink_css_set(struct css_set *cg)
}
}
+static void rcu_free_css_set(struct rcu_head *head)
+{
+ kfree(container_of(head, struct css_set, rcu));
+}
+
static void __put_css_set(struct css_set *cg, int taskexit)
{
int i;
@@ -281,7 +286,7 @@ static void __put_css_set(struct css_set *cg, int taskexit)
}
}
rcu_read_unlock();
- kfree(cg);
+ call_rcu(&cg->rcu, rcu_free_css_set);
}
/*
@@ -1267,7 +1277,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
ss->attach(ss, cgrp, oldcgrp, tsk);
}
set_bit(CGRP_RELEASABLE, &oldcgrp->flags);
- synchronize_rcu();
put_css_set(cg);
return 0;
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH] cgroups: defer free css_set
@ 2008-11-21 8:49 Lai Jiangshan
0 siblings, 0 replies; 4+ messages in thread
From: Lai Jiangshan @ 2008-11-21 8:49 UTC (permalink / raw)
To: Andrew Morton, Paul Menage, Linux Kernel Mailing List,
Linux Containers
we free css_set when refcnt became 0 immediately(except cgroup_attach_task()).
I will destroy the data which read side maybe still access it.
this patch use call_rcu() to defer free css_set
Signed-off-by: Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 1164963..22901ff 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -178,6 +178,8 @@ struct css_set {
*/
struct list_head cg_links;
+ struct rcu_head rcu;
+
/*
* Set of subsystem states, one for each subsystem. This array
* is immutable after creation apart from the init_css_set
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 358e775..ddc10ac 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -252,6 +252,11 @@ static void unlink_css_set(struct css_set *cg)
}
}
+static void rcu_free_css_set(struct rcu_head *head)
+{
+ kfree(container_of(head, struct css_set, rcu));
+}
+
static void __put_css_set(struct css_set *cg, int taskexit)
{
int i;
@@ -281,7 +286,7 @@ static void __put_css_set(struct css_set *cg, int taskexit)
}
}
rcu_read_unlock();
- kfree(cg);
+ call_rcu(&cg->rcu, rcu_free_css_set);
}
/*
@@ -1267,7 +1277,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
ss->attach(ss, cgrp, oldcgrp, tsk);
}
set_bit(CGRP_RELEASABLE, &oldcgrp->flags);
- synchronize_rcu();
put_css_set(cg);
return 0;
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] cgroups: defer free css_set
[not found] ` <4926761B.2060608-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
@ 2008-11-21 18:28 ` Paul Menage
0 siblings, 0 replies; 4+ messages in thread
From: Paul Menage @ 2008-11-21 18:28 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Linux Containers, Andrew Morton, Linux Kernel Mailing List
On Fri, Nov 21, 2008 at 12:49 AM, Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> wrote:
>
> we free css_set when refcnt became 0 immediately(except cgroup_attach_task()).
> I will destroy the data which read side maybe still access it.
> this patch use call_rcu() to defer free css_set
>
> Signed-off-by: Lai Jiangshan <laijs-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> ---
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 1164963..22901ff 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -178,6 +178,8 @@ struct css_set {
> */
> struct list_head cg_links;
>
> + struct rcu_head rcu;
> +
> /*
> * Set of subsystem states, one for each subsystem. This array
> * is immutable after creation apart from the init_css_set
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 358e775..ddc10ac 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -252,6 +252,11 @@ static void unlink_css_set(struct css_set *cg)
> }
> }
>
> +static void rcu_free_css_set(struct rcu_head *head)
> +{
> + kfree(container_of(head, struct css_set, rcu));
> +}
> +
> static void __put_css_set(struct css_set *cg, int taskexit)
> {
> int i;
> @@ -281,7 +286,7 @@ static void __put_css_set(struct css_set *cg, int taskexit)
> }
> }
> rcu_read_unlock();
> - kfree(cg);
> + call_rcu(&cg->rcu, rcu_free_css_set);
> }
>
> /*
> @@ -1267,7 +1277,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> ss->attach(ss, cgrp, oldcgrp, tsk);
> }
> set_bit(CGRP_RELEASABLE, &oldcgrp->flags);
> - synchronize_rcu();
I'm reluctant to remove this synchronize_rcu() call - it gives the
property that if you get a pointer to a task's cgroup protected by
RCU, then even if you race with the task moving away to a different
cgroup, then no other cgroup_mutex-protected operation can start until
you've finished your RCU section (since the thread that you raced with
is blocking in synchronize_rcu() while holding cgroup_mutex). I'm
pretty sure that some of the cgroups code relies on that property,
although I can't find exactly which bit I'm thinking of.
Also, using call_rcu() for freeing all css_sets seems unnecessary -
the only one that appears to be potentially broken is the one from
cgroup_exit(), since in the other cases the css_set hasn't been
visible via a task->cgroups pointer. So how about making
__put_css_set() do a call_rcu() for the case when taskexit is true,
and a plain free() otherwise? That would also reduce the change of
overloading the RCU system with too many deferred frees.
Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] cgroups: defer free css_set
2008-11-21 8:49 [PATCH] cgroups: defer free css_set Lai Jiangshan
[not found] ` <4926761B.2060608-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
@ 2008-11-21 18:28 ` Paul Menage
1 sibling, 0 replies; 4+ messages in thread
From: Paul Menage @ 2008-11-21 18:28 UTC (permalink / raw)
To: Lai Jiangshan; +Cc: Andrew Morton, Linux Kernel Mailing List, Linux Containers
On Fri, Nov 21, 2008 at 12:49 AM, Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>
> we free css_set when refcnt became 0 immediately(except cgroup_attach_task()).
> I will destroy the data which read side maybe still access it.
> this patch use call_rcu() to defer free css_set
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 1164963..22901ff 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -178,6 +178,8 @@ struct css_set {
> */
> struct list_head cg_links;
>
> + struct rcu_head rcu;
> +
> /*
> * Set of subsystem states, one for each subsystem. This array
> * is immutable after creation apart from the init_css_set
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 358e775..ddc10ac 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -252,6 +252,11 @@ static void unlink_css_set(struct css_set *cg)
> }
> }
>
> +static void rcu_free_css_set(struct rcu_head *head)
> +{
> + kfree(container_of(head, struct css_set, rcu));
> +}
> +
> static void __put_css_set(struct css_set *cg, int taskexit)
> {
> int i;
> @@ -281,7 +286,7 @@ static void __put_css_set(struct css_set *cg, int taskexit)
> }
> }
> rcu_read_unlock();
> - kfree(cg);
> + call_rcu(&cg->rcu, rcu_free_css_set);
> }
>
> /*
> @@ -1267,7 +1277,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
> ss->attach(ss, cgrp, oldcgrp, tsk);
> }
> set_bit(CGRP_RELEASABLE, &oldcgrp->flags);
> - synchronize_rcu();
I'm reluctant to remove this synchronize_rcu() call - it gives the
property that if you get a pointer to a task's cgroup protected by
RCU, then even if you race with the task moving away to a different
cgroup, then no other cgroup_mutex-protected operation can start until
you've finished your RCU section (since the thread that you raced with
is blocking in synchronize_rcu() while holding cgroup_mutex). I'm
pretty sure that some of the cgroups code relies on that property,
although I can't find exactly which bit I'm thinking of.
Also, using call_rcu() for freeing all css_sets seems unnecessary -
the only one that appears to be potentially broken is the one from
cgroup_exit(), since in the other cases the css_set hasn't been
visible via a task->cgroups pointer. So how about making
__put_css_set() do a call_rcu() for the case when taskexit is true,
and a plain free() otherwise? That would also reduce the change of
overloading the RCU system with too many deferred frees.
Paul
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-11-21 18:29 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-21 8:49 [PATCH] cgroups: defer free css_set Lai Jiangshan
[not found] ` <4926761B.2060608-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2008-11-21 18:28 ` Paul Menage
2008-11-21 18:28 ` Paul Menage
-- strict thread matches above, loose matches on Subject: below --
2008-11-21 8:49 Lai Jiangshan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.