public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
@ 2009-01-23  0:48 Paul Menage
  2009-01-23  2:20 ` Serge E. Hallyn
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Paul Menage @ 2009-01-23  0:48 UTC (permalink / raw)
  To: akpm, serue; +Cc: linux-kernel, containers

cgroup: Fix root_count when mount fails due to busy subsystem

root_count was being incremented in cgroup_get_sb() after all error
checking was complete, but decremented in cgroup_kill_sb(), which can
be called on a superblock that we gave up on due to an error.  This
patch changes cgroup_kill_sb() to only decrement root_count if the
root was previously linked into the list of roots.

Signed-off-by: Paul Menage <menage@google.com>

---

I was actually surprised to find that list_del() doesn't crash when
run on an unattached list_head structure.

 kernel/cgroup.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index adcd0bb..9ce27e8 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1115,8 +1115,10 @@ static void cgroup_kill_sb(struct super_block *sb) {
 	}
 	write_unlock(&css_set_lock);
 
-	list_del(&root->root_list);
-	root_count--;
+	if (!list_empty(&root->root_list)) {
+		list_del(&root->root_list);
+		root_count--;
+	}
 
 	mutex_unlock(&cgroup_mutex);
 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
  2009-01-23  0:48 [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem Paul Menage
@ 2009-01-23  2:20 ` Serge E. Hallyn
  2009-01-23 10:22 ` Ingo Molnar
  2009-01-23 18:10 ` Ingo Molnar
  2 siblings, 0 replies; 9+ messages in thread
From: Serge E. Hallyn @ 2009-01-23  2:20 UTC (permalink / raw)
  To: Paul Menage; +Cc: akpm, linux-kernel, containers

Quoting Paul Menage (menage@google.com):
> cgroup: Fix root_count when mount fails due to busy subsystem
> 
> root_count was being incremented in cgroup_get_sb() after all error
> checking was complete, but decremented in cgroup_kill_sb(), which can
> be called on a superblock that we gave up on due to an error.  This
> patch changes cgroup_kill_sb() to only decrement root_count if the
> root was previously linked into the list of roots.
> 
> Signed-off-by: Paul Menage <menage@google.com>

Tested-by: Serge Hallyn <serue@us.ibm.com>

thanks,
-serge

> 
> ---
> 
> I was actually surprised to find that list_del() doesn't crash when
> run on an unattached list_head structure.
> 
>  kernel/cgroup.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index adcd0bb..9ce27e8 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1115,8 +1115,10 @@ static void cgroup_kill_sb(struct super_block *sb) {
>  	}
>  	write_unlock(&css_set_lock);
> 
> -	list_del(&root->root_list);
> -	root_count--;
> +	if (!list_empty(&root->root_list)) {
> +		list_del(&root->root_list);
> +		root_count--;
> +	}
> 
>  	mutex_unlock(&cgroup_mutex);
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
  2009-01-23  0:48 [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem Paul Menage
  2009-01-23  2:20 ` Serge E. Hallyn
@ 2009-01-23 10:22 ` Ingo Molnar
  2009-01-23 16:59   ` Paul Menage
  2009-01-23 18:10 ` Ingo Molnar
  2 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2009-01-23 10:22 UTC (permalink / raw)
  To: Paul Menage; +Cc: akpm, serue, linux-kernel, containers, Peter Zijlstra


* Paul Menage <menage@google.com> wrote:

> cgroup: Fix root_count when mount fails due to busy subsystem
> 
> root_count was being incremented in cgroup_get_sb() after all error 
> checking was complete, but decremented in cgroup_kill_sb(), which can be 
> called on a superblock that we gave up on due to an error.  This patch 
> changes cgroup_kill_sb() to only decrement root_count if the root was 
> previously linked into the list of roots.
> 
> Signed-off-by: Paul Menage <menage@google.com>
> 
> ---
> 
> I was actually surprised to find that list_del() doesn't crash when
> run on an unattached list_head structure.
> 
>  kernel/cgroup.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index adcd0bb..9ce27e8 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -1115,8 +1115,10 @@ static void cgroup_kill_sb(struct super_block *sb) {
>  	}
>  	write_unlock(&css_set_lock);
>  
> -	list_del(&root->root_list);
> -	root_count--;
> +	if (!list_empty(&root->root_list)) {
> +		list_del(&root->root_list);
> +		root_count--;
> +	}

That's ugly. It is _much_ cleaner to always keep the link head consistent 
- i.e. initialize it with INIT_LIST_HEAD() and then remove from it via 
list_del_init().

That way the error path will do the right thing automatically, and there's 
no need for that ugly "if !list_empty" construct either.

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy  subsystem
  2009-01-23 10:22 ` Ingo Molnar
@ 2009-01-23 16:59   ` Paul Menage
  2009-01-23 17:50     ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Menage @ 2009-01-23 16:59 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, serue, linux-kernel, containers, Peter Zijlstra

On Fri, Jan 23, 2009 at 2:22 AM, Ingo Molnar <mingo@elte.hu> wrote:
>> --- a/kernel/cgroup.c
>> +++ b/kernel/cgroup.c
>> @@ -1115,8 +1115,10 @@ static void cgroup_kill_sb(struct super_block *sb) {
>>       }
>>       write_unlock(&css_set_lock);
>>
>> -     list_del(&root->root_list);
>> -     root_count--;
>> +     if (!list_empty(&root->root_list)) {
>> +             list_del(&root->root_list);
>> +             root_count--;
>> +     }
>
> That's ugly. It is _much_ cleaner to always keep the link head consistent
> - i.e. initialize it with INIT_LIST_HEAD()

It is initialized with INIT_LIST_HEAD().

> and then remove from it via
> list_del_init().

There's not much point doing list_del_init() rather than list_del()
here since we're about to delete the root.

>
> That way the error path will do the right thing automatically, and there's
> no need for that ugly "if !list_empty" construct either.

The important part here is avoiding decrementing root_count.

So the code could equally be:

if (!list_empty(&root->root_list)) {
  root_count--;
}
list_del(&root->root_list);

but what I have in this patch seems more straightforward. It's
actually how the code used to be before it was removed as a
"redundant" check by a patch that I unfortunately didn't get a chance
to read properly (or Ack) because I was too snowed under with other
work.

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
  2009-01-23 16:59   ` Paul Menage
@ 2009-01-23 17:50     ` Ingo Molnar
  0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2009-01-23 17:50 UTC (permalink / raw)
  To: Paul Menage; +Cc: akpm, serue, linux-kernel, containers, Peter Zijlstra


* Paul Menage <menage@google.com> wrote:

> On Fri, Jan 23, 2009 at 2:22 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >> --- a/kernel/cgroup.c
> >> +++ b/kernel/cgroup.c
> >> @@ -1115,8 +1115,10 @@ static void cgroup_kill_sb(struct super_block *sb) {
> >>       }
> >>       write_unlock(&css_set_lock);
> >>
> >> -     list_del(&root->root_list);
> >> -     root_count--;
> >> +     if (!list_empty(&root->root_list)) {
> >> +             list_del(&root->root_list);
> >> +             root_count--;
> >> +     }
> >
> > That's ugly. It is _much_ cleaner to always keep the link head consistent
> > - i.e. initialize it with INIT_LIST_HEAD()
> 
> It is initialized with INIT_LIST_HEAD().
> 
> > and then remove from it via
> > list_del_init().
> 
> There's not much point doing list_del_init() rather than list_del()
> here since we're about to delete the root.
> 
> >
> > That way the error path will do the right thing automatically, and there's
> > no need for that ugly "if !list_empty" construct either.
> 
> The important part here is avoiding decrementing root_count.

that should be done via proper placement of err_* labels. An assymetric 
exit path like the one you did is always the sign of bad code structure.

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
  2009-01-23  0:48 [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem Paul Menage
  2009-01-23  2:20 ` Serge E. Hallyn
  2009-01-23 10:22 ` Ingo Molnar
@ 2009-01-23 18:10 ` Ingo Molnar
  2009-01-23 18:32   ` Paul Menage
  2 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2009-01-23 18:10 UTC (permalink / raw)
  To: Paul Menage; +Cc: akpm, serue, linux-kernel, containers


* Paul Menage <menage@google.com> wrote:

> cgroup: Fix root_count when mount fails due to busy subsystem
> 
> root_count was being incremented in cgroup_get_sb() after all error 
> checking was complete, but decremented in cgroup_kill_sb(), which can be 
> called on a superblock that we gave up on due to an error.  This patch 
> changes cgroup_kill_sb() to only decrement root_count if the root was 
> previously linked into the list of roots.

i'm wondering, what happens in the buggy case: does cgroup_kill_sb() get 
called twice (if yes, why?), or do we call cgroup_kill_sb() on a not yet 
added sb and hence root_count has not been elevated yet? (if yes, which 
codepath does this?)

The error handling in cgroup_get_sb() definitely seems a bit twisted - 
find below a few error path and other cleanups.

Thanks,

	Ingo

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index c298310..28d1b67 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1025,18 +1026,12 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
 		 * have some link structures left over
 		 */
 		ret = allocate_cg_links(css_set_count, &tmp_cg_links);
-		if (ret) {
-			mutex_unlock(&cgroup_mutex);
-			mutex_unlock(&inode->i_mutex);
-			goto drop_new_super;
-		}
+		if (ret)
+			goto drop_new_super_unlock;
 
 		ret = rebind_subsystems(root, root->subsys_bits);
-		if (ret == -EBUSY) {
-			mutex_unlock(&cgroup_mutex);
-			mutex_unlock(&inode->i_mutex);
+		if (ret == -EBUSY)
 			goto free_cg_links;
-		}
 
 		/* EBUSY should be the only error here */
 		BUG_ON(ret);
@@ -1075,18 +1070,24 @@ static int cgroup_get_sb(struct file_system_type *fs_type,
 
  free_cg_links:
 	free_cg_links(&tmp_cg_links);
+
+ drop_new_super_unlock:
+	mutex_unlock(&cgroup_mutex);
+	mutex_unlock(&inode->i_mutex);
  drop_new_super:
 	up_write(&sb->s_umount);
 	deactivate_super(sb);
+
 	return ret;
 }
 
-static void cgroup_kill_sb(struct super_block *sb) {
+static void cgroup_kill_sb(struct super_block *sb)
+{
 	struct cgroupfs_root *root = sb->s_fs_info;
 	struct cgroup *cgrp = &root->top_cgroup;
-	int ret;
-	struct cg_cgroup_link *link;
 	struct cg_cgroup_link *saved_link;
+	struct cg_cgroup_link *link;
+	int ret;
 
 	BUG_ON(!root);
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy  subsystem
  2009-01-23 18:10 ` Ingo Molnar
@ 2009-01-23 18:32   ` Paul Menage
  2009-01-23 18:36     ` Ingo Molnar
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Menage @ 2009-01-23 18:32 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, serue, linux-kernel, containers

On Fri, Jan 23, 2009 at 10:10 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Paul Menage <menage@google.com> wrote:
>
>> cgroup: Fix root_count when mount fails due to busy subsystem
>>
>> root_count was being incremented in cgroup_get_sb() after all error
>> checking was complete, but decremented in cgroup_kill_sb(), which can be
>> called on a superblock that we gave up on due to an error.  This patch
>> changes cgroup_kill_sb() to only decrement root_count if the root was
>> previously linked into the list of roots.
>
> i'm wondering, what happens in the buggy case: does cgroup_kill_sb() get
> called twice (if yes, why?),

No.

> or do we call cgroup_kill_sb() on a not yet
> added sb and hence root_count has not been elevated yet?

Right.

> (if yes, which
> codepath does this?)

It's via the call to deactivate_super().

The code could be restructured such that:

- we don't set sb->s_fs_info until we've linked the new root into the root_list
- do any necessary cleanup for a failed root in cgroup_get_sb()
- have cgroup_kill_sb() handle either no root or a fully-initialized root

But then you're replacing "only decrement root_count if root was
linked in to list" with "only do root cleanup if root was atached to
sb" in cgroup_kill_sb(). I don't see that one is much cleaner than the
other.

For 2.6.29, we should fix this by reverting the broken part of the
patch that made it into 2.6.29-rcX

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem
  2009-01-23 18:32   ` Paul Menage
@ 2009-01-23 18:36     ` Ingo Molnar
  2009-01-23 18:42       ` Paul Menage
  0 siblings, 1 reply; 9+ messages in thread
From: Ingo Molnar @ 2009-01-23 18:36 UTC (permalink / raw)
  To: Paul Menage; +Cc: akpm, serue, linux-kernel, containers


* Paul Menage <menage@google.com> wrote:

> On Fri, Jan 23, 2009 at 10:10 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * Paul Menage <menage@google.com> wrote:
> >
> >> cgroup: Fix root_count when mount fails due to busy subsystem
> >>
> >> root_count was being incremented in cgroup_get_sb() after all error
> >> checking was complete, but decremented in cgroup_kill_sb(), which can be
> >> called on a superblock that we gave up on due to an error.  This patch
> >> changes cgroup_kill_sb() to only decrement root_count if the root was
> >> previously linked into the list of roots.
> >
> > i'm wondering, what happens in the buggy case: does cgroup_kill_sb() get
> > called twice (if yes, why?),
> 
> No.
> 
> > or do we call cgroup_kill_sb() on a not yet
> > added sb and hence root_count has not been elevated yet?
> 
> Right.
> 
> > (if yes, which
> > codepath does this?)
> 
> It's via the call to deactivate_super().

Which exact call chain is that?

> The code could be restructured such that:
> 
> - we don't set sb->s_fs_info until we've linked the new root into the root_list
> - do any necessary cleanup for a failed root in cgroup_get_sb()
> - have cgroup_kill_sb() handle either no root or a fully-initialized root
> 
> But then you're replacing "only decrement root_count if root was linked 
> in to list" with "only do root cleanup if root was atached to sb" in 
> cgroup_kill_sb(). I don't see that one is much cleaner than the other.

Agreed, that's not an improvement.

> For 2.6.29, we should fix this by reverting the broken part of the patch 
> that made it into 2.6.29-rcX

Agreed too - i withdraw my objection.

Nevertheless my observation remains: kernel/cgroup.c has a complex looking 
error paths which should be cleaned up. (independently of this issue)

	Ingo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] cgroup: Fix root_count when mount fails due to busy  subsystem
  2009-01-23 18:36     ` Ingo Molnar
@ 2009-01-23 18:42       ` Paul Menage
  0 siblings, 0 replies; 9+ messages in thread
From: Paul Menage @ 2009-01-23 18:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: akpm, serue, linux-kernel, containers

On Fri, Jan 23, 2009 at 10:36 AM, Ingo Molnar <mingo@elte.hu> wrote:
>>
>> It's via the call to deactivate_super().
>
> Which exact call chain is that?

cgroup_get_sb() -> *failure* -> deactivate_super() -> VFS stuff
(sorry, no access to a source tree right now) -> cgroup_kill_sb()

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-01-23 18:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-23  0:48 [PATCH] cgroup: Fix root_count when mount fails due to busy subsystem Paul Menage
2009-01-23  2:20 ` Serge E. Hallyn
2009-01-23 10:22 ` Ingo Molnar
2009-01-23 16:59   ` Paul Menage
2009-01-23 17:50     ` Ingo Molnar
2009-01-23 18:10 ` Ingo Molnar
2009-01-23 18:32   ` Paul Menage
2009-01-23 18:36     ` Ingo Molnar
2009-01-23 18:42       ` Paul Menage

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox