All of lore.kernel.org
 help / color / mirror / Atom feed
* commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
@ 2009-01-22 18:47 Serge E. Hallyn
       [not found] ` <20090122184754.GA17511-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Serge E. Hallyn @ 2009-01-22 18:47 UTC (permalink / raw)
  To: laijs-BthXqXjhjHXQFUHtdCDX3A; +Cc: Linux Containers, Paul Menage

The following script gives me a reliable BUG bisected to
commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074, subject
"cgroups: remove rcu_read_lock() in cgroupstats_build()".
It's not immediately clear to me why that commit should
cause this...

thanks,
-serge

==============================================================
script
==============================================================
#!/bin/bash
mount -t cgroup -o freezer none /cgroup
sleep 100 &
pid=`jobs -p`
mkdir /cgroup/1
echo $pid > /cgroup/1/tasks
umount /cgroup
mount -t cgroup -o freezer,ns none /cgroup
mount -t cgroup -o freezer none /cgroup
#kill %1
sleep 100 &
pid=`jobs -p | tail -1`
mkdir /cgroup/2
echo $pid > /cgroup/2/tasks
umount /cgroup
==============================================================

==============================================================
The BUG output:
==============================================================


------------[ cut here ]------------
kernel BUG at kernel/cgroup.c:468!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/uevent_seqnum
Modules linked in:

Pid: 2900, comm: sh Not tainted (2.6.28-07513-ge5f6a86 #231)
EIP: 0060:[<c02411bb>] EFLAGS: 00010293 CPU: 2
EIP is at cgroup_attach_task+0x281/0x3ba
EAX: dfb57ee8 EBX: dfbe2914 ECX: df8cda68 EDX: dfb57ef0
ESI: df0ff900 EDI: 00000004 EBP: dfbe2900 ESP: dfb57ec0
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process sh (pid: 2900, ti=dfb56000 task=df88ee00 task.ti=dfb56000)
Stack:
 dfb57ee8 dfaabde0 df0ff900 df294014 c0725b88 df294000 c0621244 df800150
 df809040 df8cd9c0 df9563e0 c0724b48 df8cda68 dfbe2914 dfbe2900 df0ff900
 00000005 c024135f dfb57f2c fffffff2 c0620968 c0241836 00000b5e 00000000
Call Trace:
 [<c024135f>] cgroup_tasks_write+0x6b/0x91
 [<c0241836>] cgroup_file_write+0xcd/0x1c3
 [<c02606ae>] handle_mm_fault+0x4da/0x52a
 [<c0241769>] cgroup_file_write+0x0/0x1c3
 [<c0271fa1>] vfs_write+0x83/0xf6
 [<c027245f>] sys_write+0x3c/0x63
 [<c0202d1a>] syscall_call+0x7/0xb
 [<c04f0000>] do_nanosleep+0x25/0x8c
Code: 8d 50 08 a3 48 4b 72 c0 89 68 10 8b 4d 14 c7 40 04 48 4b 72 c0 89 48 08 89 51 04 89 55 14 89 5a 04 8d 44 24 28 39 44 24 28 74 04 <0f> 0b eb fe 8d 45 1c ff 05 c8 5b 72 c0 e8 d6 ec ff ff 8d 4d 04
EIP: [<c02411bb>] cgroup_attach_task+0x281/0x3ba SS:ESP 0068:dfb57ec0
---[ end trace d624dca4c4cd0d89 ]---

==============================================================

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found] ` <20090122184754.GA17511-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-01-22 23:07   ` Paul Menage
       [not found]     ` <6599ad830901221507x796c154ck5fe4f95cc3eb8e0e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-01-23  1:12   ` Paul Menage
  1 sibling, 1 reply; 8+ messages in thread
From: Paul Menage @ 2009-01-22 23:07 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, Jan 22, 2009 at 10:47 AM, Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> #!/bin/bash
> mount -t cgroup -o freezer none /cgroup
> sleep 100 &
> pid=`jobs -p`

You can use $! to refer to the most recently started background process.

> mkdir /cgroup/1
> echo $pid > /cgroup/1/tasks
> umount /cgroup
> mount -t cgroup -o freezer,ns none /cgroup

This command should have failed with EBUSY, since freezer is already
part of an existing hierarchy. So I think it's a red herring in this
problem.

> mount -t cgroup -o freezer none /cgroup
> #kill %1

Is there a reason for this commented-out line?

> sleep 100 &
> pid=`jobs -p | tail -1`
> mkdir /cgroup/2
> echo $pid > /cgroup/2/tasks

This is the line that crashes?

> umount /cgroup
> ==============================================================
>
> ==============================================================
> The BUG output:
> ==============================================================
>
>
> ------------[ cut here ]------------
> kernel BUG at kernel/cgroup.c:468!

What is this line in your tree?

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found]     ` <6599ad830901221507x796c154ck5fe4f95cc3eb8e0e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-01-22 23:35       ` Paul Menage
       [not found]         ` <6599ad830901221535w6c723eb6xd0adced0ff6a1fb6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-01-23  2:06       ` Serge E. Hallyn
  1 sibling, 1 reply; 8+ messages in thread
From: Paul Menage @ 2009-01-22 23:35 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, Jan 22, 2009 at 3:07 PM, Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>> mount -t cgroup -o freezer,ns none /cgroup
>
> This command should have failed with EBUSY, since freezer is already
> part of an existing hierarchy. So I think it's a red herring in this
> problem.

But in fact taking out this line does stop the crash. So it's a bug in
the error handling when you try to mount a hierarchy with a subsystem
that's already busy elsewhere.

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found]         ` <6599ad830901221535w6c723eb6xd0adced0ff6a1fb6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-01-23  0:26           ` Paul Menage
       [not found]             ` <6599ad830901221626o2fd5025fib3c1c6d182e52e63-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Paul Menage @ 2009-01-23  0:26 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Linux Containers

On Thu, Jan 22, 2009 at 3:35 PM, Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> On Thu, Jan 22, 2009 at 3:07 PM, Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>> mount -t cgroup -o freezer,ns none /cgroup
>>
>> This command should have failed with EBUSY, since freezer is already
>> part of an existing hierarchy. So I think it's a red herring in this
>> problem.
>
> But in fact taking out this line does stop the crash. So it's a bug in
> the error handling when you try to mount a hierarchy with a subsystem
> that's already busy elsewhere.

It's due to root_count getting out of sync with reality. It gets
incremented in cgroup_get_sb() after we've done all error checking,
but decremented in cgroup_kill_sb, which can be called on a superblock
that we tried to create but then gave up on. I guess the fix is to
increment root_count as soon as we have the superblock.

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found] ` <20090122184754.GA17511-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2009-01-22 23:07   ` Paul Menage
@ 2009-01-23  1:12   ` Paul Menage
  1 sibling, 0 replies; 8+ messages in thread
From: Paul Menage @ 2009-01-23  1:12 UTC (permalink / raw)
  To: Serge E. Hallyn, lizf-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org
  Cc: Linux Containers

On Thu, Jan 22, 2009 at 10:47 AM, Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> The following script gives me a reliable BUG bisected to
> commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074, subject
> "cgroups: remove rcu_read_lock() in cgroupstats_build()".
> It's not immediately clear to me why that commit should
> cause this...
>

I think the problematic commit was
e5f6a8609bab0c2d7543ab1505105e011832afd7, which removed a "redundant
check" from cgroup_kill_sb(). So the good news is this hasn't made it
into a released kernel yet. I sent a patch to put that check back in.

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found]             ` <6599ad830901221626o2fd5025fib3c1c6d182e52e63-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-01-23  1:26               ` Matt Helsley
  2009-01-23  1:31                 ` Paul Menage
  0 siblings, 1 reply; 8+ messages in thread
From: Matt Helsley @ 2009-01-23  1:26 UTC (permalink / raw)
  To: Paul Menage; +Cc: Linux Containers

On Thu, 2009-01-22 at 16:26 -0800, Paul Menage wrote:
> On Thu, Jan 22, 2009 at 3:35 PM, Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> > On Thu, Jan 22, 2009 at 3:07 PM, Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
> >>> mount -t cgroup -o freezer,ns none /cgroup
> >>
> >> This command should have failed with EBUSY, since freezer is already
> >> part of an existing hierarchy. So I think it's a red herring in this
> >> problem.
> >
> > But in fact taking out this line does stop the crash. So it's a bug in
> > the error handling when you try to mount a hierarchy with a subsystem
> > that's already busy elsewhere.
> 
> It's due to root_count getting out of sync with reality. It gets
> incremented in cgroup_get_sb() after we've done all error checking,
> but decremented in cgroup_kill_sb, which can be called on a superblock
> that we tried to create but then gave up on. I guess the fix is to
> increment root_count as soon as we have the superblock.
> 
> Paul

	I see what you mean. I am still mystified how rcu_read_lock() around
the cgroup iterator in cgroupstats_build() hid/prevented the problem
though :/.

	Incrementing the root_count early dissociates root_count and the
addition to root list, doesn't it? How about only decrementing the count
if we're on the list as in the patch below?

Cheers,
	-Matt Helsley

Subject: [PATCH][RFC] Fix incorrect root_count decrement when recovering
from error in cgroup_get_sb()

We may be recovering from an error in cgroup_get_sb() that occured prior
to adding the new hierarchy root to the roots list. Check if root was
added to roots and only decrement if it has been.

Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
Totally untested. Need to check that &root->root_list is empty whenever
root is off the roots list.

 kernel/cgroup.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Index: linux-2.6.29-rc1/kernel/cgroup.c
===================================================================
--- linux-2.6.29-rc1.orig/kernel/cgroup.c
+++ linux-2.6.29-rc1/kernel/cgroup.c
@@ -1115,8 +1115,11 @@ static void cgroup_kill_sb(struct super_
 	}
 	write_unlock(&css_set_lock);
 
-	list_del(&root->root_list);
-	root_count--;
+	if (!list_empty(&root->root_list)) {
+		list_del(&root->root_list);
+		root_count--;
+	} /* else we're recovering from an error in cgroup_get_sb() and hadn't added to
+	     the root list yet */
 
 	mutex_unlock(&cgroup_mutex);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
  2009-01-23  1:26               ` Matt Helsley
@ 2009-01-23  1:31                 ` Paul Menage
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Menage @ 2009-01-23  1:31 UTC (permalink / raw)
  To: Matt Helsley; +Cc: Linux Containers

On Thu, Jan 22, 2009 at 5:26 PM, Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
>
>        I see what you mean. I am still mystified how rcu_read_lock() around
> the cgroup iterator in cgroupstats_build() hid/prevented the problem
> though :/.

I think the problem was in git bisect - the problematic commit was a
couple later than the one Serge pointed at.

>
>        Incrementing the root_count early dissociates root_count and the
> addition to root list, doesn't it? How about only decrementing the count
> if we're on the list as in the patch below?

That's basically the patch that I already sent out. That check was
there originally, but removed in
e5f6a8609bab0c2d7543ab1505105e011832afd7 as a "redundant check".

Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074
       [not found]     ` <6599ad830901221507x796c154ck5fe4f95cc3eb8e0e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-01-22 23:35       ` Paul Menage
@ 2009-01-23  2:06       ` Serge E. Hallyn
  1 sibling, 0 replies; 8+ messages in thread
From: Serge E. Hallyn @ 2009-01-23  2:06 UTC (permalink / raw)
  To: Paul Menage; +Cc: Linux Containers

Quoting Paul Menage (menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org):
> On Thu, Jan 22, 2009 at 10:47 AM, Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> > #!/bin/bash
> > mount -t cgroup -o freezer none /cgroup
> > sleep 100 &
> > pid=`jobs -p`
> 
> You can use $! to refer to the most recently started background process.

Hah - thanks, I actually looked for something like that but couldn't
find it.

> 
> > mkdir /cgroup/1
> > echo $pid > /cgroup/1/tasks
> > umount /cgroup
> > mount -t cgroup -o freezer,ns none /cgroup
> 
> This command should have failed with EBUSY, since freezer is already
> part of an existing hierarchy. So I think it's a red herring in this
> problem.
> 
> > mount -t cgroup -o freezer none /cgroup
> > #kill %1
> 
> Is there a reason for this commented-out line?

Sorry, I'd meant to look into whether that needed
to be there or not, but decided to put it off until
after git bisecting, then forgot, probably because
the bisect result was so confusing.

I see you already sent a patch for the *real*
problem, thanks!

-serge

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-01-23  2:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-22 18:47 commit 7534432dcc3c654a8671b6b0cdffd1dbdbc73074 Serge E. Hallyn
     [not found] ` <20090122184754.GA17511-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-01-22 23:07   ` Paul Menage
     [not found]     ` <6599ad830901221507x796c154ck5fe4f95cc3eb8e0e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-01-22 23:35       ` Paul Menage
     [not found]         ` <6599ad830901221535w6c723eb6xd0adced0ff6a1fb6-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-01-23  0:26           ` Paul Menage
     [not found]             ` <6599ad830901221626o2fd5025fib3c1c6d182e52e63-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-01-23  1:26               ` Matt Helsley
2009-01-23  1:31                 ` Paul Menage
2009-01-23  2:06       ` Serge E. Hallyn
2009-01-23  1:12   ` Paul Menage

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.