* Hang with fair cgroup scheduler (reproducer is attached.)
@ 2007-12-14 7:18 KAMEZAWA Hiroyuki
[not found] ` <20071214161834.034e6efe.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-14 7:18 UTC (permalink / raw)
To: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org
Cc: Andrew Morton, mingo-X9Un+BFzKDI,
vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
[-- Attachment #1: Type: text/plain, Size: 742 bytes --]
Hi,
While I was testing 2.6.24-rc5-mm1's fair group scheduler (with cgroup),
the system hangs. please confirm. it's reproducible on my box.
My test program is attached.
What happens:
the system hangs. (panic ?)
Environ:
ia64/NUMA 8CPU systems. 4 cpus per node.
How to reproduce:
Compile attached one.
# gcc -o reg reg.c
Create group as following
# mount -t cgroup none /opt/cgroup -o cpu
# mkdir /opt/cgroup/group_1
# mkdir /opt/cgroup/group_2
And run attached program
# ./reg 8 8
What 'reg' does;
usage : reg A B C...
This program forks child process and assign
A of processes to group_1
B of processes to group_2
C of processes to group_3
kick and waitpid all and repeat.
Thanks,
-Kame
[-- Attachment #2: reg.c --]
[-- Type: text/x-csrc, Size: 2301 bytes --]
#include <stdlib.h>
#include <stdio.h>
#include <strings.h>
#include <sys/types.h>
#include <unistd.h>
#include <sched.h>
#include <asm/intrinsics.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <errno.h>
#include <sys/times.h>
static char *shared;
#define MAX_PROCS 32
#define SHMSIZE (16384)
struct start_stop {
int go;
};
/* Assign PID to a group....
* work as # echo PID > /opt/cgroup/group_%d/tasks
*/
void assign_to(int pid, int group)
{
FILE *fp;
char buf[32];
memset(buf, 0, sizeof(buf));
sprintf(buf,"/opt/cgroup/group_%d/tasks",group);
fp = fopen(buf,"w");
if (fp == NULL) {
perror("fopen");
fprintf(stderr, "failed : fopen");
exit(0);
}
fprintf(fp, "%d", pid);
fclose(fp);
printf("%d to %s\n", pid, buf);
}
/*
* spin wait and go into small loop.
* # of loops are counted as score.
* This process's utime is recorded in times[id]
*/
int worker(int id)
{
struct start_stop *shared_flag;
shared_flag = (struct start_stop*)shared;
do {
sched_yield();
ia64_mf();
} while (!shared_flag->go);
}
/*
* If you want to assign..
* 2 proces to group 1, 3 procs to group 2 -># ./a.out 2 3
* 3 proces to group 1, 3 procs to group 2, 3 procs to group 3
* -># ./a.out 3 3 3
* Total 32 procs are supported.
*/
int main(int argc, char *argv[])
{
int nprocs;
int shmid, i;
struct start_stop *shared_flag;
int pids[MAX_PROCS];
int groups[MAX_PROCS];
memset(pids, 0 , sizeof(pids));
memset(groups, 0 , sizeof(groups));
again:
for (nprocs = 0, i = 1; i < argc; i++) {
int num = atoi(argv[i]);
int j;
for (j = 0; j < num; j++) {
groups[nprocs + j] = i;
}
nprocs += num;
}
shmid = shmget(IPC_PRIVATE, SHMSIZE, IPC_CREAT | 0666);
if (shmid == -1) {
perror("shmget");
exit(1);
}
shared = shmat(shmid, NULL, 0);
shared_flag = (struct start_stop *)shared;
memset(shared, 0, SHMSIZE);
shmctl(shmid, IPC_RMID, 0);
for (i = 0; i < nprocs; i++) {
int ret;
ret = fork();
if (ret == 0) {
worker(i);
exit(0);
} else if (ret == -1) {
perror("fork");
exit(0);
}
pids[i] = ret;
}
sleep(1);
for (i = 0; i < nprocs; i++)
assign_to(pids[i], groups[i]);
sleep(1);
ia64_mf();
shared_flag->go = 1;
for (i = 0; i < nprocs; i++) {
int status;
waitpid(pids[i], &status, 0);
}
goto again;
return 0;
}
[-- Attachment #3: Type: text/plain, Size: 206 bytes --]
_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214161834.034e6efe.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2007-12-14 8:17 ` KAMEZAWA Hiroyuki
[not found] ` <20071214171759.59f7ba57.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 9:48 ` Ingo Molnar
1 sibling, 1 reply; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-14 8:17 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Andrew Morton,
mingo-X9Un+BFzKDI, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
Tested again, and got NULL access and panic.
This is my guess from stack dump. (raw stack dump is attached below.)
==
static struct task_struct *pick_next_task_fair(struct rq *rq)
{
struct cfs_rq *cfs_rq = &rq->cfs;
struct sched_entity *se;
if (unlikely(!cfs_rq->nr_running))
return NULL;
do {
se = pick_next_entity(cfs_rq); <-- se was NULL.
cfs_rq = group_cfs_rq(se); <-- se->my_q causes SEGV
} while (cfs_rq);
return task_of(se);
}
===
Seems first_fair() was NULL in
==
static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
{
struct sched_entity *se = NULL;
if (first_fair(cfs_rq)) { <------------------------------(*)
se = __pick_next_entity(cfs_rq);
set_next_entity(cfs_rq, se);
}
return se;
}
==
from register information.
Thanks,
-Kame
Stack dump is here.
==
Pid: 8197, CPU 6, comm: reg
psr : 00001210085a2010 ifs : 8000000000000206 ip : [<a000000100067c01>] Not tainted
ip is at pick_next_task_fair+0x81/0xe0
unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000556959
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100067c00 b6 : a000000100076a60 b7 : a00000010000ee50
NaT consumption 2216203124768 [1]^M
Modules linked in: sunrpc binfmt_misc dm_mirror dm_mod fan sg thermal e1000 processor button conta
iner e100 eepro100 mii lpfc mptspi mptscsih mptbase ehci_hcd ohci_hcd uhci_hcd^M
^M
Pid: 8197, CPU 6, comm: reg^M
psr : 00001210085a2010 ifs : 8000000000000206 ip : [<a000000100067c01>] Not tainted^M
ip is at pick_next_task_fair+0x81/0xe0^M
unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003^M
rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000556959^M
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f^M
csd : 0000000000000000 ssd : 0000000000000000^M
b0 : a000000100067c00 b6 : a000000100076a60 b7 : a00000010000ee50^M
f6 : 000000000000000000000 f7 : 000000000000000000000^M
f8 : 1003e00000000a0000007 f9 : 1003e00000059499dd2c3^M
f10 : 1003ece02a62ae350c355 f11 : 1003e0000000000000037^M
r1 : a000000100d87a60 r2 : 000000df13538d0b r3 : 0000000000000060^M
r8 : 0000000000000000 r9 : e00001a004034b30 r10 : 0000000000000000^M
r11 : e00001a004034aa8 r12 : e00001a10397fe10 r13 : e00001a103970000^M
r14 : 00000000d594bde3 r15 : e00001a004034ab0 r16 : e00001a004034ab8^M
r17 : e00001a004034ac8 r18 : e00001a004038320 r19 : e00001a10426ff20^M
r20 : 0000000000000000 r21 : 0000000000000000 r22 : 0000000000000001^M
r23 : e00001a004034a91 r24 : e00001a004034a90 r25 : e00001a10426ff10^M
r26 : 0000000000000002 r27 : e00001a0040382f0 r28 : e00001a004038288^M
r29 : a0000001008a5468 r30 : a000000100076a60 r31 : a000000100b726e0^M
^M
Call Trace:^M
[<a000000100013bc0>] show_stack+0x40/0xa0^M
sp=e00001a10397f860 bsp=e00001a103970f18^M
[<a000000100014840>] show_regs+0x840/0x880^M
sp=e00001a10397fa30 bsp=e00001a103970ec0^M
[<a000000100036fa0>] die+0x1a0/0x2a0^M
sp=e00001a10397fa30 bsp=e00001a103970e78^M
[<a0000001000370f0>] die_if_kernel+0x50/0x80^M
sp=e00001a10397fa30 bsp=e00001a103970e48^M
[<a000000100038260>] ia64_fault+0x1140/0x1260^M
sp=e00001a10397fa30 bsp=e00001a103970de8^M
[<a00000010000ae20>] ia64_leave_kernel+0x0/0x270^M
sp=e00001a10397fc40 bsp=e00001a103970de8^M
[<a000000100067c00>] pick_next_task_fair+0x80/0xe0^M
sp=e00001a10397fe10 bsp=e00001a103970db8^M
[<a0000001006f6a60>] schedule+0x8e0/0x1280^M
sp=e00001a10397fe10 bsp=e00001a103970d08^M
[<a000000100074e20>] sys_sched_yield+0xe0/0x100^M
sp=e00001a10397fe30 bsp=e00001a103970ca8^M
[<a00000010000aca0>] ia64_ret_from_syscall+0x0/0x20^M
sp=e00001a10397fe30 bsp=e00001a103970ca8^M
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20^M
sp=e00001a103980000 bsp=e00001a103970ca8^M
Disassemble.
==
a000000100067b80 <pick_next_task_fair>:
a000000100067b80: 18 10 19 08 80 05 [MMB] alloc r34=ar.pfs,6,4,0
a000000100067b86: 20 80 83 00 42 00 adds r2=112,r32
a000000100067b8c: 00 00 00 20 nop.b 0x0
a000000100067b90: 09 20 81 41 00 21 [MMI] adds r36=96,r32
a000000100067b96: 00 00 00 02 00 20 nop.m 0x0
a000000100067b9c: 04 00 c4 00 mov r33=b0;;
a000000100067ba0: 0b 70 00 04 18 10 [MMI] ld8 r14=[r2];;
a000000100067ba6: 70 00 38 0c 72 00 cmp.eq p7,p6=0,r14
a000000100067bac: 00 00 04 00 nop.i 0x0;;
a000000100067bb0: 10 00 00 00 01 c0 [MIB] nop.m 0x0
a000000100067bb6: 81 00 00 00 c2 03 (p07) mov r8=r0
a000000100067bbc: 80 00 00 41 (p07) br.cond.spnt.few a000000100067c30 <pick_next_task_fair+0xb
0>
a000000100067bc0: 09 48 c0 48 00 21 [MMI] adds r9=48,r36
a000000100067bc6: 00 00 00 02 00 00 nop.m 0x0
a000000100067bcc: 04 00 00 84 mov r32=r0;;
a000000100067bd0: 09 00 00 00 01 00 [MMI] nop.m 0x0
a000000100067bd6: 80 00 24 30 20 00 ld8 r8=[r9]
a000000100067bdc: 00 00 04 00 nop.i 0x0;;
a000000100067be0: 03 00 00 00 01 00 [MII] nop.m 0x0
a000000100067be6: b0 00 20 14 72 05 cmp.eq p11,p10=0,r8;;
a000000100067bec: 04 47 fc 8c (p10) adds r32=-16,r8;;
a000000100067bf0: 51 29 01 40 00 21 [MIB] (p10) mov r37=r32
a000000100067bf6: 00 00 00 02 00 05 nop.i 0x0
a000000100067bfc: 58 fe ff 5a (p10) br.call.dptk.many b0=a000000100067a40 <set_next_entity>;;
a000000100067c00: 0b 18 80 41 00 21 [MMI] adds r3=96,r32;;
a000000100067c06: 40 02 0c 30 20 00 ld8 r36=[r3] <----------panic.
a000000100067c0c: 00 00 04 00 nop.i 0x0;;
a000000100067c10: 10 00 00 00 01 00 [MIB] nop.m 0x0
a000000100067c16: 90 00 90 10 72 04 cmp.eq p9,p8=0,r36
a000000100067c1c: b0 ff ff 4a (p08) br.cond.dptk.few a000000100067bc0 <pick_next_task_fair+0x4
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214161834.034e6efe.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 8:17 ` KAMEZAWA Hiroyuki
@ 2007-12-14 9:48 ` Ingo Molnar
1 sibling, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2007-12-14 9:48 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Dhaval Giani, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Andrew Morton,
Peter Zijlstra
(Cc:-ed other folks as well)
* KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> Hi,
>
> While I was testing 2.6.24-rc5-mm1's fair group scheduler (with cgroup),
> the system hangs. please confirm. it's reproducible on my box.
>
> My test program is attached.
>
> What happens:
> the system hangs. (panic ?)
>
> Environ:
> ia64/NUMA 8CPU systems. 4 cpus per node.
>
> How to reproduce:
> Compile attached one.
> # gcc -o reg reg.c
> Create group as following
> # mount -t cgroup none /opt/cgroup -o cpu
> # mkdir /opt/cgroup/group_1
> # mkdir /opt/cgroup/group_2
>
> And run attached program
> # ./reg 8 8
>
> What 'reg' does;
> usage : reg A B C...
> This program forks child process and assign
> A of processes to group_1
> B of processes to group_2
> C of processes to group_3
> kick and waitpid all and repeat.
>
> Thanks,
> -Kame
> #include <stdlib.h>
> #include <stdio.h>
> #include <strings.h>
> #include <sys/types.h>
> #include <unistd.h>
> #include <sched.h>
> #include <asm/intrinsics.h>
> #include <sys/ipc.h>
> #include <sys/shm.h>
> #include <errno.h>
> #include <sys/times.h>
>
> static char *shared;
> #define MAX_PROCS 32
> #define SHMSIZE (16384)
>
> struct start_stop {
> int go;
> };
>
> /* Assign PID to a group....
> * work as # echo PID > /opt/cgroup/group_%d/tasks
> */
> void assign_to(int pid, int group)
> {
> FILE *fp;
> char buf[32];
>
> memset(buf, 0, sizeof(buf));
> sprintf(buf,"/opt/cgroup/group_%d/tasks",group);
> fp = fopen(buf,"w");
> if (fp == NULL) {
> perror("fopen");
> fprintf(stderr, "failed : fopen");
> exit(0);
> }
> fprintf(fp, "%d", pid);
> fclose(fp);
> printf("%d to %s\n", pid, buf);
> }
>
> /*
> * spin wait and go into small loop.
> * # of loops are counted as score.
> * This process's utime is recorded in times[id]
> */
> int worker(int id)
> {
> struct start_stop *shared_flag;
>
> shared_flag = (struct start_stop*)shared;
> do {
> sched_yield();
> ia64_mf();
> } while (!shared_flag->go);
> }
>
> /*
> * If you want to assign..
> * 2 proces to group 1, 3 procs to group 2 -># ./a.out 2 3
> * 3 proces to group 1, 3 procs to group 2, 3 procs to group 3
> * -># ./a.out 3 3 3
> * Total 32 procs are supported.
> */
>
> int main(int argc, char *argv[])
> {
> int nprocs;
> int shmid, i;
> struct start_stop *shared_flag;
> int pids[MAX_PROCS];
> int groups[MAX_PROCS];
>
> memset(pids, 0 , sizeof(pids));
> memset(groups, 0 , sizeof(groups));
>
> again:
> for (nprocs = 0, i = 1; i < argc; i++) {
> int num = atoi(argv[i]);
> int j;
> for (j = 0; j < num; j++) {
> groups[nprocs + j] = i;
> }
> nprocs += num;
> }
>
> shmid = shmget(IPC_PRIVATE, SHMSIZE, IPC_CREAT | 0666);
> if (shmid == -1) {
> perror("shmget");
> exit(1);
> }
>
> shared = shmat(shmid, NULL, 0);
> shared_flag = (struct start_stop *)shared;
>
> memset(shared, 0, SHMSIZE);
> shmctl(shmid, IPC_RMID, 0);
>
> for (i = 0; i < nprocs; i++) {
> int ret;
> ret = fork();
> if (ret == 0) {
> worker(i);
> exit(0);
> } else if (ret == -1) {
> perror("fork");
> exit(0);
> }
> pids[i] = ret;
> }
> sleep(1);
> for (i = 0; i < nprocs; i++)
> assign_to(pids[i], groups[i]);
> sleep(1);
> ia64_mf();
> shared_flag->go = 1;
>
> for (i = 0; i < nprocs; i++) {
> int status;
> waitpid(pids[i], &status, 0);
> }
> goto again;
>
> return 0;
> }
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214171759.59f7ba57.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2007-12-14 9:49 ` Ingo Molnar
[not found] ` <20071214094909.GG11266-X9Un+BFzKDI@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2007-12-14 9:49 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Dhaval Giani, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Andrew Morton,
Peter Zijlstra
(Cc:-ed other folks as well)
* KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> Tested again, and got NULL access and panic.
>
> This is my guess from stack dump. (raw stack dump is attached below.)
> ==
>
> static struct task_struct *pick_next_task_fair(struct rq *rq)
> {
> struct cfs_rq *cfs_rq = &rq->cfs;
> struct sched_entity *se;
>
> if (unlikely(!cfs_rq->nr_running))
> return NULL;
>
> do {
> se = pick_next_entity(cfs_rq); <-- se was NULL.
> cfs_rq = group_cfs_rq(se); <-- se->my_q causes SEGV
> } while (cfs_rq);
>
> return task_of(se);
> }
> ===
> Seems first_fair() was NULL in
> ==
> static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
> {
> struct sched_entity *se = NULL;
>
> if (first_fair(cfs_rq)) { <------------------------------(*)
> se = __pick_next_entity(cfs_rq);
> set_next_entity(cfs_rq, se);
> }
>
> return se;
> }
> ==
> from register information.
>
> Thanks,
> -Kame
>
>
> Stack dump is here.
> ==
> Pid: 8197, CPU 6, comm: reg
> psr : 00001210085a2010 ifs : 8000000000000206 ip : [<a000000100067c01>] Not tainted
> ip is at pick_next_task_fair+0x81/0xe0
> unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003
> rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000556959
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
> csd : 0000000000000000 ssd : 0000000000000000
> b0 : a000000100067c00 b6 : a000000100076a60 b7 : a00000010000ee50
> NaT consumption 2216203124768 [1]^M
> Modules linked in: sunrpc binfmt_misc dm_mirror dm_mod fan sg thermal e1000 processor button conta
> iner e100 eepro100 mii lpfc mptspi mptscsih mptbase ehci_hcd ohci_hcd uhci_hcd^M
> ^M
> Pid: 8197, CPU 6, comm: reg^M
> psr : 00001210085a2010 ifs : 8000000000000206 ip : [<a000000100067c01>] Not tainted^M
> ip is at pick_next_task_fair+0x81/0xe0^M
> unat: 0000000000000000 pfs : 0000000000000206 rsc : 0000000000000003^M
> rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000556959^M
> ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f^M
> csd : 0000000000000000 ssd : 0000000000000000^M
> b0 : a000000100067c00 b6 : a000000100076a60 b7 : a00000010000ee50^M
> f6 : 000000000000000000000 f7 : 000000000000000000000^M
> f8 : 1003e00000000a0000007 f9 : 1003e00000059499dd2c3^M
> f10 : 1003ece02a62ae350c355 f11 : 1003e0000000000000037^M
> r1 : a000000100d87a60 r2 : 000000df13538d0b r3 : 0000000000000060^M
> r8 : 0000000000000000 r9 : e00001a004034b30 r10 : 0000000000000000^M
> r11 : e00001a004034aa8 r12 : e00001a10397fe10 r13 : e00001a103970000^M
> r14 : 00000000d594bde3 r15 : e00001a004034ab0 r16 : e00001a004034ab8^M
> r17 : e00001a004034ac8 r18 : e00001a004038320 r19 : e00001a10426ff20^M
> r20 : 0000000000000000 r21 : 0000000000000000 r22 : 0000000000000001^M
> r23 : e00001a004034a91 r24 : e00001a004034a90 r25 : e00001a10426ff10^M
> r26 : 0000000000000002 r27 : e00001a0040382f0 r28 : e00001a004038288^M
> r29 : a0000001008a5468 r30 : a000000100076a60 r31 : a000000100b726e0^M
> ^M
> Call Trace:^M
> [<a000000100013bc0>] show_stack+0x40/0xa0^M
> sp=e00001a10397f860 bsp=e00001a103970f18^M
> [<a000000100014840>] show_regs+0x840/0x880^M
> sp=e00001a10397fa30 bsp=e00001a103970ec0^M
> [<a000000100036fa0>] die+0x1a0/0x2a0^M
> sp=e00001a10397fa30 bsp=e00001a103970e78^M
> [<a0000001000370f0>] die_if_kernel+0x50/0x80^M
> sp=e00001a10397fa30 bsp=e00001a103970e48^M
> [<a000000100038260>] ia64_fault+0x1140/0x1260^M
> sp=e00001a10397fa30 bsp=e00001a103970de8^M
> [<a00000010000ae20>] ia64_leave_kernel+0x0/0x270^M
> sp=e00001a10397fc40 bsp=e00001a103970de8^M
> [<a000000100067c00>] pick_next_task_fair+0x80/0xe0^M
> sp=e00001a10397fe10 bsp=e00001a103970db8^M
> [<a0000001006f6a60>] schedule+0x8e0/0x1280^M
> sp=e00001a10397fe10 bsp=e00001a103970d08^M
> [<a000000100074e20>] sys_sched_yield+0xe0/0x100^M
> sp=e00001a10397fe30 bsp=e00001a103970ca8^M
> [<a00000010000aca0>] ia64_ret_from_syscall+0x0/0x20^M
> sp=e00001a10397fe30 bsp=e00001a103970ca8^M
> [<a000000000010720>] __kernel_syscall_via_break+0x0/0x20^M
> sp=e00001a103980000 bsp=e00001a103970ca8^M
>
> Disassemble.
> ==
> a000000100067b80 <pick_next_task_fair>:
> a000000100067b80: 18 10 19 08 80 05 [MMB] alloc r34=ar.pfs,6,4,0
> a000000100067b86: 20 80 83 00 42 00 adds r2=112,r32
> a000000100067b8c: 00 00 00 20 nop.b 0x0
> a000000100067b90: 09 20 81 41 00 21 [MMI] adds r36=96,r32
> a000000100067b96: 00 00 00 02 00 20 nop.m 0x0
> a000000100067b9c: 04 00 c4 00 mov r33=b0;;
> a000000100067ba0: 0b 70 00 04 18 10 [MMI] ld8 r14=[r2];;
> a000000100067ba6: 70 00 38 0c 72 00 cmp.eq p7,p6=0,r14
> a000000100067bac: 00 00 04 00 nop.i 0x0;;
> a000000100067bb0: 10 00 00 00 01 c0 [MIB] nop.m 0x0
> a000000100067bb6: 81 00 00 00 c2 03 (p07) mov r8=r0
> a000000100067bbc: 80 00 00 41 (p07) br.cond.spnt.few a000000100067c30 <pick_next_task_fair+0xb
> 0>
> a000000100067bc0: 09 48 c0 48 00 21 [MMI] adds r9=48,r36
> a000000100067bc6: 00 00 00 02 00 00 nop.m 0x0
> a000000100067bcc: 04 00 00 84 mov r32=r0;;
> a000000100067bd0: 09 00 00 00 01 00 [MMI] nop.m 0x0
> a000000100067bd6: 80 00 24 30 20 00 ld8 r8=[r9]
> a000000100067bdc: 00 00 04 00 nop.i 0x0;;
> a000000100067be0: 03 00 00 00 01 00 [MII] nop.m 0x0
> a000000100067be6: b0 00 20 14 72 05 cmp.eq p11,p10=0,r8;;
> a000000100067bec: 04 47 fc 8c (p10) adds r32=-16,r8;;
> a000000100067bf0: 51 29 01 40 00 21 [MIB] (p10) mov r37=r32
> a000000100067bf6: 00 00 00 02 00 05 nop.i 0x0
> a000000100067bfc: 58 fe ff 5a (p10) br.call.dptk.many b0=a000000100067a40 <set_next_entity>;;
> a000000100067c00: 0b 18 80 41 00 21 [MMI] adds r3=96,r32;;
> a000000100067c06: 40 02 0c 30 20 00 ld8 r36=[r3] <----------panic.
> a000000100067c0c: 00 00 04 00 nop.i 0x0;;
> a000000100067c10: 10 00 00 00 01 00 [MIB] nop.m 0x0
> a000000100067c16: 90 00 90 10 72 04 cmp.eq p9,p8=0,r36
> a000000100067c1c: b0 ff ff 4a (p08) br.cond.dptk.few a000000100067bc0 <pick_next_task_fair+0x4
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214094909.GG11266-X9Un+BFzKDI@public.gmane.org>
@ 2007-12-14 10:58 ` KAMEZAWA Hiroyuki
[not found] ` <20071214195837.0d3511db.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
[not found] ` <b647ffbd0712140447kfba5945ybde40f18653dd164-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 2 replies; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-14 10:58 UTC (permalink / raw)
To: Ingo Molnar
Cc: Dhaval Giani, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Andrew Morton,
Peter Zijlstra
Here is much easier test.
(I'm sorry I'll be absent tomorrow.)
the number of cpus is 8. ia64/NUMA.
The hang occurs when the number of tasks is not smaller than available cpus.
Can be a hint ?
==
[root@rhel51GA testpro]# cat yield.c
#include <sched.h>
int main()
{
while (1)
sched_yield();
}
[root@rhel51GA testpro]# cat batch-test.sh
#!/bin/bash -x
mount -t cgroup none /opt/cgroup -o cpu
mkdir /opt/cgroup/group_1
mkdir /opt/cgroup/group_2
./yield &
PIDA=$!
./yield &
PIDB=$!
while true; do
echo $PIDA > /opt/cgroup/group_1/tasks
echo $PIDB > /opt/cgroup/group_1/tasks
echo $PIDA > /opt/cgroup/group_2/tasks;
echo $PIDB > /opt/cgroup/group_2/tasks
done
[root@rhel51GA testpro]#./batech-test.sh
no hang.
[root@rhel51GA testpro]#taskset 0f ./batech-test.sh
no hang
[root@rhel51GA testpro]#taskset 03 ./batech-test.sh
hang.
Pid: 8132, CPU 0, comm: yield
psr : 00001210085a2010 ifs : 8000000000000206 ip : [<a000000100067c01>] Not tainted
ip is at pick_next_task_fair+0x81/0xe0
unat: 0000000000000000 pfs : 0000000000000b1d rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000566959
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a0000001006f6ac0 b6 : a000000100076a60 b7 : a000000100067b80
f6 : 000000000000000000000 f7 : 000000000000000000000
f8 : 1003e00000000a0000007 f9 : 1003e0000004633b23e65
f10 : 1003ee04f68ea89dfb4c3 f11 : 1003e000000000000002b
r1 : a000000100d87a60 r2 : e0000000011082f0 r3 : 0000000000000060
r8 : 0000000000000000 r9 : e000000001108310 r10 : e000004080032018
r11 : 00000000f86ccc70 r12 : e00000408394fe10 r13 : e000004083940000
r14 : 0000000000000001 r15 : 0000000000000064 r16 : e0000000011089f0
r17 : ffffffffffffffff r18 : e000000001108360 r19 : 0000000000000000
r20 : e00000408003ef10 r21 : 0000000001e9555b r22 : 000000af762794d4
r23 : 00000015e1abc70b r24 : ffffffffffffe463 r25 : e00000408003ef10
r26 : 0000000000000002 r27 : e0000000011082f0 r28 : e000000001108288
r29 : a0000001008a5468 r30 : a000000100076a60 r31 : a000000100b726e0
Call Trace:
[<a000000100013bc0>] show_stack+0x40/0xa0
sp=e00000408394f860 bsp=e000004083940f18
[<a000000100014840>] show_regs+0x840/0x880
sp=e00000408394fa30 bsp=e000004083940ec0
[<a000000100036fa0>] die+0x1a0/0x2a0
sp=e00000408394fa30 bsp=e000004083940e78
[<a0000001000370f0>] die_if_kernel+0x50/0x80
sp=e00000408394fa30 bsp=e000004083940e48
[<a000000100038260>] ia64_fault+0x1140/0x1260
sp=e00000408394fa30 bsp=e000004083940de8
[<a00000010000ae20>] ia64_leave_kernel+0x0/0x270
sp=e00000408394fc40 bsp=e000004083940de8
[<a000000100067c00>] pick_next_task_fair+0x80/0xe0
sp=e00000408394fe10 bsp=e000004083940db8
[<a0000001006f6ac0>] schedule+0x940/0x1280
sp=e00000408394fe10 bsp=e000004083940d08
[<a000000100074e20>] sys_sched_yield+0xe0/0x100
sp=e00000408394fe30 bsp=e000004083940ca8
[<a00000010000aca0>] ia64_ret_from_syscall+0x0/0x20
sp=e00000408394fe30 bsp=e000004083940ca8
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
sp=e000004083950000 bsp=e000004083940ca8
Thanks,
-Kame
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214195837.0d3511db.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2007-12-14 11:48 ` Dhaval Giani
2007-12-14 12:47 ` Dmitry Adamushko
1 sibling, 0 replies; 25+ messages in thread
From: Dhaval Giani @ 2007-12-14 11:48 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Ingo Molnar,
Andrew Morton
On Fri, Dec 14, 2007 at 07:58:37PM +0900, KAMEZAWA Hiroyuki wrote:
> Here is much easier test.
Thanks for the test! Let me see if I can reproduce it here.
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214195837.0d3511db.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 11:48 ` Dhaval Giani
@ 2007-12-14 12:47 ` Dmitry Adamushko
[not found] ` <20071214141528.GA6161-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-14 12:47 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Dhaval Giani, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Ingo Molnar,
Andrew Morton, Peter Zijlstra
On 14/12/2007, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> Here is much easier test.
> (I'm sorry I'll be absent tomorrow.)
>
> the number of cpus is 8. ia64/NUMA.
>
> The hang occurs when the number of tasks is not smaller than available cpus.
> Can be a hint ?
>
> [ ... ]
>
> [root@rhel51GA testpro]#./batech-test.sh
> no hang.
>
> [root@rhel51GA testpro]#taskset 0f ./batech-test.sh
> no hang
>
> [root@rhel51GA testpro]#taskset 03 ./batech-test.sh
> hang.
have you tried :
[root@rhel51GA testpro]#taskset 01 ./batech-test.sh
hang?
just to be sure SMP does matter here (most likely yes, I guess).
TIA,
>
> Thanks,
> -Kame
>
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712140447kfba5945ybde40f18653dd164-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-14 12:50 ` kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
2007-12-14 14:15 ` Dhaval Giani
1 sibling, 0 replies; 25+ messages in thread
From: kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A @ 2007-12-14 12:50 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton,
Dhaval Giani
-
>have you tried :
>
>[root@rhel51GA testpro]#taskset 01 ./batech-test.sh
>
yes
>hang?
>
no.
>just to be sure SMP does matter here (most likely yes, I guess).
>
maybe. As far as I tested, there was no hang if the number of cpus is 1.
Regards,
-Kame
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712140447kfba5945ybde40f18653dd164-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-14 12:50 ` kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
@ 2007-12-14 14:15 ` Dhaval Giani
1 sibling, 0 replies; 25+ messages in thread
From: Dhaval Giani @ 2007-12-14 14:15 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org, Ingo Molnar,
Andrew Morton
On Fri, Dec 14, 2007 at 01:47:13PM +0100, Dmitry Adamushko wrote:
> On 14/12/2007, KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > Here is much easier test.
> > (I'm sorry I'll be absent tomorrow.)
> >
> > the number of cpus is 8. ia64/NUMA.
> >
> > The hang occurs when the number of tasks is not smaller than available cpus.
> > Can be a hint ?
> >
> > [ ... ]
> >
> > [root@rhel51GA testpro]#./batech-test.sh
> > no hang.
> >
> > [root@rhel51GA testpro]#taskset 0f ./batech-test.sh
> > no hang
> >
> > [root@rhel51GA testpro]#taskset 03 ./batech-test.sh
> > hang.
>
> have you tried :
>
> [root@rhel51GA testpro]#taskset 01 ./batech-test.sh
>
> hang?
>
> just to be sure SMP does matter here (most likely yes, I guess).
>
NUMA? I am not able to reproduce it here locally on an x86 8 CPU box.
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214141528.GA6161-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2007-12-14 14:24 ` kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
[not found] ` <20442799.1197642268756.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A @ 2007-12-14 14:24 UTC (permalink / raw)
To: Dhaval Giani
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko, containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar,
Andrew Morton
>> just to be sure SMP does matter here (most likely yes, I guess).
>>
>
>NUMA? I am not able to reproduce it here locally on an x86 8 CPU box.
>
yes. I used NUMA. 2 Nodes/4CPU x 2
Hmm..
Thanks,
-Kame
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20442799.1197642268756.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2007-12-14 15:36 ` Dhaval Giani
[not found] ` <20071214153607.GB23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dhaval Giani @ 2007-12-14 15:36 UTC (permalink / raw)
To: kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko, containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar,
Andrew Morton
On Fri, Dec 14, 2007 at 11:24:28PM +0900, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org wrote:
> >> just to be sure SMP does matter here (most likely yes, I guess).
> >>
> >
> >NUMA? I am not able to reproduce it here locally on an x86 8 CPU box.
> >
> yes. I used NUMA. 2 Nodes/4CPU x 2
>
OK, I got hold of an IA64 box, non numa and have managed to reproduce
it.
> Hmm..
>
> Thanks,
> -Kame
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214153607.GB23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2007-12-14 15:38 ` Dhaval Giani
[not found] ` <20071214153823.GC23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dhaval Giani @ 2007-12-14 15:38 UTC (permalink / raw)
To: kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
Dmitry Adamushko, containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar,
Andrew Morton
On Fri, Dec 14, 2007 at 09:06:07PM +0530, Dhaval Giani wrote:
> On Fri, Dec 14, 2007 at 11:24:28PM +0900, kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org wrote:
> > >> just to be sure SMP does matter here (most likely yes, I guess).
> > >>
> > >
> > >NUMA? I am not able to reproduce it here locally on an x86 8 CPU box.
> > >
> > yes. I used NUMA. 2 Nodes/4CPU x 2
> >
>
> OK, I got hold of an IA64 box, non numa and have managed to reproduce
> it.
>
Actually no, its another bug. Thanks for the program!
reg[3330]: NaT consumption 2216203124768 [1]
Modules linked in: ipv6 button binfmt_misc nls_iso8859_1 loop dm_mod tg3
ext3 jbd fan thermal processor sg mptspi mptscsih mptbase
scsi_transport_spi via82cxxx sd_mod scsi_mod ide_disk ide_core
Pid: 3330, CPU 3, comm: reg
psr : 00001210085a2010 ifs : 8000000000000308 ip : [<a0000001002e0481>]
Not tainted
ip is at rb_erase+0x301/0x7e0
unat: 0000000000000000 pfs : 0000000000000308 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : a5565666a9556959
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a000000100076290 b6 : a000000100086b20 b7 : a000000100076360
f6 : 1003e0000000000000d34 f7 : 1003e000000000000000a
f8 : 1003e0000000000000000 f9 : 1003e0000000000000152
f10 : 1003e0000000000000004 f11 : 0fff2fffffffff0000000
r1 : a000000100c92030 r2 : e000000244bd0068 r3 : e000000245882000
r8 : e000000245882000 r9 : e000000241e6eda0 r10 : 0000000000000001
r11 : e0000002458f0070 r12 : e0000002458a7d80 r13 : e0000002458a0000
r14 : e000000244bd0060 r15 : e000000244bd0058 r16 : 0000000000000000
r17 : e000000245920d34 r18 : 0000000000000000 r19 : 0000000000000000
r20 : e000000245920c90 r21 : 0000000000000001 r22 : a000000100076360
r23 : a000000100a7f2f8 r24 : a000000100a7f2b0 r25 : e0000002458c0058
r26 : e000000004e05b10 r27 : 0000000000000001 r28 : 0000000000000000
r29 : a000000100a7f2e0 r30 : a000000100a7f2b0 r31 : e000000245920098
Call Trace:
[<a000000100014a80>] show_stack+0x40/0xa0
sp=e0000002458a77d0 bsp=e0000002458a1310
[<a000000100015380>] show_regs+0x840/0x880
sp=e0000002458a79a0 bsp=e0000002458a12b8
[<a0000001000384a0>] die+0x1a0/0x2a0
sp=e0000002458a79a0 bsp=e0000002458a1270
[<a0000001000385f0>] die_if_kernel+0x50/0x80
sp=e0000002458a79a0 bsp=e0000002458a1240
[<a0000001005b1a80>] ia64_fault+0x1180/0x12a0
sp=e0000002458a79a0 bsp=e0000002458a11e0
[<a00000010000b2a0>] ia64_leave_kernel+0x0/0x270
sp=e0000002458a7bb0 bsp=e0000002458a11e0
[<a0000001002e0480>] rb_erase+0x300/0x7e0
sp=e0000002458a7d80 bsp=e0000002458a11a0
[<a000000100076290>] __dequeue_entity+0x70/0xa0
sp=e0000002458a7d80 bsp=e0000002458a1170
[<a000000100076300>] set_next_entity+0x40/0xa0
sp=e0000002458a7d80 bsp=e0000002458a1148
[<a0000001000763a0>] set_curr_task_fair+0x40/0xa0
sp=e0000002458a7d80 bsp=e0000002458a1128
[<a000000100078d90>] sched_move_task+0x2d0/0x340
sp=e0000002458a7d80 bsp=e0000002458a10e8
[<a000000100078e20>] cpu_cgroup_attach+0x20/0x40
sp=e0000002458a7d90 bsp=e0000002458a10b0
[<a0000001000e9370>] attach_task+0x9b0/0xac0
sp=e0000002458a7d90 bsp=e0000002458a1058
[<a0000001000ed4e0>] cgroup_common_file_write+0x340/0x520
sp=e0000002458a7dc0 bsp=e0000002458a1010
[<a0000001000eccd0>] cgroup_file_write+0xf0/0x300
sp=e0000002458a7dd0 bsp=e0000002458a0fc0
[<a00000010017bbd0>] vfs_write+0x1d0/0x320
sp=e0000002458a7e20 bsp=e0000002458a0f70
[<a00000010017c7f0>] sys_write+0x70/0xe0
sp=e0000002458a7e20 bsp=e0000002458a0ef8
[<a00000010000b100>] ia64_ret_from_syscall+0x0/0x20
sp=e0000002458a7e30 bsp=e0000002458a0ef8
[<a000000000010720>] __kernel_syscall_via_break+0x0/0x20
sp=e0000002458a8000 bsp=e0000002458a0ef8
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071214153823.GC23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2007-12-14 16:25 ` Dmitry Adamushko
[not found] ` <b647ffbd0712140825h4f541be0xa7a7866e70b3af7a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-14 16:25 UTC (permalink / raw)
To: Dhaval Giani
Cc: Peter Zijlstra, vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton
On 14/12/2007, Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> > >
> Actually no, its another bug. Thanks for the program!
>
Humm... this crash is very likely to be caused by the same bug. It
just reveals itself in a different place, but effectivelly the pattern
looks similar. Anyway, the rb-tree gets corrupted... and for both
cases, at the very least the 'current' must be within the tree.
I think, if you repeat your test a number of times, you'll likely get
the very same crash as was reported by Kame.
ia64 does define __ARCH_WANT_UNLOCKED_CTXSW (I checked against
2.6.23.1 that I have at hand)
x86 -- not (it's not reproducible there, right?)
so for ia64 task_running() makes use of 'p->oncpu' to determine
whether a given task is currently running
(as opposed to 'rq->curr == p' otherwise)...
But at first glance, it looks like there shouldn't be situations
leading to some sort of de-synchronization in determining the real
'current'.
Will look at it closer.
>
> --
> regards,
> Dhaval
>
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712140825h4f541be0xa7a7866e70b3af7a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-14 19:51 ` Dmitry Adamushko
[not found] ` <b647ffbd0712141151k697d9bbemda9a7e90515e4400-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-14 19:51 UTC (permalink / raw)
To: Ingo Molnar, Srivatsa Vaddagiri
Cc: Peter Zijlstra, Steven Rostedt, containers-qjLDD68F18O7TbgM5vRIOg,
Andrew Morton, Dhaval Giani
> [ ... ]
>
> [<a0000001002e0480>] rb_erase+0x300/0x7e0
> [<a000000100076290>] __dequeue_entity+0x70/0xa0
> [<a000000100076300>] set_next_entity+0x40/0xa0
> [<a0000001000763a0>] set_curr_task_fair+0x40/0xa0
> [<a000000100078d90>] sched_move_task+0x2d0/0x340
> [<a000000100078e20>] cpu_cgroup_attach+0x20/0x40
>
> [ ... ]
argh... it's a consequence of the 'current is not kept within the tree" indeed.
When sched_move_task() is called for the 'current' (running on another CPU),
we get the following:
...
running = task_running(rq, tsk);
on_rq = tsk->se.on_rq;
if (on_rq) {
dequeue_task(rq, tsk, 0);
if (unlikely(running))
tsk->sched_class->put_prev_task(rq, tsk);
}
[1] tsk->sched_class->put_prev_task() actually _inserts_ 'tsk' back
into the cfs_rq of its _old_ group :
set_task_cfs_rq(tsk, task_cpu(tsk));
[2] now task.se->cfs_rq gets changed
if (on_rq) {
if (unlikely(running))
tsk->sched_class->set_curr_task(rq);
[3] and now, tsk->sched_class->set_curr_task(rq) _removes_ the
'current' from the tree... but this tree belongs to the _new_ group
(the task is still within the 'old_group->cfs_rq->rb_tree') ---> oops!
enqueue_task(rq, tsk, 0);
}
Anyway, I have to admit that this problem is a consequence of the
special-case treatment for the 'current' by
'dequeue/enqueue_task()'... it makes the interface less transparent
indeed.
/me thinking on how to get it fixed (e.g. set_task_cfs_rq() might take
care of it) or just get this special-case issue removed (have to check
whether we lose anything in this case)... sigh.
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712141151k697d9bbemda9a7e90515e4400-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-14 21:33 ` Steven Rostedt
[not found] ` <Pine.LNX.4.58.0712141614340.22005-f9ZlEuEWxVcI6MkJdU+c8EEOCMrvLtNR@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Steven Rostedt @ 2007-12-14 21:33 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, Srivatsa Vaddagiri,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton,
Dhaval Giani
On Fri, 14 Dec 2007, Dmitry Adamushko wrote:
>
> argh... it's a consequence of the 'current is not kept within the tree" indeed.
>
Thanks Dmitry for tracking this down. Although I'm still not convinced we
hit the same bug. But I'm going to go ahead and release 2.6.24-rc5-rt1
anyway. When you have a fix, please CC me and I'll add it to -rt2.
Note: I've added a bunch of logdev (see
http://rostedt.homelinux.com/logdev/README) and I kicked off the hackbench
again. I'll let it run overnight, and if it hits the bug, it will give me
a lot more output to let me know what actually happened.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <Pine.LNX.4.58.0712141614340.22005-f9ZlEuEWxVcI6MkJdU+c8EEOCMrvLtNR@public.gmane.org>
@ 2007-12-15 10:22 ` Dmitry Adamushko
[not found] ` <b647ffbd0712150222p30cac9f9i772c2a2c4e05a4a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-15 10:22 UTC (permalink / raw)
To: Steven Rostedt
Cc: Peter Zijlstra, Srivatsa Vaddagiri,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton,
Dhaval Giani
On 14/12/2007, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote:
>
> On Fri, 14 Dec 2007, Dmitry Adamushko wrote:
>
> >
> > argh... it's a consequence of the 'current is not kept within the tree" indeed.
> >
>
> Thanks Dmitry for tracking this down.
My analysis was flawed (hmm... me was under control of Belgium beer :-)
The task in not on the runqueue (p->on_rq == 0) at the moment when
put_prev_task_fair() and set_curr_task_fair() get its turn in
sched_move_task()... so dequeue/enqueue_entity() are not triggered,
that's good.
so back to the square #0.
> Thanks,
>
> -- Steve
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712150222p30cac9f9i772c2a2c4e05a4a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-15 10:50 ` Dhaval Giani
[not found] ` <20071215105036.GB26325-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-15 23:44 ` Dmitry Adamushko
1 sibling, 1 reply; 25+ messages in thread
From: Dhaval Giani @ 2007-12-15 10:50 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton
On Sat, Dec 15, 2007 at 11:22:08AM +0100, Dmitry Adamushko wrote:
> On 14/12/2007, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote:
> >
> > On Fri, 14 Dec 2007, Dmitry Adamushko wrote:
> >
> > >
> > > argh... it's a consequence of the 'current is not kept within the tree" indeed.
> > >
> >
> > Thanks Dmitry for tracking this down.
>
> My analysis was flawed (hmm... me was under control of Belgium beer :-)
>
> The task in not on the runqueue (p->on_rq == 0) at the moment when
> put_prev_task_fair() and set_curr_task_fair() get its turn in
> sched_move_task()... so dequeue/enqueue_entity() are not triggered,
> that's good.
>
Again, I am probably missing something, but if on_rq == 0, then how is
set_curr_task_fair() getting called?
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071215105036.GB26325-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2007-12-15 11:15 ` Dmitry Adamushko
0 siblings, 0 replies; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-15 11:15 UTC (permalink / raw)
To: Dhaval Giani
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Ingo Molnar, Andrew Morton
On 15/12/2007, Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> On Sat, Dec 15, 2007 at 11:22:08AM +0100, Dmitry Adamushko wrote:
> > On 14/12/2007, Steven Rostedt <rostedt-nx8X9YLhiw1AfugRpC6u6w@public.gmane.org> wrote:
> > >
> > > On Fri, 14 Dec 2007, Dmitry Adamushko wrote:
> > >
> > > >
> > > > argh... it's a consequence of the 'current is not kept within the tree" indeed.
> > > >
> > >
> > > Thanks Dmitry for tracking this down.
> >
> > My analysis was flawed (hmm... me was under control of Belgium beer :-)
> >
> > The task in not on the runqueue (p->on_rq == 0) at the moment when
> > put_prev_task_fair() and set_curr_task_fair() get its turn in
> > sched_move_task()... so dequeue/enqueue_entity() are not triggered,
> > that's good.
> >
>
> Again, I am probably missing something, but if on_rq == 0, then how is
> set_curr_task_fair() getting called?
>
...
running = task_running(rq, tsk);
on_rq = tsk->se.on_rq;
// let's say on_rq == 1 , i.e. the task is on the runqueue
if (on_rq) {
dequeue_task(rq, tsk, 0);
// now tsk->se.on_rq becomes 0
if (unlikely(running))
tsk->sched_class->put_prev_task(rq, tsk);
// put_prev_task() --> put_prev_entity() checks for 'tsk->se.on_rq' to
determine whether __enqueue_entity() must be done ---> and it's 0 in
our case.
[ it can be non-zero for the following path : schedule() -->
put_prev_task(..., prev) when deactivate_task(..., prev) was not
previously called in schedule(), i.e. 'prev' was preempted ]
tsk->se.on_rq will become 1 only after enqueue_task(). As a result,
tsk->se.on_rq is still 0 when set_curr_task() is executed.
does it make sense now?
> --
> regards,
> Dhaval
>
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712150222p30cac9f9i772c2a2c4e05a4a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-15 10:50 ` Dhaval Giani
@ 2007-12-15 23:44 ` Dmitry Adamushko
[not found] ` <b647ffbd0712151544n2dfad101r2d306d393e8550ff-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-15 23:44 UTC (permalink / raw)
To: Ingo Molnar
Cc: Dhaval Giani, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton, Peter Zijlstra
On 15/12/2007, Dmitry Adamushko <dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> My analysis was flawed (hmm... me was under control of Belgium beer :-)
>
ok, I've got another one (just in case... well, this late hour to be
blamed now :-/)
according to Dhaval, we have a crash on ia64 (it's also the arch for
the original report) and it's not reproducible on an otherwise similar
(wrt. # of cpus) x86.
(1) The difference that comes first in mind is that ia64 makes use of
__ARCH_WANT_UNLOCKED_CTXSW
dimm@earth:~/storage/kernel/linux-2.6$ grep -rn
__ARCH_WANT_UNLOCKED_CTXSW include/
include/linux/sched.h:947:#ifdef __ARCH_WANT_UNLOCKED_CTXSW
include/asm-mips/system.h:216:#define __ARCH_WANT_UNLOCKED_CTXSW
include/asm-ia64/system.h:259:#define __ARCH_WANT_UNLOCKED_CTXSW
(2) now, in this case (and for SMP)
task_running() effectively becomes { return p->oncpu; }
(3) consider a case of the context switch between prev --> next on CPU #0
'next' has preempted 'prev'
(4) context_swicth() :
next->oncpu becomes '1' as the result of:
[1] context_switch() --> prepare_task_switch() --> prepare_lock_switch(next) -->
next->oncpu = 1
prev->oncpu becomes '0' as the result of:
[2] context_switch() --> finish_task_switch() -->
finish_lock_switch(prev) --> prev->oncpu = 0
[1] takes place at the very _beginning_ of context_switch() _and_ one
more thing is that rq->lock gets unlocked.
[2] takes place at the very _end_ of context_switch()
Now recall what's task_running() in our case ( it's "return task->oncpu" )
As a result, between [1] and [2] we have 2 tasks on a single CPU for
which task_running() will return '1' and their runqueue is _unlocked_.
(5) now consider sched_move_task() running on another CPU #1.
due to 'UNLOCKED_CTXSW' it can successfully lock the rq of CPU #0
let's say it's called for 'prev' task (the one being scheduled out on
CPU #0 at this very moment)
as we remember, task_running() returns '1' for it (CPU #0 haven't
reached yet point [2] as described in (4) above)
'prev' is currently on the runqueue (prev->se.on_rq == 1) and within the tree.
what happens is as follows:
- dequeue_task() removes it from the tree ;
- put_prev_task() makes cfs_rq->curr = NULL ;
se == prev.se here... so e.g. __enqueue_entity() is not called for 'prev'
- set_curr_task() --> set_curr_task_fair()
and here things become interesting.
static void set_curr_task_fair(struct rq *rq)
{
struct sched_entity *se = &rq->curr->se;
for_each_sched_entity(se)
set_next_entity(cfs_rq_of(se), se);
}
so 'se' actually belongs to the 'next' on CPU #0
next->on_rq == 1 (obviously, as dequeue_task() in sched_move_task()
was done for 'prev' !)
and now, set_next_entity() does __dequeue_entity() for 'next' which is
_not_ within the tree !!!
(it's the real 'current' on CPU #0)
that's why the reported oops:
> [<a0000001002e0480>] rb_erase+0x300/0x7e0
> [<a000000100076290>] __dequeue_entity+0x70/0xa0
> [<a000000100076300>] set_next_entity+0x40/0xa0
> [<a0000001000763a0>] set_curr_task_fair+0x40/0xa0
> [<a000000100078d90>] sched_move_task+0x2d0/0x340
> [<a000000100078e20>] cpu_cgroup_attach+0x20/0x40
or maybe there is also a possibility of the rb-tree being corrupted as
a result and having a crash somewhere later (the original report had
another backtrace)
hum... does this analysis make sense to somebody else now?
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712151544n2dfad101r2d306d393e8550ff-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-16 0:00 ` Dmitry Adamushko
[not found] ` <b647ffbd0712151600s14e3f355we5ee6348b4d484cc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-16 0:00 UTC (permalink / raw)
To: Dhaval Giani
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton, Ingo Molnar
[-- Attachment #1: Type: text/plain, Size: 479 bytes --]
Dhaval,
so following the analysis in the previous mail... here is a test
patch. Could you please give it a try?
TIA,
(enclosed non white-space broken version)
---
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7360,7 +7360,7 @@ void sched_move_task(struct task_struct *tsk)
update_rq_clock(rq);
- running = task_running(rq, tsk);
+ running = (rq->curr == tsk);
on_rq = tsk->se.on_rq;
if (on_rq) {
---
--
Best regards,
Dmitry Adamushko
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 01-set_task_cfs_rq.patch --]
[-- Type: text/x-patch; name=01-set_task_cfs_rq.patch, Size: 434 bytes --]
diff --git a/include/linux/sched.h b/include/linux/sched.h
diff --git a/kernel/sched.c b/kernel/sched.c
index dc6fb24..12ff60f 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -7360,7 +7360,7 @@ void sched_move_task(struct task_struct *tsk)
update_rq_clock(rq);
- running = task_running(rq, tsk);
+ running = (rq->curr == tsk);
on_rq = tsk->se.on_rq;
if (on_rq) {
diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
[-- Attachment #3: Type: text/plain, Size: 206 bytes --]
_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712151600s14e3f355we5ee6348b4d484cc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2007-12-16 4:28 ` Dhaval Giani
[not found] ` <20071216042821.GA8494-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-16 8:55 ` Ingo Molnar
1 sibling, 1 reply; 25+ messages in thread
From: Dhaval Giani @ 2007-12-16 4:28 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton, Ingo Molnar
On Sun, Dec 16, 2007 at 01:00:07AM +0100, Dmitry Adamushko wrote:
> Dhaval,
>
> so following the analysis in the previous mail... here is a test
> patch. Could you please give it a try?
>
Yep, it works!
Tested-by: Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
thanks,
--
regards,
Dhaval
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <b647ffbd0712151600s14e3f355we5ee6348b4d484cc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-16 4:28 ` Dhaval Giani
@ 2007-12-16 8:55 ` Ingo Molnar
[not found] ` <20071216085559.GB20790-X9Un+BFzKDI@public.gmane.org>
1 sibling, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2007-12-16 8:55 UTC (permalink / raw)
To: Dmitry Adamushko
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton, Dhaval Giani
* Dmitry Adamushko <dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -7360,7 +7360,7 @@ void sched_move_task(struct task_struct *tsk)
>
> update_rq_clock(rq);
>
> - running = task_running(rq, tsk);
> + running = (rq->curr == tsk);
> on_rq = tsk->se.on_rq;
thanks, i've queued this up (pending more testing).
Btw., you should be able to force the ia64 scheduling by adding this to
the very top of include/linux/sched.h:
#define __ARCH_WANT_UNLOCKED_CTXSW
#define __ARCH_WANT_INTERRUPTS_ON_CTXSW
Ingo
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071216085559.GB20790-X9Un+BFzKDI@public.gmane.org>
@ 2007-12-16 10:06 ` Dmitry Adamushko
0 siblings, 0 replies; 25+ messages in thread
From: Dmitry Adamushko @ 2007-12-16 10:06 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
containers-qjLDD68F18O7TbgM5vRIOg, Andrew Morton, Dhaval Giani
On 16/12/2007, Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org> wrote:
>
> * Dmitry Adamushko <dmitry.adamushko-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -7360,7 +7360,7 @@ void sched_move_task(struct task_struct *tsk)
> >
> > update_rq_clock(rq);
> >
> > - running = task_running(rq, tsk);
> > + running = (rq->curr == tsk);
> > on_rq = tsk->se.on_rq;
>
> thanks, i've queued this up (pending more testing).
btw., sched_setscheduler() and rt_mutex_setprio() are also affected
(in general, anything that may call put_prev_task/set_curr_task()
relying task_running()).
Will see, maybe we may come up with smth better than just replacing
task_running() with (rq->curr == tsk) there.
> Btw., you should be able to force the ia64 scheduling by adding this to
> the very top of include/linux/sched.h:
>
> #define __ARCH_WANT_UNLOCKED_CTXSW
> #define __ARCH_WANT_INTERRUPTS_ON_CTXSW
Yeah, with both we even get ARM behavior. Can be a good test indeed.
>
> Ingo
>
--
Best regards,
Dmitry Adamushko
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071216042821.GA8494-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
@ 2007-12-17 1:12 ` KAMEZAWA Hiroyuki
[not found] ` <20071217101245.76562518.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
0 siblings, 1 reply; 25+ messages in thread
From: KAMEZAWA Hiroyuki @ 2007-12-17 1:12 UTC (permalink / raw)
To: Dhaval Giani
Cc: Peter Zijlstra, Srivatsa Vaddagiri, Steven Rostedt,
Dmitry Adamushko, containers-qjLDD68F18O7TbgM5vRIOg,
Andrew Morton, Ingo Molnar
On Sun, 16 Dec 2007 09:58:21 +0530
Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> On Sun, Dec 16, 2007 at 01:00:07AM +0100, Dmitry Adamushko wrote:
> > Dhaval,
> >
> > so following the analysis in the previous mail... here is a test
> > patch. Could you please give it a try?
> >
>
> Yep, it works!
>
> Tested-by: Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>
Works for me, too !!
Thanks,
-Kame
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Hang with fair cgroup scheduler (reproducer is attached.)
[not found] ` <20071217101245.76562518.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
@ 2007-12-17 14:45 ` Ingo Molnar
0 siblings, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2007-12-17 14:45 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Dhaval Giani, Srivatsa Vaddagiri, Steven Rostedt,
Dmitry Adamushko, containers-qjLDD68F18O7TbgM5vRIOg,
Andrew Morton, Peter Zijlstra
* KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org> wrote:
> > > so following the analysis in the previous mail... here is a test
> > > patch. Could you please give it a try?
> > >
> >
> > Yep, it works!
> >
> > Tested-by: Dhaval Giani <dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> >
> Works for me, too !!
thanks guys, i'll push Dmitry's fix out with the next scheduler git
push.
Ingo
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2007-12-17 14:45 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-14 7:18 Hang with fair cgroup scheduler (reproducer is attached.) KAMEZAWA Hiroyuki
[not found] ` <20071214161834.034e6efe.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 8:17 ` KAMEZAWA Hiroyuki
[not found] ` <20071214171759.59f7ba57.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 9:49 ` Ingo Molnar
[not found] ` <20071214094909.GG11266-X9Un+BFzKDI@public.gmane.org>
2007-12-14 10:58 ` KAMEZAWA Hiroyuki
[not found] ` <20071214195837.0d3511db.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 11:48 ` Dhaval Giani
2007-12-14 12:47 ` Dmitry Adamushko
[not found] ` <20071214141528.GA6161-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-14 14:24 ` kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
[not found] ` <20442799.1197642268756.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-14 15:36 ` Dhaval Giani
[not found] ` <20071214153607.GB23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-14 15:38 ` Dhaval Giani
[not found] ` <20071214153823.GC23670-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-14 16:25 ` Dmitry Adamushko
[not found] ` <b647ffbd0712140825h4f541be0xa7a7866e70b3af7a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-14 19:51 ` Dmitry Adamushko
[not found] ` <b647ffbd0712141151k697d9bbemda9a7e90515e4400-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-14 21:33 ` Steven Rostedt
[not found] ` <Pine.LNX.4.58.0712141614340.22005-f9ZlEuEWxVcI6MkJdU+c8EEOCMrvLtNR@public.gmane.org>
2007-12-15 10:22 ` Dmitry Adamushko
[not found] ` <b647ffbd0712150222p30cac9f9i772c2a2c4e05a4a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-15 10:50 ` Dhaval Giani
[not found] ` <20071215105036.GB26325-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-15 11:15 ` Dmitry Adamushko
2007-12-15 23:44 ` Dmitry Adamushko
[not found] ` <b647ffbd0712151544n2dfad101r2d306d393e8550ff-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-16 0:00 ` Dmitry Adamushko
[not found] ` <b647ffbd0712151600s14e3f355we5ee6348b4d484cc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-16 4:28 ` Dhaval Giani
[not found] ` <20071216042821.GA8494-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2007-12-17 1:12 ` KAMEZAWA Hiroyuki
[not found] ` <20071217101245.76562518.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2007-12-17 14:45 ` Ingo Molnar
2007-12-16 8:55 ` Ingo Molnar
[not found] ` <20071216085559.GB20790-X9Un+BFzKDI@public.gmane.org>
2007-12-16 10:06 ` Dmitry Adamushko
[not found] ` <b647ffbd0712140447kfba5945ybde40f18653dd164-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2007-12-14 12:50 ` kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A
2007-12-14 14:15 ` Dhaval Giani
2007-12-14 9:48 ` Ingo Molnar
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.