* [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang
@ 2008-09-25 12:32 Chirag Jog
2008-09-29 18:13 ` Gregory Haskins
` (3 more replies)
0 siblings, 4 replies; 24+ messages in thread
From: Chirag Jog @ 2008-09-25 12:32 UTC (permalink / raw)
To: ghaskins; +Cc: linux-rt-users, LKML, Steven Rostedt
Hi Gregory,
We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64)
It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests.
login: cpu 0x2: Vector: 700 (Program Check) at [c0000000e8e875d0]
pc: c00000000005110c: .pick_next_pushable_task+0x54/0x9c
lr: c000000000059f50: .push_rt_task+0x44/0x2b4
sp: c0000000e8e87850
msr: 8000000000021032
current = 0xc0000000ea5bb2e0
paca = 0xc000000000608700
pid = 2811, comm = async_handler
kernel BUG at kernel/sched_rt.c:1041! <---------------------
enter ? for help
[link register ] c000000000059f50 .push_rt_task+0x44/0x2b4
[c0000000e8e87850] c0000000e8e878f0 (unreliable)
[c0000000e8e87900] c00000000005a1dc .push_rt_tasks+0x1c/0x38
[c0000000e8e87980] c00000000005a21c .post_schedule_rt+0x24/0x44
[c0000000e8e87a10] c000000000057cbc .finish_task_switch+0xd0/0x180
[c0000000e8e87ab0] c0000000003b6e88 .__schedule+0x6e0/0x798
[c0000000e8e87b90] c0000000003b7148 .schedule+0xec/0x11c
[c0000000e8e87c10] c0000000003b7a40 .do_nanosleep+0x6c/0xcc
[c0000000e8e87c90] c000000000080738 .hrtimer_nanosleep+0x7c/0x100
[c0000000e8e87d90] c000000000080830 .sys_nanosleep+0x74/0x94
[c0000000e8e87e30] c0000000000086ac syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 0000008026449844
SP (400014185f0) is in userspace
This is generated by the BUG_ON lines in the pick_next_pushable function
introduced by the sched-only-push-once-per-queue.patch .
The -rt kernel prior to this patch didnot give such BUGes.
All this was tried with
CONFIG_GROUP_SCHED=N
CONFIG_RT_GROUP_SCHED=N
Setting the options
CONFIG_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=Y,
seems to solve the problem.
--
-Thanks,Chirag
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-25 12:32 [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Chirag Jog @ 2008-09-29 18:13 ` Gregory Haskins 2008-09-29 21:18 ` Gregory Haskins ` (2 subsequent siblings) 3 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-09-29 18:13 UTC (permalink / raw) To: ghaskins, linux-rt-users, LKML, Steven Rostedt [-- Attachment #1: Type: text/plain, Size: 2135 bytes --] Hi Chirag Chirag Jog wrote: > Hi Gregory, > We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) > It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. > FYI I am looking at this now. I suspect a dequeue_pushable_task() probably found its way inside a conditional for GROUP_SCHED and inadventently gets compiled away if you disable the feature. Investigating now.. -Greg > login: cpu 0x2: Vector: 700 (Program Check) at [c0000000e8e875d0] > pc: c00000000005110c: .pick_next_pushable_task+0x54/0x9c > lr: c000000000059f50: .push_rt_task+0x44/0x2b4 > sp: c0000000e8e87850 > msr: 8000000000021032 > current = 0xc0000000ea5bb2e0 > paca = 0xc000000000608700 > pid = 2811, comm = async_handler > kernel BUG at kernel/sched_rt.c:1041! <--------------------- > enter ? for help > [link register ] c000000000059f50 .push_rt_task+0x44/0x2b4 > [c0000000e8e87850] c0000000e8e878f0 (unreliable) > [c0000000e8e87900] c00000000005a1dc .push_rt_tasks+0x1c/0x38 > [c0000000e8e87980] c00000000005a21c .post_schedule_rt+0x24/0x44 > [c0000000e8e87a10] c000000000057cbc .finish_task_switch+0xd0/0x180 > [c0000000e8e87ab0] c0000000003b6e88 .__schedule+0x6e0/0x798 > [c0000000e8e87b90] c0000000003b7148 .schedule+0xec/0x11c > [c0000000e8e87c10] c0000000003b7a40 .do_nanosleep+0x6c/0xcc > [c0000000e8e87c90] c000000000080738 .hrtimer_nanosleep+0x7c/0x100 > [c0000000e8e87d90] c000000000080830 .sys_nanosleep+0x74/0x94 > [c0000000e8e87e30] c0000000000086ac syscall_exit+0x0/0x40 > --- Exception: c00 (System Call) at 0000008026449844 > SP (400014185f0) is in userspace > > > This is generated by the BUG_ON lines in the pick_next_pushable function > introduced by the sched-only-push-once-per-queue.patch . > > The -rt kernel prior to this patch didnot give such BUGes. > > All this was tried with > CONFIG_GROUP_SCHED=N > CONFIG_RT_GROUP_SCHED=N > > > Setting the options > CONFIG_GROUP_SCHED=y > CONFIG_RT_GROUP_SCHED=Y, > seems to solve the problem. > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-25 12:32 [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Chirag Jog 2008-09-29 18:13 ` Gregory Haskins @ 2008-09-29 21:18 ` Gregory Haskins 2008-09-29 21:34 ` Gregory Haskins 2008-10-02 11:18 ` [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Gilles Carry 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins 3 siblings, 2 replies; 24+ messages in thread From: Gregory Haskins @ 2008-09-29 21:18 UTC (permalink / raw) To: ghaskins, linux-rt-users, LKML, Steven Rostedt [-- Attachment #1: Type: text/plain, Size: 2193 bytes --] Hi Chirag Chirag Jog wrote: > Hi Gregory, > We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) > It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. > Call me an LTP newbie, but where can I get the sbrk_mutex/async_handler tests. I installed the LTP rpm and I dont seem to have tests by that name. I have a 26.5-rt9 kernel all configured and ready to go but no way to reproduce this issue. Please advise. Regards, -Greg > login: cpu 0x2: Vector: 700 (Program Check) at [c0000000e8e875d0] > pc: c00000000005110c: .pick_next_pushable_task+0x54/0x9c > lr: c000000000059f50: .push_rt_task+0x44/0x2b4 > sp: c0000000e8e87850 > msr: 8000000000021032 > current = 0xc0000000ea5bb2e0 > paca = 0xc000000000608700 > pid = 2811, comm = async_handler > kernel BUG at kernel/sched_rt.c:1041! <--------------------- > enter ? for help > [link register ] c000000000059f50 .push_rt_task+0x44/0x2b4 > [c0000000e8e87850] c0000000e8e878f0 (unreliable) > [c0000000e8e87900] c00000000005a1dc .push_rt_tasks+0x1c/0x38 > [c0000000e8e87980] c00000000005a21c .post_schedule_rt+0x24/0x44 > [c0000000e8e87a10] c000000000057cbc .finish_task_switch+0xd0/0x180 > [c0000000e8e87ab0] c0000000003b6e88 .__schedule+0x6e0/0x798 > [c0000000e8e87b90] c0000000003b7148 .schedule+0xec/0x11c > [c0000000e8e87c10] c0000000003b7a40 .do_nanosleep+0x6c/0xcc > [c0000000e8e87c90] c000000000080738 .hrtimer_nanosleep+0x7c/0x100 > [c0000000e8e87d90] c000000000080830 .sys_nanosleep+0x74/0x94 > [c0000000e8e87e30] c0000000000086ac syscall_exit+0x0/0x40 > --- Exception: c00 (System Call) at 0000008026449844 > SP (400014185f0) is in userspace > > > This is generated by the BUG_ON lines in the pick_next_pushable function > introduced by the sched-only-push-once-per-queue.patch . > > The -rt kernel prior to this patch didnot give such BUGes. > > All this was tried with > CONFIG_GROUP_SCHED=N > CONFIG_RT_GROUP_SCHED=N > > > Setting the options > CONFIG_GROUP_SCHED=y > CONFIG_RT_GROUP_SCHED=Y, > seems to solve the problem. > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-29 21:18 ` Gregory Haskins @ 2008-09-29 21:34 ` Gregory Haskins 2008-09-29 22:00 ` Gregory Haskins 2008-10-02 11:18 ` [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Gilles Carry 1 sibling, 1 reply; 24+ messages in thread From: Gregory Haskins @ 2008-09-29 21:34 UTC (permalink / raw) To: linux-rt-users, LKML, Steven Rostedt, chirag [-- Attachment #1: Type: text/plain, Size: 2946 bytes --] Gregory Haskins wrote: > Hi Chirag > > Chirag Jog wrote: > >> Hi Gregory, >> We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) >> It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. >> >> > > Call me an LTP newbie, but where can I get the sbrk_mutex/async_handler > tests. Ok, I figured out this part. I needed a newer version of the .rpm from a different repo. However, both async_handler and sbrk_mutex seem to segfault for me. Hmm --- ghaskins@test:~> uname -a Linux test 2.6.26.5-rt9-rt #1 SMP PREEMPT RT Mon Sep 29 14:26:45 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux test:/home/ghaskins # /usr/lib64/ltp/testcases/realtime/func/async_handler/async_handler ----------------------------------- Asynchronous Event Handling Latency ----------------------------------- jvmsim disabled Running 1000000 iterations Segmentation fault --- Any ideas? -Greg > I installed the LTP rpm and I dont seem to have tests by that > name. I have a 26.5-rt9 kernel all configured and ready to go but no > way to reproduce this issue. Please advise. > > Regards, > -Greg > > > >> login: cpu 0x2: Vector: 700 (Program Check) at [c0000000e8e875d0] >> pc: c00000000005110c: .pick_next_pushable_task+0x54/0x9c >> lr: c000000000059f50: .push_rt_task+0x44/0x2b4 >> sp: c0000000e8e87850 >> msr: 8000000000021032 >> current = 0xc0000000ea5bb2e0 >> paca = 0xc000000000608700 >> pid = 2811, comm = async_handler >> kernel BUG at kernel/sched_rt.c:1041! <--------------------- >> enter ? for help >> [link register ] c000000000059f50 .push_rt_task+0x44/0x2b4 >> [c0000000e8e87850] c0000000e8e878f0 (unreliable) >> [c0000000e8e87900] c00000000005a1dc .push_rt_tasks+0x1c/0x38 >> [c0000000e8e87980] c00000000005a21c .post_schedule_rt+0x24/0x44 >> [c0000000e8e87a10] c000000000057cbc .finish_task_switch+0xd0/0x180 >> [c0000000e8e87ab0] c0000000003b6e88 .__schedule+0x6e0/0x798 >> [c0000000e8e87b90] c0000000003b7148 .schedule+0xec/0x11c >> [c0000000e8e87c10] c0000000003b7a40 .do_nanosleep+0x6c/0xcc >> [c0000000e8e87c90] c000000000080738 .hrtimer_nanosleep+0x7c/0x100 >> [c0000000e8e87d90] c000000000080830 .sys_nanosleep+0x74/0x94 >> [c0000000e8e87e30] c0000000000086ac syscall_exit+0x0/0x40 >> --- Exception: c00 (System Call) at 0000008026449844 >> SP (400014185f0) is in userspace >> >> >> This is generated by the BUG_ON lines in the pick_next_pushable function >> introduced by the sched-only-push-once-per-queue.patch . >> >> The -rt kernel prior to this patch didnot give such BUGes. >> >> All this was tried with >> CONFIG_GROUP_SCHED=N >> CONFIG_RT_GROUP_SCHED=N >> >> >> Setting the options >> CONFIG_GROUP_SCHED=y >> CONFIG_RT_GROUP_SCHED=Y, >> seems to solve the problem. >> >> >> >> >> > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-29 21:34 ` Gregory Haskins @ 2008-09-29 22:00 ` Gregory Haskins 2008-09-30 4:43 ` Chirag Jog 0 siblings, 1 reply; 24+ messages in thread From: Gregory Haskins @ 2008-09-29 22:00 UTC (permalink / raw) To: linux-rt-users, LKML, Steven Rostedt, chirag; +Cc: dvhltc [-- Attachment #1: Type: text/plain, Size: 3482 bytes --] Gregory Haskins wrote: > Gregory Haskins wrote: > >> Hi Chirag >> >> Chirag Jog wrote: >> >> >>> Hi Gregory, >>> We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) >>> It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. >>> >>> >>> >> Call me an LTP newbie, but where can I get the sbrk_mutex/async_handler >> tests. >> > Ok, I figured out this part. I needed a newer version of the .rpm from > a different repo. However, both async_handler and sbrk_mutex seem to > segfault for me. Hmm > Thanks to help from Darren I got around this issue. Unfortunately both tests pass so I cannot reproduce this issue, nor do I see the problem via code inspection. Ill keep digging but I am currently at a loss. I may need to send you some diagnostic patches to find this, if that is ok with you Chirag? -Greg > --- > > ghaskins@test:~> uname -a > Linux test 2.6.26.5-rt9-rt #1 SMP PREEMPT RT Mon Sep 29 14:26:45 EDT > 2008 x86_64 x86_64 x86_64 GNU/Linux > > test:/home/ghaskins # > /usr/lib64/ltp/testcases/realtime/func/async_handler/async_handler > > ----------------------------------- > Asynchronous Event Handling Latency > ----------------------------------- > > jvmsim disabled > Running 1000000 iterations > Segmentation fault > > --- > > Any ideas? > > -Greg > > >> I installed the LTP rpm and I dont seem to have tests by that >> name. I have a 26.5-rt9 kernel all configured and ready to go but no >> way to reproduce this issue. Please advise. >> >> Regards, >> -Greg >> >> >> >> >>> login: cpu 0x2: Vector: 700 (Program Check) at [c0000000e8e875d0] >>> pc: c00000000005110c: .pick_next_pushable_task+0x54/0x9c >>> lr: c000000000059f50: .push_rt_task+0x44/0x2b4 >>> sp: c0000000e8e87850 >>> msr: 8000000000021032 >>> current = 0xc0000000ea5bb2e0 >>> paca = 0xc000000000608700 >>> pid = 2811, comm = async_handler >>> kernel BUG at kernel/sched_rt.c:1041! <--------------------- >>> enter ? for help >>> [link register ] c000000000059f50 .push_rt_task+0x44/0x2b4 >>> [c0000000e8e87850] c0000000e8e878f0 (unreliable) >>> [c0000000e8e87900] c00000000005a1dc .push_rt_tasks+0x1c/0x38 >>> [c0000000e8e87980] c00000000005a21c .post_schedule_rt+0x24/0x44 >>> [c0000000e8e87a10] c000000000057cbc .finish_task_switch+0xd0/0x180 >>> [c0000000e8e87ab0] c0000000003b6e88 .__schedule+0x6e0/0x798 >>> [c0000000e8e87b90] c0000000003b7148 .schedule+0xec/0x11c >>> [c0000000e8e87c10] c0000000003b7a40 .do_nanosleep+0x6c/0xcc >>> [c0000000e8e87c90] c000000000080738 .hrtimer_nanosleep+0x7c/0x100 >>> [c0000000e8e87d90] c000000000080830 .sys_nanosleep+0x74/0x94 >>> [c0000000e8e87e30] c0000000000086ac syscall_exit+0x0/0x40 >>> --- Exception: c00 (System Call) at 0000008026449844 >>> SP (400014185f0) is in userspace >>> >>> >>> This is generated by the BUG_ON lines in the pick_next_pushable function >>> introduced by the sched-only-push-once-per-queue.patch . >>> >>> The -rt kernel prior to this patch didnot give such BUGes. >>> >>> All this was tried with >>> CONFIG_GROUP_SCHED=N >>> CONFIG_RT_GROUP_SCHED=N >>> >>> >>> Setting the options >>> CONFIG_GROUP_SCHED=y >>> CONFIG_RT_GROUP_SCHED=Y, >>> seems to solve the problem. >>> >>> >>> >>> >>> >>> >> >> > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-29 22:00 ` Gregory Haskins @ 2008-09-30 4:43 ` Chirag Jog 2008-09-30 6:47 ` Gilles Carry 2008-10-01 14:22 ` [PATCH] sched: add a stacktrace on enqueue_pushable error Gregory Haskins 0 siblings, 2 replies; 24+ messages in thread From: Chirag Jog @ 2008-09-30 4:43 UTC (permalink / raw) To: Gregory Haskins Cc: linux-rt-users, LKML, Steven Rostedt, dvhltc, Dinakar Guniguntala Hi Gregory, * Gregory Haskins <ghaskins@novell.com> [2008-09-29 18:00:01]: > Gregory Haskins wrote: > > Gregory Haskins wrote: > > > >> Hi Chirag > >> > >> Chirag Jog wrote: > >> > >> > >>> Hi Gregory, > >>> We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) > >>> It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. > >>> > >>> > >>> > >> Call me an LTP newbie, but where can I get the sbrk_mutex/async_handler > >> tests. > >> > > Ok, I figured out this part. I needed a newer version of the .rpm from > > a different repo. However, both async_handler and sbrk_mutex seem to > > segfault for me. Hmm > > > > Thanks to help from Darren I got around this issue. Unfortunately both > tests pass so I cannot reproduce this issue, nor do I see the problem > via code inspection. Ill keep digging but I am currently at a loss. I > may need to send you some diagnostic patches to find this, if that is ok > with you Chirag? This particular bug is not producible on the x86 boxes, i have access to. Only on ppc64. Please send the diagnostic patches across. I'll try them out! :) -- -Thanks,Chirag ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-30 4:43 ` Chirag Jog @ 2008-09-30 6:47 ` Gilles Carry 2008-10-01 14:22 ` [PATCH] sched: add a stacktrace on enqueue_pushable error Gregory Haskins 1 sibling, 0 replies; 24+ messages in thread From: Gilles Carry @ 2008-09-30 6:47 UTC (permalink / raw) To: Chirag Jog Cc: Gregory Haskins, linux-rt-users, LKML, Steven Rostedt, dvhltc, Dinakar Guniguntala Chirag Jog wrote: > Hi Gregory, > * Gregory Haskins <ghaskins@novell.com> [2008-09-29 18:00:01]: > > >>Gregory Haskins wrote: >> >>>Gregory Haskins wrote: >>> >>> >>>>Hi Chirag >>>> >>>>Chirag Jog wrote: >>>> >>>> >>>> >>>>>Hi Gregory, >>>>>We see the following BUG followed by a hang on the latest kernel 2.6.26.5-rt9 on a Power6 blade (PPC64) >>>>>It is easily recreated by running the async_handler or sbrk_mutex (realtime tests from ltp) tests. >>>>> >>>>> >>>>> >>>> >>>>Call me an LTP newbie, but where can I get the sbrk_mutex/async_handler >>>>tests. >>>> >>> >>>Ok, I figured out this part. I needed a newer version of the .rpm from >>>a different repo. However, both async_handler and sbrk_mutex seem to >>>segfault for me. Hmm >>> >> >>Thanks to help from Darren I got around this issue. Unfortunately both >>tests pass so I cannot reproduce this issue, nor do I see the problem >>via code inspection. Ill keep digging but I am currently at a loss. I >>may need to send you some diagnostic patches to find this, if that is ok >>with you Chirag? > > This particular bug is not producible on the x86 boxes, i have access > to. Only on ppc64. > Please send the diagnostic patches across. > I'll try them out! :) > Hi, I have access to Power6 and x86_64 boxes and so far I could only reproduce the bug on PPC64. The bug arised from 2.6.26.3-rt6 since sched-only-push-if-pushable.patch and sched-only-push-once-per-queue.patch. Whereas sbrk_mutex definetly shows up the problem, it also can occur randomly, sometimes during the boot period. At the beginning, I had system hangs or this (very similar to Chirag's and not necessarly in sbrk_mutex): cpu 0x3: Vector: 700 (Program Check) at [c0000000ee30b600] pc: c0000000001b9bac: .__list_add+0x70/0xa0 lr: c0000000001b9ba8: .__list_add+0x6c/0xa0 sp: c0000000ee30b880 msr: 8000000000021032 current = 0xc0000000ee2b1830 paca = 0xc0000000005c3980 pid = 51, comm = sirq-sched/3 kernel BUG at lib/list_debug.c:33! enter ? for help [c0000000ee30b900] c0000000001b8ec0 .plist_del+0x6c/0xcc [c0000000ee30b9a0] c00000000004d500 .dequeue_pushable_task+0x24/0x3c [c0000000ee30ba20] c00000000004ec18 .push_rt_task+0x1f0/0x2c0 [c0000000ee30bae0] c00000000004ed0c .push_rt_tasks+0x24/0x44 [c0000000ee30bb70] c00000000004ed58 .post_schedule_rt+0x2c/0x50 [c0000000ee30bc00] c0000000000527c4 .finish_task_switch+0x100/0x1a8 [c0000000ee30bcb0] c0000000002cd1e0 .__schedule+0x688/0x744 [c0000000ee30bd90] c0000000002cd4ec .schedule+0xf4/0x128 [c0000000ee30be20] c000000000061634 .ksoftirqd+0x124/0x37c [c0000000ee30bf00] c000000000076cf0 .kthread+0x84/0xd4 [c0000000ee30bf90] c000000000029368 .kernel_thread+0x4c/0x68 3:mon> So I suspected a memory corruption but adding padding fields around the pointers and extra checks did not reveal any data trashing. Playing with xmon, I finally found out that when hanging, the system was stuck in an infinite loop in plist_check_list. Si I modified lib/plist.c: I supposed that no list holds more than 100 000 000 elements in the system. ;-) static void plist_check_list(struct list_head *top) { struct list_head *prev = top, *next = top->next; + unsigned long long i = 1; plist_check_prev_next(top, prev, next); while (next != top) { + BUG_ON(i++ > 100000000); prev = next; next = prev->next; plist_check_prev_next(top, prev, next); and got this: cpu 0x6: Vector: 700 (Program Check) at [c0000000eeda7530] pc: c0000000001ba498: .plist_check_list+0x68/0xb4 lr: c0000000001ba4b4: .plist_check_list+0x84/0xb4 sp: c0000000eeda77b0 msr: 8000000000021032 current = 0xc0000000ee80dfa0 paca = 0xc0000000005d3f80 pid = 2602, comm = sbrk_mutex kernel BUG at lib/plist.c:50! enter ? for help [c0000000eeda7850] c0000000001ba530 .plist_check_head+0x4c/0x64 [c0000000eeda78e0] c0000000001ba57c .plist_del+0x34/0xdc [c0000000eeda7980] c00000000004d734 .dequeue_pushable_task+0x24/0x3c [c0000000eeda7a00] c00000000004d7c4 .pick_next_task_rt+0x38/0x58 [c0000000eeda7a90] c0000000002cefb0 .__schedule+0x510/0x75c [c0000000eeda7b70] c0000000002cf44c .schedule+0xf4/0x128 [c0000000eeda7c00] c0000000002cfe4c .do_nanosleep+0x7c/0xe4 [c0000000eeda7c90] c00000000007be68 .hrtimer_nanosleep+0x84/0x10c [c0000000eeda7d90] c00000000007bf6c .sys_nanosleep+0x7c/0xa0 [c0000000eeda7e30] c0000000000086ac syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 00000080fdb85880 SP (4000843e660) is in userspace which corresponds to the BUG_ON stuff. It seems that the pushable_tasks list is corrupted: it never loops back to the first element (top). Is there a shortcut anywhere? Since the patches don't feature any arch-specific change, I'm looking for arch-specific code triggered by the modifications brought by the patches. Still searching... Also for me, using CONFIG_GROUP_SCHED stuffs hides the problem. I'm going to harden plist_check_list and see what it does. Gilles. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] sched: add a stacktrace on enqueue_pushable error 2008-09-30 4:43 ` Chirag Jog 2008-09-30 6:47 ` Gilles Carry @ 2008-10-01 14:22 ` Gregory Haskins 2008-10-02 9:42 ` Gilles Carry 1 sibling, 1 reply; 24+ messages in thread From: Gregory Haskins @ 2008-10-01 14:22 UTC (permalink / raw) To: Chirag Jog; +Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino Hi Chirag, Please apply this patch to your 26.5-rt9 tree for ppc64, enable CONFIG_PROVE_LOCKING (which enables CONFIG_STACKTRACE) and give it a whirl. If you get an oops, please post the console output. Thanks! -Greg ------------------ sched: add a stacktrace on enqueue_pushable error NOT FOR INCLUSION! This is to help debug an issue discovered by Chirag Jog in the thread http://lkml.org/lkml/2008/9/25/189 Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- include/linux/sched.h | 6 ++++++ kernel/sched_rt.c | 20 +++++++++++++++++++- 2 files changed, 25 insertions(+), 1 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 67da014..53e8a6a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -87,6 +87,7 @@ struct sched_param { #include <linux/task_io_accounting.h> #include <linux/kobject.h> #include <linux/latencytop.h> +#include <linux/stacktrace.h> #include <asm/processor.h> @@ -1170,6 +1171,11 @@ struct task_struct { struct list_head tasks; struct plist_node pushable_tasks; + struct { + unsigned long data[15]; + struct stack_trace trace; + pid_t pid; + } pushable_stack; /* * ptrace_list/ptrace_children forms the list of my children diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 57a0c0d..0e6a88c 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -55,7 +55,25 @@ static void update_rt_migration(struct rq *rq) static void enqueue_pushable_task(struct rq *rq, struct task_struct *p) { - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); + if (plist_node_empty(&p->pushable_tasks)) { + struct stack_trace *trace = &p->pushable_stack.trace; + + trace->nr_entries = 0; + trace->max_entries = sizeof(p->pushable_stack.data)/sizeof(p->pushable_stack.data[0]); + trace->entries = &p->pushable_stack.data[0]; + trace->skip = 0; + + save_stack_trace(trace); + + p->pushable_stack.pid = current->pid; + } else { + printk(KERN_CRIT "redundant enqueue by %d detected\n", + p->pushable_stack.pid); + print_stack_trace(&p->pushable_stack.trace, 5); + + BUG_ON(!plist_node_empty(&p->pushable_tasks)); + } + plist_node_init(&p->pushable_tasks, p->prio); plist_add(&p->pushable_tasks, &rq->rt.pushable_tasks); } ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH] sched: add a stacktrace on enqueue_pushable error 2008-10-01 14:22 ` [PATCH] sched: add a stacktrace on enqueue_pushable error Gregory Haskins @ 2008-10-02 9:42 ` Gilles Carry 0 siblings, 0 replies; 24+ messages in thread From: Gilles Carry @ 2008-10-02 9:42 UTC (permalink / raw) To: Gregory Haskins Cc: Chirag Jog, linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Jean-Pierre Dion, Sébastien Dugué, Tim Chavez Hi, I could reproduce the bug on an intel architecture. I found where the problem is. I'll post the patch in a few hours. Gilles. Gregory Haskins wrote: > Hi Chirag, > Please apply this patch to your 26.5-rt9 tree for ppc64, enable > CONFIG_PROVE_LOCKING (which enables CONFIG_STACKTRACE) and give it a whirl. > If you get an oops, please post the console output. > > Thanks! > > -Greg > > ------------------ > sched: add a stacktrace on enqueue_pushable error > > NOT FOR INCLUSION! > > This is to help debug an issue discovered by Chirag Jog in the thread > > http://lkml.org/lkml/2008/9/25/189 > > Signed-off-by: Gregory Haskins <ghaskins@novell.com> > --- > > include/linux/sched.h | 6 ++++++ > kernel/sched_rt.c | 20 +++++++++++++++++++- > 2 files changed, 25 insertions(+), 1 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 67da014..53e8a6a 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -87,6 +87,7 @@ struct sched_param { > #include <linux/task_io_accounting.h> > #include <linux/kobject.h> > #include <linux/latencytop.h> > +#include <linux/stacktrace.h> > > #include <asm/processor.h> > > @@ -1170,6 +1171,11 @@ struct task_struct { > > struct list_head tasks; > struct plist_node pushable_tasks; > + struct { > + unsigned long data[15]; > + struct stack_trace trace; > + pid_t pid; > + } pushable_stack; > > /* > * ptrace_list/ptrace_children forms the list of my children > diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c > index 57a0c0d..0e6a88c 100644 > --- a/kernel/sched_rt.c > +++ b/kernel/sched_rt.c > @@ -55,7 +55,25 @@ static void update_rt_migration(struct rq *rq) > > static void enqueue_pushable_task(struct rq *rq, struct task_struct *p) > { > - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); > + if (plist_node_empty(&p->pushable_tasks)) { > + struct stack_trace *trace = &p->pushable_stack.trace; > + > + trace->nr_entries = 0; > + trace->max_entries = sizeof(p->pushable_stack.data)/sizeof(p->pushable_stack.data[0]); > + trace->entries = &p->pushable_stack.data[0]; > + trace->skip = 0; > + > + save_stack_trace(trace); > + > + p->pushable_stack.pid = current->pid; > + } else { > + printk(KERN_CRIT "redundant enqueue by %d detected\n", > + p->pushable_stack.pid); > + print_stack_trace(&p->pushable_stack.trace, 5); > + > + BUG_ON(!plist_node_empty(&p->pushable_tasks)); > + } > + > plist_node_init(&p->pushable_tasks, p->prio); > plist_add(&p->pushable_tasks, &rq->rt.pushable_tasks); > } > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Gilles.Carry Linux Project team mailto: gilles.carry@bull.net Phone: +33 (0)4 76 29 74 27 Addr.: BULL S.A. 1 rue de Provence, B.P. 208 38432 Echirolles Cedex http://www.bull.com http://www.bullopensource.org/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang 2008-09-29 21:18 ` Gregory Haskins 2008-09-29 21:34 ` Gregory Haskins @ 2008-10-02 11:18 ` Gilles Carry 1 sibling, 0 replies; 24+ messages in thread From: Gilles Carry @ 2008-10-02 11:18 UTC (permalink / raw) To: Gregory Haskins; +Cc: linux-rt-users, LKML, Steven Rostedt Hi, I could reproduce the bug on intel x86_64 with LTP's sbrk_mutex: kernel BUG at kernel/sched_rt.c:1044! invalid opcode: 0000 [1] PREEMPT SMP CPU 5 Modules linked in: mptsas scsi_transport_sas Pid: 27577, comm: sbrk_mutex Not tainted 2.6.26.5-rt9-00002-g3b27927 #23 RIP: 0010:[<ffffffff80227f95>] [<ffffffff80227f95>] pick_next_pushable_task+0x6 1/0x77 RSP: 0018:ffff81007713fd28 EFLAGS: 00010046 RAX: 0000000000000005 RBX: ffff810083a4e280 RCX: ffff81013dcee458 RDX: ffff8100771f8000 RSI: ffff81013dcee2c0 RDI: ffff810083a4e280 RBP: ffff81007713fd28 R08: ffff81007713e000 R09: 0000000000000000 R10: 000000004bbbc9e0 R11: ffff81007dc3bde8 R12: ffff81023ff7c910 R13: ffff8101bf4ad0c0 R14: 0000000000000001 R15: ffff810083a4e280 FS: 000000004d3bf940(0063) GS:ffff81013f4458c0(0000) knlGS:00000000f7f216c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000389d495770 CR3: 000000007c11a000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sbrk_mutex (pid: 27577, threadinfo ffff81007713e000, task ffff8100771d61 c0) Stack: ffff81007713fd68 ffffffff8022b1ce ffff81007713fdc8 ffff810083a4e280 ffff81023ff7c910 ffff8101bf4ad0c0 0000000000000001 0000000000000000 ffff81007713fd88 ffffffff8022b3c3 ffff81007713fda8 ffff810083a4e280 Call Trace: [<ffffffff8022b1ce>] push_rt_task+0x26/0x207 [<ffffffff8022b3c3>] push_rt_tasks+0x14/0x1c [<ffffffff8022b3e4>] post_schedule_rt+0x19/0x25 [<ffffffff8022d7e9>] finish_task_switch+0x73/0x121 [<ffffffff805bbe3d>] thread_return+0x4f/0xdc [<ffffffff805bc066>] schedule+0xd4/0xf0 [<ffffffff805bc686>] do_nanosleep+0x5c/0x9c [<ffffffff80248350>] ? hrtimer_nanosleep+0x54/0xbd [<ffffffff80247c9d>] ? hrtimer_wakeup+0x0/0x21 [<ffffffff805bc66b>] ? do_nanosleep+0x41/0x9c [<ffffffff8022e9f4>] ? schedule_tail+0x43/0x97 [<ffffffff80248405>] ? sys_nanosleep+0x4c/0x62 [<ffffffff8020b32a>] ? system_call_after_swapgs+0x8a/0x8f Code: 42 18 74 04 0f 0b eb fe 48 39 b7 48 0e 00 00 75 04 0f 0b eb fe 83 b9 50 ff ff ff 01 7f 04 0f 0b eb fe 83 b9 e0 fe ff ff 00 75 04 <0f> 0b eb fe 83 b9 8c fe ff ff 63 7e 04 0f 0b eb fe c9 48 89 f0 RIP [<ffffffff80227f95>] pick_next_pushable_task+0x61/0x77 RSP <ffff81007713fd28> The difference with powerpc64 is that you need to be patient: it takes tens of minutes to BUG/hang on intel whereas on power it's almost immediate. I just posted the patch on this list (Fix pushable_task list corruption) Greg, please can you review this patch and comment? Thanks. Gilles. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 2008-09-25 12:32 [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Chirag Jog 2008-09-29 18:13 ` Gregory Haskins 2008-09-29 21:18 ` Gregory Haskins @ 2008-10-03 12:42 ` Gregory Haskins 2008-10-03 12:43 ` [PATCH 1/2] RT: Remove comment that is no longer true Gregory Haskins ` (2 more replies) 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins 3 siblings, 3 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 12:42 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry Hi Chirag, Please try the following patches applied to 26.5-rt9 and let me know what you find. Hi Steve, If these look good to everyone, please consider them for inclusion in -rt10. Regards, -Greg --- Gregory Haskins (2): RT: remove "paranoid" limit in push_rt_task RT: Remove comment that is no longer true kernel/sched_rt.c | 15 +++------------ 1 files changed, 3 insertions(+), 12 deletions(-) ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/2] RT: Remove comment that is no longer true 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins @ 2008-10-03 12:43 ` Gregory Haskins 2008-10-03 12:43 ` [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins 2008-10-03 12:54 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 12:43 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry We fixed the condition noted in the comment with the "pushable_tasks" logic, but forgot to remove this comment. Lets clean it up. Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 10 ---------- 1 files changed, 0 insertions(+), 10 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 57a0c0d..59ead84 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1125,16 +1125,6 @@ out: return 1; } -/* - * TODO: Currently we just use the second highest prio task on - * the queue, and stop when it can't migrate (or there's - * no more RT tasks). There may be a case where a lower - * priority RT task has a different affinity than the - * higher RT task. In this case the lower RT task could - * possibly be able to migrate where as the higher priority - * RT task could not. We currently ignore this issue. - * Enhancements are welcome! - */ static void push_rt_tasks(struct rq *rq) { /* push_rt_task will return true if it moved an RT */ ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2008-10-03 12:43 ` [PATCH 1/2] RT: Remove comment that is no longer true Gregory Haskins @ 2008-10-03 12:43 ` Gregory Haskins 2008-10-03 13:46 ` Gilles Carry 2008-10-03 12:54 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2 siblings, 1 reply; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 12:43 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry A panic was discovered by Chirag Jog and investigated by Gilles Carry to be originating in the fact that a task being pushed away may get migrated away during a double_lock_balance. The result was that the pushable_tasks list may become corrupted. The root cause is that the "paranoid" retry limit could cause us to bail out of a retry, but still try to remove the item from the (now potentially incorrect) list. There are numerous ways to correct the condition, but the paranoid feature is no longer relevant with the new pushable logic (since pushable naturally limits the loop anyway), so lets just remove it. Reported By: Chirag Jog <chirag@linux.vnet.ibm.com> Found-by: Gilles Carry <gilles.carry@bull.net> Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 59ead84..5a754fe 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq) { struct task_struct *next_task; struct rq *lowest_rq; - int paranoid = RT_MAX_TRIES; if (!rq->rt.overloaded) return 0; @@ -1094,12 +1093,14 @@ static int push_rt_task(struct rq *rq) * If it has, then try again. */ task = pick_next_pushable_task(rq); - if (unlikely(task != next_task) && task && paranoid--) { + if (unlikely(task != next_task) && task) { put_task_struct(next_task); next_task = task; goto retry; } + BUG_ON(task_cpu(next_task) != rq->cpu); + /* * Once we have failed to push this task, we will not * try again, since the other cpus will pull from us ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task 2008-10-03 12:43 ` [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins @ 2008-10-03 13:46 ` Gilles Carry 2008-10-03 15:45 ` Chirag Jog 2008-10-03 17:26 ` [RT PATCH v2 0/2] Series short description Gregory Haskins 0 siblings, 2 replies; 24+ messages in thread From: Gilles Carry @ 2008-10-03 13:46 UTC (permalink / raw) To: Gregory Haskins Cc: Chirag Jog, linux-rt-users, linux-kernel, rostedt, dvhltc, dino Sorry Greg, Neither PPC64 nor Intel64 make it with this patch. At boot time, it stops at the BUG_ON you added: 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102) I let you do more investigations. Have a good week-end in you garage ;) Gilles. PPC64: cpu 0x2: Vector: 700 (Program Check) at [c0000000ee2877b0] pc: c00000000004eca4: .push_rt_task+0x1f4/0x2d0 lr: c00000000004ec24: .push_rt_task+0x174/0x2d0 sp: c0000000ee287a30 msr: 8000000000021032 current = 0xc0000000ee276fe0 paca = 0xc0000000005c3780 pid = 36, comm = sirq-block/2 kernel BUG at kernel/sched_rt.c:1102! enter ? for help [c0000000ee287a30] c00000000004ec78 .push_rt_task+0x1c8/0x2d0 (unreliable) [c0000000ee287ae0] c00000000004eda4 .push_rt_tasks+0x24/0x44 [c0000000ee287b70] c00000000004edf0 .post_schedule_rt+0x2c/0x50 [c0000000ee287c00] c000000000052864 .finish_task_switch+0x100/0x1a8 [c0000000ee287cb0] c0000000002cdbd0 .__schedule+0x6a0/0x75c [c0000000ee287d90] c0000000002cdedc .schedule+0xf4/0x128 [c0000000ee287e20] c000000000061700 .ksoftirqd+0x124/0x37c [c0000000ee287f00] c000000000076dc0 .kthread+0x84/0xd4 [c0000000ee287f90] c000000000029368 .kernel_thread+0x4c/0x68 2:mon> Intel64: kernel BUG at kernel/sched_rt.c:1102! invalid opcode: 0000 [1] PREEMPT SMP CPU 4 Modules linked in: mptsas scsi_transport_sas Pid: 61, comm: sirq-block/4 Not tainted 2.6.26.5-rt9-00002-g3b27927-dirty #26 RIP: 0010:[<ffffffff8022b307>] [<ffffffff8022b307>] push_rt_task+0x15f/0x20b RSP: 0018:ffff81007f4d5d70 EFLAGS: 00010097 RAX: 0000000000000000 RBX: ffff81007edf09d0 RCX: 000000000822b765 RDX: 000000000822b765 RSI: 0000000000000000 RDI: ffff81000103f280 RBP: ffff81007f4d5da0 R08: ffff81007f4d4000 R09: ffff81007edcbe20 R10: 00000000ffffffff R11: ffffffff8021fa2c R12: 0000000000000000 R13: ffff810001034280 R14: ffff81007edf09e0 R15: ffff81000103f280 FS: 00007f2f26e776f0(0000) GS:ffff81007fc0ccc0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000006b9fb0 CR3: 00000001bf4c9000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process sirq-block/4 (pid: 61, threadinfo ffff81007f4d4000, task ffff81007f4d0e10) Stack: 000000007f4d5e00 ffff81000103f280 ffff81007edf09d0 ffff8101bf457540 0000000000000001 0000000000000002 ffff81007f4d5dc0 ffffffff8022b3c7 ffff81007f4d5de0 ffff81000103f280 ffff81007f4d5de0 ffffffff8022b3e8 Call Trace: [<ffffffff8022b3c7>] push_rt_tasks+0x14/0x1c [<ffffffff8022b3e8>] post_schedule_rt+0x19/0x25 [<ffffffff8022d7ee>] finish_task_switch+0x73/0x121 [<ffffffff805bbe3d>] thread_return+0x4f/0xdc [<ffffffff805bc066>] schedule+0xd4/0xf0 [<ffffffff80237eeb>] ksoftirqd+0xb3/0x260 [<ffffffff80237e38>] ? ksoftirqd+0x0/0x260 [<ffffffff80245209>] ? kthread+0x47/0x76 [<ffffffff8022e9f9>] ? schedule_tail+0x43/0x97 [<ffffffff8020c3d8>] ? child_rip+0xa/0x12 [<ffffffff802451c2>] ? kthread+0x0/0x76 [<ffffffff8020c3ce>] ? child_rip+0x0/0x12 Code: 48 c7 c6 c0 1d 23 80 e8 83 b3 03 00 e9 ee fe ff ff 4c 89 e7 e8 b1 31 39 00 eb ba 48 8b 43 08 8b 40 18 41 3b 87 90 0e 00 00 74 04 <0f> 0b eb fe 48 89 de 4c 89 ff e8 5b fe ff ff f0 41 ff 0e 0f 94 RIP [<ffffffff8022b307>] push_rt_task+0x15f/0x20b RSP <ffff81007f4d5d70> Gregory Haskins wrote: > A panic was discovered by Chirag Jog and investigated by Gilles Carry > to be originating in the fact that a task being pushed away > may get migrated away during a double_lock_balance. The result was > that the pushable_tasks list may become corrupted. > > The root cause is that the "paranoid" retry limit could cause us to > bail out of a retry, but still try to remove the item from the (now > potentially incorrect) list. There are numerous ways to correct the > condition, but the paranoid feature is no longer relevant with the new > pushable logic (since pushable naturally limits the loop anyway), so > lets just remove it. > > Reported By: Chirag Jog <chirag@linux.vnet.ibm.com> > Found-by: Gilles Carry <gilles.carry@bull.net> > Signed-off-by: Gregory Haskins <ghaskins@novell.com> > --- > > kernel/sched_rt.c | 5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c > index 59ead84..5a754fe 100644 > --- a/kernel/sched_rt.c > +++ b/kernel/sched_rt.c > @@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq) > { > struct task_struct *next_task; > struct rq *lowest_rq; > - int paranoid = RT_MAX_TRIES; > > if (!rq->rt.overloaded) > return 0; > @@ -1094,12 +1093,14 @@ static int push_rt_task(struct rq *rq) > * If it has, then try again. > */ > task = pick_next_pushable_task(rq); > - if (unlikely(task != next_task) && task && paranoid--) { > + if (unlikely(task != next_task) && task) { > put_task_struct(next_task); > next_task = task; > goto retry; > } > > + BUG_ON(task_cpu(next_task) != rq->cpu); > + > /* > * Once we have failed to push this task, we will not > * try again, since the other cpus will pull from us > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task 2008-10-03 13:46 ` Gilles Carry @ 2008-10-03 15:45 ` Chirag Jog 2008-10-03 17:27 ` Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 0/2] Series short description Gregory Haskins 1 sibling, 1 reply; 24+ messages in thread From: Chirag Jog @ 2008-10-03 15:45 UTC (permalink / raw) To: Gilles Carry Cc: Gregory Haskins, linux-rt-users, linux-kernel, rostedt, dvhltc, dino * Gilles Carry <Gilles.Carry@bull.net> [2008-10-03 15:46:59]: > Sorry Greg, > > Neither PPC64 nor Intel64 make it with this patch. > At boot time, it stops at the BUG_ON you added: > 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102) I am also confirming this issue gilles reported. Although, i have a question: When we enable group scheduling (CONFIG_RT_GROUP_SCHED), all the problems disappear. After skimming through the rt scheduler code, I don't feel group scheduling alters the behavior of push/pull strategies in any way. So I am wonder whether enabling group scheduling actually solves the problem or just makes it tough to recreate? > I let you do more investigations. > Have a good week-end in you garage ;) Have a great weekend :) > -- -Thanks,Chirag ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task 2008-10-03 15:45 ` Chirag Jog @ 2008-10-03 17:27 ` Gregory Haskins 0 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 17:27 UTC (permalink / raw) To: Gilles Carry, Gregory Haskins, linux-rt-users, linux-kernel, rostedt, dvhltc, dino [-- Attachment #1: Type: text/plain, Size: 949 bytes --] Chirag Jog wrote: > * Gilles Carry <Gilles.Carry@bull.net> [2008-10-03 15:46:59]: > > >> Sorry Greg, >> >> Neither PPC64 nor Intel64 make it with this patch. >> At boot time, it stops at the BUG_ON you added: >> 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102) >> > I am also confirming this issue gilles reported. > > Although, i have a question: > When we enable group scheduling (CONFIG_RT_GROUP_SCHED), > all the problems disappear. > After skimming through the rt scheduler code, I don't feel group > scheduling alters the behavior of push/pull strategies in any way. > > So I am wonder whether enabling group scheduling > actually solves the problem or just makes it tough > to recreate? > Hi Chirag, The issue that Gilles pointed me at is a race condition. I do not suspect GROUP_SCHED itself has anything to do with the problem other than it changes the timing. HTH! -Greg [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RT PATCH v2 0/2] Series short description 2008-10-03 13:46 ` Gilles Carry 2008-10-03 15:45 ` Chirag Jog @ 2008-10-03 17:26 ` Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins 1 sibling, 2 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 17:26 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry Gilles Carry wrote: > Sorry Greg, > > Neither PPC64 nor Intel64 make it with this patch. > At boot time, it stops at the BUG_ON you added: > 0xc00000000004eca4 is in push_rt_task (kernel/sched_rt.c:1102) > Indeed. Your report has revealed the problem to me. The issue is that there are three conditions embedded in that if(!lower_rq) code, but two are buried in the !retry case. This was the mistake I was making. We basically need to a) dequeue if the task hasnt moved b) retry if the task *has* moved AND there are more tasks left c) stop of the task *has* moved AND there are no more tasks I was missing logic to handle (c). "v2" should fix this so it is handled. Please give it a try. Thanks again, Gilles! (Again, only build-tested) Regards, -Greg --- Gregory Haskins (2): RT: remove "paranoid" limit in push_rt_task RT: Remove comment that is no longer true kernel/sched_rt.c | 44 ++++++++++++++++++++++---------------------- 1 files changed, 22 insertions(+), 22 deletions(-) -- Signature ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RT PATCH v2 1/2] RT: Remove comment that is no longer true 2008-10-03 17:26 ` [RT PATCH v2 0/2] Series short description Gregory Haskins @ 2008-10-03 17:26 ` Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins 1 sibling, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 17:26 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry We fixed the condition noted in the comment with the "pushable_tasks" logic, but forgot to remove this comment. Lets clean it up. Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 10 ---------- 1 files changed, 0 insertions(+), 10 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 57a0c0d..59ead84 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1125,16 +1125,6 @@ out: return 1; } -/* - * TODO: Currently we just use the second highest prio task on - * the queue, and stop when it can't migrate (or there's - * no more RT tasks). There may be a case where a lower - * priority RT task has a different affinity than the - * higher RT task. In this case the lower RT task could - * possibly be able to migrate where as the higher priority - * RT task could not. We currently ignore this issue. - * Enhancements are welcome! - */ static void push_rt_tasks(struct rq *rq) { /* push_rt_task will return true if it moved an RT */ ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task 2008-10-03 17:26 ` [RT PATCH v2 0/2] Series short description Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 1/2] RT: Remove comment that is no longer true Gregory Haskins @ 2008-10-03 17:26 ` Gregory Haskins 1 sibling, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 17:26 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry A panic was discovered by Chirag Jog and investigated by Gilles Carry to be originating in the fact that a task being pushed away may get migrated away during a double_lock_balance. The result was that the pushable_tasks list may become corrupted. The root cause is that the "paranoid" retry limit could cause us to bail out of a retry, but still try to remove the item from the (now potentially incorrect) list. There are numerous ways to correct the condition, but the paranoid feature is no longer relevant with the new pushable logic (since pushable naturally limits the loop anyway), so lets just remove it. Reported By: Chirag Jog <chirag@linux.vnet.ibm.com> Found-by: Gilles Carry <gilles.carry@bull.net> Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 34 ++++++++++++++++++++++------------ 1 files changed, 22 insertions(+), 12 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 59ead84..201bd97 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq) { struct task_struct *next_task; struct rq *lowest_rq; - int paranoid = RT_MAX_TRIES; if (!rq->rt.overloaded) return 0; @@ -1090,23 +1089,34 @@ static int push_rt_task(struct rq *rq) struct task_struct *task; /* * find lock_lowest_rq releases rq->lock - * so it is possible that next_task has changed. - * If it has, then try again. + * so it is possible that next_task has migrated. + * + * We need to make sure that the task is still on the same + * run-queue and is also still the next task eligible for + * pushing. */ task = pick_next_pushable_task(rq); - if (unlikely(task != next_task) && task && paranoid--) { - put_task_struct(next_task); - next_task = task; - goto retry; + if (task_cpu(next_task) == rq->cpu && task == next_task) { + /* + * If we get here, the task hasnt moved it all, but + * it has failed to push. We will not try again, + * since the other cpus will pull from us when they + * are ready. + */ + dequeue_pushable_task(rq, next_task); + goto out; } + + if (!task) + /* No more tasks, just exit */ + goto out; /* - * Once we have failed to push this task, we will not - * try again, since the other cpus will pull from us - * when they are ready + * Something has shifted, try again. */ - dequeue_pushable_task(rq, next_task); - goto out; + put_task_struct(next_task); + next_task = task; + goto retry; } deactivate_task(rq, next_task, 0); ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2008-10-03 12:43 ` [PATCH 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-03 12:43 ` [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins @ 2008-10-03 12:54 ` Gregory Haskins 2 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-03 12:54 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry [-- Attachment #1: Type: text/plain, Size: 421 bytes --] Gregory Haskins wrote: > Hi Chirag, > > Please try the following patches applied to 26.5-rt9 and let me know what you > find. > > Hi Steve, > If these look good to everyone, please consider them for inclusion in -rt10. > > I meant to add: this are build-tested only (now back to vacation time for me, which means slave labor in the garage!). I will catch up with you guys on Monday. -Greg [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 257 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" 2008-09-25 12:32 [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Chirag Jog ` (2 preceding siblings ...) 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins @ 2008-10-06 15:14 ` Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 1/2] RT: Remove comment that is no longer true Gregory Haskins ` (2 more replies) 3 siblings, 3 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-06 15:14 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry Hi Steve, Chirag reported (via IRC) that v2 fixed his issue. V3 is identical to v2 except I cleaned up the patch description and fixed a comment typo. This applies to 26.5-rt9. Please consider it an urgent fix for -rt10. Regards, -Greg --- Gregory Haskins (2): RT: fix push_rt_task() to handle dequeue_pushable properly RT: Remove comment that is no longer true kernel/sched_rt.c | 44 ++++++++++++++++++++++---------------------- 1 files changed, 22 insertions(+), 22 deletions(-) -- ^ permalink raw reply [flat|nested] 24+ messages in thread
* [RT PATCH v3 1/2] RT: Remove comment that is no longer true 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins @ 2008-10-06 15:14 ` Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 2/2] RT: fix push_rt_task() to handle dequeue_pushable properly Gregory Haskins 2008-10-07 6:04 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gilles Carry 2 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-06 15:14 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry We fixed the condition noted in the comment with the "pushable_tasks" logic, but forgot to remove this comment. Lets clean it up. Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 10 ---------- 1 files changed, 0 insertions(+), 10 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 57a0c0d..59ead84 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1125,16 +1125,6 @@ out: return 1; } -/* - * TODO: Currently we just use the second highest prio task on - * the queue, and stop when it can't migrate (or there's - * no more RT tasks). There may be a case where a lower - * priority RT task has a different affinity than the - * higher RT task. In this case the lower RT task could - * possibly be able to migrate where as the higher priority - * RT task could not. We currently ignore this issue. - * Enhancements are welcome! - */ static void push_rt_tasks(struct rq *rq) { /* push_rt_task will return true if it moved an RT */ ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RT PATCH v3 2/2] RT: fix push_rt_task() to handle dequeue_pushable properly 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 1/2] RT: Remove comment that is no longer true Gregory Haskins @ 2008-10-06 15:14 ` Gregory Haskins 2008-10-07 6:04 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gilles Carry 2 siblings, 0 replies; 24+ messages in thread From: Gregory Haskins @ 2008-10-06 15:14 UTC (permalink / raw) To: Chirag Jog Cc: linux-rt-users, linux-kernel, rostedt, dvhltc, dino, Gilles.Carry A panic was discovered by Chirag Jog where a BUG_ON sanity check in the new "pushable_task" logic would trigger a panic under certain circumstances: http://lkml.org/lkml/2008/9/25/189 Gilles Carry discovered that the root cause was attributed to the pushable_tasks list getting corrupted in the push_rt_task logic. This was the result of a dropped rq lock in double_lock_balance allowing a task in the process of being pushed to potentially migrate away, and thus corrupt the pushable_tasks() list. I traced back the problem as introduced by the pushable_tasks patch that went in recently. There is a "retry" path in push_rt_task() that actually had a compound conditional to decide whether to retry or exit. I missed the meaning behind the rationale for the virtual "if(!task) goto out;" portion of the compound statement and thus did not handle it properly. The new pushable_tasks logic actually creates three distinct conditions: 1) an untouched and unpushable task should be dequeued 2) a migrated task where more pushable tasks remain should be retried 3) a migrated task where no more pushable tasks exist should exit The original logic mushed (1) and (3) together, resulting in the system dequeuing a migrated task (against an unlocked foreign run-queue nonetheless). To fix this, we get rid of the notion of "paranoid" and we support the three unique conditions properly. The paranoid feature is no longer relevant with the new pushable logic (since pushable naturally limits the loop) anyway, so lets just remove it. Reported-By: Chirag Jog <chirag@linux.vnet.ibm.com> Found-by: Gilles Carry <gilles.carry@bull.net> Signed-off-by: Gregory Haskins <ghaskins@novell.com> --- kernel/sched_rt.c | 34 ++++++++++++++++++++++------------ 1 files changed, 22 insertions(+), 12 deletions(-) diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 59ead84..05a1d4a 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -1056,7 +1056,6 @@ static int push_rt_task(struct rq *rq) { struct task_struct *next_task; struct rq *lowest_rq; - int paranoid = RT_MAX_TRIES; if (!rq->rt.overloaded) return 0; @@ -1090,23 +1089,34 @@ static int push_rt_task(struct rq *rq) struct task_struct *task; /* * find lock_lowest_rq releases rq->lock - * so it is possible that next_task has changed. - * If it has, then try again. + * so it is possible that next_task has migrated. + * + * We need to make sure that the task is still on the same + * run-queue and is also still the next task eligible for + * pushing. */ task = pick_next_pushable_task(rq); - if (unlikely(task != next_task) && task && paranoid--) { - put_task_struct(next_task); - next_task = task; - goto retry; + if (task_cpu(next_task) == rq->cpu && task == next_task) { + /* + * If we get here, the task hasnt moved at all, but + * it has failed to push. We will not try again, + * since the other cpus will pull from us when they + * are ready. + */ + dequeue_pushable_task(rq, next_task); + goto out; } + + if (!task) + /* No more tasks, just exit */ + goto out; /* - * Once we have failed to push this task, we will not - * try again, since the other cpus will pull from us - * when they are ready + * Something has shifted, try again. */ - dequeue_pushable_task(rq, next_task); - goto out; + put_task_struct(next_task); + next_task = task; + goto retry; } deactivate_task(rq, next_task, 0); ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 2/2] RT: fix push_rt_task() to handle dequeue_pushable properly Gregory Haskins @ 2008-10-07 6:04 ` Gilles Carry 2 siblings, 0 replies; 24+ messages in thread From: Gilles Carry @ 2008-10-07 6:04 UTC (permalink / raw) To: Gregory Haskins Cc: Chirag Jog, linux-rt-users, linux-kernel, rostedt, dvhltc, dino Gregory Haskins wrote: > Hi Steve, > Chirag reported (via IRC) that v2 fixed his issue. V3 is identical to v2 > except I cleaned up the patch description and fixed a comment typo. This > applies to 26.5-rt9. Please consider it an urgent fix for -rt10. > Hi, After 24 hours of testing on PPC64, I confirm that this patch fixes the issue. Gilles. ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2008-10-07 6:08 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-25 12:32 [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Chirag Jog 2008-09-29 18:13 ` Gregory Haskins 2008-09-29 21:18 ` Gregory Haskins 2008-09-29 21:34 ` Gregory Haskins 2008-09-29 22:00 ` Gregory Haskins 2008-09-30 4:43 ` Chirag Jog 2008-09-30 6:47 ` Gilles Carry 2008-10-01 14:22 ` [PATCH] sched: add a stacktrace on enqueue_pushable error Gregory Haskins 2008-10-02 9:42 ` Gilles Carry 2008-10-02 11:18 ` [BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang Gilles Carry 2008-10-03 12:42 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2008-10-03 12:43 ` [PATCH 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-03 12:43 ` [PATCH 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins 2008-10-03 13:46 ` Gilles Carry 2008-10-03 15:45 ` Chirag Jog 2008-10-03 17:27 ` Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 0/2] Series short description Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-03 17:26 ` [RT PATCH v2 2/2] RT: remove "paranoid" limit in push_rt_task Gregory Haskins 2008-10-03 12:54 ` [RT PATCH 0/2] fix for BUG_ON crash in 26.5-rt9 Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 1/2] RT: Remove comment that is no longer true Gregory Haskins 2008-10-06 15:14 ` [RT PATCH v3 2/2] RT: fix push_rt_task() to handle dequeue_pushable properly Gregory Haskins 2008-10-07 6:04 ` [RT PATCH v3 0/2] Fix for "[BUG][PPC64] BUG in 2.6.26.5-rt9 causing Hang" Gilles Carry
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).