All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
@ 2004-02-15 18:50 BlaisorBlade
  2004-02-16  9:53 ` [uml-devel] " Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: BlaisorBlade @ 2004-02-15 18:50 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Ingo Molnar, Jeff Dike

[-- Attachment #1: Type: text/plain, Size: 5632 bytes --]

While searching the archives, I noticed that this command:

while /bin/true ; do /bin/true ; done

created some problems (task_struct leak and OOM) time ago (at least until 
2.6.0-test2, more or less). See "Re: [uml-devel] oops and memory leak with 
uml-patch-2.5.67-1".

I've rerun this command under a 2.6.2 with my patch collection (but with the 
/proc/meminfo bug, i.e. MemTotal = 0) and host 2.4.24-skas (which contains 
the Ingo Molnar's fixlet about LDT loading) and got, instead, this:

Kernel panic: Kernel mode fault at addr 0x1f, ip 0x400d1397
Kernel panic: kernel BUG at kernel/exit.c:793!

And then the kernel exited (even if it tried to trigger some other panics, see 
the end of the attached output). If anyone is able to reproduce the bug, then 
he can go straight debugging, if he doesn't want to read this. I think I 
cannot give you the binary, since I have a 56k modem and it's 6,7Mega even 
compressed (stripping debug symbols is useless for debug). However I could 
try stripping the config and such things, if you absolutely can't reproduce 
it.

The second message is of interest because the BUG line of interest reads as:
  schedule();
  BUG();

and this is fairly interesting. But this report is all a fun: the kernel 
continues to run after a panic call because the call suddenly exited without 
a return (search this message for "sudden"), the runqueue datas are 
inconsistent (maybe because of the memory leak in the above message), we 
don't know where the "array" value comes from (a compiler bug? An error 
in the debugging info? A reused variable? Do you need the gdb disassemble of 
schedule?). Also, since the problems appear in schedule(), this could relate 
with the hang after the "NET: Registered protocol family 2" message, since 
that comes from a not-working wait queue.

Note: in that moment about 10/20 processes were running.

I was able to reproduce both ones under gdb, and I've attached the whole 
debugging session output; here I summarize it, since that output is way too 
long and this is the offending code, inside schedule():

        idx = sched_find_first_bit(array->bitmap);
        queue = array->queue + idx;
        next = list_entry(queue->next, task_t, run_list);

        if (next->activated > 0) { //THIS IS THE LINE!

With gdb, I got this:
(gdb) where
#0  panic (fmt=0xa018f8e0 "Kernel mode fault at addr 0x%lx, ip 0x%lx") at 
include/asm/thread_info.h:49
#1  0xa0018ac8 in segv (address=31, ip=1074598807, is_write=0, 
is_user=1074598807, sc=0xa1e59148)
    at arch/um/kernel/trap_kern.c:167
#2  0xa0018e97 in segv_handler (sig=11, regs=0xa1e59148) at 
arch/um/kernel/trap_user.c:67
#3  0xa001e573 in sig_handler_common_skas (sig=11, sc_ptr=0x58) at 
arch/um/kernel/skas/trap_user.c:33
#4  0xa0018fa0 in sig_handler (sig=0, sc=
      {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43, 
__dsh = 0, edi = 2694560988, esi = 2716176348, ebp = 2694560956, esp = 
2694560884, ebx = 3, edx = 2687144876, ecx = 4294967267, eax = 140, trapno = 
14, err = 4, eip = 2684545927, cs = 35, __csh = 0, eflags = 66179, 
esp_at_signal = 2694560884, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 
369106944, cr2 = 31})
    at arch/um/kernel/trap_user.c:103
#5  <signal handler called>
#6  schedule () at kernel/sched.c:1677
#7  0xa0014584 in interrupt_end () at arch/um/kernel/process_kern.c:138
#8  0xa001d498 in userspace (regs=0xa1e59148) at 
arch/um/kernel/skas/process.c:174
#9  0xa001dd54 in fork_handler (sig=10) at 
arch/um/kernel/skas/process_kern.c:103
#10 <signal handler called>
#11 0xa01559dd in syscall () at include/linux/slab.h:92
#12 0xa002cd76 in os_usr1_process (pid=26614) at arch/um/os-Linux/process.c:96
#13 0xa001d53f in new_thread (stack=Cannot access memory at address 0x8
) at arch/um/kernel/skas/process.c:197
Previous frame inner to this frame (corrupt stack?)

(gdb) print next
$2 = (task_t *) 0xffffffe3
(gdb) print &next->activated
$3 = (int *) 0x1f (the address in the fault message)

(gdb) print array
$4 = (prio_array_t *) 0x3

(gdb) print rq
No symbol "rq" in current context.

Contents of this_rq are in the attached debug session log.

Also, in the panic call, I discovered that current->pid was 99, and that is 
the father of the true processes:

root [~: slack90: 6 (0)] # ps -ef
...
root        99     1  0  1997 tty1     00:00:01 -bash
root       100     1  0  1997 tty6     00:00:00 -bash
root       383    99  0  1997 tty1     00:00:00 /bin/true

About array = 0x3, I would point to someone else getting the same problem:
"[uml-devel] wait queues broken?" (but there, array was NULL).

While going on in the panic call, all of a sudden (maybe after 
local_irq_enable, which on UML is unblock_signals() ) I got to this point, 
and then to the BUG() call above (missing details are in the attached 
output):

(gdb) next
69                      sys_sync();
(gdb) next
thread_wait (sw=0xa001d578, fb=0xa09bb554) at 
arch/um/kernel/skas/process.c:210
210     }
(gdb) where
#0  thread_wait (sw=0xa001d578, fb=0xa09bb554) at 
arch/um/kernel/skas/process.c:210
#1  0xa001dd07 in fork_handler (sig=10) at 
arch/um/kernel/skas/process_kern.c:94
#2  <signal handler called>
#3  0xa01559dd in syscall () at include/linux/slab.h:92
Previous frame inner to this frame (corrupt stack?)

Then, it seems that running notifier_chan_list called some function that 
sleeped and that the calls to scheduled panicked and then tended to loop 
infinitely (at least, until I debugged it; when doing continue, finally the 
program exited).

Hope you have fun figuring it out!
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729







[-- Attachment #2: debugOutput.bz2 --]
[-- Type: application/x-bzip2, Size: 12034 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-15 18:50 [uml-devel] Diagnosed and repeatable kernel mode panic in schedule() for 2.6! BlaisorBlade
@ 2004-02-16  9:53 ` Ingo Molnar
  2004-02-16 19:25   ` BlaisorBlade
  2004-02-17 15:56   ` BlaisorBlade
  0 siblings, 2 replies; 13+ messages in thread
From: Ingo Molnar @ 2004-02-16  9:53 UTC (permalink / raw)
  To: BlaisorBlade; +Cc: user-mode-linux-devel, Jeff Dike


* BlaisorBlade <blaisorblade_spam@yahoo.it> wrote:

> While searching the archives, I noticed that this command:
> 
> while /bin/true ; do /bin/true ; done

> Kernel panic: Kernel mode fault at addr 0x1f, ip 0x400d1397
> Kernel panic: kernel BUG at kernel/exit.c:793!

can you reproduce this crash even if you boot up UML via init=/bin/bash?

	Ingo


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-16  9:53 ` [uml-devel] " Ingo Molnar
@ 2004-02-16 19:25   ` BlaisorBlade
  2004-02-16 19:27     ` Ingo Molnar
  2004-02-17  4:46     ` Jeff Dike
  2004-02-17 15:56   ` BlaisorBlade
  1 sibling, 2 replies; 13+ messages in thread
From: BlaisorBlade @ 2004-02-16 19:25 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Ingo Molnar

Alle 10:53, lunedì 16 febbraio 2004, Ingo Molnar ha scritto:
> * BlaisorBlade <blaisorblade_spam@yahoo.it> wrote:
> > While searching the archives, I noticed that this command:
> >
> > while /bin/true ; do /bin/true ; done
> >
> > Kernel panic: Kernel mode fault at addr 0x1f, ip 0x400d1397
> > Kernel panic: kernel BUG at kernel/exit.c:793!
>
> can you reproduce this crash even if you boot up UML via init=/bin/bash?
Yes (test with the same binary):
bash-2.05b# while /bin/true; do /bin/true; done
Kernel panic: Kernel mode fault at addr 0x1f, ip 0x400d1397

However, I discovered that an old 2.6.0 kernel works well. I think that had 
2.6.0-test9 patch; so I think that this is related with the NET: Registered 
... hang, which was solved with the same patch. That does not happen on 
different compilation env.s, but I think that is random: a crazy pointer 
failing in different places.
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id\x1356&alloc_id438&opÌk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-16 19:25   ` BlaisorBlade
@ 2004-02-16 19:27     ` Ingo Molnar
  2004-02-17  0:52       ` Jeff Dike
  2004-02-17  4:46     ` Jeff Dike
  1 sibling, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2004-02-16 19:27 UTC (permalink / raw)
  To: BlaisorBlade; +Cc: user-mode-linux-devel


* BlaisorBlade <blaisorblade_spam@yahoo.it> wrote:

> However, I discovered that an old 2.6.0 kernel works well. I think
> that had 2.6.0-test9 patch; so I think that this is related with the
> NET: Registered ... hang, which was solved with the same patch. That
> does not happen on different compilation env.s, but I think that is
> random: a crazy pointer failing in different places.

i cannot reproduce using 2.6.2-rc3-mm1, so it's likely some recent
breakage.

	Ingo


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-16 19:27     ` Ingo Molnar
@ 2004-02-17  0:52       ` Jeff Dike
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Dike @ 2004-02-17  0:52 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: BlaisorBlade, user-mode-linux-devel

mingo@elte.hu said:
> i cannot reproduce using 2.6.2-rc3-mm1, so it's likely some recent
> breakage. 

I can.  At least the original panic that BlaisorBlade found.  The stack shows
schedule() recursing through switch_to because of a signal arriving at the
wrong time, and the interrupt handler calling schedule():

#6  schedule () at kernel/sched.c:1680
#7  0xa00154f9 in interrupt_end () at arch/um/kernel/process_kern.c:138
#8  0xa001f5aa in userspace (regs=0xa03233b8)
    at arch/um/kernel/skas/process.c:174
#9  0xa001fd9a in fork_handler (sig=10)
    at arch/um/kernel/skas/process_kern.c:103
#10 <signal handler called>
#11 0xa016974d in syscall ()
#12 0x00000001 in ?? ()
#13 0xa7fd388c in ?? ()
#14 0xa7fd395c in ?? ()
#15 0xa0015351 in external_pid (t=0xa0323080)
    at arch/um/kernel/process_kern.c:69
#16 0xa0015487 in set_current (t=0xa0323080)
    at arch/um/kernel/process_kern.c:126
#17 0xa001fb69 in switch_to_skas (prev=0xa0a97740, next=0xa0323080)
    at arch/um/kernel/skas/process_kern.c:40
#18 0xa00154c7 in _switch_to (prev=0xa0a97740, next=0xa0323080, 
    last=0xa0a97740) at arch/um/kernel/process_kern.c:132
#19 0xa0030845 in schedule () at kernel/sched.c:919

So, this seems very wrong.  It's not obvious to me offhand what the correct
fix is, since the scheduler runs with interrupts enabled.  Time to look at what
x86 does, I guess.

				Jeff



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-16 19:25   ` BlaisorBlade
  2004-02-16 19:27     ` Ingo Molnar
@ 2004-02-17  4:46     ` Jeff Dike
  2004-02-17  7:18       ` William Stearns
  2004-02-19 19:01       ` BlaisorBlade
  1 sibling, 2 replies; 13+ messages in thread
From: Jeff Dike @ 2004-02-17  4:46 UTC (permalink / raw)
  To: BlaisorBlade; +Cc: user-mode-linux-devel, Ingo Molnar

blaisorblade_spam@yahoo.it said:
> while /bin/true ; do /bin/true ; done 

This turned out to be slightly subtle.  It was exposed by the 
longjmp -> siglongjmp change.  New processes were created with signals 
enabled, so siglongjmp to a new process context re-enabled signals in the
middle of the context switch.

Try the patch below - it seems to fix things for me.

				Jeff


--- 1.9/arch/um/kernel/skas/process_kern.c      Thu Jan  8 07:43:01 2004
+++ edited/arch/um/kernel/skas/process_kern.c   Sun Feb 15 10:42:31 2004
@@ -76,6 +76,7 @@
 
 void new_thread_proc(void *stack, void (*handler)(int sig))
 {
+       local_irq_disable();
        init_new_thread_stack(stack, handler);
        os_usr1_process(os_getpid());
 }



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-17  4:46     ` Jeff Dike
@ 2004-02-17  7:18       ` William Stearns
  2004-02-19 19:01       ` BlaisorBlade
  1 sibling, 0 replies; 13+ messages in thread
From: William Stearns @ 2004-02-17  7:18 UTC (permalink / raw)
  To: Jeff Dike; +Cc: BlaisorBlade, ML-uml-devel, Ingo Molnar

Good morning, Jeff,

On Mon, 16 Feb 2004, Jeff Dike wrote:

> blaisorblade_spam@yahoo.it said:
> > while /bin/true ; do /bin/true ; done 
> 
> This turned out to be slightly subtle.  It was exposed by the 
> longjmp -> siglongjmp change.  New processes were created with signals 
> enabled, so siglongjmp to a new process context re-enabled signals in the
> middle of the context switch.
> 
> Try the patch below - it seems to fix things for me.
> 
> --- 1.9/arch/um/kernel/skas/process_kern.c      Thu Jan  8 07:43:01 2004
> +++ edited/arch/um/kernel/skas/process_kern.c   Sun Feb 15 10:42:31 2004
> @@ -76,6 +76,7 @@
>  
>  void new_thread_proc(void *stack, void (*handler)(int sig))
>  {
> +       local_irq_disable();
>         init_new_thread_stack(stack, handler);
>         os_usr1_process(os_getpid());
>  }

	I've put this up for the whoopis and cache vm's on Zaphod.  We'll 
see how things look in the morning.
	Thanks again for the help, Jeff.
	Cheers,
	- Bill

---------------------------------------------------------------------------
	"Nynex.  Iroquois for Moron"
	-- A well-known Linux kernel hacker.
--------------------------------------------------------------------------
William Stearns (wstearns@pobox.com).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
--------------------------------------------------------------------------



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-16  9:53 ` [uml-devel] " Ingo Molnar
  2004-02-16 19:25   ` BlaisorBlade
@ 2004-02-17 15:56   ` BlaisorBlade
  2004-02-18 21:15     ` Jeff Dike
  1 sibling, 1 reply; 13+ messages in thread
From: BlaisorBlade @ 2004-02-17 15:56 UTC (permalink / raw)
  To: user-mode-linux-devel

Alle 10:53, lunedì 16 febbraio 2004, Ingo Molnar ha scritto:
> * BlaisorBlade <blaisorblade_spam@yahoo.it> wrote:
> > While searching the archives, I noticed that this command:
> >
> > while /bin/true ; do /bin/true ; done
> >
> > Kernel panic: Kernel mode fault at addr 0x1f, ip 0x400d1397
> > Kernel panic: kernel BUG at kernel/exit.c:793!
>
> can you reproduce this crash even if you boot up UML via init=/bin/bash?

Now, with 2.6.3-rc2-1um, I get this message (the command was while cat 
/dev/null; do cat /dev/null; done; both with normal init and init=/bin/sh). 
Note I did not press SysRq, nor did it with mconsole; it was done by Uml 
itself (panic_exit...). However, the last time I remember that the notifier 
chain was corrupted.

The decoding of part of the "Call stack" follows (but it is normal that most 
calls are seriously screwed away? I.e. they are just values in the stack, not 
actual calls).

Kernel panic: Kernel mode fault at addr 0x94, ip 0x40059923
Kernel panic: kernel BUG at kernel/exit.c:793!

 <6>SysRq : Show Regs

EIP: 0023:[<4005b476>] CPU: 0 Not tainted ESP: 002b:bffffd44 EFLAGS: 00000286
    Not tainted
EAX: ffffffda EBX: 00000000 ECX: 00000000 EDX: 000018c0
ESI: ffffffff EDI: 00000000 EBP: 00000000 DS: 002b ES: 002b
Call Trace: [<a00c4021>] [<a001991f>] [<a00836ed>] [<a0040186>] [<a003240d>]
   [<a0032498>] [<a003492c>] [<a0026000>] [<a0034ba4>] [<a001e298>] 
[<a0017b7e>]
   [<a001e30d>] [<a001d445>] [<a001d723>] [<a001dfd4>] [<a001dfbc>] 
[<a012de98>]
   [<a001df1b>] [<a0149cdd>]

(gdb) info line *0xa00c4021
Line 327 of "drivers/char/sysrq.c" starts at address 0xa00c4021 
<handle_sysrq+49>
   and ends at 0xa00c4030 <__handle_sysrq_nolock>.
(gdb) info line *0xa001991f
Line 398 of "arch/um/kernel/um_arch.c" starts at address 0xa001991f 
<panic_exit+31> and ends at 0xa0019924 <panic_exit+36>.

(gdb) info line *0xa00836ed
Line 467 of "fs/fs-writeback.c" starts at address 0xa00836db <sync_inodes+75> 
and ends at 0xa00836f3 <sync_inodes+99>.
(gdb) info line *0xa0040186
Line 169 of "kernel/sys.c" starts at address 0xa0040180 
<notifier_call_chain+32>
   and ends at 0xa0040188 <notifier_call_chain+40>.
(gdb) info line *0xa003240d
Line 78 of "kernel/panic.c" starts at address 0xa003240d <panic+125> and ends 
at 0xa0032419 <panic+137>.
(gdb) info line *0xa0032498
Line 69 of "kernel/panic.c" starts at address 0xa0032493 <panic+259> and ends 
at 0xa00324a0 <print_tainted>.
(gdb) info line *0xa0018c19
Line 154 of "arch/um/kernel/trap_kern.c" starts at address 0xa0018c19 
<segv+121> and ends at 0xa0018c1c <segv+124>.
(gdb) info line *0xa0018d28
Line 161 of "arch/um/kernel/trap_kern.c" starts at address 0xa0018d28 
<segv+392> and ends at 0xa0018d3f <segv+415>.
(gdb) info line *0xa012e19a
Line 92 of "include/linux/slab.h" starts at address 0xa011fe16 
<unix_seq_open+22> and ends at 0xa01c3632 <af_unix_exit+18>.
(gdb) info line *0xa012e19a
Line 92 of "include/linux/slab.h" starts at address 0xa011fe16 
<unix_seq_open+22> and ends at 0xa01c3632 <af_unix_exit+18>.
(gdb) info line *0xa0016f1d
Line 69 of "arch/um/kernel/signal_user.c" starts at address 0xa0016f0e 
<change_signals+62>
   and ends at 0xa0016f24 <change_signals+84>.
(gdb) info line *0xa00190f7
Line 67 of "arch/um/kernel/trap_user.c" starts at address 0xa00190c2 
<segv_handler+306> and ends at 0xa00191b0 <usr2_handler>.
(gdb) info line *0xa001905f
Line 64 of "arch/um/kernel/trap_user.c" starts at address 0xa001905f 
<segv_handler+207>
   and ends at 0xa0019065 <segv_handler+213>.
(gdb) info line *0xa001e7f3
Line 35 of "arch/um/kernel/skas/trap_user.c" starts at address 0xa001e7f3 
<sig_handler_common_skas+115>
   and ends at 0xa001e7f8 <sig_handler_common_skas+120>.
(gdb) info line *0xa0019200
Line 103 of "arch/um/kernel/trap_user.c" starts at address 0xa00191f0 
<sig_handler+32> and ends at 0xa0019210 <alarm_handler>.
(gdb) info line *0xa012de98
Line 92 of "include/linux/slab.h" starts at address 0xa011fe16 
<unix_seq_open+22> and ends at 0xa01c3632 <af_unix_exit+18>.
(gdb) info line *0xa002ee65
Line 1680 of "kernel/sched.c" starts at address 0xa002ee65 <schedule+453> and 
ends at 0xa002ee70 <schedule+464>.
(gdb) info line *0xa012e19a
Line 92 of "include/linux/slab.h" starts at address 0xa011fe16 
<unix_seq_open+22> and ends at 0xa01c3632 <af_unix_exit+18>.
(gdb) info line *0xa00170a4
Line 127 of "arch/um/kernel/signal_user.c" starts at address 0xa0017099 
<set_signals+105>
   and ends at 0xa00170ab <set_signals+123>.
(gdb) info line *0xa002e55a
Line 301 of "kernel/sched.c" starts at address 0xa002e555 
<wake_up_forked_process+357>
   and ends at 0xa002e55d <wake_up_forked_process+365>.
(gdb) info line *0xa002e408
Line 780 of "kernel/sched.c" starts at address 0xa002e408 
<wake_up_forked_process+24>
   and ends at 0xa002e412 <wake_up_forked_process+34>.
(gdb) info line *0xa0031e4f
Line 1138 of "kernel/fork.c" starts at address 0xa0031e4f <do_fork+63> and 
ends at 0xa0031e52 <do_fork+66>.
(gdb) info line *0xa0031f47
Line 1157 of "kernel/fork.c" starts at address 0xa0031f40 <do_fork+304> and 
ends at 0xa0031f4c <do_fork+316>.
(gdb) info line *0xa0018a84
Line 90 of "arch/um/kernel/trap_kern.c" starts at address 0xa0018a7b 
<handle_page_fault+331>
   and ends at 0xa0018a8c <handle_page_fault+348>.
(gdb) info line *0xa001898c
Line 93 of "arch/um/kernel/trap_kern.c" starts at address 0xa001898c 
<handle_page_fault+92>
   and ends at 0xa0018997 <handle_page_fault+103>.

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729




-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id\x1356&alloc_id438&opÌk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-17 15:56   ` BlaisorBlade
@ 2004-02-18 21:15     ` Jeff Dike
  2004-02-19 18:15       ` BlaisorBlade
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Dike @ 2004-02-18 21:15 UTC (permalink / raw)
  To: BlaisorBlade; +Cc: user-mode-linux-devel

blaisorblade_spam@yahoo.it said:
> Now, with 2.6.3-rc2-1um, I get this message (the command was while cat
>  /dev/null; do cat /dev/null; done; both with normal init and init=/
> bin/sh).

Is this with the patch I posted a couple of days ago?

				Jeff



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-18 21:15     ` Jeff Dike
@ 2004-02-19 18:15       ` BlaisorBlade
  0 siblings, 0 replies; 13+ messages in thread
From: BlaisorBlade @ 2004-02-19 18:15 UTC (permalink / raw)
  To: user-mode-linux-devel

Alle 22:15, mercoledì 18 febbraio 2004, Jeff Dike ha scritto:
> blaisorblade_spam@yahoo.it said:
> > Now, with 2.6.3-rc2-1um, I get this message (the command was while cat
> >  /dev/null; do cat /dev/null; done; both with normal init and init=/
> > bin/sh).
>
> Is this with the patch I posted a couple of days ago?

No, that was not. However I'm going to post a report with that applied (I'm 
getting the backtrace and examining data).

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id\x1356&alloc_id438&opÌk
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-17  4:46     ` Jeff Dike
  2004-02-17  7:18       ` William Stearns
@ 2004-02-19 19:01       ` BlaisorBlade
  2004-02-20  2:06         ` Jeff Dike
  1 sibling, 1 reply; 13+ messages in thread
From: BlaisorBlade @ 2004-02-19 19:01 UTC (permalink / raw)
  To: Jeff Dike; +Cc: user-mode-linux-devel, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 2995 bytes --]

Alle 05:46, martedì 17 febbraio 2004, Jeff Dike ha scritto:
> blaisorblade_spam@yahoo.it said:
> > while /bin/true ; do /bin/true ; done
>
> This turned out to be slightly subtle.  It was exposed by the
> longjmp -> siglongjmp change.  New processes were created with signals
> enabled, so siglongjmp to a new process context re-enabled signals in the
> middle of the context switch.

Probably it is a needed fix, but that one is not enough. I can still panic the 
kernel in the same exact way.

By the way, would you check if this is needed in the 2.4 kernel, too? I've 
checked the 2.4.24-1 uml patch and it seems to miss that line (and your 
reason seem to apply to 2.4 too).

The only change with that patch applied is that while with 2.6.2 schedule 
recursed and panic was called twice, with 2.6.3-rc2 this didn't happen any 
more, i.e. for some random reason I got only one panic, and with this patch 
applied I keep getting one panic. What is more strange is that the panic 
happens deterministically, and in the same situation (same root fs and same 
kernel) I get it at the same addresses! Since what happens seem to change 
with each single kernel, but is easily repeatable, I would say that it is not 
a race condition, but something with a crazy pointer outside of the core 
code. However, it cannot be a not initialized one, but one initialized with 
less than needed space. I'm attaching my .config (but note that I have no 
module loaded when I run the test, so do not check modules).

Also, I have a backtrace + some inquiries of the runqueue; I'm putting those 
in the attached file. It is much shorter than last time. Note that I checked 
in the expired array the first process that was executing (the -32 is the 
offset from the list_head and the task_struct):

(gdb) print ((struct task_struct*)((void 
*)per_cpu__runqueues.expired.queue[125].next-32))

$19 = (struct task_struct *) 0xa08e4080

And that I dereferenced it (and it seemed valid) and I dereferenced the 
"real_parent" task_struct. Also I checked the number of processes in the two 
arrays (active and expired) and in the active one there were -1 processes 
(which is invalid, right Ingo?). So it seems that both active.queue[140] and 
active.queue.nr_active are corrupted. Those datas are next to one another, so 
it seems a true memory corruption.

However, even in expired.queue[125] there is a problem, since in the list 
there is one list_head (the one at 0xa08e40a0) whose next and prev pointers 
point to itself (i.e. that is an empty list); however this "empty list" can 
be reached by expired.queue[125].

Note: to find what index you must give to queue when examining the run_queue, 
take a look to sched_find_first_bit inside asm-i386 and to the code in 
schedule() which invokes it. The queue contains the "R" processes organized 
in one list_head for each priority level.
-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729

[-- Attachment #2: MyUmlConfig --]
[-- Type: text/plain, Size: 7808 bytes --]

#
# Automatically generated make config: don't edit
#
CONFIG_USERMODE=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_RWSEM_GENERIC_SPINLOCK=y

#
# UML-specific options
#
CONFIG_MODE_TT=y
CONFIG_MODE_SKAS=y
CONFIG_NET=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
CONFIG_HOSTFS=m
# CONFIG_HPPFS is not set
CONFIG_MCONSOLE=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_HOST_2G_2G is not set
# CONFIG_UML_SMP is not set
# CONFIG_SMP is not set
CONFIG_NEST_LEVEL=0
CONFIG_KERNEL_HALF_GIGS=1
# CONFIG_HIGHMEM is not set
CONFIG_PROC_MM=y
CONFIG_KERNEL_STACK_ORDER=2
# CONFIG_UML_REAL_TIME_CLOCK is not set

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_BROKEN_ON_SMP=y

#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y

#
# Generic Driver Options
#

#
# Character Devices
#
CONFIG_STDIO_CONSOLE=y
CONFIG_SSL=y
CONFIG_FD_CHAN=y
CONFIG_NULL_CHAN=y
CONFIG_PORT_CHAN=y
CONFIG_PTY_CHAN=y
CONFIG_TTY_CHAN=y
CONFIG_XTERM_CHAN=y
CONFIG_CON_ZERO_CHAN="fd:0,fd:1"
CONFIG_CON_CHAN="xterm"
CONFIG_SSL_CHAN="pty"
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
# CONFIG_WATCHDOG is not set
CONFIG_UML_SOUND=m
CONFIG_SOUND=m
CONFIG_HOSTAUDIO=m

#
# Block Devices
#
CONFIG_BLK_DEV_UBD=y
CONFIG_BLK_DEV_UBD_SYNC=y
CONFIG_BLK_DEV_COW_COMMON=y
CONFIG_BLK_DEV_LOOP=m
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
# CONFIG_BLK_DEV_INITRD is not set
# CONFIG_MMAPPER is not set
CONFIG_NETDEVICES=y

#
# UML Network Devices
#
CONFIG_UML_NET=y
CONFIG_UML_NET_ETHERTAP=y
CONFIG_UML_NET_TUNTAP=y
CONFIG_UML_NET_SLIP=y
CONFIG_UML_NET_DAEMON=y
CONFIG_UML_NET_MCAST=y
CONFIG_UML_NET_PCAP=y
CONFIG_UML_NET_SLIRP=y

#
# Networking support
#

#
# Networking options
#
CONFIG_PACKET=m
CONFIG_PACKET_MMAP=y
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_INET_ECN is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set

#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set

#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_DSCP is not set
# CONFIG_IP_NF_MATCH_AH_ESP is not set
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
CONFIG_IP_NF_NAT_LOCAL=y
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
# CONFIG_IP_NF_TARGET_ECN is not set
# CONFIG_IP_NF_TARGET_DSCP is not set
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
# CONFIG_IP_NF_TARGET_TCPMSS is not set
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set

#
# SCTP Configuration (EXPERIMENTAL)
#
CONFIG_IPV6_SCTP__=y
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_DUMMY is not set
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set

#
# Ethernet (10 or 100Mbit)
#
# CONFIG_NET_ETHERNET is not set

#
# Ethernet (1000 Mbit)
#

#
# Ethernet (10000 Mbit)
#
# CONFIG_PPP is not set
# CONFIG_SLIP is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Token Ring devices
#
# CONFIG_SHAPER is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set

#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set

#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set

#
# Bluetooth support
#
# CONFIG_BT is not set

#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set

#
# DOS/FAT/NT Filesystems
#
# CONFIG_FAT_FS is not set
# CONFIG_NTFS_FS is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS=y
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V4=y
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_SUNRPC=m
# CONFIG_SUNRPC_GSS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
# CONFIG_AFS_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y

#
# Native Language Support
#
# CONFIG_NLS is not set

#
# Security options
#
# CONFIG_SECURITY is not set

#
# Cryptographic options
#
# CONFIG_CRYPTO is not set

#
# Library routines
#
# CONFIG_CRC32 is not set

#
# SCSI support
#
# CONFIG_SCSI is not set

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Kernel hacking
#
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
CONFIG_DEBUG_INFO=y
CONFIG_FRAME_POINTER=y
# CONFIG_PT_PROXY is not set
# CONFIG_GPROF is not set
# CONFIG_GCOV is not set

[-- Attachment #3: debugOutput-secondTry.bz2 --]
[-- Type: application/x-bzip2, Size: 5742 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-19 19:01       ` BlaisorBlade
@ 2004-02-20  2:06         ` Jeff Dike
  2004-02-20  7:48           ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Dike @ 2004-02-20  2:06 UTC (permalink / raw)
  To: BlaisorBlade; +Cc: user-mode-linux-devel, Ingo Molnar

blaisorblade_spam@yahoo.it said:
> By the way, would you check if this is needed in the 2.4 kernel, too?
> I've  checked the 2.4.24-1 uml patch and it seems to miss that line
> (and your  reason seem to apply to 2.4 too). 

Yup, I think so too.

> What is more strange is that the panic  happens deterministically, and
> in the same situation (same root fs and same  kernel) I get it at the
> same addresses! Since what happens seem to change  with each single
> kernel, but is easily repeatable, I would say that it is not  a race
> condition, but something with a crazy pointer outside of the core
> code.

I'm seeing two things, which don't happen deterministically :
	an exited process scheduling to itself, causing the exit.c BUG - I
don't see what prevents a TASK_ZOMBIE process from being schedulable offhand.
It doesn't seem to be dequeued from any runqueues as a result of calling exit().

	a segfault in schedule() caused by there being not bits set in the
queue bitmap (except for bit 140 which I guess is there to prevent ffs from
running into nowhereland).  And bit 140 doesn't refer to a valid queue, so
the task pulled of it is bogus.

> So it seems that both active.queue[140] and  active.queue.nr_active
> are corrupted.

Repeating the above, it looks like #140 isn't supposed to be valid.

> However, even in expired.queue[125] there is a problem, since in the
> list  there is one list_head (the one at 0xa08e40a0) whose next and
> prev pointers  point to itself (i.e. that is an empty list); however
> this "empty list" can  be reached by expired.queue[125]. 

Yup, interesting.  What I would do is stick some code in that looks for
this condition (and if you can look for it being created, that's even better).

I was looking for something that seemed like an invariant being violated,
and came up empty.  This looks like a good one, though.  If we can figure
out what's causing it, that gets us one step closer to the bug.

				Jeff



-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [uml-devel] Re: Diagnosed and repeatable kernel mode panic in schedule() for 2.6!
  2004-02-20  2:06         ` Jeff Dike
@ 2004-02-20  7:48           ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2004-02-20  7:48 UTC (permalink / raw)
  To: Jeff Dike; +Cc: BlaisorBlade, user-mode-linux-devel

* Jeff Dike <jdike@addtoit.com> wrote:

> I'm seeing two things, which don't happen deterministically :
> 	an exited process scheduling to itself, causing the exit.c BUG -
> I don't see what prevents a TASK_ZOMBIE process from being schedulable
> offhand. It doesn't seem to be dequeued from any runqueues as a result
> of calling exit().

a zombie task schedules away and is thus removed from the runqueue
within schedule() itself - and it's never (supposed to be) woken up.

	Ingo


-------------------------------------------------------
SF.Net is sponsored by: Speed Start Your Linux Apps Now.
Build and deploy apps & Web services for Linux with
a free DVD software kit from IBM. Click Now!
http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2004-02-20  7:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-15 18:50 [uml-devel] Diagnosed and repeatable kernel mode panic in schedule() for 2.6! BlaisorBlade
2004-02-16  9:53 ` [uml-devel] " Ingo Molnar
2004-02-16 19:25   ` BlaisorBlade
2004-02-16 19:27     ` Ingo Molnar
2004-02-17  0:52       ` Jeff Dike
2004-02-17  4:46     ` Jeff Dike
2004-02-17  7:18       ` William Stearns
2004-02-19 19:01       ` BlaisorBlade
2004-02-20  2:06         ` Jeff Dike
2004-02-20  7:48           ` Ingo Molnar
2004-02-17 15:56   ` BlaisorBlade
2004-02-18 21:15     ` Jeff Dike
2004-02-19 18:15       ` BlaisorBlade

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.