[Xenomai-help] Kernel crash during queue create/destroy

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-help] Kernel crash during queue create/destroy
@ 2006-11-15 10:21 Stephan Zimmermann
  2006-11-15 13:19 ` Dmitry Adamushko
  0 siblings, 1 reply; 16+ messages in thread
From: Stephan Zimmermann @ 2006-11-15 10:21 UTC (permalink / raw)
  To: xenomai

Hello,
I got some trouble with the native skin and queues, when creating / deleting 
queues, my Kernel sometimes (actually very often...) crashes, leading to a 
frozen system, with my Xenomai program continuing until it returns. I tried 
to isolate / reproduce the problem, which lead me to the following demo-code.

This piece of code lets reproducible crash my Systems running Kernel 
2.6.17.14 / Xenomai 2.2.5  as well as Kernel 2.6.17.6 / Xenomai 2.2.0. 

<code>
#include <iostream>
#include <sys/mman.h>
#include <assert.h>
#include "native/task.h"
#include "native/timer.h"
#include "native/queue.h"

RT_TASK maintask;

int main(void){
	std::cout << "xenomai 2.2.4 timer-test" << std::endl;
	mlockall(MCL_CURRENT | MCL_FUTURE);
	
	int err;

	err = rt_task_shadow (&maintask,"maintask",10,0);
	std::cout << "task shadow:" << err << std::endl;
	
	err = rt_timer_set_mode(1000000);
	std::cout << "timer set mode:" << err << std::endl;
	
	err = rt_task_sleep(10);
	std::cout << "task sleep:" << err << std::endl;
	
	std::cout << "testing XENOMAI q-functions" << std::endl;
		RT_QUEUE* testq;
		for(int i = 0; i < 100; i++){
			for(int j = 0; j < 10; j++){
				testq = new RT_QUEUE;
				err = rt_queue_create(testq,"testq",10240,100,Q_FIFO);
				if(err == -EEXIST){
					err = rt_queue_bind(testq,"testq",100000000);
				}
				assert(err == 0);
				rt_task_sleep(1); // commenting this seems to make things work
				err = rt_queue_delete(testq);
				assert(err == 0);
				delete testq;
				//rt_task_sleep(10); // uncommenting this seems to make things work 
			}
			std::cout << "." << std::flush;
		}	
		std::cout << "ok" << std::endl;
	return 0;
}

</code>

The crash leads ti the following information in syslog:

<syslog>

Nov 14 16:47:55 localhost kernel: BUG: unable to handle kernel NULL pointer 
dereference at virtual address 00000000
Nov 14 16:47:55 localhost kernel:  printing eip:
Nov 14 16:47:55 localhost kernel: c01a6f66
Nov 14 16:47:55 localhost kernel: *pde = 00000000
Nov 14 16:47:55 localhost kernel: Oops: 0000 [#1]
Nov 14 16:47:55 localhost kernel: PREEMPT
Nov 14 16:47:55 localhost kernel: Modules linked in: ipv6 nfs lockd sunrpc 
snd_mpu401 floppy pcspkr rtc snd_via82xx gameport snd_ac97_codec snd_ac97_bus 
snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device 
snd soundcore i2c_viapro i2c_core generic 8139cp amd64_agp agpgart tsdev 
mousedev ehci_hcd usbhid uhci_hcd usbcore via82cxxx 8139too mii psmouse 
ide_generic ide_disk ide_cd cdrom ide_core unix
Nov 14 16:47:55 localhost kernel: CPU:    0
Nov 14 16:47:55 localhost kernel: EIP:    0060:[remove_proc_entry+51/333]    
Not tainted VLI
Nov 14 16:47:55 localhost kernel: EFLAGS: 00010286   (2.6.17.14 #5)
Nov 14 16:47:55 localhost kernel: EIP is at remove_proc_entry+0x33/0x14d
Nov 14 16:47:55 localhost kernel: eax: 00000000   ebx: c03d10f4   ecx: 
ffffffff   edx: 00000000
Nov 14 16:47:55 localhost kernel: esi: c03cd7f8   edi: 00000000   ebp: 
f790f8c0   esp: c1907f00
Nov 14 16:47:55 localhost kernel: ds: 007b   es: 007b   ss: 0068
Nov 14 16:47:55 localhost kernel: Process events/0 (pid: 4, 
threadinfo=c1906000 task=c190ea90)
Nov 14 16:47:55 localhost kernel: Stack: 00000000 c03d10f4 c03cd7f8 c03464cc 
c0148919 00000000 f790f8c0 00000000
Nov 14 16:47:55 localhost kernel:        00000000 c02ff30d c02ff282 f617b440 
f6371ac0 c18e0640 c03460e0 00000200
Nov 14 16:47:55 localhost kernel:        00000000 c0126773 00000000 c0147fa5 
c1906000 c190ea90 fffffffb c18e0640
Nov 14 16:47:55 localhost kernel: Call Trace:
Nov 14 16:47:55 localhost kernel:  <c0148919> 
registry_proc_callback+0x974/0x9ec  <c0126773> run_workqueue+0xd7/0x172
Nov 14 16:47:55 localhost kernel:  <c0147fa5> registry_proc_callback+0x0/0x9ec  
<c0126906> worker_thread+0xf8/0x12a
Nov 14 16:47:55 localhost kernel:  <c0112a33> default_wake_function+0x0/0x12  
<c02e04a4> schedule+0x62e/0x64d
Nov 14 16:47:55 localhost kernel:  <c0112a33> default_wake_function+0x0/0x12  
<c012680e> worker_thread+0x0/0x12a
Nov 14 16:47:55 localhost kernel:  <c0129922> kthread+0x79/0xa3  <c01298a9> 
kthread+0x0/0xa3
Nov 14 16:47:55 localhost kernel:  <c0101385> kernel_thread_helper+0x5/0xb
Nov 14 16:47:55 localhost kernel: Code: 00 8b 54 24 14 89 14 24 75 19 89 e0 50 
8d 44 24 1c 50 52 e8 a4 f5 ff ff 83 c4 0c 85 c0 0f 85 1d 01 00 00 8b 3c 24 31 
c0 83 c9 ff <f2> ae f7 d1 49 81 3d 80 5c 34 c0 00 39 34 c0 89 ce 75 0a b8 01
Nov 14 16:47:55 localhost kernel: EIP: [remove_proc_entry+51/333] 
remove_proc_entry+0x33/0x14d SS:ESP 0068:c1907f00

</syslog>

Thanks again, 
Stephan


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 10:21 [Xenomai-help] Kernel crash during queue create/destroy Stephan Zimmermann
@ 2006-11-15 13:19 ` Dmitry Adamushko
  2006-11-15 14:44   ` Stephan Zimmermann
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Adamushko @ 2006-11-15 13:19 UTC (permalink / raw)
  To: Stephan Zimmermann; +Cc: Xenomai help


[-- Attachment #1.1: Type: text/plain, Size: 889 bytes --]

Hello,

I got some trouble with the native skin and queues, when creating / deleting
> queues, my Kernel sometimes (actually very often...) crashes, leading to a
> frozen system, with my Xenomai program continuing until it returns. I
> tried
> to isolate / reproduce the problem, which lead me to the following
> demo-code.
>
>
it looks like there is some "misunderstanding" between

(1) rt_queue_delete() -> xnregistry_remove()

and

(2) registry_proc_callback() which crushes in remove_proc_entry().

You may follow the logic in ksrc/nucleus/registry.c.

rt_queue_create() -> xnregistry_enter() -> ... -> registry_proc_callback()

rt_queue_delete() -> xnregistry_remove() -> ... registry_proc_unexport()

I don't have enough time to investigate further right now, but nevertheless,
could you apply the following patch and let us know of the outcome?


-- 
Best regards,
Dmitry Adamushko

[-- Attachment #1.2: Type: text/html, Size: 1207 bytes --]

[-- Attachment #2: registry-test.patch --]
[-- Type: text/x-patch, Size: 664 bytes --]

--- ksrc/nucleus/registry-old.c	2006-11-15 14:10:02.877744000 +0100
+++ ksrc/nucleus/registry.c	2006-11-15 14:14:44.335173000 +0100
@@ -404,7 +404,10 @@ static inline void registry_proc_export(
 
 static inline void registry_proc_unexport(xnobject_t *object)
 {
-	if (object->proc != XNOBJECT_PROC_RESERVED1) {
+	if (!object->proc || object->proc == XNOBJECT_PROC_RESERVED2)
+		printk(KERN_INFO "*** object->proc == %x ***\n", object->proc);
+
+	if ((unsigned long)object->proc > (unsigned long)XNOBJECT_PROC_RESERVED2) {
 		removeq(&registry_obj_busyq, &object->link);
 		appendq(&registry_obj_unexportq, &object->link);
 		rthal_apc_schedule(registry_proc_apc);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 13:19 ` Dmitry Adamushko
@ 2006-11-15 14:44   ` Stephan Zimmermann
  2006-11-15 21:01     ` Jan Kiszka
  2006-11-16 11:57     ` Philippe Gerum
  0 siblings, 2 replies; 16+ messages in thread
From: Stephan Zimmermann @ 2006-11-15 14:44 UTC (permalink / raw)
  To: xenomai

Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> Hello,
>
> I got some trouble with the native skin and queues, when creating /
> deleting
>
> > queues, my Kernel sometimes (actually very often...) crashes, leading to
> > a frozen system, with my Xenomai program continuing until it returns. I
> > tried
> > to isolate / reproduce the problem, which lead me to the following
> > demo-code.
>
> it looks like there is some "misunderstanding" between
>
> (1) rt_queue_delete() -> xnregistry_remove()
>
> and
>
> (2) registry_proc_callback() which crushes in remove_proc_entry().
>
> You may follow the logic in ksrc/nucleus/registry.c.
>
> rt_queue_create() -> xnregistry_enter() -> ... -> registry_proc_callback()
>
> rt_queue_delete() -> xnregistry_remove() -> ... registry_proc_unexport()
>
> I don't have enough time to investigate further right now, but
> nevertheless, could you apply the following patch and let us know of the
> outcome?

Thanks for your fast response. I just applied (and recompiled everything...) 
the patch you attached to your last Post. It doesn't seem to change  anything 
for me. I attached the entries from syslog the test produced.

<syslog>
Nov 15 15:34:56 localhost kernel: BUG: unable to handle kernel NULL pointer 
dereference at virtual address 00000000
Nov 15 15:34:56 localhost kernel:  printing eip:
Nov 15 15:34:56 localhost kernel: c01a6f7e
Nov 15 15:34:56 localhost kernel: *pde = 00000000
Nov 15 15:34:56 localhost kernel: Oops: 0000 [#1]
Nov 15 15:34:56 localhost kernel: PREEMPT
Nov 15 15:34:56 localhost kernel: Modules linked in: ipv6 nfs lockd sunrpc 
snd_mpu401 floppy pcspkr rtc snd_via82xx gameport snd_ac97_codec snd_ac97_bus 
snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device 
snd soundcore i2c_viapro i2c_core generic 8139cp tsdev mousedev amd64_agp 
agpgart ehci_hcd usbhid uhci_hcd usbcore via82cxxx 8139too mii psmouse 
ide_generic ide_disk ide_cd cdrom ide_core unix
Nov 15 15:34:56 localhost kernel: CPU:    0
Nov 15 15:34:56 localhost kernel: EIP:    0060:[remove_proc_entry+51/333]    
Not tainted VLI
Nov 15 15:34:56 localhost kernel: EFLAGS: 00010286   (2.6.17.14 #1)
Nov 15 15:34:56 localhost kernel: EIP is at remove_proc_entry+0x33/0x14d
Nov 15 15:34:56 localhost kernel: eax: 00000000   ebx: c03d10f4   ecx: 
ffffffff   edx: 00000000
Nov 15 15:34:56 localhost kernel: esi: c03d0a98   edi: 00000000   ebp: 
f433e0c0   esp: c1907f00
Nov 15 15:34:56 localhost kernel: ds: 007b   es: 007b   ss: 0068
Nov 15 15:34:56 localhost kernel: Process events/0 (pid: 4, 
threadinfo=c1906000 task=c190ea90)
Nov 15 15:34:56 localhost kernel: Stack: 00000000 c03d10f4 c03d0a98 c03464cc 
c0148919 00000000 f433e0c0 00000000
Nov 15 15:34:56 localhost kernel:        00000000 c02ff32d c02ff2a2 f6153740 
f433ec40 c18e0640 c03460e0 00000200
Nov 15 15:34:56 localhost kernel:        00000000 c0126773 00000000 c0147fa5 
c1906000 c190ea90 fffffffb c18e0640
Nov 15 15:34:56 localhost kernel: Call Trace:
Nov 15 15:34:57 localhost kernel:  <c0148919> 
registry_proc_callback+0x974/0x9ec  <c0126773> run_workqueue+0xd7/0x172
Nov 15 15:34:57 localhost kernel:  <c0147fa5> registry_proc_callback+0x0/0x9ec  
<c0126906> worker_thread+0xf8/0x12a
Nov 15 15:34:57 localhost kernel:  <c0112a33> default_wake_function+0x0/0x12  
<c02e04bc> schedule+0x62e/0x64d
Nov 15 15:34:57 localhost kernel:  <c0112a33> default_wake_function+0x0/0x12  
<c012680e> worker_thread+0x0/0x12a
Nov 15 15:34:57 localhost kernel:  <c0129922> kthread+0x79/0xa3  <c01298a9> 
kthread+0x0/0xa3
Nov 15 15:34:57 localhost kernel:  <c0101385> kernel_thread_helper+0x5/0xb
Nov 15 15:34:57 localhost kernel: Code: 00 8b 54 24 14 89 14 24 75 19 89 e0 50 
8d 44 24 1c 50 52 e8 a4 f5 ff ff 83 c4 0c 85 c0 0f 85 1d 01 00 00 8b 3c 24 31 
c0 83 c9 ff <f2> ae f7 d1 49 81 3d 80 5c 34 c0 00 39 34 c0 89 ce 75 0a b8 01
Nov 15 15:34:57 localhost kernel: EIP: [remove_proc_entry+51/333] 
remove_proc_entry+0x33/0x14d SS:ESP 0068:c1907f00
</syslog>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 14:44   ` Stephan Zimmermann
@ 2006-11-15 21:01     ` Jan Kiszka
  2006-11-15 22:22       ` Philippe Gerum
  2006-11-16 11:57     ` Philippe Gerum
  1 sibling, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2006-11-15 21:01 UTC (permalink / raw)
  To: Stephan Zimmermann; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 2465 bytes --]

Stephan Zimmermann wrote:
> Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
>> Hello,
>>
>> I got some trouble with the native skin and queues, when creating /
>> deleting
>>
>>> queues, my Kernel sometimes (actually very often...) crashes, leading to
>>> a frozen system, with my Xenomai program continuing until it returns. I
>>> tried
>>> to isolate / reproduce the problem, which lead me to the following
>>> demo-code.
>> it looks like there is some "misunderstanding" between
>>
>> (1) rt_queue_delete() -> xnregistry_remove()
>>
>> and
>>
>> (2) registry_proc_callback() which crushes in remove_proc_entry().
>>
>> You may follow the logic in ksrc/nucleus/registry.c.
>>
>> rt_queue_create() -> xnregistry_enter() -> ... -> registry_proc_callback()
>>
>> rt_queue_delete() -> xnregistry_remove() -> ... registry_proc_unexport()
>>
>> I don't have enough time to investigate further right now, but
>> nevertheless, could you apply the following patch and let us know of the
>> outcome?
> 
> Thanks for your fast response. I just applied (and recompiled everything...) 
> the patch you attached to your last Post. It doesn't seem to change  anything 
> for me. I attached the entries from syslog the test produced.
> 

/me unfortunately failed to reproduce your problem here. Instead, I
found a regression in SVN head - different story for a different thread.

I have another debugging suggestion:

Enable CONFIG_IPIPE_TRACE_MCOUNT (under Kernel Hacking) and patch the
kernel as follows:

--- arch/i386/mm/fault.c.orig
+++ arch/i386/mm/fault.c
@@ -515,6 +515,8 @@ no_context:

 	bust_spinlocks(1);

+	ipipe_trace_freeze(0);
+
 	if (oops_may_print()) {
 	#ifdef CONFIG_X86_PAE
 		if (error_code & 16) {

Then recompile it and let your test run. After (and if...) it crashed,
you should look at /proc/ipipe/trace/frozen. This will contain a
back-trace around the crash. Tune the output via, e.g.,

	echo 1000 > /proc/ipipe/trace/back_trace_points
	(see also our wiki on more information about the tracer)

so that at least the path from the previous rt_queue_create up to the
crash is visible. This will give a call history (not just a stack
back-trace), as the tracer records all function calls in the kernel. May
help us to understand if we hit an unexpected function schedule here. It
all sounds like some rare race to me. I hope the trace will not make it
vanish...

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 21:01     ` Jan Kiszka
@ 2006-11-15 22:22       ` Philippe Gerum
  2006-11-15 22:36         ` Jan Kiszka
  2006-11-15 22:44         ` Philippe Gerum
  0 siblings, 2 replies; 16+ messages in thread
From: Philippe Gerum @ 2006-11-15 22:22 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:

> /me unfortunately failed to reproduce your problem here. Instead, I
> found a regression in SVN head - different story for a different thread.
> 

It's reproducible here. This bug triggers if you leave enough time
between queue creation and deletion for Linux to deal with its usual
business, like running workqueues... Additionally, this bug would not
trigger with different queue names passed to the creation routines. It
seems to be caused by out-of-sequence create/delete requests of /proc
entries relayed to the proc_fs subsystem, which does not perform any
sanity checks on the data it is submitted.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 22:22       ` Philippe Gerum
@ 2006-11-15 22:36         ` Jan Kiszka
  2006-11-15 22:43           ` Jan Kiszka
  2006-11-15 22:46           ` Philippe Gerum
  2006-11-15 22:44         ` Philippe Gerum
  1 sibling, 2 replies; 16+ messages in thread
From: Jan Kiszka @ 2006-11-15 22:36 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]

Philippe Gerum wrote:
> On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:
> 
>> /me unfortunately failed to reproduce your problem here. Instead, I
>> found a regression in SVN head - different story for a different thread.
>>
> 
> It's reproducible here. This bug triggers if you leave enough time
> between queue creation and deletion for Linux to deal with its usual
> business, like running workqueues... Additionally, this bug would not
> trigger with different queue names passed to the creation routines. It
> seems to be caused by out-of-sequence create/delete requests of /proc
> entries relayed to the proc_fs subsystem, which does not perform any
> sanity checks on the data it is submitted.
> 

Yes, that delay makes the difference. I just added a huge one (100
ticks), and now I get tones of this:

> xenomai 2.2.4 timer-test
> task shadow:0
> timer set mode:0
> task sleep:0
> testing XENOMAI q-functions
> Bad page state in process 'maintask'
> page:c10e8b20 flags:0x80000400 mapping:00000000 mapcount:0 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
>  <c0103055> show_trace+0x12/0x14  <c01035e0> dump_stack+0x1c/0x1e
>  <c0157104> bad_page+0x49/0x73  <c01577d8> free_hot_cold_page+0x68/0x13d
>  <c01578fb> free_hot_page+0xf/0x11  <c0157a17> __free_pages+0x2e/0x39
>  <c01646f2> __vunmap+0x90/0xbc  <c01647c4> vfree+0x2e/0x30
>  <c01388ca> xnheap_destroy_mapped+0xe2/0x11b  <c8887d93> rt_queue_delete+0xb5/0xf2 [xeno_native]
>  <c88836d9> __rt_queue_delete+0x4e/0x72 [xeno_native]  <c0141b20> losyscall_event+0xa3/0x145
>  <c013594f> __ipipe_dispatch_event+0xac/0x16c  <c0109fd2> __ipipe_syscall_root+0x78/0x100
>  <c02a5520> system_call+0x20/0x38 

Is this also what you see? Looks quite different on first sight. [Note:
it's on a qemu box.]


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 22:36         ` Jan Kiszka
@ 2006-11-15 22:43           ` Jan Kiszka
  2006-11-15 22:46           ` Philippe Gerum
  1 sibling, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2006-11-15 22:43 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:
>>
>>> /me unfortunately failed to reproduce your problem here. Instead, I
>>> found a regression in SVN head - different story for a different thread.
>>>
>> It's reproducible here. This bug triggers if you leave enough time
>> between queue creation and deletion for Linux to deal with its usual
>> business, like running workqueues... Additionally, this bug would not
>> trigger with different queue names passed to the creation routines. It
>> seems to be caused by out-of-sequence create/delete requests of /proc
>> entries relayed to the proc_fs subsystem, which does not perform any
>> sanity checks on the data it is submitted.
>>
> 
> Yes, that delay makes the difference. I just added a huge one (100
> ticks), and now I get tones of this:

No, the delay make no difference for me. This one is a new "property" of
#1838 (#1832 is fine), the other issue still doesn't pop up here.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 22:22       ` Philippe Gerum
  2006-11-15 22:36         ` Jan Kiszka
@ 2006-11-15 22:44         ` Philippe Gerum
  1 sibling, 0 replies; 16+ messages in thread
From: Philippe Gerum @ 2006-11-15 22:44 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

On Wed, 2006-11-15 at 23:22 +0100, Philippe Gerum wrote:
> On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:
> 
> > /me unfortunately failed to reproduce your problem here. Instead, I
> > found a regression in SVN head - different story for a different thread.
> > 
> 
> It's reproducible here. This bug triggers if you leave enough time
> between queue creation and deletion for Linux to deal with its usual
> business, like running workqueues... Additionally, this bug would not
> trigger with different queue names passed to the creation routines. It
> seems to be caused by out-of-sequence create/delete requests of /proc
> entries relayed to the proc_fs subsystem, which does not perform any
> sanity checks on the data it is submitted.
> 

Mm, ok, got it. It's indeed an out-of-sequence issue. Will fix.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 22:36         ` Jan Kiszka
  2006-11-15 22:43           ` Jan Kiszka
@ 2006-11-15 22:46           ` Philippe Gerum
  2006-11-15 22:52             ` Jan Kiszka
  1 sibling, 1 reply; 16+ messages in thread
From: Philippe Gerum @ 2006-11-15 22:46 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

On Wed, 2006-11-15 at 23:36 +0100, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:
> > 
> >> /me unfortunately failed to reproduce your problem here. Instead, I
> >> found a regression in SVN head - different story for a different thread.
> >>
> > 
> > It's reproducible here. This bug triggers if you leave enough time
> > between queue creation and deletion for Linux to deal with its usual
> > business, like running workqueues... Additionally, this bug would not
> > trigger with different queue names passed to the creation routines. It
> > seems to be caused by out-of-sequence create/delete requests of /proc
> > entries relayed to the proc_fs subsystem, which does not perform any
> > sanity checks on the data it is submitted.
> > 
> 
> Yes, that delay makes the difference. I just added a huge one (100
> ticks), and now I get tones of this:
> 
> > xenomai 2.2.4 timer-test
> > task shadow:0
> > timer set mode:0
> > task sleep:0
> > testing XENOMAI q-functions
> > Bad page state in process 'maintask'
> > page:c10e8b20 flags:0x80000400 mapping:00000000 mapcount:0 count:0
> > Trying to fix it up, but a reboot is needed
> > Backtrace:
> >  <c0103055> show_trace+0x12/0x14  <c01035e0> dump_stack+0x1c/0x1e
> >  <c0157104> bad_page+0x49/0x73  <c01577d8> free_hot_cold_page+0x68/0x13d
> >  <c01578fb> free_hot_page+0xf/0x11  <c0157a17> __free_pages+0x2e/0x39
> >  <c01646f2> __vunmap+0x90/0xbc  <c01647c4> vfree+0x2e/0x30
> >  <c01388ca> xnheap_destroy_mapped+0xe2/0x11b  <c8887d93> rt_queue_delete+0xb5/0xf2 [xeno_native]
> >  <c88836d9> __rt_queue_delete+0x4e/0x72 [xeno_native]  <c0141b20> losyscall_event+0xa3/0x145
> >  <c013594f> __ipipe_dispatch_event+0xac/0x16c  <c0109fd2> __ipipe_syscall_root+0x78/0x100
> >  <c02a5520> system_call+0x20/0x38 
> 
> Is this also what you see? Looks quite different on first sight. [Note:
> it's on a qemu box.]
> 

Uh, no. Try removing the second part of your heap patch, regarding the
memory unreservation routine.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 22:46           ` Philippe Gerum
@ 2006-11-15 22:52             ` Jan Kiszka
  0 siblings, 0 replies; 16+ messages in thread
From: Jan Kiszka @ 2006-11-15 22:52 UTC (permalink / raw)
  To: rpm; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 2164 bytes --]

Philippe Gerum wrote:
> On Wed, 2006-11-15 at 23:36 +0100, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Wed, 2006-11-15 at 22:01 +0100, Jan Kiszka wrote:
>>>
>>>> /me unfortunately failed to reproduce your problem here. Instead, I
>>>> found a regression in SVN head - different story for a different thread.
>>>>
>>> It's reproducible here. This bug triggers if you leave enough time
>>> between queue creation and deletion for Linux to deal with its usual
>>> business, like running workqueues... Additionally, this bug would not
>>> trigger with different queue names passed to the creation routines. It
>>> seems to be caused by out-of-sequence create/delete requests of /proc
>>> entries relayed to the proc_fs subsystem, which does not perform any
>>> sanity checks on the data it is submitted.
>>>
>> Yes, that delay makes the difference. I just added a huge one (100
>> ticks), and now I get tones of this:
>>
>>> xenomai 2.2.4 timer-test
>>> task shadow:0
>>> timer set mode:0
>>> task sleep:0
>>> testing XENOMAI q-functions
>>> Bad page state in process 'maintask'
>>> page:c10e8b20 flags:0x80000400 mapping:00000000 mapcount:0 count:0
>>> Trying to fix it up, but a reboot is needed
>>> Backtrace:
>>>  <c0103055> show_trace+0x12/0x14  <c01035e0> dump_stack+0x1c/0x1e
>>>  <c0157104> bad_page+0x49/0x73  <c01577d8> free_hot_cold_page+0x68/0x13d
>>>  <c01578fb> free_hot_page+0xf/0x11  <c0157a17> __free_pages+0x2e/0x39
>>>  <c01646f2> __vunmap+0x90/0xbc  <c01647c4> vfree+0x2e/0x30
>>>  <c01388ca> xnheap_destroy_mapped+0xe2/0x11b  <c8887d93> rt_queue_delete+0xb5/0xf2 [xeno_native]
>>>  <c88836d9> __rt_queue_delete+0x4e/0x72 [xeno_native]  <c0141b20> losyscall_event+0xa3/0x145
>>>  <c013594f> __ipipe_dispatch_event+0xac/0x16c  <c0109fd2> __ipipe_syscall_root+0x78/0x100
>>>  <c02a5520> system_call+0x20/0x38 
>> Is this also what you see? Looks quite different on first sight. [Note:
>> it's on a qemu box.]
>>
> 
> Uh, no. Try removing the second part of your heap patch, regarding the
> memory unreservation routine.
> 

Ouch, that part of my patch was nonsense, indeed. Fine again now.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-15 14:44   ` Stephan Zimmermann
  2006-11-15 21:01     ` Jan Kiszka
@ 2006-11-16 11:57     ` Philippe Gerum
  2006-11-16 12:59       ` Stephan Zimmermann
  2006-11-16 15:35       ` Stephan Zimmermann
  1 sibling, 2 replies; 16+ messages in thread
From: Philippe Gerum @ 2006-11-16 11:57 UTC (permalink / raw)
  To: Stephan Zimmermann; +Cc: xenomai

On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
> Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> > Hello,
> >
> > I got some trouble with the native skin and queues, when creating /
> > deleting
> >
> > > queues, my Kernel sometimes (actually very often...) crashes, leading to
> > > a frozen system, with my Xenomai program continuing until it returns. I
> > > tried
> > > to isolate / reproduce the problem, which lead me to the following
> > > demo-code.
> >

[...]

Cannot test it yet, but could you try out this patch? TIA,

--- ksrc/nucleus/registry.c	(revision 1838)
+++ ksrc/nucleus/registry.c	(working copy)
@@ -65,10 +65,8 @@
 
 static void registry_proc_schedule(void *cookie);
 
-static xnqueue_t registry_obj_exportq;	/* Objects waiting for /proc export. */
+static xnqueue_t registry_obj_procq;	/* Objects waiting for /proc handling. */
 
-static xnqueue_t registry_obj_unexportq;	/* Objects waiting for /proc unexport. */
-
 #ifndef CONFIG_PREEMPT_RT
 static DECLARE_WORK(registry_proc_work, &registry_proc_callback, NULL);
 #endif /* !CONFIG_PREEMPT_RT */
@@ -106,8 +104,7 @@
 		return -ENOMEM;
 	}
 
-	initq(&registry_obj_exportq);
-	initq(&registry_obj_unexportq);
+	initq(&registry_obj_procq);
 #endif /* CONFIG_XENO_EXPORT_REGISTRY */
 
 	initq(&registry_obj_freeq);
@@ -274,16 +271,20 @@
 
 	xnlock_get_irqsave(&nklock, s);
 
-	while ((holder = getq(&registry_obj_exportq)) != NULL) {
+	while ((holder = getq(&registry_obj_procq)) != NULL) {
 		object = link2xnobj(holder);
 		pnode = object->pnode;
 		type = pnode->type;
+		dir = pnode->dir;
+		rdir = pnode->root->dir;
 		root = pnode->root->name;
+
+		if (object->proc != XNOBJECT_PROC_RESERVED1)
+			goto unexport;
+
 		++pnode->entries;
 		object->proc = XNOBJECT_PROC_RESERVED2;
 		appendq(&registry_obj_busyq, holder);
-		dir = pnode->dir;
-		rdir = pnode->root->dir;
 
 		xnlock_put_irqrestore(&nklock, s);
 
@@ -334,19 +335,14 @@
 			object->pnode = NULL;
 			--pnode->entries;
 		}
-	}
 
-	while ((holder = getq(&registry_obj_unexportq)) != NULL) {
-		object = link2xnobj(holder);
-		pnode = object->pnode;
-		object->pnode = NULL;
+		continue;
+
+	unexport:
+		entries = --pnode->entries;
 		entry = object->proc;
 		object->proc = NULL;
-		type = pnode->type;
-		dir = pnode->dir;
-		rdir = pnode->root->dir;
-		root = pnode->root->name;
-		entries = --pnode->entries;
+		object->pnode = NULL;
 
 		if (entries <= 0) {
 			pnode->dir = NULL;
@@ -398,7 +394,7 @@
 	object->proc = XNOBJECT_PROC_RESERVED1;
 	object->pnode = pnode;
 	removeq(&registry_obj_busyq, &object->link);
-	appendq(&registry_obj_exportq, &object->link);
+	appendq(&registry_obj_procq, &object->link);
 	rthal_apc_schedule(registry_proc_apc);
 }
 
@@ -406,13 +402,13 @@
 {
 	if (object->proc != XNOBJECT_PROC_RESERVED1) {
 		removeq(&registry_obj_busyq, &object->link);
-		appendq(&registry_obj_unexportq, &object->link);
+		appendq(&registry_obj_procq, &object->link);
 		rthal_apc_schedule(registry_proc_apc);
 	} else {
 		/* Unexporting before the lower stage has had a chance to
 		   export. Move back the object to the busyq just like if no
 		   export had been requested. */
-		removeq(&registry_obj_exportq, &object->link);
+		removeq(&registry_obj_procq, &object->link);
 		appendq(&registry_obj_busyq, &object->link);
 		object->pnode = NULL;
 		object->proc = NULL;


-- 
Philippe.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-16 11:57     ` Philippe Gerum
@ 2006-11-16 12:59       ` Stephan Zimmermann
  2006-11-16 13:26         ` Philippe Gerum
  2006-11-16 15:35       ` Stephan Zimmermann
  1 sibling, 1 reply; 16+ messages in thread
From: Stephan Zimmermann @ 2006-11-16 12:59 UTC (permalink / raw)
  To: xenomai

Am Donnerstag, 16. November 2006 12:57 schrieb Philippe Gerum:
> On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
> > Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> > > Hello,
> > >
> > > I got some trouble with the native skin and queues, when creating /
> > > deleting
> > >
> > > > queues, my Kernel sometimes (actually very often...) crashes, leading
> > > > to a frozen system, with my Xenomai program continuing until it
> > > > returns. I tried
> > > > to isolate / reproduce the problem, which lead me to the following
> > > > demo-code.
>
> [...]
>
> Cannot test it yet, but could you try out this patch? TIA,

This is the moment I have to confess I'm somehow not familiar enough with 
patch command....

I did download an fresh 1838 revision fron svn using "svn co -r 1838 
svn://svn.gna.org/svn/xenomai/trunk xenomai", saved your patch to a file and 
tried to apply it from within the xenomai directory like this "patch -p0 < 
filename" (worked somehow last time), which leads me to the following output:
<patch putput>
patching file ksrc/nucleus/registry.c
Hunk #1 FAILED at 65.
Hunk #2 FAILED at 104.
Hunk #3 FAILED at 271.
Hunk #4 FAILED at 335.
Hunk #5 FAILED at 394.
patch unexpectedly ends in middle of line
Hunk #6 FAILED at 402.
6 out of 6 hunks FAILED -- saving rejects to file ksrc/nucleus/registry.c.rej
</patch output>

sorry, don't know how to fix that...

> --- ksrc/nucleus/registry.c	(revision 1838)
> +++ ksrc/nucleus/registry.c	(working copy)
> @@ -65,10 +65,8 @@
>
>  static void registry_proc_schedule(void *cookie);
>
> -static xnqueue_t registry_obj_exportq;	/* Objects waiting for /proc
> export. */ +static xnqueue_t registry_obj_procq;	/* Objects waiting for
> /proc handling. */
>
> -static xnqueue_t registry_obj_unexportq;	/* Objects waiting for /proc
> unexport. */ -
>  #ifndef CONFIG_PREEMPT_RT
>  static DECLARE_WORK(registry_proc_work, &registry_proc_callback, NULL);
>  #endif /* !CONFIG_PREEMPT_RT */
> @@ -106,8 +104,7 @@
>  		return -ENOMEM;
>  	}
>
> -	initq(&registry_obj_exportq);
> -	initq(&registry_obj_unexportq);
> +	initq(&registry_obj_procq);
>  #endif /* CONFIG_XENO_EXPORT_REGISTRY */
>
>  	initq(&registry_obj_freeq);
> @@ -274,16 +271,20 @@
>
>  	xnlock_get_irqsave(&nklock, s);
>
> -	while ((holder = getq(&registry_obj_exportq)) != NULL) {
> +	while ((holder = getq(&registry_obj_procq)) != NULL) {
>  		object = link2xnobj(holder);
>  		pnode = object->pnode;
>  		type = pnode->type;
> +		dir = pnode->dir;
> +		rdir = pnode->root->dir;
>  		root = pnode->root->name;
> +
> +		if (object->proc != XNOBJECT_PROC_RESERVED1)
> +			goto unexport;
> +
>  		++pnode->entries;
>  		object->proc = XNOBJECT_PROC_RESERVED2;
>  		appendq(&registry_obj_busyq, holder);
> -		dir = pnode->dir;
> -		rdir = pnode->root->dir;
>
>  		xnlock_put_irqrestore(&nklock, s);
>
> @@ -334,19 +335,14 @@
>  			object->pnode = NULL;
>  			--pnode->entries;
>  		}
> -	}
>
> -	while ((holder = getq(&registry_obj_unexportq)) != NULL) {
> -		object = link2xnobj(holder);
> -		pnode = object->pnode;
> -		object->pnode = NULL;
> +		continue;
> +
> +	unexport:
> +		entries = --pnode->entries;
>  		entry = object->proc;
>  		object->proc = NULL;
> -		type = pnode->type;
> -		dir = pnode->dir;
> -		rdir = pnode->root->dir;
> -		root = pnode->root->name;
> -		entries = --pnode->entries;
> +		object->pnode = NULL;
>
>  		if (entries <= 0) {
>  			pnode->dir = NULL;
> @@ -398,7 +394,7 @@
>  	object->proc = XNOBJECT_PROC_RESERVED1;
>  	object->pnode = pnode;
>  	removeq(&registry_obj_busyq, &object->link);
> -	appendq(&registry_obj_exportq, &object->link);
> +	appendq(&registry_obj_procq, &object->link);
>  	rthal_apc_schedule(registry_proc_apc);
>  }
>
> @@ -406,13 +402,13 @@
>  {
>  	if (object->proc != XNOBJECT_PROC_RESERVED1) {
>  		removeq(&registry_obj_busyq, &object->link);
> -		appendq(&registry_obj_unexportq, &object->link);
> +		appendq(&registry_obj_procq, &object->link);
>  		rthal_apc_schedule(registry_proc_apc);
>  	} else {
>  		/* Unexporting before the lower stage has had a chance to
>  		   export. Move back the object to the busyq just like if no
>  		   export had been requested. */
> -		removeq(&registry_obj_exportq, &object->link);
> +		removeq(&registry_obj_procq, &object->link);
>  		appendq(&registry_obj_busyq, &object->link);
>  		object->pnode = NULL;
>  		object->proc = NULL;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-16 12:59       ` Stephan Zimmermann
@ 2006-11-16 13:26         ` Philippe Gerum
  0 siblings, 0 replies; 16+ messages in thread
From: Philippe Gerum @ 2006-11-16 13:26 UTC (permalink / raw)
  To: Stephan Zimmermann; +Cc: xenomai

On Thu, 2006-11-16 at 13:59 +0100, Stephan Zimmermann wrote:
> Am Donnerstag, 16. November 2006 12:57 schrieb Philippe Gerum:
> > On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
> > > Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> > > > Hello,
> > > >
> > > > I got some trouble with the native skin and queues, when creating /
> > > > deleting
> > > >
> > > > > queues, my Kernel sometimes (actually very often...) crashes, leading
> > > > > to a frozen system, with my Xenomai program continuing until it
> > > > > returns. I tried
> > > > > to isolate / reproduce the problem, which lead me to the following
> > > > > demo-code.
> >
> > [...]
> >
> > Cannot test it yet, but could you try out this patch? TIA,
> 
> This is the moment I have to confess I'm somehow not familiar enough with 
> patch command....
> 
> I did download an fresh 1838 revision fron svn using "svn co -r 1838 
> svn://svn.gna.org/svn/xenomai/trunk xenomai", saved your patch to a file and 
> tried to apply it from within the xenomai directory like this "patch -p0 < 
> filename" (worked somehow last time), which leads me to the following output:
> <patch putput>
> patching file ksrc/nucleus/registry.c
> Hunk #1 FAILED at 65.
> Hunk #2 FAILED at 104.
> Hunk #3 FAILED at 271.
> Hunk #4 FAILED at 335.
> Hunk #5 FAILED at 394.
> patch unexpectedly ends in middle of line
> Hunk #6 FAILED at 402.
> 6 out of 6 hunks FAILED -- saving rejects to file ksrc/nucleus/registry.c.rej
> </patch output>
> 
> sorry, don't know how to fix that...
> 

The paste&copy you did from the mail to get the patch code likely
failed, adding spurious linebreaks and such. Try asking your mail client
to write the mail to a file, remove the useless text, then apply the
patch from this file. patch -p0, as you did, is correct.

> > --- ksrc/nucleus/registry.c	(revision 1838)
> > +++ ksrc/nucleus/registry.c	(working copy)
> > @@ -65,10 +65,8 @@
> >
> >  static void registry_proc_schedule(void *cookie);
> >
> > -static xnqueue_t registry_obj_exportq;	/* Objects waiting for /proc
> > export. */ +static xnqueue_t registry_obj_procq;	/* Objects waiting for
> > /proc handling. */
> >
> > -static xnqueue_t registry_obj_unexportq;	/* Objects waiting for /proc
> > unexport. */ -
> >  #ifndef CONFIG_PREEMPT_RT
> >  static DECLARE_WORK(registry_proc_work, &registry_proc_callback, NULL);
> >  #endif /* !CONFIG_PREEMPT_RT */
> > @@ -106,8 +104,7 @@
> >  		return -ENOMEM;
> >  	}
> >
> > -	initq(&registry_obj_exportq);
> > -	initq(&registry_obj_unexportq);
> > +	initq(&registry_obj_procq);
> >  #endif /* CONFIG_XENO_EXPORT_REGISTRY */
> >
> >  	initq(&registry_obj_freeq);
> > @@ -274,16 +271,20 @@
> >
> >  	xnlock_get_irqsave(&nklock, s);
> >
> > -	while ((holder = getq(&registry_obj_exportq)) != NULL) {
> > +	while ((holder = getq(&registry_obj_procq)) != NULL) {
> >  		object = link2xnobj(holder);
> >  		pnode = object->pnode;
> >  		type = pnode->type;
> > +		dir = pnode->dir;
> > +		rdir = pnode->root->dir;
> >  		root = pnode->root->name;
> > +
> > +		if (object->proc != XNOBJECT_PROC_RESERVED1)
> > +			goto unexport;
> > +
> >  		++pnode->entries;
> >  		object->proc = XNOBJECT_PROC_RESERVED2;
> >  		appendq(&registry_obj_busyq, holder);
> > -		dir = pnode->dir;
> > -		rdir = pnode->root->dir;
> >
> >  		xnlock_put_irqrestore(&nklock, s);
> >
> > @@ -334,19 +335,14 @@
> >  			object->pnode = NULL;
> >  			--pnode->entries;
> >  		}
> > -	}
> >
> > -	while ((holder = getq(&registry_obj_unexportq)) != NULL) {
> > -		object = link2xnobj(holder);
> > -		pnode = object->pnode;
> > -		object->pnode = NULL;
> > +		continue;
> > +
> > +	unexport:
> > +		entries = --pnode->entries;
> >  		entry = object->proc;
> >  		object->proc = NULL;
> > -		type = pnode->type;
> > -		dir = pnode->dir;
> > -		rdir = pnode->root->dir;
> > -		root = pnode->root->name;
> > -		entries = --pnode->entries;
> > +		object->pnode = NULL;
> >
> >  		if (entries <= 0) {
> >  			pnode->dir = NULL;
> > @@ -398,7 +394,7 @@
> >  	object->proc = XNOBJECT_PROC_RESERVED1;
> >  	object->pnode = pnode;
> >  	removeq(&registry_obj_busyq, &object->link);
> > -	appendq(&registry_obj_exportq, &object->link);
> > +	appendq(&registry_obj_procq, &object->link);
> >  	rthal_apc_schedule(registry_proc_apc);
> >  }
> >
> > @@ -406,13 +402,13 @@
> >  {
> >  	if (object->proc != XNOBJECT_PROC_RESERVED1) {
> >  		removeq(&registry_obj_busyq, &object->link);
> > -		appendq(&registry_obj_unexportq, &object->link);
> > +		appendq(&registry_obj_procq, &object->link);
> >  		rthal_apc_schedule(registry_proc_apc);
> >  	} else {
> >  		/* Unexporting before the lower stage has had a chance to
> >  		   export. Move back the object to the busyq just like if no
> >  		   export had been requested. */
> > -		removeq(&registry_obj_exportq, &object->link);
> > +		removeq(&registry_obj_procq, &object->link);
> >  		appendq(&registry_obj_busyq, &object->link);
> >  		object->pnode = NULL;
> >  		object->proc = NULL;
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
-- 
Philippe.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-16 11:57     ` Philippe Gerum
  2006-11-16 12:59       ` Stephan Zimmermann
@ 2006-11-16 15:35       ` Stephan Zimmermann
  2006-11-16 15:44         ` Jan Kiszka
  1 sibling, 1 reply; 16+ messages in thread
From: Stephan Zimmermann @ 2006-11-16 15:35 UTC (permalink / raw)
  To: xenomai

Am Donnerstag, 16. November 2006 12:57 schrieb Philippe Gerum:
> On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
> > Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> > > Hello,
> > >
> > > I got some trouble with the native skin and queues, when creating /
> > > deleting
> > >
> > > > queues, my Kernel sometimes (actually very often...) crashes, leading
> > > > to a frozen system, with my Xenomai program continuing until it
> > > > returns. I tried
> > > > to isolate / reproduce the problem, which lead me to the following
> > > > demo-code.
>
> [...]
>
> Cannot test it yet, but could you try out this patch? TIA,

So, finally I managed to apply your patch, thanks for the hint :). After 
recompiling everything it seems to work an the first view. But I guess your 
patch is not the final solution (or we discover another problem here...).  
When I run the testprogram, everything looks fine, until I start to work with 
the X-Server. Starting X, logging into KDE or shutting down X sometimes leads 
to a syslog full of 'bad page' messages, as seen below. The system keeps 
responding to commands, I can exit my testprogram by pressing CTRL-C, X works 
as usual (just using it for typing this mail...).

<syslog>
Bad page state in process 'maintask'
Nov 16 16:24:49 localhost kernel: page:c16d8b40 flags:0x80000400 
mapping:00000000 mapcount:0 count:0
Nov 16 16:24:49 localhost kernel: Trying to fix it up, but a reboot is needed
Nov 16 16:24:49 localhost kernel: Backtrace:
Nov 16 16:24:49 localhost kernel:  <c01572e3> bad_page+0x43/0x6c  <c0157b32> 
free_hot_cold_page+0x60/0x14d
Nov 16 16:24:49 localhost kernel:  <c0165de5> __vunmap+0x91/0xc2  <c0165e33> 
vfree+0x1d/0x20
Nov 16 16:24:49 localhost kernel:  <c0136ba5> 
xnheap_destroy_mapped+0x200/0x23b  <c01513ab> rt_queue_delete+0xc2/0xe9
Nov 16 16:24:49 localhost kernel:  <c014ddb4> __rt_queue_delete+0x50/0x77  
<c0143fee> losyscall_event+0xa4/0x146
Nov 16 16:24:49 localhost kernel:  <c0143f4a> losyscall_event+0x0/0x146  
<c0134612> __ipipe_dispatch_event+0xa6/0x162
Nov 16 16:24:49 localhost kernel:  <c010e91e> __ipipe_syscall_root+0x6d/0xdb  
<c0102e36> sysenter_past_esp+0x3b/0x67
Nov 16 16:24:49 localhost kernel: Bad page state in process 'maintask'
Nov 16 16:24:49 localhost kernel: page:c16dd6c0 flags:0x80000400 
mapping:00000000 mapcount:0 count:0
Nov 16 16:24:49 localhost kernel: Trying to fix it up, but a reboot is needed
Nov 16 16:24:49 localhost kernel: Backtrace:
Nov 16 16:24:49 localhost kernel:  <c01572e3> bad_page+0x43/0x6c  <c0157b32> 
free_hot_cold_page+0x60/0x14d
Nov 16 16:24:49 localhost kernel:  <c0165de5> __vunmap+0x91/0xc2  <c0165e33> 
vfree+0x1d/0x20
Nov 16 16:24:49 localhost kernel:  <c0136ba5> 
xnheap_destroy_mapped+0x200/0x23b  <c01513ab> rt_queue_delete+0xc2/0xe9
Nov 16 16:24:49 localhost kernel:  <c014ddb4> __rt_queue_delete+0x50/0x77  
<c0143fee> losyscall_event+0xa4/0x146
Nov 16 16:24:49 localhost kernel:  <c0143f4a> losyscall_event+0x0/0x146  
<c0134612> __ipipe_dispatch_event+0xa6/0x162
Nov 16 16:24:49 localhost kernel:  <c010e91e> __ipipe_syscall_root+0x6d/0xdb  
<c0102e36> sysenter_past_esp+0x3b/0x67
Nov 16 16:24:49 localhost kernel: Bad page state in process 'maintask'
Nov 16 16:24:49 localhost kernel: page:c16dd740 flags:0x80000400 
mapping:00000000 mapcount:0 count:0
Nov 16 16:24:49 localhost kernel: Trying to fix it up, but a reboot is needed
Nov 16 16:24:49 localhost kernel: Backtrace:
Nov 16 16:24:49 localhost kernel:  <c01572e3> bad_page+0x43/0x6c  <c0157b32> 
free_hot_cold_page+0x60/0x14d
Nov 16 16:24:49 localhost kernel:  <c0165de5> __vunmap+0x91/0xc2  <c0165e33> 
vfree+0x1d/0x20
Nov 16 16:24:49 localhost kernel:  <c0136ba5> 
xnheap_destroy_mapped+0x200/0x23b  <c01513ab> rt_queue_delete+0xc2/0xe9
Nov 16 16:24:49 localhost kernel:  <c014ddb4> __rt_queue_delete+0x50/0x77  
<c0143fee> losyscall_event+0xa4/0x146
Nov 16 16:24:49 localhost kernel:  <c0143f4a> losyscall_event+0x0/0x146  
<c0134612> __ipipe_dispatch_event+0xa6/0x162
Nov 16 16:24:49 localhost kernel:  <c010e91e> __ipipe_syscall_root+0x6d/0xdb  
<c0102e36> sysenter_past_esp+0x3b/0x67
Nov 16 16:24:49 localhost kernel: Bad page state in process 'maintask'
Nov 16 16:24:49 localhost kernel: page:c16dd760 flags:0x80000400 
mapping:00000000 mapcount:0 count:0
Nov 16 16:24:49 localhost kernel: Trying to fix it up, but a reboot is needed
Nov 16 16:24:49 localhost kernel: Backtrace:
Nov 16 16:24:49 localhost kernel:  <c01572e3> bad_page+0x43/0x6c  <c0157b32> 
free_hot_cold_page+0x60/0x14d
Nov 16 16:24:49 localhost kernel:  <c0165de5> __vunmap+0x91/0xc2  <c0165e33> 
vfree+0x1d/0x20
Nov 16 16:24:49 localhost kernel:  <c0136ba5> 
xnheap_destroy_mapped+0x200/0x23b  <c01513ab> rt_queue_delete+0xc2/0xe9
Nov 16 16:24:49 localhost kernel:  <c014ddb4> __rt_queue_delete+0x50/0x77  
<c0143fee> losyscall_event+0xa4/0x146
Nov 16 16:24:49 localhost kernel:  <c0143f4a> losyscall_event+0x0/0x146  
<c0134612> __ipipe_dispatch_event+0xa6/0x162
Nov 16 16:24:49 localhost kernel:  <c010e91e> __ipipe_syscall_root+0x6d/0xdb  
<c0102e36> sysenter_past_esp+0x3b/0x67
... many more here...
</syslog>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-16 15:35       ` Stephan Zimmermann
@ 2006-11-16 15:44         ` Jan Kiszka
  2006-11-17 12:38           ` Stephan Zimmermann
  0 siblings, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2006-11-16 15:44 UTC (permalink / raw)
  To: Stephan Zimmermann; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

Stephan Zimmermann wrote:
> Am Donnerstag, 16. November 2006 12:57 schrieb Philippe Gerum:
>> On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
>>> Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
>>>> Hello,
>>>>
>>>> I got some trouble with the native skin and queues, when creating /
>>>> deleting
>>>>
>>>>> queues, my Kernel sometimes (actually very often...) crashes, leading
>>>>> to a frozen system, with my Xenomai program continuing until it
>>>>> returns. I tried
>>>>> to isolate / reproduce the problem, which lead me to the following
>>>>> demo-code.
>> [...]
>>
>> Cannot test it yet, but could you try out this patch? TIA,
> 
> So, finally I managed to apply your patch, thanks for the hint :). After 
> recompiling everything it seems to work an the first view. But I guess your 
> patch is not the final solution (or we discover another problem here...).  
> When I run the testprogram, everything looks fine, until I start to work with 
> the X-Server. Starting X, logging into KDE or shutting down X sometimes leads 
> to a syslog full of 'bad page' messages, as seen below. The system keeps 
> responding to commands, I can exit my testprogram by pressing CTRL-C, X works 
> as usual (just using it for typing this mail...).

See last night's thread: this issue was due to my patch fixing too much
of an regression in SVN trunk. You should upgrade to head (#1842 by now)
and try again. "svn up" should work fine for this, merging all changes
together.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Xenomai-help] Kernel crash during queue create/destroy
  2006-11-16 15:44         ` Jan Kiszka
@ 2006-11-17 12:38           ` Stephan Zimmermann
  0 siblings, 0 replies; 16+ messages in thread
From: Stephan Zimmermann @ 2006-11-17 12:38 UTC (permalink / raw)
  To: xenomai

Am Donnerstag, 16. November 2006 16:44 schrieb Jan Kiszka:
> Stephan Zimmermann wrote:
> > Am Donnerstag, 16. November 2006 12:57 schrieb Philippe Gerum:
> >> On Wed, 2006-11-15 at 15:44 +0100, Stephan Zimmermann wrote:
> >>> Am Mittwoch, 15. November 2006 14:19 schrieb Dmitry Adamushko:
> >>>> Hello,
> >>>>
> >>>> I got some trouble with the native skin and queues, when creating /
> >>>> deleting
> >>>>
> >>>>> queues, my Kernel sometimes (actually very often...) crashes, leading
> >>>>> to a frozen system, with my Xenomai program continuing until it
> >>>>> returns. I tried
> >>>>> to isolate / reproduce the problem, which lead me to the following
> >>>>> demo-code.
> >>
> >> [...]
> >>
> >> Cannot test it yet, but could you try out this patch? TIA,
> >
> > So, finally I managed to apply your patch, thanks for the hint :). After
> > recompiling everything it seems to work an the first view. But I guess
> > your patch is not the final solution (or we discover another problem
> > here...). When I run the testprogram, everything looks fine, until I
> > start to work with the X-Server. Starting X, logging into KDE or shutting
> > down X sometimes leads to a syslog full of 'bad page' messages, as seen
> > below. The system keeps responding to commands, I can exit my testprogram
> > by pressing CTRL-C, X works as usual (just using it for typing this
> > mail...).
>
> See last night's thread: this issue was due to my patch fixing too much
> of an regression in SVN trunk. You should upgrade to head (#1842 by now)
> and try again. "svn up" should work fine for this, merging all changes
> together.

You're right, upgraded to head and recompiled it. Seems to work now. Many 
thanks for the fast response on this list.

Stephan


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-11-17 12:38 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-15 10:21 [Xenomai-help] Kernel crash during queue create/destroy Stephan Zimmermann
2006-11-15 13:19 ` Dmitry Adamushko
2006-11-15 14:44   ` Stephan Zimmermann
2006-11-15 21:01     ` Jan Kiszka
2006-11-15 22:22       ` Philippe Gerum
2006-11-15 22:36         ` Jan Kiszka
2006-11-15 22:43           ` Jan Kiszka
2006-11-15 22:46           ` Philippe Gerum
2006-11-15 22:52             ` Jan Kiszka
2006-11-15 22:44         ` Philippe Gerum
2006-11-16 11:57     ` Philippe Gerum
2006-11-16 12:59       ` Stephan Zimmermann
2006-11-16 13:26         ` Philippe Gerum
2006-11-16 15:35       ` Stephan Zimmermann
2006-11-16 15:44         ` Jan Kiszka
2006-11-17 12:38           ` Stephan Zimmermann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.