All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
@ 2007-11-16 15:51 Remy Bohmer
  2007-11-16 20:20 ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-16 15:51 UTC (permalink / raw)
  To: Steven Rostedt, Ingo Molnar, Thomas Gleixner; +Cc: RT, linux-kernel

Hello All,

I have a problem with the RT-mutex code and signals. The problem is
very easily reproducible, but I do not have found the root-cause yet.
I hope someone can help me on this one.

This is what I am doing:
* I have a simple character driver with a read call.(called spi_read()
in the logging below )
* The read call blocks on a semaphore until some condition in hardware
is reached. (in the routine wait_for_io_level(), see logging below)
* I use a down_interruptible() call on a 'struct semaphore' type
semaphore, which eventually blocks on a mutex. (the semaphore is, of
course, initialised with sema_init() )

What happens is that when a user-space RT-thread is waiting on the
semaphore through the spi_read() call, and a signal arrives during the
wait at this thread (like e.g. CTRL-C), the kernel starts oopsing
until it is as good as brain-dead.

If I do NOT sent a posix-signal the code/mutex/semaphore is working
properly for days,
So, it seems to be related by waking up from a blocking situation,
because of a pending posix-signal.

I tried also the types 'struct compat_semaphore', and mutexes; none of
them work.
In fact the real Mutex type, declared with init_MUTEX() has the same problem.

Anyone an idea?
Below the kernel oops output. (I run on ARM, but I think that should
not matter for this problem)

Kind Regards,

Remy Bohmer

[   52.113254] WARNING: at kernel/rtmutex.c:585 remove_waiter()
[   52.113254] [<c0022d88>] (dump_stack+0x0/0x14) from [<c004db84>]
(remove_waiter+0x14c/0x1e0)
[   52.113254] [<c004da38>] (remove_waiter+0x0/0x1e0) from
[<c0172ef4>] (rt_mutex_slowlock+0x278/0x2ec)
[   52.113254] [<c0172c7c>] (rt_mutex_slowlock+0x0/0x2ec) from
[<c0172b20>] (rt_mutex_lock_interruptible+0x58/0x5c)
[   52.113254] [<c0172ac8>] (rt_mutex_lock_interruptible+0x0/0x5c)
from [<c004e458>] (rt_down_interruptible+0x24/0x5c)
[   52.113254] [<c004e434>] (rt_down_interruptible+0x0/0x5c) from
[<c00fd9a4>] (wait_for_io_level+0x7c/0xcc)
[   52.113254]  r6:00000000 r5:00000024 r4:c01d5a34
[   52.113254] [<c00fd928>] (wait_for_io_level+0x0/0xcc) from
[<c00fe20c>] (spi_read+0x60/0x198)
[   52.113254]  r6:c3823f78 r5:c01d59b0 r4:00000000
[   52.113254] [<c00fe1ac>] (spi_read+0x0/0x198) from [<c006e750>]
(vfs_read+0xb4/0xf4)
[   52.113254]  r8:c001f044 r7:c3823f78 r6:40967c29 r5:c3e2ae80 r4:000001ff
[   52.113254] [<c006e69c>] (vfs_read+0x0/0xf4) from [<c006eb4c>]
(sys_read+0x4c/0x7c)
[   52.113254]  r7:00000003 r6:c3e2ae80 r5:00000000 r4:00000000
[   52.113254] [<c006eb00>] (sys_read+0x0/0x7c) from [<c001eec0>]
(ret_fast_syscall+0x0/0x2c)
[   52.113254]  r6:40042000 r5:40969490 r4:409694d8
[   52.791965] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[   52.791965] pgd = c3ff4000
[   52.791965] [00000000] *pgd=23e3b031, *pte=00000000, *ppte=00000000
[   52.791965] stopped custom tracer.
[   52.791965] Internal error: Oops: 817 [#1] PREEMPT
[   52.791965] CPU: 0    Not tainted  (2.6.23.1-rt5 #23)
[   52.791965] PC is at task_setprio+0x18/0x17c
[   52.791965] LR is at __rt_mutex_adjust_prio+0x30/0x34
[   52.791965] pc : [<c002a95c>]    lr : [<c004d488>]    psr: a0000093
[   52.791965] sp : c3823d80  ip : c3823db0  fp : c3823dac
[   52.791965] r10: ffffffff  r9 : a0000013  r8 : c3823ed0
[   52.791965] r7 : c036f544  r6 : c3823e08  r5 : a0000013  r4 : c02b6ca0
[   52.791965] r3 : 00000000  r2 : c02b6ec4  r1 : c02b6ec4  r0 : c02b6ca0
[   52.791965] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[   52.791965] Control: c000717f  Table: 23ff4000  DAC: 00000015
[   52.791965] Process ceidrvtest (pid: 301, stack limit = 0xc3822260)
[   52.791965] Stack: (0xc3823d80 to 0xc3824000)
[   52.791965] 3d80: 00000002 00000001 c3823da4 c02b6ca0 a0000013
c3823e08 c036f544 c3823ed0
[   52.791965] 3da0: c3823dc4 c3823db0 c004d488 c002a954 c3823dd4
c3822000 c3823e04 c3823dc8
[   52.791965] 3dc0: c004dc60 c004d468 c0298000 ffffffff 00000000
c02b63a0 c3823e0c c3822000
[   52.791965] 3de0: a0000013 c036f544 00000000 c3823ed0 00000004
ffffffff c3823e64 c3823e08
[   52.791965] 3e00: c017309c c004dc28 c003c144 c003c0e0 00000000
c01b5ff0 c3823e44 c3823e28
[   52.791965] 3e20: c001eea8 c003c134 c3823e4c c3823e38 00000000
c004d128 c3823e64 c3822000
[   52.791965] 3e40: 00000000 c3823fb0 c3822000 c3823ed0 00000000
c3823f50 c3823e74 c3823e68
[   52.791965] 3e60: c0173420 c0172fc8 c3823e84 c3823e78 c0173434
c01733f0 c3823ebc c3823e88
[   52.791965] 3e80: c003ac84 c0173434 c3823fb0 c02b6ec4 c02b6f28
c3822000 00000000 c3823fb0
[   52.791965] 3ea0: c001f044 c001f044 c3822000 400298a8 c3823f9c
c3823ec0 c0021558 c003ac4c
[   52.791965] 3ec0: c3822000 400298a8 c02b6ec4 c3823ed8 c0172b20
c0172c8c c3823f04 c3823ee8
[   52.791965] 3ee0: c004e458 c0172ad8 c0171d70 c01d5a34 00000024
00000000 c3823f24 c3823f08
[   52.791965] 3f00: c00fd9a4 c004e444 c3823f54 00000000 c01d59b0
c3823f78 c3823f54 c3823f28
[   52.791965] 3f20: c00fe20c c00fd938 00000000 00000000 c001f044
000001ff c3e2ae80 40967c29
[   52.791965] 3f40: c3823f78 c001f044 c3823f74 c3823f58 c006e750
c00fe1bc 00000000 00000000
[   52.791965] 3f60: c3e2ae80 00000003 c3823fa4 c3823f78 c006eb70
409694d8 40969490 40042000
[   52.791965] 3f80: 00000003 c001f044 c3822000 400298a8 c3823fac
c3823fa0 c00219a0 c00214fc
[   52.791965] 3fa0: 00000000 c3823fb0 c001ef0c c0021984 fffffffc
40967c29 000001ff 000001ff
[   52.791965] 3fc0: 409694d8 40969490 40042000 00000003 40042000
003d0f00 400298a8 40967c0c
[   52.791965] 3fe0: 00000000 40967bd8 40033ea4 40033eb4 80000010
00000003 20002031 20002431
[   52.791965] Backtrace:
[   52.791965] [<c002a944>] (task_setprio+0x0/0x17c) from [<c004d488>]
(__rt_mutex_adjust_prio+0x30/0x34)
[   52.791965]  r8:c3823ed0 r7:c036f544 r6:c3823e08 r5:a0000013 r4:c02b6ca0
[   52.791965] [<c004d458>] (__rt_mutex_adjust_prio+0x0/0x34) from
[<c004dc60>] (task_blocks_on_rt_mutex+0x48/0x204)
[   52.791965]  r4:c3822000
[   52.791965] [<c004dc18>] (task_blocks_on_rt_mutex+0x0/0x204) from
[<c017309c>] (rt_spin_lock_slowlock+0xe4/0x1e8)
[   52.791965] [<c0172fb8>] (rt_spin_lock_slowlock+0x0/0x1e8) from
[<c0173420>] (__rt_spin_lock+0x40/0x44)
[   52.791965] [<c01733e0>] (__rt_spin_lock+0x0/0x44) from
[<c0173434>] (rt_spin_lock+0x10/0x14)
[   52.791965] [<c0173424>] (rt_spin_lock+0x0/0x14) from [<c003ac84>]
(get_signal_to_deliver+0x48/0x3e8)
[   52.791965] [<c003ac3c>] (get_signal_to_deliver+0x0/0x3e8) from
[<c0021558>] (do_signal+0x6c/0x488)
[   52.791965] [<c00214ec>] (do_signal+0x0/0x488) from [<c00219a0>]
(do_notify_resume+0x2c/0x30)
[   52.791965] [<c0021974>] (do_notify_resume+0x0/0x30) from
[<c001ef0c>] (work_pending+0x1c/0x20)
[   52.791965] Code: e24cb004 e24dd00c e351008c 83a03000 (85833000)
[   54.968721] note: ceidrvtest[301] exited with preempt_count 2
[   55.002901] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[   55.002901] pgd = c0004000
[   55.002901] [00000000] *pgd=00000000
[   55.002901] Internal error: Oops: 817 [#2] PREEMPT
[   55.002901] CPU: 0    Tainted: G      D  (2.6.23.1-rt5 #23)
[   55.002901] PC is at task_setprio+0x18/0x17c
[   55.002901] LR is at __rt_mutex_adjust_prio+0x30/0x34
[   55.002901] pc : [<c002a95c>]    lr : [<c004d488>]    psr: a0000093
[   55.002901] sp : c3823ae8  ip : c3823b18  fp : c3823b14
[   55.002901] r10: ffffffff  r9 : a0000013  r8 : 00000000
[   55.002901] r7 : c01b2000  r6 : c3823b70  r5 : a0000013  r4 : c02b6ca0
[   55.002901] r3 : 00000000  r2 : c02b6ec4  r1 : c02b6ec4  r0 : c02b6ca0
[   55.002901] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM
Segment user
[   55.002901] Control: c000717f  Table: 23ff4000  DAC: 00000015
[   55.002901] Process ceidrvtest (pid: 301, stack limit = 0xc3822260)
[   55.002901] Stack: (0xc3823ae8 to 0xc3824000)
[   55.002901] 3ae0:                   c3822000 000028cd 000028cd
c02b6ca0 a0000013 c3823b70
[   55.002901] 3b00: c01b2000 00000000 c3823b2c c3823b18 c004d488
c002a954 c002e7c8 c3822000
[   55.002901] 3b20: c3823b6c c3823b30 c004dc60 c004d468 00000001
00000000 00000000 c02b63a0
[   55.002901] 3b40: c004cb38 c3822000 a0000013 c01b2000 00000004
00000000 00000004 ffffffff
[   55.002901] 3b60: c3823bcc c3823b70 c017309c c004dc28 37383639
205d3132 c0033c00 00000000
[   55.002901] 3b80: c02b6ca0 409694d8 c0018674 c0018674 c02b6ca0
00000000 00000000 c0018674
[   55.002901] 3ba0: c02b6ca0 c02b6f10 c02b6ca0 c01b763c 00000000
00000000 c0018640 00000000
[   55.002901] 3bc0: c3823bdc c3823bd0 c0173420 c0172fc8 c3823bec
c3823be0 c0173618 c01733f0
[   55.002901] 3be0: c3823c0c c3823bf0 c00318e4 c0173618 c3823c0c
c3823c00 c002e0a4 c002dfe8
[   55.002901] 3c00: c3823c1c c3823c10 c0022c8c c00314f0 c3823c3c
c3823c20 c0023f9c c0022acc
[   55.002901] 3c20: ffffffff c01b422c c02b6ca0 c3823d38 c3823c84
c3823c40 c00241c8 c0023f40
[   55.002901] 3c40: c3823c64 c3823c50 c0028b08 c00289b8 c3823d38
00000817 c3823c7c ffffffff
[   55.002901] 3c60: c01b422c 00000817 c3823d38 00000000 a0000093
ffffffff c3823d34 c3823c88
[   55.002901] 3c80: c001e264 c0023fbc c3823cc4 c3823c98 c002abcc
c0171db0 c3823ca4 80000093
[   55.002901] 3ca0: c3823cec c028a040 00000000 c01ce5a8 c01b4d28
c01b4d50 c3823cd4 c3823cc8
[   55.002901] 3cc0: c002ac94 c002aad0 c3823cec c3823cd8 c00337a8
c002ac88 c01c7b80 20000093
[   55.002901] 3ce0: c3823d04 c3823cf0 c0034280 c0033740 c3823d14
c3823d00 c0051d50 c0052410
[   55.002901] 3d00: c0038120 c02b6ca0 c3823d2c c3823d18 c00382d4
ffffffff c3823d6c c3823e08
[   55.002901] 3d20: c036f544 c3823ed0 c3823dac c3823d38 c001ea80
c001e238 c02b6ca0 c02b6ec4
[   55.002901] 3d40: c02b6ec4 00000000 c02b6ca0 a0000013 c3823e08
c036f544 c3823ed0 a0000013
[   55.002901] 3d60: ffffffff c3823dac c3823db0 c3823d80 c004d488
c002a95c a0000093 ffffffff
[   55.002901] 3d80: 00000002 00000001 c3823da4 c02b6ca0 a0000013
c3823e08 c036f544 c3823ed0
[   55.002901] 3da0: c3823dc4 c3823db0 c004d488 c002a954 c3823dd4
c3822000 c3823e04 c3823dc8
[   55.002901] 3dc0: c004dc60 c004d468 c0298000 ffffffff 00000000
c02b63a0 c3823e0c c3822000
[   55.002901] 3de0: a0000013 c036f544 00000000 c3823ed0 00000004
ffffffff c3823e64 c3823e08
[   55.002901] 3e00: c017309c c004dc28 c003c144 c003c0e0 00000000
c01b5ff0 c3823e44 c3823e28
[   55.002901] 3e20: c001eea8 c003c134 c3823e4c c3823e38 00000000
c004d128 c3823e64 c3822000
[   55.002901] 3e40: 00000000 c3823fb0 c3822000 c3823ed0 00000000
c3823f50 c3823e74 c3823e68
[   55.002901] 3e60: c0173420 c0172fc8 c3823e84 c3823e78 c0173434
c01733f0 c3823ebc c3823e88
[   55.002901] 3e80: c003ac84 c0173434 c3823fb0 c02b6ec4 c02b6f28
c3822000 00000000 c3823fb0
[   55.002901] 3ea0: c001f044 c001f044 c3822000 400298a8 c3823f9c
c3823ec0 c0021558 c003ac4c
[   55.002901] 3ec0: c3822000 400298a8 c02b6ec4 c3823ed8 c0172b20
c0172c8c c3823f04 c3823ee8
[   55.002901] 3ee0: c004e458 c0172ad8 c0171d70 c01d5a34 00000024
00000000 c3823f24 c3823f08
[   55.002901] 3f00: c00fd9a4 c004e444 c3823f54 00000000 c01d59b0
c3823f78 c3823f54 c3823f28
[   55.002901] 3f20: c00fe20c c00fd938 00000000 00000000 c001f044
000001ff c3e2ae80 40967c29
[   55.002901] 3f40: c3823f78 c001f044 c3823f74 c3823f58 c006e750
c00fe1bc 00000000 00000000
[   55.002901] 3f60: c3e2ae80 00000003 c3823fa4 c3823f78 c006eb70
409694d8 40969490 40042000
[   55.002901] 3f80: 00000003 c001f044 c3822000 400298a8 c3823fac
c3823fa0 c00219a0 c00214fc
[   55.002901] 3fa0: 00000000 c3823fb0 c001ef0c c0021984 fffffffc
40967c29 000001ff 000001ff
[   55.002901] 3fc0: 409694d8 40969490 40042000 00000003 40042000
003d0f00 400298a8 40967c0c
[   55.002901] 3fe0: 00000000 40967bd8 40033ea4 40033eb4 80000010
00000003 20002031 20002431
[   55.002901] Backtrace:
[   55.002901] [<c002a944>] (task_setprio+0x0/0x17c) from [<c004d488>]
(__rt_mutex_adjust_prio+0x30/0x34)
[   55.002901]  r8:00000000 r7:c01b2000 r6:c3823b70 r5:a0000013 r4:c02b6ca0
[   55.002901] [<c004d458>] (__rt_mutex_adjust_prio+0x0/0x34) from
[<c004dc60>] (task_blocks_on_rt_mutex+0x48/0x204)
[   55.002901]  r4:c3822000
[   55.002901] [<c004dc18>] (task_blocks_on_rt_mutex+0x0/0x204) from
[<c017309c>] (rt_spin_lock_slowlock+0xe4/0x1e8)
[   55.002901] [<c0172fb8>] (rt_spin_lock_slowlock+0x0/0x1e8) from
[<c0173420>] (__rt_spin_lock+0x40/0x44)
[   55.002901] [<c01733e0>] (__rt_spin_lock+0x0/0x44) from
[<c0173618>] (rt_write_lock+0x10/0x14)
[   55.002901] [<c0173608>] (rt_write_lock+0x0/0x14) from [<c00318e4>]
(do_exit+0x404/0x7fc)
[   55.002901] [<c00314e0>] (do_exit+0x0/0x7fc) from [<c0022c8c>]
(die+0x1d0/0x21c)
[   55.002901] [<c0022abc>] (die+0x0/0x21c) from [<c0023f9c>]
(__do_kernel_fault+0x6c/0x7c)
[   55.002901] [<c0023f30>] (__do_kernel_fault+0x0/0x7c) from
[<c00241c8>] (do_page_fault+0x21c/0x238)
[   55.002901]  r7:c3823d38 r6:c02b6ca0 r5:c01b422c r4:ffffffff
[   55.002901] [<c0023fac>] (do_page_fault+0x0/0x238) from
[<c001e264>] (do_DataAbort+0x3c/0xa0)
[   55.002901] [<c001e228>] (do_DataAbort+0x0/0xa0) from [<c001ea80>]
(__dabt_svc+0x40/0x60)
[   55.002901] Exception stack(0xc3823d38 to 0xc3823d80)
[   55.002901] 3d20:
    c02b6ca0 c02b6ec4
[   55.002901] 3d40: c02b6ec4 00000000 c02b6ca0 a0000013 c3823e08
c036f544 c3823ed0 a0000013
[   55.002901] 3d60: ffffffff c3823dac c3823db0 c3823d80 c004d488
c002a95c a0000093 ffffffff
[   55.002901]  r8:c3823ed0 r7:c036f544 r6:c3823e08 r5:c3823d6c r4:ffffffff
[   55.002901] [<c002a944>] (task_setprio+0x0/0x17c) from [<c004d488>]
(__rt_mutex_adjust_prio+0x30/0x34)
[   55.002901]  r8:c3823ed0 r7:c036f544 r6:c3823e08 r5:a0000013 r4:c02b6ca0
[   55.002901] [<c004d458>] (__rt_mutex_adjust_prio+0x0/0x34) from
[<c004dc60>] (task_blocks_on_rt_mutex+0x48/0x204)
[   55.002901]  r4:c3822000
[   55.002901] [<c004dc18>] (task_blocks_on_rt_mutex+0x0/0x204) from
[<c017309c>] (rt_spin_lock_slowlock+0xe4/0x1e8)
[   55.002901] [<c0172fb8>] (rt_spin_lock_slowlock+0x0/0x1e8) from
[<c0173420>] (__rt_spin_lock+0x40/0x44)
[   55.002901] [<c01733e0>] (__rt_spin_lock+0x0/0x44) from
[<c0173434>] (rt_spin_lock+0x10/0x14)
[   55.002901] [<c0173424>] (rt_spin_lock+0x0/0x14) from [<c003ac84>]
(get_signal_to_deliver+0x48/0x3e8)
[   55.002901] [<c003ac3c>] (get_signal_to_deliver+0x0/0x3e8) from
[<c0021558>] (do_signal+0x6c/0x488)
[   55.002901] [<c00214ec>] (do_signal+0x0/0x488) from [<c00219a0>]
(do_notify_resume+0x2c/0x30)
[   55.002901] [<c0021974>] (do_notify_resume+0x0/0x30) from
[<c001ef0c>] (work_pending+0x1c/0x20)
[   55.002901] Code: e24cb004 e24dd00c e351008c 83a03000 (85833000)
[   59.099579] Fixing recursive fault but reboot is needed!
[   59.130829] BUG: scheduling while atomic: ceidrvtest/0x00000005/301, CPU#0
[   59.130829] [<c0022d88>] (dump_stack+0x0/0x14) from [<c0029bec>]
(__schedule_bug+0x3c/0x48)
[   59.130829] [<c0029bb0>] (__schedule_bug+0x0/0x48) from
[<c0171820>] (__schedule+0x90/0x44c)
[   59.130829] [<c0171790>] (__schedule+0x0/0x44c) from [<c0171fa4>]
(schedule+0xd0/0x118)
[   59.130829]  r7:00000817 r6:0000000b r5:c02b6ca0 r4:c3822000
[   59.130829] [<c0171ed4>] (schedule+0x0/0x118) from [<c00315ec>]
(do_exit+0x10c/0x7fc)
[   59.130829]  r4:c3822000
[   59.130829] [<c00314e0>] (do_exit+0x0/0x7fc) from [<c0022c8c>]
(die+0x1d0/0x21c)
[   59.130829] [<c0022abc>] (die+0x0/0x21c) from [<c0023f9c>]
(__do_kernel_fault+0x6c/0x7c)
[   59.130829] [<c0023f30>] (__do_kernel_fault+0x0/0x7c) from
[<c00241c8>] (do_page_fault+0x21c/0x238)
[   59.130829]  r7:c3823aa0 r6:c02b6ca0 r5:c01b422c r4:ffffffff
[   59.130829] [<c0023fac>] (do_page_fault+0x0/0x238) from
[<c001e264>] (do_DataAbort+0x3c/0xa0)
[   59.130829] [<c001e228>] (do_DataAbort+0x0/0xa0) from [<c001ea80>]
(__dabt_svc+0x40/0x60)
[   59.130829] Exception stack(0xc3823aa0 to 0xc3823ae8)
[   59.130829] 3aa0: c02b6ca0 c02b6ec4 c02b6ec4 00000000 c02b6ca0
a0000013 c3823b70 c01b2000
[   59.130829] 3ac0: 00000000 a0000013 ffffffff c3823b14 c3823b18
c3823ae8 c004d488 c002a95c
[   59.130829] 3ae0: a0000093 ffffffff
[   59.130829]  r8:00000000 r7:c01b2000 r6:c3823b70 r5:c3823ad4 r4:ffffffff
[   59.130829] [<c002a944>] (task_setprio+0x0/0x17c) from [<c004d488>]
(__rt_mutex_adjust_prio+0x30/0x34)
[   59.130829]  r8:00000000 r7:c01b2000 r6:c3823b70 r5:a0000013 r4:c02b6ca0
[   59.130829] [<c004d458>] (__rt_mutex_adjust_prio+0x0/0x34) from
[<c004dc60>] (task_blocks_on_rt_mutex+0x48/0x204)
[   59.130829]  r4:c3822000
[   59.130829] [<c004dc18>] (task_blocks_on_rt_mutex+0x0/0x204) from
[<c017309c>] (rt_spin_lock_slowlock+0xe4/0x1e8)
[   59.130829] [<c0172fb8>] (rt_spin_lock_slowlock+0x0/0x1e8) from
[<c0173420>] (__rt_spin_lock+0x40/0x44)
[   59.130829] [<c01733e0>] (__rt_spin_lock+0x0/0x44) from
[<c0173618>] (rt_write_lock+0x10/0x14)
[   59.130829] [<c0173608>] (rt_write_lock+0x0/0x14) from [<c00318e4>]
(do_exit+0x404/0x7fc)
[   59.130829] [<c00314e0>] (do_exit+0x0/0x7fc) from [<c0022c8c>]
(die+0x1d0/0x21c)
[   59.130829] [<c0022abc>] (die+0x0/0x21c) from [<c0023f9c>]
(__do_kernel_fault+0x6c/0x7c)
[   59.130829] [<c0023f30>] (__do_kernel_fault+0x0/0x7c) from
[<c00241c8>] (do_page_fault+0x21c/0x238)
[   59.130829]  r7:c3823d38 r6:c02b6ca0 r5:c01b422c r4:ffffffff
[   59.130829] [<c0023fac>] (do_page_fault+0x0/0x238) from
[<c001e264>] (do_DataAbort+0x3c/0xa0)
[   59.130829] [<c001e228>] (do_DataAbort+0x0/0xa0) from [<c001ea80>]
(__dabt_svc+0x40/0x60)
[   59.130829] Exception stack(0xc3823d38 to 0xc3823d80)
[   59.130829] 3d20:
    c02b6ca0 c02b6ec4
[   59.130829] 3d40: c02b6ec4 00000000 c02b6ca0 a0000013 c3823e08
c036f544 c3823ed0 a0000013
[   59.130829] 3d60: ffffffff c3823dac c3823db0 c3823d80 c004d488
c002a95c a0000093 ffffffff
[   59.130829]  r8:c3823ed0 r7:c036f544 r6:c3823e08 r5:c3823d6c r4:ffffffff
[   59.130829] [<c002a944>] (task_setprio+0x0/0x17c) from [<c004d488>]
(__rt_mutex_adjust_prio+0x30/0x34)
[   59.130829]  r8:c3823ed0 r7:c036f544 r6:c3823e08 r5:a0000013 r4:c02b6ca0
[   59.130829] [<c004d458>] (__rt_mutex_adjust_prio+0x0/0x34) from
[<c004dc60>] (task_blocks_on_rt_mutex+0x48/0x204)
[   59.130829]  r4:c3822000
[   59.130829] [<c004dc18>] (task_blocks_on_rt_mutex+0x0/0x204) from
[<c017309c>] (rt_spin_lock_slowlock+0xe4/0x1e8)
[   59.130829] [<c0172fb8>] (rt_spin_lock_slowlock+0x0/0x1e8) from
[<c0173420>] (__rt_spin_lock+0x40/0x44)
[   59.130829] [<c01733e0>] (__rt_spin_lock+0x0/0x44) from
[<c0173434>] (rt_spin_lock+0x10/0x14)
[   59.130829] [<c0173424>] (rt_spin_lock+0x0/0x14) from [<c003ac84>]
(get_signal_to_deliver+0x48/0x3e8)
[   59.130829] [<c003ac3c>] (get_signal_to_deliver+0x0/0x3e8) from
[<c0021558>] (do_signal+0x6c/0x488)
[   59.130829] [<c00214ec>] (do_signal+0x0/0x488) from [<c00219a0>]
(do_notify_resume+0x2c/0x30)
[   59.130829] [<c0021974>] (do_notify_resume+0x0/0x30) from
[<c001ef0c>] (work_pending+0x1c/0x20)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-16 15:51 [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals Remy Bohmer
@ 2007-11-16 20:20 ` Steven Rostedt
  2007-11-16 23:02   ` Remy Bohmer
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2007-11-16 20:20 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel



On Fri, 16 Nov 2007, Remy Bohmer wrote:

> Hello All,
>
> I have a problem with the RT-mutex code and signals. The problem is
> very easily reproducible, but I do not have found the root-cause yet.
> I hope someone can help me on this one.
>
> This is what I am doing:
> * I have a simple character driver with a read call.(called spi_read()
> in the logging below )
> * The read call blocks on a semaphore until some condition in hardware
> is reached. (in the routine wait_for_io_level(), see logging below)
> * I use a down_interruptible() call on a 'struct semaphore' type
> semaphore, which eventually blocks on a mutex. (the semaphore is, of
> course, initialised with sema_init() )

The above sounds more like you need a completion. What's used to wake up
the caller of down_interruptible?

Can you post some code to see what you are doing. That would make it much
easier to analyze.

-- Steve


>
> What happens is that when a user-space RT-thread is waiting on the
> semaphore through the spi_read() call, and a signal arrives during the
> wait at this thread (like e.g. CTRL-C), the kernel starts oopsing
> until it is as good as brain-dead.
>
> If I do NOT sent a posix-signal the code/mutex/semaphore is working
> properly for days,
> So, it seems to be related by waking up from a blocking situation,
> because of a pending posix-signal.
>
> I tried also the types 'struct compat_semaphore', and mutexes; none of
> them work.
> In fact the real Mutex type, declared with init_MUTEX() has the same problem.
>
> Anyone an idea?
> Below the kernel oops output. (I run on ARM, but I think that should
> not matter for this problem)
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-16 20:20 ` Steven Rostedt
@ 2007-11-16 23:02   ` Remy Bohmer
  2007-11-16 23:37     ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-16 23:02 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3935 bytes --]

Hello Steven,

Thanks for your reply

> The above sounds more like you need a completion.
Funny, I first started with using completion structures, but that did
not work either. I get similar OOPses on all these kind of locking
mechanisms, as long as I use the _interruptible() type. I tried every
work-around I can think of, but none worked :-((
Even if I block on an ordinary rt-mutex in the same routine, wait a
_interruptible() type, I get the same problem.

> What's used to wake up the caller of down_interruptible?
A call to up() is used from inside an interrupt(thread) context, but
this is not relevant for the problem, because only blocking on a
semaphore with down_interruptible() and waking the thread by CTRL-C is
enough to get this Oops.

I saw that the code is trying to wake 'other waiters', while there is
only 1 thread waiting on the semaphore at most. I feel that the root
cause of this problem has to be searched there.

I believe that executing any PI code on semaphores is a strange
behavior anyway, because a semaphore is never 'owned' by a thread, and
it is always another thread that wakes the thread that blocks on a
semaphore, and because the waker is unknown, the PI code will always
boost the prio of the wrong thread.

Strange is also, that I get different behavior on ARM if I use
sema_init(&sema, 1) versus sema_init(&sema,0). The latter seems to
crash less, it will not crash until the first up(); while the first
will crash even without any up().

Attached I have put a sample driver I just hacked together a few
minutes ago. It is NOT the driver that has generates the oops in the
previous mail, but I have stripped a scull-driver down that much that
it will be much easier to talk about, and to keep us focussed on the
part of the code that is causing this.
Besides: I tested this driver on X86 with 2.6.23.1-rt5 and I get the
also OOPSes although slightly different than on ARM. See the attached
dummy.txt file.


Beware: The up(&sema) is NOT required to get this OOPS, I get it even
without any up(&sema) !

I hope you can look at the attached driver source and help me with this...


Kind Regards,

Remy Bohmer



2007/11/16, Steven Rostedt <rostedt@goodmis.org>:
>
>
> On Fri, 16 Nov 2007, Remy Bohmer wrote:
>
> > Hello All,
> >
> > I have a problem with the RT-mutex code and signals. The problem is
> > very easily reproducible, but I do not have found the root-cause yet.
> > I hope someone can help me on this one.
> >
> > This is what I am doing:
> > * I have a simple character driver with a read call.(called spi_read()
> > in the logging below )
> > * The read call blocks on a semaphore until some condition in hardware
> > is reached. (in the routine wait_for_io_level(), see logging below)
> > * I use a down_interruptible() call on a 'struct semaphore' type
> > semaphore, which eventually blocks on a mutex. (the semaphore is, of
> > course, initialised with sema_init() )
>
> The above sounds more like you need a completion. What's used to wake up
> the caller of down_interruptible?
>
> Can you post some code to see what you are doing. That would make it much
> easier to analyze.
>
> -- Steve
>
>
> >
> > What happens is that when a user-space RT-thread is waiting on the
> > semaphore through the spi_read() call, and a signal arrives during the
> > wait at this thread (like e.g. CTRL-C), the kernel starts oopsing
> > until it is as good as brain-dead.
> >
> > If I do NOT sent a posix-signal the code/mutex/semaphore is working
> > properly for days,
> > So, it seems to be related by waking up from a blocking situation,
> > because of a pending posix-signal.
> >
> > I tried also the types 'struct compat_semaphore', and mutexes; none of
> > them work.
> > In fact the real Mutex type, declared with init_MUTEX() has the same problem.
> >
> > Anyone an idea?
> > Below the kernel oops output. (I run on ARM, but I think that should
> > not matter for this problem)
> >
>

[-- Attachment #2: dummy.tgz --]
[-- Type: application/x-gzip, Size: 1355 bytes --]

[-- Attachment #3: dummy.txt --]
[-- Type: text/plain, Size: 30848 bytes --]

Linux version 2.6.23.1-rt5 (remy@huiskamer) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #1 SMP PREEMPT RT Fri Nov 16 23:28:18 CET 2007
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003ff30000 (usable)
 BIOS-e820: 000000003ff30000 - 000000003ff40000 (ACPI data)
 BIOS-e820: 000000003ff40000 - 000000003fff0000 (ACPI NVS)
 BIOS-e820: 000000003fff0000 - 0000000040000000 (reserved)
 BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
Entering add_active_range(0, 0, 261936) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->   229376
  HighMem    229376 ->   261936
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0:        0 ->   261936
On node 0 totalpages: 261936
  DMA zone: 56 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4040 pages, LIFO batch:0
  Normal zone: 3080 pages used for memmap
  Normal zone: 222200 pages, LIFO batch:31
  HighMem zone: 445 pages used for memmap
  HighMem zone: 32115 pages, LIFO batch:7
  Movable zone: 0 pages used for memmap
DMI 2.3 present.
Using APIC driver default
ACPI: RSDP 000F9E30, 0021 (r2 ACPIAM)
ACPI: XSDT 3FF30100, 003C (r1 A M I  OEMXSDT  10000414 MSFT       97)
ACPI: FACP 3FF30290, 00F4 (r3 A M I  OEMFACP  10000414 MSFT       97)
ACPI: DSDT 3FF303F0, 382D (r1  P4P81 P4P81104      104 INTL  2002026)
ACPI: FACS 3FF40000, 0040
ACPI: APIC 3FF30390, 005C (r1 A M I  OEMAPIC  10000414 MSFT       97)
ACPI: OEMB 3FF40040, 003F (r1 A M I  OEMBIOS  10000414 MSFT       97)
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:2 APIC version 20
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bfb80000)
swsusp: Registered nosave memory region: 000000000009f000 - 00000000000a0000
swsusp: Registered nosave memory region: 00000000000a0000 - 00000000000e8000
swsusp: Registered nosave memory region: 00000000000e8000 - 0000000000100000
Real-Time Preemption Support (C) 2004-2007 Ingo Molnar
Built 1 zonelists in Zone order.  Total pages: 258355
Kernel command line: ro root=/dev/lvmbase/root rhgb quiet
mapped APIC to ffffb000 (fee00000)
mapped IOAPIC to ffffa000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
WARNING: experimental RCU implementation.
CPU 0 irqstacks, hard=c07b0000 soft=c0790000
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 3198.545 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1023700k/1047744k available (2182k kernel code, 23336k reserved, 1157k data, 276k init, 130240k highmem)
virtual kernel memory layout:
    fixmap  : 0xffc53000 - 0xfffff000   (3760 kB)
    pkmap   : 0xff800000 - 0xffc00000   (4096 kB)
    vmalloc : 0xf8800000 - 0xff7fe000   ( 111 MB)
    lowmem  : 0xc0000000 - 0xf8000000   ( 896 MB)
      .init : 0xc0748000 - 0xc078d000   ( 276 kB)
      .data : 0xc0621b0c - 0xc0743124   (1157 kB)
      .text : 0xc0400000 - 0xc0621b0c   (2182 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 6399.57 BogoMIPS (lpj=3199788)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Compat vDSO mapped to ffffe000.
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
ACPI: Core revision 20070126
CPU0: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 09
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
CPU 1 irqstacks, hard=c07b1000 soft=c0791000
Initializing CPU#1
Calibrating delay using timer specific routine.. 6395.97 BogoMIPS (lpj=3197987)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 3.20GHz stepping 09
Total of 2 processors activated (12795.55 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Brought up 2 CPUs
Booting paravirtualized kernel on bare hardware
Time: 23:44:46  Date: 10/16/107
NET: Registered protocol family 16
No dock devices found.
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=3
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S1 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
Force enabled HPET at base address 0xfed00000
PCI quirk: region 0800-087f claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0480-04bf claimed by ICH4 GPIO
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 *7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs *3 4 5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI Warning (tbutils-0217): Incorrect checksum in table [OEMB] -  03, should be FA [20070126]
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 13 devices
ACPI: ACPI bus type pnp unregistered
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
PCI: Cannot allocate resource region 0 of device 0000:00:00.0
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
hpet clockevent registered
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 14318180 Hz
pnp: 00:09: ioport range 0x680-0x6ff has been reserved
pnp: 00:09: ioport range 0x290-0x297 has been reserved
pnp: 00:0a: iomem range 0xfed20000-0xfed8ffff has been reserved
pnp: 00:0a: iomem range 0xffb00000-0xffbfffff could not be reserved
pnp: 00:0b: iomem range 0xfec00000-0xfec00fff has been reserved
pnp: 00:0b: iomem range 0xfee00000-0xfee00fff has been reserved
pnp: 00:0c: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:0c: iomem range 0xc0000-0xdffff could not be reserved
pnp: 00:0c: iomem range 0xe0000-0xfffff could not be reserved
pnp: 00:0c: iomem range 0x100000-0x3ffeffff could not be reserved
PCI: Bridge: 0000:00:01.0
  IO window: d000-dfff
  MEM window: f6d00000-f6dfffff
  PREFETCH window: d0000000-f5ffffff
PCI: Bridge: 0000:02:09.0
  IO window: disabled.
  MEM window: f6f00000-fbffffff
  PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
  IO window: e000-efff
  MEM window: f6e00000-fbffffff
  PREFETCH window: disabled.
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 65536 (order: 9, 2883584 bytes)
TCP bind hash table entries: 65536 (order: 9, 2359296 bytes)
TCP: Hash tables configured (established 65536 bind 65536)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 3581k freed
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
apm: disabled - APM is not SMP safe.
audit: initializing netlink socket (disabled)
audit(1195256685.210:1): initialized
highmem bounce pool size: 64 pages
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux:  Registering netfilter hooks
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Boot video device is 0000:01:00.0
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.102
agpgart: Detected an Intel 865 Chipset.
agpgart: AGP aperture is 64M @ 0x54000000
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
input: Macintosh mouse button emulation as /class/input/input0
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input1
cpuidle: using governor menu
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
drivers/hid/usbhid/hid-core.c: v2.6:USB HID core driver
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI No-Shortcut mode
  Magic number: 15:358:757
Freeing unused kernel memory: 276k freed
Write protecting the kernel read-only data: 856k
logips2pp: Detected unknown logitech mouse model 127
ACPI: PCI Interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 16, io mem 0xf6cffc00
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 17, io base 0x0000c480
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 18, io base 0x0000c800
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: irq 19, io base 0x0000c880
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
input: ImExPS/2 Logitech Explorer Mouse as /class/input/input2
ACPI: PCI Interrupt 0000:00:1d.3[A] -> GSI 16 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.3: irq 17, io base 0x0000cc00
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usb 2-2: new full speed USB device using uhci_hcd and address 2
SCSI subsystem initialized
libata version 2.21 loaded.
ata_piix 0000:00:1f.1: version 2.12
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1f.1 to 64
scsi0 : ata_piix
scsi1 : ata_piix
ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x0001fc00 irq 14
ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x0001fc08 irq 15
usb 2-2: configuration #1 chosen from 1 choice
hub 2-2:1.0: USB hub found
hub 2-2:1.0: 4 ports detected
ata1.00: ATAPI: SONY    DVD RW DRU-500A, 2.1a, max UDMA/33
ata1.00: configured for UDMA/33
ata2.00: ATA-6: WDC WD3200JB-00KFA0, 08.05J08, max UDMA/100
ata2.00: 625142448 sectors, multi 16: LBA48 
ata2.01: ATA-7: Maxtor 6E040L0, NAR61590, max UDMA/133
ata2.01: 80293248 sectors, multi 16: LBA 
ata2.00: configured for UDMA/100
ata2.01: configured for UDMA/133
scsi 0:0:0:0: CD-ROM            SONY     DVD RW DRU-500A  2.1a PQ: 0 ANSI: 5
scsi 1:0:0:0: Direct-Access     ATA      WDC WD3200JB-00K 08.0 PQ: 0 ANSI: 5
sd 1:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sda] Write Protect is off
sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sda] Write Protect is off
sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 < sda5 > sda3 sda4
sd 1:0:0:0: [sda] Attached SCSI disk
scsi 1:0:1:0: Direct-Access     ATA      Maxtor 6E040L0   NAR6 PQ: 0 ANSI: 5
sd 1:0:1:0: [sdb] 80293248 512-byte hardware sectors (41110 MB)
sd 1:0:1:0: [sdb] Write Protect is off
sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:1:0: [sdb] 80293248 512-byte hardware sectors (41110 MB)
sd 1:0:1:0: [sdb] Write Protect is off
sd 1:0:1:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: sdb1
sd 1:0:1:0: [sdb] Attached SCSI disk
ata_piix 0000:00:1f.2: MAP [ P0 -- P1 -- ]
ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 18 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1f.2 to 64
scsi2 : ata_piix
scsi3 : ata_piix
ata3: SATA max UDMA/133 cmd 0x0001bc00 ctl 0x0001b882 bmdma 0x0001b400 irq 19
ata4: SATA max UDMA/133 cmd 0x0001b800 ctl 0x0001b482 bmdma 0x0001b408 irq 19
ata3.00: ATA-7: Maxtor 6Y080M0, YAR51EW0, max UDMA/133
ata3.00: 160086528 sectors, multi 16: LBA 
ata3.00: configured for UDMA/133
scsi 2:0:0:0: Direct-Access     ATA      Maxtor 6Y080M0   YAR5 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdc] 160086528 512-byte hardware sectors (81964 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:0:0: [sdc] 160086528 512-byte hardware sectors (81964 MB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdc: sdc1 sdc2
sd 2:0:0:0: [sdc] Attached SCSI disk
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux:  Disabled at runtime.
SELinux:  Unregistering netfilter hooks
audit(1195256693.209:2): selinux=0 auid=4294967295
intel_rng: FWH not detected
sr0: scsi3-mmc drive: 32x/32x writer cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 0:0:0:0: Attached scsi CD-ROM sr0
iTCO_vendor_support: vendor-support=0
sr 0:0:0:0: Attached scsi generic sg0 type 5
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 1:0:1:0: Attached scsi generic sg2 type 0
sd 2:0:0:0: Attached scsi generic sg3 type 0
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.02 (26-Jul-2007)
iTCO_wdt: Found a ICH5 or ICH5R TCO device (Version=1, TCOBASE=0x0860)
iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
Floppy drive(s): fd0 is 1.44M
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 20
FDC 0 is a post-1991 82077
input: Power Button (FF) as /class/input/input3
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input4
ACPI: Power Button (CM) [PWRB]
ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 22 (level, low) -> IRQ 21
skge 1.11 addr 0xf6ef8000 irq 21 chip Yukon rev 1
skge eth0: addr 00:0e:a6:2f:2d:a4
ACPI: PCI Interrupt 0000:02:03.0[A] -> GSI 20 (level, low) -> IRQ 22
firewire_ohci: Added fw-ohci device 0000:02:03.0, OHCI version 1.0
ACPI: PCI Interrupt 0000:03:0d.0[A] -> GSI 22 (level, low) -> IRQ 21
firewire_ohci: Added fw-ohci device 0000:03:0d.0, OHCI version 1.10
Linux video capture interface: v2.00
firewire_core: created new fw device fw0 (0 config rom retries, S400)
firewire_core: created new fw device fw1 (0 config rom retries, S400)
cx2388x v4l2 driver version 0.0.6 loaded
ACPI: PCI Interrupt 0000:03:0c.0[A] -> GSI 21 (level, low) -> IRQ 23
CORE cx88[0]: subsystem: 107d:6620, board: Leadtek Winfast DV2000 [card=8,autodetected]
TV tuner 38 at 0x1fe, Radio tuner -1 at 0x1fe
ACPI: PCI Interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:1f.5 to 64
cx88[0]/0: found at 0000:03:0c.0, rev: 5, irq: 23, latency: 64, mmio: 0xf7000000
tuner 1-0043: chip found @ 0x86 (cx88[0])
tda9887 1-0043: tda988[5/6/7] found @ 0x43 (tuner)
tuner 1-0060: All bytes are equal. It is not a TEA5767
tuner 1-0060: chip found @ 0xc0 (cx88[0])
tuner 1-0060: type set to 38 (Philips PAL/SECAM multi (FM1216ME MK3))
tuner 1-0060: type set to 38 (Philips PAL/SECAM multi (FM1216ME MK3))
cx88[0]/0: registered device video0 [v4l2]
cx88[0]/0: registered device vbi0
cx88[0]/0: registered device radio0
AC'97 0 analog subsections not ready
intel8x0_measure_ac97_clock: measured 50076 usecs
intel8x0: clocking to 48000
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 17
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on dm-0, internal journal
kjournald starting.  Commit interval 5 seconds
EXT3 FS on sda3, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
fuse init (API version 7.8)
Adding 2097144k swap on /dev/mapper/lvmbase-swap.  Priority:-1 extents:1 across:2097144k
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
skge eth0: enabling interface
ADDRCONF(NETDEV_UP): eth0: link is not ready
skge eth0: Link is up at 1000 Mbps, full duplex, flow control both
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Bluetooth: Core ver 2.11
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.8
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
Bluetooth: RFCOMM ver 1.8
eth0: no IPv6 routers present
Bridge firewalling registered
virbr0: Dropping NETIF_F_UFO since no NETIF_F_HW_CSUM feature.
virbr0: no IPv6 routers present
dummy: module license 'unspecified' taints kernel.
dummy_open
dummy_read
First time decrement goes okay.
Do 1 up.
Do 2 times down.
We will block now, and if you press CTRL-C from here, we get an OOPS.
WARNING: at kernel/rtmutex.c:585 remove_waiter()
 [<c0449eb7>] remove_waiter+0x105/0x14a
 [<c061da36>] rt_mutex_slowlock+0x1af/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0f2>] dummy_read+0x72/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c061e0ce>] .... __spin_lock_irq+0x17/0x27
.....[<00000000>] ..   ( <= 0x0)

We are returning from the read().
dummy_release
dummy_open
dummy_read
First time decrement goes okay.
WARNING: at kernel/rtmutex.c:585 remove_waiter()
 [<c0449eb7>] remove_waiter+0x105/0x14a
 [<c061da36>] rt_mutex_slowlock+0x1af/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c061e0ce>] .... __spin_lock_irq+0x17/0x27
.....[<00000000>] ..   ( <= 0x0)

Do 1 up.
Do 2 times down.
We will block now, and if you press CTRL-C from here, we get an OOPS.
We are returning from the read().
dummy_release
dummy_open
dummy_read
First time decrement goes okay.
WARNING: at kernel/rtmutex.c:585 remove_waiter()
 [<c0449eb7>] remove_waiter+0x105/0x14a
 [<c061da36>] rt_mutex_slowlock+0x1af/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c061e0ce>] .... __spin_lock_irq+0x17/0x27
.....[<00000000>] ..   ( <= 0x0)

Do 1 up.
Do 2 times down.
We will block now, and if you press CTRL-C from here, we get an OOPS.
We are returning from the read().
dummy_release
dummy_open
dummy_read
First time decrement goes okay.
WARNING: at kernel/rtmutex.c:585 remove_waiter()
 [<c0449eb7>] remove_waiter+0x105/0x14a
 [<c061da36>] rt_mutex_slowlock+0x1af/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c061e0ce>] .... __spin_lock_irq+0x17/0x27
.....[<00000000>] ..   ( <= 0x0)

Do 1 up.
Do 2 times down.
We will block now, and if you press CTRL-C from here, we get an OOPS.
We are returning from the read().
dummy_release
dummy_open
dummy_read
First time decrement goes okay.
BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c04f44ca
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP 
Modules linked in: dummy(P) ipt_MASQUERADE iptable_nat nf_nat bridge rfcomm l2cap bluetooth autofs4 sunrpc nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables vfat fat fuse dm_multipath ipv6 tuner snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy cx8800 cx88xx ir_common i2c_algo_bit video_buf snd_seq_oss snd_seq_midi_event tveeprom snd_seq videodev v4l1_compat compat_ioctl32 snd_seq_device snd_pcm_oss v4l2_common snd_mixer_oss firewire_ohci btcx_risc firewire_core snd_pcm skge crc_itu_t button snd_timer floppy i2c_i801 snd i2c_core iTCO_wdt sg iTCO_vendor_support soundcore sr_mod snd_page_alloc cdrom dm_snapshot dm_zero dm_mirror dm_mod ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0060:[<c04f44ca>]    Tainted: P        VLI
EFLAGS: 00010006   (2.6.23.1-rt5 #1)
EIP is at plist_add+0x4f/0x85
eax: fffffffc   ebx: fffffffc   ecx: dab12f2c   edx: 00000000
esi: dab12f20   edi: dae045bc   ebp: dae04030   esp: dab12eac
ds: 007b   es: 007b   fs: 00d8  gs: 0033  ss: 0068  preempt:00000003
Process cat (pid: 3632, ti=dab12000 task=d37660b0 task.ti=dab12000)
Stack: 000a80d2 c0734600 c06fe4a0 dae045b4 dae045bc dae045b4 dab12f0c c044a016 
       c0455b79 f9bfe080 00000000 f9bff984 dab12f0c dab12f20 00000000 f9bff984 
       00000002 00000000 c061d9b7 00000282 00000000 00000000 ffffffff 00000282 
Call Trace:
 [<c044a016>] task_blocks_on_rt_mutex+0x11a/0x17f
 [<c0455b79>] add_preempt_count+0x12/0xcc
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c061d9b7>] rt_mutex_slowlock+0x130/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000003 ]
| 3-level deep critical section nesting:
----------------------------------------
.. [<c061e045>] .... __spin_lock_irqsave+0x24/0x5c
.....[<00000000>] ..   ( <= 0x0)
.. [<c061df53>] .... __spin_lock+0xd/0x1d
.....[<00000000>] ..   ( <= 0x0)
.. [<c061e045>] .... __spin_lock_irqsave+0x24/0x5c
.....[<00000000>] ..   ( <= 0x0)

Code: 24 04 a5 0d 6c c0 c7 04 24 7d 80 6a c0 e8 4e a0 f3 ff e8 ff 26 f1 ff 8b 1f 83 eb 04 eb 0d 8b 03 39 06 7c 15 8d 42 fc 74 32 89 c3 <8b> 53 04 0f 18 02 90 8d 43 04 39 f8 75 e5 8d 4b 04 8b 51 04 8d 
EIP: [<c04f44ca>] plist_add+0x4f/0x85 SS:ESP 0068:dab12eac
note: cat[3632] exited with preempt_count 2
BUG: sleeping function called from invalid context cat(3632) at kernel/rtmutex.c:1175
in_atomic():1 [00000002], irqs_disabled():0
 [<c061d713>] rt_mutex_lock+0x15/0x3f
 [<c044a201>] __rt_down_read+0x32/0x3a
 [<c0452d41>] acct_collect+0x38/0x13f
 [<c0430a2f>] do_exit+0x1c6/0x748
 [<c0406749>] die+0x226/0x25c
 [<c061ffcd>] do_page_fault+0x53b/0x609
 [<c04621cf>] find_lock_page+0x25/0x52
 [<c061fa92>] do_page_fault+0x0/0x609
 [<c061e5d2>] error_code+0x72/0x78
 [<c04f44ca>] plist_add+0x4f/0x85
 [<c044a016>] task_blocks_on_rt_mutex+0x11a/0x17f
 [<c0455b79>] add_preempt_count+0x12/0xcc
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c061d9b7>] rt_mutex_slowlock+0x130/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c061df53>] .... __spin_lock+0xd/0x1d
.....[<00000000>] ..   ( <= 0x0)
.. [<c061df53>] .... __spin_lock+0xd/0x1d
.....[<00000000>] ..   ( <= 0x0)

BUG: scheduling while atomic: cat/0x10000003/3632, CPU#0
 [<c061c1a4>] __sched_text_start+0xac/0x913
 [<c0455b79>] add_preempt_count+0x12/0xcc
 [<c047ce2f>] kmem_cache_free+0x8a/0x8f
 [<c0455b79>] add_preempt_count+0x12/0xcc
 [<c0429093>] __cond_resched+0x2b/0x45
 [<c061cec7>] cond_resched+0x26/0x31
 [<c042f90e>] put_files_struct+0x64/0xa7
 [<c0430aad>] do_exit+0x244/0x748
 [<c0406749>] die+0x226/0x25c
 [<c061ffcd>] do_page_fault+0x53b/0x609
 [<c04621cf>] find_lock_page+0x25/0x52
 [<c061fa92>] do_page_fault+0x0/0x609
 [<c061e5d2>] error_code+0x72/0x78
 [<c04f44ca>] plist_add+0x4f/0x85
 [<c044a016>] task_blocks_on_rt_mutex+0x11a/0x17f
 [<c0455b79>] add_preempt_count+0x12/0xcc
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c061d9b7>] rt_mutex_slowlock+0x130/0x20d
 [<c061d66b>] rt_mutex_lock_interruptible+0x41/0x49
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<c044a2c5>] rt_down_interruptible+0x11/0x30
 [<f9bfe080>] dummy_read+0x0/0x88 [dummy]
 [<f9bfe0b0>] dummy_read+0x30/0x88 [dummy]
 [<c04816b0>] vfs_read+0xa6/0x152
 [<c0481b12>] sys_read+0x41/0x67
 [<c040512a>] sysenter_past_esp+0x6b/0xa1
 [<c0610000>] dump_one_state+0x31b/0x3f8
 =======================
---------------------------
| preempt count: 10000003 ]
| 3-level deep critical section nesting:
----------------------------------------
.. [<c061df53>] .... __spin_lock+0xd/0x1d
.....[<00000000>] ..   ( <= 0x0)
.. [<c061df53>] .... __spin_lock+0xd/0x1d
.....[<00000000>] ..   ( <= 0x0)
.. [<c061c10b>] .... __sched_text_start+0x13/0x913
.....[<00000000>] ..   ( <= 0x0)

dummy_release

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-16 23:02   ` Remy Bohmer
@ 2007-11-16 23:37     ` Steven Rostedt
  2007-11-17 11:44       ` Remy Bohmer
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2007-11-16 23:37 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel




On Sat, 17 Nov 2007, Remy Bohmer wrote:

> Hello Steven,
>
> Thanks for your reply
>
> > The above sounds more like you need a completion.
> Funny, I first started with using completion structures, but that did
> not work either. I get similar OOPses on all these kind of locking
> mechanisms, as long as I use the _interruptible() type. I tried every
> work-around I can think of, but none worked :-((
> Even if I block on an ordinary rt-mutex in the same routine, wait a
> _interruptible() type, I get the same problem.

The taker of a mutex must also be the one that releases it.  I don't see
how you could use a mutex for this. It really requires some kind of
completion, or a compat_semaphore.

>
> > What's used to wake up the caller of down_interruptible?
> A call to up() is used from inside an interrupt(thread) context, but
> this is not relevant for the problem, because only blocking on a
> semaphore with down_interruptible() and waking the thread by CTRL-C is
> enough to get this Oops.
>
> I saw that the code is trying to wake 'other waiters', while there is
> only 1 thread waiting on the semaphore at most. I feel that the root
> cause of this problem has to be searched there.
>
> I believe that executing any PI code on semaphores is a strange
> behavior anyway, because a semaphore is never 'owned' by a thread, and
> it is always another thread that wakes the thread that blocks on a
> semaphore, and because the waker is unknown, the PI code will always
> boost the prio of the wrong thread.

Exactly why it should be a completion or compat semaphores. The reason we
did PI on semaphores is only because they were used as mutexes before Ingo
pushed to actually get a mutex primative into the kernel. Since then,
we've been trying to remove all semaphores with either a mutex or
completion.

>
> Strange is also, that I get different behavior on ARM if I use
> sema_init(&sema, 1) versus sema_init(&sema,0). The latter seems to
> crash less, it will not crash until the first up(); while the first
> will crash even without any up().
>
> Attached I have put a sample driver I just hacked together a few
> minutes ago. It is NOT the driver that has generates the oops in the
> previous mail, but I have stripped a scull-driver down that much that
> it will be much easier to talk about, and to keep us focussed on the
> part of the code that is causing this.
> Besides: I tested this driver on X86 with 2.6.23.1-rt5 and I get the
> also OOPSes although slightly different than on ARM. See the attached
> dummy.txt file.
>
>
> Beware: The up(&sema) is NOT required to get this OOPS, I get it even
> without any up(&sema) !
>
> I hope you can look at the attached driver source and help me with this...
>

        down_interruptible(&dummy);
        printk("We will block now, and if you press CTRL-C from here, we get an OOPS.\n");
        down_interruptible(&dummy);


This double down is actually illegal with rt semaphores. Because we treat
semaphores as mutexes unless they are declared as compat_semaphores. In
which case we don't do PI.

Seems that you need to work out how to use a completion for your code. And
if that doesn't work, then use a compat_semaphore. But beware, that the
compat_semaphore can cause unbounded latencies. But then again, so can
completions.


-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-16 23:37     ` Steven Rostedt
@ 2007-11-17 11:44       ` Remy Bohmer
  2007-11-17 14:08         ` Steven Rostedt
  2007-11-17 16:22         ` Daniel Walker
  0 siblings, 2 replies; 28+ messages in thread
From: Remy Bohmer @ 2007-11-17 11:44 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel

Hello Steven,

> The taker of a mutex must also be the one that releases it.  I don't see
> how you could use a mutex for this. It really requires some kind of
> completion, or a compat_semaphore.

I tried several ways of working around the bug, even tried
implementing it with kernel threads and protecting global data with
mutexes. Therefor I know that I have the same problem with mutexes. I
just created a simple example that showed the problem quickly, this
does not mean that this is the only case that does not work.

BTW: I am hacking around in the PREEMPT-RT kernel for years now, I
know the history very well, and I know what I am doing... Please, do
not find holes in the example I quickly hacked together, please focus
on the OOPS message, and help me figuring out what is causing this. I
can give you other examples of code that shows the same problem. But,
basically, every call that blocks with _inturruptible() on a rt-mutex
beneath the surface in the context of a user space process (and
receives a signal during the block) shows the same problem.

> Exactly why it should be a completion or compat semaphores. The reason we
> did PI on semaphores is only because they were used as mutexes before Ingo
> pushed to actually get a mutex primative into the kernel. Since then,
> we've been trying to remove all semaphores with either a mutex or
> completion.

Okay, sounds fair. But: the current implementation does counting the
number of up's and down's, suggesting that it really behaves like a
semaphore. It only does some special things during the transition of
the counter from 0->1 and from 1->0. If this counting is illegal use
of the mutex mechanism, it should report (compile) errors if:
* sema_init is used on 'struct semaphore' -> init_MUTEX() must be used instead.
* sema_init should only be used on 'struct compat_semaphore' types.
* calls to up() and down() in a row should report a BUG message,
* if up() is called from a different thread than the down() it should
report a BUG message. Further, the counting up's and down's are not
allowed on struct semphore types, so it should be removed from the
code.
* PI should only take place if it is for 100% sure that the 'struct
semaphore' is used as a mutex. And this is only the case when it is
initialised with init_MUTEX().

So, because all these items are not there, I doubt it is really true
that it is illegal to use 'struct semaphore' types as counting
semaphores across multiple threads. BESIDES: Everything works fine
UNTIL a signal is generated during a block on the semaphore. I think
Ingo tried to make the 'struct semaphore' type to behave like the
non-RT kernel 'struct semaphore', which actually does NOT show this
problem wtih my example driver!!!

So, this is a regression if exactly the same driver is used in both
non-preempt-rt patched kernel and preempt-rt patched kernels.

>         down_interruptible(&dummy);
>         printk("We will block now, and if you press CTRL-C from here, we get an OOPS.\n");
>         down_interruptible(&dummy);
>
> This double down is actually illegal with rt semaphores. Because we treat
> semaphores as mutexes unless they are declared as compat_semaphores. In
> which case we don't do PI.

According to code there is a counting mechanism there, which suggest
that this is allowed to do. It works fine, until a signal arrives.the
SIGNAL is the only problem here!

> Seems that you need to work out how to use a completion for your code. And
> if that doesn't work, then use a compat_semaphore. But beware, that the
> compat_semaphore can cause unbounded latencies. But then again, so can
> completions.

I hope you get my point now. Other mechanisms like ordinary rt-mutexes
show the same problem, so either case: Please help me figuring out
which bug the signal is triggering here.


Kind Regards,

Remy Bohmer


2007/11/17, Steven Rostedt <rostedt@goodmis.org>:
>
>
>
> On Sat, 17 Nov 2007, Remy Bohmer wrote:
>
> > Hello Steven,
> >
> > Thanks for your reply
> >
> > > The above sounds more like you need a completion.
> > Funny, I first started with using completion structures, but that did
> > not work either. I get similar OOPses on all these kind of locking
> > mechanisms, as long as I use the _interruptible() type. I tried every
> > work-around I can think of, but none worked :-((
> > Even if I block on an ordinary rt-mutex in the same routine, wait a
> > _interruptible() type, I get the same problem.
>
> The taker of a mutex must also be the one that releases it.  I don't see
> how you could use a mutex for this. It really requires some kind of
> completion, or a compat_semaphore.
>
> >
> > > What's used to wake up the caller of down_interruptible?
> > A call to up() is used from inside an interrupt(thread) context, but
> > this is not relevant for the problem, because only blocking on a
> > semaphore with down_interruptible() and waking the thread by CTRL-C is
> > enough to get this Oops.
> >
> > I saw that the code is trying to wake 'other waiters', while there is
> > only 1 thread waiting on the semaphore at most. I feel that the root
> > cause of this problem has to be searched there.
> >
> > I believe that executing any PI code on semaphores is a strange
> > behavior anyway, because a semaphore is never 'owned' by a thread, and
> > it is always another thread that wakes the thread that blocks on a
> > semaphore, and because the waker is unknown, the PI code will always
> > boost the prio of the wrong thread.
>
> Exactly why it should be a completion or compat semaphores. The reason we
> did PI on semaphores is only because they were used as mutexes before Ingo
> pushed to actually get a mutex primative into the kernel. Since then,
> we've been trying to remove all semaphores with either a mutex or
> completion.
>
> >
> > Strange is also, that I get different behavior on ARM if I use
> > sema_init(&sema, 1) versus sema_init(&sema,0). The latter seems to
> > crash less, it will not crash until the first up(); while the first
> > will crash even without any up().
> >
> > Attached I have put a sample driver I just hacked together a few
> > minutes ago. It is NOT the driver that has generates the oops in the
> > previous mail, but I have stripped a scull-driver down that much that
> > it will be much easier to talk about, and to keep us focussed on the
> > part of the code that is causing this.
> > Besides: I tested this driver on X86 with 2.6.23.1-rt5 and I get the
> > also OOPSes although slightly different than on ARM. See the attached
> > dummy.txt file.
> >
> >
> > Beware: The up(&sema) is NOT required to get this OOPS, I get it even
> > without any up(&sema) !
> >
> > I hope you can look at the attached driver source and help me with this...
> >
>
>         down_interruptible(&dummy);
>         printk("We will block now, and if you press CTRL-C from here, we get an OOPS.\n");
>         down_interruptible(&dummy);
>
>
> This double down is actually illegal with rt semaphores. Because we treat
> semaphores as mutexes unless they are declared as compat_semaphores. In
> which case we don't do PI.
>
> Seems that you need to work out how to use a completion for your code. And
> if that doesn't work, then use a compat_semaphore. But beware, that the
> compat_semaphore can cause unbounded latencies. But then again, so can
> completions.
>
>
> -- Steve
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 11:44       ` Remy Bohmer
@ 2007-11-17 14:08         ` Steven Rostedt
  2007-11-17 15:06           ` Steven Rostedt
  2007-11-17 15:36           ` Remy Bohmer
  2007-11-17 16:22         ` Daniel Walker
  1 sibling, 2 replies; 28+ messages in thread
From: Steven Rostedt @ 2007-11-17 14:08 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel



On Sat, 17 Nov 2007, Remy Bohmer wrote:

> Hello Steven,
>
> > The taker of a mutex must also be the one that releases it.  I don't see
> > how you could use a mutex for this. It really requires some kind of
> > completion, or a compat_semaphore.
>
> I tried several ways of working around the bug, even tried
> implementing it with kernel threads and protecting global data with
> mutexes. Therefor I know that I have the same problem with mutexes. I
> just created a simple example that showed the problem quickly, this
> does not mean that this is the only case that does not work.
>
> BTW: I am hacking around in the PREEMPT-RT kernel for years now, I
> know the history very well, and I know what I am doing... Please, do
> not find holes in the example I quickly hacked together, please focus
> on the OOPS message, and help me figuring out what is causing this. I
> can give you other examples of code that shows the same problem. But,
> basically, every call that blocks with _inturruptible() on a rt-mutex
> beneath the surface in the context of a user space process (and
> receives a signal during the block) shows the same problem.

Please don't think that I just took your example to find a hole in it.

Actually, the reason I pointed that out was not because of "Oh, this
example is broken", but because of the description of the problem you
are trying to solve. To me, it needs to be done with a completion or
compat_semaphore otherwise it will become very complex in solving.

>
> > Exactly why it should be a completion or compat semaphores. The reason we
> > did PI on semaphores is only because they were used as mutexes before Ingo
> > pushed to actually get a mutex primative into the kernel. Since then,
> > we've been trying to remove all semaphores with either a mutex or
> > completion.
>
> Okay, sounds fair. But: the current implementation does counting the
> number of up's and down's, suggesting that it really behaves like a
> semaphore. It only does some special things during the transition of
> the counter from 0->1 and from 1->0. If this counting is illegal use
> of the mutex mechanism, it should report (compile) errors if:
> * sema_init is used on 'struct semaphore' -> init_MUTEX() must be used instead.
> * sema_init should only be used on 'struct compat_semaphore' types.
> * calls to up() and down() in a row should report a BUG message,
> * if up() is called from a different thread than the down() it should
> report a BUG message. Further, the counting up's and down's are not
> allowed on struct semphore types, so it should be removed from the
> code.

It should print out warnings, do you have CONFIG_DEBUG_RT_MUTEXES set?

> * PI should only take place if it is for 100% sure that the 'struct
> semaphore' is used as a mutex. And this is only the case when it is
> initialised with init_MUTEX().

Well, we can't determine that with code ;-)  Remember, there are still
drivers out in the world that use semaphores as mutexes. So the PI
on semaphores is really more of a compatibility issue.

>
> So, because all these items are not there, I doubt it is really true
> that it is illegal to use 'struct semaphore' types as counting
> semaphores across multiple threads. BESIDES: Everything works fine
> UNTIL a signal is generated during a block on the semaphore. I think
> Ingo tried to make the 'struct semaphore' type to behave like the
> non-RT kernel 'struct semaphore', which actually does NOT show this
> problem wtih my example driver!!!

Right, because a non-RT semaphore _is_ a compat_semaphore. I'm saying if
you see the bug in your driver with the compat_semaphore then lets debug
that. Because that _is_ a bug!


>
> So, this is a regression if exactly the same driver is used in both
> non-preempt-rt patched kernel and preempt-rt patched kernels.

Not really.  There are things that the preempt-rt kernels require. One, is
that things that need to keep semaphores instead of using them as mutexes,
they should be converted to compat_semaphores.  Perhaps now that we have
mutexes, we can remove the PI on semaphores, and out-of-tree drivers will
need to make sure they don't use semaphores as mutexes anymore.

>
> >         down_interruptible(&dummy);
> >         printk("We will block now, and if you press CTRL-C from here, we get an OOPS.\n");
> >         down_interruptible(&dummy);
> >
> > This double down is actually illegal with rt semaphores. Because we treat
> > semaphores as mutexes unless they are declared as compat_semaphores. In
> > which case we don't do PI.
>
> According to code there is a counting mechanism there, which suggest
> that this is allowed to do. It works fine, until a signal arrives.the
> SIGNAL is the only problem here!

Yeah, that code is more of a hack to convert counting semaphores into
mutexes. But semaphores still need to have owners, and they should not
block on themselves. That may be where the bug is.

>
> > Seems that you need to work out how to use a completion for your code. And
> > if that doesn't work, then use a compat_semaphore. But beware, that the
> > compat_semaphore can cause unbounded latencies. But then again, so can
> > completions.
>
> I hope you get my point now. Other mechanisms like ordinary rt-mutexes
> show the same problem, so either case: Please help me figuring out
> which bug the signal is triggering here.

OK, I wont be able to work on this this weekend, but I'll try to get to it
on Monday.  A better example to show the bug you are looking for is simply
create a mutex and create a thread that grabs that mutex and goes to
sleep. Have your driver read grab that mutex with
mutex_lock_interruptible. And if the signal code is broken with this, then
you definitely got a point that the inerruptible code is broken.

This will keep the semantics clean and not obfuscate it with the semaphore
code.

I'll write up that example on Monday if you don't have the time.

Note, that the unloading of the module should wake up the thread that
grabbed the mutex so it can release it.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 14:08         ` Steven Rostedt
@ 2007-11-17 15:06           ` Steven Rostedt
  2007-11-17 15:36           ` Remy Bohmer
  1 sibling, 0 replies; 28+ messages in thread
From: Steven Rostedt @ 2007-11-17 15:06 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel


On Sat, 17 Nov 2007, Steven Rostedt wrote:
> On Sat, 17 Nov 2007, Remy Bohmer wrote:
> > * PI should only take place if it is for 100% sure that the 'struct
> > semaphore' is used as a mutex. And this is only the case when it is
> > initialised with init_MUTEX().
>
> Well, we can't determine that with code ;-)  Remember, there are still
> drivers out in the world that use semaphores as mutexes. So the PI
> on semaphores is really more of a compatibility issue.
>

Oh, I missed your point here. I was up late last night so I'm blaming that
for my _not_so_thorough_ reading ;-)

You are saying that we should change the semaphore to a mutex only if it
was initialized by init_MUTEX(). I see your point.

Actually, if something is initialized by init_MUTEX it really needs to be
a mutex :-)

Right now the compiler determines that something is a mutex or a
semaphore. If we make it a run time option, that will add more complexity
to the code and probably make it less efficient. The more complex part
is something we can deal with. The less efficient part we can not.

find .  -name "*.c" ! -type d  | xargs grep  "init_MUTEX" | wc -l
100

Heh, nice number.

Well, I guess I could start sending patches to mainline to convert
semaphores to mutexes. I'm sure that will probably annoy akpm, but I'll do
it one at a time, with lots of thought behind each change.

BTW, drivers breaking with the RT patch is not always considered a
regression.

For example, this is fine in vanilla kernel:

  local_irq_save(flags);
  [...]
  spin_lock(&lock);

Where as in the RT patch, that would break. And it would require either
changing the local_irq_save to a local_irq_save_nort, or find a way to do
the spin_lock_irqsave(&lock, flags) instead.

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 14:08         ` Steven Rostedt
  2007-11-17 15:06           ` Steven Rostedt
@ 2007-11-17 15:36           ` Remy Bohmer
  2007-11-19 12:55             ` Remy Bohmer
  1 sibling, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-17 15:36 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel

Hello Steven,

> It should print out warnings, do you have CONFIG_DEBUG_RT_MUTEXES set?

Nope, not yet... I will do that on Monday also. (On ARM I have as less
as debug options enabled per default, because it eats too much
CPU-power)

> > * PI should only take place if it is for 100% sure that the 'struct
> > semaphore' is used as a mutex. And this is only the case when it is
> > initialised with init_MUTEX().
>
> Well, we can't determine that with code ;-)  Remember, there are still
> drivers out in the world that use semaphores as mutexes. So the PI
> on semaphores is really more of a compatibility issue.

Very strange, although there is now a real mutex in non-RT as in RT,
all semaphores are still converted to some
badly-implemented-recursive-but-not-recursive-callable-mutexes,
because the assumption is made that semaphores are intentionally
always used in the wrong way?!!
So, the code that is written nicely must be adapted to prevent
adaption of bad code?

I would expect a more logical solution like the introduction of raw_
types for the exceptions, just like what is done with spinlocks and so
on.

> > So, this is a regression if exactly the same driver is used in both
> > non-preempt-rt patched kernel and preempt-rt patched kernels.
>
> Not really.  There are things that the preempt-rt kernels require. One, is
> that things that need to keep semaphores instead of using them as mutexes,
> they should be converted to compat_semaphores.  Perhaps now that we have
> mutexes, we can remove the PI on semaphores, and out-of-tree drivers will
> need to make sure they don't use semaphores as mutexes anymore.

Up to this semaphore implementation the standard was always that a
driver written for a non-RT kernel should compile and work properly
without any adaption on a RT-kernel, until there are some specific
realtime requirements.
When it comes to semaphores this is thus completely not true.

And this is where the whole thing confused me completely.

> Yeah, that code is more of a hack to convert counting semaphores into
> mutexes.

Enough said...

> But semaphores still need to have owners, and they should not
> block on themselves. That may be where the bug is.

Or there should be a real recursive mutex implementation. But the need
for recursive mutexes is usually a sign for bad locking, and should
therefor be avoided.

> OK, I wont be able to work on this this weekend, but I'll try to get to it
> on Monday.  A better example to show the bug you are looking for is simply
> create a mutex and create a thread that grabs that mutex and goes to
> sleep. Have your driver read grab that mutex with
> mutex_lock_interruptible. And if the signal code is broken with this, then
> you definitely got a point that the inerruptible code is broken.

I will work on an example like this on Monday.

> This will keep the semantics clean and not obfuscate it with the semaphore
> code.

I agree.

> I'll write up that example on Monday if you don't have the time.

I will make time ;-)

> Note, that the unloading of the module should wake up the thread that
> grabbed the mutex so it can release it.

Unloading of a module is not possible as long as there is still a
handle open to it. So, it should be safe without waking up the thread.
And returning from the read call and still have a mutex locked is not
nice either.

Thanks for the explanation. I will keep you informed on Monday.


Have a nice weekend !


Remy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 11:44       ` Remy Bohmer
  2007-11-17 14:08         ` Steven Rostedt
@ 2007-11-17 16:22         ` Daniel Walker
  2007-11-17 17:09           ` Remy Bohmer
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Walker @ 2007-11-17 16:22 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Steven Rostedt, Ingo Molnar, Thomas Gleixner, RT, linux-kernel

On Sat, 2007-11-17 at 12:44 +0100, Remy Bohmer wrote:
> Hello Steven,
> 
> > The taker of a mutex must also be the one that releases it.  I don't see
> > how you could use a mutex for this. It really requires some kind of
> > completion, or a compat_semaphore.
> 
> I tried several ways of working around the bug, even tried
> implementing it with kernel threads and protecting global data with
> mutexes. Therefor I know that I have the same problem with mutexes. I
> just created a simple example that showed the problem quickly, this
> does not mean that this is the only case that does not work.

I tried your example and I was able to reproduce the OOPS that you
found.. Although there is one problem, you don't have the same number of
up()'s to down() calls so you end up leaving the dummy_read function
with the lock still held ..

Reviewing the OOPS and the warnings it looks like your progressively
corrupting the mutex waiter list since remove_waiter() actually leaves
the stack based waiter object on the waiter list.. (That's what it looks
like anyway)..

So I converted your code to use a compat_semaphore, and no oops
happens.. Which makes sense because compat_semaphores are designed to
work the way your using them.

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 16:22         ` Daniel Walker
@ 2007-11-17 17:09           ` Remy Bohmer
  2007-11-17 17:29             ` Daniel Walker
  0 siblings, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-17 17:09 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Steven Rostedt, Ingo Molnar, Thomas Gleixner, RT, linux-kernel

Hello Daniel,

Thanks for looking into it also.
Steven already made clear to me that the 'struct semaphore' type on
the RT-kernel should not be used as a counting-semaphore, but as some
sort of legacy-mutex... (The confusion that this will cause is clear
by now...)

I still do not understand the problems I had with the
interruptible-waits on a real rt-mutex, but I have to figure that out
again on Monday. Maybe one confusion let to another...

(Note, A completion will not work for me, because they are not
designed for reuse across several threads. The read/write runs in user
context and as such it can be called by different threads, which would
require a init of a completion before waiting on it, but that would be
racy, I could miss the awake by the init)

> So I converted your code to use a compat_semaphore, and no oops
> happens.. Which makes sense because compat_semaphores are designed to
> work the way your using them.

Actually, IMO, compat_semaphores behave like semaphores should behave,
and thus the same as they behave on a non-RT kernel, and at the
locations where the semaphores are now misused as mutexes on RT, we
should replace them by differently-named-mutex-type-semaphores, or
better: real-RT-mutexes..
IMO this wrong usage of semaphores is solved by modifying the code
that actually made proper use of the semaphores, and I think that if
the naming matches the mainline kernel, we only need to patch the
files that really need to be patched during the integration in
mainline of the RT-patch.

Kind Regards,

Remy Bohmer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:09           ` Remy Bohmer
@ 2007-11-17 17:29             ` Daniel Walker
  2007-11-17 17:46               ` Ingo Molnar
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Walker @ 2007-11-17 17:29 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Steven Rostedt, Ingo Molnar, Thomas Gleixner, RT, linux-kernel

On Sat, 2007-11-17 at 18:09 +0100, Remy Bohmer wrote:

> Actually, IMO, compat_semaphores behave like semaphores should behave,
> and thus the same as they behave on a non-RT kernel, and at the
> locations where the semaphores are now misused as mutexes on RT, we
> should replace them by differently-named-mutex-type-semaphores, or
> better: real-RT-mutexes..

The vast majority of semaphore are actually binary semaphores in the
Linux kernel .. So it's easier to mass convert semaphores to mutexes,
then address the ones that don't conform.. Usually they are converted to
the complete API in mainline..

> IMO this wrong usage of semaphores is solved by modifying the code
> that actually made proper use of the semaphores, and I think that if
> the naming matches the mainline kernel, we only need to patch the
> files that really need to be patched during the integration in
> mainline of the RT-patch.

As I say above, it's happen already.. Code is slowly getting converted
to the complete API or the code gets converted to use a binary semaphore
(or a mutex)..

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:29             ` Daniel Walker
@ 2007-11-17 17:46               ` Ingo Molnar
  2007-11-17 17:55                 ` Daniel Walker
  2007-11-20 16:43                 ` Daniel Walker
  0 siblings, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2007-11-17 17:46 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel


* Daniel Walker <dwalker@mvista.com> wrote:

> > Actually, IMO, compat_semaphores behave like semaphores should 
> > behave, and thus the same as they behave on a non-RT kernel, and at 
> > the locations where the semaphores are now misused as mutexes on RT, 
> > we should replace them by differently-named-mutex-type-semaphores, 
> > or better: real-RT-mutexes..
> 
> The vast majority of semaphore are actually binary semaphores in the 
> Linux kernel .. So it's easier to mass convert semaphores to mutexes, 
> then address the ones that don't conform.. Usually they are converted 
> to the complete API in mainline..

right now there are 3992 mutex_lock() critical sections in the kernel 
and only 351 down() based critical sections are left.

fixing the top 20:

      4 &vuart_bus_priv.probe_mutex
      5 &connections_lock
      5 &irq_ptr->setting_up_sema
      5 &kbd->sem
      5 &pnp_res_mutex
      5 &port->port_lock
      5 &tq_init_sem
      6 &adb_handler_sem
      6 &dev->parent->sem
      6 &driver_lock
      6 &ha->vport_sem
      7 &big_buffer_sem
      8 &dir_f->sem
      9 &c->alloc_sem
     11 &dev->sem
     11 &usbvision->lock
     12 &c->erase_free_sem
     15 &u132->scheduler_lock
     16 &zfcp_data.config_sema
     17 &f->sem

would remove 164 of them, so it would convert half of the remaining 
semaphore use in the kernel. So the job is almost finished - would 
anyone like to go for the final grand feat: complete removal of 
semaphores from the kernel? :-)

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:46               ` Ingo Molnar
@ 2007-11-17 17:55                 ` Daniel Walker
  2007-11-17 18:04                   ` Ingo Molnar
  2007-11-19  7:25                     ` Jon Masters
  2007-11-20 16:43                 ` Daniel Walker
  1 sibling, 2 replies; 28+ messages in thread
From: Daniel Walker @ 2007-11-17 17:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel

On Sat, 2007-11-17 at 18:46 +0100, Ingo Molnar wrote:
> * Daniel Walker <dwalker@mvista.com> wrote:
> 
> > > Actually, IMO, compat_semaphores behave like semaphores should 
> > > behave, and thus the same as they behave on a non-RT kernel, and at 
> > > the locations where the semaphores are now misused as mutexes on RT, 
> > > we should replace them by differently-named-mutex-type-semaphores, 
> > > or better: real-RT-mutexes..
> > 
> > The vast majority of semaphore are actually binary semaphores in the 
> > Linux kernel .. So it's easier to mass convert semaphores to mutexes, 
> > then address the ones that don't conform.. Usually they are converted 
> > to the complete API in mainline..
> 
> right now there are 3992 mutex_lock() critical sections in the kernel 
> and only 351 down() based critical sections are left.
> 
> fixing the top 20:
> 
>       4 &vuart_bus_priv.probe_mutex
>       5 &connections_lock
>       5 &irq_ptr->setting_up_sema
>       5 &kbd->sem
>       5 &pnp_res_mutex
>       5 &port->port_lock
>       5 &tq_init_sem
>       6 &adb_handler_sem
>       6 &dev->parent->sem
>       6 &driver_lock
>       6 &ha->vport_sem
>       7 &big_buffer_sem
>       8 &dir_f->sem
>       9 &c->alloc_sem
>      11 &dev->sem
>      11 &usbvision->lock
>      12 &c->erase_free_sem
>      15 &u132->scheduler_lock
>      16 &zfcp_data.config_sema
>      17 &f->sem
> 
> would remove 164 of them, so it would convert half of the remaining 
> semaphore use in the kernel. So the job is almost finished - would 
> anyone like to go for the final grand feat: complete removal of 
> semaphores from the kernel? :-)

Sure, you want to split the list?

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:55                 ` Daniel Walker
@ 2007-11-17 18:04                   ` Ingo Molnar
  2007-11-17 18:12                     ` Daniel Walker
  2007-11-17 22:49                     ` Remy Bohmer
  2007-11-19  7:25                     ` Jon Masters
  1 sibling, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2007-11-17 18:04 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel


* Daniel Walker <dwalker@mvista.com> wrote:

> On Sat, 2007-11-17 at 18:46 +0100, Ingo Molnar wrote:
> > * Daniel Walker <dwalker@mvista.com> wrote:
> > 
> > > > Actually, IMO, compat_semaphores behave like semaphores should 
> > > > behave, and thus the same as they behave on a non-RT kernel, and at 
> > > > the locations where the semaphores are now misused as mutexes on RT, 
> > > > we should replace them by differently-named-mutex-type-semaphores, 
> > > > or better: real-RT-mutexes..
> > > 
> > > The vast majority of semaphore are actually binary semaphores in the 
> > > Linux kernel .. So it's easier to mass convert semaphores to mutexes, 
> > > then address the ones that don't conform.. Usually they are converted 
> > > to the complete API in mainline..
> > 
> > right now there are 3992 mutex_lock() critical sections in the kernel 
> > and only 351 down() based critical sections are left.
> > 
> > fixing the top 20:
> > 
> >       4 &vuart_bus_priv.probe_mutex
> >       5 &connections_lock
> >       5 &irq_ptr->setting_up_sema
> >       5 &kbd->sem
> >       5 &pnp_res_mutex
> >       5 &port->port_lock
> >       5 &tq_init_sem
> >       6 &adb_handler_sem
> >       6 &dev->parent->sem
> >       6 &driver_lock
> >       6 &ha->vport_sem
> >       7 &big_buffer_sem
> >       8 &dir_f->sem
> >       9 &c->alloc_sem
> >      11 &dev->sem
> >      11 &usbvision->lock
> >      12 &c->erase_free_sem
> >      15 &u132->scheduler_lock
> >      16 &zfcp_data.config_sema
> >      17 &f->sem
> > 
> > would remove 164 of them, so it would convert half of the remaining 
> > semaphore use in the kernel. So the job is almost finished - would 
> > anyone like to go for the final grand feat: complete removal of 
> > semaphores from the kernel? :-)
> 
> Sure, you want to split the list?

split the list with you? Feel free to take any of those :-) dev->sem is 
nontrivial and probably not possible right now - and some of the others 
might be problematic too. But there might be fixable ones in the list. 
This shouldnt become like the BKL conversion - never truly finished.

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 18:04                   ` Ingo Molnar
@ 2007-11-17 18:12                     ` Daniel Walker
  2007-11-18 12:33                       ` Peter Zijlstra
  2007-11-17 22:49                     ` Remy Bohmer
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Walker @ 2007-11-17 18:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel

On Sat, 2007-11-17 at 19:04 +0100, Ingo Molnar wrote:

> 
> split the list with you? Feel free to take any of those :-) dev->sem is 
> nontrivial and probably not possible right now - and some of the others 
> might be problematic too. But there might be fixable ones in the list. 
> This shouldnt become like the BKL conversion - never truly finished.

If I said I was going to do some set, at least it's more likely that the
work isn't duplicated..

What specifically is wrong with dev->sem ?

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 18:04                   ` Ingo Molnar
  2007-11-17 18:12                     ` Daniel Walker
@ 2007-11-17 22:49                     ` Remy Bohmer
  1 sibling, 0 replies; 28+ messages in thread
From: Remy Bohmer @ 2007-11-17 22:49 UTC (permalink / raw)
  To: Ingo Molnar, Daniel Walker
  Cc: Steven Rostedt, Thomas Gleixner, RT, linux-kernel

> > Sure, you want to split the list?
>
> split the list with you? Feel free to take any of those :-) dev->sem is
> nontrivial and probably not possible right now - and some of the others
> might be problematic too. But there might be fixable ones in the list.
> This shouldnt become like the BKL conversion - never truly finished.

Hey Ingo and Daniel, Leave some of the fun open for me :-)

I just looked at the list and I found a few that seem doable.


Remy

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 18:12                     ` Daniel Walker
@ 2007-11-18 12:33                       ` Peter Zijlstra
  2007-11-18 22:26                         ` David Chinner
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2007-11-18 12:33 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Ingo Molnar, Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT,
	linux-kernel, Dave Chinner


On Sat, 2007-11-17 at 10:12 -0800, Daniel Walker wrote:

> What specifically is wrong with dev->sem ?

Nothing really, other than that they use semaphores to avoid lockdep :-/

I think I know how to annotate this, after Alan Stern explained all the
use cases, but I haven't come around to implementing it. Hope to do that
soonish.

Another real semaphore user is XFS, they really use the down/up
asymmetry that semaphores allow, last time I spoke with Dave Chinner he
didn't know a way around this.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-18 12:33                       ` Peter Zijlstra
@ 2007-11-18 22:26                         ` David Chinner
  0 siblings, 0 replies; 28+ messages in thread
From: David Chinner @ 2007-11-18 22:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Daniel Walker, Ingo Molnar, Remy Bohmer, Steven Rostedt,
	Thomas Gleixner, RT, linux-kernel, Dave Chinner

On Sun, Nov 18, 2007 at 01:33:50PM +0100, Peter Zijlstra wrote:
> 
> On Sat, 2007-11-17 at 10:12 -0800, Daniel Walker wrote:
> 
> > What specifically is wrong with dev->sem ?
> 
> Nothing really, other than that they use semaphores to avoid lockdep :-/
> 
> I think I know how to annotate this, after Alan Stern explained all the
> use cases, but I haven't come around to implementing it. Hope to do that
> soonish.
> 
> Another real semaphore user is XFS, they really use the down/up
> asymmetry that semaphores allow, last time I spoke with Dave Chinner he
> didn't know a way around this.

Hasn't changed recently. We could convert some to completions, but
others are much more difficult.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:55                 ` Daniel Walker
@ 2007-11-19  7:25                     ` Jon Masters
  2007-11-19  7:25                     ` Jon Masters
  1 sibling, 0 replies; 28+ messages in thread
From: Jon Masters @ 2007-11-19  7:24 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Ingo Molnar, Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT,
	linux-kernel

On Sat, 2007-11-17 at 09:55 -0800, Daniel Walker wrote:

> Sure, you want to split the list?

I'm happy to grab a few of these too. Let me know if either of you or
Ingo is working on the whole lot and about to dump it on us ;-)

Jon.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
@ 2007-11-19  7:25                     ` Jon Masters
  0 siblings, 0 replies; 28+ messages in thread
From: Jon Masters @ 2007-11-19  7:25 UTC (permalink / raw)
  To: linux-kernel

On Sat, 2007-11-17 at 09:55 -0800, Daniel Walker wrote:

> Sure, you want to split the list?

I'm happy to grab a few of these too. Let me know if either of you or
Ingo is working on the whole lot and about to dump it on us ;-)

Jon.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 15:36           ` Remy Bohmer
@ 2007-11-19 12:55             ` Remy Bohmer
  2007-11-19 13:54               ` Steven Rostedt
  0 siblings, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-19 12:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel, Daniel Walker

Hello Steven,

> > OK, I wont be able to work on this this weekend, but I'll try to get to it
> > on Monday.  A better example to show the bug you are looking for is simply
> > create a mutex and create a thread that grabs that mutex and goes to
> > sleep. Have your driver read grab that mutex with
> > mutex_lock_interruptible. And if the signal code is broken with this, then
> > you definitely got a point that the inerruptible code is broken.

I removed the 'struct semaphore' completely from my driver, using real
mutexes now instead, replace the signalling semaphores by 'struct
completion' mechanisms and got rid of the possible race I saw with the
completions in a different way, and now the problem is completely
gone!

Posix Signals work properly now (no OOPS anymore), so the problem was
likely related to the way I used the 'struct semaphore' types, which
is thus different compared to the non-RT kernel and therefor quite
confusing.

So, thank you (and Daniel) for pointing me into the right direction.

Now lets get rid of the 'struct semaphore' completely in the kernel :-))


Kind Regards,

Remy Bohmer

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-19 12:55             ` Remy Bohmer
@ 2007-11-19 13:54               ` Steven Rostedt
  0 siblings, 0 replies; 28+ messages in thread
From: Steven Rostedt @ 2007-11-19 13:54 UTC (permalink / raw)
  To: Remy Bohmer; +Cc: Ingo Molnar, Thomas Gleixner, RT, linux-kernel, Daniel Walker


On Mon, 19 Nov 2007, Remy Bohmer wrote:

> Hello Steven,
>
> > > OK, I wont be able to work on this this weekend, but I'll try to get to it
> > > on Monday.  A better example to show the bug you are looking for is simply
> > > create a mutex and create a thread that grabs that mutex and goes to
> > > sleep. Have your driver read grab that mutex with
> > > mutex_lock_interruptible. And if the signal code is broken with this, then
> > > you definitely got a point that the inerruptible code is broken.
>
> I removed the 'struct semaphore' completely from my driver, using real
> mutexes now instead, replace the signalling semaphores by 'struct
> completion' mechanisms and got rid of the possible race I saw with the
> completions in a different way, and now the problem is completely
> gone!
>
> Posix Signals work properly now (no OOPS anymore), so the problem was
> likely related to the way I used the 'struct semaphore' types, which
> is thus different compared to the non-RT kernel and therefor quite
> confusing.
>
> So, thank you (and Daniel) for pointing me into the right direction.
>
> Now lets get rid of the 'struct semaphore' completely in the kernel :-))

Remy,

Thanks a lot for looking further into this. I'd like to join the fun in
removing the rest of the semaphores in the kernel, but with you, Ingo,
Daniel and Jon going to do that, one more cook will just spoil the stew.

Once we get rid of all the semaphores in the kernel that are being used as
mutexes, then we can change the code in the -rt patch to keep semaphores
default as compat_semaphores.  And any out of tree driver would just need
to be fixed to use mutexes.

Well, read/write semaphores might have to stay special.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-19  7:25                     ` Jon Masters
  (?)
@ 2007-11-19 15:31                     ` Daniel Walker
  2007-11-19 15:51                       ` Remy Bohmer
  -1 siblings, 1 reply; 28+ messages in thread
From: Daniel Walker @ 2007-11-19 15:31 UTC (permalink / raw)
  To: Jon Masters
  Cc: Ingo Molnar, Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT,
	linux-kernel

On Mon, 2007-11-19 at 02:24 -0500, Jon Masters wrote:
> On Sat, 2007-11-17 at 09:55 -0800, Daniel Walker wrote:
> 
> > Sure, you want to split the list?
> 
> I'm happy to grab a few of these too. Let me know if either of you or
> Ingo is working on the whole lot and about to dump it on us ;-)

I was planning to just try a few, so I think your safe.. 

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-19 15:31                     ` Daniel Walker
@ 2007-11-19 15:51                       ` Remy Bohmer
  2007-11-19 16:11                         ` Daniel Walker
  0 siblings, 1 reply; 28+ messages in thread
From: Remy Bohmer @ 2007-11-19 15:51 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Jon Masters, Ingo Molnar, Steven Rostedt, Thomas Gleixner, RT,
	linux-kernel

Hello Daniel,

Can we coordinate who is doing what somehow, to prevent double work?

Remy

2007/11/19, Daniel Walker <dwalker@mvista.com>:
> On Mon, 2007-11-19 at 02:24 -0500, Jon Masters wrote:
> > On Sat, 2007-11-17 at 09:55 -0800, Daniel Walker wrote:
> >
> > > Sure, you want to split the list?
> >
> > I'm happy to grab a few of these too. Let me know if either of you or
> > Ingo is working on the whole lot and about to dump it on us ;-)
>
> I was planning to just try a few, so I think your safe..
>
> Daniel
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-19 15:51                       ` Remy Bohmer
@ 2007-11-19 16:11                         ` Daniel Walker
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Walker @ 2007-11-19 16:11 UTC (permalink / raw)
  To: Remy Bohmer
  Cc: Jon Masters, Ingo Molnar, Steven Rostedt, Thomas Gleixner, RT,
	linux-kernel

On Mon, 2007-11-19 at 16:51 +0100, Remy Bohmer wrote:
> Hello Daniel,
> 
> Can we coordinate who is doing what somehow, to prevent double work?

I actually haven't looked at any on Ingo's list (i8042tregs).. Feel free
to call out the ones your looking at tho..

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-17 17:46               ` Ingo Molnar
  2007-11-17 17:55                 ` Daniel Walker
@ 2007-11-20 16:43                 ` Daniel Walker
  2007-11-20 20:37                   ` Ingo Molnar
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Walker @ 2007-11-20 16:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel

On Sat, 2007-11-17 at 18:46 +0100, Ingo Molnar wrote:

> fixing the top 20:
> 

There are about 25 DECLARE_MUTEX() semaphores remaining .. One is the
BKL which I would guess can't be converted. The others I've looked at
appear to be trivial find/replace changes to get them to use the mutex
type.. Any reason those haven't been converted over?

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-20 16:43                 ` Daniel Walker
@ 2007-11-20 20:37                   ` Ingo Molnar
  2007-11-20 20:54                     ` Daniel Walker
  0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2007-11-20 20:37 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel


* Daniel Walker <dwalker@mvista.com> wrote:

> On Sat, 2007-11-17 at 18:46 +0100, Ingo Molnar wrote:
> 
> > fixing the top 20:
> 
> There are about 25 DECLARE_MUTEX() semaphores remaining .. One is the 
> BKL which I would guess can't be converted. The others I've looked at 
> appear to be trivial find/replace changes to get them to use the mutex 
> type.. Any reason those haven't been converted over?

i guess talking about it instead of posting patches is a factor ;-)

	Ingo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals
  2007-11-20 20:37                   ` Ingo Molnar
@ 2007-11-20 20:54                     ` Daniel Walker
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Walker @ 2007-11-20 20:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Remy Bohmer, Steven Rostedt, Thomas Gleixner, RT, linux-kernel

On Tue, 2007-11-20 at 21:37 +0100, Ingo Molnar wrote:
> * Daniel Walker <dwalker@mvista.com> wrote:
> 
> > On Sat, 2007-11-17 at 18:46 +0100, Ingo Molnar wrote:
> > 
> > > fixing the top 20:
> > 
> > There are about 25 DECLARE_MUTEX() semaphores remaining .. One is the 
> > BKL which I would guess can't be converted. The others I've looked at 
> > appear to be trivial find/replace changes to get them to use the mutex 
> > type.. Any reason those haven't been converted over?
> 
> i guess talking about it instead of posting patches is a factor ;-)

There's several using init_MUTEX() (~70 or so), reviewing a few they
look fairly trivial ..

Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2007-11-20 20:59 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-16 15:51 [BUG on PREEMPT_RT, 2.6.23.1-rt5] in rt-mutex code and signals Remy Bohmer
2007-11-16 20:20 ` Steven Rostedt
2007-11-16 23:02   ` Remy Bohmer
2007-11-16 23:37     ` Steven Rostedt
2007-11-17 11:44       ` Remy Bohmer
2007-11-17 14:08         ` Steven Rostedt
2007-11-17 15:06           ` Steven Rostedt
2007-11-17 15:36           ` Remy Bohmer
2007-11-19 12:55             ` Remy Bohmer
2007-11-19 13:54               ` Steven Rostedt
2007-11-17 16:22         ` Daniel Walker
2007-11-17 17:09           ` Remy Bohmer
2007-11-17 17:29             ` Daniel Walker
2007-11-17 17:46               ` Ingo Molnar
2007-11-17 17:55                 ` Daniel Walker
2007-11-17 18:04                   ` Ingo Molnar
2007-11-17 18:12                     ` Daniel Walker
2007-11-18 12:33                       ` Peter Zijlstra
2007-11-18 22:26                         ` David Chinner
2007-11-17 22:49                     ` Remy Bohmer
2007-11-19  7:24                   ` Jon Masters
2007-11-19  7:25                     ` Jon Masters
2007-11-19 15:31                     ` Daniel Walker
2007-11-19 15:51                       ` Remy Bohmer
2007-11-19 16:11                         ` Daniel Walker
2007-11-20 16:43                 ` Daniel Walker
2007-11-20 20:37                   ` Ingo Molnar
2007-11-20 20:54                     ` Daniel Walker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.