* possible problem with sem_post
@ 2012-02-13 13:42 Tim Blechmann
2012-02-13 19:38 ` Tim Blechmann
0 siblings, 1 reply; 3+ messages in thread
From: Tim Blechmann @ 2012-02-13 13:42 UTC (permalink / raw)
To: linux-rt-users
hi all,
i am experiencing a strange issue with lockups of my application. have multiple
high-priority real-time threads (as many threads as there are physical cpus) and
one of the threads seems to lock inside sem_post(). these lookups only occur
very rarely, after stressing the application (and the semaphore) for a rather
long time.
sem_post seems to call sys_futex with FUTEX_WAKE. this issue only occurred
recently after installing the 3.0 rt kernel (currently 3.0.20-rt35). but haven't
seen this behavior on any non-rt kernel (currently running another stress-test).
the machine is a thinkpad t410, x86_64.
if this is a problem of the rt-kernel, is there any way to debug it? or is it in
general unsafe to call sem_post from real-time threads?
thanks, tim
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: possible problem with sem_post
2012-02-13 13:42 possible problem with sem_post Tim Blechmann
@ 2012-02-13 19:38 ` Tim Blechmann
2012-02-14 18:51 ` Tim Blechmann
0 siblings, 1 reply; 3+ messages in thread
From: Tim Blechmann @ 2012-02-13 19:38 UTC (permalink / raw)
To: linux-rt-users
> i am experiencing a strange issue with lockups of my application. have
> multiple high-priority real-time threads (as many threads as there are
> physical cpus) and one of the threads seems to lock inside sem_post(). these
> lookups only occur very rarely, after stressing the application (and the
> semaphore) for a rather long time.
>
> sem_post seems to call sys_futex with FUTEX_WAKE. this issue only occurred
> recently after installing the 3.0 rt kernel (currently 3.0.20-rt35). but
> haven't seen this behavior on any non-rt kernel (currently running another
> stress-test). the machine is a thinkpad t410, x86_64.
>
> if this is a problem of the rt-kernel, is there any way to debug it? or is it
> in general unsafe to call sem_post from real-time threads?
ok, i ran the same test on a stock ubuntu kernel for a few hours without any
problem.
the situation: 2 cpus, 2 high-priority SCHED_FIFO threads. several low-priority
threads, one of them waiting for a semaphore, that is posted by the rt threads.
my guess is that the low-priority thread acquires a spinlock but then gets
preempted, but the high-priority thread waits for this spinlock ... is this
possible?
thanks, tim
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: possible problem with sem_post
2012-02-13 19:38 ` Tim Blechmann
@ 2012-02-14 18:51 ` Tim Blechmann
0 siblings, 0 replies; 3+ messages in thread
From: Tim Blechmann @ 2012-02-14 18:51 UTC (permalink / raw)
To: linux-rt-users
>> i am experiencing a strange issue with lockups of my application. have
>> multiple high-priority real-time threads (as many threads as there are
>> physical cpus) and one of the threads seems to lock inside sem_post(). these
>> lookups only occur very rarely, after stressing the application (and the
>> semaphore) for a rather long time.
>>
>> sem_post seems to call sys_futex with FUTEX_WAKE. this issue only occurred
>> recently after installing the 3.0 rt kernel (currently 3.0.20-rt35). but
>> haven't seen this behavior on any non-rt kernel (currently running another
>> stress-test). the machine is a thinkpad t410, x86_64.
>>
>> if this is a problem of the rt-kernel, is there any way to debug it? or is it
>> in general unsafe to call sem_post from real-time threads?
>
> ok, i ran the same test on a stock ubuntu kernel for a few hours without any
> problem.
>
> the situation: 2 cpus, 2 high-priority SCHED_FIFO threads. several
> low-priority threads, one of them waiting for a semaphore, that is posted by
> the rt threads. my guess is that the low-priority thread acquires a spinlock
> but then gets preempted, but the high-priority thread waits for this spinlock
> ... is this possible?
for the record:
[ 999.660730] BUG: sleeping function called from invalid context at kernel/rtmutex.c:646
[ 999.660735] in_atomic(): 1, irqs_disabled(): 1, pid: 22, name: irq/9-acpi
[ 999.660738] 1 lock held by irq/9-acpi/22:
[ 999.660739] #0: (acpi_gbl_gpe_lock){......}, at: [<ffffffff812c2c1c>] acpi_ev_gpe_detect+0x2c/0x108
[ 999.660752] irq event stamp: 84018
[ 999.660753] hardirqs last enabled at (84017): [<ffffffff8158bb5b>] _raw_spin_unlock_irq+0x2b/0x60
[ 999.660761] hardirqs last disabled at (84018): [<ffffffff8158b974>] _raw_spin_lock_irqsave+0x24/0x70
[ 999.660765] softirqs last enabled at (0): [<ffffffff8104e1f4>] copy_process+0x6c4/0x1680
[ 999.660772] softirqs last disabled at (0): [< (null)>] (null)
[ 999.660776] Pid: 22, comm: irq/9-acpi Not tainted 3.0.20-rt36+ #76
[ 999.660778] Call Trace:
[ 999.660787] [<ffffffff81085330>] ? print_irqtrace_events+0xd0/0xe0
[ 999.660791] [<ffffffff8103ec7a>] __might_sleep+0xea/0x120
[ 999.660795] [<ffffffff8158b1af>] rt_spin_lock+0x1f/0x60
[ 999.660802] [<ffffffff81108b13>] kmem_cache_alloc+0x83/0x210
[ 999.660807] [<ffffffff812b7cef>] ? acpi_ec_sync_query+0xbf/0xbf
[ 999.660813] [<ffffffff812b26db>] __acpi_os_execute+0x2c/0x10c
[ 999.660817] [<ffffffff812b27c6>] acpi_os_execute+0xb/0xd
[ 999.660820] [<ffffffff812b8389>] acpi_ec_gpe_handler+0x69/0x72
[ 999.660824] [<ffffffff812c2b77>] acpi_ev_gpe_dispatch+0xc0/0x139
[ 999.660827] [<ffffffff812c2ca0>] acpi_ev_gpe_detect+0xb0/0x108
[ 999.660834] [<ffffffff810ba070>] ? irq_thread_fn+0x50/0x50
[ 999.660838] [<ffffffff812c137d>] acpi_ev_sci_xrupt_handler+0x1d/0x26
[ 999.660841] [<ffffffff812b2831>] acpi_irq+0x11/0x2c
[ 999.660845] [<ffffffff810ba099>] irq_forced_thread_fn+0x29/0x70
[ 999.660848] [<ffffffff810b9fa2>] irq_thread+0x172/0x1f0
[ 999.660853] [<ffffffff810b9e30>] ? irq_finalize_oneshot+0x120/0x120
[ 999.660858] [<ffffffff810706fc>] kthread+0x9c/0xb0
[ 999.660865] [<ffffffff81592964>] kernel_thread_helper+0x4/0x10
[ 999.660869] [<ffffffff8103ea67>] ? finish_task_switch+0x87/0x110
[ 999.660873] [<ffffffff8158bfd8>] ? retint_restore_args+0x13/0x13
[ 999.660877] [<ffffffff81070660>] ? __init_kthread_worker+0xa0/0xa0
[ 999.660881] [<ffffffff81592960>] ? gs_change+0x13/0x13
thnx, tim
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2012-02-14 18:52 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-13 13:42 possible problem with sem_post Tim Blechmann
2012-02-13 19:38 ` Tim Blechmann
2012-02-14 18:51 ` Tim Blechmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox