* 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
@ 2010-04-04 9:14 Udo van den Heuvel
2010-04-04 9:17 ` Udo van den Heuvel
2010-04-06 8:07 ` Thomas Gleixner
0 siblings, 2 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2010-04-04 9:14 UTC (permalink / raw)
To: RT
Hello,
I see a load of these after booting into 2.6.33.1-rt11:
BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 0, pid: 1507, name: md1_raid5
Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
Call Trace:
[<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
[<ffffffff812becc4>] ? __raid_run_ops+0x304/0xc60
[<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
[<ffffffff8104b460>] ? mod_timer+0x150/0x200
[<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
[<ffffffff8138e5bd>] ? schedule_timeout+0x22d/0x2b0
[<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
[<ffffffff812cd0f3>] ? md_thread+0x53/0x120
[<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
[<ffffffff812cd0a0>] ? md_thread+0x0/0x120
[<ffffffff81057016>] ? kthread+0x96/0xa0
[<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
[<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81056f80>] ? kthread+0x0/0xa0
[<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
As these appear to be touching my raid array I am quite eager to learn
how I can fix the BUGs.
Please have a look and explain.
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-04 9:14 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684 Udo van den Heuvel
@ 2010-04-04 9:17 ` Udo van den Heuvel
2010-04-06 8:07 ` Thomas Gleixner
1 sibling, 0 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2010-04-04 9:17 UTC (permalink / raw)
To: RT
Hello,
On 2010-04-04 11:14, Udo van den Heuvel wrote:
> I see a load of these after booting into 2.6.33.1-rt11:
>
> BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
> pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 0, pid: 1507, name: md1_raid5
> Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
> Call Trace:
> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
> [<ffffffff812becc4>] ? __raid_run_ops+0x304/0xc60
> [<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
> [<ffffffff8104b460>] ? mod_timer+0x150/0x200
> [<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
> [<ffffffff8138e5bd>] ? schedule_timeout+0x22d/0x2b0
> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
> [<ffffffff812cd0f3>] ? md_thread+0x53/0x120
> [<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
> [<ffffffff812cd0a0>] ? md_thread+0x0/0x120
> [<ffffffff81057016>] ? kthread+0x96/0xa0
> [<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
> [<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
> [<ffffffff81056f80>] ? kthread+0x0/0xa0
> [<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
>
I also see this one happing in 2.6.33.1-rt1 in between a number of the
BUGs I mentioned up on this page:
#
BUG: scheduling while atomic: md1_raid5/0x00000001/1507, CPU#2
#
Modules linked in: radeon ttm drm_kms_helper drm fb i2c_algo_bit
cfbcopyarea cfbimgblt cfbfillrect nfsd nfs_acl exportfs eeprom it87
hwmon_vid lockd sunrpc cpufreq_ondemand powernow_k8 ipt_REJECT
nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
nf_conntrack_netbios_ns ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 xt_state
nf_conntrack ip6table_filter ip6_tables x_tables ipv6 microcode
binfmt_misc ext4 jbd2 crc16 snd_hda_codec_realtek snd_ice1712
snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_ac97_codec ac97_bus
snd_i2c snd_mpu401_uart snd_hda_intel snd_hda_codec snd_seq snd_rawmidi
snd_pcm snd_seq_device snd_timer i2c_nforce2 ppdev snd ehci_hcd
parport_pc ohci1394 ohci_hcd parport ieee1394 sg snd_page_alloc evdev
k10temp button floppy [last unloaded: scsi_wait_scan]
#
Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
#
Call Trace:
#
[<ffffffff8138da5a>] ? __schedule+0x38a/0x920
#
[<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
#
[<ffffffff8138fe30>] ? _raw_spin_unlock+0x10/0x40
#
[<ffffffff8106a54f>] ? task_blocks_on_rt_mutex+0x17f/0x1e0
#
[<ffffffff8138e110>] ? schedule+0x10/0x20
#
[<ffffffff8138f22f>] ? rt_spin_lock_slowlock+0x1ef/0x2c0
#
[<ffffffff812bcf38>] ? release_stripe+0x28/0x50
#
[<ffffffff81196d4e>] ? async_xor+0x13e/0x170
#
[<ffffffff812bee4a>] ? __raid_run_ops+0x48a/0xc60
#
[<ffffffff812bd140>] ? ops_complete_reconstruct+0x0/0xa0
#
[<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
#
[<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
#
[<ffffffff812cd0f3>] ? md_thread+0x53/0x120
#
[<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
#
[<ffffffff812cd0a0>] ? md_thread+0x0/0x120
#
[<ffffffff81057016>] ? kthread+0x96/0xa0
#
[<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
#
[<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
#
[<ffffffff81056f80>] ? kthread+0x0/0xa0
#
[<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-04 9:14 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684 Udo van den Heuvel
2010-04-04 9:17 ` Udo van den Heuvel
@ 2010-04-06 8:07 ` Thomas Gleixner
2010-04-06 13:34 ` Udo van den Heuvel
1 sibling, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2010-04-06 8:07 UTC (permalink / raw)
To: Udo van den Heuvel; +Cc: RT
On Sun, 4 Apr 2010, Udo van den Heuvel wrote:
> Hello,
>
> I see a load of these after booting into 2.6.33.1-rt11:
>
> BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
> pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 0, pid: 1507, name: md1_raid5
> Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
> Call Trace:
> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
> [<ffffffff812becc4>] ? __raid_run_ops+0x304/0xc60
> [<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
> [<ffffffff8104b460>] ? mod_timer+0x150/0x200
> [<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
> [<ffffffff8138e5bd>] ? schedule_timeout+0x22d/0x2b0
> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
> [<ffffffff812cd0f3>] ? md_thread+0x53/0x120
> [<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
> [<ffffffff812cd0a0>] ? md_thread+0x0/0x120
> [<ffffffff81057016>] ? kthread+0x96/0xa0
> [<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
> [<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
> [<ffffffff81056f80>] ? kthread+0x0/0xa0
> [<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
>
> As these appear to be touching my raid array I am quite eager to learn
> how I can fix the BUGs.
>
> Please have a look and explain.
That's caused by the get_cpu()/put_cpu() preempt disabled region. Can
you try the following (untested) patch ?
Thanks,
tglx
---
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ceb24af..b61eaa6 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1149,8 +1149,9 @@ static void __raid_run_ops(struct stripe_head *sh, unsigned long ops_request)
struct raid5_percpu *percpu;
unsigned long cpu;
- cpu = get_cpu();
+ cpu = raw_smp_processor_id();
percpu = per_cpu_ptr(conf->percpu, cpu);
+ spin_lock(&percpu->lock);
if (test_bit(STRIPE_OP_BIOFILL, &ops_request)) {
ops_run_biofill(sh);
overlap_clear++;
@@ -1202,7 +1203,7 @@ static void __raid_run_ops(struct stripe_head *sh, unsigned long ops_request)
if (test_and_clear_bit(R5_Overlap, &dev->flags))
wake_up(&sh->raid_conf->wait_for_overlap);
}
- put_cpu();
+ spin_unlock(&percpu->lock);
}
#ifdef CONFIG_MULTICORE_RAID456
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index dd70835..2db71cd 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -400,6 +400,7 @@ struct raid5_private_data {
*/
/* per cpu variables */
struct raid5_percpu {
+ spinlock_t lock; /* Protection for -RT */
struct page *spare_page; /* Used when checking P/Q in raid6 */
void *scribble; /* space for constructing buffer
* lists and performing address
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-06 8:07 ` Thomas Gleixner
@ 2010-04-06 13:34 ` Udo van den Heuvel
2010-04-06 13:43 ` Luis Claudio R. Goncalves
2010-04-06 14:39 ` Thomas Gleixner
0 siblings, 2 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2010-04-06 13:34 UTC (permalink / raw)
To: RT
Hallo Thomas,
On 2010-04-06 10:07, Thomas Gleixner wrote:
> On Sun, 4 Apr 2010, Udo van den Heuvel wrote:
>> I see a load of these after booting into 2.6.33.1-rt11:
>>
>> BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
>> pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 0, pid: 1507, name: md1_raid5
>> Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
>> Call Trace:
>> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
>> [<ffffffff812becc4>] ? __raid_run_ops+0x304/0xc60
>> [<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
>> [<ffffffff8104b460>] ? mod_timer+0x150/0x200
>> [<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
>> [<ffffffff8138e5bd>] ? schedule_timeout+0x22d/0x2b0
>> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
>> [<ffffffff812cd0f3>] ? md_thread+0x53/0x120
>> [<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
>> [<ffffffff812cd0a0>] ? md_thread+0x0/0x120
>> [<ffffffff81057016>] ? kthread+0x96/0xa0
>> [<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
>> [<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
>> [<ffffffff81056f80>] ? kthread+0x0/0xa0
>> [<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
>>
>> As these appear to be touching my raid array I am quite eager to learn
>> how I can fix the BUGs.
>>
>> Please have a look and explain.
>
> That's caused by the get_cpu()/put_cpu() preempt disabled region. Can
> you try the following (untested) patch ?
After applying, compilign and rebooting the BUG-messages appear to be
gone (after a quick test).
Thanks!
Only issue remaining is this message:
Apr 6 15:25:02 nawdew rtkit-daemon[3484]: Failed to make ourselves RT:
Operation not permitted
(repeated a number of times)
Is this also -rt related?
Kind regards,
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-06 13:34 ` Udo van den Heuvel
@ 2010-04-06 13:43 ` Luis Claudio R. Goncalves
2010-04-06 14:01 ` Udo van den Heuvel
2010-04-06 14:39 ` Thomas Gleixner
1 sibling, 1 reply; 7+ messages in thread
From: Luis Claudio R. Goncalves @ 2010-04-06 13:43 UTC (permalink / raw)
To: Udo van den Heuvel; +Cc: RT
On Tue, Apr 06, 2010 at 03:34:29PM +0200, Udo van den Heuvel wrote:
| Hallo Thomas,
|
| On 2010-04-06 10:07, Thomas Gleixner wrote:
| > On Sun, 4 Apr 2010, Udo van den Heuvel wrote:
| >> I see a load of these after booting into 2.6.33.1-rt11:
| >>
| >> BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
| >> pcnt: 1 0 in_atomic(): 1, irqs_disabled(): 0, pid: 1507, name: md1_raid5
| >> Pid: 1507, comm: md1_raid5 Not tainted 2.6.33.1-rt11 #1
| >> Call Trace:
| >> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
| >> [<ffffffff812becc4>] ? __raid_run_ops+0x304/0xc60
| >> [<ffffffff812c0ccd>] ? handle_stripe+0x6bd/0x1a70
| >> [<ffffffff8104b460>] ? mod_timer+0x150/0x200
| >> [<ffffffff812c23f6>] ? raid5d+0x376/0x4f0
| >> [<ffffffff8138e5bd>] ? schedule_timeout+0x22d/0x2b0
| >> [<ffffffff8138fb8c>] ? rt_spin_lock+0x2c/0x70
| >> [<ffffffff812cd0f3>] ? md_thread+0x53/0x120
| >> [<ffffffff810573a0>] ? autoremove_wake_function+0x0/0x30
| >> [<ffffffff812cd0a0>] ? md_thread+0x0/0x120
| >> [<ffffffff81057016>] ? kthread+0x96/0xa0
| >> [<ffffffff81037908>] ? finish_task_switch+0x58/0xd0
| >> [<ffffffff810032d4>] ? kernel_thread_helper+0x4/0x10
| >> [<ffffffff81056f80>] ? kthread+0x0/0xa0
| >> [<ffffffff810032d0>] ? kernel_thread_helper+0x0/0x10
| >>
| >> As these appear to be touching my raid array I am quite eager to learn
| >> how I can fix the BUGs.
| >>
| >> Please have a look and explain.
| >
| > That's caused by the get_cpu()/put_cpu() preempt disabled region. Can
| > you try the following (untested) patch ?
|
| After applying, compilign and rebooting the BUG-messages appear to be
| gone (after a quick test).
| Thanks!
|
| Only issue remaining is this message:
|
| Apr 6 15:25:02 nawdew rtkit-daemon[3484]: Failed to make ourselves RT:
| Operation not permitted
| (repeated a number of times)
|
| Is this also -rt related?
here's the description of rtkit:
Summary : Realtime Policy and Watchdog Daemon
Description :
RealtimeKit is a D-Bus system service that changes the
scheduling policy of user processes/threads to SCHED_RR (i.e. realtime
scheduling mode) on request. It is intended to be used as a secure
mechanism to allow real-time scheduling to be used by normal user
processes.
One famous user of rtkit is pulseaudio. In fact, it is the only user of
rtkit on my system. And the message reflects that either your pulseaudio
config is forbidding usage of RT prios or the user running pulseaudio has
no rights to use higher priorities (check with ulimit)
Luis
|
| Kind regards,
| Udo
| --
| To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at http://vger.kernel.org/majordomo-info.html
---end quoted text---
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-06 13:43 ` Luis Claudio R. Goncalves
@ 2010-04-06 14:01 ` Udo van den Heuvel
0 siblings, 0 replies; 7+ messages in thread
From: Udo van den Heuvel @ 2010-04-06 14:01 UTC (permalink / raw)
To: RT
On 2010-04-06 15:43, Luis Claudio R. Goncalves wrote:
> | After applying, compilign and rebooting the BUG-messages appear to be
> | gone (after a quick test).
> | Thanks!
> |
> | Only issue remaining is this message:
> |
> | Apr 6 15:25:02 nawdew rtkit-daemon[3484]: Failed to make ourselves RT:
> | Operation not permitted
> | (repeated a number of times)
> |
> | Is this also -rt related?
>
> here's the description of rtkit:
>
> Summary : Realtime Policy and Watchdog Daemon
(...)
> One famous user of rtkit is pulseaudio. In fact, it is the only user of
> rtkit on my system. And the message reflects that either your pulseaudio
> config is forbidding usage of RT prios or the user running pulseaudio has
> no rights to use higher priorities (check with ulimit)
At the time rtkit-daemon is/was running as the gdm user since no user
was logged in using the gui.
So perhaps I can fix this in /etc/security/limits.conf?
Hmm.
Putting gdm user in pulse-rt group does fix the rtkit-daemon messages
but not the pulseaudio error.
I will seek help for this elsewhere as it is not so on topic here.
Thanks for the patch!
Udo
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684
2010-04-06 13:34 ` Udo van den Heuvel
2010-04-06 13:43 ` Luis Claudio R. Goncalves
@ 2010-04-06 14:39 ` Thomas Gleixner
1 sibling, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2010-04-06 14:39 UTC (permalink / raw)
To: Udo van den Heuvel; +Cc: RT
On Tue, 6 Apr 2010, Udo van den Heuvel wrote:
> Only issue remaining is this message:
>
> Apr 6 15:25:02 nawdew rtkit-daemon[3484]: Failed to make ourselves RT:
> Operation not permitted
> (repeated a number of times)
>
> Is this also -rt related?
Don't think so. RT does not change the rules for schedsetscheduler().
Thanks,
tglx
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-04-06 14:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-04 9:14 2.6.33.1-rt11 BUG: sleeping function called from invalid context at kernel/rtmutex.c:684 Udo van den Heuvel
2010-04-04 9:17 ` Udo van den Heuvel
2010-04-06 8:07 ` Thomas Gleixner
2010-04-06 13:34 ` Udo van den Heuvel
2010-04-06 13:43 ` Luis Claudio R. Goncalves
2010-04-06 14:01 ` Udo van den Heuvel
2010-04-06 14:39 ` Thomas Gleixner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox