* sun x4500 soft lockup during raid creation
@ 2009-01-28 20:30 Vladimir Ivashchenko
2009-01-28 21:33 ` Joe Landman
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-01-28 20:30 UTC (permalink / raw)
To: linux-raid
Hi,
We've got these new Sun X4500 servers. The system I'm playing with now
has 48 x 250 GB SATA HDDs.
Right now I'm creating two RAID6 arrays, 24 and 22 drives each:
mdadm --verbose --create /dev/md3 --level=6
--raid-devices=24 /dev/sda /dev/sdaa /dev/sdab /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan /dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas /dev/sdat /dev/sdau /dev/sdav /dev/sdb /dev/sdc
mdadm --verbose --create /dev/md4 --level=6
--raid-devices=22 /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdz
mdadm --detail is reporting that everything is going smoothly, however
my /var/log/messages is full of "BUG: soft lockup - CPU#X stuck for
10s!" errors appearing every 1-3 minutes.
CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
Ghz, 16 GB RAM.
The system does not crash and otherwise seems to be healthy. Arrays are
still under construction and I don't know if they will actually work
yet.
What I noticed is that at first it was complaining about lockups on md3
process, but once I started creating md4, complaints were exclusively
for md4 process only.
Any stability assurances or workarounds are highly appreciated. :)
Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
[md3_raid5:5672]
Jan 28 21:31:32 SunSTG kernel:
Jan 28 21:31:32 SunSTG kernel: Pid: 5672, comm: md3_raid5
Jan 28 21:31:32 SunSTG kernel: EIP: 0060:[<f8d68162>] CPU: 0
Jan 28 21:31:32 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
+0x10a/0x1b6 [raid456]
Jan 28 21:31:32 SunSTG kernel: EFLAGS: 00000202 Not tainted
(2.6.18-92.1.22.el5PAE #1)
Jan 28 21:31:32 SunSTG kernel: EAX: ea0774e0 EBX: 000004e0 ECX: ead0ad30
EDX: ea077000
Jan 28 21:31:32 SunSTG kernel: ESI: ead0ade0 EDI: 00000004 EBP: ead0add0
DS: 007b ES: 007b
Jan 28 21:31:32 SunSTG kernel: CR0: 80050033 CR2: 0806e000 CR3: 373239e0
CR4: 000006f0
Jan 28 21:31:32 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
[raid456]
Jan 28 21:31:32 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
[raid456]
Jan 28 21:31:32 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
Jan 28 21:31:32 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
Jan 28 21:31:32 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
Jan 28 21:31:32 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
Jan 28 21:31:32 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
[raid456]
Jan 28 21:31:33 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
[raid456]
Jan 28 21:31:33 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
Jan 28 21:31:33 SunSTG kernel: [<c0436347>] autoremove_wake_function
+0x0/0x2d
Jan 28 21:31:33 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
Jan 28 21:31:33 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
Jan 28 21:31:33 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
Jan 28 21:31:33 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
+0x7/0x10
Jan 28 21:31:33 SunSTG kernel: =======================
Jan 28 21:32:26 SunSTG kernel: BUG: soft lockup - CPU#2 stuck for 10s!
[md3_raid5:5672]
Jan 28 21:32:26 SunSTG kernel:
Jan 28 21:32:26 SunSTG kernel: Pid: 5672, comm: md3_raid5
Jan 28 21:32:26 SunSTG kernel: EIP: 0060:[<f8d68170>] CPU: 2
Jan 28 21:32:26 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
+0x118/0x1b6 [raid456]
Jan 28 21:32:26 SunSTG kernel: EFLAGS: 00000202 Not tainted
(2.6.18-92.1.22.el5PAE #1)
Jan 28 21:32:26 SunSTG kernel: EAX: ea784040 EBX: 00000040 ECX: ead0ad30
EDX: ea784000
Jan 28 21:32:26 SunSTG kernel: ESI: ead0adf0 EDI: 00000008 EBP: ead0add0
DS: 007b ES: 007b
Jan 28 21:32:26 SunSTG kernel: CR0: 80050033 CR2: b7f6f000 CR3: 3714e920
CR4: 000006f0
Jan 28 21:32:26 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
[raid456]
Jan 28 21:32:26 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
[raid456]
Jan 28 21:32:26 SunSTG kernel: [<c041f34b>] find_busiest_group
+0x177/0x462
Jan 28 21:32:26 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
Jan 28 21:32:26 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
Jan 28 21:32:26 SunSTG kernel: [<f8d6171e>] __release_stripe+0xfc/0x101
[raid456]
Jan 28 21:32:26 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
[raid456]
Jan 28 21:32:26 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
[raid456]
Jan 28 21:32:26 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
Jan 28 21:32:26 SunSTG kernel: [<c0436347>] autoremove_wake_function
+0x0/0x2d
Jan 28 21:32:26 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
Jan 28 21:32:26 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
Jan 28 21:32:26 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
Jan 28 21:32:26 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
+0x7/0x10
Jan 28 21:32:26 SunSTG kernel: =======================
<somewhere here I issue commands to create md4>
Jan 28 21:32:43 SunSTG kernel: md: syncing RAID array md4
Jan 28 21:32:43 SunSTG kernel: md: minimum _guaranteed_ reconstruction
speed: 1000 KB/sec/disc.
Jan 28 21:32:43 SunSTG kernel: md: using maximum available idle IO
bandwidth (but not more than 200000 KB/sec) for reconstruction.
Jan 28 21:32:43 SunSTG kernel: md: using 128k window, over a total of
244195200 blocks.
Jan 28 21:33:20 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
[md4_raid5:5694]
Jan 28 21:33:20 SunSTG kernel:
Jan 28 21:33:20 SunSTG kernel: Pid: 5694, comm: md4_raid5
Jan 28 21:33:20 SunSTG kernel: EIP: 0060:[<f8d63aff>] CPU: 3
Jan 28 21:33:20 SunSTG kernel: EIP is at handle_stripe+0x25c/0x215e
[raid456]
Jan 28 21:33:20 SunSTG kernel: EFLAGS: 00000282 Not tainted
(2.6.18-92.1.22.el5PAE #1)
Jan 28 21:33:20 SunSTG kernel: EAX: f6a2b404 EBX: 00000001 ECX: f53d17c0
EDX: e8c532c0
Jan 28 21:33:20 SunSTG kernel: ESI: e8c532c4 EDI: 00000016 EBP: e8c52b64
DS: 007b ES: 007b
Jan 28 21:33:20 SunSTG kernel: CR0: 8005003b CR2: b7cfc000 CR3: 3714ef00
CR4: 000006f0
Jan 28 21:33:20 SunSTG kernel: [<c041f34b>] find_busiest_group
+0x177/0x462
Jan 28 21:33:20 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
Jan 28 21:33:20 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
Jan 28 21:33:20 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
Jan 28 21:33:20 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
Jan 28 21:33:20 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
Jan 28 21:33:20 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
[raid456]
Jan 28 21:33:20 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
[raid456]
Jan 28 21:33:20 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
Jan 28 21:33:20 SunSTG kernel: [<c0436347>] autoremove_wake_function
+0x0/0x2d
Jan 28 21:33:20 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
Jan 28 21:33:21 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
Jan 28 21:33:21 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
Jan 28 21:33:21 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
+0x7/0x10
Jan 28 21:33:21 SunSTG kernel: =======================
Jan 28 21:33:50 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
[md4_raid5:5694]
Jan 28 21:33:50 SunSTG kernel:
Jan 28 21:33:50 SunSTG kernel: Pid: 5694, comm: md4_raid5
Jan 28 21:33:50 SunSTG kernel: EIP: 0060:[<f8bf9813>] CPU: 3
Jan 28 21:33:50 SunSTG kernel: EIP is at xor_sse_5+0xa0/0x3b5 [xor]
Jan 28 21:33:50 SunSTG kernel: EFLAGS: 00000202 Not tainted
(2.6.18-92.1.22.el5PAE #1)
Jan 28 21:33:50 SunSTG kernel: EAX: 0000000b EBX: e8e66500 ECX: e8e69500
EDX: e8e6e500
Jan 28 21:33:50 SunSTG kernel: ESI: e8e67500 EDI: e8e68500 EBP: e96b5dd4
DS: 007b ES: 007b
Jan 28 21:33:50 SunSTG kernel: CR0: 80050033 CR2: b7cfc000 CR3: 3714ef00
CR4: 000006f0
Jan 28 21:33:50 SunSTG kernel: [<f8bfa200>] xor_block+0x74/0x7d [xor]
Jan 28 21:33:50 SunSTG kernel: [<f8d636b3>] compute_block_1+0xe3/0x13a
[raid456]
Jan 28 21:33:50 SunSTG kernel: [<f8d644ba>] handle_stripe+0xc17/0x215e
[raid456]
Jan 28 21:33:50 SunSTG kernel: [<c041f34b>] find_busiest_group
+0x177/0x462
Jan 28 21:33:50 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
Jan 28 21:33:50 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
Jan 28 21:33:50 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
Jan 28 21:33:50 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
Jan 28 21:33:50 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
[raid456]
Jan 28 21:33:50 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
[raid456]
Jan 28 21:33:50 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
Jan 28 21:33:50 SunSTG kernel: [<c0436347>] autoremove_wake_function
+0x0/0x2d
Jan 28 21:33:50 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
Jan 28 21:33:51 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
Jan 28 21:33:51 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
Jan 28 21:33:51 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
+0x7/0x10
Jan 28 21:33:51 SunSTG kernel: =======================
... and it goes on complaining about md4_raid5:5694.
[root@SunSTG ~]# mdadm --detail /dev/md3
/dev/md3:
Version : 00.90.03
Creation Time : Wed Jan 28 21:30:50 2009
Raid Level : raid6
Array Size : 5372294400 (5123.42 GiB 5501.23 GB)
Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
Raid Devices : 24
Total Devices : 24
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Wed Jan 28 21:30:50 2009
State : clean, resyncing
Active Devices : 24
Working Devices : 24
Failed Devices : 0
Spare Devices : 0
Chunk Size : 64K
Rebuild Status : 15% complete
UUID : d8c2b5ce:576a117b:f2494cd1:626a774c
Events : 0.1
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 65 160 1 active sync /dev/sdaa
2 65 176 2 active sync /dev/sdab
3 65 208 3 active sync /dev/sdad
4 65 224 4 active sync /dev/sdae
5 65 240 5 active sync /dev/sdaf
6 66 0 6 active sync /dev/sdag
7 66 16 7 active sync /dev/sdah
8 66 32 8 active sync /dev/sdai
9 66 48 9 active sync /dev/sdaj
10 66 64 10 active sync /dev/sdak
11 66 80 11 active sync /dev/sdal
12 66 96 12 active sync /dev/sdam
13 66 112 13 active sync /dev/sdan
14 66 128 14 active sync /dev/sdao
15 66 144 15 active sync /dev/sdap
16 66 160 16 active sync /dev/sdaq
17 66 176 17 active sync /dev/sdar
18 66 192 18 active sync /dev/sdas
19 66 208 19 active sync /dev/sdat
20 66 224 20 active sync /dev/sdau
21 66 240 21 active sync /dev/sdav
22 8 16 22 active sync /dev/sdb
23 8 32 23 active sync /dev/sdc
[root@SunSTG ~]# mdadm --detail /dev/md4
/dev/md4:
Version : 00.90.03
Creation Time : Wed Jan 28 21:32:39 2009
Raid Level : raid6
Array Size : 4883904000 (4657.65 GiB 5001.12 GB)
Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
Raid Devices : 22
Total Devices : 22
Preferred Minor : 4
Persistence : Superblock is persistent
Update Time : Wed Jan 28 21:32:39 2009
State : clean, resyncing
Active Devices : 22
Working Devices : 22
Failed Devices : 0
Spare Devices : 0
Chunk Size : 64K
Rebuild Status : 17% complete
UUID : 7e2c7f35:f51c9047:40130c15:63a7cfa6
Events : 0.1
Number Major Minor RaidDevice State
0 8 48 0 active sync /dev/sdd
1 8 64 1 active sync /dev/sde
2 8 80 2 active sync /dev/sdf
3 8 96 3 active sync /dev/sdg
4 8 112 4 active sync /dev/sdh
5 8 128 5 active sync /dev/sdi
6 8 144 6 active sync /dev/sdj
7 8 160 7 active sync /dev/sdk
8 8 176 8 active sync /dev/sdl
9 8 192 9 active sync /dev/sdm
10 8 208 10 active sync /dev/sdn
11 8 224 11 active sync /dev/sdo
12 8 240 12 active sync /dev/sdp
13 65 0 13 active sync /dev/sdq
14 65 16 14 active sync /dev/sdr
15 65 32 15 active sync /dev/sds
16 65 48 16 active sync /dev/sdt
17 65 64 17 active sync /dev/sdu
18 65 80 18 active sync /dev/sdv
19 65 96 19 active sync /dev/sdw
20 65 112 20 active sync /dev/sdx
21 65 144 21 active sync /dev/sdz
--
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
@ 2009-01-28 21:33 ` Joe Landman
2009-01-28 21:37 ` Vladimir Ivashchenko
2009-01-28 22:17 ` Richard Scobie
2009-01-28 22:31 ` Bill Davidsen
` (3 subsequent siblings)
4 siblings, 2 replies; 13+ messages in thread
From: Joe Landman @ 2009-01-28 21:33 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: linux-raid
Vladimir Ivashchenko wrote:
> Any stability assurances or workarounds are highly appreciated. :)
>
> Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
[...]
> Jan 28 21:31:32 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:31:32 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:31:32 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:31:32 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:31:32 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:31:33 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
Are you able to update the kernel to something more modern, or are you
required to keep the kernel at the 2.6.18 level? Out of curiousity,
could you post the output of
cat /proc/interrupts
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web : http://www.scalableinformatics.com
http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 21:33 ` Joe Landman
@ 2009-01-28 21:37 ` Vladimir Ivashchenko
2009-01-28 22:17 ` Richard Scobie
1 sibling, 0 replies; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-01-28 21:37 UTC (permalink / raw)
To: Joe Landman; +Cc: linux-raid
I don't have a problem upgrading to a more recent kernel. I'll do it and try again.
CPU0 CPU1 CPU2 CPU3
0: 35796 219508 1367509 714846322 IO-APIC-edge timer
1: 0 0 0 2 IO-APIC-edge i8042
8: 0 0 1 0 IO-APIC-edge rtc
9: 0 0 0 1 IO-APIC-level acpi
12: 0 0 0 4 IO-APIC-edge i8042
50: 0 6685568 2 400 IO-APIC-level eth0
169: 43 4556 32 101 IO-APIC-level ehci_hcd:usb1, ohci_hcd:usb2, ohci_hcd:usb3
177: 26565935 88022 65347 67440 IO-APIC-level ohci_hcd:usb4
185: 26477 12697949 31539 31684 IO-APIC-level ohci_hcd:usb5
193: 18125889 55786346 52018220 19481451 IO-APIC-level sata_mv
201: 41459172 34509549 20492998 45295899 IO-APIC-level sata_mv
209: 33309776 34688917 26929934 44091250 IO-APIC-level sata_mv
217: 30980862 21002267 29723861 21791804 IO-APIC-level sata_mv
225: 40838685 22136210 38218071 21081955 IO-APIC-level sata_mv
233: 30862641 22239226 33000041 31993705 IO-APIC-level sata_mv
NMI: 0 0 0 0
LOC: 716498535 716498534 716498533 716498532
ERR: 1
MIS: 0
On Wed, Jan 28, 2009 at 04:33:42PM -0500, Joe Landman wrote:
>> Any stability assurances or workarounds are highly appreciated. :)
>> Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
>
> [...]
>
>> Jan 28 21:31:32 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
>> [raid456]
>> Jan 28 21:31:32 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
>> [raid456]
>> Jan 28 21:31:32 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
>> Jan 28 21:31:32 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
>> Jan 28 21:31:32 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
>> Jan 28 21:31:32 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
>> Jan 28 21:31:32 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
>> [raid456]
>> Jan 28 21:31:33 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
>> [raid456]
>> Jan 28 21:31:33 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
>> Jan 28 21:31:33 SunSTG kernel: [<c0436347>] autoremove_wake_function
>> +0x0/0x2d
>> Jan 28 21:31:33 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
>> Jan 28 21:31:33 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
>> Jan 28 21:31:33 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
>> Jan 28 21:31:33 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
>> +0x7/0x10
>
> Are you able to update the kernel to something more modern, or are you
> required to keep the kernel at the 2.6.18 level? Out of curiousity, could
> you post the output of
>
> cat /proc/interrupts
>
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics LLC,
> email: landman@scalableinformatics.com
> web : http://www.scalableinformatics.com
> http://jackrabbit.scalableinformatics.com
> phone: +1 734 786 8423 x121
> fax : +1 866 888 3112
> cell : +1 734 612 4615
--
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 21:33 ` Joe Landman
2009-01-28 21:37 ` Vladimir Ivashchenko
@ 2009-01-28 22:17 ` Richard Scobie
1 sibling, 0 replies; 13+ messages in thread
From: Richard Scobie @ 2009-01-28 22:17 UTC (permalink / raw)
To: landman; +Cc: Vladimir Ivashchenko, linux-raid
Joe Landman wrote:
> Vladimir Ivashchenko wrote:
>
>> Any stability assurances or workarounds are highly appreciated. :)
>>
>> Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
I think it is possibly the same error as was discussed here:
http://marc.info/?l=linux-raid&m=123264525708803&w=2
Regards,
Richard
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
2009-01-28 21:33 ` Joe Landman
@ 2009-01-28 22:31 ` Bill Davidsen
2009-01-28 22:33 ` Tru Huynh
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Bill Davidsen @ 2009-01-28 22:31 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: linux-raid
Vladimir Ivashchenko wrote:
> Hi,
>
> We've got these new Sun X4500 servers. The system I'm playing with now
> has 48 x 250 GB SATA HDDs.
>
> Right now I'm creating two RAID6 arrays, 24 and 22 drives each:
>
> mdadm --verbose --create /dev/md3 --level=6
> --raid-devices=24 /dev/sda /dev/sdaa /dev/sdab /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan /dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas /dev/sdat /dev/sdau /dev/sdav /dev/sdb /dev/sdc
>
> mdadm --verbose --create /dev/md4 --level=6
> --raid-devices=22 /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdz
>
> mdadm --detail is reporting that everything is going smoothly, however
> my /var/log/messages is full of "BUG: soft lockup - CPU#X stuck for
> 10s!" errors appearing every 1-3 minutes.
>
> CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
> Ghz, 16 GB RAM.
>
> The system does not crash and otherwise seems to be healthy. Arrays are
> still under construction and I don't know if they will actually work
> yet.
>
> What I noticed is that at first it was complaining about lockups on md3
> process, but once I started creating md4, complaints were exclusively
> for md4 process only.
>
> Any stability assurances or workarounds are highly appreciated. :)
>
Recently comments about soft lockups in md init have popped up on
several lists, and the consensus seems to be that some of the internal
operations are keeping one or more CPUs waiting, but that's not a
failure. I'm guessing that a more recent kernel might not do this, but
it probably doesn't indicate a functional problem.
My read on a newer kernel is this:
- you went with CentOS instead of Fedora, you got stable instead of
cutting edge
- CentOS 5.3 is coming out soon, RHEL 5.3 just came out
- it's not a functional problem
I'm planning to go to CentOS 5.3 on some machines, and I run Fedora on
the rest. I don't see any joy between "most recent" and "most stable" on
my systems. I would ignore the warning unless it happens during normal
operation.
--
Bill Davidsen <davidsen@tmr.com>
"Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
2009-01-28 21:33 ` Joe Landman
2009-01-28 22:31 ` Bill Davidsen
@ 2009-01-28 22:33 ` Tru Huynh
2009-01-28 23:08 ` Vladimir Ivashchenko
2009-01-29 22:54 ` Jody McIntyre
2009-02-05 16:10 ` Vladimir Ivashchenko
4 siblings, 1 reply; 13+ messages in thread
From: Tru Huynh @ 2009-01-28 22:33 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: linux-raid
On Wed, Jan 28, 2009 at 10:30:33PM +0200, Vladimir Ivashchenko wrote:
> Hi,
>
> We've got these new Sun X4500 servers. The system I'm playing with now
> has 48 x 250 GB SATA HDDs.
>
> Right now I'm creating two RAID6 arrays, 24 and 22 drives each:
> ...
> CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
> Ghz, 16 GB RAM.
any reason for using the 32 bits version instead of the 64 bits?
you must also be aware of http://kbase.redhat.com/faq/docs/DOC-15593
just my .2 cents
Tru
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 22:33 ` Tru Huynh
@ 2009-01-28 23:08 ` Vladimir Ivashchenko
2009-01-30 15:28 ` Bill Davidsen
0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-01-28 23:08 UTC (permalink / raw)
To: Tru Huynh; +Cc: linux-raid
On Wed, Jan 28, 2009 at 11:33:30PM +0100, Tru Huynh wrote:
> > CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
> > Ghz, 16 GB RAM.
> any reason for using the 32 bits version instead of the 64 bits?
>
> you must also be aware of http://kbase.redhat.com/faq/docs/DOC-15593
>
> just my .2 cents
Always welcome :)
According to http://epubs.cclrc.ac.uk/bitstream/2943/ThumperReport.pdf, x4500 was shown to be unstable under centos/rhel 4.x (he didn't
use mv_sata though). In any case, centos 4.x is way too old.
I changed the kernel to 2.6.27.12-78.2.8.fc9.i686 and so far it is stable.
x64 will be the next step. i686 is what our guys install by default, I didn't bother to reinstall it.
--
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
` (2 preceding siblings ...)
2009-01-28 22:33 ` Tru Huynh
@ 2009-01-29 22:54 ` Jody McIntyre
2009-02-05 16:10 ` Vladimir Ivashchenko
4 siblings, 0 replies; 13+ messages in thread
From: Jody McIntyre @ 2009-01-29 22:54 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: linux-raid
On Wed, Jan 28, 2009 at 10:30:33PM +0200, Vladimir Ivashchenko wrote:
> CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
> Ghz, 16 GB RAM.
You should really be running the EL 5.3 kernel - sata_mv in EL 5.2 has
known issues according to the x4500 team but they are happy with the
version in EL 5.3.
> Any stability assurances or workarounds are highly appreciated. :)
It's just a lockup, not a crash. The system will be fine. We've seen a
lot of these, and there's a workaround patch attached to this bug:
https://bugzilla.lustre.org/show_bug.cgi?id=17084
It's probably the same bug seen here, as pointed out by Richard Scobie:
http://marc.info/?l=linux-raid&m=123264525708803&w=2
The problem is not specific to the x4500 - I've seen it with many
configurations, including on non-Sun hardware, generally when lots of
disks are involved in a rebuild. I have not seen it with any mainline
kernel in the past 6 months (they are much more recent than EL 5) but it
may still exist.
As a complete side note, you'll likely see better performance if you
stagger disks across controllers (the x4500 has 6) rather than creating
arrays with most disks from 3 controllers.
Note: I don't work for Sun support or the x4500 product team and nothing
in this message is necessarily an official Sun position.
Cheers,
Jody
> Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
> [md3_raid5:5672]
> Jan 28 21:31:32 SunSTG kernel:
> Jan 28 21:31:32 SunSTG kernel: Pid: 5672, comm: md3_raid5
> Jan 28 21:31:32 SunSTG kernel: EIP: 0060:[<f8d68162>] CPU: 0
> Jan 28 21:31:32 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
> +0x10a/0x1b6 [raid456]
> Jan 28 21:31:32 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:31:32 SunSTG kernel: EAX: ea0774e0 EBX: 000004e0 ECX: ead0ad30
> EDX: ea077000
> Jan 28 21:31:32 SunSTG kernel: ESI: ead0ade0 EDI: 00000004 EBP: ead0add0
> DS: 007b ES: 007b
> Jan 28 21:31:32 SunSTG kernel: CR0: 80050033 CR2: 0806e000 CR3: 373239e0
> CR4: 000006f0
> Jan 28 21:31:32 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:31:32 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:31:32 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:31:32 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:31:32 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:31:33 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
>
> Jan 28 21:31:33 SunSTG kernel: =======================
> Jan 28 21:32:26 SunSTG kernel: BUG: soft lockup - CPU#2 stuck for 10s!
> [md3_raid5:5672]
> Jan 28 21:32:26 SunSTG kernel:
> Jan 28 21:32:26 SunSTG kernel: Pid: 5672, comm: md3_raid5
> Jan 28 21:32:26 SunSTG kernel: EIP: 0060:[<f8d68170>] CPU: 2
> Jan 28 21:32:26 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
> +0x118/0x1b6 [raid456]
> Jan 28 21:32:26 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:32:26 SunSTG kernel: EAX: ea784040 EBX: 00000040 ECX: ead0ad30
> EDX: ea784000
> Jan 28 21:32:26 SunSTG kernel: ESI: ead0adf0 EDI: 00000008 EBP: ead0add0
> DS: 007b ES: 007b
> Jan 28 21:32:26 SunSTG kernel: CR0: 80050033 CR2: b7f6f000 CR3: 3714e920
> CR4: 000006f0
> Jan 28 21:32:26 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:32:26 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
> Jan 28 21:32:26 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:32:26 SunSTG kernel: [<f8d6171e>] __release_stripe+0xfc/0x101
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:32:26 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:32:26 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:32:26 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:32:26 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:32:26 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:32:26 SunSTG kernel: =======================
>
> <somewhere here I issue commands to create md4>
>
> Jan 28 21:32:43 SunSTG kernel: md: syncing RAID array md4
> Jan 28 21:32:43 SunSTG kernel: md: minimum _guaranteed_ reconstruction
> speed: 1000 KB/sec/disc.
> Jan 28 21:32:43 SunSTG kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jan 28 21:32:43 SunSTG kernel: md: using 128k window, over a total of
> 244195200 blocks.
> Jan 28 21:33:20 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
> [md4_raid5:5694]
> Jan 28 21:33:20 SunSTG kernel:
> Jan 28 21:33:20 SunSTG kernel: Pid: 5694, comm: md4_raid5
> Jan 28 21:33:20 SunSTG kernel: EIP: 0060:[<f8d63aff>] CPU: 3
> Jan 28 21:33:20 SunSTG kernel: EIP is at handle_stripe+0x25c/0x215e
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: EFLAGS: 00000282 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:33:20 SunSTG kernel: EAX: f6a2b404 EBX: 00000001 ECX: f53d17c0
> EDX: e8c532c0
> Jan 28 21:33:20 SunSTG kernel: ESI: e8c532c4 EDI: 00000016 EBP: e8c52b64
> DS: 007b ES: 007b
> Jan 28 21:33:20 SunSTG kernel: CR0: 8005003b CR2: b7cfc000 CR3: 3714ef00
> CR4: 000006f0
> Jan 28 21:33:20 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:33:20 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
> Jan 28 21:33:20 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:33:20 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:33:20 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:33:20 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:33:20 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:33:20 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:33:20 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:33:21 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:33:21 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:33:21 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:33:21 SunSTG kernel: =======================
> Jan 28 21:33:50 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
> [md4_raid5:5694]
> Jan 28 21:33:50 SunSTG kernel:
> Jan 28 21:33:50 SunSTG kernel: Pid: 5694, comm: md4_raid5
> Jan 28 21:33:50 SunSTG kernel: EIP: 0060:[<f8bf9813>] CPU: 3
> Jan 28 21:33:50 SunSTG kernel: EIP is at xor_sse_5+0xa0/0x3b5 [xor]
> Jan 28 21:33:50 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:33:50 SunSTG kernel: EAX: 0000000b EBX: e8e66500 ECX: e8e69500
> EDX: e8e6e500
> Jan 28 21:33:50 SunSTG kernel: ESI: e8e67500 EDI: e8e68500 EBP: e96b5dd4
> DS: 007b ES: 007b
> Jan 28 21:33:50 SunSTG kernel: CR0: 80050033 CR2: b7cfc000 CR3: 3714ef00
> CR4: 000006f0
> Jan 28 21:33:50 SunSTG kernel: [<f8bfa200>] xor_block+0x74/0x7d [xor]
> Jan 28 21:33:50 SunSTG kernel: [<f8d636b3>] compute_block_1+0xe3/0x13a
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<f8d644ba>] handle_stripe+0xc17/0x215e
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:33:50 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:33:50 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:33:50 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:33:50 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:33:50 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:33:50 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:33:50 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:33:51 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:33:51 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:33:51 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:33:51 SunSTG kernel: =======================
> ... and it goes on complaining about md4_raid5:5694.
>
> [root@SunSTG ~]# mdadm --detail /dev/md3
> /dev/md3:
> Version : 00.90.03
> Creation Time : Wed Jan 28 21:30:50 2009
> Raid Level : raid6
> Array Size : 5372294400 (5123.42 GiB 5501.23 GB)
> Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
> Raid Devices : 24
> Total Devices : 24
> Preferred Minor : 3
> Persistence : Superblock is persistent
>
> Update Time : Wed Jan 28 21:30:50 2009
> State : clean, resyncing
> Active Devices : 24
> Working Devices : 24
> Failed Devices : 0
> Spare Devices : 0
>
> Chunk Size : 64K
>
> Rebuild Status : 15% complete
>
> UUID : d8c2b5ce:576a117b:f2494cd1:626a774c
> Events : 0.1
>
> Number Major Minor RaidDevice State
> 0 8 0 0 active sync /dev/sda
> 1 65 160 1 active sync /dev/sdaa
> 2 65 176 2 active sync /dev/sdab
> 3 65 208 3 active sync /dev/sdad
> 4 65 224 4 active sync /dev/sdae
> 5 65 240 5 active sync /dev/sdaf
> 6 66 0 6 active sync /dev/sdag
> 7 66 16 7 active sync /dev/sdah
> 8 66 32 8 active sync /dev/sdai
> 9 66 48 9 active sync /dev/sdaj
> 10 66 64 10 active sync /dev/sdak
> 11 66 80 11 active sync /dev/sdal
> 12 66 96 12 active sync /dev/sdam
> 13 66 112 13 active sync /dev/sdan
> 14 66 128 14 active sync /dev/sdao
> 15 66 144 15 active sync /dev/sdap
> 16 66 160 16 active sync /dev/sdaq
> 17 66 176 17 active sync /dev/sdar
> 18 66 192 18 active sync /dev/sdas
> 19 66 208 19 active sync /dev/sdat
> 20 66 224 20 active sync /dev/sdau
> 21 66 240 21 active sync /dev/sdav
> 22 8 16 22 active sync /dev/sdb
> 23 8 32 23 active sync /dev/sdc
> [root@SunSTG ~]# mdadm --detail /dev/md4
> /dev/md4:
> Version : 00.90.03
> Creation Time : Wed Jan 28 21:32:39 2009
> Raid Level : raid6
> Array Size : 4883904000 (4657.65 GiB 5001.12 GB)
> Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
> Raid Devices : 22
> Total Devices : 22
> Preferred Minor : 4
> Persistence : Superblock is persistent
>
> Update Time : Wed Jan 28 21:32:39 2009
> State : clean, resyncing
> Active Devices : 22
> Working Devices : 22
> Failed Devices : 0
> Spare Devices : 0
>
> Chunk Size : 64K
>
> Rebuild Status : 17% complete
>
> UUID : 7e2c7f35:f51c9047:40130c15:63a7cfa6
> Events : 0.1
>
> Number Major Minor RaidDevice State
> 0 8 48 0 active sync /dev/sdd
> 1 8 64 1 active sync /dev/sde
> 2 8 80 2 active sync /dev/sdf
> 3 8 96 3 active sync /dev/sdg
> 4 8 112 4 active sync /dev/sdh
> 5 8 128 5 active sync /dev/sdi
> 6 8 144 6 active sync /dev/sdj
> 7 8 160 7 active sync /dev/sdk
> 8 8 176 8 active sync /dev/sdl
> 9 8 192 9 active sync /dev/sdm
> 10 8 208 10 active sync /dev/sdn
> 11 8 224 11 active sync /dev/sdo
> 12 8 240 12 active sync /dev/sdp
> 13 65 0 13 active sync /dev/sdq
> 14 65 16 14 active sync /dev/sdr
> 15 65 32 15 active sync /dev/sds
> 16 65 48 16 active sync /dev/sdt
> 17 65 64 17 active sync /dev/sdu
> 18 65 80 18 active sync /dev/sdv
> 19 65 96 19 active sync /dev/sdw
> 20 65 112 20 active sync /dev/sdx
> 21 65 144 21 active sync /dev/sdz
>
>
> --
> Best Regards,
> Vladimir Ivashchenko
> Chief Technology Officer
> PrimeTel PLC, Cyprus - www.prime-tel.com
> Tel: +357 25 100100 Fax: +357 2210 2211
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 23:08 ` Vladimir Ivashchenko
@ 2009-01-30 15:28 ` Bill Davidsen
2009-01-30 19:38 ` Vladimir Ivashchenko
0 siblings, 1 reply; 13+ messages in thread
From: Bill Davidsen @ 2009-01-30 15:28 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: Tru Huynh, linux-raid
Vladimir Ivashchenko wrote:
> On Wed, Jan 28, 2009 at 11:33:30PM +0100, Tru Huynh wrote:
>
>
>>> CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
>>> Ghz, 16 GB RAM.
>>>
>> any reason for using the 32 bits version instead of the 64 bits?
>>
>> you must also be aware of http://kbase.redhat.com/faq/docs/DOC-15593
>>
>> just my .2 cents
>>
>
> Always welcome :)
>
> According to http://epubs.cclrc.ac.uk/bitstream/2943/ThumperReport.pdf, x4500 was shown to be unstable under centos/rhel 4.x (he didn't
> use mv_sata though). In any case, centos 4.x is way too old.
>
> I changed the kernel to 2.6.27.12-78.2.8.fc9.i686 and so far it is stable.
>
> x64 will be the next step. i686 is what our guys install by default, I didn't bother to reinstall it.
>
>
In spite of the theoretical benefits of 64 bit, I find that the
advantages are "measurable but not noticeable" for most things. The lack
of 64 bit versions of some applications was a problem for me, but may
not be for you. I did find that even building from source not all
applications worked right, or worked at all, or in some cases compiled. :-(
--
Bill Davidsen <davidsen@tmr.com>
"Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-30 15:28 ` Bill Davidsen
@ 2009-01-30 19:38 ` Vladimir Ivashchenko
2009-01-30 22:28 ` Keld Jørn Simonsen
0 siblings, 1 reply; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-01-30 19:38 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Tru Huynh, linux-raid
On Fri, 2009-01-30 at 10:28 -0500, Bill Davidsen wrote:
> > I changed the kernel to 2.6.27.12-78.2.8.fc9.i686 and so far it is stable.
> >
> > x64 will be the next step. i686 is what our guys install by default, I didn't bother to reinstall it.
> >
>
> In spite of the theoretical benefits of 64 bit, I find that the
> advantages are "measurable but not noticeable" for most things. The lack
> of 64 bit versions of some applications was a problem for me, but may
> not be for you. I did find that even building from source not all
> applications worked right, or worked at all, or in some cases compiled. :-(
More or less this is our experience also, but this box will only be used
as a file-server.
Does anybody know if software RAID benefits when being run in 64-bit ?
--
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-30 19:38 ` Vladimir Ivashchenko
@ 2009-01-30 22:28 ` Keld Jørn Simonsen
0 siblings, 0 replies; 13+ messages in thread
From: Keld Jørn Simonsen @ 2009-01-30 22:28 UTC (permalink / raw)
To: Vladimir Ivashchenko; +Cc: Bill Davidsen, Tru Huynh, linux-raid
On Fri, Jan 30, 2009 at 09:38:06PM +0200, Vladimir Ivashchenko wrote:
> On Fri, 2009-01-30 at 10:28 -0500, Bill Davidsen wrote:
>
> > > I changed the kernel to 2.6.27.12-78.2.8.fc9.i686 and so far it is stable.
> > >
> > > x64 will be the next step. i686 is what our guys install by default, I didn't bother to reinstall it.
> > >
> >
> > In spite of the theoretical benefits of 64 bit, I find that the
> > advantages are "measurable but not noticeable" for most things. The lack
> > of 64 bit versions of some applications was a problem for me, but may
> > not be for you. I did find that even building from source not all
> > applications worked right, or worked at all, or in some cases compiled. :-(
>
> More or less this is our experience also, but this box will only be used
> as a file-server.
>
> Does anybody know if software RAID benefits when being run in 64-bit ?
I think it may. IO buffer copying may be twice as fast.
Some statistics on IO, including network traffic can be measured
when it goes beyound about 100 Mbit/s - tehere are some counters that
would overflow 32 bit.
best regards
keld
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
` (3 preceding siblings ...)
2009-01-29 22:54 ` Jody McIntyre
@ 2009-02-05 16:10 ` Vladimir Ivashchenko
2009-02-20 18:57 ` Vladimir Ivashchenko
4 siblings, 1 reply; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-02-05 16:10 UTC (permalink / raw)
To: linux-raid
Ok, further updates:
I have installed a 64-bit CentOS5 and put x86_64 2.6.26.8-57.fc8 Fedora
kernel on it.
The RAID creation was mostly quiet, apart from a few softluckups as
described below.
Then we tried inserting and removing a HDD. As expected, it didn't fully
work properly, but at least the machine have not crashed. The arrays
didn't have any load though. From being /dev/sdat the disk
became /dev/sdax. For some reason mdadm was reporting the array and the
disk itself to be healthy, but the device entry for the removed hard
drive #19 was empty with wrong major/minor numbers.
Reading about sata_mv driver, it seems that hotplug is known to be
problematic, so we're going to try OpenSolaris. However I have another
X4500 for a few days, and if any developers would like me to check
something, I will try to do it.
*** HOT PLUG ***
Feb 5 15:48:21 SunSTG kernel: ata46: exception Emask 0x10 SAct 0x0 SErr
0x180000 action 0x6 frozen
Feb 5 15:48:21 SunSTG kernel: ata46: edma_err_cause=02000020
pp_flags=00000002, SError=00180000
Feb 5 15:48:21 SunSTG kernel: ata46: SError: { 10B8B Dispar }
Feb 5 15:48:21 SunSTG kernel: ata46: hard resetting link
Feb 5 15:48:21 SunSTG kernel: ata46: SATA link down (SStatus 0 SControl
300)
Feb 5 15:48:21 SunSTG kernel: ata46: failed to recover some devices,
retrying in 5 secs
Feb 5 15:48:26 SunSTG kernel: ata46: hard resetting link
Feb 5 15:48:27 SunSTG kernel: ata46: SATA link down (SStatus 0 SControl
300)
Feb 5 15:48:27 SunSTG kernel: ata46: failed to recover some devices,
retrying in 5 secs
Feb 5 15:48:32 SunSTG kernel: ata46: hard resetting link
Feb 5 15:48:32 SunSTG kernel: ata46: SATA link down (SStatus 0 SControl
300)
Feb 5 15:48:32 SunSTG kernel: ata46.00: disabled
Feb 5 15:48:32 SunSTG kernel: ata46: EH complete
Feb 5 15:48:32 SunSTG kernel: ata46.00: detaching (SCSI 45:0:0:0)
Feb 5 15:48:32 SunSTG kernel: sd 45:0:0:0: [sdat] Stopping disk
Feb 5 15:48:32 SunSTG kernel: sd 45:0:0:0: [sdat] START_STOP FAILED
Feb 5 15:48:32 SunSTG kernel: sd 45:0:0:0: [sdat] Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
Feb 5 15:51:05 SunSTG kernel: ata46: exception Emask 0x10 SAct 0x0 SErr
0x4010000 action 0xe frozen
Feb 5 15:51:05 SunSTG kernel: ata46: edma_err_cause=00000010
pp_flags=00000002, dev connect
Feb 5 15:51:05 SunSTG kernel: ata46: SError: { PHYRdyChg DevExch }
Feb 5 15:51:05 SunSTG kernel: ata46: hard resetting link
Feb 5 15:51:11 SunSTG kernel: ata46: link is slow to respond, please be
patient (ready=0)
Feb 5 15:51:12 SunSTG kernel: ata46: SATA link up 3.0 Gbps (SStatus 123
SControl 300)
Feb 5 15:51:12 SunSTG kernel: ata46.00: HPA detected: current
488390625, native 488397168
Feb 5 15:51:12 SunSTG kernel: ata46.00: ATA-7: SEAGATE ST32500NSSUN250G
0830B85CNR, 3AZQ, max UDMA/133
Feb 5 15:51:12 SunSTG kernel: ata46.00: 488390625 sectors, multi 0:
LBA48 NCQ (depth 31/32)
Feb 5 15:51:12 SunSTG kernel: ata46.00: max_sectors limited to 256 for
NCQ
Feb 5 15:51:12 SunSTG kernel: ata46.00: max_sectors limited to 256 for
NCQ
Feb 5 15:51:12 SunSTG kernel: ata46.00: configured for UDMA/133
Feb 5 15:51:12 SunSTG kernel: ata46: EH complete
Feb 5 15:51:12 SunSTG kernel: scsi 45:0:0:0: Direct-Access ATA
SEAGATE ST32500N n/a PQ: 0 ANSI: 5
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] 488390625 512-byte
hardware sectors (250056 MB)
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] Write Protect is off
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] Write cache:
disabled, read cache: enabled, doesn't support DPO or FUA
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] 488390625 512-byte
hardware sectors (250056 MB)
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] Write Protect is off
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] Write cache:
disabled, read cache: enabled, doesn't support DPO or FUA
Feb 5 15:51:12 SunSTG kernel: sdax:
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: [sdax] Attached SCSI disk
Feb 5 15:51:12 SunSTG kernel: sd 45:0:0:0: Attached scsi generic sg45
type 0
Feb 5 16:08:49 SunSTG smartd[12928]: Device: /dev/sdat, No such device,
open() failed
Feb 5 16:08:49 SunSTG smartd[12928]: Sending warning via mail to
root ...
mdadm output after the event:
[root@SunSTG ~]# mdadm --detail /dev/md3
/dev/md3:
Version : 00.90.03
Creation Time : Wed Feb 4 21:43:12 2009
Raid Level : raid6
Array Size : 5372294400 (5123.42 GiB 5501.23 GB)
Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
Raid Devices : 24
Total Devices : 24
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Thu Feb 5 03:22:46 2009
State : active
Active Devices : 24
Working Devices : 24
Failed Devices : 0
Spare Devices : 0
Chunk Size : 64K
UUID : 5f7531f9:6a512ed6:b82261e1:e67c5c29
Events : 0.7
Number Major Minor RaidDevice State
0 8 0 0 active sync /dev/sda
1 65 160 1 active sync /dev/sdaa
2 65 176 2 active sync /dev/sdab
3 65 208 3 active sync /dev/sdad
4 65 224 4 active sync /dev/sdae
5 65 240 5 active sync /dev/sdaf
6 8 160 6 active sync /dev/sdk
7 8 176 7 active sync /dev/sdl
8 8 192 8 active sync /dev/sdm
9 8 208 9 active sync /dev/sdn
10 66 0 10 active sync /dev/sdag
11 66 16 11 active sync /dev/sdah
12 66 32 12 active sync /dev/sdai
13 66 112 13 active sync /dev/sdan
14 66 128 14 active sync /dev/sdao
15 66 144 15 active sync /dev/sdap
16 66 160 16 active sync /dev/sdaq
17 66 176 17 active sync /dev/sdar
18 66 192 18 active sync /dev/sdas
19 66 208 19 active sync
20 65 96 20 active sync /dev/sdw
21 65 112 21 active sync /dev/sdx
22 65 144 22 active sync /dev/sdz
23 66 240 23 active sync /dev/sdav
*** SOFT LOCKUPS: ****
Feb 5 02:36:51 SunSTG kernel: BUG: soft lockup - CPU#2 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:36:51 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:36:51 SunSTG kernel: CPU 2:
Feb 5 02:36:51 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:36:51 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:36:51 SunSTG kernel: RIP: 0010:[<ffffffffa0299468>]
[<ffffffffa0299468>] :raid456:raid6_sse24_gen_syndrome+0x184/0x210
Feb 5 02:36:51 SunSTG kernel: RSP: 0018:ffff8101f2881bd8 EFLAGS:
00000286
Feb 5 02:36:51 SunSTG kernel: RAX: ffff8101f2c7b000 RBX:
ffff8101f2881c10 RCX: ffff8101f2c7baa0
Feb 5 02:36:51 SunSTG kernel: RDX: ffff8101f2c7ba80 RSI:
0000000000000a80 RDI: ffff8101f2c7aa80
Feb 5 02:36:51 SunSTG kernel: RBP: 000000008005003b R08:
ffff8101f2c79a80 R09: 00000000ffffffff
Feb 5 02:36:51 SunSTG kernel: R10: ffff8101f2881c18 R11:
000000008005003b R12: ffff8101f2881bc8
Feb 5 02:36:51 SunSTG kernel: R13: ffffffff8107e9d7 R14:
ffff8101f2881b40 R15: ffff8103fd859eb0
Feb 5 02:36:51 SunSTG kernel: FS: 00007f7d08c696e0(0000)
GS:ffff8103ff039300(0000) knlGS:0000000000000000
Feb 5 02:36:51 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:36:51 SunSTG kernel: CR2: 00007fc464d4e000 CR3:
00000003fdd03000 CR4: 00000000000006e0
Feb 5 02:36:51 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:36:51 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:36:51 SunSTG kernel:
Feb 5 02:36:51 SunSTG kernel: Call Trace:
Feb 5 02:36:51 SunSTG kernel:
[<ffffffffa0299362>] ? :raid456:raid6_sse24_gen_syndrome+0x7e/0x210
Feb 5 02:36:51 SunSTG kernel:
[<ffffffffa0295f6a>] ? :raid456:compute_parity6+0x24f/0x2e2
Feb 5 02:36:51 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:36:51 SunSTG kernel:
[<ffffffffa0296d8f>] ? :raid456:handle_stripe+0x9eb/0xf1b
Feb 5 02:36:51 SunSTG kernel: [<ffffffff811ee616>] ? md_wakeup_thread
+0x24/0x26
Feb 5 02:36:51 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:36:51 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:36:51 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:36:51 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:36:51 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:36:52 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:36:52 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:36:52 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:36:52 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:36:52 SunSTG kernel:
Feb 5 02:37:55 SunSTG kernel: BUG: soft lockup - CPU#2 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:37:55 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:37:55 SunSTG kernel: CPU 2:
Feb 5 02:37:55 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:37:55 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:37:55 SunSTG kernel: RIP: 0010:[<ffffffffa027eb63>]
[<ffffffffa027eb63>] :xor:xor_sse_5+0x3d0/0x3d7
Feb 5 02:37:55 SunSTG kernel: RSP: 0018:ffff8101f2881c70 EFLAGS:
00000246
Feb 5 02:37:55 SunSTG kernel: RAX: 0000000000000100 RBX:
ffff8101f2881cc0 RCX: 0000000000000000
Feb 5 02:37:55 SunSTG kernel: RDX: ffff8101f3401000 RSI:
ffff8101f33fd000 RDI: 0000000000000010
Feb 5 02:37:55 SunSTG kernel: RBP: 000000000000000f R08:
ffff8101f33ff000 R09: ffff8101f33fc000
Feb 5 02:37:55 SunSTG kernel: R10: ffff8101f2881c70 R11:
000000008005003b R12: 0000000000000003
Feb 5 02:37:55 SunSTG kernel: R13: 0000000000001000 R14:
ffffffffa02994e7 R15: ffff8101f2881c10
Feb 5 02:37:55 SunSTG kernel: FS: 00007f7d08c696e0(0000)
GS:ffff8103ff039300(0000) knlGS:0000000000000000
Feb 5 02:37:56 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b
Feb 5 02:37:56 SunSTG kernel: CR2: 00007fc464d4e000 CR3:
00000003fdd03000 CR4: 00000000000006e0
Feb 5 02:37:56 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:37:56 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:37:56 SunSTG kernel:
Feb 5 02:37:56 SunSTG kernel: Call Trace:
Feb 5 02:37:56 SunSTG kernel: [<ffffffffa027ebd3>] ? :xor:xor_blocks
+0x69/0x6b
Feb 5 02:37:56 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:37:56 SunSTG kernel:
[<ffffffffa0296c8c>] ? :raid456:handle_stripe+0x8e8/0xf1b
Feb 5 02:37:56 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:37:56 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:37:56 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:37:56 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:37:56 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:37:56 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:37:56 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:37:56 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:37:56 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:37:56 SunSTG kernel:
Feb 5 02:38:09 SunSTG yum-updatesd-helper: error getting update info:
Cannot retrieve repository metadata (repomd.xml) for repository: base.
Please verify its path and try again
Feb 5 02:43:30 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:43:30 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:43:30 SunSTG kernel: CPU 3:
Feb 5 02:43:30 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:43:30 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:43:30 SunSTG kernel: RIP: 0010:[<ffffffff810204c1>]
[<ffffffff810204c1>] native_read_cr0+0x0/0x9
Feb 5 02:43:30 SunSTG kernel: RSP: 0018:ffff8101f2881bd0 EFLAGS:
00000246
Feb 5 02:43:30 SunSTG kernel: RAX: ffff8101f315c000 RBX:
ffff8101f2881c10 RCX: ffff8101f315cfe0
Feb 5 02:43:30 SunSTG kernel: RDX: ffff8101f315cfc0 RSI:
0000000000001000 RDI: ffff8101f315c000
Feb 5 02:43:30 SunSTG kernel: RBP: 000000008005003b R08:
ffff8101f315b000 R09: 00000000ffffffff
Feb 5 02:43:30 SunSTG kernel: R10: ffff8101f2881c18 R11:
000000008005003b R12: ffff8101f2881bc8
Feb 5 02:43:30 SunSTG kernel: R13: ffffffff8107e9d7 R14:
ffff8101f2881b40 R15: ffff8103fd858330
Feb 5 02:43:30 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:43:30 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:43:30 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:43:30 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:43:30 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:43:30 SunSTG kernel:
Feb 5 02:43:30 SunSTG kernel: Call Trace:
Feb 5 02:43:30 SunSTG kernel:
[<ffffffffa02994d7>] ? :raid456:raid6_sse24_gen_syndrome+0x1f3/0x210
Feb 5 02:43:30 SunSTG kernel:
[<ffffffffa0295f6a>] ? :raid456:compute_parity6+0x24f/0x2e2
Feb 5 02:43:31 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:43:31 SunSTG kernel:
[<ffffffffa0296d8f>] ? :raid456:handle_stripe+0x9eb/0xf1b
Feb 5 02:43:31 SunSTG kernel: [<ffffffff811ee616>] ? md_wakeup_thread
+0x24/0x26
Feb 5 02:43:31 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:43:31 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:43:31 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:43:31 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:43:31 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:43:31 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:43:31 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:43:31 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:43:31 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:43:31 SunSTG kernel:
Feb 5 02:44:36 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:44:36 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:44:36 SunSTG kernel: CPU 3:
Feb 5 02:44:36 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:44:36 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:44:36 SunSTG kernel: RIP: 0010:[<ffffffffa027e953>]
[<ffffffffa027e953>] :xor:xor_sse_5+0x1c0/0x3d7
Feb 5 02:44:36 SunSTG kernel: RSP: 0018:ffff8101f2881c70 EFLAGS:
00000202
Feb 5 02:44:36 SunSTG kernel: RAX: 0000000000000100 RBX:
ffff8101f2881cc0 RCX: 0000000000000008
Feb 5 02:44:36 SunSTG kernel: RDX: ffff8101f3342800 RSI:
ffff8101f3347800 RDI: 0000000000000010
Feb 5 02:44:36 SunSTG kernel: RBP: 0000000000000012 R08:
ffff8101f3340800 R09: ffff8101f333f800
Feb 5 02:44:36 SunSTG kernel: R10: ffff8101f2881c70 R11:
000000008005003b R12: 0000000000000003
Feb 5 02:44:36 SunSTG kernel: R13: 0000000000001000 R14:
ffffffffa02994e7 R15: ffff8101f2881c10
Feb 5 02:44:36 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:44:36 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:44:36 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:44:36 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:44:36 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:44:36 SunSTG kernel:
Feb 5 02:44:36 SunSTG kernel: Call Trace:
Feb 5 02:44:36 SunSTG kernel: [<ffffffffa027ebd3>] ? :xor:xor_blocks
+0x69/0x6b
Feb 5 02:44:36 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:44:36 SunSTG kernel:
[<ffffffffa0296c8c>] ? :raid456:handle_stripe+0x8e8/0xf1b
Feb 5 02:44:36 SunSTG kernel: [<ffffffff811ee616>] ? md_wakeup_thread
+0x24/0x26
Feb 5 02:44:36 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:44:36 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:44:36 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:44:36 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:44:36 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:44:36 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:44:36 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:44:36 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:44:36 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:44:36 SunSTG kernel:
Feb 5 02:45:41 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:45:41 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:45:41 SunSTG kernel: CPU 3:
Feb 5 02:45:41 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:45:41 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:45:41 SunSTG kernel: RIP: 0010:[<ffffffffa00b4465>]
[<ffffffffa00b4465>] :sata_mv:mv_process_crpb_entries+0x60/0x15e
Feb 5 02:45:41 SunSTG kernel: RSP: 0018:ffff8101ff10be58 EFLAGS:
00000202
Feb 5 02:45:41 SunSTG kernel: RAX: 0000000000000002 RBX:
ffff8101ff10be88 RCX: ffffc200025a2000
Feb 5 02:45:41 SunSTG kernel: RDX: 0000000000002000 RSI:
ffff8101fe1c0828 RDI: ffff8101fe290000
Feb 5 02:45:41 SunSTG kernel: RBP: ffff8101ff10bdd0 R08:
0000000000000202 R09: 0000000000000008
Feb 5 02:45:41 SunSTG kernel: R10: 0000000000000002 R11:
000000008005003b R12: ffffffff8100cf52
Feb 5 02:45:41 SunSTG kernel: R13: ffff8101ff10bdd0 R14:
ffff8101fe290000 R15: 0000000000000018
Feb 5 02:45:41 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:45:41 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:45:41 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:45:41 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:45:41 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:45:41 SunSTG kernel:
Feb 5 02:45:41 SunSTG kernel: Call Trace:
Feb 5 02:45:41 SunSTG kernel: <IRQ>
[<ffffffffa00b4fdf>] ? :sata_mv:mv_interrupt+0x28b/0x64b
Feb 5 02:45:41 SunSTG kernel: [<ffffffff8112d332>] ? blk_done_softirq
+0x71/0x80
Feb 5 02:45:41 SunSTG kernel: [<ffffffff81075689>] ? handle_IRQ_event
+0x2e/0x65
Feb 5 02:45:41 SunSTG kernel: [<ffffffff81076cce>] ?
handle_fasteoi_irq+0x95/0xd0
Feb 5 02:45:41 SunSTG kernel: [<ffffffff8100f00f>] ? do_IRQ+0xf7/0x16c
Feb 5 02:45:41 SunSTG kernel: [<ffffffff8100c6cd>] ? ret_from_intr
+0x0/0x19
Feb 5 02:45:42 SunSTG kernel: <EOI>
[<ffffffffa027e8d8>] ? :xor:xor_sse_5+0x145/0x3d7
Feb 5 02:45:42 SunSTG kernel: [<ffffffffa027ebd3>] ? :xor:xor_blocks
+0x69/0x6b
Feb 5 02:45:42 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:45:42 SunSTG kernel:
[<ffffffffa0296c8c>] ? :raid456:handle_stripe+0x8e8/0xf1b
Feb 5 02:45:42 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:45:42 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:45:42 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:45:42 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:45:42 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:45:42 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:45:42 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:45:42 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:45:42 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:45:42 SunSTG kernel:
Feb 5 02:46:47 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:46:47 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:46:47 SunSTG kernel: CPU 3:
Feb 5 02:46:47 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:46:47 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:46:47 SunSTG kernel: RIP: 0010:[<ffffffffa027e7f5>]
[<ffffffffa027e7f5>] :xor:xor_sse_5+0x62/0x3d7
Feb 5 02:46:47 SunSTG kernel: RSP: 0018:ffff8101f2881c70 EFLAGS:
00000202
Feb 5 02:46:47 SunSTG kernel: RAX: 0000000000000100 RBX:
ffff8101f2881cc0 RCX: 000000000000000d
Feb 5 02:46:47 SunSTG kernel: RDX: ffff8101ef5a2300 RSI:
ffff8101ef5a9300 RDI: 0000000000000010
Feb 5 02:46:47 SunSTG kernel: RBP: 000000000000000c R08:
ffff8101ef5a0300 R09: ffff8101ef59f300
Feb 5 02:46:47 SunSTG kernel: R10: ffff8101f2881c70 R11:
000000008005003b R12: 0000000000000003
Feb 5 02:46:47 SunSTG kernel: R13: 0000000000001000 R14:
ffffffffa02994e7 R15: ffff8101f2881c10
Feb 5 02:46:47 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:46:47 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:46:47 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:46:47 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:46:47 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:46:47 SunSTG kernel:
Feb 5 02:46:47 SunSTG kernel: Call Trace:
Feb 5 02:46:47 SunSTG kernel: [<ffffffffa027ebd3>] ? :xor:xor_blocks
+0x69/0x6b
Feb 5 02:46:47 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:46:47 SunSTG kernel:
[<ffffffffa0296c8c>] ? :raid456:handle_stripe+0x8e8/0xf1b
Feb 5 02:46:47 SunSTG kernel: [<ffffffff811ee616>] ? md_wakeup_thread
+0x24/0x26
Feb 5 02:46:47 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:46:47 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:46:47 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:46:47 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:46:47 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:46:47 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:46:47 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:46:47 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:46:47 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:46:47 SunSTG kernel:
Feb 5 02:48:52 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:48:52 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:48:52 SunSTG kernel: CPU 3:
Feb 5 02:48:52 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:48:52 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:48:52 SunSTG kernel: RIP: 0010:[<ffffffffa0299416>]
[<ffffffffa0299416>] :raid456:raid6_sse24_gen_syndrome+0x132/0x210
Feb 5 02:48:52 SunSTG kernel: RSP: 0018:ffff8101f2881bd8 EFLAGS:
00000202
Feb 5 02:48:52 SunSTG kernel: RAX: ffff8101f78c0000 RBX:
ffff8101f2881c10 RCX: ffff8101f78c0260
Feb 5 02:48:52 SunSTG kernel: RDX: ffff8101f78c0240 RSI:
0000000000000240 RDI: ffff8101f78af240
Feb 5 02:48:52 SunSTG kernel: RBP: 000000008005003b R08:
ffff8101f78ae240 R09: 0000000000000010
Feb 5 02:48:52 SunSTG kernel: R10: ffff8101f2881ca0 R11:
000000008005003b R12: ffff8101f2881bc8
Feb 5 02:48:52 SunSTG kernel: R13: ffffffff8107e9d7 R14:
ffff8101f2881b40 R15: ffff8103fd858970
Feb 5 02:48:52 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:48:52 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:48:52 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:48:52 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:48:52 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:48:52 SunSTG kernel:
Feb 5 02:48:52 SunSTG kernel: Call Trace:
Feb 5 02:48:52 SunSTG kernel:
[<ffffffffa0299362>] ? :raid456:raid6_sse24_gen_syndrome+0x7e/0x210
Feb 5 02:48:52 SunSTG kernel:
[<ffffffffa0295f6a>] ? :raid456:compute_parity6+0x24f/0x2e2
Feb 5 02:48:52 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:48:52 SunSTG kernel:
[<ffffffffa0296d8f>] ? :raid456:handle_stripe+0x9eb/0xf1b
Feb 5 02:48:52 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:48:52 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:48:52 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:48:52 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:48:52 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:48:52 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:48:52 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:48:52 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:48:52 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:48:52 SunSTG kernel:
Feb 5 02:49:55 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:49:55 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:49:55 SunSTG kernel: CPU 3:
Feb 5 02:49:55 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:49:55 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:49:55 SunSTG kernel: RIP: 0010:[<ffffffffa008ced4>]
[<ffffffffa008ced4>] :libata:__ata_qc_complete+0x53/0xb1
Feb 5 02:49:55 SunSTG kernel: RSP: 0018:ffff8101ff10bdf8 EFLAGS:
00000246
Feb 5 02:49:55 SunSTG kernel: RAX: 0000000000000000 RBX:
ffff8101ff10be18 RCX: 0000000000000000
Feb 5 02:49:55 SunSTG kernel: RDX: 0000000000001000 RSI:
ffffffffa006031d RDI: ffff8101f2838140
Feb 5 02:49:55 SunSTG kernel: RBP: ffff8101ff10bd70 R08:
0000000000000202 R09: 0000000000000008
Feb 5 02:49:55 SunSTG kernel: R10: 0000000000000002 R11:
000000008005003b R12: ffffffff8100cf52
Feb 5 02:49:55 SunSTG kernel: R13: ffff8101ff10bd70 R14:
ffff8101fe290000 R15: ffff8101fe2900d0
Feb 5 02:49:55 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:49:56 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:49:56 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:49:56 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:49:56 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:49:56 SunSTG kernel:
Feb 5 02:49:56 SunSTG kernel: Call Trace:
Feb 5 02:49:56 SunSTG kernel: <IRQ>
[<ffffffffa008cec0>] ? :libata:__ata_qc_complete+0x3f/0xb1
Feb 5 02:49:56 SunSTG kernel:
[<ffffffffa008de32>] ? :libata:ata_qc_complete+0x12f/0x143
Feb 5 02:49:56 SunSTG kernel:
[<ffffffffa00b4554>] ? :sata_mv:mv_process_crpb_entries+0x14f/0x15e
Feb 5 02:49:56 SunSTG kernel:
[<ffffffffa00b4fdf>] ? :sata_mv:mv_interrupt+0x28b/0x64b
Feb 5 02:49:56 SunSTG kernel: [<ffffffff8112d332>] ? blk_done_softirq
+0x71/0x80
Feb 5 02:49:56 SunSTG kernel: [<ffffffff81075689>] ? handle_IRQ_event
+0x2e/0x65
Feb 5 02:49:56 SunSTG kernel: [<ffffffff81076cce>] ?
handle_fasteoi_irq+0x95/0xd0
Feb 5 02:49:56 SunSTG kernel: [<ffffffff8100f00f>] ? do_IRQ+0xf7/0x16c
Feb 5 02:49:56 SunSTG kernel: [<ffffffff8100c6cd>] ? ret_from_intr
+0x0/0x19
Feb 5 02:49:56 SunSTG kernel: <EOI>
[<ffffffffa027e82a>] ? :xor:xor_sse_5+0x97/0x3d7
Feb 5 02:49:56 SunSTG kernel: [<ffffffffa027ebd3>] ? :xor:xor_blocks
+0x69/0x6b
Feb 5 02:49:56 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:49:56 SunSTG kernel:
[<ffffffffa0296c8c>] ? :raid456:handle_stripe+0x8e8/0xf1b
Feb 5 02:49:56 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:49:56 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:49:56 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:49:56 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:49:56 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:49:56 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:49:56 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:49:56 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:49:57 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:49:57 SunSTG kernel:
Feb 5 02:51:01 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 61s!
[md4_raid5:13198]
Feb 5 02:51:01 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:51:01 SunSTG kernel: CPU 3:
Feb 5 02:51:01 SunSTG kernel: Modules linked in: raid456 async_xor
async_memcpy async_tx xor autofs4 hidp rfcomm l2cap bluetooth sunrpc
dm_mirror dm_log dm_multipath dm_mod wmi video output sbs sbshc battery
ac ipv6 parport_pc lp parport sr_mod cdrom joydev sg e1000 serio_raw
pata_amd pata_acpi i2c_amd8111 i2c_amd756 pcspkr ata_generic i2c_core
shpchp k8temp amd_rng hwmon usb_storage sata_mv libata sd_mod scsi_mod
raid1 ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded:
freq_table]
Feb 5 02:51:01 SunSTG kernel: Pid: 13198, comm: md4_raid5 Not tainted
2.6.26.8-57.fc8 #1
Feb 5 02:51:01 SunSTG kernel: RIP: 0010:[<ffffffffa0061d58>]
[<ffffffffa0061d58>] :scsi_mod:scsi_softirq_done+0xf4/0xfa
Feb 5 02:51:01 SunSTG kernel: RSP: 0018:ffff8101ff10bed0 EFLAGS:
00000282
Feb 5 02:51:01 SunSTG kernel: RAX: 0000000000000000 RBX:
ffff8101ff10bee0 RCX: ffff8101ff10bca0
Feb 5 02:51:01 SunSTG kernel: RDX: 0000000000000000 RSI:
ffff8101ff0e31e0 RDI: ffff8101fe3de218
Feb 5 02:51:01 SunSTG kernel: RBP: ffff8101ff10be50 R08:
0000000000000001 R09: 00000000ef6e7000
Feb 5 02:51:01 SunSTG kernel: R10: 0000000000001000 R11:
0000000000002002 R12: ffffffff8100cf52
Feb 5 02:51:01 SunSTG kernel: R13: ffff8101ff10be50 R14:
ffffffff8141b140 R15: 0000000000000001
Feb 5 02:51:01 SunSTG kernel: FS: 00007fdaeea3a6e0(0000)
GS:ffff8103ff039700(0000) knlGS:0000000000000000
Feb 5 02:51:01 SunSTG kernel: CS: 0010 DS: 0018 ES: 0018 CR0:
0000000080050033
Feb 5 02:51:01 SunSTG kernel: CR2: 00007fadb1a4d170 CR3:
00000003f996e000 CR4: 00000000000006e0
Feb 5 02:51:01 SunSTG kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 5 02:51:01 SunSTG kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 5 02:51:01 SunSTG kernel:
Feb 5 02:51:01 SunSTG kernel: Call Trace:
Feb 5 02:51:01 SunSTG kernel: <IRQ> [<ffffffff8112d332>] ?
blk_done_softirq+0x71/0x80
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8103b31f>] ? __do_softirq
+0x5e/0xd5
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8100d52c>] ? call_softirq
+0x1c/0x28
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8100ed5e>] ? do_softirq
+0x44/0x8b
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8103b280>] ? irq_exit
+0x3f/0x80
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8100f05f>] ? do_IRQ
+0x147/0x16c
Feb 5 02:51:01 SunSTG kernel: [<ffffffff8100c6cd>] ? ret_from_intr
+0x0/0x19
Feb 5 02:51:01 SunSTG kernel: <EOI>
[<ffffffffa029940b>] ? :raid456:raid6_sse24_gen_syndrome+0x127/0x210
Feb 5 02:51:02 SunSTG kernel:
[<ffffffffa0299362>] ? :raid456:raid6_sse24_gen_syndrome+0x7e/0x210
Feb 5 02:51:02 SunSTG kernel:
[<ffffffffa0295f6a>] ? :raid456:compute_parity6+0x24f/0x2e2
Feb 5 02:51:02 SunSTG kernel:
[<ffffffffa0296146>] ? :raid456:compute_block_1+0x149/0x1b2
Feb 5 02:51:02 SunSTG kernel:
[<ffffffffa0296d8f>] ? :raid456:handle_stripe+0x9eb/0xf1b
Feb 5 02:51:02 SunSTG kernel: [<ffffffffa029769d>] ? :raid456:raid5d
+0x3de/0x3ee
Feb 5 02:51:02 SunSTG kernel: [<ffffffff81297b98>] ? schedule_timeout
+0x22/0xb4
Feb 5 02:51:02 SunSTG kernel: [<ffffffff811f687a>] ? md_thread
+0xd6/0xee
Feb 5 02:51:02 SunSTG kernel: [<ffffffff810492dc>] ?
autoremove_wake_function+0x0/0x38
Feb 5 02:51:02 SunSTG kernel: [<ffffffff811f67a4>] ? md_thread
+0x0/0xee
Feb 5 02:51:02 SunSTG kernel: [<ffffffff810491a5>] ? kthread+0x49/0x78
Feb 5 02:51:02 SunSTG kernel: [<ffffffff8100d188>] ? child_rip
+0xa/0x12
Feb 5 02:51:02 SunSTG kernel: [<ffffffff8104915c>] ? kthread+0x0/0x78
Feb 5 02:51:02 SunSTG kernel: [<ffffffff8100d17e>] ? child_rip
+0x0/0x12
Feb 5 02:51:02 SunSTG kernel:
On Wed, 2009-01-28 at 22:30 +0200, Vladimir Ivashchenko wrote:
> Hi,
>
> We've got these new Sun X4500 servers. The system I'm playing with now
> has 48 x 250 GB SATA HDDs.
>
> Right now I'm creating two RAID6 arrays, 24 and 22 drives each:
>
> mdadm --verbose --create /dev/md3 --level=6
> --raid-devices=24 /dev/sda /dev/sdaa /dev/sdab /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal /dev/sdam /dev/sdan /dev/sdao /dev/sdap /dev/sdaq /dev/sdar /dev/sdas /dev/sdat /dev/sdau /dev/sdav /dev/sdb /dev/sdc
>
> mdadm --verbose --create /dev/md4 --level=6
> --raid-devices=22 /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdz
>
> mdadm --detail is reporting that everything is going smoothly, however
> my /var/log/messages is full of "BUG: soft lockup - CPU#X stuck for
> 10s!" errors appearing every 1-3 minutes.
>
> CentOS 5.2, 2.6.18-92.1.22.el5PAE, sata_mv. Two dual-core Opterons @ 2.8
> Ghz, 16 GB RAM.
>
> The system does not crash and otherwise seems to be healthy. Arrays are
> still under construction and I don't know if they will actually work
> yet.
>
> What I noticed is that at first it was complaining about lockups on md3
> process, but once I started creating md4, complaints were exclusively
> for md4 process only.
>
> Any stability assurances or workarounds are highly appreciated. :)
>
> Jan 28 21:31:32 SunSTG kernel: BUG: soft lockup - CPU#0 stuck for 10s!
> [md3_raid5:5672]
> Jan 28 21:31:32 SunSTG kernel:
> Jan 28 21:31:32 SunSTG kernel: Pid: 5672, comm: md3_raid5
> Jan 28 21:31:32 SunSTG kernel: EIP: 0060:[<f8d68162>] CPU: 0
> Jan 28 21:31:32 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
> +0x10a/0x1b6 [raid456]
> Jan 28 21:31:32 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:31:32 SunSTG kernel: EAX: ea0774e0 EBX: 000004e0 ECX: ead0ad30
> EDX: ea077000
> Jan 28 21:31:32 SunSTG kernel: ESI: ead0ade0 EDI: 00000004 EBP: ead0add0
> DS: 007b ES: 007b
> Jan 28 21:31:32 SunSTG kernel: CR0: 80050033 CR2: 0806e000 CR3: 373239e0
> CR4: 000006f0
> Jan 28 21:31:32 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
> [raid456]
> Jan 28 21:31:32 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:31:32 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:31:32 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:31:32 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:31:32 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:31:33 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:31:33 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:31:33 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:31:33 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
>
> Jan 28 21:31:33 SunSTG kernel: =======================
> Jan 28 21:32:26 SunSTG kernel: BUG: soft lockup - CPU#2 stuck for 10s!
> [md3_raid5:5672]
> Jan 28 21:32:26 SunSTG kernel:
> Jan 28 21:32:26 SunSTG kernel: Pid: 5672, comm: md3_raid5
> Jan 28 21:32:26 SunSTG kernel: EIP: 0060:[<f8d68170>] CPU: 2
> Jan 28 21:32:26 SunSTG kernel: EIP is at raid6_sse22_gen_syndrome
> +0x118/0x1b6 [raid456]
> Jan 28 21:32:26 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:32:26 SunSTG kernel: EAX: ea784040 EBX: 00000040 ECX: ead0ad30
> EDX: ea784000
> Jan 28 21:32:26 SunSTG kernel: ESI: ead0adf0 EDI: 00000008 EBP: ead0add0
> DS: 007b ES: 007b
> Jan 28 21:32:26 SunSTG kernel: CR0: 80050033 CR2: b7f6f000 CR3: 3714e920
> CR4: 000006f0
> Jan 28 21:32:26 SunSTG kernel: [<f8d63562>] compute_parity6+0x21c/0x28a
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d6452e>] handle_stripe+0xc8b/0x215e
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:32:26 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
> Jan 28 21:32:26 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:32:26 SunSTG kernel: [<f8d6171e>] __release_stripe+0xfc/0x101
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:32:26 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:32:26 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:32:26 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:32:26 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:32:26 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:32:26 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:32:26 SunSTG kernel: =======================
>
> <somewhere here I issue commands to create md4>
>
> Jan 28 21:32:43 SunSTG kernel: md: syncing RAID array md4
> Jan 28 21:32:43 SunSTG kernel: md: minimum _guaranteed_ reconstruction
> speed: 1000 KB/sec/disc.
> Jan 28 21:32:43 SunSTG kernel: md: using maximum available idle IO
> bandwidth (but not more than 200000 KB/sec) for reconstruction.
> Jan 28 21:32:43 SunSTG kernel: md: using 128k window, over a total of
> 244195200 blocks.
> Jan 28 21:33:20 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
> [md4_raid5:5694]
> Jan 28 21:33:20 SunSTG kernel:
> Jan 28 21:33:20 SunSTG kernel: Pid: 5694, comm: md4_raid5
> Jan 28 21:33:20 SunSTG kernel: EIP: 0060:[<f8d63aff>] CPU: 3
> Jan 28 21:33:20 SunSTG kernel: EIP is at handle_stripe+0x25c/0x215e
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: EFLAGS: 00000282 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:33:20 SunSTG kernel: EAX: f6a2b404 EBX: 00000001 ECX: f53d17c0
> EDX: e8c532c0
> Jan 28 21:33:20 SunSTG kernel: ESI: e8c532c4 EDI: 00000016 EBP: e8c52b64
> DS: 007b ES: 007b
> Jan 28 21:33:20 SunSTG kernel: CR0: 8005003b CR2: b7cfc000 CR3: 3714ef00
> CR4: 000006f0
> Jan 28 21:33:20 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:33:20 SunSTG kernel: [<c041fc53>] task_rq_lock+0x31/0x58
> Jan 28 21:33:20 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:33:20 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:33:20 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:33:20 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:33:20 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:33:20 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:33:20 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:33:20 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:33:21 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:33:21 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:33:21 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:33:21 SunSTG kernel: =======================
> Jan 28 21:33:50 SunSTG kernel: BUG: soft lockup - CPU#3 stuck for 10s!
> [md4_raid5:5694]
> Jan 28 21:33:50 SunSTG kernel:
> Jan 28 21:33:50 SunSTG kernel: Pid: 5694, comm: md4_raid5
> Jan 28 21:33:50 SunSTG kernel: EIP: 0060:[<f8bf9813>] CPU: 3
> Jan 28 21:33:50 SunSTG kernel: EIP is at xor_sse_5+0xa0/0x3b5 [xor]
> Jan 28 21:33:50 SunSTG kernel: EFLAGS: 00000202 Not tainted
> (2.6.18-92.1.22.el5PAE #1)
> Jan 28 21:33:50 SunSTG kernel: EAX: 0000000b EBX: e8e66500 ECX: e8e69500
> EDX: e8e6e500
> Jan 28 21:33:50 SunSTG kernel: ESI: e8e67500 EDI: e8e68500 EBP: e96b5dd4
> DS: 007b ES: 007b
> Jan 28 21:33:50 SunSTG kernel: CR0: 80050033 CR2: b7cfc000 CR3: 3714ef00
> CR4: 000006f0
> Jan 28 21:33:50 SunSTG kernel: [<f8bfa200>] xor_block+0x74/0x7d [xor]
> Jan 28 21:33:50 SunSTG kernel: [<f8d636b3>] compute_block_1+0xe3/0x13a
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<f8d644ba>] handle_stripe+0xc17/0x215e
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<c041f34b>] find_busiest_group
> +0x177/0x462
> Jan 28 21:33:50 SunSTG kernel: [<c041fdb3>] enqueue_task+0x29/0x39
> Jan 28 21:33:50 SunSTG kernel: [<c0420629>] try_to_wake_up+0x371/0x37b
> Jan 28 21:33:50 SunSTG kernel: [<c041edec>] __wake_up_common+0x2f/0x53
> Jan 28 21:33:50 SunSTG kernel: [<c041fbe6>] __wake_up+0x2a/0x3d
> Jan 28 21:33:50 SunSTG kernel: [<f8d61744>] release_stripe+0x21/0x2e
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<f8d65b0c>] raid5d+0x10b/0x130
> [raid456]
> Jan 28 21:33:50 SunSTG kernel: [<c059aca8>] md_thread+0xdf/0xf5
> Jan 28 21:33:50 SunSTG kernel: [<c0436347>] autoremove_wake_function
> +0x0/0x2d
> Jan 28 21:33:50 SunSTG kernel: [<c059abc9>] md_thread+0x0/0xf5
> Jan 28 21:33:51 SunSTG kernel: [<c0436285>] kthread+0xc0/0xeb
> Jan 28 21:33:51 SunSTG kernel: [<c04361c5>] kthread+0x0/0xeb
> Jan 28 21:33:51 SunSTG kernel: [<c0405c3b>] kernel_thread_helper
> +0x7/0x10
> Jan 28 21:33:51 SunSTG kernel: =======================
> ... and it goes on complaining about md4_raid5:5694.
>
> [root@SunSTG ~]# mdadm --detail /dev/md3
> /dev/md3:
> Version : 00.90.03
> Creation Time : Wed Jan 28 21:30:50 2009
> Raid Level : raid6
> Array Size : 5372294400 (5123.42 GiB 5501.23 GB)
> Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
> Raid Devices : 24
> Total Devices : 24
> Preferred Minor : 3
> Persistence : Superblock is persistent
>
> Update Time : Wed Jan 28 21:30:50 2009
> State : clean, resyncing
> Active Devices : 24
> Working Devices : 24
> Failed Devices : 0
> Spare Devices : 0
>
> Chunk Size : 64K
>
> Rebuild Status : 15% complete
>
> UUID : d8c2b5ce:576a117b:f2494cd1:626a774c
> Events : 0.1
>
> Number Major Minor RaidDevice State
> 0 8 0 0 active sync /dev/sda
> 1 65 160 1 active sync /dev/sdaa
> 2 65 176 2 active sync /dev/sdab
> 3 65 208 3 active sync /dev/sdad
> 4 65 224 4 active sync /dev/sdae
> 5 65 240 5 active sync /dev/sdaf
> 6 66 0 6 active sync /dev/sdag
> 7 66 16 7 active sync /dev/sdah
> 8 66 32 8 active sync /dev/sdai
> 9 66 48 9 active sync /dev/sdaj
> 10 66 64 10 active sync /dev/sdak
> 11 66 80 11 active sync /dev/sdal
> 12 66 96 12 active sync /dev/sdam
> 13 66 112 13 active sync /dev/sdan
> 14 66 128 14 active sync /dev/sdao
> 15 66 144 15 active sync /dev/sdap
> 16 66 160 16 active sync /dev/sdaq
> 17 66 176 17 active sync /dev/sdar
> 18 66 192 18 active sync /dev/sdas
> 19 66 208 19 active sync /dev/sdat
> 20 66 224 20 active sync /dev/sdau
> 21 66 240 21 active sync /dev/sdav
> 22 8 16 22 active sync /dev/sdb
> 23 8 32 23 active sync /dev/sdc
> [root@SunSTG ~]# mdadm --detail /dev/md4
> /dev/md4:
> Version : 00.90.03
> Creation Time : Wed Jan 28 21:32:39 2009
> Raid Level : raid6
> Array Size : 4883904000 (4657.65 GiB 5001.12 GB)
> Used Dev Size : 244195200 (232.88 GiB 250.06 GB)
> Raid Devices : 22
> Total Devices : 22
> Preferred Minor : 4
> Persistence : Superblock is persistent
>
> Update Time : Wed Jan 28 21:32:39 2009
> State : clean, resyncing
> Active Devices : 22
> Working Devices : 22
> Failed Devices : 0
> Spare Devices : 0
>
> Chunk Size : 64K
>
> Rebuild Status : 17% complete
>
> UUID : 7e2c7f35:f51c9047:40130c15:63a7cfa6
> Events : 0.1
>
> Number Major Minor RaidDevice State
> 0 8 48 0 active sync /dev/sdd
> 1 8 64 1 active sync /dev/sde
> 2 8 80 2 active sync /dev/sdf
> 3 8 96 3 active sync /dev/sdg
> 4 8 112 4 active sync /dev/sdh
> 5 8 128 5 active sync /dev/sdi
> 6 8 144 6 active sync /dev/sdj
> 7 8 160 7 active sync /dev/sdk
> 8 8 176 8 active sync /dev/sdl
> 9 8 192 9 active sync /dev/sdm
> 10 8 208 10 active sync /dev/sdn
> 11 8 224 11 active sync /dev/sdo
> 12 8 240 12 active sync /dev/sdp
> 13 65 0 13 active sync /dev/sdq
> 14 65 16 14 active sync /dev/sdr
> 15 65 32 15 active sync /dev/sds
> 16 65 48 16 active sync /dev/sdt
> 17 65 64 17 active sync /dev/sdu
> 18 65 80 18 active sync /dev/sdv
> 19 65 96 19 active sync /dev/sdw
> 20 65 112 20 active sync /dev/sdx
> 21 65 144 21 active sync /dev/sdz
>
>
--
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: sun x4500 soft lockup during raid creation
2009-02-05 16:10 ` Vladimir Ivashchenko
@ 2009-02-20 18:57 ` Vladimir Ivashchenko
0 siblings, 0 replies; 13+ messages in thread
From: Vladimir Ivashchenko @ 2009-02-20 18:57 UTC (permalink / raw)
To: linux-raid; +Cc: Mark Lord
Hi All,
Final update. I have contacted Mark Lord who gave me one patch for
2.6.29 and advised that hotplug should be stable. From my tests so far,
it is indeed so. We removed and added HDDs on our Sun X4500 during I/O
and seek activity without any errors or crashes.
On Thu, 2009-02-05 at 18:10 +0200, Vladimir Ivashchenko wrote:
> Ok, further updates:
>
> I have installed a 64-bit CentOS5 and put x86_64 2.6.26.8-57.fc8 Fedora
> kernel on it.
>
> The RAID creation was mostly quiet, apart from a few softluckups as
> described below.
>
> Then we tried inserting and removing a HDD. As expected, it didn't fully
> work properly, but at least the machine have not crashed. The arrays
> didn't have any load though. From being /dev/sdat the disk
> became /dev/sdax. For some reason mdadm was reporting the array and the
> disk itself to be healthy, but the device entry for the removed hard
> drive #19 was empty with wrong major/minor numbers.
>
> Reading about sata_mv driver, it seems that hotplug is known to be
> problematic, so we're going to try OpenSolaris. However I have another
> X4500 for a few days, and if any developers would like me to check
> something, I will try to do it.
--
Best Regards,
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel PLC, Cyprus - www.prime-tel.com
Tel: +357 25 100100 Fax: +357 2210 2211
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-02-20 18:57 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-28 20:30 sun x4500 soft lockup during raid creation Vladimir Ivashchenko
2009-01-28 21:33 ` Joe Landman
2009-01-28 21:37 ` Vladimir Ivashchenko
2009-01-28 22:17 ` Richard Scobie
2009-01-28 22:31 ` Bill Davidsen
2009-01-28 22:33 ` Tru Huynh
2009-01-28 23:08 ` Vladimir Ivashchenko
2009-01-30 15:28 ` Bill Davidsen
2009-01-30 19:38 ` Vladimir Ivashchenko
2009-01-30 22:28 ` Keld Jørn Simonsen
2009-01-29 22:54 ` Jody McIntyre
2009-02-05 16:10 ` Vladimir Ivashchenko
2009-02-20 18:57 ` Vladimir Ivashchenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).