* cpu stuck when raid5 was in recovery
@ 2013-03-27 7:43 hanguozhong
2013-03-27 15:54 ` Roy Sigurd Karlsbakk
0 siblings, 1 reply; 2+ messages in thread
From: hanguozhong @ 2013-03-27 7:43 UTC (permalink / raw)
To: linux-raid
Hello, everyone:
I Created a 16*2T raid5 array yesterday just for test, the kernel 2.6.38 was used.
And when the array was in recovery, there were lots of "kernel bugs" outputs of dmesg.
The contents of the outputs was as follows:
# BUG: soft lockup - CPU#9 stuck for 67s! [kworker/u:11:1193]
Modules linked in: raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx bonding [last unloaded: scsi_wait_scan]
Pid: 1193, comm: kworker/u:11, CPU: 9
r0 : 0x0000000000000244 r1 : 0xfffffe41f1506e80 r2 : 0xfffffe41f1436ec0
r3 : 0xfffffe41f1426ec0 r4 : 0xfffffe41f1416ec0 r5 : 0xfffffe41f1506ea8
r6 : 0xfffffe41f1506ea0 r7 : 0xfffffe41f1506e98 r8 : 0xfffffe41f1506e90
r9 : 0xfffffe41f1506e88 r10: 0xfffffe41f1506eb8 r11: 0xfffffe41f1506eb0
r12: 0xfffffe41f1416eb8 r13: 0xfffffe41f1426eb8 r14: 0x0000000000000000
BUG: soft lockup - CPU#35 stuck for 67s! [kworker/u:16:1198]
Modules linked in: raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx bonding [last unloaded: scsi_wait_scan]
Pid: 1198, comm: kworker/u:16, CPU: 35
r0 : 0x000000000000015c r1 : 0xfffffe41f18aa8c0 r2 : 0xfffffe41f183a8c0
r3 : 0xfffffe41f182a8c0 r4 : 0xfffffe41f181a8c0 r5 : 0xfffffe41f18aa8e8
r6 : 0xfffffe41f18aa8e0 r7 : 0xfffffe41f18aa8d8 r8 : 0xfffffe41f18aa8d0
r9 : 0xfffffe41f18aa8c8 r10: 0xfffffe41f18aa8f8 r11: 0xfffffe41f18aa8f0
r12: 0xfffffe41f183a8c8 r13: 0xfffffe41f181a8d0 r14: 0x483158ac59313149
r15: 0xfffffe41f183a8d0 r16: 0xfffffe41f182a8d0 r17: 0x0000000000000000
r18: 0xef05894a64ff6660 r19: 0x0f23dc5939e9b8ba r20: 0xfffffe41f182a8c8
r21: 0x3bc955b8012dd7f3 r22: 0xe02655135d16deda r23: 0x7cfb03994e2c49d9
r24: 0x5b7fc5d225715c11 r25: 0x26c199892b1172bb r26: 0xc4d0c69121e38d22
r27: 0x85c40e017cf99e36 r28: 0x85c40e017cf99e36 r29: 0xfffffe41f181a8c8
r30: 0x834676b803e35c33 r31: 0xe2115f180af2ff99 r32: 0xfffffe01f63f8c90
r33: 0x000000000000000f r34: 0x0000000000000000 r35: 0x0000000000000000
r36: 0x0000000000000000 r37: 0x0000000000000000 r38: 0xffffffffffffffff
r39: 0xfffffe0000a74348 r40: 0x000000000001f4da r41: 0xfffffe0000a71c80
r42: 0x00000000007d3680 r43: 0x0000000000002740 r44: 0x0000000000000000
r45: 0x1000000000000000 r46: 0x0000000000000000 r47: 0x0000000000000000
r48: 0x0000000000000000 r49: 0x0000000000000000 r50: 0x0000000000000000
r51: 0x0000000000000000 r52: 0xfffffe00008e3c80 tp : 0x000001f4ff950000
sp : 0xfffffe01f36efc60 lr : 0x7dbe5c5b0e602eaa
pc : 0xfffffff710281188 ex1: 1 faultnum: 22
Starting stack dump of tid 1198, pid 1198 (kworker/u:16) on cpu 35 at cycle 899347866419
frame 0: 0xfffffff710281188 xor_32regs_p_4.cold+0x80/0x1f0 [xor] (sp 0xfffffe01f36efc60)
frame 1: 0xfffffff710280e00 xor_blocks.cold+0xc0/0x148 [xor] (sp 0xfffffe01f36efc70)
frame 2: 0xfffffff7102e0238 async_xor.cold+0x238/0x340 [async_xor] (sp 0xfffffe01f36efc80)
frame 3: 0xfffffff7102e0408 async_xor_val.cold+0xc8/0x278 [async_xor] (sp 0xfffffe01f36efcd0)
frame 4: 0xfffffff7103a74d8 __raid_run_ops.cold+0x1180/0x1a78 [raid456] (sp 0xfffffe01f36efd28)
frame 5: 0xfffffff7103a7e48 async_run_ops+0x78/0xa0 [raid456] (sp 0xfffffe01f36efde8)
frame 6: 0xfffffff7000b4b90 async_run_entry_fn+0xd8/0x1f8 (sp 0xfffffe01f36efe08)
frame 7: 0xfffffff7002999e8 process_one_work+0x1e8/0x538 (sp 0xfffffe01f36efe48)
frame 8: 0xfffffff700274f78 worker_thread+0x378/0x898 (sp 0xfffffe01f36efea0)
frame 9: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe01f36eff80)
frame 10: 0xfffffff7000bab38 start_kernel_thread+0x18/0x20 (sp 0xfffffe01f36effe8)
Stack dump complete
hrtimer: interrupt took 26799238 ns
r15: 0x0000000000000000 r16: 0x4d03b72156442e8b r17: 0x0000000000000000
r18: 0xe84615a32a3bbb31 r19: 0xfffffe41f1416ea8 r20: 0x0000000000000000
r21: 0xe19a6fbcb2784276 r22: 0x5073bbff19c23f2b r23: 0x0000000000000000
r24: 0xa6d8f2e28cc4eb4c r25: 0x0000000000000000 r26: 0x81b7047f7bf7e509
r27: 0x1f2bb3f9c2f23efe r28: 0x1f2bb3f9c2f23efe r29: 0x4f580806db3001d5
r30: 0x44dfcd3ece07d7cc r31: 0xac780ca019d138c6 r32: 0xfffffe01f63f5890
r33: 0x000000000000000f r34: 0x0000000000000000 r35: 0x0000000000000000
r36: 0x0000000000000000 r37: 0x0000000000000000 r38: 0xffffffffffffffff
r39: 0xfffffe0000a76a88 r40: 0x000000000001f16e r41: 0xfffffe0000a743c0
r42: 0x00000000007c5b80 r43: 0x0000000000002740 r44: 0x0000000000000001
r45: 0x5000000000000000 r46: 0x0000000000000000 r47: 0x0000000000000000
r48: 0x0000000000000000 r49: 0x0000000000000000 r50: 0x0000000000000000
r51: 0x0000000000000000 r52: 0xfffffe00008e3c80 tp : 0x000001f4ff7b0000
sp : 0xfffffe01f373fc60 lr : 0x8b17fa3deee23683
pc : 0xfffffff710281270 ex1: 1 faultnum: 22
Starting stack dump of tid 1193, pid 1193 (kworker/u:11) on cpu 9 at cycle 899797964239
frame 0: 0xfffffff710281270 xor_32regs_p_4.cold+0x168/0x1f0 [xor] (sp 0xfffffe01f373fc60)
frame 1: 0xfffffff710280e00 xor_blocks.cold+0xc0/0x148 [xor] (sp 0xfffffe01f373fc70)
frame 2: 0xfffffff7102e0238 async_xor.cold+0x238/0x340 [async_xor] (sp 0xfffffe01f373fc80)
frame 3: 0xfffffff7102e0408 async_xor_val.cold+0xc8/0x278 [async_xor] (sp 0xfffffe01f373fcd0)
frame 4: 0xfffffff7103a74d8 __raid_run_ops.cold+0x1180/0x1a78 [raid456] (sp 0xfffffe01f373fd28)
frame 5: 0xfffffff7103a7e48 async_run_ops+0x78/0xa0 [raid456] (sp 0xfffffe01f373fde8)
frame 6: 0xfffffff7000b4b90 async_run_entry_fn+0xd8/0x1f8 (sp 0xfffffe01f373fe08)
frame 7: 0xfffffff7002999e8 process_one_work+0x1e8/0x538 (sp 0xfffffe01f373fe48)
frame 8: 0xfffffff700274f78 worker_thread+0x378/0x898 (sp 0xfffffe01f373fea0)
frame 9: 0xfffffff7000f0530 kthread+0xe0/0xe8 (sp 0xfffffe01f373ff80)
frame 10: 0xfffffff7000bab38 start_kernel_thread+0x18/0x20 (sp 0xfffffe01f373ffe8)
Stack dump complete
It seemed that "CPU9" had been occupied by process "kworker/u:11", and "CPU35" had been occupied by
process "kworker/u:16". I used the tilera 36 core CPU and the frequency each core is 1.0G. When one of
the cores was stuck, the process bind to the core would be no response any more until the lockup was resolved.
Is there any way to lower the priority of process such as "kworker/u:11"? I hope that programs bind to
the specified core which could response immediately. can anyone help me?
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: cpu stuck when raid5 was in recovery
2013-03-27 7:43 cpu stuck when raid5 was in recovery hanguozhong
@ 2013-03-27 15:54 ` Roy Sigurd Karlsbakk
0 siblings, 0 replies; 2+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-03-27 15:54 UTC (permalink / raw)
To: hanguozhong; +Cc: linux-raid
----- Opprinnelig melding -----
> Hello, everyone:
> I Created a 16*2T raid5 array yesterday just for test, the kernel
> 2.6.38 was used.
> And when the array was in recovery, there were lots of "kernel bugs"
> outputs of dmesg.
first of all - 16 drives in a single RAID-5 is like BASE-jumping with a small umbrella, but for test - hey - go on.
But then - any errors from the drives in there?
Vennlige hilsener / Best regards
roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-03-27 15:54 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 7:43 cpu stuck when raid5 was in recovery hanguozhong
2013-03-27 15:54 ` Roy Sigurd Karlsbakk
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.