From: Lee Howard <faxguy@howardsilvan.com>
To: "Majed B." <majedb@gmail.com>
Cc: Steven Haigh <netwiz@crc.id.au>, linux-raid@vger.kernel.org
Subject: Re: BUG: soft lockup - CPU#0 stuck for 10s! [md2_raid1:358]
Date: Tue, 20 Oct 2009 22:24:32 -0700 [thread overview]
Message-ID: <4ADE9B10.2030204@howardsilvan.com> (raw)
In-Reply-To: <70ed7c3e0910202202y53231834y639db36af6e964db@mail.gmail.com>
I've been deliberately monitoring the kernel via the git web interfaces,
and I can't yet see the patch committed that supposedly fixed this.
(Please correct me if it was actually committed.)
While a single 10s stuck CPU may not be serious, it *is* serious when it
happens over and over and over again consecutively (like it does in my
case).
Thanks,
Lee.
Majed B. wrote:
> And it's not serious.
>
> On Wed, Oct 21, 2009 at 8:01 AM, Majed B. <majedb@gmail.com> wrote:
>
>> Hello,
>>
>> I believe this has been fixed in 2.6.30 or 2.6.31.
>>
>> On Wed, Oct 21, 2009 at 5:46 AM, Steven Haigh <netwiz@crc.id.au> wrote:
>>
>>> When trying to run a check using:
>>> echo check > /sys/block/md2/md/sync_action
>>>
>>> I got the following errors printed to the console:
>>>
>>> Oct 21 13:31:03 wireless kernel: md: syncing RAID array md2
>>> Oct 21 13:31:03 wireless kernel: md: minimum _guaranteed_ reconstruction
>>> speed: 1000 KB/sec/disc.
>>> Oct 21 13:31:03 wireless kernel: md: using maximum available idle IO
>>> bandwidth (but not more than 20000 KB/sec) for reconstruction.
>>> Oct 21 13:31:03 wireless kernel: md: using 128k window, over a total of
>>> 300511808 blocks.
>>> BUG: soft lockup - CPU#0 stuck for 10s! [md2_raid1:358]
>>>
>>> Pid: 358, comm: md2_raid1
>>> EIP: 0060:[<c04ec1bc>] CPU: 0
>>> EIP is at memcmp+0xd/0x22
>>> EFLAGS: 00000202 Not tainted (2.6.18-164.el5 #1)
>>> EAX: 00000000 EBX: e2826fe0 ECX: d15f3fe0 EDX: 00000000
>>> ESI: 00000020 EDI: 00000090 EBP: f70b8e40 DS: 007b ES: 007b
>>> CR0: 8005003b CR2: 0806af70 CR3: 37872000 CR4: 000006d0
>>> [<f8843c64>] raid1d+0x270/0xbea [raid1]
>>> [<c0616870>] schedule+0x9cc/0xa55
>>> [<c0616f33>] schedule_timeout+0x13/0x8c
>>> [<c05a6b5e>] md_thread+0xdf/0xf5
>>> [<c0434907>] autoremove_wake_function+0x0/0x2d
>>> [<c05a6a7f>] md_thread+0x0/0xf5
>>> [<c0434845>] kthread+0xc0/0xeb
>>> [<c0434785>] kthread+0x0/0xeb
>>> [<c0405c53>] kernel_thread_helper+0x7/0x10
>>> =======================
>>> Oct 21 13:37:50 wireless kernel: BUG: soft lockup - CPU#0 stuck for 10s!
>>> [md2_raid1:358]
>>> Oct 21 13:37:50 wireless kernel:
>>> Oct 21 13:37:50 wireless kernel: Pid: 358, comm: md2_raid1
>>> Oct 21 13:37:50 wireless kernel: EIP: 0060:[<c04ec1bc>] CPU: 0
>>> Oct 21 13:37:50 wireless kernel: EIP is at memcmp+0xd/0x22
>>> Oct 21 13:37:50 wireless kernel: EFLAGS: 00000202 Not tainted
>>> (2.6.18-164.el5 #1)
>>> Oct 21 13:37:50 wireless kernel: EAX: 00000000 EBX: e2826fe0 ECX: d15f3fe0
>>> EDX: 00000000
>>> Oct 21 13:37:50 wireless kernel: ESI: 00000020 EDI: 00000090 EBP: f70b8e40
>>> DS: 007b ES: 007b
>>> Oct 21 13:37:50 wireless kernel: CR0: 8005003b CR2: 0806af70 CR3: 37872000
>>> CR4: 000006d0
>>> Oct 21 13:37:50 wireless kernel: [<f8843c64>] raid1d+0x270/0xbea [raid1]
>>> Oct 21 13:37:50 wireless kernel: [<c0616870>] schedule+0x9cc/0xa55
>>> Oct 21 13:37:50 wireless kernel: [<c0616f33>] schedule_timeout+0x13/0x8c
>>> Oct 21 13:37:50 wireless kernel: [<c05a6b5e>] md_thread+0xdf/0xf5
>>> Oct 21 13:37:51 wireless kernel: [<c0434907>]
>>> autoremove_wake_function+0x0/0x2d
>>> Oct 21 13:37:51 wireless kernel: [<c05a6a7f>] md_thread+0x0/0xf5
>>> Oct 21 13:37:51 wireless kernel: [<c0434845>] kthread+0xc0/0xeb
>>> Oct 21 13:37:51 wireless kernel: [<c0434785>] kthread+0x0/0xeb
>>> Oct 21 13:37:51 wireless kernel: [<c0405c53>] kernel_thread_helper+0x7/0x10
>>> Oct 21 13:37:51 wireless kernel: =======================
>>>
>>> This is using CentOS 5.3 with Kernel 2.6.18-164.el5 on an i686.
>>>
>>> Is this a serious type error? Is there anything else I can supply to
>>> diagnose things more?
>>>
>>> # mdadm --detail /dev/md2
>>> /dev/md2:
>>> Version : 00.90.03
>>> Creation Time : Mon Feb 23 17:15:41 2009
>>> Raid Level : raid1
>>> Array Size : 300511808 (286.59 GiB 307.72 GB)
>>> Used Dev Size : 300511808 (286.59 GiB 307.72 GB)
>>> Raid Devices : 2
>>> Total Devices : 2
>>> Preferred Minor : 2
>>> Persistence : Superblock is persistent
>>>
>>> Update Time : Wed Oct 21 13:46:28 2009
>>> State : clean, resyncing
>>> Active Devices : 2
>>> Working Devices : 2
>>> Failed Devices : 0
>>> Spare Devices : 0
>>>
>>> Rebuild Status : 5% complete
>>>
>>> UUID : fed99e3d:d08fdcc9:b9593a45:2cc09736
>>> Events : 0.30584
>>>
>>> Number Major Minor RaidDevice State
>>> 0 3 3 0 active sync /dev/hda3
>>> 1 22 3 1 active sync /dev/hdc3
>>>
>>>
>>> --
>>> Steven Haigh
>>>
>>> Email: netwiz@crc.id.au
>>> Web: http://www.crc.id.au
>>> Phone: (03) 9001 6090 - 0412 935 897
>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>> --
>> Majed B.
>>
>>
>
>
>
>
next prev parent reply other threads:[~2009-10-21 5:24 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-21 2:46 BUG: soft lockup - CPU#0 stuck for 10s! [md2_raid1:358] Steven Haigh
2009-10-21 5:01 ` Majed B.
2009-10-21 5:02 ` Majed B.
2009-10-21 5:24 ` Lee Howard [this message]
2009-10-21 8:44 ` Majed B.
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ADE9B10.2030204@howardsilvan.com \
--to=faxguy@howardsilvan.com \
--cc=linux-raid@vger.kernel.org \
--cc=majedb@gmail.com \
--cc=netwiz@crc.id.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.