From: "K.Tanaka" <k-tanaka@ce.jp.nec.com>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org, dm-devel@redhat.com,
linux-scsi@vger.kernel.org
Subject: Re: [BUG] The kernel thread for md RAID1 could cause a md RAID1 array deadlock
Date: Wed, 30 Jan 2008 11:02:16 +0900 [thread overview]
Message-ID: <479FDAA8.3030409@ce.jp.nec.com> (raw)
In-Reply-To: <47985B37.9000503@ce.jp.nec.com>
Hi,
>Also, md raid10 seems to have the same problem.
>I will test raid10 applying this patch as well.
Sorry for the late response. I had a trouble with reproducing the problem,
but it turns out that the 2.6.24 kernel needs the latest (possibly testing)
version of systemtap-0.6.1-1 to run systemtap for the fault injection tool.
I've reproduced the stall on both raid1 and raid10 using 2.6.24.
Also I've tested the patch applied to 2.6.24 and confirmed that
it will fix the stall problem for both cases.
K.Tanaka wrote:
> Hi,
>
> Thank you for the patch.
> I have applied the patch to 2.6.23.14 and it works well.
>
> - In case of 2.6.23.14, the problem is reproduced.
> - In case of 2.6.23.14 with this patch, raid1 works well so far.
> The fault injection script continues to run, and it doesn't deadlock.
> I will keep it running for a while.
>
> Also, md raid10 seems to have the same problem.
> I will test raid10 applying this patch as well.
>
>
> Neil Brown wrote:
>> On Tuesday January 15, k-tanaka@ce.jp.nec.com wrote:
>>> This message describes the details about md-RAID1 issue found by
>>> testing the md RAID1 using the SCSI fault injection framework.
>>>
>>> Abstract:
>>> Both the error handler for md RAID1 and write access request to the md RAID1
>>> use raid1d kernel thread. The nr_pending flag could cause a race condition
>>> in raid1d, results in a raid1d deadlock.
>> Thanks for finding and reporting this.
>>
>> I believe the following patch should fix the deadlock.
>>
>> If you are able to repeat your test and confirm this I would
>> appreciate it.
>>
>> Thanks,
>> NeilBrown
>>
>>
>>
>> Fix deadlock in md/raid1 when handling a read error.
>>
>> When handling a read error, we freeze the array to stop any other
>> IO while attempting to over-write with correct data.
>>
--
---------------------------------------------------------
Kenichi TANAKA | Open Source Software Platform Development Division
| Computers Software Operations Unit, NEC Corporation
| k-tanaka@ce.jp.nec.com
prev parent reply other threads:[~2008-01-30 2:02 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-15 3:10 [BUG] The kernel thread for md RAID1 could cause a md RAID1 array deadlock K.Tanaka
2008-01-24 3:28 ` Neil Brown
2008-01-24 9:32 ` K.Tanaka
2008-01-30 2:02 ` K.Tanaka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=479FDAA8.3030409@ce.jp.nec.com \
--to=k-tanaka@ce.jp.nec.com \
--cc=dm-devel@redhat.com \
--cc=linux-raid@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).