From: NeilBrown <neilb@suse.de>
To: "Kwolek, Adam" <adam.kwolek@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
"Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
"Labun, Marcin" <Marcin.Labun@intel.com>,
"Williams, Dan J" <dan.j.williams@intel.com>
Subject: Re: [PATCH] md: Add ability for disable bad block management
Date: Thu, 8 Dec 2011 15:02:22 +1100 [thread overview]
Message-ID: <20111208150222.2ce2ac16@notabene.brown> (raw)
In-Reply-To: <79556383A0E1384DB3A3903742AAC04A055D88@IRSMSX101.ger.corp.intel.com>
[-- Attachment #1: Type: text/plain, Size: 3625 bytes --]
On Wed, 7 Dec 2011 11:10:06 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:
>
>
> > -----Original Message-----
> > From: NeilBrown [mailto:neilb@suse.de]
> > I cannot reproduce this.
> > I didn't physically remove devices, but I used
> > echo 1 > /sys/block/sdc/device/delete
> > which should be nearly identical from the perspective of md and mdadm.
>
> I've checked that when I'm deleting device using sysfs everything works perfect.
> When when device is pulled out, reshape stops in md/mdstat.
>
> > If you could give me the exact set of steps that you follow to produce the
> > problem that would help - maybe a script? Just a description is OK.
>
>
> #used disks sdb, sdc, sdd, sde
> export IMSM_NO_PLATFORM=1
> #create container
> mdadm -C /dev/md/imsm0 -amd -e imsm -n 3 /dev/sdb /dev/sdc /dev/sde -R
> #create vol
> mdadm -C /dev/md/raid5vol_0 -amd -l 5 --chunk 32 --size 104850 -n 3 /dev/sdb /dev/sdc /dev/sde -R
> #add spare
> mdadm --add /dev/md/imsm0 /dev/sdd
> #run OLCE
> mdadm --grow /dev/md/imsm0 --raid-devices 4
> #when reshape starts, I'm (physically) pulling device out
>
> > Also you say it is blocking in md_do_sync. Is that at the
> >
> > wait_event(mddev->recovery_wait, !atomic_read(&mddev-
> > >recovery_active));
> >
> > call just after the "out:" label?
>
> None of those 2 places.
> It enters sync_request() function. Md_error() is called.
> More is visible on thread stack information below (md_wait_for_blocked_rdev()).
>
>
> >
> > What is the raid thread doing at this point?
> > cat /proc/PID/stack
> > might help.
>
> [md126_raid5]
> [<ffffffff8121d843>] md_wait_for_blocked_rdev+0xbc/0x10f
> [<ffffffffa01d87ce>] handle_stripe+0x1c5c/0x2c99 [raid456]
> [<ffffffffa01d9d0d>] raid5d+0x502/0x564 [raid456]
> [<ffffffff8121eca5>] md_thread+0x101/0x11f
> [<ffffffff81049e0e>] kthread+0x81/0x89
> [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> [md126_reshape]
> [<ffffffffa02455a2>] sync_request+0x90a/0xbfb [raid456]
> [<ffffffff8121e151>] md_do_sync+0x7aa/0xc40
> [<ffffffff8121ecb3>] md_thread+0x101/0x11f
> [<ffffffff81049e0e>] kthread+0x81/0x89
> [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> >
> > What are the contents of all the sysfs files?
> > grep . /sys/block/mdXXX/md/*
> array_state ->active
> degraded ->1
> max_read_errors ->20
> reshape_position ->12288
> resync_start ->none
> sync_completed ->4096 / 209664
>
>
> > grep . /sys/block/mdXXX/md/dev-*/*
>
> When removed is sdd /sys/block/mdXXX/md/dev-sdd/*
> bad_blocks ->4096 512
> ->4608 128
> ->4736 384
> block ->MISSING link is not valid
> errors ->0
> offset ->0
> recovery_start ->4096
> size ->104832
> slot ->3
> state ->faulty,write_error
> unacknowledged_bad_blocks ->4096 512
> ->4608 128
> ->4736 384
>
> I hope this helps.
Yes it does, thanks.
Can you try with this patch as well please.
Thanks,
NeilBrown
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ea6dce9..6cf0f6a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3175,6 +3175,8 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s)
rdev = rcu_dereference(conf->disks[i].rdev);
clear_bit(R5_ReadRepl, &dev->flags);
}
+ if (rdev && test_bit(Faulty, &rdev->flags))
+ rdev = NULL;
if (rdev) {
is_bad = is_badblock(rdev, sh->sector, STRIPE_SECTORS,
&first_bad, &bad_sectors);
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2011-12-08 4:02 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-24 12:19 [PATCH] md: Add ability for disable bad block management Adam Kwolek
2011-11-24 12:23 ` Paul Menzel
2011-11-24 12:28 ` Kwolek, Adam
2011-11-24 12:48 ` Paul Menzel
2011-11-30 0:14 ` NeilBrown
2011-11-30 8:17 ` Kwolek, Adam
2011-12-06 6:05 ` NeilBrown
2011-12-06 13:02 ` Kwolek, Adam
2011-12-07 1:52 ` NeilBrown
2011-12-07 11:10 ` Kwolek, Adam
2011-12-08 4:02 ` NeilBrown [this message]
2011-12-08 15:36 ` Kwolek, Adam
2011-12-09 3:53 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111208150222.2ce2ac16@notabene.brown \
--to=neilb@suse.de \
--cc=Marcin.Labun@intel.com \
--cc=adam.kwolek@intel.com \
--cc=dan.j.williams@intel.com \
--cc=ed.ciechanowski@intel.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).