linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: "Kwolek, Adam" <adam.kwolek@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>,
	"Ciechanowski, Ed" <ed.ciechanowski@intel.com>,
	"Labun, Marcin" <Marcin.Labun@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>
Subject: Re: [PATCH] md: Add ability for disable bad block management
Date: Thu, 8 Dec 2011 15:02:22 +1100	[thread overview]
Message-ID: <20111208150222.2ce2ac16@notabene.brown> (raw)
In-Reply-To: <79556383A0E1384DB3A3903742AAC04A055D88@IRSMSX101.ger.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 3625 bytes --]

On Wed, 7 Dec 2011 11:10:06 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:

> 
> 
> > -----Original Message-----
> > From: NeilBrown [mailto:neilb@suse.de]

> > I cannot reproduce this.
> > I didn't physically remove devices, but I used
> >    echo 1 > /sys/block/sdc/device/delete
> > which should be nearly identical from the perspective of md and mdadm.
> 
> I've checked that when I'm deleting device using sysfs  everything works perfect. 
> When when device is pulled out, reshape stops in md/mdstat.
> 
> > If you could give me the exact set of steps that you follow to produce the
> > problem that would help - maybe a script?  Just a description is OK.
> 
> 
> #used disks sdb, sdc, sdd, sde
> export IMSM_NO_PLATFORM=1
> #create container
> mdadm -C /dev/md/imsm0 -amd -e imsm -n 3 /dev/sdb /dev/sdc /dev/sde -R
> #create vol
> mdadm -C /dev/md/raid5vol_0 -amd -l 5 --chunk 32 --size 104850 -n 3 /dev/sdb /dev/sdc /dev/sde -R
> #add spare
> mdadm --add /dev/md/imsm0 /dev/sdd
> #run OLCE
> mdadm --grow /dev/md/imsm0 --raid-devices 4
> #when reshape starts, I'm (physically) pulling device out
> 
> > Also you say it is blocking in md_do_sync.  Is that at the
> > 
> > 	wait_event(mddev->recovery_wait, !atomic_read(&mddev-
> > >recovery_active));
> > 
> > call just after the "out:" label?
> 
> None of those 2 places.
> It enters sync_request() function. Md_error() is called. 
> More is visible on thread stack information below (md_wait_for_blocked_rdev()).
> 
> 
> > 
> > What is the raid thread doing at this point?
> >    cat /proc/PID/stack
> > might help.
> 
> [md126_raid5]
> [<ffffffff8121d843>] md_wait_for_blocked_rdev+0xbc/0x10f
> [<ffffffffa01d87ce>] handle_stripe+0x1c5c/0x2c99 [raid456]
> [<ffffffffa01d9d0d>] raid5d+0x502/0x564 [raid456]
> [<ffffffff8121eca5>] md_thread+0x101/0x11f
> [<ffffffff81049e0e>] kthread+0x81/0x89
> [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> [md126_reshape]
> [<ffffffffa02455a2>] sync_request+0x90a/0xbfb [raid456]
> [<ffffffff8121e151>] md_do_sync+0x7aa/0xc40
> [<ffffffff8121ecb3>] md_thread+0x101/0x11f
> [<ffffffff81049e0e>] kthread+0x81/0x89
> [<ffffffff812cc4f4>] kernel_thread_helper+0x4/0x10
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> > 
> > What are the contents of all the sysfs files?
> >    grep . /sys/block/mdXXX/md/*
> array_state		->active
> degraded		->1
> max_read_errors	->20
> reshape_position	->12288
> resync_start		->none
> sync_completed	->4096 / 209664
> 
> 
> >    grep . /sys/block/mdXXX/md/dev-*/*
> 
> When removed is sdd   /sys/block/mdXXX/md/dev-sdd/*
> bad_blocks		->4096 512
> 			->4608 128
> 			->4736 384
> block			->MISSING link is not valid
> errors			->0
> offset			->0
> recovery_start		->4096
> size			->104832
> slot			->3
> state			->faulty,write_error
> unacknowledged_bad_blocks	->4096 512
> 				->4608 128
> 				->4736 384
> 
> I hope this helps.

Yes it does, thanks.

Can you try with this patch as well please.

Thanks,
NeilBrown


diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ea6dce9..6cf0f6a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3175,6 +3175,8 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s)
 			rdev = rcu_dereference(conf->disks[i].rdev);
 			clear_bit(R5_ReadRepl, &dev->flags);
 		}
+		if (rdev && test_bit(Faulty, &rdev->flags))
+			rdev = NULL;
 		if (rdev) {
 			is_bad = is_badblock(rdev, sh->sector, STRIPE_SECTORS,
 					     &first_bad, &bad_sectors);


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2011-12-08  4:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-24 12:19 [PATCH] md: Add ability for disable bad block management Adam Kwolek
2011-11-24 12:23 ` Paul Menzel
2011-11-24 12:28   ` Kwolek, Adam
2011-11-24 12:48     ` Paul Menzel
2011-11-30  0:14 ` NeilBrown
2011-11-30  8:17   ` Kwolek, Adam
2011-12-06  6:05     ` NeilBrown
2011-12-06 13:02       ` Kwolek, Adam
2011-12-07  1:52         ` NeilBrown
2011-12-07 11:10           ` Kwolek, Adam
2011-12-08  4:02             ` NeilBrown [this message]
2011-12-08 15:36               ` Kwolek, Adam
2011-12-09  3:53                 ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111208150222.2ce2ac16@notabene.brown \
    --to=neilb@suse.de \
    --cc=Marcin.Labun@intel.com \
    --cc=adam.kwolek@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ed.ciechanowski@intel.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).