linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: "Michał Sawicz" <michal@sawicz.net>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Failed, but "md: cannot remove active disk..."
Date: Mon, 14 May 2012 21:36:23 +1000	[thread overview]
Message-ID: <20120514213623.3bc1bfa5@notabene.brown> (raw)
In-Reply-To: <1336992780.6722.18.camel@localhost>

[-- Attachment #1: Type: text/plain, Size: 2401 bytes --]

On Mon, 14 May 2012 12:53:00 +0200 Michał Sawicz <michal@sawicz.net> wrote:

> Dnia 2012-05-14, pon o godzinie 20:22 +1000, NeilBrown pisze:
> > On Sun, 13 May 2012 20:21:48 +0200 Michał Sawicz <michal@sawicz.net> wrote:
> > 
> > > Hey,
> > > 
> > > I've a weird issue with a RAID6 setup, /proc/mdstat says:
> > > 
> > > > md126 : active raid6 sda1[3] sdh1[6] sdg1[0](F) sdf1[5] sdi1[1] sdc[8] sdb[7]
> > > >       9767559680 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [_UUUUUU]
> > > 
> > > So sdg1 is (F)ailed, yet `mdadm --remove` yields:
> > > 
> > > > md: cannot remove active disk sdg1 from md126 ...
> > 
> > There is a period of time between when a device fails and when the raid456
> > module finally lets go of it so it can be removed.  You seem to be in this
> > period of time.
> > Normally it is very short.  It needs to wait for any requests that have
> > already been sent to the device to complete (probably with failure) and
> > very shortly after that it should be released.  So this is normally much less
> > than one second but could be several seconds is some excessive retry is
> > happening.
> > 
> > But I'm guessing you have waited more than a few seconds.
> 
> Yup :)
> 
> > I vaguely recall a bug in the not too distant past whereby RAID456 wouldn't
> > let go of a device quite as soon as it should.  Unfortunately I don't
> > remember the details.  You might be able to trigger it to release the drive
> > by adding a spare - if you have one - or maybe by just
> >   echo sync > /sys/block/md126/md/sync_action
> > it won't actually do a sync, but it might check things enough to make
> > progress.
> 
> # echo sync > /sys/block/md126/md/sync_action
> -bash: echo: write error: Device or resource busy

Hmmm....

Looks like MD_RECOVERY_NEEDED is already set.
But remove_and_add_spares() isn't removing the failed device
from the array.

I cannot find anything since 2.6.38 that looks like your symptoms.

Is the array still functioning?
Are there any interesting messages appearing in the kernel logs?

What does
  grep . /sys/block/md126/md/dev*/*
show?

NeilBrown


> 
> eh?
> 
> > What kernel are you using?
> 
> # uname -a
> Linux media 2.6.38-gentoo-r6 #2 SMP Tue Sep 13 19:13:42 CEST 2011 x86_64
> AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ AuthenticAMD GNU/Linux
> 
> Thanks,


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2012-05-14 11:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-13 18:21 Failed, but "md: cannot remove active disk..." Michał Sawicz
2012-05-14 10:22 ` NeilBrown
2012-05-14 10:53   ` Michał Sawicz
2012-05-14 11:36     ` NeilBrown [this message]
2012-05-14 11:44       ` Michał Sawicz
2012-05-15  3:38         ` NeilBrown
2012-05-15  7:56           ` Michał Sawicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120514213623.3bc1bfa5@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=michal@sawicz.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).