From: "NeilBrown" <neilb@suse.de>
To: "Darius S. Naqvi" <dnaqvi@datagardens.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: "mdadm --remove" fails if it is too soon after "mdadm --fail"
Date: Thu, 15 Oct 2009 12:49:12 +1100 [thread overview]
Message-ID: <eebda61d9dea7affc2934fd8d887b752.squirrel@neil.brown.name> (raw)
In-Reply-To: <alpine.DEB.1.00.0910141236260.14710@darius>
On Thu, October 15, 2009 5:38 am, Darius S. Naqvi wrote:
> On Wed, 14 Oct 2009, Darius S. Naqvi wrote:
>
>> I.e., it seems that the ioctl invoked by --fail doesn't directly set
>> up the device to be ready for --remove, but some other kernel thread
>> completes that state change. I'm wondering if it could be the case
>> that when the system is very, very busy, it could take long enough for
>> that kernel thread to run that it would cause what I see, i.e.,
>> --remove fails with EBUSY, even though I've already waited about 20
>> seconds for the device to be ready to be removed. If this is so, what
>> shall I do? Here are the options I can think of:
>
> Sorry to reply to my own posting. It turns out that in this case,
> I've only waited 2.5 seconds. This may affect the probability of my
> hunch being correct.
2.5 seconds certainly seems more believable than 20 seconds.
Waiting for the kernel thread to run is not the only cause for delay.
If there are any pending IO requests, you have to wait for all of those
to complete before the device can be removed from the array.
As error handling can take an arbitrarily long time, there can be
an arbitrary delay between a device being marked faulty and it being
able to be removed from the array.
So probably the best bet is simply to wait and retry as you are doing.
If I were to make it more deterministic, I would probably allow you to
'poll' or 'select' on the sysfs file /sys/block/mdX/md/dev-YYY/slot
and once that becomes 'none', the device can be removed.
NeilBrown
>
> --
> Darius S. Naqvi
> dnaqvi@datagardens.com
> http://www.datagardens.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
prev parent reply other threads:[~2009-10-15 1:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-14 15:32 "mdadm --remove" fails if it is too soon after "mdadm --fail" Darius S. Naqvi
2009-10-14 18:38 ` Darius S. Naqvi
2009-10-15 1:49 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eebda61d9dea7affc2934fd8d887b752.squirrel@neil.brown.name \
--to=neilb@suse.de \
--cc=dnaqvi@datagardens.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox