From: Lars Marowsky-Bree <lmb@suse.de>
To: Oktay Akbal <oktay.akbal@s-tec.de>, linux-kernel@vger.kernel.org
Subject: Re: Possible Bug with MD multipath and raid1 on top
Date: Sun, 15 Sep 2002 01:07:53 +0200 [thread overview]
Message-ID: <20020914230753.GA3781@marowsky-bree.de> (raw)
In-Reply-To: <Pine.LNX.4.44.0209142014080.21833-100000@omega.s-tec.de>
On 2002-09-14T20:33:07,
Oktay Akbal <oktay.akbal@s-tec.de> said:
> I found a very strange effect when using a raid1 on top of multipathing
> with Kernel 2.4.18 (Suse-version of it) with a 2-Port qlogic HBA
> connecting two arrays.
Is this with or without the patch I recently posted to linux-kernel?
If so, please use the patch at http://lars.marowsky-bree.de/dl/md-mp instead,
which is slightly newer and fixes one important (affecting raid0 use) and two
minor issues. Please be aware that you are beta-testing code for the time
being ;-) (Which is highly appreciated!)
> When I now pull out one of the cables two disks are missing and the
> multipath driver correctly uses the second path to the disks and
> continues to work. After plugging out the second cable all drives
> are marked as failed (mdstat), but the raid1 (md2) is still reported
> as functional with one device (md0) missing.
So far this sounds OK. (Even though the updated md-mp patch will _never_ fail
the last path but instead return the error to the layer upwards; this protects
against certain scenarios in 2.4 where a device error can't be distinguished
from a failed path and we don't want that to lead to an inaccessible device)
> All Processes using the raid1-device get stuck and this situation
> does not recover. Even some simple process testing the disk-access
> got stuck (I think ps showed state L<D).
That's not OK, obviously ;-)
I will try to reproduce this on Monday. As I don't have the hardware, but
instead use a loop device (which I can make fail on demand), if I can't
reproduce it, it might in fact be the FC driver which gets stuck somehow.
> Even if I'm quite sure that this is a bug, how should I test disk access
> without ending in "uninterruptible sleep" ?
Uhm, essentially, you should never get stuck in uninterruptible sleep. All
errors should "eventually" time out.
Please compile the kernel with magic-sysrq enabled and check where the
processes are stuck using magic-sysrq t. It might help if you piped the
results through ksymoops.
Sincerely,
Lars Marowsky-Brée <lmb@suse.de>
--
Immortality is an adequate definition of high availability for me.
--- Gregory F. Pfister
next prev parent reply other threads:[~2002-09-14 23:02 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-09-14 18:33 Possible Bug with MD multipath and raid1 on top Oktay Akbal
2002-09-14 23:07 ` Lars Marowsky-Bree [this message]
2002-09-15 5:29 ` Oktay Akbal
2002-09-15 21:12 ` Lars Marowsky-Bree
2002-09-15 7:31 ` Nachtrag: " Oktay Akbal
2002-09-15 7:39 ` Oktay Akbal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020914230753.GA3781@marowsky-bree.de \
--to=lmb@suse.de \
--cc=linux-kernel@vger.kernel.org \
--cc=oktay.akbal@s-tec.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox