From: NeilBrown <neilb@suse.de>
To: Denis Golovan <denis.golovan@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID5 hard freeze
Date: Tue, 25 Feb 2014 13:58:09 +1100 [thread overview]
Message-ID: <20140225135809.0b1afc69@notabene.brown> (raw)
In-Reply-To: <CAF8Fvyz2PnEd=t=E-xUuG6anoLhm1zJCn0txD=P_aE_5AmoVUw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1941 bytes --]
On Tue, 25 Feb 2014 00:01:42 +0200 Denis Golovan <denis.golovan@gmail.com>
wrote:
> Hi all
>
> I am struggling to diagnose a strange freeze of software RAID5 array.
> My RAID5 consists of 4 Toshiba SATA drives and has ext4 filesystem on top of it.
>
> It works fine unless I start several process writing intensively to it.
> At first, it looks like the system is under high pressure, then the
> system starts lagging a lot and a hard freeze always follows after
> several minutes.
>
> No errors in system log, nothing is emitted to console. Just hard
> freeze with HDD light always on. I tried enabling kernel network
> logging to another machine and again no information when hanging.
> After reboot, my array starts reconstruction and finishes without
> errors.
>
> I tried disabling quotas and barriers for ext4.
> After disabling barriers, it almost seemed to work, but after some
> time the same hard freeze happens.
>
> I tested the same hardware configuration under Linux v3.10, 3.11, 3.12
> and now 3.13.5 (all x86 arch) behaves the same way. The same issue can
> be reproduced easily.
>
> So now I tested everything Google suggests on the matter.
> Could you give a hint on how to debug this issue?
>
The most useful thing for debugging a hard freeze is the alt-sysrq-T output
when it is frozen. typing that magic sequence should always produce some
output unless it is hard-frozen with interrupts disabled.
So make sure you can produce the output when the system is working properly
(to a log file file the network console would be ideal), then when it hangs,
produce the output again.
To probably need to have a text console rather than a graphic console for it
to work.
If it is hard-hanging with interrupts disabled, then it gets tricky. I
thought there was some NMI-based lockup detector which would warn if that
happened, but I cannot find it just now.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-02-25 2:58 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-24 22:01 RAID5 hard freeze Denis Golovan
2014-02-25 2:58 ` NeilBrown [this message]
2014-02-26 20:52 ` Denis Golovan
2014-03-01 14:54 ` Denis Golovan
2014-03-04 11:57 ` Bernd Schubert
2014-03-05 21:12 ` Denis Golovan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140225135809.0b1afc69@notabene.brown \
--to=neilb@suse.de \
--cc=denis.golovan@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.