linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Truschnigg <johannes@truschnigg.info>
To: Phil Turmel <philip@turmel.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: What just happened to my disks/RAID5 array?
Date: Wed, 14 Sep 2011 20:17:12 +0200	[thread overview]
Message-ID: <4E70EFA8.4080706@truschnigg.info> (raw)
In-Reply-To: <4E7092F1.5010209@turmel.org>

[-- Attachment #1: Type: text/plain, Size: 3089 bytes --]

Hello again Phil (and of course alco possible bystanders :))!

On 09/14/2011 01:41 PM, Phil Turmel wrote:
> Good Morning Johannes,
> 
> Sorry about the delay...  worked late yesterday.

Really no need to be sorry about anything; actually I'm perfectly aware
that I'm not entitled to any kind of your support, and I greatly
appreciate it whenever you volunteer to share your insights with me. So
let me say thank you very, very much for getting back to me again in
this regard!

>> The controller seems alive still - lsdrv (output attached) lists 
>> the kernel still having registered some of the component devices.
> 
> Actually, it doesn't.  None of the /dev/md0 components are present. 
> Ditto for the "mdadm -D" report.

You are right; none of the disks were present once I got to the machine.
The lvm and fs on top seemed rather confused about what happened, and I
went on to kill all processes with file handles open on the fs in
question, unmounted the fs, and rebooted. The board's BIOS took an
awkwardly long time when scanning for SATA devices on the SB's ports,
but in the end showed all of them in the POST screen. After booting the
kernel, one of the drives popped out rather early in the process (about
two or three seconds after the kernel picked it up), and all subsequent
reboots (even when disconnecting the failed and/or all but one drive(s))
make the box hang indefinitely upon POSTing and scanning the SATA
controller. My guess is that the board/controller is fried.

> [...] "--assemble" is safe in all known cases.  Use it first.  With 
> the whole controller gone, you probably have consistent event counts 
> after all, and --assemble should just work.  "--assemble --force" is 
> somewhat less safe, but I wouldn't hesitate to use it in a situation 
> where the drives truly dropped out together.  You'll likely find some
> problems with fsck if files were actively being written when the 
> array dropped out, but the vast majority of your filesystem(s) should
> be safe.

Thanks, I will try that as soon as I can get my hands onto a machine
with enough free SATA ports - I might have to replace the whole system
(at least board, CPU and RAM) and will have to do some research before
settling for specific hardware. I can do without that part of my data
for a few days, probably even weeks, but losing it forever would be hard
to swallow still.

> Other procedures are progressively less safe.  I prefer to not offer 
> specifics until you've hooked your drives back up, and generated 
> fresh "lsdrv" and "mdadm" reports.

I promise I'll get back to the list if --assemble doesn't do its deed
right away once I got a system put together that can handle all the
array's member devices.

Again, thank you very much for your time and sharing your expertise!

-- 
with best regards:
- Johannes Truschnigg ( johannes@truschnigg.info )

www: http://johannes.truschnigg.info/
phone: +43 650 2 133337
xmpp: johannes@truschnigg.info

Please do not bother me with HTML-eMail or attachments. Thank you.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2011-09-14 18:17 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-13  8:27 What just happened to my disks/RAID5 array? Johannes Truschnigg
2011-09-13 11:37 ` Phil Turmel
2011-09-13 18:56   ` Johannes Truschnigg
2011-09-14 11:41     ` Phil Turmel
2011-09-14 18:17       ` Johannes Truschnigg [this message]
2011-09-14 19:19         ` Phil Turmel
2012-01-06 10:51           ` Johannes Truschnigg
2012-01-06 13:16             ` Phil Turmel
2012-01-06 13:46               ` Johannes Truschnigg
2012-01-06 14:51                 ` Phil Turmel
2012-01-06 15:28                   ` Johannes Truschnigg
2012-01-07 14:23                     ` John Robinson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E70EFA8.4080706@truschnigg.info \
    --to=johannes@truschnigg.info \
    --cc=linux-raid@vger.kernel.org \
    --cc=philip@turmel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).