Re: Daily crashes, incorrect RAID behaviour

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Michael Tokarev <mjt@tls.msk.ru>
To: Carsten Otto <carsten.otto@gmail.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: Daily crashes, incorrect RAID behaviour
Date: Tue, 15 Aug 2006 16:33:45 +0400	[thread overview]
Message-ID: <44E1BF29.8010405@tls.msk.ru> (raw)
In-Reply-To: <13e988610608150436y6812f623p9919b2d5b1989427@mail.gmail.com>

Carsten Otto wrote:
> Hello!
> 
> System specs below (iCH7R, software raid 5)
> 
> My problems continue, even with a new and good power supply.
> 1) The system loses a disk about every week, only a hard reboot solves that

We've seen this in alot of cases in the past.  The issue was in a single
batch of seagate 9gig drives (yes, old) - from time to time, one disk
just disappears from the system completely, only power-off-on cycle
forces it to reappear.  This happens without any pattern, ie, randomly -
sometimes a disk can disappear after several minutes after a power-on,
without any system load; and some times, it works just fine for several
months.

We tried to replace (RMA) the bad drives one by one, with the same
scenario all the time: they test the drive for a day, and call us back
saying everything's ok; we grab the drive, and return it back the next
day (because we *know* it's NOT Ok), and they send it for replacement.
The replaced drives (even refurbished ones) all works ok (we replaced
about 20 drives in total, all from the same batch).

I talked with seagate techs about this issue, but there was no conclusion
(he said it's "typical mishandling", like static elictricity etc, but
that does not match the behaviour at all).  And since the drives are very
old now (but quite some of them are still in production ;), and was already
quite old when the problem started happening (about 6 years ago).. it's
simpler to trash them, replacing with more modern drives.

That was only one batch of drives.  And the drives was excellent (for their
age anyway): no single disk failure in many years, not even single bad block
on about 50 drives!  If not counting those sporadic disappearing of course ;)
And Seagate guys says this is something they've never hear before, too.

That all to say: sometimes disk drives do strange things.  Rare, very rare,
but that happens... ;)

/mjt

next prev parent reply	other threads:[~2006-08-15 12:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-15 11:36 Daily crashes, incorrect RAID behaviour Carsten Otto
2006-08-15 12:33 ` Michael Tokarev [this message]
2006-08-15 12:57 ` Alan Cox
2006-08-15 12:42   ` Carsten Otto
2006-08-15 13:08     ` Jan Engelhardt
2006-08-15 13:45 ` Ralf Müller
2006-08-19 11:37   ` Andrew Baker
2006-08-19 11:47     ` Justin Piszcz
2006-08-19 18:53       ` Andrew Baker
2006-08-15 15:31 ` Carsten Otto
2006-08-15 18:28   ` Mike Dresser
2006-08-15 19:27     ` Alistair John Strachan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44E1BF29.8010405@tls.msk.ru \
    --to=mjt@tls.msk.ru \
    --cc=carsten.otto@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.