Re: Odd --examine output

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Vanhorn, Mike" <michael.vanhorn@wright.edu>
To: Phil Turmel <philip@turmel.org>, linux-raid <linux-raid@vger.kernel.org>
Subject: Re: Odd --examine output
Date: Fri, 12 Apr 2013 16:47:53 +0000	[thread overview]
Message-ID: <CD8DAF4D.412EF%michael.vanhorn@wright.edu> (raw)
In-Reply-To: <51681FB2.8060803@turmel.org>

On 4/12/13 10:52 AM, "Phil Turmel" <philip@turmel.org> wrote:

[snip]

>As noted above, the partition tables aren't wiped.  Just the device
>nodes are missing.  You could try a "blockdev --rereadpt /dev/sdX" on
>affected drives to see if it is a transient issue.

That did it! I was able to run blockdev for all of the drives that had
missing devices for the partitions, and then was able to

 mdadm --assemble --force /dev/md0 /dev/sd[cdefghi]1

and it assembled using all of the disks except, for some reason, sde1 and
sdf1. I think sde1 got left out because it had been dropped before the
raid actually stopped, and I think I could have added it back in with

 mdadm /dev/md0 --re-add /dev/sde1

(since /dev/sde actually seems to be fine). However, once I got the
filesystem mounted, my first priority was to get the data off, so I didn't
try to re-add that disk.

I don't know why sdf1 got left out.

[snip]

>If the partition is *not* aligned, each large chunk written will have at
>least two R-M-W cycles.

I snipped most of that explanation, but thank you for it; it really helps
me understand what was going on with my partitions.

>I guess "lsdrv" didn't work for you.  I'm naturally curious how it
>failed....

I don't have an lsdrv command, so I did the 'ls -l' that you suggested.

>Anyways, your detailed smartctl reports show big problems:
>
>1)  You have multiple drives with many dozens of pending relocations.
>This suggests that your regular scrubs are not happening on schedule.  A
>"check" scrub turns pending relocations into either real relocations, or
>no error at all (successful rewrite).  Typically the latter.

I've got a raid-check script that runs from cron.weekly. I really did
think it was working, because every week I would check and the array was
re-building.

>2) All of your self-test log entries show "short offline".  That isn't
>rigorous enough.  You need "long offline" self-tests occasionally, too.
> Or just use the long self-test every time.

I will take this into account, and being using the long test.

>3) You have a drive that entirely failed its SMART assessment
>{WD-WMAUR0381532 ==> /dev/sdj} due to excessive actual relocations.
>Replace this drive immediately.

I will. I have a spare disk on the shelf ready to go, once I feel safe
that the data is copied.

[snip]

>NOT a guess.  Back up what you can, while you can, and start over.  Use
>"fdisk -u" so you can ensure partitions start on multiples of eight (8)
>sectors.  (Modern fdisk uses 1MB alignment by default.  Highly
>recommended.)

That is exactly what I'm going to do. I feel like an idiot that there
seems to have been so many things wrong and I didn't realize it. Now,
thanks to your help, and I am much more enlightened.

Thanks!

---
Mike VanHorn
Senior Computer Systems Administrator
College of Engineering and Computer Science
Wright State University
265 Russ Engineering Center
937-775-5157
michael.vanhorn@wright.edu
http://www.cecs.wright.edu/~mvanhorn/

next      parent reply	other threads:[~2013-04-12 16:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <51681FB2.8060803@turmel.org>
2013-04-12 16:47 ` Vanhorn, Mike [this message]
2013-04-12 17:21   ` Odd --examine output Phil Turmel
2013-04-15 13:46 ` Vanhorn, Mike
2013-04-15 14:00   ` Phil Turmel
2013-04-15 18:42   ` Roy Sigurd Karlsbakk
2013-04-15 20:13     ` John Stoffel
2013-04-15 16:06       ` Oliver Schinagl
2013-04-16  8:58       ` Robin Hill
2013-04-18 11:33         ` Roy Sigurd Karlsbakk
2013-04-18 13:03           ` John Stoffel
2013-04-18 14:22             ` Roy Sigurd Karlsbakk
2013-04-18 11:37         ` Roy Sigurd Karlsbakk
2013-04-18  6:32     ` Sam Bingner
2013-04-11 12:47 Vanhorn, Mike
2013-04-11 20:31 ` Vanhorn, Mike
2013-04-11 21:15   ` Phil Turmel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CD8DAF4D.412EF%michael.vanhorn@wright.edu \
    --to=michael.vanhorn@wright.edu \
    --cc=linux-raid@vger.kernel.org \
    --cc=philip@turmel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.