* [linux-lvm] HDD failure - please help!
@ 2010-09-01 16:57 Patterson, James
2010-09-01 17:14 ` Stuart D. Gathman
0 siblings, 1 reply; 4+ messages in thread
From: Patterson, James @ 2010-09-01 16:57 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]
Greetings.
I am more of a general user than a guru, so please be gentle...I'll do
my best to describe the issue here.
I have a system running FC11 with five HDDs, ~2 TB of space, all in a
logical volume. It was my presumption that the LVM system functioned
sort of like a raid system (I know you can run striped, for instance).
Everything has been running fine, up until last Monday - I noticed that
I couldn't write to my samba drive (on the machine in question) - it was
read only.
I thought: "Hmm". Whatever...I rebooted the system, but walked away
(busy) so didn't watch the reboot process. Came back and did use the
shared samba drive later - but later Tuesday, it went read-only again.
OK, so I started looking around, and it appeared that I had a disk
failure from checking dmesg. Indeed, I could not access anything to
write - it was read only even for root - because SElinux said
so...locked up. So...I rebooted again.
One of the five drives refused to come up - and made a rather ominous
clicking noise - would not initialize. The system would not boot at all.
Would get thru BIOS, pass the ATA card, and then stop.
I pulled the drive out that was bad, dl'd and burned a boot CD-ROM,
booted off of that, and tried to do a restore.
I got an error message: "Error processing LVM. There is inconsistent
LVM data on logical volume _____________. You can reinitialize all
related PVs ___ which will erase the LVM metadata, or ignore which will
preserve the contents."
I tried ignore first, and it then stated that there were no partitions
to mount (or something - I didn't write this one down).
I rebooted, mounted read-only this time ( I didn't want to erase any
metadata) and tried re-initialize - got the same error message again.
I am at a loss as to what to do - I'm thinking that there must be a way
to recover the data on the remaining four drives. Surely this LVM system
is not going to cause me to lose the data on all five drives because one
of them failed?
I would greatly appreciate any help you could give me - I've search
quite a bit and can't seem to fine much in relation to this. It seems
like it would be something that had been addressed...somewhere.
Kind regards,
James
[-- Attachment #2: Type: text/html, Size: 9775 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [linux-lvm] HDD failure - please help!
2010-09-01 16:57 Patterson, James
@ 2010-09-01 17:14 ` Stuart D. Gathman
0 siblings, 0 replies; 4+ messages in thread
From: Stuart D. Gathman @ 2010-09-01 17:14 UTC (permalink / raw)
To: LVM general discussion and development
On Wed, 1 Sep 2010, Patterson, James wrote:
> I have a system running FC11 with five HDDs, ~2 TB of space, all in a
> logical volume. It was my presumption that the LVM system functioned
> sort of like a raid system (I know you can run striped, for instance).
The only RAID like redundancy that LVM supports is mirroring (like RAID1), but
only if you create the LVs as mirrored. Striping (like RAID0) actually
makes your data *more* vulnerable (but faster).
If you wanted RAID5, your best bet on linux is the md driver. Or else
a hardware RAID controller.
> I am at a loss as to what to do - I'm thinking that there must be a way
> to recover the data on the remaining four drives. Surely this LVM system
> is not going to cause me to lose the data on all five drives because one
> of them failed?
If the LVs were striped, then yep, that is the nature of striping. And
even if you had properly configured RAID 5 or mirroring and where immune
to hard drive failure, there is always the fat finger failure ("rm -rf /"
and its more subtle ilk).
Since you don't have any details on exactly what kind of LVs you
created, your first step is getting a copy of the metadata. There
should be a copy at the beginning of each drive.
If you had a single logical volume, and you were configured as jbod (Just
a Bunch of Disks), then you should look back a month or so in the archives
where another user made the same mistake, but got a significant portion
of his data back from the good drives. I won't repeat the steps here.
(You'll need an external USB adapter, freezer, new drive to replace failed,
etc).
Hopefully, you've learned your lesson and won't presume (or assume) next time.
--
Stuart D. Gathman <stuart@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [linux-lvm] HDD failure - please help!
[not found] <mailman.25972.1283360318.9817.linux-lvm@redhat.com>
@ 2010-09-02 11:50 ` Patterson, James
2010-09-02 15:22 ` Stuart D Gathman
0 siblings, 1 reply; 4+ messages in thread
From: Patterson, James @ 2010-09-02 11:50 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]
> Striping (like RAID0) actually makes your data *more* vulnerable (but faster).
Stuart, I just mentioned that in passing - I did NOT use striping.
> If you wanted RAID5, your best bet on linux is the md driver.
> Or else a hardware RAID controller.
I don't/didn't want RAID.
> Since you don't have any details on exactly what kind of LVs you created
The default FC11 creation - whatever that may be.
> your first step is getting a copy of the metadata.
> There should be a copy at the beginning of each drive.
Yes. How do I access it? None of the drives will mount. I am thinking here that I should create a special boot disk with the LVM tools on it (they are not present on the FC11 boot iso, afaik).
> If you had a single logical volume, and you were configured as
> jbod (Just a Bunch of Disks)
I'm reasonably certain that's what I had.
> then you should look back a month or so in the archives
> where another user made the same mistake, but got a significant portion
> of his data back from the good drives. I won't repeat the steps here.
> (You'll need an external USB adapter, freezer, new drive to replace failed,
I looked...could you please be a bit more specific? I didn't see anything.
> Hopefully, you've learned your lesson and won't presume (or assume) next time.
Well, truly, the only thing I've learned is never to use LVM if it's going to cause me to lose data on all 5 drives when one goes down. The logic behind it's use appears to be to just make life "easier" by using one larger disk. The reality of the situation is that I don't have one large disk - I have 5 disks. If LVM can't figure out how to restore from four remaining disks (or alternatively I can't figure out how to use LVM tools to do so) then it's less than useful for me.
James
[-- Attachment #2: winmail.dat --]
[-- Type: application/ms-tnef, Size: 3627 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [linux-lvm] HDD failure - please help!
2010-09-02 11:50 ` [linux-lvm] HDD failure - please help! Patterson, James
@ 2010-09-02 15:22 ` Stuart D Gathman
0 siblings, 0 replies; 4+ messages in thread
From: Stuart D Gathman @ 2010-09-02 15:22 UTC (permalink / raw)
To: LVM general discussion and development
On 09/02/2010 07:50 AM, Patterson, James wrote:
>> If you wanted RAID5, your best bet on linux is the md driver.
>> Or else a hardware RAID controller.
>>
> I don't/didn't want RAID.
>
Based on your expectations, I think you *do* want at least RAID1. Raid
1 is simple to administer
and understand.
>> your first step is getting a copy of the metadata.
>> There should be a copy at the beginning of each drive.
>>
> Yes. How do I access it? None of the drives will mount. I am thinking here that I should create a special boot disk with the LVM tools on it (they are not present on the FC11 boot iso, afaik).
>
You don't mount the PVs. Use the metadata extraction tool, I don't
remember the name atm.
If this was your boot filesystem, then you will need a LiveCD or new
install. Since you will need a new disk anyway, I suggest you get the
new disk that is *bigger* than the failing drive and install to it (but
*not* overwriting the others) and leave a partition big enough to
contain the PV from the failing drive. Remove the failing drive, and
access it via USB - even if you have another drive slot. By taking
steps to keep it as cold as possible during recovery, you can coax a few
more sectors out of it.
>> then you should look back a month or so in the archives
>>
> I looked...could you please be a bit more specific? I didn't see anything.
>
This should get you started:
https://www.redhat.com/archives/linux-lvm/2010-July/msg00057.html
> Well, truly, the only thing I've learned is never to use LVM if it's
> going to cause me to lose data on all 5 drives when one goes down. The
> logic behind it's use appears to be to just make life "easier"
With jbod (which you likely have), the failure scenario is exactly the
same whether you have 1 disk or 5. Part of your filesystem gets
trashed, and you have to use low level tools to recover what remains if
you don't have backups.
What having 5 disks *does* do is make failure more likely. Suppose the
probability of 1 disk *not* failing in a given year is .999 (3 sigmas).
With jbod, the LV fails if *any* of the disks fail. The probability
that none of them fail in a given year would then be .999^5 ~= .995.
Your array is less reliable.
By using RAID, you can make the array more reliable. RAID works by
using multiple copies of data so that you don't lose anything on a
single drive failure.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-09-02 15:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <mailman.25972.1283360318.9817.linux-lvm@redhat.com>
2010-09-02 11:50 ` [linux-lvm] HDD failure - please help! Patterson, James
2010-09-02 15:22 ` Stuart D Gathman
2010-09-01 16:57 Patterson, James
2010-09-01 17:14 ` Stuart D. Gathman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).