* [linux-lvm] pv_move_pe() error again :/
@ 2001-09-06 15:28 FEJF
2001-09-06 20:05 ` Ragnar Kjørstad
0 siblings, 1 reply; 19+ messages in thread
From: FEJF @ 2001-09-06 15:28 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
hi,
i search through the mailing list and found this question few times... and i
have also this problem atm and i didn't find a solution, i want to ask if
there is one now ? or perhaps will installing of lvm1.0.1-rc2 help (currently
using 0.9.1_beta7) ?
root@bolm:[/x] # pvmove /dev/hdh1
pvmove -- moving physical extents in active volume group "vg01"
pvmove -- WARNING: if you lose power during the move you may need to restore
your LVM metadata from backup!
pvmove -- do you want to continue? [y/n] y
pvmove -- ERROR reading input physical volume "/dev/hdh1" (still 65536 bytes
to read)
pvmove -- ERROR "pv_move_pe(): read input PV" pv_move_pe
pvmove -- ERROR "pv_move_pe(): read input PV" moving physical extents
mfg, Florian E.J. Fruth
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7l5YE7Xtp66ctWuIRAvbXAJwPE9rdTBmtIz/stPOxlPIXS5PbwwCfQx41
oCGlQD5AHCV0S9v/UDfog9o=
=KYdJ
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-06 15:28 [linux-lvm] pv_move_pe() error again :/ FEJF
@ 2001-09-06 20:05 ` Ragnar Kjørstad
2001-09-06 23:41 ` FEJF
0 siblings, 1 reply; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-06 20:05 UTC (permalink / raw)
To: linux-lvm
On Thu, Sep 06, 2001 at 05:28:04PM +0200, FEJF wrote:
> hi,
> i search through the mailing list and found this question few times... and i
> have also this problem atm and i didn't find a solution, i want to ask if
> there is one now ? or perhaps will installing of lvm1.0.1-rc2 help (currently
> using 0.9.1_beta7) ?
>
> root@bolm:[/x] # pvmove /dev/hdh1
> pvmove -- moving physical extents in active volume group "vg01"
> pvmove -- WARNING: if you lose power during the move you may need to restore
> your LVM metadata from backup!
> pvmove -- do you want to continue? [y/n] y
> pvmove -- ERROR reading input physical volume "/dev/hdh1" (still 65536 bytes
> to read)
>
> pvmove -- ERROR "pv_move_pe(): read input PV" pv_move_pe
>
> pvmove -- ERROR "pv_move_pe(): read input PV" moving physical extents
This could be because of a disk-error. Do you have io-errors in
/var/log/messages?
If so, you will have to modify pvmove to not give up after read-errors.
Maybe a '--ignore-read-errors' option should be added?
--
Ragnar Kjørstad
Big Storage
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-06 20:05 ` Ragnar Kjørstad
@ 2001-09-06 23:41 ` FEJF
2001-09-07 9:43 ` Ragnar Kjørstad
2001-09-07 10:46 ` Heinz J . Mauelshagen
0 siblings, 2 replies; 19+ messages in thread
From: FEJF @ 2001-09-06 23:41 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ragnar Kj�rstad, on Donnerstag, 6. September 2001 22:05 wrote:
> On Thu, Sep 06, 2001 at 05:28:04PM +0200, FEJF wrote:
> > hi,
> > i search through the mailing list and found this question few times...
> > and i have also this problem atm and i didn't find a solution, i want to
> > ask if there is one now ? or perhaps will installing of lvm1.0.1-rc2 help
> > (currently using 0.9.1_beta7) ?
> >
> > root@bolm:[/x] # pvmove /dev/hdh1
> > pvmove -- moving physical extents in active volume group "vg01"
> > pvmove -- WARNING: if you lose power during the move you may need to
> > restore your LVM metadata from backup!
> > pvmove -- do you want to continue? [y/n] y
> > pvmove -- ERROR reading input physical volume "/dev/hdh1" (still 65536
> > bytes to read)
> >
> > pvmove -- ERROR "pv_move_pe(): read input PV" pv_move_pe
> >
> > pvmove -- ERROR "pv_move_pe(): read input PV" moving physical extents
>
> This could be because of a disk-error. Do you have io-errors in
> /var/log/messages?
there are no io-errors... but as the hd makes really scary noise when pvmove
tries to move the remaining bytes. so i think the hd is damaged.
but all i want is to remove the damaged hd.
but pvreduce says i have to use pvmove to get rid of the remaining data and
pvmove gives the errors...
so i can't remove it :/ - is there a way to get rid of the hd without
destroying the rest of the data ? sth. like pvreduce --force ?
> If so, you will have to modify pvmove to not give up after read-errors.
how can i do this ? i'm not a coder but if someone has too much time... ;)
> Maybe a '--ignore-read-errors' option should be added?
if there's no other way to get rid of the damaged hd... that would be also a
way to solve my problem.
btw: could there be a problem with using reiserfs on the lvm ?
mfg, Florian E.J. Fruth
ps: at the moment i try a "dd if=/dev/hdh1 of=/dev/null" to see if it also
complains about errors...
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7mAmk7Xtp66ctWuIRAqdSAJwLKC2Z517hbQow9HX/dKGe+sn9CgCgpWaV
ZRo4gIxtkRLdZAfttoKuJDo=
=yDmS
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-06 23:41 ` FEJF
@ 2001-09-07 9:43 ` Ragnar Kjørstad
2001-09-07 11:36 ` FEJF
2001-09-07 10:46 ` Heinz J . Mauelshagen
1 sibling, 1 reply; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-07 9:43 UTC (permalink / raw)
To: linux-lvm
On Fri, Sep 07, 2001 at 01:41:22AM +0200, FEJF wrote:
> there are no io-errors... but as the hd makes really scary noise when pvmove
> tries to move the remaining bytes. so i think the hd is damaged.
> but all i want is to remove the damaged hd.
> but pvreduce says i have to use pvmove to get rid of the remaining data and
> pvmove gives the errors...
> so i can't remove it :/ - is there a way to get rid of the hd without
> destroying the rest of the data ? sth. like pvreduce --force ?
No, don't think so.
I have a patch, but I'm not so sure I can recommend that you use it.
I only wrote it to copy off a single block of a bad disk - I'm not sure
exactly how it will behave when moving multiple blocks (if the
non-corrupted blocks will end up in the right spot, or if it writes too
little data, so it will be some bytes off).
--
Ragnar Kjørstad
Big Storage
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-06 23:41 ` FEJF
2001-09-07 9:43 ` Ragnar Kjørstad
@ 2001-09-07 10:46 ` Heinz J . Mauelshagen
2001-09-07 11:45 ` FEJF
1 sibling, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2001-09-07 10:46 UTC (permalink / raw)
To: linux-lvm
On Fri, Sep 07, 2001 at 01:41:22AM +0200, FEJF wrote:
> Ragnar Kjørstad, on Donnerstag, 6. September 2001 22:05 wrote:
> > On Thu, Sep 06, 2001 at 05:28:04PM +0200, FEJF wrote:
> > > hi,
> > > i search through the mailing list and found this question few times...
> > > and i have also this problem atm and i didn't find a solution, i want to
> > > ask if there is one now ? or perhaps will installing of lvm1.0.1-rc2 help
> > > (currently using 0.9.1_beta7) ?
> > >
> > > root@bolm:[/x] # pvmove /dev/hdh1
> > > pvmove -- moving physical extents in active volume group "vg01"
> > > pvmove -- WARNING: if you lose power during the move you may need to
> > > restore your LVM metadata from backup!
> > > pvmove -- do you want to continue? [y/n] y
> > > pvmove -- ERROR reading input physical volume "/dev/hdh1" (still 65536
> > > bytes to read)
> > >
> > > pvmove -- ERROR "pv_move_pe(): read input PV" pv_move_pe
> > >
> > > pvmove -- ERROR "pv_move_pe(): read input PV" moving physical extents
> >
> > This could be because of a disk-error. Do you have io-errors in
> > /var/log/messages?
>
> there are no io-errors... but as the hd makes really scary noise when pvmove
> tries to move the remaining bytes. so i think the hd is damaged.
> but all i want is to remove the damaged hd.
> but pvreduce says i have to use pvmove to get rid of the remaining data and
> pvmove gives the errors...
> so i can't remove it :/ - is there a way to get rid of the hd without
> destroying the rest of the data ? sth. like pvreduce --force ?
>
> > If so, you will have to modify pvmove to not give up after read-errors.
>
> how can i do this ? i'm not a coder but if someone has too much time... ;)
>
> > Maybe a '--ignore-read-errors' option should be added?
>
> if there's no other way to get rid of the damaged hd... that would be also a
> way to solve my problem.
> btw: could there be a problem with using reiserfs on the lvm ?
Well, assuming your disk has a flaw, the only way to work around your problem
is patching pv_move_pe() in order to ignore read errors (this takes place
around line 520 in LVM 1.0) which will cause a copy of the data to some other
device with probably flaky data in it.
In case you've got some filesystem data in there, fsck will complain accordingly
and some file or metadata of the filesystem will be gone which will cause
more or less filesystem data loss.
>
> mfg, Florian E.J. Fruth
>
> ps: at the moment i try a "dd if=/dev/hdh1 of=/dev/null" to see if it also
> complains about errors...
>
> --
> Backups are usefull. Most often when you don't have one ;)
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
--
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-07 9:43 ` Ragnar Kjørstad
@ 2001-09-07 11:36 ` FEJF
2001-09-09 22:16 ` Ragnar Kjørstad
0 siblings, 1 reply; 19+ messages in thread
From: FEJF @ 2001-09-07 11:36 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ragnar Kj�rstad, on Freitag, 7. September 2001 11:43 wrote:
> On Fri, Sep 07, 2001 at 01:41:22AM +0200, FEJF wrote:
> > there are no io-errors... but as the hd makes really scary noise when
> > pvmove tries to move the remaining bytes. so i think the hd is damaged.
> > but all i want is to remove the damaged hd.
> > but pvreduce says i have to use pvmove to get rid of the remaining data
> > and pvmove gives the errors...
> > so i can't remove it :/ - is there a way to get rid of the hd without
> > destroying the rest of the data ? sth. like pvreduce --force ?
>
> No, don't think so.
>
> I have a patch, but I'm not so sure I can recommend that you use it.
>
> I only wrote it to copy off a single block of a bad disk - I'm not sure
> exactly how it will behave when moving multiple blocks (if the
> non-corrupted blocks will end up in the right spot, or if it writes too
> little data, so it will be some bytes off).
the only thing i fear at the moment is that the hd gets totaly damaged sooner
or later (if it runs serveral hours i can't access it...) and if it will i
will loose the whole lvm (about 360gigs) and so i say if there are some filez
corrupted it won't matter that much...
fejf
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7mLEz7Xtp66ctWuIRAnNKAKDSYdnAEGSC9w+7SeVu5dWqFs8tVACbBsK9
PS4wZR9k653HkL/YQzocYqQ=
=XlFo
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-07 10:46 ` Heinz J . Mauelshagen
@ 2001-09-07 11:45 ` FEJF
0 siblings, 0 replies; 19+ messages in thread
From: FEJF @ 2001-09-07 11:45 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Heinz J . Mauelshagen, on Freitag, 7. September 2001 12:46 wrote:
[cut]
> Well, assuming your disk has a flaw, the only way to work around your
> problem is patching pv_move_pe() in order to ignore read errors (this takes
> place around line 520 in LVM 1.0) which will cause a copy of the data to
> some other device with probably flaky data in it.
search a bit through the file... but i actually don't know what to change
there... would a single
ret:=1;
just before
pe_moved += ret;
help ?
> In case you've got some filesystem data in there, fsck will complain
> accordingly and some file or metadata of the filesystem will be gone which
> will cause more or less filesystem data loss.
loose some bytes is nothing against the whole lvm... ;)
fejf
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7mLNS7Xtp66ctWuIRAgRGAKDL9Us+S5NUHUhOhymfTidbzJYOTwCgg9ti
WFlbMA5xs29zsHYyOLiA0EM=
=uSq4
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-07 11:36 ` FEJF
@ 2001-09-09 22:16 ` Ragnar Kjørstad
2001-09-09 23:51 ` FEJF
0 siblings, 1 reply; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-09 22:16 UTC (permalink / raw)
To: linux-lvm
[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]
OK, this is the patch.
With this patch pe_move will not even try to lock the extent before
moving it. (because it would fail). I'm not sure exactly what
implications that has, but I would avoid using LVM when you're running
pe_move. I also removed the code that unlock the extent - but only for
the standard execution path - if any other errors occur pe_move will try
to unlock the extent.
Another problem is that the file possition in the source is undefined
after a failed read - so if you move several extents at the same time,
chances are that pe_move will read the broken extent over and over
again, instead of reading first the broken extent and then other extent
- it will in other words destroy your data.
This patch was written specificly to fix a problem with LVM on one of
our workstations, and it is not intended for general use! If you still
choose to try it, i recommend first moving all the extents you can move
with the regular pe_move command - then install this one and move the
remaining extents one by one.
Be aware that you have to do make install to use it (or use
LD_LIBRARY_PATH), as the library is linken in dynamicly - if you do make
install - be sure to install the proper utilities afterwards so you
don't risk running this modified pe_move by accident later.
You have been warned....
--
Ragnar Kjorstad
Big Storage
[-- Attachment #2: LVM_ignore_read_errors.patch2 --]
[-- Type: text/plain, Size: 1553 bytes --]
diff -u -r LVM/1.0.1-rc2/tools/lib/pv_move.c LVM_ignore_read_errors/1.0.1-rc2/tools/lib/pv_move.c
--- LVM/1.0.1-rc2/tools/lib/pv_move.c Thu Jul 19 17:19:01 2001
+++ LVM_ignore_read_errors/1.0.1-rc2/tools/lib/pv_move.c Mon Sep 10 00:06:47 2001
@@ -494,12 +494,14 @@
if ( opt_t == 0) {
int lv_num = vg->pv[dst_pv_index]->pe[pe_dest].lv_num;
+/* Don't even try to lock
if ( ( ret = pe_lock ( vg->vg_name, vg->pv[src_pv_index]->pv_dev,
le_remap_req.old_pe, vg->vg_number,
lv_num, vg->lv[lv_num-1]->lv_dev)) < 0) {
ret = -LVM_EPV_MOVE_PE_LOCK;
- goto pv_move_pe_end;
+ goto pv_move_pe_end;
}
+*/
}
@@ -521,9 +523,13 @@
fprintf ( stderr, "%s -- ERROR reading input "
"physical volume \"%s\" (still %d bytes to read)\n\n",
cmd, vg->pv[src_pv_index]->pv_name, size);
+ fprintf( stderr, " -- IGNORED \n");
+ /*
pe_unlock ( vg->vg_name);
ret = -LVM_EPV_MOVE_PE_READ_IN;
goto pv_move_pe_end;
+ */
+ red=to_read;
}
to_write = red;
}
@@ -570,8 +576,9 @@
cmd, lvm_error ( ret),
pe_source,
vg->pv[src_pv_index]->pv_name);
- ret = -LVM_EPV_MOVE_PE_UNLOCK;
- goto pv_move_pe_end;
+ fprintf( stderr, "IGNORED\n");
+ /* ret = -LVM_EPV_MOVE_PE_UNLOCK;
+ goto pv_move_pe_end; */
}
}
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-09 22:16 ` Ragnar Kjørstad
@ 2001-09-09 23:51 ` FEJF
2001-09-10 8:39 ` Ragnar Kjørstad
0 siblings, 1 reply; 19+ messages in thread
From: FEJF @ 2001-09-09 23:51 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ragnar Kj�rstad, on Montag, 10. September 2001 00:16 wrote:
> OK, this is the patch.
thx, for your help, but meanwhile i got one from Holger Grothe.
sorry, but i haven't had time to post it earlier. i do it now because it has
some advantages...
tools/lib/pv_move.c:
replace:
fprintf ( stderr, "%s -- ERROR reading input "
"physical volume \"%s\" (still %d bytes to read)\n
cmd, vg->pv[src_pv_index]->pv_name, size);
pe_unlock ( vg->vg_name);
ret = -LVM_EPV_MOVE_PE_READ_IN;
goto pv_move_pe_end;
with:
fprintf ( stderr, "read: %ld, to_read %ld\n", red, to_read);
memset(buffer,170,to_read);
red=to_read;
with 170 u can chosse with which chars the bad block should be replaced with.
(you can search filez for them later if u want - and have enough time ;)
tools/pvmove.c:
replace:
int buffer_size = 64*1024;
with:
int buffer_size = 512;
this is an advantage to your patch, because pvmove then copys only 512
byte-blocks and if there's only one block damaged u don't loose 64kb data.
this has one disadvantage: it's SLOW... and slower than that ;)
so change the source and do a static compiling:
./configure --enable-static_link ; make
u can then use the normal pvmove to move your partitions and if there is an
error u can use the (patched) tools/pvmove program for bad-block moving.
worked find for me and i saw that only one 512 byte block was damaged.
fejf
ps: i used lvm 0.9.1b7 but it should look and work similar with other
versions but i think a diff is not the best for it ;)
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7nACM7Xtp66ctWuIRAgg7AKCBCVB/SPRCLjP9i5oHuf6vgx1u9wCeOmzP
ky+sZgdG6iVcPrN9Ufgl5L8=
=Mfig
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-09 23:51 ` FEJF
@ 2001-09-10 8:39 ` Ragnar Kjørstad
2001-09-10 11:27 ` Heinz J . Mauelshagen
2001-09-10 11:53 ` FEJF
0 siblings, 2 replies; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-10 8:39 UTC (permalink / raw)
To: linux-lvm
On Mon, Sep 10, 2001 at 01:51:37AM +0200, FEJF wrote:
> Ragnar Kjørstad, on Montag, 12. September 2001 00:16 wrote:
> > OK, this is the patch.
>
> thx, for your help, but meanwhile i got one from Holger Grothe.
> sorry, but i haven't had time to post it earlier. i do it now because it has
> some advantages...
>
> tools/lib/pv_move.c:
>
> replace:
> fprintf ( stderr, "%s -- ERROR reading input "
> "physical volume \"%s\" (still %d bytes to read)\n
> cmd, vg->pv[src_pv_index]->pv_name, size);
> pe_unlock ( vg->vg_name);
> ret = -LVM_EPV_MOVE_PE_READ_IN;
> goto pv_move_pe_end;
> with:
> fprintf ( stderr, "read: %ld, to_read %ld\n", red, to_read);
> memset(buffer,170,to_read);
> red=to_read;
>
> with 170 u can chosse with which chars the bad block should be replaced with.
> (you can search filez for them later if u want - and have enough time ;)
It's better than my patch, but still not "correct":
* In my case pe_lock() failed, so it had to be "removed" as well. Was
that not the case for you?
* ret==-1 and ret<to_read should probably be handled differently.
* For ret>0 && ret<to_read the regular execution path could be followed.
* When red==-1 there should be a seek on pv[src_pv_index], so the
possition of the filehandle is set correctly. (now it's undefined)
* maybe red=SECTOR_SIZE; would be better? see next comment.
> tools/pvmove.c:
>
> replace:
> int buffer_size = 64*1024;
> with:
> int buffer_size = 512;
>
> this is an advantage to your patch, because pvmove then copys only 512
> byte-blocks and if there's only one block damaged u don't loose 64kb data.
> this has one disadvantage: it's SLOW... and slower than that ;)
This is not needed if the read loop is allowed to continue.
It would be great if someone took this patch, made the proposed changes,
and integrated it into the standard tools with an "ingore-read-errors"
flag... hint hint.
--
Ragnar Kjørstad
Big Storage
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 8:39 ` Ragnar Kjørstad
@ 2001-09-10 11:27 ` Heinz J . Mauelshagen
2001-09-10 11:45 ` FEJF
2001-09-10 15:38 ` Ragnar Kjørstad
2001-09-10 11:53 ` FEJF
1 sibling, 2 replies; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2001-09-10 11:27 UTC (permalink / raw)
To: linux-lvm
Ragnar,
typically a dying source device will cause read to fail (~line 520
in pv_move_pe.c). This could be easily addressed by an 'ignore read errors'
option and a fallback to BLOCK_SIZEed I/O in order to avoid as much data losses
as possible.
But I wonder, why in your case the locking of the PE failed. Are you able to
reporduce your case and provide the error code?
Regards,
Heinz -- The LVM Guy --
On Mon, Sep 10, 2001 at 10:39:52AM +0200, Ragnar Kjørstad wrote:
> On Mon, Sep 10, 2001 at 01:51:37AM +0200, FEJF wrote:
> > Ragnar Kjørstad, on Montag, 12. September 2001 00:16 wrote:
> > > OK, this is the patch.
> >
> > thx, for your help, but meanwhile i got one from Holger Grothe.
> > sorry, but i haven't had time to post it earlier. i do it now because it has
> > some advantages...
> >
> > tools/lib/pv_move.c:
> >
> > replace:
> > fprintf ( stderr, "%s -- ERROR reading input "
> > "physical volume \"%s\" (still %d bytes to read)\n
> > cmd, vg->pv[src_pv_index]->pv_name, size);
> > pe_unlock ( vg->vg_name);
> > ret = -LVM_EPV_MOVE_PE_READ_IN;
> > goto pv_move_pe_end;
> > with:
> > fprintf ( stderr, "read: %ld, to_read %ld\n", red, to_read);
> > memset(buffer,170,to_read);
> > red=to_read;
> >
> > with 170 u can chosse with which chars the bad block should be replaced with.
> > (you can search filez for them later if u want - and have enough time ;)
>
> It's better than my patch, but still not "correct":
> * In my case pe_lock() failed, so it had to be "removed" as well. Was
> that not the case for you?
> * ret==-1 and ret<to_read should probably be handled differently.
> * For ret>0 && ret<to_read the regular execution path could be followed.
> * When red==-1 there should be a seek on pv[src_pv_index], so the
> possition of the filehandle is set correctly. (now it's undefined)
> * maybe red=SECTOR_SIZE; would be better? see next comment.
>
> > tools/pvmove.c:
> >
> > replace:
> > int buffer_size = 64*1024;
> > with:
> > int buffer_size = 512;
> >
> > this is an advantage to your patch, because pvmove then copys only 512
> > byte-blocks and if there's only one block damaged u don't loose 64kb data.
> > this has one disadvantage: it's SLOW... and slower than that ;)
>
> This is not needed if the read loop is allowed to continue.
>
>
> It would be great if someone took this patch, made the proposed changes,
> and integrated it into the standard tools with an "ingore-read-errors"
> flag... hint hint.
>
>
>
> --
> Ragnar Kjørstad
> Big Storage
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 11:27 ` Heinz J . Mauelshagen
@ 2001-09-10 11:45 ` FEJF
2001-09-10 13:43 ` Heinz J . Mauelshagen
2001-09-10 15:38 ` Ragnar Kjørstad
1 sibling, 1 reply; 19+ messages in thread
From: FEJF @ 2001-09-10 11:45 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Heinz J . Mauelshagen, on Montag, 10. September 2001 13:27 wrote:
> Ragnar,
>
> typically a dying source device will cause read to fail (~line 520
> in pv_move_pe.c). This could be easily addressed by an 'ignore read errors'
> option and a fallback to BLOCK_SIZEed I/O in order to avoid as much data
> losses as possible.
>
> But I wonder, why in your case the locking of the PE failed. Are you able
> to reporduce your case and provide the error code?
as the harddisc makes scary noise and i "loose" it after some hours without
and reboot. and have to wait a day or so to get it back. i think it's a
serious harddisc problem combined with some heat problem or so...
reproduce... hmm... prehaps try to throw your harddisc several times on the
ground ? ;)
fejf
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7nKfB7Xtp66ctWuIRAmAuAKCdtKp3c3cADDSV+wZkzvYsrfHdxQCfdEyJ
54i0hhl+SAQb5tXKLJhxH+s=
=H8ud
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 8:39 ` Ragnar Kjørstad
2001-09-10 11:27 ` Heinz J . Mauelshagen
@ 2001-09-10 11:53 ` FEJF
1 sibling, 0 replies; 19+ messages in thread
From: FEJF @ 2001-09-10 11:53 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ragnar Kj�rstad, on Montag, 10. September 2001 10:39 wrote:
[cut]
> It's better than my patch, but still not "correct":
> * In my case pe_lock() failed, so it had to be "removed" as well. Was
> that not the case for you?
> * ret==-1 and ret<to_read should probably be handled differently.
> * For ret>0 && ret<to_read the regular execution path could be followed.
> * When red==-1 there should be a seek on pv[src_pv_index], so the
> possition of the filehandle is set correctly. (now it's undefined)
> * maybe red=SECTOR_SIZE; would be better? see next comment.
after this msg i search through the syslog:
Aug 29 13:29:20 bolm kernel: hdh: read_intr: error=0x40 { UncorrectableError
}, LBAsect=95091302, sector=95091239
Aug 29 13:29:20 bolm kernel: end_request: I/O error, dev 22:41 (hdh), sector
95091239
what does this locking do ? i mean copying these few remaining bytes took
about 10minutes... so perhaps it had problems with locking but skipped it ?
or would it abort in such a case ?
> > tools/pvmove.c:
> >
> > replace:
> > int buffer_size = 64*1024;
> > with:
> > int buffer_size = 512;
> >
> > this is an advantage to your patch, because pvmove then copys only 512
> > byte-blocks and if there's only one block damaged u don't loose 64kb
> > data. this has one disadvantage: it's SLOW... and slower than that ;)
>
> This is not needed if the read loop is allowed to continue.
and it also copies the whole 64kb if only one 512 byte block damaged with
only 512 bytes of crap ?
> It would be great if someone took this patch, made the proposed changes,
> and integrated it into the standard tools with an "ingore-read-errors"
> flag... hint hint.
hehe ;)
fejf
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7nKnR7Xtp66ctWuIRAubiAJ9YHdZez6QGOUut63knCXcc0kf5DQCgrehc
Zyntw0mfYTTKn7se0VJZ7O8=
=+9/r
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 11:45 ` FEJF
@ 2001-09-10 13:43 ` Heinz J . Mauelshagen
2001-09-10 13:49 ` FEJF
0 siblings, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2001-09-10 13:43 UTC (permalink / raw)
To: linux-lvm
On Mon, Sep 10, 2001 at 01:45:02PM +0200, FEJF wrote:
> Heinz J . Mauelshagen, on Montag, 10. September 2001 13:27 wrote:
> > Ragnar,
> >
> > typically a dying source device will cause read to fail (~line 520
> > in pv_move_pe.c). This could be easily addressed by an 'ignore read errors'
> > option and a fallback to BLOCK_SIZEed I/O in order to avoid as much data
> > losses as possible.
> >
> > But I wonder, why in your case the locking of the PE failed. Are you able
> > to reporduce your case and provide the error code?
>
> as the harddisc makes scary noise and i "loose" it after some hours without
> and reboot. and have to wait a day or so to get it back. i think it's a
> serious harddisc problem combined with some heat problem or so...
Maybe switching to operator beer cooling might help ;-)
> reproduce... hmm... prehaps try to throw your harddisc several times on the
> ground ? ;)
;)
> fejf
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
--
Regards,
Heinz -- The LVM Guy --
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 13:43 ` Heinz J . Mauelshagen
@ 2001-09-10 13:49 ` FEJF
0 siblings, 0 replies; 19+ messages in thread
From: FEJF @ 2001-09-10 13:49 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Heinz J . Mauelshagen, on Montag, 10. September 2001 15:43 wrote:
> On Mon, Sep 10, 2001 at 01:45:02PM +0200, FEJF wrote:
> > Heinz J . Mauelshagen, on Montag, 10. September 2001 13:27 wrote:
> > > Ragnar,
> > >
> > > typically a dying source device will cause read to fail (~line 520
> > > in pv_move_pe.c). This could be easily addressed by an 'ignore read
> > > errors' option and a fallback to BLOCK_SIZEed I/O in order to avoid as
> > > much data losses as possible.
> > >
> > > But I wonder, why in your case the locking of the PE failed. Are you
> > > able to reporduce your case and provide the error code?
> >
> > as the harddisc makes scary noise and i "loose" it after some hours
> > without and reboot. and have to wait a day or so to get it back. i think
> > it's a serious harddisc problem combined with some heat problem or so...
>
> Maybe switching to operator beer cooling might help ;-)
baught an extra cooler for the harddisc... but didn't help... but now i have
a fan for the operator while getting the beer out of the fridge :)
fejf
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7nMTR7Xtp66ctWuIRAp3OAKDM0dCKEcMySeK5OC9Ygu4tZd5ziACcCQiM
B9/m5t2DOCEP08pACc8VR4Q=
=pj6f
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 11:27 ` Heinz J . Mauelshagen
2001-09-10 11:45 ` FEJF
@ 2001-09-10 15:38 ` Ragnar Kjørstad
2001-09-10 16:13 ` FEJF
2001-09-11 14:31 ` Heinz J . Mauelshagen
1 sibling, 2 replies; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-10 15:38 UTC (permalink / raw)
To: linux-lvm
On Mon, Sep 10, 2001 at 01:27:09PM +0200, Heinz J . Mauelshagen wrote:
> But I wonder, why in your case the locking of the PE failed. Are you able to
> reporduce your case and provide the error code?
It was reproducable at the time - now the disk in question has been
replaced. If it's interesting enough, I can make a phone call and see if
I can get the broken disk back.
The error_occured in the pe_lock() call at the beginning of the
move-operation - the one that I commented out in patch.
This is the kernel-log from the time (well, I think it was from the time
I ran pe_move):
Sep 6 18:10:01 argus kernel: hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Sep 6 18:10:01 argus kernel: hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=155058176, sector=0
Sep 6 18:10:01 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
Sep 6 18:10:01 argus kernel: hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Sep 6 18:10:01 argus kernel: hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=155058176, sector=0
Sep 6 18:10:01 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
and:
Sep 6 18:15:51 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
Sep 6 18:15:55 argus last message repeated 213 times
Sep 6 18:15:55 argus kernel: hdd: read_intr: status=0x59 { DriveReady SeekComplete DataRequest E rror }
Sep 6 18:15:55 argus kernel: hdd: read_intr: error=0x40 { UncorrectableError }, LBAsect=11801433 2, sector=118014332
Sep 6 18:15:55 argus kernel: end_request: I/O error, dev 16:40 (hdd), sector 118014332
The kernel is standard 2.4.8 without the LVM-patches. The usersspace
tools are version 1.0.1-rc2.
--
Ragnar Kjørstad
Big Storage
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 15:38 ` Ragnar Kjørstad
@ 2001-09-10 16:13 ` FEJF
2001-09-11 14:31 ` Heinz J . Mauelshagen
1 sibling, 0 replies; 19+ messages in thread
From: FEJF @ 2001-09-10 16:13 UTC (permalink / raw)
To: linux-lvm
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Ragnar Kj�rstad, on Montag, 10. September 2001 17:38 wrote:
> On Mon, Sep 10, 2001 at 01:27:09PM +0200, Heinz J . Mauelshagen wrote:
> > But I wonder, why in your case the locking of the PE failed. Are you able
> > to reporduce your case and provide the error code?
[cut]
> Sep 6 18:15:51 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
> Sep 6 18:15:55 argus last message repeated 213 times
> Sep 6 18:15:55 argus kernel: hdd: read_intr: status=0x59 { DriveReady
> SeekComplete DataRequest E rror } Sep 6 18:15:55 argus kernel: hdd:
> read_intr: error=0x40 { UncorrectableError }, LBAsect=11801433 2,
> sector=118014332 Sep 6 18:15:55 argus kernel: end_request: I/O error, dev
> 16:40 (hdd), sector 118014332
>
> The kernel is standard 2.4.8 without the LVM-patches. The usersspace
> tools are version 1.0.1-rc2.
i got similar errors - but in my box the damaged hd was slave... so i got the
uncorrectable errors on the slave (hdh) and the dma "DriveReady
SeekComplete"-Errors on the master-disk (hdg)
kernel 2.4.7-ac7 without lvm-patches and lvm 0.9.1b7 tools
fejf
- --
Backups are usefull. Most often when you don't have one ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org
iD8DBQE7nOaY7Xtp66ctWuIRAr/rAKDbwC5DMea+yksFvJWZ9ubIvrm7LQCfTHMa
+aAlDpsQ/RUoakNRe+iUyo8=
=93ns
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-10 15:38 ` Ragnar Kjørstad
2001-09-10 16:13 ` FEJF
@ 2001-09-11 14:31 ` Heinz J . Mauelshagen
2001-09-11 17:26 ` Ragnar Kjørstad
1 sibling, 1 reply; 19+ messages in thread
From: Heinz J . Mauelshagen @ 2001-09-11 14:31 UTC (permalink / raw)
To: linux-lvm
On Mon, Sep 10, 2001 at 05:38:51PM +0200, Ragnar Kjørstad wrote:
> On Mon, Sep 10, 2001 at 01:27:09PM +0200, Heinz J . Mauelshagen wrote:
> > But I wonder, why in your case the locking of the PE failed. Are you able to
> > reporduce your case and provide the error code?
>
> It was reproducable at the time - now the disk in question has been
> replaced. If it's interesting enough, I can make a phone call and see if
> I can get the broken disk back.
>
> The error_occured in the pe_lock() call at the beginning of the
> move-operation - the one that I commented out in patch.
>
> This is the kernel-log from the time (well, I think it was from the time
> I ran pe_move):
> Sep 6 18:10:01 argus kernel: hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> Sep 6 18:10:01 argus kernel: hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=155058176, sector=0
> Sep 6 18:10:01 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
> Sep 6 18:10:01 argus kernel: hdd: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> Sep 6 18:10:01 argus kernel: hdd: dma_intr: error=0x10 { SectorIdNotFound }, LBAsect=155058176, sector=0
> Sep 6 18:10:01 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
>
> and:
>
> Sep 6 18:15:51 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
> Sep 6 18:15:55 argus last message repeated 213 times
> Sep 6 18:15:55 argus kernel: hdd: read_intr: status=0x59 { DriveReady SeekComplete DataRequest E rror }
> Sep 6 18:15:55 argus kernel: hdd: read_intr: error=0x40 { UncorrectableError }, LBAsect=11801433 2, sector=118014332
> Sep 6 18:15:55 argus kernel: end_request: I/O error, dev 16:40 (hdd), sector 118014332
Ragnar, the block ioctl error the lvm driver shows is not related to the locking
of a physical extent, because that is achived by the PE_LOCK_UNLOCK ioctl
(0x50 BTW) using the character ioctl function.
Wasn't able to find that ioctl grepping the kernel sources.
Could it be some application checking devices regularly like a desktop
CD-ROM tool or something?
I gues your problem was just the dying disk and therefore temporarily avoiding
the read() check in pv_move_pe() should have catched this one as well.
Regards,
Heinz -- The LVM Guy --
>
> The kernel is standard 2.4.8 without the LVM-patches. The usersspace
> tools are version 1.0.1-rc2.
>
>
>
>
> --
> Ragnar Kjørstad
> Big Storage
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm@sistina.com
> http://lists.sistina.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://www.sistina.com/lvm/Pages/howto.html
*** Software bugs are stupid.
Nevertheless it needs not so stupid people to solve them ***
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Heinz Mauelshagen Sistina Software Inc.
Senior Consultant/Developer Am Sonnenhang 11
56242 Marienrachdorf
Germany
Mauelshagen@Sistina.com +49 2626 141200
FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [linux-lvm] pv_move_pe() error again :/
2001-09-11 14:31 ` Heinz J . Mauelshagen
@ 2001-09-11 17:26 ` Ragnar Kjørstad
0 siblings, 0 replies; 19+ messages in thread
From: Ragnar Kjørstad @ 2001-09-11 17:26 UTC (permalink / raw)
To: linux-lvm
On Tue, Sep 11, 2001 at 04:31:20PM +0200, Heinz J . Mauelshagen wrote:
> > Sep 6 18:15:51 argus kernel: lvm -- lvm_blk_ioctl: unknown command 587
> > Sep 6 18:15:55 argus last message repeated 213 times
> > Sep 6 18:15:55 argus kernel: hdd: read_intr: status=0x59 { DriveReady SeekComplete DataRequest E rror }
> > Sep 6 18:15:55 argus kernel: hdd: read_intr: error=0x40 { UncorrectableError }, LBAsect=11801433 2, sector=118014332
> > Sep 6 18:15:55 argus kernel: end_request: I/O error, dev 16:40 (hdd), sector 118014332
>
> Ragnar, the block ioctl error the lvm driver shows is not related to the locking
> of a physical extent, because that is achived by the PE_LOCK_UNLOCK ioctl
> (0x50 BTW) using the character ioctl function.
> Wasn't able to find that ioctl grepping the kernel sources.
> Could it be some application checking devices regularly like a desktop
> CD-ROM tool or something?
Could it be badblocks?
> I gues your problem was just the dying disk and therefore temporarily avoiding
> the read() check in pv_move_pe() should have catched this one as well.
I ran pv_move with debugging enabled, and it showed that it died in
lock_pe(). I'll try to reproduce this next time I find I broken disk.
--
Ragnar Kjørstad
Big Storage
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2001-09-11 17:26 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-09-06 15:28 [linux-lvm] pv_move_pe() error again :/ FEJF
2001-09-06 20:05 ` Ragnar Kjørstad
2001-09-06 23:41 ` FEJF
2001-09-07 9:43 ` Ragnar Kjørstad
2001-09-07 11:36 ` FEJF
2001-09-09 22:16 ` Ragnar Kjørstad
2001-09-09 23:51 ` FEJF
2001-09-10 8:39 ` Ragnar Kjørstad
2001-09-10 11:27 ` Heinz J . Mauelshagen
2001-09-10 11:45 ` FEJF
2001-09-10 13:43 ` Heinz J . Mauelshagen
2001-09-10 13:49 ` FEJF
2001-09-10 15:38 ` Ragnar Kjørstad
2001-09-10 16:13 ` FEJF
2001-09-11 14:31 ` Heinz J . Mauelshagen
2001-09-11 17:26 ` Ragnar Kjørstad
2001-09-10 11:53 ` FEJF
2001-09-07 10:46 ` Heinz J . Mauelshagen
2001-09-07 11:45 ` FEJF
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.