* [linux-lvm] LVM2/DM data corruption during write with 2.6.11.12
@ 2005-09-08 10:31 Ludovic Drolez
0 siblings, 0 replies; only message in thread
From: Ludovic Drolez @ 2005-09-08 10:31 UTC (permalink / raw)
To: linux-lvm
Hi !
We are developing a (GPLed) disk cloning software similar to partimage: it's an
intelligent 'dd' which backups only used sectors. Project info and SVN available
at http://lrs.linbox.org
Recently I added LVM1/2 support to it, and sometimes we saw LVM restorations
failing randomly (Disk images from RHES are not corrupted, but the result of the
restoration can be lead to a corrupted filesystem). If a restoration fails, just
try another one and it will work...
How the restoration program works:
- I restore the LVM2 administrative data (384 sectors, most of the time),
- I 'vgscan', 'vgchange',
- open for writing the '/dev/dm-xxxx',
- read a compressed file over NFS,
- and put the sectors in place, so it's a succession of '_llseek()' and
'write()' to the DM device.
But, *sometimes*, for example, the current seek position is at 9GB, and some
data is written to sector 0 ! It happens randomly.
Here is a typical strace of a restoration:
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
_llseek(5, 20963328, [37604032512], SEEK_CUR) = 0
_llseek(5, 0, [37604032512], SEEK_CUR) = 0
_llseek(5, 2097152, [37606129664], SEEK_CUR) = 0
write(5, "\1\335E\0\f\0\1\2.\0\0\0\2\0\0\0\364\17\2\2..\0\0\0\0\0"..., 512) = 51
2
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
_llseek(5, 88076288, [37694210048], SEEK_CUR) = 0
_llseek(5, 0, [37694210048], SEEK_CUR) = 0
_llseek(5, 20971520, [37715181568], SEEK_CUR) = 0
write(5, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 512) = 5
12
....
....
As you can see, there are no seeks to sector 0, but something randomly write
some data to sector 0 !
I could reproduce these random problems on different kind of PCs.
But, the strace above comes from an improved version, which aggregates
'_llseek's. A previous version, which did *many* 512 bytes seeks had much more
problems. Aggregating seeks made the corruption to appears very rarely... And I
more likely to happen, for a 40GB restoration than for a 10GB one.
So less system calls to the DM/LVM2 layer seems to give less corruption probability.
Any ideas ? Newer kernel releases could have fixed such a problem ?
--
Ludovic DROLEZ Linbox / Free&ALter Soft
www.linbox.com www.linbox.org
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2005-09-08 10:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-08 10:31 [linux-lvm] LVM2/DM data corruption during write with 2.6.11.12 Ludovic Drolez
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.