* [linux-lvm] Page cache corruption when creating a snapshot
@ 2008-02-29 17:32 ghudson
2008-02-29 18:31 ` Alasdair G Kergon
2008-02-29 19:53 ` Stuart D. Gathman
0 siblings, 2 replies; 9+ messages in thread
From: ghudson @ 2008-02-29 17:32 UTC (permalink / raw)
To: linux-lvm
We have observed an apparent kernel memory corruption bug when
creating an LVM snapshot. This has been reproduced on two different
machines, so it does not appear to be a memory hardware issue.
The reproduction recipe looks like:
rm -rf /tmp/test
mkdir /tmp/test
# Put around 60MB of files into /tmp/test
find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
lvcreate --size 2G --snapshot /dev/dink/gutsy-i386-sbuild --name testsnapshot
find /tmp/test -type f | xargs md5sum > /tmp/sum.post
lvremove -f /dev/dink/testsnapshot
diff -u /tmp/sum.pre /tmp/sum.post
Line 5 naturally needs to be adjusted for the LVM configuration of the
test machine. On my machine, /dev/dink/gutsy-i386-sbuild is an
unmounted 2GB logical volume containing a build chroot; it lives in a
different volume group from the one /tmp's filesystem is located in.
Not all of the time, but some of the time when I do this, one of the
files in /tmp/test will have a different md5sum. It's always a
one-byte difference at offset 156 within a 1K block (but a different
block each time), and the incorrect value of that byte is always one
less than the correct value. For example:
@@ -471431,7 +471431,7 @@
0731860: 4d46 6ae3 0252 6864 e634 15eb 7ac1 f0ee MFj..Rhd.4..z...
0731870: 9f2b 8d82 33e3 138b 31a2 8da5 4594 5648 .+..3...1...E.VH
0731880: 74fd 00e0 bc48 fe09 d557 f501 70a8 7dfd t....H...W..p.}.
-0731890: ea8f 5010 b963 e2ec 7b84 8ef7 e851 fdfa ..P..c..{....Q..
+0731890: ea8f 5010 b963 e2ec 7b84 8ef7 e751 fdfa ..P..c..{....Q..
07318a0: 6031 670b cd54 fe01 20d6 f3fb c662 dfc3 `1g..T.. ....b..
07318b0: 7605 acd2 1be6 3fee 54ff e15b bc60 77fa v.....?.T..[.`w.
07318c0: 368e 99f9 60a0 a1a2 fbdf ef0d 4bca a201 6...`.......K...
If the machine is rebooted (after moving /tmp/test to another location
so it doesn't get blown away by init scripts), the apparently modified
file reverts to the correct contents. Thus, the issue appears to be
page cache corruption, not actual filesystem corruption.
Version information:
root@linux-build-10:~# uname -a
Linux linux-build-10 2.6.22-14-server #1 SMP Thu Jan 31 23:57:25 UTC 2008 x86_64 GNU/Linux
root@linux-build-10:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 7.10
Release: 7.10
Codename: gutsy
root@linux-build-10:~# dpkg -s lvm2 | grep Version
Version: 2.02.26-1ubuntu4
root@linux-build-10:~# pvscan
PV /dev/sdb VG dink lvm2 [136.73 GB / 110.73 GB free]
PV /dev/sda5 VG LINUX-BUILD-10.mit.edu lvm2 [68.12 GB / 0 free]
Total: 2 [204.85 GB] / in use: 2 [204.85 GB] / in no VG: 0 [0 ]
root@linux-build-10:~# vgscan
Reading all physical volumes. This may take a while...
Found volume group "dink" using metadata type lvm2
Found volume group "LINUX-BUILD-10.mit.edu" using metadata type lvm2
(I sent a slightly different variant of this yesterday without
subscribing to the list, which I think was black-holed. Apologies if
this shows up twice. Also, I filed a similar bug report with Ubuntu
which can be seen at:
https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/196784
)
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 17:32 [linux-lvm] Page cache corruption when creating a snapshot ghudson
@ 2008-02-29 18:31 ` Alasdair G Kergon
2008-02-29 19:11 ` Greg Hudson
2008-02-29 19:53 ` Stuart D. Gathman
1 sibling, 1 reply; 9+ messages in thread
From: Alasdair G Kergon @ 2008-02-29 18:31 UTC (permalink / raw)
To: ghudson; +Cc: linux-lvm
On Fri, Feb 29, 2008 at 12:32:41PM -0500, ghudson@MIT.EDU wrote:
> The reproduction recipe looks like:
> rm -rf /tmp/test
> mkdir /tmp/test
> # Put around 60MB of files into /tmp/test
> find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
> lvcreate --size 2G --snapshot /dev/dink/gutsy-i386-sbuild --name testsnapshot
> find /tmp/test -type f | xargs md5sum > /tmp/sum.post
Can you do that twice?
find /tmp/test -type f | xargs md5sum > /tmp/sum.post2
and check the two post files are the same?
And add some syncs/blockdev --flushbufs at different places
in the script to see if you can make it go away.
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 18:31 ` Alasdair G Kergon
@ 2008-02-29 19:11 ` Greg Hudson
2008-02-29 19:10 ` Bryn M. Reeves
2008-02-29 19:29 ` malahal
0 siblings, 2 replies; 9+ messages in thread
From: Greg Hudson @ 2008-02-29 19:11 UTC (permalink / raw)
To: Alasdair G Kergon; +Cc: linux-lvm
On Fri, 2008-02-29 at 18:31 +0000, Alasdair G Kergon wrote:
> On Fri, Feb 29, 2008 at 12:32:41PM -0500, ghudson@MIT.EDU wrote:
> > The reproduction recipe looks like:
> > rm -rf /tmp/test
> > mkdir /tmp/test
> > # Put around 60MB of files into /tmp/test
> > find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
> > lvcreate --size 2G --snapshot /dev/dink/gutsy-i386-sbuild --name testsnapshot
> > find /tmp/test -type f | xargs md5sum > /tmp/sum.post
>
> Can you do that twice?
> find /tmp/test -type f | xargs md5sum > /tmp/sum.post2
> and check the two post files are the same?
In three reproductions of the page cache corruption, sum.post2 was
always the same as sum.post.
In my experiences with this problem in general, the page cache
corruption is not particularly transient; once it happens, the file
continues to appear modified (with the same incorrect contents) for the
indefinite future, until the machine is rebooted.
> And add some syncs/blockdev --flushbufs at different places
> in the script to see if you can make it go away.
Nope, that never made it go away. I'm not sure in what situations
flushing write buffers would have any effect. If I had a way to throw
away the read-only page cache and force a file reload from disk, I would
expect that to eliminate the visible effect of the corruption; at the
moment the only reliable way I know how to do that is to reboot. (I
could churn the page cache into oblivion with a bazillion reads of
different files, but I'd have no way of knowing when I had succeeded in
reusing the corrupted cache page.)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 19:11 ` Greg Hudson
@ 2008-02-29 19:10 ` Bryn M. Reeves
2008-02-29 19:35 ` Greg Hudson
2008-02-29 19:29 ` malahal
1 sibling, 1 reply; 9+ messages in thread
From: Bryn M. Reeves @ 2008-02-29 19:10 UTC (permalink / raw)
To: LVM general discussion and development
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Greg Hudson wrote:
> Nope, that never made it go away. I'm not sure in what situations
> flushing write buffers would have any effect. If I had a way to throw
> away the read-only page cache and force a file reload from disk, I would
> expect that to eliminate the visible effect of the corruption; at the
> moment the only reliable way I know how to do that is to reboot. (I
> could churn the page cache into oblivion with a bazillion reads of
> different files, but I'd have no way of knowing when I had succeeded in
> reusing the corrupted cache page.)
You should be able to achieve that via the /proc/sys/vm/drop_caches sysctl:
Documentation/filesystems/proc.txt:
drop_caches
- -----------
Writing to this will cause the kernel to drop clean caches, dentries and
inodes from memory, causing that memory to become free.
To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches
I guess a 1 would be a good first try & if that doesn't clear it, a 3
should force the entire inode to be re-read from disk.
Regards,
Bryn.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org
iD8DBQFHyFia6YSQoMYUY94RArInAJ9dRVLCGYXcllL4RDz4pei4qGnUVQCg4LL1
d22/7jgF5zxsGnFPeCfpkwA=
=v99d
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 19:10 ` Bryn M. Reeves
@ 2008-02-29 19:35 ` Greg Hudson
2008-02-29 20:14 ` Alasdair G Kergon
0 siblings, 1 reply; 9+ messages in thread
From: Greg Hudson @ 2008-02-29 19:35 UTC (permalink / raw)
To: LVM general discussion and development
On Fri, 2008-02-29 at 19:10 +0000, Bryn M. Reeves wrote:
> Greg Hudson wrote:
> > Nope, that never made it go away. I'm not sure in what situations
> > flushing write buffers would have any effect. If I had a way to throw
> > away the read-only page cache and force a file reload from disk, I would
> > expect that to eliminate the visible effect of the corruption; at the
> > moment the only reliable way I know how to do that is to reboot.
> To free pagecache:
> echo 1 > /proc/sys/vm/drop_caches
Thank you! Verified that freeing the page cache causes the
apparently-corrupted file's md5sum to revert to the correct value:
[Abbreviated diff output:]
--- /tmp/sum.pre 2008-02-29 14:30:29.000000000 -0500
+++ /tmp/sum.post 2008-02-29 14:30:32.000000000 -0500
-3e497fef20ce899c5621dcfbcbfec9a3 /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
+e9725c38b630515dc304a26bde28fa51 /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
root@linux-build-10:~# md5sum /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
e9725c38b630515dc304a26bde28fa51 /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
root@linux-build-10:~# echo 1 > /proc/sys/vm/drop_caches
root@linux-build-10:~# md5sum /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
3e497fef20ce899c5621dcfbcbfec9a3 /tmp/test/krb5-1.6.dfsg/src/windows/cns/kerbnet.hlp
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 19:35 ` Greg Hudson
@ 2008-02-29 20:14 ` Alasdair G Kergon
2008-03-04 23:57 ` Greg Hudson
0 siblings, 1 reply; 9+ messages in thread
From: Alasdair G Kergon @ 2008-02-29 20:14 UTC (permalink / raw)
To: LVM general discussion and development
On Fri, Feb 29, 2008 at 02:35:16PM -0500, Greg Hudson wrote:
> On Fri, 2008-02-29 at 19:10 +0000, Bryn M. Reeves wrote:
> > Greg Hudson wrote:
> Thank you! Verified that freeing the page cache causes the
> apparently-corrupted file's md5sum to revert to the correct value:
I notice this is ubuntu: Can you eliminate the distribution, by
reproducing on a different distribution, or using a current upstream
development kernel like 2.6.25-rc3?
What is /tmp ?
Alasdair
--
agk@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 20:14 ` Alasdair G Kergon
@ 2008-03-04 23:57 ` Greg Hudson
0 siblings, 0 replies; 9+ messages in thread
From: Greg Hudson @ 2008-03-04 23:57 UTC (permalink / raw)
To: LVM general discussion and development
On Fri, 2008-02-29 at 20:14 +0000, Alasdair G Kergon wrote:
> I notice this is ubuntu: Can you eliminate the distribution, by
> reproducing on a different distribution, or using a current upstream
> development kernel like 2.6.25-rc3?
Apologies for the delay. I was unable to reproduce this bug in many
trials on 2.6.25-rc3, so it may have been either introduced by Ubuntu in
gutsy or (more likely) fixed in the mainline kernel source since 2.6.22.
Anders Kaseorg has also seen this problem on Gutsy but was unable to
reproduce it on Hardy, which lends credence to the second theory.
> What is /tmp ?
A regular old directory on the root fileystem, which is ext3fs on a
different LVM volume group on a different disk from the volume group
containing the LV's I'm snapshotting.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 19:11 ` Greg Hudson
2008-02-29 19:10 ` Bryn M. Reeves
@ 2008-02-29 19:29 ` malahal
1 sibling, 0 replies; 9+ messages in thread
From: malahal @ 2008-02-29 19:29 UTC (permalink / raw)
To: linux-lvm
Greg Hudson [ghudson@MIT.EDU] wrote:
> On Fri, 2008-02-29 at 18:31 +0000, Alasdair G Kergon wrote:
> > On Fri, Feb 29, 2008 at 12:32:41PM -0500, ghudson@MIT.EDU wrote:
> > > The reproduction recipe looks like:
> > > rm -rf /tmp/test
> > > mkdir /tmp/test
> > > # Put around 60MB of files into /tmp/test
> > > find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
> > > lvcreate --size 2G --snapshot /dev/dink/gutsy-i386-sbuild --name testsnapshot
> > > find /tmp/test -type f | xargs md5sum > /tmp/sum.post
> >
> > Can you do that twice?
> > find /tmp/test -type f | xargs md5sum > /tmp/sum.post2
> > and check the two post files are the same?
>
> In three reproductions of the page cache corruption, sum.post2 was
> always the same as sum.post.
>
> In my experiences with this problem in general, the page cache
> corruption is not particularly transient; once it happens, the file
> continues to appear modified (with the same incorrect contents) for the
> indefinite future, until the machine is rebooted.
>
> > And add some syncs/blockdev --flushbufs at different places
> > in the script to see if you can make it go away.
>
> Nope, that never made it go away. I'm not sure in what situations
> flushing write buffers would have any effect. If I had a way to throw
> away the read-only page cache and force a file reload from disk, I would
> expect that to eliminate the visible effect of the corruption; at the
> moment the only reliable way I know how to do that is to reboot.
Not an expert on O_DIRECT, but it is supposed to read from the disk
without creating page cache. I don't really know what it does if page
cache exists! The "dd" command has O_DIRECT support and see if you
notice any change with the corrupted file when you do "dd" with and
without O_DIRECT.
--Malahal.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [linux-lvm] Page cache corruption when creating a snapshot
2008-02-29 17:32 [linux-lvm] Page cache corruption when creating a snapshot ghudson
2008-02-29 18:31 ` Alasdair G Kergon
@ 2008-02-29 19:53 ` Stuart D. Gathman
1 sibling, 0 replies; 9+ messages in thread
From: Stuart D. Gathman @ 2008-02-29 19:53 UTC (permalink / raw)
To: LVM general discussion and development
On Fri, 29 Feb 2008 ghudson@MIT.EDU wrote:
> rm -rf /tmp/test
> mkdir /tmp/test
> # Put around 60MB of files into /tmp/test
> find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
Ran the last four lines several times on my Centos-5 system with
/tmp and the snapshot VG on different volume groups. (/tmp on rootvg)
# du /tmp/test
84688 /tmp/test
# lvs backvg2/DFL
LV VG Attr LSize Origin Snap% Move Log Copy%
DFL backvg2 -wi-a- 10.00G
> lvcreate --size 2G --snapshot /dev/backvg2/DFL --name DFL_snap
> find /tmp/test -type f | xargs md5sum > /tmp/sum.post
> lvremove -f /dev/backvg2/DFL_snap
> diff -u /tmp/sum.pre /tmp/sum.
Doesn't seem to fail for us. (Sighs with relief.)
--
Stuart D. Gathman <stuart@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flammis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-03-04 23:58 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-02-29 17:32 [linux-lvm] Page cache corruption when creating a snapshot ghudson
2008-02-29 18:31 ` Alasdair G Kergon
2008-02-29 19:11 ` Greg Hudson
2008-02-29 19:10 ` Bryn M. Reeves
2008-02-29 19:35 ` Greg Hudson
2008-02-29 20:14 ` Alasdair G Kergon
2008-03-04 23:57 ` Greg Hudson
2008-02-29 19:29 ` malahal
2008-02-29 19:53 ` Stuart D. Gathman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).