* xfs_fsr - problem with open files possible?
@ 2007-06-15 16:04 Martin Steigerwald
2007-06-15 16:58 ` Eric Sandeen
0 siblings, 1 reply; 5+ messages in thread
From: Martin Steigerwald @ 2007-06-15 16:04 UTC (permalink / raw)
To: linux-xfs
Hi!
I ran xfs_fsr on my laptops some time ago, for example on my partition
which holds /home. This seemed to work quite well:
---------------------------------------------------------------------
shambala:~> xfs_db -c frag -f /dev/sda2
actual 777855, ideal 737681, fragmentation factor 5.16%
shambala:~> xfs_fsr /dev/sda2
/home start inode=0
shambala:~> xfs_db -c frag -f /dev/sda2
actual 744809, ideal 737681, fragmentation factor 0.96%
shambala:~> xfs_fsr /dev/sda2
/home start inode=0
shambala:~> xfs_db -c frag -f /dev/sda2
actual 744809, ideal 737681, fragmentation factor 0.96%
---------------------------------------------------------------------
As discussed before here there is a limit in what xfs_fsr can do and it
didn't come below 0.96% fragmentation even though the partition has
several GB of free space.
As I restarted KMail after fragmenting it however told me about some
broken indexes. KMail stores index for folders in files. It was able to
restore the indexes.
I can not pinpoint this problem to the defragmentation with xfs_fsr, but
it seemed more than a coincidence to me that it happened directly after I
restarted KMail after the defragmentation.
KMail was running while the fragmentation was taking place and well it
likely had some index files opened. At least it does now that I am
writing that mail:
---------------------------------------------------------------------
shambala:~> lsof +D /home/martin/Mail
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
kontact 2787 martin mem REG 8,2 280837
223046853 /home/martin/Mail/.Lichtvoll.directory/.KDE.index
kontact 2787 martin mem REG 8,2 33
119177542 /home/martin/Mail/.templates.index
kontact 2787 martin mem REG 8,2 33
120665290 /home/martin/Mail/.drafts.index
kontact 2787 martin 18u REG 8,2 280837
223046853 /home/martin/Mail/.Lichtvoll.directory/.KDE.index
kontact 2787 martin 19u REG 8,2 33
120665290 /home/martin/Mail/.drafts.index
kontact 2787 martin 20u REG 8,2 33
119177542 /home/martin/Mail/.templates.index
---------------------------------------------------------------------
Could data loss happen when running xfs_fsr on files that are opened by an
application?
I did not came across any other corrupted files except a Mercurial
repository. I can not pinpoint this problem to XFS at all and have no
idea how and when it got corrupted. At least in my backup from some weeks
ago the repository has been okay. Unfortunately I do not know anymore
whether I made a commit to that repository while xfs_fsr was running or
not. But I think I didn't.
The filesystem itself was okay after fragmentation, I checked it via
xfs_check!
I can try to reproduce the problem. Would be handy tough to have an
xfs_fsr that can be limited only to operate on a certain directory and
its files and sub directories. This way I could create a new mailfolder,
copy some mails in there, have it opened so that KMail accesses the index
file and let xfs_fsr only run on this mailfolder.
I can also open a bug report, but I first wanted to ask here.
Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: xfs_fsr - problem with open files possible?
2007-06-15 16:04 xfs_fsr - problem with open files possible? Martin Steigerwald
@ 2007-06-15 16:58 ` Eric Sandeen
2007-06-15 18:13 ` Martin Steigerwald
2007-06-23 15:23 ` xfs_fsr and null byte areas in files (was: Re: xfs_fsr - problem with open files possible?) Martin Steigerwald
0 siblings, 2 replies; 5+ messages in thread
From: Eric Sandeen @ 2007-06-15 16:58 UTC (permalink / raw)
To: Martin Steigerwald; +Cc: linux-xfs
Martin Steigerwald wrote:
> Could data loss happen when running xfs_fsr on files that are opened by an
> application?
It should not; fsr performs a lot of safety checks and aborts under
problematic circumstances. It will skip files if:
* mandatory locks are present
* file is marked immutable, append-only, or nodefrag
* filesystem is shut down
* change/modify times have been altered since defrag started
* original file is mmapped
If you can clearly recreate it with xfs_fsr, it would be interesting to
compare the good & bad index files to see how they differ, it might
offer a clue as to what/why/how it changed.
> I did not came across any other corrupted files except a Mercurial
> repository. I can not pinpoint this problem to XFS at all and have no
> idea how and when it got corrupted. At least in my backup from some weeks
> ago the repository has been okay. Unfortunately I do not know anymore
> whether I made a commit to that repository while xfs_fsr was running or
> not. But I think I didn't.
>
> The filesystem itself was okay after fragmentation, I checked it via
> xfs_check!
>
> I can try to reproduce the problem. Would be handy tough to have an
> xfs_fsr that can be limited only to operate on a certain directory and
> its files and sub directories. This way I could create a new mailfolder,
> copy some mails in there, have it opened so that KMail accesses the index
> file and let xfs_fsr only run on this mailfolder.
>From the man page:
Files marked as no-defrag will be skipped. The xfs_io(8) chattr command
with the f attribute can be used to set or clear this flag. Files and
directories created in a directory with the no-defrag flag will inherit
the attribute.
so you can flag your mail directory or index files as no-defrag if you
like. (I know this is the converse of what you wanted... but maybe helpful)
-Eric
> I can also open a bug report, but I first wanted to ask here.
>
> Regards,
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: xfs_fsr - problem with open files possible?
2007-06-15 16:58 ` Eric Sandeen
@ 2007-06-15 18:13 ` Martin Steigerwald
2007-06-15 20:04 ` Martin Steigerwald
2007-06-23 15:23 ` xfs_fsr and null byte areas in files (was: Re: xfs_fsr - problem with open files possible?) Martin Steigerwald
1 sibling, 1 reply; 5+ messages in thread
From: Martin Steigerwald @ 2007-06-15 18:13 UTC (permalink / raw)
To: linux-xfs
Am Freitag 15 Juni 2007 schrieb Eric Sandeen:
> > I can try to reproduce the problem. Would be handy tough to have an
> > xfs_fsr that can be limited only to operate on a certain directory
> > and its files and sub directories. This way I could create a new
> > mailfolder, copy some mails in there, have it opened so that KMail
> > accesses the index file and let xfs_fsr only run on this mailfolder.
>
> From the man page:
>
> Files marked as no-defrag will be skipped. The xfs_io(8) chattr
> command with the f attribute can be used to set or clear this flag.
> Files and directories created in a directory with the no-defrag flag
> will inherit the attribute.
>
> so you can flag your mail directory or index files as no-defrag if you
> like. (I know this is the converse of what you wanted... but maybe
> helpful)
Hello Eric!
Well also from the man page as I just found out after sending my mail:
"xfs_fsr can be called with one or more arguments naming filesystems
(block device name), and files to reorganize. In this mode xfs_fsr does
not read or write /var/tmp/.fsrlast_xfs nor does it run for a fixed time
interval. It makes one pass through each specified regular file and all
regular files in each specified filesystem. A command line name
referring to a symbolic link (except to a file system device), FIFO, or
UNIX domain socket generates a warning message, but is otherwise
ignored. While traversing the filesystem these types of files are
silently skipped."
So maybe something along the lines of:
xfs_fsr ~/Mail/.trash.index
while it is open by KMail should be enough. I think I will give this a try
soon.
Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: xfs_fsr - problem with open files possible?
2007-06-15 18:13 ` Martin Steigerwald
@ 2007-06-15 20:04 ` Martin Steigerwald
0 siblings, 0 replies; 5+ messages in thread
From: Martin Steigerwald @ 2007-06-15 20:04 UTC (permalink / raw)
To: linux-xfs
Am Freitag 15 Juni 2007 schrieb Martin Steigerwald:
> Hello Eric!
>
> Well also from the man page as I just found out after sending my mail:
>
> "xfs_fsr can be called with one or more arguments naming filesystems
> (block device name), and files to reorganize. In this mode xfs_fsr
> does not read or write /var/tmp/.fsrlast_xfs nor does it run for a
> fixed time interval. It makes one pass through each specified regular
> file and all regular files in each specified filesystem. A command
> line name referring to a symbolic link (except to a file system
> device), FIFO, or UNIX domain socket generates a warning message, but
> is otherwise ignored. While traversing the filesystem these types of
> files are silently skipped."
>
> So maybe something along the lines of:
>
> xfs_fsr ~/Mail/.trash.index
>
> while it is open by KMail should be enough. I think I will give this a
> try soon.
Hello!
Hmmm, could not reproduce it with a single file. xfs_fsr seems to be
behaving correctly:
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
xfs_bmap -v .Kernel.index
.Kernel.index:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..167]: 83985960..83986127 15 (1586760..1586927) 168
1: [168..175]: 83985824..83985831 15 (1586624..1586631) 8
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
Started Kontact (KMail)...
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory> lsof .Kernel.index
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
kontact 15985 martin mem REG 8,2 86497 254828298 .Kernel.index
kontact 15985 martin 23u REG 8,2 86497 254828298 .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
cp -a .Kernel.index .Kernel.index.old
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
md5sum .Kernel.index.old
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index.old
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
Trying to defragment while file is open: xfs_fsr seems to leave the file
alone completely like it should.
shambala:Mail/.Lichtvoll.directory/.Linux.directory> xfs_fsr .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
xfs_bmap -v .Kernel.index
.Kernel.index:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..167]: 83985960..83986127 15 (1586760..1586927) 168
1: [168..175]: 83985824..83985831 15 (1586624..1586631) 8
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
Also on a second try
shambala:Mail/.Lichtvoll.directory/.Linux.directory> xfs_fsr .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
xfs_bmap -v .Kernel.index
.Kernel.index:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..167]: 83985960..83986127 15 (1586760..1586927) 168
1: [168..175]: 83985824..83985831 15 (1586624..1586631) 8
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
Stopping Kontact (KMail) and trying to defragment then... well now it
defragments. md5sum stays the fine.
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory> xfs_fsr .Kernel.index
shambala:Mail/.Lichtvoll.directory/.Linux.directory>
xfs_bmap -v .Kernel.index
.Kernel.index:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..175]: 86612448..86612623 15 (4213248..4213423) 176
shambala:Mail/.Lichtvoll.directory/.Linux.directory> md5sum .Kernel.index
9d1fdd1138e297fd97548a3a4696e308 .Kernel.index
Hmmm, maybe there is some race condition that is not triggered always. If
I find time I let xfs_fsr run von ~/Mail or folder with several sub
folders so that more index files are touched. I can make a backup, let it
run and if Kontact / KMail complains look at the differences.
Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 5+ messages in thread
* xfs_fsr and null byte areas in files (was: Re: xfs_fsr - problem with open files possible?)
2007-06-15 16:58 ` Eric Sandeen
2007-06-15 18:13 ` Martin Steigerwald
@ 2007-06-23 15:23 ` Martin Steigerwald
1 sibling, 0 replies; 5+ messages in thread
From: Martin Steigerwald @ 2007-06-23 15:23 UTC (permalink / raw)
To: linux-xfs
Am Freitag 15 Juni 2007 schrieb Eric Sandeen:
> Martin Steigerwald wrote:
> > Could data loss happen when running xfs_fsr on files that are opened
> > by an application?
>
> It should not; fsr performs a lot of safety checks and aborts under
> problematic circumstances. It will skip files if:
>
> * mandatory locks are present
> * file is marked immutable, append-only, or nodefrag
> * filesystem is shut down
> * change/modify times have been altered since defrag started
> * original file is mmapped
>
> If you can clearly recreate it with xfs_fsr, it would be interesting to
> compare the good & bad index files to see how they differ, it might
> offer a clue as to what/why/how it changed.
>
> > I did not came across any other corrupted files except a Mercurial
> > repository. I can not pinpoint this problem to XFS at all and have no
> > idea how and when it got corrupted. At least in my backup from some
> > weeks ago the repository has been okay. Unfortunately I do not know
> > anymore whether I made a commit to that repository while xfs_fsr was
> > running or not. But I think I didn't.
Hello Eric!
Well I now found lots of corrupted Bazaar repositories as well. For all
except one I had a working backup. Those were definately not in use while
xfs_fsr ran. I can't prove the corruption to xfs_fsr, but except for
another XFS failure - which I doubt happened - I can't think of much of
else. Granted I used some beta versions of Bazaar, but not on all of
those corrupted repositories.
I tried to reproduce this with my good repositories from the backup, but
couldn't. I bet that they were not fragmented. xfs_fsr **/* (which
recursively matches all files on Z-Shell) in the directories of them
returned rather quickly.
It also happened on a second notebook. I have a T42 and a T23 and it
happened on the T23 too.
I had a closer look about the kind of corruption. Only the file content
seemed to have changed, not the date of the file or its size. Thus I was
in luck cause rsync simply didn't copy the corrupted repositories to my
backup disk. I found out about this as I tried to rsync the corrected
repositories from my T42 to the T23. rsync would not copy them. But when
I used rsync -c it copies some files with differing checksum.
One example:
---------------------------------------------------------------------
martin@shambala:~/.crm114> rsync -acnv .bzr/ .bzr-broken/
building file list ... done
./
branch-lock/
branch/
branch/lock/
checkout/
checkout/lock/
repository/
repository/knits/
repository/knits/22/
repository/knits/36/
repository/knits/36/mailtrainer.crm-20070329201046-91b62hipeixaywh9-5.knit
repository/knits/3d/
repository/knits/4a/
repository/knits/4a/mailfilter.cf-20070329201046-91b62hipeixaywh9-1.knit
repository/knits/76/
repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit
repository/knits/8c/
repository/knits/8c/shuffle.crm-20070329201046-91b62hipeixaywh9-6.knit
repository/knits/9c/
repository/knits/a9/
repository/knits/c5/
repository/knits/c5/mailreaver.crm-20070329201046-91b62hipeixaywh9-4.knit
repository/lock/
repository/revision-store/
sent 2335 bytes received 368 bytes 5406.00 bytes/sec
total size is 86288 speedup is 31.92
martin@shambala:~/.crm114>
---------------------------------------------------------------------
Those files are differing - one example:
---------------------------------------------------------------------
martin@shambala:~/.crm114> LANG=C
cmp .bzr/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit .bzr-broken/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit
.bzr/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit .bzr-broken/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit
differ: char 8193, line 22
---------------------------------------------------------------------
The good one:
---------------------------------------------------------------------
martin@shambala:~/.crm114>
head -c9183 .bzr/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit |
tail -c1000 | od -h
0000000 d46d 2b8b 9b50 5ce8 e3f7 964c 6b3f 728b
0000020 8541 e12e f328 92a5 7061 2981 6232 7463
0000040 6583 0421 b568 481f f5db ce4c 6924 8b55
[... lots of data that looks like the first three lines ...]
0001700 a28d 2b64 d792 6dcc 96cb 15c7 f960 7e26
0001720 dc22 1d2d 6a0e 6ce0 3648 a050 c7ec 371c
0001740 743c 2502 e5e4 e1d1
0001750
---------------------------------------------------------------------
The broken one - this really looks like the null files problem!:
---------------------------------------------------------------------
martin@shambala:~/.crm114>
head -c9183 .bzr-broken/repository/knits/76/maillib.crm-20070329201046-91b62hipeixaywh9-3.knit |
tail -c1000 | od -h
0000000 d46d 2b8b 9b50 5ce8 00f7 0000 0000 0000
0000020 0000 0000 0000 0000 0000 0000 0000 0000
*
0001740 0000 0000 0000 0000
0001750
---------------------------------------------------------------------
I checked at least one of those files with xfs_bmap and it was no sparse
file, these are real zeros.
I am quite a bit confused by that since I cannot remember that I had any
abnormal write interruption while the xfs_fsr command was running or
after it has finished. As I found out about corrupted files I just
rebooted cleanly to my SUSE installation and ran xfs_check on the
partitions I xfs_fsr'ed which turned out to be okay. Also certainly not
all of these broken repositories had been in use while any abnormal write
termination happened.
I am doing checks for other broken files now:
---------------------------------------------------------------------
find -type f -and -iname "*" -print0 | xargs -0 -I{} cmp
{} /mnt/backup/home/martin/{}
---------------------------------------------------------------------
Better with that one:
---------------------------------------------------------------------
rsync -acnv /mnt/backup/home/martin/ /home/martin/
---------------------------------------------------------------------
Oh well, there are other files differing which should certainly be the
same - since I didn't touch them in the meanwhile. And they contain zeros
where there should be data as well as far as I looked. So whatever I
faced here its a real corruption issue! And I know I will continue to use
rsync without -c in my backup script, cause that prevented the corrupted
files from being copied to my backup!
Well thus maybe is a bad interaction between xfs_fsr and the null files
problem or a problem in xfs_fsr...
I for my part will be very cautious with xfs_fsr in the future. If I
manage to take the time I likely will create a XFS partition in that one
GB that is left on my harddisk copy certain directories in it
simultaneously so that it hopefully creates some fragmentation and then
run xfs_fsr... I hope I will be able to track this down, cause its really
bad behavior.
Is there a xfs QA test which tests xfs_fsr? I think there should be one,
especially as this tool does not seem to be used that often out in the
wild. Preferably one that copies lots of data to a partition in a way that
it gets fragmented, md5sums it, runs xfs_fsr and then compares the
md5sums of the files.
If anyone else has time I suggest tests on that topic. It seems that
something is going really wrong here even when I cannot prove it right
now. But maybe it was just a rare race condition. Well see. I hope I will
be able to reproduce it.
I ran xfs_fsr
---------------------------------------------------------------------
shambala:~> LANG=C apt-cache policy xfsdump
xfsdump:
Installed: 2.2.38-1
Candidate: 2.2.38-1
Version table:
2.2.45-1 0
500 http://debian.n-ix.net lenny/main Packages
500 http://debian.n-ix.net sid/main Packages
*** 2.2.38-1 0
990 http://debian.n-ix.net etch/main Packages
100 /var/lib/dpkg/status
---------------------------------------------------------------------
on Linux Kernel 2.6.21.3 on my XFS /home partition back then.
Since images, music, mail files, movies and stuff are not automatically
checked for corruption and since not many files seem to be affected I
think I didn't notice other files to be affected. Will likely take quite
some time as I changed to many files since my last backup...
OK, now on to recovery:
- xfs_check xfs_fsr involved partitions - they are fine
- xfs_check the backup partitions - they are fine tough
The most automatic way to get things recovered as good as possible is to
rsync my current data to the backup partition *without* determining
changed files by its checksum. Then rsync the updated backup back to the
work place *with* determining changes files by its checksum:
---------------------------------------------------------------------
shambala:~ #
rsync -axv --acls --sparse --delete /mnt/debian/ /mnt/backup/debian/
shambala:~ #
rsync -axv --acls --sparse --delete /mnt/debian/ /mnt/backup/debian/
building file list ... done
sent 10306228 bytes received 20 bytes 38818.26 bytes/sec
total size is 8376658632 speedup is 812.77
shambala:~ #
rsync -acxv --acls --sparse --del /mnt/backup/debian/ /mnt/debian/ |
tee /home/martin/XFS-Probleme/2007-07-23/rsync-acxv-debian.txt
---------------------------------------------------------------------
Now look what is in "rsync-acxv-debian.txt":
---------------------------------------------------------------------
building file list ... done
.kde/share/apps/kconf_update/log/update.log
[ ... above one shouldn't be here at / at all of course, however I managed
it to be there ;-) ... ]
boot/grub/default
lost+found/8937921
lost+found/8937923
usr/bin/nslookup
usr/games/gnubg
usr/lib/libGLU.a
usr/lib/libclucene.so.0.0.0
usr/lib/libdns.so.22.1.0
usr/lib/libkeximain.so.2.0.0
usr/lib/libkoproperty.so.2.0.0
usr/lib/libkwordprivate.so.4.0.0
usr/lib/libpoppler.so.0.0.0
usr/lib/flashplugin-nonfree/libflashplayer.so
usr/lib/gcc/i486-linux-gnu/4.1.2/cc1
usr/lib/jvm/java-1.5.0-sun-1.5.0.10/jre/lib/charsets.jar
usr/lib/jvm/java-1.5.0-sun-1.5.0.10/jre/lib/i386/client/libjvm.so
usr/lib/jvm/java-1.5.0-sun-1.5.0.10/jre/lib/i386/motif21/libmawt.so
usr/lib/mono/gac/System.Data/1.0.5000.0__b77a5c561934e089/System.Data.dll.mdb
usr/lib/python2.3/site-packages/PyQt4/QtGui.so
usr/lib/python2.4/site-packages/PyQt4/QtGui.so
usr/share/X11/doc/hardcopy/XKB/XKBlib.ps.gz
usr/share/doc/aircrack-ng/injection-patches/linux-wlan-0.2.3.packet.injection.patch.gz
usr/share/doc/dahb-html/html/bilder/allgemein/debian-devel-earth.png
usr/share/doc/dahb-html/html/bilder/allgemein/openlogo-nd.png
usr/share/doc/dahb-html/html/bilder/sarge/sarge1.png
usr/share/doc/dahb-html/html/bilder/sarge/sargebasec17.png
usr/share/doc/dahb-html/html/bilder/sarge/sargebasesecurity.png
usr/share/doc/dahb-html/html/bilder/sarge/sargedomainname.png
usr/share/doc/dahb-html/html/bilder/sarge/sargehostname.png
usr/share/doc/dahb-html/html/bilder/sarge/sargepart1.png
[... it goes on like this ...]
---------------------------------------------------------------------
This is nasty. Some of those would have been very difficult to find. Heck
that flash player even worked OK yesterday on YouTube.
Luckily not that many files seem to have been affected (aside from those
broken repositories and KMail index files I alread fixed):
martin@shambala:Shambala/XFS-Probleme/2007-07-23> wc -l
rsync-acxv-debian.txt
117 rsync-acxv-debian.txt
martin@shambala:Shambala/XFS-Probleme/2007-07-23> wc -l
rsync-acxv-debian-home.txt
232 rsync-acxv-debian-home.txt
Well at least for me T42 I recovered almost all important data. One Bazaar
repository got lost, but I can get the version history from someone else.
And there is a slight risk that some of the files that I already changed
in the meanwhile have zero byte areas as well that didn't come to my
attention yet.
Well that shows me that a checksumming filesystem would really be good! It
could have told me that file contents have been corrupted during check.
And I learn quite a lot:
- if there is any slight chance of corruption (that corrupted KMail index
files) check it out before doing anything else. Just do not assume that
its only those files that have been affected, but check with rsync -c
- if there are broken files, restore the backup before doing anything
else. Except that I have spare storage where I could store the
filesystems with the broken files first for further analysis.
That would have helped me to have next to zero data loss and it would have
been way quicker to restore everything to normal.
And another one:
- do not try an online defragmentation on valuable data before testing it
first on test data
I still can't prove it, but I am quite sure that it has been xfs_fsr maybe
in combination with some XFS problem. It was the only tool that could
have accessed all those files. I know for sure I didn't do write accesses
on many of them.
Some more information about my T42... Kernel version and Debian
installation on my T23 are pretty similar only drivers differ a bit:
martin@shambala:~> cat /proc/version
Linux version 2.6.21.3-tp42-cfs-v15-sws2-2.2.10 (martin@shambala) (gcc
version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 PREEMPT Sun Jun
3 21:43:50 CEST 2007
shambala:~> xfs_info /
meta-data=/dev/root isize=256 agcount=16, agsize=228894
blks
= sectsz=512 attr=1
data = bsize=4096 blocks=3662304, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=2560, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=65536 blocks=0, rtextents=0
shambala:~> xfs_info /home
meta-data=/dev/sda2 isize=256 agcount=19, agsize=686660
blks
= sectsz=512 attr=0
data = bsize=4096 blocks=12695508, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=5364, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=65536 blocks=0, rtextents=0
Ok, I am quite exhausted now. And there is still the T23 to do... (in
progress at the moment).
Regards,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-06-25 7:53 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-15 16:04 xfs_fsr - problem with open files possible? Martin Steigerwald
2007-06-15 16:58 ` Eric Sandeen
2007-06-15 18:13 ` Martin Steigerwald
2007-06-15 20:04 ` Martin Steigerwald
2007-06-23 15:23 ` xfs_fsr and null byte areas in files (was: Re: xfs_fsr - problem with open files possible?) Martin Steigerwald
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox