* cleanup after a small data loss on incorrect shutdown.
@ 2009-06-11 12:44 Michael Raskin
2009-06-12 10:53 ` Chris Mason
0 siblings, 1 reply; 7+ messages in thread
From: Michael Raskin @ 2009-06-11 12:44 UTC (permalink / raw)
To: linux-btrfs
Hello.
I am continuing my tests of BtrFS under a practical workload. Recently
an incorrect poweroff (or maybe a small bug in BtrFS) caused a small
data loss. The actual damage was non-existent.
I used old branch, so maybe the relevant code is already improved.
1. Why btrfsck says "bad block" on that partition? What does it mean?
My fist reaction was to use badblocks. It found no badblocks in its own
sense, so I assume btrfsck means something else. It would be nice to
explain that to user. Maybe "damaged FS data block" ?
2. I found a file which is listed in the directory, but stat on it
returns "No such file or directory". Certainly, rm and unlink cannot
remove it. The partition has 14G in use. What can I do to provide a
useful piece of FS structure information? How can I remove the file
afterwards.
3. On a 30G partition with 14G used btrfsck was left overnight. It has
neither finished nor printed any meaningful request for interaction. Is
it normal?
Michael Raskin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-11 12:44 cleanup after a small data loss on incorrect shutdown Michael Raskin
@ 2009-06-12 10:53 ` Chris Mason
2009-06-12 11:08 ` Michael Raskin
2009-06-12 11:48 ` Michael Raskin
0 siblings, 2 replies; 7+ messages in thread
From: Chris Mason @ 2009-06-12 10:53 UTC (permalink / raw)
To: Michael Raskin; +Cc: linux-btrfs
On Thu, Jun 11, 2009 at 04:44:30PM +0400, Michael Raskin wrote:
> Hello.
>
> I am continuing my tests of BtrFS under a practical workload. Recently
> an incorrect poweroff (or maybe a small bug in BtrFS) caused a small
> data loss. The actual damage was non-existent.
> I used old branch, so maybe the relevant code is already improved.
>
> 1. Why btrfsck says "bad block" on that partition? What does it mean?
> My fist reaction was to use badblocks. It found no badblocks in its own
> sense, so I assume btrfsck means something else. It would be nice to
> explain that to user. Maybe "damaged FS data block" ?
Yes, it would make sense to make these more informative.
>
> 2. I found a file which is listed in the directory, but stat on it
> returns "No such file or directory". Certainly, rm and unlink cannot
> remove it. The partition has 14G in use. What can I do to provide a
> useful piece of FS structure information? How can I remove the file
> afterwards.
I'd say to send us the btrfsck output, it will help answer these
questions.
>
> 3. On a 30G partition with 14G used btrfsck was left overnight. It has
> neither finished nor printed any meaningful request for interaction. Is
> it normal?
Definitely not ;) You can check with vmstat to see if btrfsck is
actually doing anything, but it sounds like you hit a bug. Which
version of the kernel and tools are you using?
-chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-12 10:53 ` Chris Mason
@ 2009-06-12 11:08 ` Michael Raskin
2009-06-12 11:42 ` Chris Mason
2009-06-12 11:48 ` Michael Raskin
1 sibling, 1 reply; 7+ messages in thread
From: Michael Raskin @ 2009-06-12 11:08 UTC (permalink / raw)
To: Chris Mason, Michael Raskin, linux-btrfs
Chris Mason wrote:
>> 2. I found a file which is listed in the directory, but stat on it
>> returns "No such file or directory". Certainly, rm and unlink cannot
>> remove it. The partition has 14G in use. What can I do to provide a
>> useful piece of FS structure information? How can I remove the file
>> afterwards.
>
> I'd say to send us the btrfsck output, it will help answer these
> questions.
Oh, easily. "Bad block <number way beyond partition block count>".
That's all. Reading one of the damaged file actually returned
"Input/output error" - probably it tried to read beyond end-of-device. I
had to kill this file (practical testing means that to continue to use
my notebook normally I had to nuke the damaged file and get intact
copies). The "no such file except in readdir" is still there right now.
>> 3. On a 30G partition with 14G used btrfsck was left overnight. It has
>> neither finished nor printed any meaningful request for interaction. Is
>> it normal?
>
> Definitely not ;) You can check with vmstat to see if btrfsck is
> actually doing anything, but it sounds like you hit a bug. Which
> version of the kernel and tools are you using?
v0.18 release tools. 2.6.30-rc8 kernel. (It seems to include everything
before newformat patches). "top" said 99% of single-core Celeron was
used by btrfsck. I didn't run vmstat, but I remember that e2fsck random
reads would make much more noise (and for linear reading it took way too
much time).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-12 11:08 ` Michael Raskin
@ 2009-06-12 11:42 ` Chris Mason
2009-06-12 11:56 ` Michael Raskin
2009-06-12 18:52 ` Michael Raskin
0 siblings, 2 replies; 7+ messages in thread
From: Chris Mason @ 2009-06-12 11:42 UTC (permalink / raw)
To: Michael Raskin; +Cc: linux-btrfs
On Fri, Jun 12, 2009 at 03:08:58PM +0400, Michael Raskin wrote:
> Chris Mason wrote:
> >> 2. I found a file which is listed in the directory, but stat on it
> >> returns "No such file or directory". Certainly, rm and unlink cannot
> >> remove it. The partition has 14G in use. What can I do to provide a
> >> useful piece of FS structure information? How can I remove the file
> >> afterwards.
> >
> > I'd say to send us the btrfsck output, it will help answer these
> > questions.
>
> Oh, easily. "Bad block <number way beyond partition block count>".
Btrfs deals in byte numbers not block numbers ;)
> That's all. Reading one of the damaged file actually returned
> "Input/output error" - probably it tried to read beyond end-of-device. I
> had to kill this file (practical testing means that to continue to use
> my notebook normally I had to nuke the damaged file and get intact
> copies). The "no such file except in readdir" is still there right now.
Ok, btrfsck will give us more output when it finishes, but it hasn't
finished. It would help to use btrfs-image to send us a coyp of the
metadata so we can fix the btrfsck bug.
-chris
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-12 10:53 ` Chris Mason
2009-06-12 11:08 ` Michael Raskin
@ 2009-06-12 11:48 ` Michael Raskin
1 sibling, 0 replies; 7+ messages in thread
From: Michael Raskin @ 2009-06-12 11:48 UTC (permalink / raw)
To: Chris Mason, Michael Raskin, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 613 bytes --]
Chris Mason wrote:
>> 2. I found a file which is listed in the directory, but stat on it
>> returns "No such file or directory". Certainly, rm and unlink cannot
>> remove it. The partition has 14G in use. What can I do to provide a
>> useful piece of FS structure information? How can I remove the file
>> afterwards.
>
> I'd say to send us the btrfsck output, it will help answer these
> questions.
OK, after cleaning the FS from bad files I managed to run btrfsck. It
reports unresolved reference. The files I cleaned were related to
libattr, so I do not believe a kernel header would be a hardlink to them.
[-- Attachment #2: btrfsck.log.cleaned --]
[-- Type: text/plain, Size: 2278 bytes --]
root 5 inode 273 errors 0
unresolved ref dir 270 index 2 namelen 4 name attr filetype 0 error 3
root 5 inode 274 errors 0
unresolved ref dir 271 index 2 namelen 2 name cs filetype 0 error 3
root 5 inode 275 errors 0
unresolved ref dir 271 index 3 namelen 2 name de filetype 0 error 3
root 5 inode 276 errors 0
unresolved ref dir 271 index 4 namelen 2 name es filetype 0 error 3
root 5 inode 277 errors 0
unresolved ref dir 271 index 5 namelen 2 name fr filetype 0 error 3
root 5 inode 278 errors 0
unresolved ref dir 271 index 6 namelen 2 name gl filetype 0 error 3
root 5 inode 279 errors 0
unresolved ref dir 271 index 7 namelen 2 name nl filetype 0 error 3
root 5 inode 280 errors 0
unresolved ref dir 271 index 8 namelen 2 name pl filetype 0 error 3
root 5 inode 281 errors 0
unresolved ref dir 271 index 9 namelen 2 name sv filetype 0 error 3
root 5 inode 290 errors 0
unresolved ref dir 272 index 2 namelen 4 name man1 filetype 0 error 3
root 5 inode 291 errors 0
unresolved ref dir 272 index 3 namelen 4 name man2 filetype 0 error 3
root 5 inode 292 errors 0
unresolved ref dir 272 index 4 namelen 4 name man3 filetype 0 error 3
root 5 inode 293 errors 0
unresolved ref dir 272 index 5 namelen 4 name man5 filetype 0 error 3
root 5 inode 316 errors 0
unresolved ref dir 353285 index 1288 namelen 28 name mwave.h.tmp-31838-1822528541 filetype 1 error 1
root 5 inode 353285 errors 200
found 14247617559 bytes used err is 1
total csum bytes: 13316736
total tree bytes: 627621888
btree space waste bytes: 174539968
file data blocks allocated: 13776285696
referenced 13445578752
Btrfs Btrfs v0.18
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-12 11:42 ` Chris Mason
@ 2009-06-12 11:56 ` Michael Raskin
2009-06-12 18:52 ` Michael Raskin
1 sibling, 0 replies; 7+ messages in thread
From: Michael Raskin @ 2009-06-12 11:56 UTC (permalink / raw)
To: Chris Mason, Michael Raskin, linux-btrfs
Chris Mason wrote:
>>> I'd say to send us the btrfsck output, it will help answer these
>>> questions.
>> Oh, easily. "Bad block <number way beyond partition block count>".
>
> Btrfs deals in byte numbers not block numbers ;)
Interesting to know. Maybe just adding "at" in the message would reduce
confusion.
It doesn't look like it is a canonical bad block anyway.
>> That's all. Reading one of the damaged file actually returned
>> "Input/output error" - probably it tried to read beyond end-of-device. I
>> had to kill this file (practical testing means that to continue to use
>> my notebook normally I had to nuke the damaged file and get intact
>> copies). The "no such file except in readdir" is still there right now.
>
> Ok, btrfsck will give us more output when it finishes, but it hasn't
> finished. It would help to use btrfs-image to send us a coyp of the
> metadata so we can fix the btrfsck bug.
Well, as the partition fills up at ~25 G of 30 G used, I guess that
average metadata size is >=1G for that partition. And now I destroyed
the evidence to make the notebook boot. The disappearing file, though,
is a minor annoyance so I can keep it and do whatever is needed with
btrfs-image..
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: cleanup after a small data loss on incorrect shutdown.
2009-06-12 11:42 ` Chris Mason
2009-06-12 11:56 ` Michael Raskin
@ 2009-06-12 18:52 ` Michael Raskin
1 sibling, 0 replies; 7+ messages in thread
From: Michael Raskin @ 2009-06-12 18:52 UTC (permalink / raw)
To: Chris Mason, Michael Raskin, linux-btrfs
Chris Mason wrote:
>> That's all. Reading one of the damaged file actually returned
>> "Input/output error" - probably it tried to read beyond end-of-device. I
>> had to kill this file (practical testing means that to continue to use
>> my notebook normally I had to nuke the damaged file and get intact
>> copies). The "no such file except in readdir" is still there right now.
>
> Ok, btrfsck will give us more output when it finishes, but it hasn't
> finished. It would help to use btrfs-image to send us a coyp of the
> metadata so we can fix the btrfsck bug.
I have a 74M compressed btrfs-image of a partition with a ghost file (I
sent btrfsck logs earlier). Would they be of any use in debugging
handling of such situations? If yes - how should I transmit the image
file? How can I kill the ghost file?
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-06-12 18:52 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-11 12:44 cleanup after a small data loss on incorrect shutdown Michael Raskin
2009-06-12 10:53 ` Chris Mason
2009-06-12 11:08 ` Michael Raskin
2009-06-12 11:42 ` Chris Mason
2009-06-12 11:56 ` Michael Raskin
2009-06-12 18:52 ` Michael Raskin
2009-06-12 11:48 ` Michael Raskin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox