All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <Anand.Jain@oracle.com>
To: Christian Volkmann <haveaniceday@cv-sv.de>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfsck crashes
Date: Tue, 10 Jul 2012 14:30:50 +0800	[thread overview]
Message-ID: <4FFBCC1A.8020800@oracle.com> (raw)
In-Reply-To: <4FFB4BB4.4080408@cv-sv.de>


Christian,

  line # is still confusing to me as well. patch was to avoid seg
  fault when csum_root node is null and it might not be the case
  here then.

  (If the original problem stack-trace has remained the same
  which is as below)..
---------
>>> (gdb) bt
>>> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
>>> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
>>> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
>>> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
>>> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
----------
>>> What I have seen: buf is "0", after read_tree_block.
>>>
>>> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
>>> 2512 nritems = btrfs_header_nritems(buf);
----------


   A re-look (ignore line number) suggests that we already have the
   extent_buffer_uptodate check for the buf, so buf can't be NULL
   when calling btrfs_header_nritems which contradicts the above
   stack trace if you are using the latest code. as shown below.
  
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=blob;f=btrfsck.c;h=088b9f427339cde70dd6b1a457aeba5cf190ce34;hb=HEAD

-------
2526 static int run_next_block(struct btrfs_root *root,
::
2585         buf = read_tree_block(root, bytenr, size, 0);
2586         if (!extent_buffer_uptodate(buf)) {
2587                 record_bad_block_io(root->fs_info,
2588                                     extent_cache, bytenr, size);
2589                 free_extent_buffer(buf);
2590                 goto out;
2591         }
2592
2593         nritems = btrfs_header_nritems(buf);  <-- Seg fault ??
-------

   

Thanks,
-Anand




On 10/07/12 05:23, Christian Volkmann wrote:
> Anand Jain schrieb:>
>  >
>  >> What I have seen: buf is "0", after read_tree_block.
>  >
>  > Yes since we not checking extent_buffer_uptodate for the csum_root_tree,
>  > that will pass the null buf, The following patch will avoid sending null
>  > buffer
>  > https://patchwork.kernel.org/patch/1148831/
>  >
>  > However whether --init-csum-tree will build the good csum I think that
>  > will still depends on the corruption IMO.
>  >
>  > -Anand
>  >
>
> .)
> The patch does not help.
> This is false: !extent_buffer_uptodate(info->csum_root->node)
>
> .)
> Output btrfsck of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git ,
> patched at line 3552.
>
> speedy:/tmp/btrfs/btrfs-progs # gdb ./btrfsck
> GNU gdb (GDB) SUSE (7.3-41.1.2)
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-suse-linux".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /tmp/btrfs/btrfs-progs/btrfsck...done.
> (gdb) r /dev/md3
> Starting program: /tmp/btrfs/btrfs-progs/btrfsck /dev/md3
> Missing separate debuginfo for /lib64/ld-linux-x86-64.so.2
> Try: zypper install -C "debuginfo(build-id)=f20c99249f5a5776e1377d3bd728502e3f455a3f"
> Missing separate debuginfo for /lib64/libuuid.so.1
> Try: zypper install -C "debuginfo(build-id)=24ae727f9cd5fb29f81b0f965859d3cf4668bf17"
> Missing separate debuginfo for /lib64/libc.so.6
> Try: zypper install -C "debuginfo(build-id)=7b169b1db50384b70e3e4b4884cd56432d5de796"
> checking extents
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> Csum didn't match
> owner ref check failed [2327654400 4096]
> ref mismatch on [101138354176 98304] extent item 1, found 0
> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x182ebd20
> backpointer mismatch on [101138354176 98304]
> owner ref check failed [101138354176 98304]
> ref mismatch on [101138452480 106496] extent item 1, found 0
> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0xefb8d0
> backpointer mismatch on [101138452480 106496]
> owner ref check failed [101138452480 106496]
> ref mismatch on [101138558976 8192] extent item 1, found 0
> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x5a22350
> backpointer mismatch on [101138558976 8192]
> owner ref check failed [101138558976 8192]
> ref mismatch on [101138567168 16384] extent item 1, found 0
> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x5a22390
> backpointer mismatch on [101138567168 16384]
> owner ref check failed [101138567168 16384]
> ref mismatch on [101138583552 16384] extent item 1, found 0
> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x19dfaae0
> backpointer mismatch on [101138583552 16384]
> owner ref check failed [101138583552 16384]
> Errors found in extent allocation tree
> checking fs roots
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> Csum didn't match
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000402264 in btrfs_header_level (eb=0x0) at ctree.h:1540
> 1540 BTRFS_SETGET_HEADER_FUNCS(header_level, struct btrfs_header, level, 8);
> (gdb)
>
>
> .)
> Against which git should I regular patch?
> This git from the wiki seems to be not up to date:
> http://git.darksatanic.net/repo/btrfs-progs-unstable.git
>
> This repository does not match from the line number:
> git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git
>
> .)
> Strange for me: Why seems the same "number" 2327654400 wants
> to have a different checksum?
>
> checksum verify failed on 2327654400 wanted 89AAEA38 found 72
> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>
>
> Thanks & regards,
> Christian
>
>
>>
>> On 09/07/12 00:08, Christian Volkmann wrote:
>>> Hi there,
>>>
>>> I have a corrupted filesystem. This filesystem crashes btrfsck.
>>>
>>> A gdb anaylsis showed me:
>>> (gdb) bt
>>> #0 0x0000000000402379 in btrfs_header_nritems (eb=0x0) at ctree.h:1426
>>> #1 0x0000000000408c14 in run_next_block (root=0x73fb40, bits=0x740d50, bits_nr=1024, last=0x7fffffffd948, pending=0x7fffffffda40,
>>> seen=0x7fffffffda50, reada=0x7fffffffda30, nodes=0x7fffffffda20, extent_cache=0x7fffffffda60) at btrfsck.c:2512
>>> #2 0x00000000004099e2 in check_extents (root=0x73fb40) at btrfsck.c:2792
>>> #3 0x0000000000409bec in main (ac=1, av=0x7fffffffdbe8) at btrfsck.c:2853
>>>
>>> What I have seen: buf is "0", after read_tree_block.
>>>
>>> btrfsck.c:2511 buf = read_tree_block(root, bytenr, size, 0);
>>> 2512 nritems = btrfs_header_nritems(buf);
>>>
>>> So ctree.h crashes here with btrfs_header_nritems(buf)
>>> ...
>>> static inline u##bits btrfs_##name(struct extent_buffer *eb) \
>>> { \
>>> struct btrfs_header *h = (struct btrfs_header *)eb->data; \
>>> return le##bits##_to_cpu(h->member); \
>>> } \
>>> ...
>>>
>>> I expect an error "eb == 0" is not covered by ctree.h.
>>> May be another fix is required. E.g. harden btrfsck against "0".
>>>
>>> The file system crashes the kernel on some access. I did not follow up this,
>>> cause the file system is corrupt.( Using openSUSE Tumbleweed 3.4.4-31-desktop)
>>> May be the kernel code requires also checks for this?
>>>
>>> Please contact me, if I should do some further tests with this file system
>>> or use some tools for a fix test. (developer knowledge given)
>>>
>>> Another minor issue: btrfsck uses much memory. But this might be normal.
>>> ( > 800MB)
>>>
>>> Best regards,
>>> Christian
>>>
>>>
>>>
>>> PS: Just if anyone is interested:
>>> - History + tried: openSUSE btrfsck showed the messages below in the first step.
>>> - /sbin/btrfsck /dev/md3 --repair removed some messages, except checksum.
>>> - File system is mounted with:
>>> /backup btrfs defaults,compress=zlib,noatime 1 2
>>> - filesystem is used to back up some unix system with heavy usage of:
>>> rsync -aH .... --link-dest=...
>>> So each file should have regular multiple hard links.
>>>
>>> ===
>>> Is there anybody interested in fixing this file system with me,
>>> to check btrfsck speedy:/home/cv # /sbin/btrfsck /dev/md3
>>> checking extents
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> Csum didn't match
>>> owner ref check failed [2327654400 4096]
>>> ref mismatch on [101138354176 98304] extent item 1, found 0
>>> Incorrect local backref count on 101138354176 root 5 owner 1867898 offset 0 found 0 wanted 1 back 0x1f076d0
>>> backpointer mismatch on [101138354176 98304]
>>> owner ref check failed [101138354176 98304]
>>> ref mismatch on [101138452480 106496] extent item 1, found 0
>>> Incorrect local backref count on 101138452480 root 5 owner 1867899 offset 0 found 0 wanted 1 back 0x6aa85d0
>>> backpointer mismatch on [101138452480 106496]
>>> owner ref check failed [101138452480 106496]
>>> ref mismatch on [101138558976 8192] extent item 1, found 0
>>> Incorrect local backref count on 101138558976 root 5 owner 1867901 offset 0 found 0 wanted 1 back 0x6aa8610
>>> backpointer mismatch on [101138558976 8192]
>>> owner ref check failed [101138558976 8192]
>>> ref mismatch on [101138567168 16384] extent item 1, found 0
>>> Incorrect local backref count on 101138567168 root 5 owner 1867902 offset 0 found 0 wanted 1 back 0x1f8fa80
>>> backpointer mismatch on [101138567168 16384]
>>> owner ref check failed [101138567168 16384]
>>> ref mismatch on [101138583552 16384] extent item 1, found 0
>>> Incorrect local backref count on 101138583552 root 5 owner 1867903 offset 0 found 0 wanted 1 back 0x1f8fac0
>>> backpointer mismatch on [101138583552 16384]
>>> owner ref check failed [101138583552 16384]
>>> Errors found in extent allocation tree
>>> checking fs roots
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> checksum verify failed on 2327654400 wanted 73CDE79C found 72
>>> Csum didn't match
>>> Speicherzugriffsfehler
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-07-10  6:34 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-08 16:08 btrfsck crashes Christian Volkmann
2012-07-09  3:40 ` Anand Jain
2012-07-09 21:23   ` Christian Volkmann
2012-07-10  6:30     ` Anand Jain [this message]
2012-07-10  9:13       ` haveaniceday
2012-07-10 11:08         ` haveaniceday
2012-07-11  7:13           ` Anand Jain
2012-07-11  8:36             ` haveaniceday
2012-07-15 14:05               ` Martin Steigerwald
2012-07-12 19:08             ` Christian Volkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FFBCC1A.8020800@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=haveaniceday@cv-sv.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.