All of lore.kernel.org
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: Marti Raudsepp <marti@juffo.org>
Cc: reiserfs-devel <reiserfs-devel@vger.kernel.org>
Subject: Re: reiser4 on 2.6.24: corruption, hang on read()
Date: Thu, 10 Apr 2008 02:04:36 +0400	[thread overview]
Message-ID: <47FD3D74.8080408@gmail.com> (raw)
In-Reply-To: <54b33ccd0804091310j25978ab3p6621e8896ae2b493@mail.gmail.com>

Hello.

There are 2 pending patches against reiser4-for-2.6.24.
They fix some bugs that can be related to your corruption:

http://marc.info/?l=reiserfs-devel&m=120498592129461&q=p3
http://marc.info/?l=reiserfs-devel&m=120527307032124&q=p3

Please, apply, and  report if any problems..

Thanks,
Edward.

Marti Raudsepp wrote:

>Hello,
>
>I recently found my computer consuming 100% of CPU in system; some
>investigation revealed that this was caused by some zombie processes
>attempting to read a particular file, not returning from the syscall.
>Kill had no effect on the processes. Metadata operations (stat,
>rename, etc) still succeeded, but according to strace, processes
>reading the file froze after the second read() to the given file.
>There were no relevant messages in dmesg.
>
>Apparently the problematic file has been truncated; I am not sure if
>that happened during normal operation or was part of the malfunction.
>
>When the problem re-appeared after a reboot, I decided to run fsck on
>the file system which found several problems, including 1 fatal
>corruption. I made a backup copy of the entire partition (in case more
>analysis is necessary) and ran fsck --build-fs on it. After the
>rebuild, the file system appears to be performing normally.
>
>This file system had been subject to moderate, but constant
>multithreaded load for over a week now. As far as I know, this file
>system has not had to tolerate unexpected resets or power loss. The
>file system is located on a LVM volume, which sits on top of software
>RAID0, on two identical SATA disks.
>
>uname -a: Linux hez 2.6.24-gentoo-r4 #1 SMP Wed Apr 9 18:47:14 UTC
>2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5600+
>AuthenticAMD GNU/Linux
>(this kernel was built after the problem occured; the corruption
>happened with the initial vanilla 2.6.24 release)
>
>Here's the fsck output:
>--------------------------------------------------------------------------
>***** fsck.reiser4 started at Wed Apr  9 19:07:37 2008
>Reiser4 fs was detected on /dev/mapper/plain-freenet.
>Master super block (16):
>magic:          ReIsEr4
>blksize:        4096
>format:         0x0 (format40)
>uuid:           e70223e5-e538-4491-ab8a-98509c426814
>label:          <none>
>
>Format super block (17):
>plugin:         format40
>description:    Disk-format plugin.
>version:        0
>magic:          ReIsEr40FoRmAt
>mkfs id:        0x2ea688e7
>flushes:        0
>blocks:         2096640
>free blocks:    284099
>root block:     87
>tail policy:    0x2 (smart)
>next oid:       0x80db5
>file count:     14470
>tree height:    5
>key policy:     LARGE
>
>
>CHECKING THE STORAGE TREE
>        Read nodes 5588
>        Nodes left in the tree 5588
>                Leaves of them 2328, Twigs of them 3153
>        Time interval: Wed Apr  9 19:07:38 2008 - Wed Apr  9 19:08:32 2008
>CHECKING EXTENT REGIONS.
>FSCK: extent40_repair.c: 96: extent40_check_layout: Node (1395911),
>item (5), unit (9),
>[11d61:4(FB):174656d702d3761:77f11:0]: points out of the fs, region
>[2096637..2096639].
>        Read twigs 3153
>        Invaid extent pointers 1
>        Time interval: Wed Apr  9 19:08:32 2008 - Wed Apr  9 19:08:32 2008
>CHECKING THE SEMANTIC TREE
>FSCK: obj40_repair.c: 350: obj40_stat_lw_check: Node (611499), item
>(24), [10004:727470726f7073:80b2d]
>(stat40): wrong size (15697), Should be (12288).
>        Found 14470 objects (some could be encountered more then
>once).
>        Time interval: Wed Apr  9 19:08:32 2008 - Wed Apr  9 19:08:33 2008
>FSCK: repair.c: 550: repair_sem_fini: On-disk used block bitmap and
>really used block bitmap differ.
>***** fsck.reiser4 finished at Wed Apr  9 19:08:33 2008
>Closing fs...done
>
>1 fatal corruptions were detected in FileSystem. Run with --build-fs
>option to fix them.
>--------------------------------------------------------------------------
>
>Output of fsck.reiser4 --rebuild-fs:
>--------------------------------------------------------------------------
>CHECKING THE STORAGE TREE
>        Read nodes 5588
>        Nodes left in the tree 5588
>                Leaves of them 2328, Twigs of them 3153
>        Time interval: Wed Apr  9 19:40:52 2008 - Wed Apr  9 19:41:55
>2008
>CHECKING EXTENT REGIONS.
>FSCK: extent40_repair.c: 96: extent40_check_layout: Node (1395911),
>item (5), unit (9),
>[11d61:4(FB):174656d702d3761:77f11:0]: points out of the fs, region
>[2096637..2096639]. Zeroed.
>        Read twigs 3153
>        Corrected nodes 1
>        Fixed invalid extent pointers 1
>        Time interval: Wed Apr  9 19:41:55 2008 - Wed Apr  9 19:41:55
>2008
>LOOKING FOR UNCONNECTED NODES
>        Read nodes 3
>        Good nodes 0
>                Leaves of them 0, Twigs of them 0
>        Time interval: Wed Apr  9 19:41:55 2008 - Wed Apr  9 19:41:55
>2008
>CHECKING EXTENT REGIONS.
>        Read twigs 0
>        Time interval: Wed Apr  9 19:41:55 2008 - Wed Apr  9 19:41:55
>2008
>INSERTING UNCONNECTED NODES
>1. Twigs: done
>2. Twigs by item: done
>3. Leaves: done
>4. Leaves by item: done
>        Twigs: read 0, inserted 0, by item 0, empty 0
>        Leaves: read 0, inserted 0, by item 0
>        Time interval: Wed Apr  9 19:41:55 2008 - Wed Apr  9 19:41:55
>2008
>CHECKING THE SEMANTIC TREE
>FSCK: semantic.c: 705: repair_semantic_lost_prepare: No 'lost+found'
>entry found. Building a new object with the key 2a:0:ffff.
>FSCK: semantic.c: 573: repair_semantic_dir_open: Failed to recognize
>the plugin for the directory [2a:0:ffff].
>FSCK: semantic.c: 581: repair_semantic_dir_open: Trying to recover the
>directory [2a:0:ffff] with the default  plugin--dir40.
>FSCK: obj40_repair.c: 576: obj40_prepare_stat: The file [2a:0:ffff]
>does not have a StatData item. Creating a new one. Plugin dir40.
>FSCK: dir40_repair.c: 40: dir40_dot: Directory [2a:0:ffff]: The entry
>"." is not found. Insert a new one. Plugin (dir40).
>FSCK: obj40_repair.c: 223: obj40_stat_unix_check: Node (7634), item
>(2), [2a:0:ffff] (stat40): wrong bytes (0), Fixed to (50).
>FSCK: obj40_repair.c: 350: obj40_stat_lw_check: Node (7634), item (2),
>[2a:0:ffff] (stat40): wrong size (0), Fixed to (1).
>FSCK: obj40_repair.c: 350: obj40_stat_lw_check: Node (611500), item
>(23), [10004:727470726f7073:80b2d]
>(stat40): wrong size (15697), Fixed to (12288).
>FSCK: obj40_repair.c: 223: obj40_stat_unix_check: Node (1260934), item
>(37), [11d61:174656d702d3761:77f11]
>(stat40): wrong bytes (528384), Fixed to (516096).
>        Found 14471 objects.
>        Time interval: Wed Apr  9 19:41:55 2008 - Wed Apr  9 19:41:56 2008
>CLEANING UP THE STORAGE TREE
>        Removed items 57
>        Time interval: Wed Apr  9 19:41:56 2008 - Wed Apr  9 19:41:56 2008
>FSCK: repair.c: 677: repair_update: File count 14470 is wrong. Fixed to 14471.
>***** fsck.reiser4 finished at Wed Apr  9 19:41:56 2008
>--------------------------------------------------------------------------
>
>
>Regards,
>Marti Raudsepp
>--
>To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>


      reply	other threads:[~2008-04-09 22:04 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-09 20:10 reiser4 on 2.6.24: corruption, hang on read() Marti Raudsepp
2008-04-09 22:04 ` Edward Shishkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47FD3D74.8080408@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=marti@juffo.org \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.