From: Li Zefan <lizefan@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: qixuan wu <wuqixuan@gmail.com>, Tao Ma <tm@tao.ma>,
"Theodore Ts'o" <tytso@mit.edu>,
Eric Sandeen <sandeen@redhat.com>,
Yafang Shao <laoar.shao@gmail.com>,
<linux-fsdevel@vger.kernel.org>, <linux-ext4@vger.kernel.org>,
<wuqixuan@huawei.com>
Subject: Re: help about ext3 read-only issue on ext3(2.6.16.30)
Date: Fri, 7 Dec 2012 18:03:49 +0800 [thread overview]
Message-ID: <50C1BF05.6020605@huawei.com> (raw)
In-Reply-To: <20121206170913.GC21029@quack.suse.cz>
On 2012/12/7 1:09, Jan Kara wrote:
> On Fri 07-12-12 00:21:25, qixuan wu wrote:
>> Hi Kara,
>>
>> On Thu, Dec 6, 2012 at 8:37 PM, Jan Kara <jack@suse.cz> wrote:
>>> On Thu 06-12-12 09:13:45, Li Zefan wrote:
>>>>>> I found this in one log:
>>>>>>
>>>>>> Nov 14 05:26:55 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=3952, inode=0, rec_len=0, name_len=0
>>>>>> Nov 14 13:42:40 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=4024, inode=0, rec_len=0, name_len=0
>>>>>> Nov 16 17:29:40 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=4084, inode=0, rec_len=0, name_len=0
>>>>>> Nov 23 19:42:44 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=3952, inode=0, rec_len=0, name_len=0
>>> Sorry for posting here in the thread but I got unsubscribed from the
>>> list so I don't have the beginning of the thread in my inbox.
>>>
>>> ext3 directory format is such that the last directory entry in the block
>>> should have length to exactly fill up the whole block. Apparently, the
>>> length got trimmed for some reason so we ended up before end of directory
>>> block looked of another directory entry there and didn't find anything. I
>>> will also make one observation regarding offsets. They are 3952, 4024, and
>>> 4084. If we subtract that from 4096 (block size), we get differences (in
>>> binary) 10010000, 01001000, 00001100. Interestingly these have always two
>>> bits set. Might be luck but need not...
>>
>> Yes, we also found the interesting things that the offset happen in
>> many boards are like below:
>> 1) 3952
>> 2) 3988( 3952+36)
>> 3) 4024( 3988+36)
>> 4) 4048(4042+24)
>> 5) 4084(same as the rec_len of ".." file if there isn't any file).
>>
>> I need introduce the rule of the files in the dir, for example:
>> .
>> ..
>> current_log.txt (len is 15, rec_len is 24 when there is file after it,
>> the value "24" i think has relative with offset 4048)
>> 20120526124556.865213.txt(len is 25, rec_len is 36 when there is file after it).
>> 20120526124984.239475.txt(len is 25, rec_len is 36 when there is file after it).
>> ....
>> Because the rec_len is 36, it has some relative with those offset
>> values( the diff of those values are multiple of 36).
>> I need tell another thing, customer's app invoke opendir/readdir very
>> frequently. There are more than 1000 times, every second(the value
>> need to be confirmed).
>>
>>> Anyway it would be interesting to get the dump of the corrupted directory
>>> before e2fsck is run. You can do that by running:
>>> debugfs -R "dump_inode <7225391> /tmp/corrupted_dir" /dev/sda7
>>>
>>> Then you can send the dump of the corrupted directory here.
>>
>> We have already dump of the data by debugfs. The data is very good
>> without error. But we just did it before fsck, even the fsck is not
>> giving any error. I want to know whether fsck will modify disk data
>> without reporting any error or not ?
> Ah, OK. So it seems that directory block is OK, just f_pos gets corrupted
> somehow. There are guards in ext3_readdir() to rescan dir block when
> directory is modified but maybe that's not working correctly. I don't want
> to burn too much time on this since this is so ancient kernel but I'd be
> looking in that direction...
>
I've added some debug code into ext3, which does these things:
- dump the dir block
- print the current and last f_pos and offset
- dump_stack() to see which process triggers the bug
Hope we can trigger the bug in our labs (We did see this happened twice this week
in a lab), though we can't patch the kernel in the products.
I compared ext3_readdir() with latest ext3, and saw no difference except some
API changes. I'll dig deeper. Thansks for the suggestion!
Regards
Li Zefan
next prev parent reply other threads:[~2012-12-07 10:04 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-01 14:22 help about ext3 read-only issue on ext3(2.6.16.30) Yafang Shao
2012-12-03 17:59 ` Eric Sandeen
2012-12-04 13:54 ` Li Zefan
2012-12-04 15:09 ` Theodore Ts'o
2012-12-05 10:43 ` Li Zefan
2012-12-05 14:26 ` Tao Ma
2012-12-05 15:51 ` qixuan wu
2012-12-06 1:13 ` Li Zefan
2012-12-06 12:37 ` Jan Kara
2012-12-06 16:21 ` qixuan wu
2012-12-06 17:09 ` Jan Kara
2012-12-07 10:03 ` Li Zefan [this message]
2012-12-11 8:01 ` Li Zefan
2012-12-12 10:04 ` Jan Kara
2012-12-12 11:31 ` Li Zefan
2012-12-14 3:32 ` Peng, Tao
2012-12-17 10:51 ` Li Zefan
2012-12-20 11:32 ` Jan Kara
2013-02-12 12:19 ` Jan Kara
2012-12-04 15:29 ` Tao Ma
2012-12-04 16:11 ` Bernd Schubert
2012-12-04 20:20 ` Theodore Ts'o
2012-12-04 16:16 ` qixuan wu
2012-12-04 20:45 ` Theodore Ts'o
2012-12-05 13:58 ` Tao Ma
2012-12-05 15:05 ` Theodore Ts'o
2012-12-06 1:54 ` Tao Ma
2012-12-06 15:48 ` qixuan wu
2012-12-05 15:46 ` qixuan wu
2012-12-06 2:58 ` Yongqiang Yang
2012-12-06 16:26 ` qixuan wu
2012-12-07 1:49 ` Yongqiang Yang
2012-12-05 10:46 ` Li Zefan
2012-12-05 14:02 ` Tao Ma
2012-12-06 1:17 ` Li Zefan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50C1BF05.6020605@huawei.com \
--to=lizefan@huawei.com \
--cc=jack@suse.cz \
--cc=laoar.shao@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=sandeen@redhat.com \
--cc=tm@tao.ma \
--cc=tytso@mit.edu \
--cc=wuqixuan@gmail.com \
--cc=wuqixuan@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.