linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tao Ma <tm@tao.ma>
To: Li Zefan <lizefan@huawei.com>
Cc: Theodore Ts'o <tytso@mit.edu>, Eric Sandeen <sandeen@redhat.com>,
	Yafang Shao <laoar.shao@gmail.com>,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	wuqixuan@huawei.com, wuqixuan@gmail.com
Subject: Re: help about ext3 read-only issue on ext3(2.6.16.30)
Date: Wed, 05 Dec 2012 22:26:05 +0800	[thread overview]
Message-ID: <50BF597D.3040704@tao.ma> (raw)
In-Reply-To: <50BF2537.6070809@huawei.com>

On 12/05/2012 06:43 PM, Li Zefan wrote:
> On 2012/12/4 23:09, Theodore Ts'o wrote:
>> On Tue, Dec 04, 2012 at 09:54:05PM +0800, Li Zefan wrote:
>>>
>>> I've collected some logs in different machines, and the error was always
>>> triggered in ext3_readdir:
>>>
>>> EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #6685458: rec_len is smaller than minimal - offset=3860, inode=0, rec_len=0, name_len=0
>>> EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #9650541: rec_len is smaller than minimal - offset=3960, inode=0, rec_len=0, name_len=0
>>> EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #11124783: rec_len is smaller than minimal - offset=4072, inode=0, rec_len=0, name_len=0
>>> EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #52740880: rec_len is smaller than minimal - offset=4024, inode=0, rec_len=0, name_len=0
>>> EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #52740880: rec_len is smaller than minimal - offset=4084, inode=0, rec_len=0, name_len=0
>>
>> This looks like the last part of the inode was zapped.  It might be
> 
> I don't think so. See below...
> 
>> worth adding a kernel patch which dumps out the entire directory block
>> as a hex dump when this triggers --- and then compare it to what you
>> get if you dump the directory back out after the machine reboot.  That
>> might given you a hint if something is corrupting the directory block
>> in memory.  (especially if you set the remount read-only option).
>>
>>> The last two errors happened on the same machine, and the same inode! One
>>> happened in 11/22 (I was told they had run fsck later on), and one in 12/01.
>>
>> If it's always the same inode, you might want to correlate based on
>> the pathname.  Is there any commonality accross multiple machines in
>> terms of the directory name, and what application(s) might be touching
>> that directory?
>>
> 
> I found this in one log:
> 
> Nov 14 05:26:55 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=3952, inode=0, rec_len=0, name_len=0
> Nov 14 13:42:40 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=4024, inode=0, rec_len=0, name_len=0
> Nov 16 17:29:40 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=4084, inode=0, rec_len=0, name_len=0
> Nov 23 19:42:44 kernel: EXT3-fs error (device sda7): ext3_readdir: bad entry in directory #7225391: rec_len is smaller than minimal - offset=3952, inode=0, rec_len=0, name_len=0
> 
> Happend 4 times, the same inode, different offsets. Another log showed the
> same pattern.
> 
> They said they ran fsck everytime this happened. Many machines got this problem,
> but they remember most of the time fsck didn't report error.(*)
> 
> I've checked the pathname, and they all points to log dirs. There're 2 kinds
> of log dirs with different loggers, but seems work similarly.
> 
> Except one bug report, all others point to exactly the same log dir.
> 
> There're two processes that will touch this dir. One is a monitor, it will
> delete old logs if they occupy too much space, but normally this shouldn't
> happen.
> 
> Another is the logger. When it wants to log sth, it scans the directory, if
> there're more than 100 log files, it will delete the oldest one. After writting
> to the current log file, if the file is larger than 8M, this file will be
> renamed as a backup log. I haven't read the code yet. But sounds pretty
> simple, right?
> 
> The length of the file name is 25. There were 35 logs dating from 2012/11/02
> to 2012/11/23, and no pending deleted files. Thus the remaining ~2.8K of the
> dir block is never used, so I don't think something zeroed it because it
> has always been zero.
Only 35 files? So there should be no rename. And the only possible
action we do to this dir is "create a new log file", right? Then, I
really don't think ext3 will error in such a simple test case. :(

> 
> This log dir is new in this version, while the other one also exists in
> old verison, with less IO.
You mean the kernel version? Sorry, but what do you want to tell us here?

Thanks
Tao
> 
> (*) They have machines in different spots. In another spot, 5 out of ~30
> machines met this problem after upgrading, and fsck reported errors in
> all of them. However there were just a few errors, and they didn't seem to
> relate to the directory, which means the directory seems intact. Adding
> that the fs was created nearly 1 years ago and ever fscked, those errors
> might have nothing to do with this bug?
> 
> btw, the version of e2fsprogsis: e2fsck 1.38 (30-Jun-2005)
> 
> Regards
> Li Zefan
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


  reply	other threads:[~2012-12-06  0:26 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-01 14:22 help about ext3 read-only issue on ext3(2.6.16.30) Yafang Shao
2012-12-03 17:59 ` Eric Sandeen
2012-12-04 13:54   ` Li Zefan
2012-12-04 15:09     ` Theodore Ts'o
2012-12-05 10:43       ` Li Zefan
2012-12-05 14:26         ` Tao Ma [this message]
2012-12-05 15:51           ` qixuan wu
2012-12-06  1:13           ` Li Zefan
2012-12-06 12:37             ` Jan Kara
2012-12-06 16:21               ` qixuan wu
2012-12-06 17:09                 ` Jan Kara
2012-12-07 10:03                   ` Li Zefan
2012-12-11  8:01                     ` Li Zefan
2012-12-12 10:04                       ` Jan Kara
2012-12-12 11:31                         ` Li Zefan
2012-12-14  3:32                           ` Peng, Tao
2012-12-17 10:51                           ` Li Zefan
2012-12-20 11:32                             ` Jan Kara
2013-02-12 12:19                               ` Jan Kara
2012-12-04 15:29     ` Tao Ma
2012-12-04 16:11       ` Bernd Schubert
2012-12-04 20:20         ` Theodore Ts'o
2012-12-04 16:16       ` qixuan wu
2012-12-04 20:45         ` Theodore Ts'o
2012-12-05 13:58         ` Tao Ma
2012-12-05 15:05           ` Theodore Ts'o
2012-12-06  1:54             ` Tao Ma
2012-12-06 15:48               ` qixuan wu
2012-12-05 15:46           ` qixuan wu
2012-12-06  2:58             ` Yongqiang Yang
2012-12-06 16:26               ` qixuan wu
2012-12-07  1:49                 ` Yongqiang Yang
2012-12-05 10:46       ` Li Zefan
2012-12-05 14:02         ` Tao Ma
2012-12-06  1:17           ` Li Zefan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50BF597D.3040704@tao.ma \
    --to=tm@tao.ma \
    --cc=laoar.shao@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    --cc=wuqixuan@gmail.com \
    --cc=wuqixuan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).