From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261442AbVE0CYv (ORCPT ); Thu, 26 May 2005 22:24:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261449AbVE0CYv (ORCPT ); Thu, 26 May 2005 22:24:51 -0400 Received: from postfix4-2.free.fr ([213.228.0.176]:12520 "EHLO postfix4-2.free.fr") by vger.kernel.org with ESMTP id S261442AbVE0CYg (ORCPT ); Thu, 26 May 2005 22:24:36 -0400 Message-ID: <4296847C.6010602@guerrier.com> Date: Fri, 27 May 2005 04:22:52 +0200 From: Olivier Guerrier User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 To: huiac@internode.on.net CC: linux-kernel@vger.kernel.org Subject: Re: Fake ext3 corruption on raid5 in 2.6.11.9 smp References: <42959820.7090309@guerrier.com> <20050527012909.GA32085@oasissystems.com.au> In-Reply-To: <20050527012909.GA32085@oasissystems.com.au> X-Enigmail-Version: 0.86.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org First, I just realized the typo in my previous message's subject: Should read 2.6.*11.*9 smp, instead of 2.6_.9. Sorry :| jpearson wrote: > Hi, > > I saw this exact same error (EXT3-fs error (device dm-x): ext3_readdir: > bad entry in directory #nnnnnnn: rec_len % 4 != 0 - offset=0, inode=xxxxxxxx, > rec_len=... > > from time to time in my non-SMP RAID system with 512Mb RAM, with ext3 on LVM on top of RAID5. > > Never caused actual corruption - run FSCK, no errors, remount rw > successfully until next time; error rarely in the same place, but always > in a directory and rec_len % 4 != 0. Looks like an 'in-kernel' thing, > because (e.g.) running find on the volume after remounting rw produced > no issues, so presumably the on-disk directory wasn't *really* the > issue. I confirm this here too: random place, always a dir, always 'rec_len % 4 != 0', no fs issue or data loss (so far...) > Filesystems between about 8 and 50 Gb, and not what I'd characterise as a > heavy load. By heavy load, I mean a system load between 10 and 15 for 3 hours (before error) Processes running were several instances of mkisofs (reading from and writing to the faulty partition) > This was with about 2.6.4 - 2.6.7. I'm running 2.6.11 now and haven't > seen it in some time; so it was either fixed by 2.6.11, or mounting ro > by default has just reduced my exposure. As my kernel is a 2.6.11.9, It is not fixed so far. I will reformat when possible, this time I will use lvm over raid5, so I can use xfs for my usefull data, and keep a medium ext3 partition to make tests if needed (just need to know what to test) Thanks