From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756063AbZCWMR2 (ORCPT ); Mon, 23 Mar 2009 08:17:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754407AbZCWMRP (ORCPT ); Mon, 23 Mar 2009 08:17:15 -0400 Received: from THUNK.ORG ([69.25.196.29]:41405 "EHLO thunker.thunk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754229AbZCWMRO (ORCPT ); Mon, 23 Mar 2009 08:17:14 -0400 Date: Mon, 23 Mar 2009 08:17:09 -0400 From: Theodore Tso To: Richard Cc: linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org Subject: Re: Severe data corruption with ext4 Message-ID: <20090323121709.GD13368@mit.edu> Mail-Followup-To: Theodore Tso , Richard , linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org References: <20090320030121.1fa8e6d3.akpm@linux-foundation.org> <20090323020522.GF29466@mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@mit.edu X-SA-Exim-Scanned: No (on thunker.thunk.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 23, 2009 at 10:10:43AM +0100, Richard wrote: > > That's another indication of data corruption in inode 1022.  This > > could be hardware induced corruption; or it could be a software > > induced error.  There's been one other user with a RAID that had > > reported a strange corruption near the beginning of the filesystem, in > > the inode table.  How big is your filesystem, exactly? > > 5,158,556 K. OK, so about 5 gigs; not all that big at all. I was starting to worry that maybe we had some 32-bit signed/unsigned problem, but that would be showing up in the 8+ TB range. > Attached, as well as the itable image. I've analyzed the itable image, and it looks valid; in particular, I didn't see any evidence of corruption in inode 1022. > > By the way, yesterday's fsck on another file system (/home) placed > almost 8,500 (!) files and directories in lost+found. I have not a > single error message regarding this device in my log files. All > files/directories were originally placed in the same parent directory. There is something very wrong going on here, and I'm at a loss why no one else is reporting anything like what you are seeing. Are you able to run a stock, unmodified mainline kernel on your system? At this point I'd really like to see if the problems you are seeing can be replicated with a stock 2.6.29-rc8 kernel. - Ted