linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Tomas <bzzz@sun.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Kevin Shanahan <kmshanah@ucwb.org.au>,
	Andreas Dilger <adilger@sun.com>,
	linux-ext4@vger.kernel.org
Subject: Re: More ext4 acl/xattr corruption - 4th occurence now
Date: Fri, 15 May 2009 07:57:14 +0400	[thread overview]
Message-ID: <4A0CE81A.1060006@sun.com> (raw)
In-Reply-To: <20090514161254.GJ11352@mit.edu>

when cache was introduced single exclusive spinlock protect
whole ext3_ext_get_blocks and there was no concurrency at all.
so I guess your theory is correct.

thanks, Alex

Theodore Tso wrote:
> On Fri, May 15, 2009 at 12:00:15AM +0930, Kevin Shanahan wrote:
>>> debugfs: stat <759>
>> hermes:~# debugfs /dev/dm-0
>> debugfs 1.41.3 (12-Oct-2008)
>> debugfs:  stat <759>
>>
>> Inode: 759   Type: regular    Mode:  0660   Flags: 0x80000
>> Generation: 3979120103    Version: 0x00000000:00000001
>> User:     0   Group: 10140   Size: 14615630848
>> File ACL: 0    Directory ACL: 0
>> Links: 1   Blockcount: 28546168
>> Fragment:  Address: 0    Number: 0    Size: 0
>>  ctime: 0x4a0acdb5:2a88cbec -- Wed May 13 23:10:05 2009
>>  atime: 0x4a0ac45b:10899618 -- Wed May 13 22:30:11 2009
>>  mtime: 0x4a0acdb5:2a88cbec -- Wed May 13 23:10:05 2009
>> crtime: 0x4a0ac45b:10899618 -- Wed May 13 22:30:11 2009
> 
>> Inode   Pathname
>> 759     /local/dumps/exchange/exchange-2000-UCWB-KVM-18.bkfB-KVM-18.bkf
> 
> Do you know how the system was likely writing into
> /local/dumps/exhcnag/eexchange-2000-UCWB-KVM-18.bkf?  What this a
> backup via rsync or tar?  Was this some application writing into a
> pre-existing file via NFS, or via local disk access?
> 
> Given the ctime/atime fields, I'm inclined to guess the latter, but it
> would be good to know.
> 
> The stat dump for the inode 759 does *not* show logical block 1741329
> getting mapped to physical block 529.  So the question is how did that
> happen?
> 
> I've started looking, and one thing popped up at me.  I need to check
> in with the Lustre folks who originally donated the code, but I don't
> see any spinlock or mutexes protecting the inode's extent cache.  So
> if you are on an SMP machine, this could potentially have caused the
> problem.  How many CPU's or cores do you have?  What does
> /proc/cpuinfo report?  Also, would it be correct to assume this file
> is getting served up via Samba.  My theory is that we might be running
> into problems when two threads are simultaneously trying read and
> write to a single file at the same time.
> 
> Hmm, what is accessing your files on this system?  Are you just doing
> backups?  Is it just a backup server?  Or are you serving up files
> using Samba and there are clients which are accessing those files?
> 
> So if this the problem the following experiment should be able to
> confirm whether it's the problem, by seeing if the problem goes away
> if we short-circuit the inode's extent cache.  In fs/ext4/extents.c,
> try inserting a "return" statement to in ext4_ext_put_in_cache():
> 
> static void
> ext4_ext_put_in_cache(struct inode *inode, ext4_lblk_t block,
> 			__u32 len, ext4_fsblk_t start, int type)
> {
> 	struct ext4_ext_cache *cex;
> 
> 	return;		      <---- insert this line
> 	BUG_ON(len == 0);
> 	cex = &EXT4_I(inode)->i_cached_extent;
> 	cex->ec_type = type;
> 	cex->ec_block = block;
> 	cex->ec_len = len;
> 	cex->ec_start = start;
> }
> 
> This should short circuit the i_cached_extent cache, and this may be
> enough to make your problem go away.  (If this theory is correct,
> using mount -o nodelalloc probably won't make a difference, although
> it might change the timing enough to make the bug harder to see.)
> 
> If that solves the problem, the right long-term fix will be to drop bin
> a spinlock to protect i_cached_extent.
> 
> 						- Ted


  parent reply	other threads:[~2009-05-15  4:08 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-13  6:26 More ext4 acl/xattr corruption - 4th occurence now Kevin Shanahan
2009-05-13 23:56 ` Kevin Shanahan
2009-05-14  4:40 ` Theodore Tso
2009-05-14 11:07   ` Kevin Shanahan
2009-05-14 11:17     ` Manish Katiyar
2009-05-14 12:30       ` Theodore Tso
2009-05-14 13:25     ` Kevin Shanahan
2009-05-14 14:07       ` Theodore Tso
2009-05-14 14:30         ` Kevin Shanahan
2009-05-14 15:44           ` Eric Sandeen
2009-05-14 21:07             ` Kevin Shanahan
2009-05-14 21:08               ` Eric Sandeen
2009-05-14 16:12           ` Theodore Tso
2009-05-14 21:02             ` Kevin Shanahan
2009-05-14 21:23               ` Theodore Tso
2009-05-14 21:33                 ` Kevin Shanahan
2009-05-15 23:18                   ` Kevin Shanahan
2009-05-15  1:21                 ` Eric Sandeen
2009-05-15 12:50                   ` Theodore Tso
2009-05-15 12:58                     ` Eric Sandeen
2009-05-15 15:24                       ` Eric Sandeen
2009-05-15 16:27                         ` Eric Sandeen
2009-05-15  4:55                 ` Aneesh Kumar K.V
2009-05-15 10:11                   ` Theodore Tso
2009-05-15 13:07                   ` Theodore Tso
2009-05-19 10:00                 ` Thierry Vignaud
2009-05-19 11:36                   ` Theodore Tso
2009-05-19 12:01                     ` Alex Tomas
2009-05-19 15:04                       ` Theodore Tso
2009-05-19 15:16                         ` Alex Tomas
2009-05-19 15:18                         ` Thierry Vignaud
2009-05-15  3:57             ` Alex Tomas [this message]
2009-05-15  4:58   ` Aneesh Kumar K.V
2009-05-15 10:27     ` Theodore Tso
2009-05-18  2:14       ` [PATCH] ext4: Add a comprehensive block validity check to ext4_get_blocks() (Was: More ext4 acl/xattr corruption - 4th occurence now) Theodore Tso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A0CE81A.1060006@sun.com \
    --to=bzzz@sun.com \
    --cc=adilger@sun.com \
    --cc=kmshanah@ucwb.org.au \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).