From: Jan Kara <jack@suse.cz>
To: Chris Mason <chris.mason@oracle.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
tytso@mit.edu, Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH RFC] Add locking to ext3_do_update_inode
Date: Tue, 8 Sep 2009 00:14:54 +0200 [thread overview]
Message-ID: <20090907221453.GB11748@duck.suse.cz> (raw)
In-Reply-To: <20090904200613.GJ17033@think>
On Fri 04-09-09 16:06:13, Chris Mason wrote:
> Hello everyone,
>
> I've been struggling with this off and on while I've been testing the
> data=guarded work. The symptom is corrupted orphan lists and inodes
> with the wrong i_size stored on disk. I was convinced the
> data=guarded code was just missing a call to ext3_mark_inode_dirty, but
> tracing showed the i_disksize I was sending to ext3_mark_inode_dirty
> wasn't actually making it to the drive.
>
> ext3_mark_inode_dirty can be called without locks held (atime updates
> and a few others), so the data=guarded code uses locks while updating
> the in-memory inode, and then calls ext3_mark_inode_dirty
> without any locks held.
>
> But, ext3_mark_inode_dirty has no internal locking to make sure that
> only one CPU is updating the buffer head at a time. Generally this
> works out ok because everyone that changes the inode then calls
> ext3_mark_inode_dirty themselves. Even though it races, eventually
> someone updates the buffer heads and things move on.
>
> But there is still a risk of the wrong values getting in, and the
> data=guarded code seems to hit the race very often.
>
> Since everyone that changes the inode also logs it, it should be
> possible to fix this with some memory barriers. I'll leave that as an
> exercise to the reader and lock the buffer head instead.
>
> It it probably a good idea to have a different patch series for lockless
> bit flipping on the ext3 i_state field. ext3_do_update_inode &= clears
> EXT3_STATE_NEW without any locks held.
>
> Signed-off-by: Chris Mason <chris.mason@oracle.com>
The patch looks good. I've added it to my tree...
Honza
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index 00f5dc1..6a0a056 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -3466,6 +3479,10 @@ static int ext3_do_update_inode(handle_t *handle,
> struct buffer_head *bh = iloc->bh;
> int err = 0, rc, block;
>
> +again:
> + /* we can't allow multiple procs in here at once, its a bit racey */
> + lock_buffer(bh);
> +
> /* For fields not not tracking in the in-memory inode,
> * initialise them to zero for new inodes. */
> if (ei->i_state & EXT3_STATE_NEW)
> @@ -3525,16 +3542,20 @@ static int ext3_do_update_inode(handle_t *handle,
> /* If this is the first large file
> * created, add a flag to the superblock.
> */
> + unlock_buffer(bh);
> err = ext3_journal_get_write_access(handle,
> EXT3_SB(sb)->s_sbh);
> if (err)
> goto out_brelse;
> +
> ext3_update_dynamic_rev(sb);
> EXT3_SET_RO_COMPAT_FEATURE(sb,
> EXT3_FEATURE_RO_COMPAT_LARGE_FILE);
> handle->h_sync = 1;
> err = ext3_journal_dirty_metadata(handle,
> EXT3_SB(sb)->s_sbh);
> + /* get our lock and start over */
> + goto again;
> }
> }
> }
> @@ -3557,6 +3578,7 @@ static int ext3_do_update_inode(handle_t *handle,
> raw_inode->i_extra_isize = cpu_to_le16(ei->i_extra_isize);
>
> BUFFER_TRACE(bh, "call ext3_journal_dirty_metadata");
> + unlock_buffer(bh);
> rc = ext3_journal_dirty_metadata(handle, bh);
> if (!err)
> err = rc;
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
next prev parent reply other threads:[~2009-09-07 22:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-04 20:06 [PATCH RFC] Add locking to ext3_do_update_inode Chris Mason
2009-09-07 22:14 ` Jan Kara [this message]
2009-09-07 22:30 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090907221453.GB11748@duck.suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).