From: Chris Mason <chris.mason@oracle.com>
To: Jan Kara <jack@suse.cz>
Cc: tytso@mit.edu, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] Ext3: data=guarded mode
Date: Wed, 16 Sep 2009 10:37:56 -0400 [thread overview]
Message-ID: <20090916143756.GC2641@think> (raw)
In-Reply-To: <20090916140913.GN26030@duck.suse.cz>
On Wed, Sep 16, 2009 at 04:09:13PM +0200, Jan Kara wrote:
> On Tue 15-09-09 14:39:06, Chris Mason wrote:
[ ... ]
> > > The code here still looks suspicious.
> > > 1) Inodes can be on orphan list with i_nlink > 0 when a write failed for
> > > some reason and we have to truncate blocks instantiated beyond i_size.
> > > Those places (similarly as truncate) expect that while they hold i_mutex
> > > they are safe doing what they want with the orphan list. This code would
> > > happily remove the inode from orphan list...
> >
> > The only risky place for this is the work thread doing the ordered
> > writes. Truncate gets around it by waiting for the ordered completions.
> > I'll add the wait to the error handlers as well.
> You probably mean guarded writes. I agree.
Sorry, guarded in ext3 is ordered in btrfs and I'm easily confused.
>
> > > 2) Cannot it happen that:
> > > CPU1
> > > orphan_del()
> > > if (inode->i_nlink && list_empty(ordered_list)) {
> > > ext3_ordered_unlock(inode);
> > > lock_super(inode->i_sb);
> > > smp_mb();
> > > if (inode->i_nlink == 0 || !list_empty(ordered_list)) {
> > >
> > > CPU2
> > > journal_dirty_data_guarded_fn()
> > > ret = ext3_add_ordered_extent(inode, offset, bh);
> > > if (ret == 0 && buffer_dataguarded(bh) &&
> > > list_empty(&EXT3_I(inode)->i_orphan) &&
> > > !(EXT3_SB(inode->i_sb)->s_mount_state & EXT3_ORPHAN_FS)) - list isn't
> > > empty yet so we don't add inode to orphan list, but on CPU1, we go ahead
> > > and remove inode from the orphan list...
> >
> > This used to have a check after the orphan_del to re-add the orphan if
> > we raced with the end_io handlers. I removed it because I thought it
> > was over-paranoid, but I see that you're right. So, I'll put that one
> > back in.
> Hmm, that will probably work but it's ugly :(. The ugliness is localized
> in the guarded mode code so probably we can bear it for a while but I'll
> certainly try to look into what we can do to get rid of it :).
;)
>
> > > > diff --git a/include/linux/ext3_fs_i.h b/include/linux/ext3_fs_i.h
> > > > index ca1bfe9..a6cf26d 100644
> > > > --- a/include/linux/ext3_fs_i.h
> > > > +++ b/include/linux/ext3_fs_i.h
> > > > @@ -137,6 +180,8 @@ struct ext3_inode_info {
> > > > * by other means, so we have truncate_mutex.
> > > > */
> > > > struct mutex truncate_mutex;
> > > > +
> > > > + struct ext3_ordered_buffers ordered_buffers;
> > > > struct inode vfs_inode;
> > > > };
> > > Hmm, how hard would it be to hide especially this behind
> > > CONFIG_EXT3_GUARDED_DATA so that we can avoid increasing inode size for
> > > users which are not interested in the new guarded mode?
> >
> > I'm not too picky, but it would litter the code with #ifdefs around the
> > guarded functions. I'd rather not.
> Looking into the code, it needn't be too bad if we define a a few
> functions as empty in !guarded case. I'll have a look at it for the next
> version of your patch.
Fair enough, I plan on hammering out the next version today or tomorrow.
-chris
next prev parent reply other threads:[~2009-09-16 14:38 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-08 15:09 [PATCH RFC] Ext3 data=guarded Chris Mason
2009-09-08 15:09 ` [PATCH 1/2] Ext3: Fix race in ext3_mark_inode_dirty Chris Mason
2009-09-08 15:09 ` [PATCH 2/2] Ext3: data=guarded mode Chris Mason
2009-09-15 17:29 ` Jan Kara
2009-09-15 18:39 ` Chris Mason
2009-09-16 14:09 ` Jan Kara
2009-09-16 14:37 ` Chris Mason [this message]
2009-09-21 16:29 ` Chris Mason
2009-09-24 16:16 ` Chris Mason
2009-09-17 21:53 ` [PATCH RFC] Ext3 data=guarded Jamie Lokier
2009-09-17 22:19 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090916143756.GC2641@think \
--to=chris.mason@oracle.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.