From: Namjae Jeon <namjae.jeon@samsung.com>
To: 'Jan Kara' <jack@suse.cz>
Cc: "'Dave Chinner'" <david@fromorbit.com>,
"'Theodore Ts'o'" <tytso@mit.edu>,
"'Alexander Viro'" <viro@zeniv.linux.org.uk>,
"'Brian Foster'" <bfoster@redhat.com>,
"'Dmitry Monakhov'" <dmonakhov@openvz.org>,
"'Lukáš Czerner'" <lczerner@redhat.com>,
linux-fsdevel@vger.kernel.org,
"'Ashish Sangwan'" <a.sangwan@samsung.com>,
linux-kernel@vger.kernel.org
Subject: RE: [RFC PATCH] fs: file freeze support
Date: Mon, 19 Jan 2015 21:34:06 +0900 [thread overview]
Message-ID: <005501d033e4$37149440$a53dbcc0$@samsung.com> (raw)
In-Reply-To: <20150116105712.GF25884@quack.suse.cz>
> Hello,
Hi Jan.
>
> On Fri 16-01-15 15:48:04, Namjae Jeon wrote:
> > > > +int file_write_unfreeze(struct inode *inode)
> > > > +{
> > > > + struct super_block *sb = inode->i_sb;
> > > > +
> > > > + if (!S_ISREG(inode->i_mode))
> > > > + return -EINVAL;
> > > > +
> > > > + spin_lock(&inode->i_lock);
> > > > +
> > > > + if (!(inode->i_state & I_WRITE_FREEZED)) {
> > > > + spin_unlock(&inode->i_lock);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + inode->i_state &= ~I_WRITE_FREEZED;
> > > > + smp_wmb();
> > > > + wake_up(&sb->s_writers.wait_unfrozen);
> > > > + spin_unlock(&inode->i_lock);
> > > > + return 0;
> > > > +}
> > > > +EXPORT_SYMBOL(file_write_unfreeze);
> > > So I was looking at the implementation and I have a few comments:
> > > 1) The trick with freezing superblock looks nice but I'm somewhat worried
> > > that if we wanted to heavily use per-inode freezing to defrag the whole
> > > filesystem it may be too slow to freeze the whole fs, mark one inode as
> > > frozen and then unfreeze the fs. But I guess we'll see that once have some
> > > reasonably working implementation.
> > Dmitry has given a good idea to avoid multiple freeze fs and unfreeze fs
> > calls.
> >
> > ioctl(sb,FIFREEZE)
> > while (f = pop(files_list))
> > ioctl(f,FS_IOC_FWFREEZE)
> > ioctl(sb,FITHAW)
> >
> > In file_write_freeze, we could first check if the fs is already frozen,
> > if yes than we can directly set inode write freeze state after taking
> > relevant lock to prevent fs_thaw while the inode state is being set.
> Well, doing fs-wide freezing from userspace makes sense as Dmitry pointed
> out. We can then just fail FS_IOC_FWFREEZE with error when the whole fs isn't
> frozen. I'm just somewhat worried whether the fs-wide freezing won't be too
> fragile. E.g. consider a situation when you are running a defrag program
> which is freezing and unfreezing the filesystem and then some background
> work kicks which will want to snapshot the filesystem so it will freeze &
> unfreeze the fs as well. Now depending on how exactly defrag and snapshot
> race one of the FIFREEZE ioctls will return EBUSY and the process
> (hopefully gracefully) fails.
>
> This isn't a new situation - if you ran two snapshots at once, you'd see
> the same failure. But the more fs-wide freezing gets used in different
> places the stranger and less expected failure you'll see...
Yes, Right. Thanks for your opinion. I will consider your point.
>
> > > 2) The tests you are currently doing are racy. If
> > > things happen as:
> > > CPU1 CPU2
> > > inode_start_write()
> > > file_write_freeze()
> > > sb_start_pagefault()
> > > Do modifications.
> > >
> > > Then you have a CPU modifying a file while file_write_freeze() has
> > > succeeded so it should be frozen.
> > >
> > > If you swap inode_start_write() with sb_start_pagefault() the above race
> > > doesn't happen but userspace program has to be really careful not to hit a
> > > deadlock. E.g. if you tried to freeze two inodes the following could happen:
> > > CPU1 CPU2
> > > file_write_freeze(inode1)
> > > fault on inode1:
> > > sb_start_pagefault()
> > > inode_start_write() -> blocks
> > > file_write_freeze(inode2)
> > > blocks in freeze_super()
> > >
> > > So I don't think this is a good scheme for inode freezing...
> > To solve this race, we can fold inode_start_write with sb_start_write and use
> > similar appraoch of __sb_start_write.
> > How about the below scheme ?
> >
> > void inode_start_write(struct inode *inode)
> > {
> > struct super_block *sb = inode->i_sb;
> >
> > retry:
> >
> > if (unlikely(inode->i_state & I_WRITE_FREEZED)) {
> > DEFINE_WAIT(wait);
> >
> > prepare_to_wait(&sb->s_writers.wait_unfrozen, &wait,
> > TASK_UNINTERRUPTIBLE);
> > schedule();
> > finish_wait(&sb->s_writers.wait_unfrozen, &wait);
> >
> > goto retry;
> > }
> >
> > sb_start_write(sb);
> >
> > /* check if file_write_freeze race with us */
> > if (unlikely(inode->i_state & I_WRITE_FREEZED) {
> > sb_end_write(sb);
> > goto retry;
> > }
> > }
> Yes, this should work. You'll need a similar wrapper for page faults but
> that's easy enough.
Okay, Thanks :)
>
> Honza
>
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
next prev parent reply other threads:[~2015-01-19 12:34 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-15 11:36 [RFC PATCH] fs: file freeze support Namjae Jeon
2015-01-15 15:19 ` Dmitry Monakhov
2015-01-16 5:54 ` Namjae Jeon
2015-01-15 16:17 ` Jan Kara
2015-01-16 6:48 ` Namjae Jeon
2015-01-16 10:57 ` Jan Kara
2015-01-19 12:34 ` Namjae Jeon [this message]
2015-01-18 23:33 ` Dave Chinner
2015-01-19 13:07 ` Namjae Jeon
2015-01-20 11:21 ` Jan Kara
2015-01-20 22:22 ` Dave Chinner
2015-01-21 0:15 ` Namjae Jeon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='005501d033e4$37149440$a53dbcc0$@samsung.com' \
--to=namjae.jeon@samsung.com \
--cc=a.sangwan@samsung.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=dmonakhov@openvz.org \
--cc=jack@suse.cz \
--cc=lczerner@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.