linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	xfs@oss.sgi.com, Christoph Hellwig <hch@infradead.org>,
	Jan Kara <jack@suse.cz>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [PATCH v3 2/5] fs: Add inode_update_time_writable
Date: Tue, 20 Aug 2013 18:10:55 +0200	[thread overview]
Message-ID: <20130820161055.GD2862@quack.suse.cz> (raw)
In-Reply-To: <CALCETrVrSiXCt5+2801C+QA6B1jzb0K3VHT6w8sVf_VXrz16Bw@mail.gmail.com>

On Mon 19-08-13 21:07:49, Andy Lutomirski wrote:
> On Mon, Aug 19, 2013 at 8:33 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Mon, Aug 19, 2013 at 08:20:12PM -0700, Andy Lutomirski wrote:
> >> On Mon, Aug 19, 2013 at 7:28 PM, Dave Chinner <david@fromorbit.com> wrote:
> >> > On Fri, Aug 16, 2013 at 04:22:09PM -0700, Andy Lutomirski wrote:
> >> >> This is like file_update_time, except that it acts on a struct inode *
> >> >> instead of a struct file *.
> >> >>
> >> >> Signed-off-by: Andy Lutomirski <luto@amacapital.net>
> >> >> ---
> >> >>  fs/inode.c         | 72 ++++++++++++++++++++++++++++++++++++++++++------------
> >> >>  include/linux/fs.h |  1 +
> >> >>  2 files changed, 58 insertions(+), 15 deletions(-)
> >> >>
> >>
> >> [...]
> >>
> >> >> +
> >> >> +int inode_update_time_writable(struct inode *inode)
> >> >> +{
> >> >> +     struct timespec now;
> >> >> +     int sync_it = prepare_update_cmtime(inode, &now);
> >> >> +     int ret;
> >> >> +
> >> >> +     if (!sync_it)
> >> >> +             return 0;
> >> >> +
> >> >> +     /* sb_start_pagefault and update_time can both sleep. */
> >> >> +     sb_start_pagefault(inode->i_sb);
> >> >> +     ret = update_time(inode, &now, sync_it);
> >> >> +     sb_end_pagefault(inode->i_sb);
> >> >
> >> > This gets called from the writeback path - you can't use
> >> > sb_start_pagefault/sb_end_pagefault in that path.
> >>
> >> The race I'm worried about is:
> >>
> >>  - mmap
> >>  - write to the mapping
> >>  - remount ro
> >>  - flush_cmtime -> inode_update_time_writable
> >
> > sb_start_pagefault() is for filesystem freeze protection, not
> > remount-ro protection. If you freeze the filesystem, then we stop
> > writes and pagefaults by making sb_start_pagefault/sb_start_write
> > block, and then run writeback to clean all the pages.  If writeback
> > then blocks on sb_start_pagefault(), we've got a deadlock.
> >
> >> This may be impossible, in which case I'm okay, but it's nice to have
> >> a sanity check.  I'll see if I can figure out how to do that.
> >
> > The process of remount-ro should flush the dirty pages - the inode
> > and page has been marked dirty by page_mkwrite(), after all.
> 
> Hmm.  We can land in here from writeback, in which case the time
> should be updated unconditionally.  We can also land in here from
> msync(MS_ASYNC) or munmap.  munmap at least shouldn't block.
> 
> The nasty case is if a page is dirtied, then the frozen level is set
> to SB_FREEZE_PAGEFAULT, and then userspace calls munmap or msync
> *before* writepages gets called.  In this case, blocking until the fs
> is unfrozen is probably impolite, and returning without updating the
> time is questionable.
> 
> Removing the check entirely may add a new race, though: what if
> .flush_cmtime has called mapping_test_clear_cmtime but hasn't gotten
> to updating the time yet when freezing finishes?  This could be
> prevented by changing generic_flush_cmtime to do __sb_start_write(sb,
> SB_FREEZE_FS, false) and doing nothing if the fs is already frozen.
  I think it should be a responsibility of the caller of .flush_cmtime (as
is the case of update_time()) to handle freeze protection if needed. As
Dave told you, writeback path mustn't actually take it. OTOH things like
munmap() or msync() need to get it because we must avoid changing the
filesystem when it is frozen and these calls can happen when fs is frozen.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-20 16:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-16 23:22 [PATCH v3 0/5] Rework mtime and ctime updates on mmaped Andy Lutomirski
2013-08-16 23:22 ` [PATCH v3 1/5] mm: Track mappings that have been written via ptes Andy Lutomirski
2013-08-16 23:22 ` [PATCH v3 2/5] fs: Add inode_update_time_writable Andy Lutomirski
2013-08-20  2:28   ` Dave Chinner
2013-08-20  3:20     ` Andy Lutomirski
2013-08-20  3:33       ` Dave Chinner
2013-08-20  4:07         ` Andy Lutomirski
2013-08-20 16:10           ` Jan Kara [this message]
2013-08-16 23:22 ` [PATCH v3 3/5] mm: Notify filesystems when it's time to apply a deferred cmtime update Andy Lutomirski
2013-08-20  2:36   ` Dave Chinner
2013-08-20  3:28     ` Andy Lutomirski
2013-08-20  4:08       ` Dave Chinner
2013-08-20  4:14         ` Andy Lutomirski
2013-08-20 16:00           ` Jan Kara
2013-08-20 16:42             ` Andy Lutomirski
2013-08-20 19:27               ` Andy Lutomirski
2013-08-20 21:48               ` Dave Chinner
2013-08-20 21:54                 ` Andy Lutomirski
2013-08-20 22:43                   ` Dave Chinner
2013-08-21  0:47                     ` Andy Lutomirski
2013-08-21  1:33                       ` Dave Chinner
2013-08-16 23:22 ` [PATCH v3 4/5] mm: Scan for dirty ptes and update cmtime on MS_ASYNC Andy Lutomirski
2013-08-16 23:22 ` [PATCH v3 5/5] ext4: Defer mmap cmtime update until writeback Andy Lutomirski
2013-08-20  2:38   ` Dave Chinner
2013-08-20  3:30     ` Andy Lutomirski
2013-08-20  4:08       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130820161055.GD2862@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).