set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Edward Shishkin <edward.shishkin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ryan Hope <rmh3093@gmail.com>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	linux-kernel@vger.kernel.org,
	ReiserFS Mailing List <reiserfs-devel@vger.kernel.org>
Subject: set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag)
Date: Tue, 17 Feb 2009 11:40:00 +0100	[thread overview]
Message-ID: <1234867200.4744.65.camel@laptop> (raw)
In-Reply-To: <20090217102443.GA26402@wotan.suse.de>

On Tue, 2009-02-17 at 11:24 +0100, Nick Piggin wrote:
> On Tue, Feb 17, 2009 at 11:05:16AM +0100, Peter Zijlstra wrote:
> > On Tue, 2009-02-17 at 10:38 +0100, Nick Piggin wrote:
> > 
> > > It is a great shame that filesystems are not properly notified
> > > that a page may become dirty before the actual set_page_dirty
> > > event (which is not allowed to fail and is called after the
> > > page is already dirty).
> > 
> > Not quite true, for example the set_page_dirty() done by the write fault
> > code is done _before_ the page becomes dirty.
> > 
> > This before/after thing was the reason for that horrid file corruption
> > bug that dragged on for a few weeks back in .19 (IIRC).
> 
> Yeah, there are actually races though. The page can become cleaned
> before set_page_dirty is reached, and there are also nasty races with
> truncate.

Hmm, so you're saying that never got properly fixed?
 
> > > This is a big problem I have with fsblock simply in trying to
> > > make the memory allocation robust. page_mkwrite unfortunately
> > > is racy and I've fixed problems there... the big problem though
> > > is get_user_pages. Fixing that properly seems to require fixing
> > > callers so it is not really realistic in the short term.
> > 
> > Right, I'm just not sure what we can do, even with a
> > prepage_page_dirty() function, what are you going to do, fail the fault?
> 
> Oh, for regular page fault functions using page_mkwrite, they
> definitely want to fail the fault with a SIGBUS, and actually XFS
> already does that (for fsblock robust memory allocations you
> would also want to fail OOM on metadata allocation failure). What
> is the other option? Silently fail the write?

OK, agreed.

> For XFS purpose (ie. -ENOSPC handling), the current code is reasonable
> although there could be some truncate races with block allocation. But
> mostly probably works. For something like fsblock it can be much more
> common to have the metadata refcount reach 0 and freed before spd is
> called. In that case the code actually goes into a bug situation so it
> is a bit more critical.
> 
> But no that's the "easy" part. The hard part is get_user_pages
> because the caller can hold onto the page indefinitely simply with a
> refcount, and go along happily dirtying it at any stage (actually
> writing to the page memory) before actually calling set_page_dirty.

Should a gup user not specify .write=1 if it wants to dirty the page, at
which point the follow_page() will do the dirty-fault thingy.

Ah, but then we can clean it because we're not holding the page-lock. I
see.

> The "cleanest" way to fix this from VM point of view is probably to
> force gup callers to hold the page locked for the duration to
> prevent truncation or writeout after the filesystem notification.
> Don't know if that would be very popular, however.

Right, so you'd want to keep the page locked over gup(.write=1)
sections.

So should we extend the gup() with put_user_page()?

next prev parent reply	other threads:[~2009-02-17 10:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-13 11:56 [patch 2/4] vfs: add set_page_dirty_notag Edward Shishkin
2009-02-13 13:08 ` Peter Zijlstra
2009-02-13 13:57   ` Edward Shishkin
2009-02-13 14:09     ` Peter Zijlstra
2009-02-14 13:11       ` Edward Shishkin
2009-02-14 21:11         ` Peter Zijlstra
2009-02-16 22:43           ` Edward Shishkin
2009-02-17  9:09             ` Peter Zijlstra
2009-02-17  9:38               ` Nick Piggin
2009-02-17 10:05                 ` Peter Zijlstra
2009-02-17 10:24                   ` Nick Piggin
2009-02-17 10:40                     ` Peter Zijlstra [this message]
2009-02-17 11:25                       ` set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag) Nick Piggin
2009-02-17 11:39                         ` Peter Zijlstra
2009-02-17 11:55                           ` Nick Piggin
2009-02-17 12:05                             ` Peter Zijlstra
2009-02-17 12:30                               ` Nick Piggin
2009-02-17 22:35             ` [patch 2/4] vfs: add set_page_dirty_notag Andrew Morton
2009-02-18  0:26               ` Edward Shishkin
2009-02-18  0:38                 ` Andrew Morton
2009-02-18 13:27                   ` [patch 1/2] vfs: add/use update_page_accounting Edward Shishkin
2009-02-18 14:06                     ` Nick Piggin
2009-02-18 18:23                       ` Andrew Morton
2009-02-18 13:27                   ` [patch 2/2] vfs: (take 2)add set_page_dirty_notag Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1234867200.4744.65.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=edward.shishkin@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=randy.dunlap@oracle.com \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=rmh3093@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox