All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Nick Piggin <npiggin@suse.de>
Cc: Edward Shishkin <edward.shishkin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ryan Hope <rmh3093@gmail.com>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	linux-kernel@vger.kernel.org,
	ReiserFS Mailing List <reiserfs-devel@vger.kernel.org>
Subject: set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag)
Date: Tue, 17 Feb 2009 11:40:00 +0100	[thread overview]
Message-ID: <1234867200.4744.65.camel@laptop> (raw)
In-Reply-To: <20090217102443.GA26402@wotan.suse.de>

On Tue, 2009-02-17 at 11:24 +0100, Nick Piggin wrote:
> On Tue, Feb 17, 2009 at 11:05:16AM +0100, Peter Zijlstra wrote:
> > On Tue, 2009-02-17 at 10:38 +0100, Nick Piggin wrote:
> > 
> > > It is a great shame that filesystems are not properly notified
> > > that a page may become dirty before the actual set_page_dirty
> > > event (which is not allowed to fail and is called after the
> > > page is already dirty).
> > 
> > Not quite true, for example the set_page_dirty() done by the write fault
> > code is done _before_ the page becomes dirty.
> > 
> > This before/after thing was the reason for that horrid file corruption
> > bug that dragged on for a few weeks back in .19 (IIRC).
> 
> Yeah, there are actually races though. The page can become cleaned
> before set_page_dirty is reached, and there are also nasty races with
> truncate.

Hmm, so you're saying that never got properly fixed?
 
> > > This is a big problem I have with fsblock simply in trying to
> > > make the memory allocation robust. page_mkwrite unfortunately
> > > is racy and I've fixed problems there... the big problem though
> > > is get_user_pages. Fixing that properly seems to require fixing
> > > callers so it is not really realistic in the short term.
> > 
> > Right, I'm just not sure what we can do, even with a
> > prepage_page_dirty() function, what are you going to do, fail the fault?
> 
> Oh, for regular page fault functions using page_mkwrite, they
> definitely want to fail the fault with a SIGBUS, and actually XFS
> already does that (for fsblock robust memory allocations you
> would also want to fail OOM on metadata allocation failure). What
> is the other option? Silently fail the write?

OK, agreed.

> For XFS purpose (ie. -ENOSPC handling), the current code is reasonable
> although there could be some truncate races with block allocation. But
> mostly probably works. For something like fsblock it can be much more
> common to have the metadata refcount reach 0 and freed before spd is
> called. In that case the code actually goes into a bug situation so it
> is a bit more critical.
> 
> But no that's the "easy" part. The hard part is get_user_pages
> because the caller can hold onto the page indefinitely simply with a
> refcount, and go along happily dirtying it at any stage (actually
> writing to the page memory) before actually calling set_page_dirty.

Should a gup user not specify .write=1 if it wants to dirty the page, at
which point the follow_page() will do the dirty-fault thingy.

Ah, but then we can clean it because we're not holding the page-lock. I
see.

> The "cleanest" way to fix this from VM point of view is probably to
> force gup callers to hold the page locked for the duration to
> prevent truncation or writeout after the filesystem notification.
> Don't know if that would be very popular, however.

Right, so you'd want to keep the page locked over gup(.write=1)
sections.

So should we extend the gup() with put_user_page()?


  reply	other threads:[~2009-02-17 10:40 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-13 11:56 [patch 2/4] vfs: add set_page_dirty_notag Edward Shishkin
2009-02-13 13:08 ` Peter Zijlstra
2009-02-13 13:57   ` Edward Shishkin
2009-02-13 14:09     ` Peter Zijlstra
2009-02-14 13:11       ` Edward Shishkin
2009-02-14 21:11         ` Peter Zijlstra
2009-02-16 22:43           ` Edward Shishkin
2009-02-17  9:09             ` Peter Zijlstra
2009-02-17  9:38               ` Nick Piggin
2009-02-17 10:05                 ` Peter Zijlstra
2009-02-17 10:24                   ` Nick Piggin
2009-02-17 10:40                     ` Peter Zijlstra [this message]
2009-02-17 11:25                       ` set_page_dirty races (was: Re: [patch 2/4] vfs: add set_page_dirty_notag) Nick Piggin
2009-02-17 11:39                         ` Peter Zijlstra
2009-02-17 11:55                           ` Nick Piggin
2009-02-17 12:05                             ` Peter Zijlstra
2009-02-17 12:30                               ` Nick Piggin
2009-02-17 22:35             ` [patch 2/4] vfs: add set_page_dirty_notag Andrew Morton
2009-02-17 22:35               ` Andrew Morton
2009-02-18  0:26               ` Edward Shishkin
2009-02-18  0:38                 ` Andrew Morton
2009-02-18 13:27                   ` [patch 1/2] vfs: add/use update_page_accounting Edward Shishkin
2009-02-18 14:06                     ` Nick Piggin
2009-02-18 18:23                       ` Andrew Morton
2009-02-18 13:27                   ` [patch 2/2] vfs: (take 2)add set_page_dirty_notag Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1234867200.4744.65.camel@laptop \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=edward.shishkin@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=npiggin@suse.de \
    --cc=randy.dunlap@oracle.com \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=rmh3093@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.