All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Chris Mason <chris.mason@oracle.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Miklos Szeredi <miklos@szeredi.hu>,
	holt@sgi.com, linux-nfs@vger.kernel.org,
	linux-fsdevel-u79uwXL29TY@public.gmane.org
Subject: Re: Why doesn't zap_pte_range() call page_mkwrite()
Date: Tue, 8 Sep 2009 17:41:32 +0200	[thread overview]
Message-ID: <20090908154132.GC29902@wotan.suse.de> (raw)
In-Reply-To: <20090908153007.GB2513@think>

On Tue, Sep 08, 2009 at 11:30:07AM -0400, Chris Mason wrote:
> > > As I said, I think I can fix the NFS problem by simply unmapping the
> > > page inside ->writepage() whenever we know the write request was
> > > originally set up by a page fault.
> > 
> > The biggest outstanding problem we have remaining is get_user_pages.
> > Callers are only required to hold a ref on the page and then they
> > can call set_page_dirty at any point after that.
> > 
> > I have a half-done patch somewhere to add a put_user_pages, and then
> > we could probably go from there to pinning the fs metadata (whether
> > by using the page lock or something else, I don't quite know).
> 
> Hi everyone,
> 
> Sorry for digging up an old thread, but is there any reason we can't
> just use page_mkwrite here?  I'd love to get rid of the btrfs code to
> detect places that use set_page_dirty without a page_mkwrite.

It is because page_mkwrite must be called before the page is dirtied
(it may fail, it theoretically may do something crazy with the previous
clean page data). And in several places I think it gets called from a
nasty context.

It hasn't fallen completely off my radar. fsblock has the same issue
(although I've just been ignoring gup writes into fsblock fs for the
time being).

I have a basic idea of what to do... It would be nice to change calling
convention of get_user_pages and take the page lock. Database people might
scream, in which case we could only take the page lock for filesystems that
define ->page_mkwrite (so shared mem segments avoid the overhead). Lock
ordering might get a bit interesting, but if we can have callers ensure they
always submit and release partially fulfilled requirests, then we can always
trylock them.


WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <npiggin@suse.de>
To: Chris Mason <chris.mason@oracle.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Miklos Szeredi <miklos@szeredi.hu>,
	holt@sgi.com, linux-nfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: Why doesn't zap_pte_range() call page_mkwrite()
Date: Tue, 8 Sep 2009 17:41:32 +0200	[thread overview]
Message-ID: <20090908154132.GC29902@wotan.suse.de> (raw)
In-Reply-To: <20090908153007.GB2513@think>

On Tue, Sep 08, 2009 at 11:30:07AM -0400, Chris Mason wrote:
> > > As I said, I think I can fix the NFS problem by simply unmapping the
> > > page inside ->writepage() whenever we know the write request was
> > > originally set up by a page fault.
> > 
> > The biggest outstanding problem we have remaining is get_user_pages.
> > Callers are only required to hold a ref on the page and then they
> > can call set_page_dirty at any point after that.
> > 
> > I have a half-done patch somewhere to add a put_user_pages, and then
> > we could probably go from there to pinning the fs metadata (whether
> > by using the page lock or something else, I don't quite know).
> 
> Hi everyone,
> 
> Sorry for digging up an old thread, but is there any reason we can't
> just use page_mkwrite here?  I'd love to get rid of the btrfs code to
> detect places that use set_page_dirty without a page_mkwrite.

It is because page_mkwrite must be called before the page is dirtied
(it may fail, it theoretically may do something crazy with the previous
clean page data). And in several places I think it gets called from a
nasty context.

It hasn't fallen completely off my radar. fsblock has the same issue
(although I've just been ignoring gup writes into fsblock fs for the
time being).

I have a basic idea of what to do... It would be nice to change calling
convention of get_user_pages and take the page lock. Database people might
scream, in which case we could only take the page lock for filesystems that
define ->page_mkwrite (so shared mem segments avoid the overhead). Lock
ordering might get a bit interesting, but if we can have callers ensure they
always submit and release partially fulfilled requirests, then we can always
trylock them.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-09-08 15:41 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-23 18:17 Why doesn't zap_pte_range() call page_mkwrite() Trond Myklebust
2009-04-23 18:17 ` Trond Myklebust
     [not found] ` <1240510668.11148.40.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-04-23 19:52   ` Miklos Szeredi
2009-04-23 19:52     ` Miklos Szeredi
     [not found]     ` <E1Lx4yU-0007A8-Gl-8f8m9JG5TPIdUIPVzhDTVZP2KDSNp7ea@public.gmane.org>
2009-04-23 20:42       ` Trond Myklebust
2009-04-23 20:42         ` Trond Myklebust
2009-04-24  7:15         ` Miklos Szeredi
2009-04-24  7:15           ` Miklos Szeredi
     [not found]           ` <E1LxFd4-0008Ih-Rd-8f8m9JG5TPIdUIPVzhDTVZP2KDSNp7ea@public.gmane.org>
2009-04-24  7:33             ` Miklos Szeredi
2009-04-24  7:33               ` Miklos Szeredi
2009-04-24  7:33               ` Miklos Szeredi
2009-04-24 12:59               ` Chris Mason
2009-04-24 12:59                 ` Chris Mason
2009-04-24 13:31                 ` Trond Myklebust
2009-04-24 14:06                   ` Trond Myklebust
2009-04-24 16:18               ` Jamie Lokier
2009-04-24 10:41             ` Robin Holt
2009-04-24 10:41               ` Robin Holt
2009-04-24 10:41               ` Robin Holt
2009-04-24 14:52               ` Miklos Szeredi
     [not found]                 ` <E1LxMlO-0000sU-1J-8f8m9JG5TPIdUIPVzhDTVZP2KDSNp7ea@public.gmane.org>
2009-04-24 17:00                   ` Trond Myklebust
2009-04-24 17:00                     ` Trond Myklebust
2009-04-24 17:00                     ` Trond Myklebust
     [not found]                     ` <1240592448.4946.35.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-04-25  5:10                       ` Nick Piggin
2009-04-25  5:10                         ` Nick Piggin
2009-04-25  5:10                         ` Nick Piggin
2009-09-08 15:30                         ` Chris Mason
2009-09-08 15:41                           ` Nick Piggin
2009-09-08 15:41                           ` Nick Piggin [this message]
2009-09-08 15:41                             ` Nick Piggin
2009-09-08 16:31                             ` Chris Mason
2009-09-08 17:00                               ` Nick Piggin
2009-09-08 17:00                               ` Nick Piggin
2009-09-08 17:00                               ` Nick Piggin
2009-09-08 17:00                                 ` Nick Piggin
2009-09-08 15:41                           ` Nick Piggin
2009-09-09  2:21                           ` Christoph Hellwig
2009-09-09  2:21                           ` Christoph Hellwig
2009-09-09  2:21                           ` Christoph Hellwig
2009-09-09  2:21                             ` Christoph Hellwig
2009-09-09  5:39                             ` Nick Piggin
2009-09-09  5:39                               ` Nick Piggin
2009-09-09  5:39                               ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090908154132.GC29902@wotan.suse.de \
    --to=npiggin@suse.de \
    --cc=chris.mason@oracle.com \
    --cc=holt@sgi.com \
    --cc=linux-fsdevel-u79uwXL29TY@public.gmane.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.