public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@us.ibm.com>
To: Daniel Phillips <phillips@arcor.de>
Cc: Andrew Morton <akpm@osdl.org>,
	sct@redhat.com, hch@infradead.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC] Distributed mmap API
Date: Thu, 4 Mar 2004 10:55:01 -0800	[thread overview]
Message-ID: <20040304185501.GH1384@us.ibm.com> (raw)
In-Reply-To: <200403030800.35612.phillips@arcor.de>

This matches what we are after here!

						Thanx, Paul

On Wed, Mar 03, 2004 at 08:06:20AM -0500, Daniel Phillips wrote:
> On Tuesday 02 March 2004 22:15, Andrew Morton wrote:
> > Daniel Phillips <phillips@arcor.de> wrote:
> > > Here is a rearranged zap_pte_range that avoids any operations for
> > > out-of-range pfns.
> >
> > Please remind us why Linux needs this patch?
> 
> The is purely to support mmap, including MAP_PRIVATE, accurately on 
> distributed filesystems, where "accurately" is defined as "with local 
> filesystem semantics".
> 
> If the same file region is mmapped by more than one node, only one of them is 
> allowed to have a given page of the mmap valid in the page tables at any 
> time.  When a memory write occurs on one of the other nodes, it must fault so 
> that the distributed filesystem can arrange for exclusive ownership of the 
> file page (or as GFS currently implements it, the whole file) to change from 
> one node to the other.  At this time, any pages already faulted in must be 
> unmapped so that future memory accesses will properly fault.  This unmapping 
> is done by zap_page_range, which has nearly the semantics we want except that 
> it will also unmap private pages of a MAP_PRIVATE mapping, destroying the 
> only copy of that data.  A user would observe the privately written data 
> spontaneously revert to the current file contents.  The purpose of this patch 
> is to fix that.
> 
> This patch allows a distributed filesystem to unmap file-backed memory without 
> unmapping anonymous pages or deleting swap cache, avoiding the above data 
> destruction.  Since zap_page_range is the only function that knows how to 
> unmap memory, it needs to be taught how to skip anonymous pages.
> 
> An alternative to this patch is simply to export zap_page_range, then the 
> distributed filesystem can walk the lists of mmapped vmas itself, skipping 
> any that are MAP_PRIVATE.  This achieves Posix local filesystem semantics, 
> but not Linux local filesystem semantics, because updates to the mmap from 
> other nodes become visible unpredictably.  Earlier this year, Linus said that 
> he wants tighter semantics for distributed MAP_PRIVATE.
> 
> This patch presses zap_page_range into service in a way that was not 
> originally intended, that is, for invalidation as opposed to destruction of 
> memory regions.  The requirements are identical except for the MAP_PRIVATE 
> detail.  Forking the whole zap_ chain would be even more distasteful than 
> grafting on this option flag.  It's also impractical to implement a zap_ 
> variant within a dfs module because of the heavy use of per-arch APIs.  As
> far I can see, this patch is the minimum cost of having accurate semantics
> for distributed MAP_PRIVATE mmap.
> 
> I'll take the opportunity to beat my chest a once again about the fact that 
> this doesn't benefit anything other than distributed filesystems.  On the 
> other hand, the cost is  miniscule: 54 bytes, a little stack and likely no 
> measureable cpu.
> 
> > I forget what `all' does?  anon+swapcache as well as pagecache?
> 
> Yes
> 
> > A bit of API documentation here would be appropriate.
> 
> Oops, sorry:
> 
> /**
>  * zap_page_range - remove user pages in a given range
>  * @vma: vm_area_struct holding the applicable pages
>  * @address: starting address of pages to zap
>  * @size: number of bytes to zap
>  * @all: also unmap anonymous pages
>  */
> void zap_page_range(struct vm_area_struct *vma,
>                     unsigned long address, unsigned long size, int all)
> 
> Regards,
> 
> Daniel
> 
> 

      reply	other threads:[~2004-03-05  2:01 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-02-16 19:09 Non-GPL export of invalidate_mmap_range Paul E. McKenney
2004-02-17  2:31 ` Andrew Morton
2004-02-17  7:35 ` Christoph Hellwig
2004-02-17 12:40   ` Paul E. McKenney
2004-02-18  0:19     ` Andrew Morton
2004-02-18 12:51       ` Arjan van de Ven
2004-02-18 14:00         ` Paul E. McKenney
2004-02-18 21:10           ` Christoph Hellwig
2004-02-18 15:06             ` Paul E. McKenney
2004-02-18 22:21               ` Christoph Hellwig
2004-02-18 22:51                 ` Andrew Morton
2004-02-18 23:00                   ` Christoph Hellwig
2004-02-18 16:21                     ` Paul E. McKenney
2004-02-18 23:32                     ` Andrew Morton
2004-02-19 12:32                       ` Christoph Hellwig
2004-02-19 18:56                         ` Andrew Morton
2004-02-19 19:01                           ` Christoph Hellwig
2004-02-19 13:04                             ` Paul E. McKenney
2004-02-20  3:17                             ` Anton Blanchard
2004-02-20 21:46                               ` Valdis.Kletnieks
2004-02-19  0:28                     ` Andrew Morton
2004-02-18 18:36                       ` Paul E. McKenney
2004-02-19 12:31                       ` Christoph Hellwig
2004-02-19  9:11                         ` Paul E. McKenney
     [not found]                           ` <20040219183210.GX14000@marowsky-bree.de>
2004-02-19 18:38                             ` Arjan van de Ven
2004-02-19 19:16                             ` viro
2004-02-19 16:15                               ` Paul E. McKenney
2004-02-19 18:59                         ` Tim Bird
2004-02-19  9:11                   ` David Weinehall
2004-02-19  8:58                     ` Paul E. McKenney
2004-03-04  5:51                       ` Mike Fedyk
2004-02-19 10:29                   ` Lars Marowsky-Bree
2004-02-19  9:00                     ` Paul E. McKenney
2004-02-19 11:11                     ` Arjan van de Ven
2004-02-19 11:53                       ` Lars Marowsky-Bree
2004-02-18 18:04         ` Tim Bird
2004-02-19 20:56       ` Daniel Phillips
2004-02-19 22:06         ` Stephen C. Tweedie
2004-02-19 22:31           ` Daniel Phillips
2004-02-19 16:42             ` Paul E. McKenney
2004-02-20  2:06               ` Daniel Phillips
2004-02-19 19:47                 ` Paul E. McKenney
2004-02-20  5:07                   ` Daniel Phillips
2004-02-20 12:02                     ` Paul E. McKenney
2004-02-20 20:37                       ` Daniel Phillips
2004-02-20 14:01                         ` Paul E. McKenney
2004-02-20 23:00                           ` Daniel Phillips
2004-02-20 16:17                             ` Paul E. McKenney
2004-02-21  3:19                               ` Daniel Phillips
2004-02-20 21:17                         ` Christoph Hellwig
2004-02-20 22:16                           ` Daniel Phillips
2004-02-20 23:56                             ` GFS requirements (was: Non-GPL export of invalidate_mmap_range) Lars Marowsky-Bree
2004-02-21  3:16                               ` Daniel Phillips
2004-02-21 14:17                                 ` Lars Marowsky-Bree
2004-02-21 19:09                                   ` Daniel Phillips
2004-02-22 10:37                                     ` Lars Marowsky-Bree
2004-02-24 18:26                                       ` Daniel Phillips
2004-02-18 12:12     ` Non-GPL export of invalidate_mmap_range Dominik Kubla
     [not found]   ` <24651326.1077037044@42.150.104.212.access.eclipse.net.uk>
2004-02-18 13:13     ` Christoph Hellwig
2004-02-17 22:22 ` David Weinehall
     [not found] ` <200402211400.16779.phillips@arcor.de>
     [not found]   ` <20040222233911.GB1311@us.ibm.com>
2004-02-25 21:04     ` [RFC] Distributed mmap API Daniel Phillips
2004-02-25 19:12       ` Paul E. McKenney
2004-02-25 19:14       ` Paul E. McKenney
2004-02-25 22:07       ` Andrew Morton
2004-02-25 22:07         ` Daniel Phillips
2004-02-25 22:16           ` Andrew Morton
2004-02-25 22:46             ` Daniel Phillips
2004-03-03  3:00         ` Daniel Phillips
2004-03-03  3:15           ` Andrew Morton
2004-03-03 13:06             ` Daniel Phillips
2004-03-04 18:55               ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040304185501.GH1384@us.ibm.com \
    --to=paulmck@us.ibm.com \
    --cc=akpm@osdl.org \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=phillips@arcor.de \
    --cc=sct@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox