Re: [LSF/MM TOPIC] Sharing file backed pages

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jerome Glisse <jglisse@redhat.com>
To: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>,
	lsf-pc@lists.linux-foundation.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Dave Chinner <david@fromorbit.com>,
	Matthew Wilcox <willy@infradead.org>, Chris Mason <clm@fb.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>
Subject: Re: [LSF/MM TOPIC] Sharing file backed pages
Date: Wed, 23 Jan 2019 10:12:29 -0500	[thread overview]
Message-ID: <20190123151228.GA3097@redhat.com> (raw)
In-Reply-To: <20190123145434.GK13149@quack2.suse.cz>

On Wed, Jan 23, 2019 at 03:54:34PM +0100, Jan Kara wrote:
> On Wed 23-01-19 10:48:58, Amir Goldstein wrote:
> > In his session about "reflink" in LSF/MM 2016 [1], Darrick Wong brought
> > up the subject of sharing pages between cloned files and the general vibe
> > in room was that it could be done.
> > 
> > In his talk about XFS subvolumes and snapshots [2], Dave Chinner said
> > that Matthew Willcox was "working on that problem".
> > 
> > I have started working on a new overlayfs address space implementation
> > that could also benefit from being able to share pages even for filesystems
> > that do not support clones (for copy up anticipation state).
> > 
> > To simplify the problem, we can start with sharing only uptodate clean
> > pages that map the same offset in respected files. While the same offset
> > requirement somewhat limits the use cases that benefit from shared file
> > pages, there is still a vast majority of use cases (i.e. clone full
> > image), where sharing pages of similar offset will bring a lot of
> > benefit.
> > 
> > At first glance, this requires dropping the assumption that a for an
> > uptodate clean page, vmf->vma->vm_file->f_inode == page->mapping->host.
> > Is there really such an assumption in common vfs/mm code?  and what will
> > it take to drop it?
> 
> There definitely is such assumption. Take for example page reclaim as one
> such place that will be non-trivial to deal with. You need to remove the
> page from page cache of all inodes that contain it without having any file
> context whatsoever. So you will need to create some way for this page->page
> caches mapping to happen. Jerome in his talk at LSF/MM last year [1] actually
> nicely summarized what it would take to get rid of page->mapping
> dereferences. He even had some preliminary patches. To sum it up, it's a
> lot of intrusive work but in principle it is possible.
> 
> [1] https://lwn.net/Articles/752564/
> 

I intend to post a v2 of my patchset doing that sometime soon. For
various reasons this had been push to the bottom of my todo list since
last year. It is now almost at the top and it will stay at the top.
So i will be resuming work on that.

I wanted to propose this topic again as a joint session with mm so
here is my proposal:


I would like to discuss the removal of page mapping field dependency
in most kernel code path so the we can overload that field for generic
page write protection (KSM) for file back pages. The whole idea behind
this is that we almost always have the mapping a page belongs to within
the call stack for any function that operate on a file or on a vma do
have it:
    - syscall/kernel on a file (file -> inode -> mapping)
    - syscall/kernel on virtual address (vma -> file -> mapping)
    - write back for a given mapping

Note that the plan is not to free up the mapping field in struct page
but to reduce the number of place that needs the mapping corresponding
to a page to as few places as possible. The few exceptions are:
    - page reclaim
    - memory compaction
    - set_page_dirty() on GUPed (get_user_pages*()) pages

For page reclaim and memory compaction we do not care about mapping
exactly but about being able to unmap/migrate a page. So any over-
loading of mapping needs to keep providing helpers to handle those
cases.

For set_page_dirty() on GUPed pages we can take a slow path if the
page has an overloaded mapping field.


Previous patchset:
https://lore.kernel.org/lkml/20180404191831.5378-1-jglisse@redhat.com/

Cheers,
Jérôme

next prev parent reply	other threads:[~2019-01-23 15:12 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23  8:48 [LSF/MM TOPIC] Sharing file backed pages Amir Goldstein
2019-01-23 14:54 ` Jan Kara
2019-01-23 15:12   ` Jerome Glisse [this message]
2019-01-23 15:26     ` Jerome Glisse
2019-01-23 17:57   ` Amir Goldstein
2019-01-24 10:39   ` Kirill A. Shutemov
2019-01-25  8:39     ` Amir Goldstein
2019-01-23 17:06 ` James Bottomley
2019-01-23 19:10 ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190123151228.GA3097@redhat.com \
    --to=jglisse@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).