From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: How can we share page cache pages for reflinked files?
Date: Fri, 11 Aug 2017 10:08:47 -0700 [thread overview]
Message-ID: <20170811170847.GK31390@bombadil.infradead.org> (raw)
In-Reply-To: <20170811042519.GS21024@dastard>
On Fri, Aug 11, 2017 at 02:25:19PM +1000, Dave Chinner wrote:
> On Thu, Aug 10, 2017 at 09:11:59AM -0700, Matthew Wilcox wrote:
> > On Thu, Aug 10, 2017 at 02:28:49PM +1000, Dave Chinner wrote:
> > > If we scale this up to a container host which is using reflink trees
> > > it's shared root images, there might be hundreds of copies of the
> > > same data held in cache (i.e. one page per container). Given that
> > > the filesystem knows that the underlying data extent is shared when
> > > we go to read it, it's relatively easy to add mechanisms to the
> > > filesystem to return the same page for all attempts to read the
> > > from a shared extent from all inodes that share it.
> >
> > I agree the problem exists. Should we try to fix this problem, or
> > should we steer people towards solutions which don't have this problem?
> > The solutions I've been seeing use COW block devices instead of COW
> > filesystems, and DAX to share the common pages between the host and
> > each guest.
>
> That's one possible solution for people using hardware
> virutalisation, but not everyone is doing that. It also relies on
> block devices, which rules out a whole bunch of interesting stuff we
> can do with filesystems...
Assuming there's something fun we can do with filesystems that's
interesting to this type of user, what do you think to this:
Create a block device (maybe it's a loop device, maybe it's dm-raid0)
which supports DAX and uses the page cache to cache the physical pages
of the block device it's fronting.
Use XFS+reflink+DAX on top of this loop device. Now there's only one
copy of each page in RAM.
We'd need to be able to shoot down all mapped pages when evicting pages
from the loop device's page cache, but we have the right data structures
in place for that; we just need to use them.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-11 17:08 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-10 4:28 How can we share page cache pages for reflinked files? Dave Chinner
2017-08-10 5:57 ` Kirill A. Shutemov
2017-08-10 9:01 ` Dave Chinner
2017-08-10 13:31 ` Kirill A. Shutemov
2017-08-11 3:59 ` Dave Chinner
2017-08-11 12:57 ` Kirill A. Shutemov
2017-08-10 16:11 ` Matthew Wilcox
2017-08-10 19:17 ` Vivek Goyal
2017-08-10 21:20 ` Matthew Wilcox
2017-08-11 4:25 ` Dave Chinner
2017-08-11 17:08 ` Matthew Wilcox [this message]
2017-08-11 18:04 ` Christoph Hellwig
2017-08-14 6:48 ` Dave Chinner
2017-08-14 18:14 ` Christopher Lameter
2017-08-14 21:09 ` Kirill A. Shutemov
2017-08-15 15:11 ` Christopher Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170811170847.GK31390@bombadil.infradead.org \
--to=willy@infradead.org \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).