From: Matthew Wilcox <willy@infradead.org>
To: Vito Caputo <vcaputo@pengaru.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [QUESTION] Sharing a `struct page` across multiple `struct address_space` instances
Date: Sat, 25 Jul 2020 04:11:58 +0100 [thread overview]
Message-ID: <20200725031158.GD23808@casper.infradead.org> (raw)
In-Reply-To: <20200725002221.dszdahfhqrbz43cz@shells.gnugeneration.com>
On Fri, Jul 24, 2020 at 05:22:21PM -0700, Vito Caputo wrote:
> Prior to looking at the code, conceptually I was envisioning the pages
> in the reflink source inode's address_space would simply get their
> refcounts bumped as they were added to the dest inode's address_space,
> with some CoW flag set to prevent writes.
>
> But there seems to be a fundamental assumption that a `struct page`
> would only belong to a single `struct address_space` at a time, as it
> has single `mapping` and `index` members for reverse mapping the page
> to its address_space.
>
> Am I completely lost here or does it really look like a rather
> invasive modification to support this feature?
>
> I have vague memories of Dave Chinner mentioning work towards sharing
> pages across address spaces in the interests of getting reflink copies
> more competitive with overlayfs in terms of page cache utilization.
It's invasive. Dave and I have chatted about this in the past. I've done
no work towards it (... a little busy right now with THPs in the page
cache ...) but I have a design in mind.
The fundamental idea is to use the DAX support to refer to pages which
actually belong to a separate address space. DAX entries are effectively
PFN entries. So there would be a clear distinction between "I looked
up a page which actually belongs to this address space" and "I looked
up a page which is shared with a different address space". My thinking
has been that if files A and B are reflinked, both A and B would see
DAX entries in their respective page caches. The page would belong to
a third address space which might be the block device's address space,
or maybe there would be an address space per shared fragment (since
files can share fragments that are at different offsets from each other).
There are a lot of details to get right around this approach.
Importantly, there _shouldn't_ be a refcount from each of file A and
B on the page. Instead the refcount from files A and B should be on
the fragment. When the fragment's refcount goes to zero, we know there
are no more references to the fragment and all its pages can be freed.
That means that if we reflink B to C, we don't have to walk every page
in the file and increase its refcount again.
So, are you prepared to do a lot of work, or were you thinking this
would be a quick hack? Because I'm willing to advise on a big project,
but if you're thinking this will be quick, and don't have time for a
big project, it's probably time to stop here.
---
Something that did occur to me while writing this is that if you just want
read-only duplicates of files to work, you could make inode->i_mapping
point to a different address_space instead of &inode->i_data. There's
probabyl a quick hack solution there.
next prev parent reply other threads:[~2020-07-25 3:12 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-25 0:22 [QUESTION] Sharing a `struct page` across multiple `struct address_space` instances Vito Caputo
2020-07-25 3:11 ` Matthew Wilcox [this message]
2020-07-27 19:38 ` Vito Caputo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200725031158.GD23808@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=vcaputo@pengaru.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).