linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jingbo Xu <jefflexu@linux.alibaba.com>
To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org
Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: [PATCH v2 6/7] erofs: add helper checking if page cache sharing shall be enabled
Date: Wed, 11 Jan 2023 16:31:57 +0800	[thread overview]
Message-ID: <20230111083158.23462-7-jefflexu@linux.alibaba.com> (raw)
In-Reply-To: <20230111083158.23462-1-jefflexu@linux.alibaba.com>

Erofs supports chunk deduplication to reduce disk usage.  Furthermore we
can make inodes share page cache of these deduplicated chunks to reduce
the memory usage.  This shall be much usable in container scenarios as
deduplication is requisite for container image.

This can be achieved by managing page cache of deduplicated chunks in
blob's address space.  In this way, all inodes sharing the deduplicated
chunk will refer to and share the page cache in the blob's address
space.

So far there are some restrictions for enabling this feature.

The page cache sharing feature also supports .mmap().  The reverse
mapping requires that one vma can not be shared among inodes and can
be linked to only one inode.  As the vma will be finally linked to the
blob's address space when page cache sharing enabled, the restriction of
the reverse mapping actually requires that the mapped file area can not
be mapped to multiple blobs.  Thus page cache sharing can only be
enabled for those files mapped to one blob.

The chunk based data layout guarantees that a chunk will not cross the
device (blob) boundary.  Thus in chunk based data layout, those files
smaller than the chunk size shall be guaranteed to be mapped to one
blob.  As chunk size is tunable at a per-file basis, this restriction
can be relaxed at image building phase.  As long as we ensure that the
file can not be deduplicated, the file's chunk size can be set to a
reasonable value larger than the file size, so that the page cache
sharing feature can be enabled on this file later.

The second restriction is that EROFS_BLKSIZ mus be multiples of
PAGE_SIZE to avoid data leakage.  Otherwise unrelated data may be
exposed at the end of the last page, since file's data is arranged in
unit of EROFS_BLKSIZ in the image.

Considering all these restrictions, add a helper checking if page cache
sharing shall be enabled for specific file.

Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
 fs/erofs/internal.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 7c6a7a2d9acf..adf6be08b47c 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -368,6 +368,29 @@ static inline unsigned int erofs_inode_datalayout(unsigned int value)
 			      EROFS_I_DATALAYOUT_BITS);
 }
 
+static inline bool erofs_can_share_page(struct inode *inode)
+{
+	struct erofs_inode *vi = EROFS_I(inode);
+	struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
+
+	/* enable page cache sharing only in share domain mode */
+	if (!erofs_is_fscache_mode(inode->i_sb) || !sbi->domain_id)
+		return false;
+
+	if (vi->datalayout != EROFS_INODE_CHUNK_BASED)
+		return false;
+
+	/* avoid crossing multi devicces/blobs */
+	if (inode->i_size > 1UL << vi->chunkbits)
+		return false;
+
+	/* avoid data leakage in mmap routine */
+	if (EROFS_BLKSIZ % PAGE_SIZE)
+		return false;
+
+	return true;
+}
+
 /*
  * Different from grab_cache_page_nowait(), reclaiming is never triggered
  * when allocating new pages.
-- 
2.19.1.6.gb485710b


  parent reply	other threads:[~2023-01-11  8:32 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-11  8:31 [PATCH v2 0/7] erofs: support page cache sharing between EROFS images in fscache mode Jingbo Xu
2023-01-11  8:31 ` [PATCH v2 1/7] erofs: remove unused device mapping in the meta routine Jingbo Xu
2023-01-11  8:31 ` [PATCH v2 2/7] erofs: unify anonymous inodes for blob Jingbo Xu
2023-01-11  8:31 ` [PATCH v2 3/7] erofs: allocate anonymous file of blob for page cache sharing Jingbo Xu
2023-01-11  8:31 ` [PATCH v2 4/7] erofs: implement .read_iter " Jingbo Xu
2023-01-11  8:31 ` [PATCH v2 5/7] erofs: implement .mmap " Jingbo Xu
2023-01-11  8:31 ` Jingbo Xu [this message]
2023-01-11  8:31 ` [PATCH v2 7/7] erofs: introduce 'sharecache' mount option Jingbo Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230111083158.23462-7-jefflexu@linux.alibaba.com \
    --to=jefflexu@linux.alibaba.com \
    --cc=chao@kernel.org \
    --cc=huyue2@coolpad.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xiang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).