* [PATCH v10 00/10] erofs: Introduce page cache sharing feature
@ 2025-12-23 1:56 Hongbo Li
2025-12-23 1:56 ` [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter Hongbo Li
` (8 more replies)
0 siblings, 9 replies; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
Enabling page cahe sharing in container scenarios has become increasingly
crucial, as it can significantly reduce memory usage. In previous efforts,
Hongzhen has done substantial work to push this feature into the EROFS
mainline. Due to other commitments, he hasn't been able to continue his
work recently, and I'm very pleased to build upon his work and continue
to refine this implementation.
This patch series is based on Hongzhen's original EROFS shared pagecache
implementation which was posted about half a year ago:
https://lore.kernel.org/all/20250301145002.2420830-1-hongzhen@linux.alibaba.com/T/#u
In addition to the forward-port, I have also fixed several bugs, resolved
some prerequisite dependencies and performed some minor cleanup.
(A recap of Hongzhen's original cover letter is below, edited slightly
for this serise:)
Background
==============
Currently, reading files with different paths (or names) but the same
content can consume multiple copies of the page cache, even if the
content of these caches is identical. For example, reading identical
files (e.g., *.so files) from two different minor versions of container
images can result in multiple copies of the same page cache, since
different containers have different mount points. Therefore, sharing
the page cache for files with the same content can save memory.
Proposal
==============
1. determining file identity
----------------------------
First, a way needs to be found to check whether the content of two files
is the same. Here, the xattr values associated with the file
fingerprints are assessed for consistency. When creating the EROFS
image, users can specify the name of the xattr for file fingerprints,
and the corresponding name will be stored in the packfile. The on-disk
`ishare_key_start` indicates the index of the xattr name within the
prefix xattrs:
```
struct erofs_super_block {
__u8 xattr_filter_reserved; /* reserved for xattr name filter */
- __u8 reserved[3];
+ __u8 ishare_xattr_prefix_id; /* indexes the ishare key in prefix xattres */
+ __u8 reserved[2];
};
```
For example, users can specify the first long prefix as the name for the
file fingerprint as follows:
```
mkfs.erofs --ishare_key=trusted.erofs.fingerprint erofs.img ./dir
```
In this way, `trusted.erofs.fingerprint` serves as the name of the xattr
for the file fingerprint. The relevant patch for erofs-utils has been posted
in:
v2: https://lore.kernel.org/all/20251118015849.228939-1-lihongbo22@huawei.com/
At the same time, for security reasons, this patch series only shares
files within the same domain, which is achieved by adding
"-o domain_id=xxxx" during the mounting process:
```
mount -t erofs -o domain_id=trusted.erofs.fingerprint erofs.img /mnt
```
If no domain ID is specified, it will fall back to the non-page cache
sharing mode.
2. Implementation
==================
2.1. file open & close
----------------------
When the file is opened, the ->private_data field of file A or file B is
set to point to an internal deduplicated file. When the actual read
occurs, the page cache of this deduplicated file will be accessed.
When the file is opened, if the corresponding erofs inode is newly
created, then perform the following actions:
1. add the erofs inode to the backing list of the deduplicated inode;
2. increase the reference count of the deduplicated inode.
The purpose of step 1 above is to ensure that when a real I/O operation
occurs, the deduplicated inode can locate one of the disk devices
(as the deduplicated inode itself is not bound to a specific device).
Step 2 is for managing the lifecycle of the deduplicated inode.
When the erofs inode is destroyed, the opposite actions mentioned above
will be taken.
2.2. file reading
-----------------
Assuming the deduplication inode's page cache is PGCache_dedup, there
are two possible scenarios when reading a file:
1) the content being read is already present in PGCache_dedup;
2) the content being read is not present in PGCache_dedup.
In the second scenario, it involves the iomap operation to read from the
disk.
2.2.1. reading existing data in PGCache_dedup
-------------------------------------------
In this case, the overall read flowchart is as follows (take ksys_read()
for example):
ksys_read
│
│
▼
...
│
│
▼
erofs_ishare_file_read_iter (switch to backing deduplicated file)
│
│
▼
read PGCache_dedup & return
At this point, the content in PGCache_dedup will be read directly and
returned.
2.2.2 reading non-existent content in PGCache_dedup
---------------------------------------------------
In this case, disk I/O operations will be involved. Taking the reading
of an uncompressed file as an example, here is the reading process:
ksys_read
│
│
▼
...
│
│
▼
erofs_ishare_file_read_iter (switch to backing deduplicated file)
│
│
▼
... (allocate pages)
│
│
▼
erofs_read_folio/erofs_readahead
│
│
▼
... (iomap)
│
│
▼
erofs_iomap_begin
│
│
▼
...
Iomap and the layers below will involve disk I/O operations. As
described in 2.1, the deduplicated inode itself is not bound to a
specific device. The deduplicated inode will select an erofs inode from
the backing list (by default, the first one) to complete the
corresponding iomap operation.
2.3. release page cache
-----------------------
Similar to overlayfs, when dropping the page cache via .fadvise, erofs
locates the deduplicated file and applies vfs_fadvise to that specific
file.
Effect
==================
I conducted experiments on two aspects across two different minor
versions of container images:
1. reading all files in two different minor versions of container images
2. run workloads or use the default entrypoint within the containers^[1]
Below is the memory usage for reading all files in two different minor
versions of container images:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 241 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 163 | 33% |
+-------------------+------------------+-------------+---------------+
| | No | 872 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 630 | 28% |
+-------------------+------------------+-------------+---------------+
| | No | 2771 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 2340 | 16% |
+-------------------+------------------+-------------+---------------+
| | No | 926 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 735 | 21% |
+-------------------+------------------+-------------+---------------+
| | No | 390 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 219 | 44% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 924 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 474 | 49% |
+-------------------+------------------+-------------+---------------+
Additionally, the table below shows the runtime memory usage of the
container:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 34.9 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 33.6 | 4% |
+-------------------+------------------+-------------+---------------+
| | No | 149.1 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 95 | 37% |
+-------------------+------------------+-------------+---------------+
| | No | 1027.9 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 934.3 | 10% |
+-------------------+------------------+-------------+---------------+
| | No | 155.0 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 139.1 | 11% |
+-------------------+------------------+-------------+---------------+
| | No | 25.4 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 18.8 | 26% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 186 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 99 | 47% |
+-------------------+------------------+-------------+---------------+
It can be observed that when reading all the files in the image, the
reduced memory usage varies from 16% to 49%, depending on the specific
image. Additionally, the container's runtime memory usage reduction
ranges from 4% to 47%.
[1] Below are the workload for these images:
- redis: redis-benchmark
- postgres: sysbench
- tensorflow: app.py of tensorflow.python.platform
- mysql: sysbench
- nginx: wrk
- tomcat: default entrypoint
Changes from v9:
- make shared page cache as a compatiable feature.
- refine code style as suggested by Xiang.
- init ishare mnt during the module init as suggested by Xiang.
- rebase the latest mainline and fix the comments in cover letter.
Changes from v8:
- add review-by in patch 1 and patch 10.
- do some clean up in patch 2 and patch 4,6,9 as suggested by Xiang.
- add new patch 3 to export alloc_empty_backing_file.
- patch 5 only use xattr prefix id to record the ishare info, changed
config to EROFS_FS_PAGE_CACHE_SHARE and make it compatible.
- patch 7 use backing file helpers to alloc file when ishare file is
opened as suggested by Xiang.
- patch 8 remove erofs_read_{begin,end} as suggested by Xiang.
v9: https://lore.kernel.org/all/20251117132537.227116-1-lihongbo22@huawei.com/
v8: https://lore.kernel.org/all/20251114095516.207555-1-lihongbo22@huawei.com/
v7: https://lore.kernel.org/all/20251021104815.70662-1-lihongbo22@huawei.com/
v6: https://lore.kernel.org/all/20250301145002.2420830-1-hongzhen@linux.alibaba.com/T/#u
v5: https://lore.kernel.org/all/20250105151208.3797385-1-hongzhen@linux.alibaba.com/
v4: https://lore.kernel.org/all/20240902110620.2202586-1-hongzhen@linux.alibaba.com/
v3: https://lore.kernel.org/all/20240828111959.3677011-1-hongzhen@linux.alibaba.com/
v2: https://lore.kernel.org/all/20240731080704.678259-1-hongzhen@linux.alibaba.com/
v1: https://lore.kernel.org/all/20240722065355.1396365-1-hongzhen@linux.alibaba.com/
Diffstat:
fs/erofs/Kconfig | 9 ++
fs/erofs/Makefile | 1 +
fs/erofs/data.c | 89 +++++++++++----
fs/erofs/erofs_fs.h | 5 +-
fs/erofs/fscache.c | 13 ---
fs/erofs/inode.c | 4 +
fs/erofs/internal.h | 50 ++++++++
fs/erofs/ishare.c | 253 +++++++++++++++++++++++++++++++++++++++++
fs/erofs/super.c | 52 ++++++++-
fs/erofs/xattr.c | 13 +++
fs/erofs/zdata.c | 42 +++++--
fs/file_table.c | 1 +
fs/fuse/file.c | 4 +-
fs/iomap/buffered-io.c | 6 +-
include/linux/iomap.h | 8 +-
15 files changed, 492 insertions(+), 58 deletions(-)
create mode 100644 fs/erofs/ishare.c
--
2.22.0
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 2:32 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed Hongbo Li
` (7 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
It's useful to get filesystem-specific information using the
existing private field in the @iomap_iter passed to iomap_{begin,end}
for advanced usage for iomap buffered reads, which is much like the
current iomap DIO.
For example, EROFS needs it to:
- implement an efficient page cache sharing feature, since iomap
needs to apply to anon inode page cache but we'd like to get the
backing inode/fs instead, so filesystem-specific private data is
needed to keep such information;
- pass in both struct page * and void * for inline data to avoid
kmap_to_page() usage (which is bogus).
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/fuse/file.c | 4 ++--
fs/iomap/buffered-io.c | 6 ++++--
include/linux/iomap.h | 8 ++++----
3 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 01bc894e9c2b..f5d8887c1922 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -979,7 +979,7 @@ static int fuse_read_folio(struct file *file, struct folio *folio)
return -EIO;
}
- iomap_read_folio(&fuse_iomap_ops, &ctx);
+ iomap_read_folio(&fuse_iomap_ops, &ctx, NULL);
fuse_invalidate_atime(inode);
return 0;
}
@@ -1081,7 +1081,7 @@ static void fuse_readahead(struct readahead_control *rac)
if (fuse_is_bad(inode))
return;
- iomap_readahead(&fuse_iomap_ops, &ctx);
+ iomap_readahead(&fuse_iomap_ops, &ctx, NULL);
}
static ssize_t fuse_cache_read_iter(struct kiocb *iocb, struct iov_iter *to)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e5c1ca440d93..5f7dcbabbda3 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -555,13 +555,14 @@ static int iomap_read_folio_iter(struct iomap_iter *iter,
}
void iomap_read_folio(const struct iomap_ops *ops,
- struct iomap_read_folio_ctx *ctx)
+ struct iomap_read_folio_ctx *ctx, void *private)
{
struct folio *folio = ctx->cur_folio;
struct iomap_iter iter = {
.inode = folio->mapping->host,
.pos = folio_pos(folio),
.len = folio_size(folio),
+ .private = private,
};
size_t bytes_submitted = 0;
int ret;
@@ -620,13 +621,14 @@ static int iomap_readahead_iter(struct iomap_iter *iter,
* the filesystem to be reentered.
*/
void iomap_readahead(const struct iomap_ops *ops,
- struct iomap_read_folio_ctx *ctx)
+ struct iomap_read_folio_ctx *ctx, void *private)
{
struct readahead_control *rac = ctx->rac;
struct iomap_iter iter = {
.inode = rac->mapping->host,
.pos = readahead_pos(rac),
.len = readahead_length(rac),
+ .private = private,
};
size_t cur_bytes_submitted;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 520e967cb501..441d614e9fdf 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -341,9 +341,9 @@ ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
const struct iomap_ops *ops,
const struct iomap_write_ops *write_ops, void *private);
void iomap_read_folio(const struct iomap_ops *ops,
- struct iomap_read_folio_ctx *ctx);
+ struct iomap_read_folio_ctx *ctx, void *private);
void iomap_readahead(const struct iomap_ops *ops,
- struct iomap_read_folio_ctx *ctx);
+ struct iomap_read_folio_ctx *ctx, void *private);
bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count);
struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos, size_t len);
bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags);
@@ -595,7 +595,7 @@ static inline void iomap_bio_read_folio(struct folio *folio,
.cur_folio = folio,
};
- iomap_read_folio(ops, &ctx);
+ iomap_read_folio(ops, &ctx, NULL);
}
static inline void iomap_bio_readahead(struct readahead_control *rac,
@@ -606,7 +606,7 @@ static inline void iomap_bio_readahead(struct readahead_control *rac,
.rac = rac,
};
- iomap_readahead(ops, &ctx);
+ iomap_readahead(ops, &ctx, NULL);
}
#endif /* CONFIG_BLOCK */
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
2025-12-23 1:56 ` [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 2:32 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 03/10] fs: Export alloc_empty_backing_file Hongbo Li
` (6 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
Introduce `struct erofs_iomap_iter_ctx` to hold both `struct page *`
and `void *base`, avoiding bogus use of `kmap_to_page()` in
`erofs_iomap_end()`.
With this change, fiemap and bmap no longer need to read inline data.
Additionally, the upcoming page cache sharing mechanism requires
passing the backing inode pointer to `erofs_iomap_{begin,end}()`, as
I/O accesses must apply to backing inodes rather than anon inodes.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/data.c | 67 +++++++++++++++++++++++++++++++++----------------
1 file changed, 46 insertions(+), 21 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index bb13c4cb8455..71e23d91123d 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -266,13 +266,20 @@ void erofs_onlinefolio_end(struct folio *folio, int err, bool dirty)
folio_end_read(folio, !(v & BIT(EROFS_ONLINEFOLIO_EIO)));
}
+struct erofs_iomap_iter_ctx {
+ struct page *page;
+ void *base;
+};
+
static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
unsigned int flags, struct iomap *iomap, struct iomap *srcmap)
{
- int ret;
+ struct iomap_iter *iter = container_of(iomap, struct iomap_iter, iomap);
+ struct erofs_iomap_iter_ctx *ctx = iter->private;
struct super_block *sb = inode->i_sb;
struct erofs_map_blocks map;
struct erofs_map_dev mdev;
+ int ret;
map.m_la = offset;
map.m_llen = length;
@@ -283,7 +290,6 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
iomap->offset = map.m_la;
iomap->length = map.m_llen;
iomap->flags = 0;
- iomap->private = NULL;
iomap->addr = IOMAP_NULL_ADDR;
if (!(map.m_flags & EROFS_MAP_MAPPED)) {
iomap->type = IOMAP_HOLE;
@@ -309,16 +315,20 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
}
if (map.m_flags & EROFS_MAP_META) {
- void *ptr;
- struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
-
iomap->type = IOMAP_INLINE;
- ptr = erofs_read_metabuf(&buf, sb, map.m_pa,
- erofs_inode_in_metabox(inode));
- if (IS_ERR(ptr))
- return PTR_ERR(ptr);
- iomap->inline_data = ptr;
- iomap->private = buf.base;
+ /* read context should read the inlined data */
+ if (ctx) {
+ struct erofs_buf buf = __EROFS_BUF_INITIALIZER;
+ void *ptr;
+
+ ptr = erofs_read_metabuf(&buf, sb, map.m_pa,
+ erofs_inode_in_metabox(inode));
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+ iomap->inline_data = ptr;
+ ctx->page = buf.page;
+ ctx->base = buf.base;
+ }
} else {
iomap->type = IOMAP_MAPPED;
}
@@ -328,18 +338,18 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
static int erofs_iomap_end(struct inode *inode, loff_t pos, loff_t length,
ssize_t written, unsigned int flags, struct iomap *iomap)
{
- void *ptr = iomap->private;
+ struct iomap_iter *iter = container_of(iomap, struct iomap_iter, iomap);
+ struct erofs_iomap_iter_ctx *ctx = iter->private;
- if (ptr) {
+ if (ctx && ctx->base) {
struct erofs_buf buf = {
- .page = kmap_to_page(ptr),
- .base = ptr,
+ .page = ctx->page,
+ .base = ctx->base,
};
DBG_BUGON(iomap->type != IOMAP_INLINE);
erofs_put_metabuf(&buf);
- } else {
- DBG_BUGON(iomap->type == IOMAP_INLINE);
+ ctx->base = NULL;
}
return written;
}
@@ -369,18 +379,30 @@ int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
*/
static int erofs_read_folio(struct file *file, struct folio *folio)
{
+ struct iomap_read_folio_ctx read_ctx = {
+ .ops = &iomap_bio_read_ops,
+ .cur_folio = folio,
+ };
+ struct erofs_iomap_iter_ctx iter_ctx = {};
+
trace_erofs_read_folio(folio, true);
- iomap_bio_read_folio(folio, &erofs_iomap_ops);
+ iomap_read_folio(&erofs_iomap_ops, &read_ctx, &iter_ctx);
return 0;
}
static void erofs_readahead(struct readahead_control *rac)
{
+ struct iomap_read_folio_ctx read_ctx = {
+ .ops = &iomap_bio_read_ops,
+ .rac = rac,
+ };
+ struct erofs_iomap_iter_ctx iter_ctx = {};
+
trace_erofs_readahead(rac->mapping->host, readahead_index(rac),
readahead_count(rac), true);
- iomap_bio_readahead(rac, &erofs_iomap_ops);
+ iomap_readahead(&erofs_iomap_ops, &read_ctx, &iter_ctx);
}
static sector_t erofs_bmap(struct address_space *mapping, sector_t block)
@@ -400,9 +422,12 @@ static ssize_t erofs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
if (IS_DAX(inode))
return dax_iomap_rw(iocb, to, &erofs_iomap_ops);
#endif
- if ((iocb->ki_flags & IOCB_DIRECT) && inode->i_sb->s_bdev)
+ if ((iocb->ki_flags & IOCB_DIRECT) && inode->i_sb->s_bdev) {
+ struct erofs_iomap_iter_ctx iter_ctx = {};
+
return iomap_dio_rw(iocb, to, &erofs_iomap_ops,
- NULL, 0, NULL, 0);
+ NULL, 0, &iter_ctx, 0);
+ }
return filemap_read(iocb, to, 0);
}
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 03/10] fs: Export alloc_empty_backing_file
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
2025-12-23 1:56 ` [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter Hongbo Li
2025-12-23 1:56 ` [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 8:31 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c Hongbo Li
` (5 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
There is no need to open nonexistent real files if backing files
couldn't be backed by real files (e.g., EROFS page cache sharing
doesn't need typical real files to open again).
Therefore, we export the alloc_empty_backing_file() helper, allowing
filesystems to dynamically set the backing file without real file
open. This is particularly useful for obtaining the correct @path
and @inode when calling file_user_path() and file_user_inode().
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/file_table.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/file_table.c b/fs/file_table.c
index cd4a3db4659a..476edfe7d8f5 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -308,6 +308,7 @@ struct file *alloc_empty_backing_file(int flags, const struct cred *cred)
ff->file.f_mode |= FMODE_BACKING | FMODE_NOACCOUNT;
return &ff->file;
}
+EXPORT_SYMBOL_GPL(alloc_empty_backing_file);
/**
* file_init_path - initialize a 'struct file' based on path
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (2 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 03/10] fs: Export alloc_empty_backing_file Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 8:30 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 05/10] erofs: support user-defined fingerprint name Hongbo Li
` (4 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
From: Hongzhen Luo <hongzhen@linux.alibaba.com>
Move the `struct erofs_anon_fs_type` to the super.c and
expose it in preparation for the upcoming page cache share
feature.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/fscache.c | 13 -------------
fs/erofs/internal.h | 2 ++
fs/erofs/super.c | 15 +++++++++++++++
3 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 7a346e20f7b7..f4937b025038 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -3,7 +3,6 @@
* Copyright (C) 2022, Alibaba Cloud
* Copyright (C) 2022, Bytedance Inc. All rights reserved.
*/
-#include <linux/pseudo_fs.h>
#include <linux/fscache.h>
#include "internal.h"
@@ -13,18 +12,6 @@ static LIST_HEAD(erofs_domain_list);
static LIST_HEAD(erofs_domain_cookies_list);
static struct vfsmount *erofs_pseudo_mnt;
-static int erofs_anon_init_fs_context(struct fs_context *fc)
-{
- return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type erofs_anon_fs_type = {
- .owner = THIS_MODULE,
- .name = "pseudo_erofs",
- .init_fs_context = erofs_anon_init_fs_context,
- .kill_sb = kill_anon_super,
-};
-
struct erofs_fscache_io {
struct netfs_cache_resources cres;
struct iov_iter iter;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f7f622836198..98fe652aea33 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -188,6 +188,8 @@ static inline bool erofs_is_fileio_mode(struct erofs_sb_info *sbi)
return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->dif0.file;
}
+extern struct file_system_type erofs_anon_fs_type;
+
static inline bool erofs_is_fscache_mode(struct super_block *sb)
{
return IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) &&
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 937a215f626c..2a44c4e5af4f 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -11,6 +11,7 @@
#include <linux/fs_parser.h>
#include <linux/exportfs.h>
#include <linux/backing-dev.h>
+#include <linux/pseudo_fs.h>
#include "xattr.h"
#define CREATE_TRACE_POINTS
@@ -936,6 +937,20 @@ static struct file_system_type erofs_fs_type = {
};
MODULE_ALIAS_FS("erofs");
+#if defined(CONFIG_EROFS_FS_ONDEMAND)
+static int erofs_anon_init_fs_context(struct fs_context *fc)
+{
+ return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
+}
+
+struct file_system_type erofs_anon_fs_type = {
+ .owner = THIS_MODULE,
+ .name = "pseudo_erofs",
+ .init_fs_context = erofs_anon_init_fs_context,
+ .kill_sb = kill_anon_super,
+};
+#endif
+
static int __init erofs_module_init(void)
{
int err;
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 05/10] erofs: support user-defined fingerprint name
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (3 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 7:22 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 06/10] erofs: support domain-specific page cache share Hongbo Li
` (3 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
From: Hongzhen Luo <hongzhen@linux.alibaba.com>
When creating the EROFS image, users can specify the fingerprint name.
This is to prepare for the upcoming inode page cache share.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/Kconfig | 9 +++++++++
fs/erofs/erofs_fs.h | 5 +++--
fs/erofs/internal.h | 2 ++
fs/erofs/super.c | 3 +++
fs/erofs/xattr.c | 13 +++++++++++++
5 files changed, 30 insertions(+), 2 deletions(-)
diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
index d81f3318417d..c88b6d0714a4 100644
--- a/fs/erofs/Kconfig
+++ b/fs/erofs/Kconfig
@@ -194,3 +194,12 @@ config EROFS_FS_PCPU_KTHREAD_HIPRI
at higher priority.
If unsure, say N.
+
+config EROFS_FS_PAGE_CACHE_SHARE
+ bool "EROFS page cache share support (experimental)"
+ depends on EROFS_FS && EROFS_FS_XATTR && !EROFS_FS_ONDEMAND
+ help
+ This enables page cache sharing among inodes with identical
+ content fingerprints on the same device.
+
+ If unsure, say N.
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
index e24268acdd62..20515d2462af 100644
--- a/fs/erofs/erofs_fs.h
+++ b/fs/erofs/erofs_fs.h
@@ -17,7 +17,7 @@
#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004
#define EROFS_FEATURE_COMPAT_SHARED_EA_IN_METABOX 0x00000008
#define EROFS_FEATURE_COMPAT_PLAIN_XATTR_PFX 0x00000010
-
+#define EROFS_FEATURE_COMPAT_ISHARE_XATTRS 0x00000020
/*
* Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should
@@ -83,7 +83,8 @@ struct erofs_super_block {
__le32 xattr_prefix_start; /* start of long xattr prefixes */
__le64 packed_nid; /* nid of the special packed inode */
__u8 xattr_filter_reserved; /* reserved for xattr name filter */
- __u8 reserved[3];
+ __u8 ishare_xattr_prefix_id; /* indexes the ishare key in prefix xattres */
+ __u8 reserved[2];
__le32 build_time; /* seconds added to epoch for mkfs time */
__le64 rootnid_8b; /* (48BIT on) nid of root directory */
__le64 reserved2;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 98fe652aea33..99e2857173c3 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -134,6 +134,7 @@ struct erofs_sb_info {
u32 xattr_blkaddr;
u32 xattr_prefix_start;
u8 xattr_prefix_count;
+ u8 ishare_xattr_pfx;
struct erofs_xattr_prefix_item *xattr_prefixes;
unsigned int xattr_filter_reserved;
#endif
@@ -238,6 +239,7 @@ EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
EROFS_FEATURE_FUNCS(xattr_filter, compat, COMPAT_XATTR_FILTER)
EROFS_FEATURE_FUNCS(shared_ea_in_metabox, compat, COMPAT_SHARED_EA_IN_METABOX)
EROFS_FEATURE_FUNCS(plain_xattr_pfx, compat, COMPAT_PLAIN_XATTR_PFX)
+EROFS_FEATURE_FUNCS(ishare_xattrs, compat, COMPAT_ISHARE_XATTRS)
static inline u64 erofs_nid_to_ino64(struct erofs_sb_info *sbi, erofs_nid_t nid)
{
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 2a44c4e5af4f..68480f10e69d 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -298,6 +298,9 @@ static int erofs_read_superblock(struct super_block *sb)
if (ret)
goto out;
}
+ if (erofs_sb_has_ishare_xattrs(sbi))
+ sbi->ishare_xattr_pfx =
+ dsb->ishare_xattr_prefix_id & EROFS_XATTR_LONG_PREFIX_MASK;
ret = -EINVAL;
sbi->feature_incompat = le32_to_cpu(dsb->feature_incompat);
diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c
index 396536d9a862..969e77efd038 100644
--- a/fs/erofs/xattr.c
+++ b/fs/erofs/xattr.c
@@ -519,6 +519,19 @@ int erofs_xattr_prefixes_init(struct super_block *sb)
}
erofs_put_metabuf(&buf);
+ if (!ret && erofs_sb_has_ishare_xattrs(sbi)) {
+ struct erofs_xattr_prefix_item *pf = pfs + sbi->ishare_xattr_pfx;
+ struct erofs_xattr_long_prefix *newpfx;
+
+ newpfx = krealloc(pf->prefix,
+ sizeof(*newpfx) + pf->infix_len + 1, GFP_KERNEL);
+ if (newpfx) {
+ newpfx->infix[pf->infix_len] = '\0';
+ pf->prefix = newpfx;
+ } else {
+ ret = -ENOMEM;
+ }
+ }
sbi->xattr_prefixes = pfs;
if (ret)
erofs_xattr_prefixes_cleanup(sb);
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 06/10] erofs: support domain-specific page cache share
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (4 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 05/10] erofs: support user-defined fingerprint name Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 7:25 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 07/10] erofs: introduce the page cache share feature Hongbo Li
` (2 subsequent siblings)
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
From: Hongzhen Luo <hongzhen@linux.alibaba.com>
Only files in the same domain will share the page cache. Also modify
the sysfs related content in preparation for the upcoming page cache
share feature.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/super.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 68480f10e69d..be9f96252c6c 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -518,6 +518,8 @@ static int erofs_fc_parse_param(struct fs_context *fc,
if (!sbi->fsid)
return -ENOMEM;
break;
+#endif
+#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
case Opt_domain_id:
kfree(sbi->domain_id);
sbi->domain_id = kstrdup(param->string, GFP_KERNEL);
@@ -618,7 +620,7 @@ static void erofs_set_sysfs_name(struct super_block *sb)
{
struct erofs_sb_info *sbi = EROFS_SB(sb);
- if (sbi->domain_id)
+ if (sbi->domain_id && !erofs_sb_has_ishare_xattrs(sbi))
super_set_sysfs_name_generic(sb, "%s,%s", sbi->domain_id,
sbi->fsid);
else if (sbi->fsid)
@@ -1052,6 +1054,8 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
#ifdef CONFIG_EROFS_FS_ONDEMAND
if (sbi->fsid)
seq_printf(seq, ",fsid=%s", sbi->fsid);
+#endif
+#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
if (sbi->domain_id)
seq_printf(seq, ",domain_id=%s", sbi->domain_id);
#endif
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 07/10] erofs: introduce the page cache share feature
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (5 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 06/10] erofs: support domain-specific page cache share Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 8:11 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 08/10] erofs: support unencoded inodes for page cache share Hongbo Li
2025-12-23 1:56 ` [PATCH v10 09/10] erofs: support compressed " Hongbo Li
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
From: Hongzhen Luo <hongzhen@linux.alibaba.com>
Currently, reading files with different paths (or names) but the same
content will consume multiple copies of the page cache, even if the
content of these page caches is the same. For example, reading
identical files (e.g., *.so files) from two different minor versions of
container images will cost multiple copies of the same page cache,
since different containers have different mount points. Therefore,
sharing the page cache for files with the same content can save memory.
This introduces the page cache share feature in erofs. It allocate a
deduplicated inode and use its page cache as shared. Reads for files
with identical content will ultimately be routed to the page cache of
the deduplicated inode. In this way, a single page cache satisfies
multiple read requests for different files with the same contents.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/Makefile | 1 +
fs/erofs/internal.h | 29 ++++++
fs/erofs/ishare.c | 211 ++++++++++++++++++++++++++++++++++++++++++++
fs/erofs/super.c | 34 ++++++-
4 files changed, 272 insertions(+), 3 deletions(-)
create mode 100644 fs/erofs/ishare.c
diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
index 549abc424763..a80e1762b607 100644
--- a/fs/erofs/Makefile
+++ b/fs/erofs/Makefile
@@ -10,3 +10,4 @@ erofs-$(CONFIG_EROFS_FS_ZIP_ZSTD) += decompressor_zstd.o
erofs-$(CONFIG_EROFS_FS_ZIP_ACCEL) += decompressor_crypto.o
erofs-$(CONFIG_EROFS_FS_BACKED_BY_FILE) += fileio.o
erofs-$(CONFIG_EROFS_FS_ONDEMAND) += fscache.o
+erofs-$(CONFIG_EROFS_FS_PAGE_CACHE_SHARE) += ishare.o
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 99e2857173c3..ae9560434324 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -304,6 +304,22 @@ struct erofs_inode {
};
#endif /* CONFIG_EROFS_FS_ZIP */
};
+#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
+ union {
+ /* internal dedup inode */
+ struct {
+ char *fingerprint;
+ spinlock_t lock;
+ /* all backing inodes */
+ struct list_head backing_head;
+ };
+
+ struct {
+ struct inode *ishare;
+ struct list_head backing_link;
+ };
+ };
+#endif
/* the corresponding vfs inode */
struct inode vfs_inode;
};
@@ -410,6 +426,7 @@ extern const struct inode_operations erofs_dir_iops;
extern const struct file_operations erofs_file_fops;
extern const struct file_operations erofs_dir_fops;
+extern const struct file_operations erofs_ishare_fops;
extern const struct iomap_ops z_erofs_iomap_report_ops;
@@ -541,6 +558,18 @@ static inline struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev) {
static inline void erofs_fscache_submit_bio(struct bio *bio) {}
#endif
+#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
+int __init erofs_init_ishare(void);
+void erofs_exit_ishare(void);
+bool erofs_ishare_fill_inode(struct inode *inode);
+void erofs_ishare_free_inode(struct inode *inode);
+#else
+static inline int erofs_init_ishare(void) { return 0; }
+static inline void erofs_exit_ishare(void) {}
+static inline bool erofs_ishare_fill_inode(struct inode *inode) { return false; }
+static inline void erofs_ishare_free_inode(struct inode *inode) {}
+#endif // CONFIG_EROFS_FS_PAGE_CACHE_SHARE
+
long erofs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
long erofs_compat_ioctl(struct file *filp, unsigned int cmd,
unsigned long arg);
diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
new file mode 100644
index 000000000000..4b46016bcd03
--- /dev/null
+++ b/fs/erofs/ishare.c
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024, Alibaba Cloud
+ */
+#include <linux/xxhash.h>
+#include <linux/refcount.h>
+#include <linux/mount.h>
+#include <linux/mutex.h>
+#include <linux/ramfs.h>
+#include "internal.h"
+#include "xattr.h"
+
+#include "../internal.h"
+
+static struct vfsmount *erofs_ishare_mnt;
+
+static int erofs_ishare_iget5_eq(struct inode *inode, void *data)
+{
+ struct erofs_inode *vi = EROFS_I(inode);
+
+ return vi->fingerprint && memcmp(vi->fingerprint, data,
+ sizeof(size_t) + *(size_t *)data) == 0;
+}
+
+static int erofs_ishare_iget5_set(struct inode *inode, void *data)
+{
+ struct erofs_inode *vi = EROFS_I(inode);
+
+ vi->fingerprint = data;
+ INIT_LIST_HEAD(&vi->backing_head);
+ spin_lock_init(&vi->lock);
+ return 0;
+}
+
+bool erofs_ishare_fill_inode(struct inode *inode)
+{
+ struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
+ struct erofs_xattr_prefix_item *ishare_prefix;
+ struct erofs_inode *vi = EROFS_I(inode);
+ struct inode *idedup;
+ /*
+ * fingerprint layout:
+ * fingerprint length + fingerprint content (xattr_value + domain_id)
+ */
+ char *ishare_key, *fingerprint;
+ ssize_t ishare_vlen;
+ unsigned long hash;
+ int key_idx;
+
+ if (!sbi->domain_id || !erofs_sb_has_ishare_xattrs(sbi))
+ return false;
+
+ ishare_prefix = sbi->xattr_prefixes + sbi->ishare_xattr_pfx;
+ ishare_key = ishare_prefix->prefix->infix;
+ key_idx = ishare_prefix->prefix->base_index;
+ ishare_vlen = erofs_getxattr(inode, key_idx, ishare_key, NULL, 0);
+ if (ishare_vlen <= 0 || ishare_vlen > (1 << sbi->blkszbits))
+ return false;
+
+ fingerprint = kmalloc(sizeof(ssize_t) + ishare_vlen +
+ strlen(sbi->domain_id), GFP_KERNEL);
+ if (!fingerprint)
+ return false;
+
+ *(ssize_t *)fingerprint = ishare_vlen + strlen(sbi->domain_id);
+ if (ishare_vlen != erofs_getxattr(inode, key_idx, ishare_key,
+ fingerprint + sizeof(ssize_t),
+ ishare_vlen)) {
+ kfree(fingerprint);
+ return false;
+ }
+
+ memcpy(fingerprint + sizeof(ssize_t) + ishare_vlen,
+ sbi->domain_id, strlen(sbi->domain_id));
+ hash = xxh32(fingerprint + sizeof(ssize_t),
+ ishare_vlen + strlen(sbi->domain_id), hash);
+ idedup = iget5_locked(erofs_ishare_mnt->mnt_sb, hash,
+ erofs_ishare_iget5_eq, erofs_ishare_iget5_set,
+ fingerprint);
+ if (!idedup) {
+ kfree(fingerprint);
+ return false;
+ }
+
+ INIT_LIST_HEAD(&vi->backing_link);
+ vi->ishare = idedup;
+ spin_lock(&EROFS_I(idedup)->lock);
+ list_add(&vi->backing_link, &EROFS_I(idedup)->backing_head);
+ spin_unlock(&EROFS_I(idedup)->lock);
+ if (!(inode_state_read_once(idedup) & I_NEW)) {
+ kfree(fingerprint);
+ return true;
+ }
+ if (erofs_inode_is_data_compressed(vi->datalayout))
+ idedup->i_mapping->a_ops = &z_erofs_aops;
+ else
+ idedup->i_mapping->a_ops = &erofs_aops;
+ idedup->i_mode = vi->vfs_inode.i_mode;
+ i_size_write(idedup, vi->vfs_inode.i_size);
+ unlock_new_inode(idedup);
+ return true;
+}
+
+void erofs_ishare_free_inode(struct inode *inode)
+{
+ struct erofs_inode *vi = EROFS_I(inode);
+ struct inode *idedup = vi->ishare;
+
+ if (!idedup)
+ return;
+ spin_lock(&EROFS_I(idedup)->lock);
+ list_del(&vi->backing_link);
+ spin_unlock(&EROFS_I(idedup)->lock);
+ iput(idedup);
+ vi->ishare = NULL;
+}
+
+static int erofs_ishare_file_open(struct inode *inode, struct file *file)
+{
+ struct file *realfile;
+ struct inode *dedup;
+
+ dedup = EROFS_I(inode)->ishare;
+ if (!dedup)
+ return -EINVAL;
+
+ realfile = alloc_empty_backing_file(O_RDONLY|O_NOATIME, current_cred());
+ if (IS_ERR(realfile))
+ return PTR_ERR(realfile);
+ ihold(dedup);
+ realfile->f_op = &erofs_file_fops;
+ realfile->f_inode = dedup;
+ realfile->f_mapping = dedup->i_mapping;
+ path_get(&file->f_path);
+ backing_file_set_user_path(realfile, &file->f_path);
+
+ file_ra_state_init(&realfile->f_ra, file->f_mapping);
+ realfile->private_data = EROFS_I(inode);
+ file->private_data = realfile;
+ return 0;
+}
+
+static int erofs_ishare_file_release(struct inode *inode, struct file *file)
+{
+ struct file *realfile = file->private_data;
+
+ if (!realfile)
+ return -EINVAL;
+ iput(realfile->f_inode);
+ fput(realfile);
+ file->private_data = NULL;
+ return 0;
+}
+
+static ssize_t erofs_ishare_file_read_iter(struct kiocb *iocb,
+ struct iov_iter *to)
+{
+ struct file *realfile = iocb->ki_filp->private_data;
+ struct inode *inode = file_inode(iocb->ki_filp);
+ struct kiocb dedup_iocb;
+ ssize_t nread;
+
+ if (!realfile)
+ return -EINVAL;
+ if (!iov_iter_count(to))
+ return 0;
+
+ /* fallback to the original file in DAX or DIRECT mode */
+ if (IS_DAX(inode) || (iocb->ki_flags & IOCB_DIRECT))
+ realfile = iocb->ki_filp;
+
+ kiocb_clone(&dedup_iocb, iocb, realfile);
+ nread = filemap_read(&dedup_iocb, to, 0);
+ iocb->ki_pos = dedup_iocb.ki_pos;
+ file_accessed(iocb->ki_filp);
+ return nread;
+}
+
+static int erofs_ishare_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ struct file *realfile = file->private_data;
+
+ if (!realfile)
+ return -EINVAL;
+
+ vma_set_file(vma, realfile);
+ return generic_file_readonly_mmap(file, vma);
+}
+
+const struct file_operations erofs_ishare_fops = {
+ .open = erofs_ishare_file_open,
+ .llseek = generic_file_llseek,
+ .read_iter = erofs_ishare_file_read_iter,
+ .mmap = erofs_ishare_mmap,
+ .release = erofs_ishare_file_release,
+ .get_unmapped_area = thp_get_unmapped_area,
+ .splice_read = filemap_splice_read,
+};
+
+int __init erofs_init_ishare(void)
+{
+ erofs_ishare_mnt = kern_mount(&erofs_anon_fs_type);
+ if (IS_ERR(erofs_ishare_mnt))
+ return PTR_ERR(erofs_ishare_mnt);
+ return 0;
+}
+
+void erofs_exit_ishare(void)
+{
+ kern_unmount(erofs_ishare_mnt);
+}
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index be9f96252c6c..ecce491871af 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -942,14 +942,34 @@ static struct file_system_type erofs_fs_type = {
};
MODULE_ALIAS_FS("erofs");
-#if defined(CONFIG_EROFS_FS_ONDEMAND)
+#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
+static void erofs_free_anon_inode(struct inode *inode)
+{
+ struct erofs_inode *vi = EROFS_I(inode);
+
+#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
+ kfree(vi->fingerprint);
+#endif
+ kmem_cache_free(erofs_inode_cachep, vi);
+}
+
+static const struct super_operations erofs_anon_sops = {
+ .alloc_inode = erofs_alloc_inode,
+ .free_inode = erofs_free_anon_inode,
+};
+
static int erofs_anon_init_fs_context(struct fs_context *fc)
{
- return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
+ struct pseudo_fs_context *ctx;
+
+ ctx = init_pseudo(fc, EROFS_SUPER_MAGIC);
+ if (ctx)
+ ctx->ops = &erofs_anon_sops;
+
+ return ctx ? 0 : -ENOMEM;
}
struct file_system_type erofs_anon_fs_type = {
- .owner = THIS_MODULE,
.name = "pseudo_erofs",
.init_fs_context = erofs_anon_init_fs_context,
.kill_sb = kill_anon_super,
@@ -981,6 +1001,10 @@ static int __init erofs_module_init(void)
if (err)
goto sysfs_err;
+ err = erofs_init_ishare();
+ if (err)
+ goto ishare_err;
+
err = register_filesystem(&erofs_fs_type);
if (err)
goto fs_err;
@@ -988,6 +1012,8 @@ static int __init erofs_module_init(void)
return 0;
fs_err:
+ erofs_exit_ishare();
+ishare_err:
erofs_exit_sysfs();
sysfs_err:
z_erofs_exit_subsystem();
@@ -1005,6 +1031,7 @@ static void __exit erofs_module_exit(void)
/* Ensure all RCU free inodes / pclusters are safe to be destroyed. */
rcu_barrier();
+ erofs_exit_ishare();
erofs_exit_sysfs();
z_erofs_exit_subsystem();
erofs_exit_shrinker();
@@ -1071,6 +1098,7 @@ static void erofs_evict_inode(struct inode *inode)
dax_break_layout_final(inode);
#endif
+ erofs_ishare_free_inode(inode);
truncate_inode_pages_final(&inode->i_data);
clear_inode(inode);
}
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 08/10] erofs: support unencoded inodes for page cache share
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (6 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 07/10] erofs: introduce the page cache share feature Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 8:15 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 09/10] erofs: support compressed " Hongbo Li
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
This patch adds inode page cache sharing functionality for unencoded
files.
I conducted experiments in the container environment. Below is the
memory usage for reading all files in two different minor versions
of container images:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 241 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 163 | 33% |
+-------------------+------------------+-------------+---------------+
| | No | 872 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 630 | 28% |
+-------------------+------------------+-------------+---------------+
| | No | 2771 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 2340 | 16% |
+-------------------+------------------+-------------+---------------+
| | No | 926 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 735 | 21% |
+-------------------+------------------+-------------+---------------+
| | No | 390 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 219 | 44% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 924 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 474 | 49% |
+-------------------+------------------+-------------+---------------+
Additionally, the table below shows the runtime memory usage of the
container:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 35 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 28 | 20% |
+-------------------+------------------+-------------+---------------+
| | No | 149 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 95 | 37% |
+-------------------+------------------+-------------+---------------+
| | No | 1028 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 930 | 10% |
+-------------------+------------------+-------------+---------------+
| | No | 155 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 132 | 15% |
+-------------------+------------------+-------------+---------------+
| | No | 25 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 20 | 20% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 186 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 98 | 48% |
+-------------------+------------------+-------------+---------------+
Co-developed-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/data.c | 30 +++++++++++++++++++++++-------
fs/erofs/inode.c | 4 ++++
fs/erofs/internal.h | 17 +++++++++++++++++
fs/erofs/ishare.c | 31 +++++++++++++++++++++++++++++++
4 files changed, 75 insertions(+), 7 deletions(-)
diff --git a/fs/erofs/data.c b/fs/erofs/data.c
index 71e23d91123d..862df0c7ceb7 100644
--- a/fs/erofs/data.c
+++ b/fs/erofs/data.c
@@ -269,6 +269,7 @@ void erofs_onlinefolio_end(struct folio *folio, int err, bool dirty)
struct erofs_iomap_iter_ctx {
struct page *page;
void *base;
+ struct inode *realinode;
};
static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
@@ -276,14 +277,15 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
{
struct iomap_iter *iter = container_of(iomap, struct iomap_iter, iomap);
struct erofs_iomap_iter_ctx *ctx = iter->private;
- struct super_block *sb = inode->i_sb;
+ struct inode *realinode = ctx ? ctx->realinode : inode;
+ struct super_block *sb = realinode->i_sb;
struct erofs_map_blocks map;
struct erofs_map_dev mdev;
int ret;
map.m_la = offset;
map.m_llen = length;
- ret = erofs_map_blocks(inode, &map);
+ ret = erofs_map_blocks(realinode, &map);
if (ret < 0)
return ret;
@@ -296,7 +298,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
return 0;
}
- if (!(map.m_flags & EROFS_MAP_META) || !erofs_inode_in_metabox(inode)) {
+ if (!(map.m_flags & EROFS_MAP_META) || !erofs_inode_in_metabox(realinode)) {
mdev = (struct erofs_map_dev) {
.m_deviceid = map.m_deviceid,
.m_pa = map.m_pa,
@@ -322,7 +324,7 @@ static int erofs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
void *ptr;
ptr = erofs_read_metabuf(&buf, sb, map.m_pa,
- erofs_inode_in_metabox(inode));
+ erofs_inode_in_metabox(realinode));
if (IS_ERR(ptr))
return PTR_ERR(ptr);
iomap->inline_data = ptr;
@@ -379,30 +381,42 @@ int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
*/
static int erofs_read_folio(struct file *file, struct folio *folio)
{
+ struct inode *inode = folio_inode(folio);
struct iomap_read_folio_ctx read_ctx = {
.ops = &iomap_bio_read_ops,
.cur_folio = folio,
};
- struct erofs_iomap_iter_ctx iter_ctx = {};
+ struct erofs_iomap_iter_ctx iter_ctx = {
+ .realinode = erofs_ishare_iget(inode),
+ };
+ if (!iter_ctx.realinode)
+ return -EIO;
trace_erofs_read_folio(folio, true);
iomap_read_folio(&erofs_iomap_ops, &read_ctx, &iter_ctx);
+ erofs_ishare_iput(iter_ctx.realinode);
return 0;
}
static void erofs_readahead(struct readahead_control *rac)
{
+ struct inode *inode = rac->mapping->host;
struct iomap_read_folio_ctx read_ctx = {
.ops = &iomap_bio_read_ops,
.rac = rac,
};
- struct erofs_iomap_iter_ctx iter_ctx = {};
+ struct erofs_iomap_iter_ctx iter_ctx = {
+ .realinode = erofs_ishare_iget(inode),
+ };
+ if (!iter_ctx.realinode)
+ return;
trace_erofs_readahead(rac->mapping->host, readahead_index(rac),
readahead_count(rac), true);
iomap_readahead(&erofs_iomap_ops, &read_ctx, &iter_ctx);
+ erofs_ishare_iput(iter_ctx.realinode);
}
static sector_t erofs_bmap(struct address_space *mapping, sector_t block)
@@ -423,7 +437,9 @@ static ssize_t erofs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
return dax_iomap_rw(iocb, to, &erofs_iomap_ops);
#endif
if ((iocb->ki_flags & IOCB_DIRECT) && inode->i_sb->s_bdev) {
- struct erofs_iomap_iter_ctx iter_ctx = {};
+ struct erofs_iomap_iter_ctx iter_ctx = {
+ .realinode = inode,
+ };
return iomap_dio_rw(iocb, to, &erofs_iomap_ops,
NULL, 0, &iter_ctx, 0);
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index bce98c845a18..8116738fe432 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -215,6 +215,10 @@ static int erofs_fill_inode(struct inode *inode)
case S_IFREG:
inode->i_op = &erofs_generic_iops;
inode->i_fop = &erofs_file_fops;
+#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
+ if (erofs_ishare_fill_inode(inode))
+ inode->i_fop = &erofs_ishare_fops;
+#endif
break;
case S_IFDIR:
inode->i_op = &erofs_dir_iops;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index ae9560434324..c35e6857d563 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -197,6 +197,19 @@ static inline bool erofs_is_fscache_mode(struct super_block *sb)
!erofs_is_fileio_mode(EROFS_SB(sb)) && !sb->s_bdev;
}
+#if defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
+static inline bool erofs_is_ishare_inode(struct inode *inode)
+{
+ /* we have assumed FS_ONDEMAND is excluded with FS_INODE_SHARE feature */
+ return inode->i_sb->s_type == &erofs_anon_fs_type;
+}
+#else
+static inline bool erofs_is_ishare_inode(struct inode *inode)
+{
+ return false;
+}
+#endif
+
enum {
EROFS_ZIP_CACHE_DISABLED,
EROFS_ZIP_CACHE_READAHEAD,
@@ -563,11 +576,15 @@ int __init erofs_init_ishare(void);
void erofs_exit_ishare(void);
bool erofs_ishare_fill_inode(struct inode *inode);
void erofs_ishare_free_inode(struct inode *inode);
+struct inode *erofs_ishare_iget(struct inode *inode);
+void erofs_ishare_iput(struct inode *realinode);
#else
static inline int erofs_init_ishare(void) { return 0; }
static inline void erofs_exit_ishare(void) {}
static inline bool erofs_ishare_fill_inode(struct inode *inode) { return false; }
static inline void erofs_ishare_free_inode(struct inode *inode) {}
+static inline struct inode *erofs_ishare_iget(struct inode *inode) { return inode; }
+static inline void erofs_ishare_iput(struct inode *realinode) {}
#endif // CONFIG_EROFS_FS_PAGE_CACHE_SHARE
long erofs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
index 4b46016bcd03..269b53b3ed79 100644
--- a/fs/erofs/ishare.c
+++ b/fs/erofs/ishare.c
@@ -197,6 +197,37 @@ const struct file_operations erofs_ishare_fops = {
.splice_read = filemap_splice_read,
};
+/*
+ * erofs_ishare_iget - find the backing inode.
+ */
+struct inode *erofs_ishare_iget(struct inode *inode)
+{
+ struct erofs_inode *vi, *vi_dedup;
+ struct inode *realinode;
+
+ if (!erofs_is_ishare_inode(inode))
+ return igrab(inode);
+
+ vi_dedup = EROFS_I(inode);
+ spin_lock(&vi_dedup->lock);
+ /* fall back to all backing inodes */
+ DBG_BUGON(list_empty(&vi_dedup->backing_head));
+ list_for_each_entry(vi, &vi_dedup->backing_head, backing_link) {
+ realinode = igrab(&vi->vfs_inode);
+ if (realinode)
+ break;
+ }
+ spin_unlock(&vi_dedup->lock);
+
+ DBG_BUGON(!realinode);
+ return realinode;
+}
+
+void erofs_ishare_iput(struct inode *realinode)
+{
+ iput(realinode);
+}
+
int __init erofs_init_ishare(void)
{
erofs_ishare_mnt = kern_mount(&erofs_anon_fs_type);
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v10 09/10] erofs: support compressed inodes for page cache share
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
` (7 preceding siblings ...)
2025-12-23 1:56 ` [PATCH v10 08/10] erofs: support unencoded inodes for page cache share Hongbo Li
@ 2025-12-23 1:56 ` Hongbo Li
2025-12-23 8:18 ` Gao Xiang
8 siblings, 1 reply; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 1:56 UTC (permalink / raw)
To: hsiangkao, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel, lihongbo22
From: Hongzhen Luo <hongzhen@linux.alibaba.com>
This patch adds page cache sharing functionality for compressed inodes.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
fs/erofs/zdata.c | 42 ++++++++++++++++++++++++++++++++----------
1 file changed, 32 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 65da21504632..465918093984 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -493,7 +493,7 @@ enum z_erofs_pclustermode {
};
struct z_erofs_frontend {
- struct inode *const inode;
+ struct inode *inode;
struct erofs_map_blocks map;
struct z_erofs_bvec_iter biter;
@@ -1883,10 +1883,18 @@ static void z_erofs_pcluster_readmore(struct z_erofs_frontend *f,
static int z_erofs_read_folio(struct file *file, struct folio *folio)
{
- struct inode *const inode = folio->mapping->host;
- Z_EROFS_DEFINE_FRONTEND(f, inode, folio_pos(folio));
+ struct inode *const inode = folio->mapping->host, *realinode;
+ Z_EROFS_DEFINE_FRONTEND(f, NULL, folio_pos(folio));
int err;
+ if (erofs_is_ishare_inode(inode))
+ realinode = erofs_ishare_iget(inode);
+ else
+ realinode = inode;
+
+ if (!realinode)
+ return -EIO;
+ f.inode = realinode;
trace_erofs_read_folio(folio, false);
z_erofs_pcluster_readmore(&f, NULL, true);
err = z_erofs_scan_folio(&f, folio, false);
@@ -1896,23 +1904,34 @@ static int z_erofs_read_folio(struct file *file, struct folio *folio)
/* if some pclusters are ready, need submit them anyway */
err = z_erofs_runqueue(&f, 0) ?: err;
if (err && err != -EINTR)
- erofs_err(inode->i_sb, "read error %d @ %lu of nid %llu",
- err, folio->index, EROFS_I(inode)->nid);
+ erofs_err(realinode->i_sb, "read error %d @ %lu of nid %llu",
+ err, folio->index, EROFS_I(realinode)->nid);
erofs_put_metabuf(&f.map.buf);
erofs_release_pages(&f.pagepool);
+
+ if (erofs_is_ishare_inode(inode))
+ erofs_ishare_iput(realinode);
return err;
}
static void z_erofs_readahead(struct readahead_control *rac)
{
- struct inode *const inode = rac->mapping->host;
- Z_EROFS_DEFINE_FRONTEND(f, inode, readahead_pos(rac));
+ struct inode *const inode = rac->mapping->host, *realinode;
+ Z_EROFS_DEFINE_FRONTEND(f, NULL, readahead_pos(rac));
unsigned int nrpages = readahead_count(rac);
struct folio *head = NULL, *folio;
int err;
- trace_erofs_readahead(inode, readahead_index(rac), nrpages, false);
+ if (erofs_is_ishare_inode(inode))
+ realinode = erofs_ishare_iget(inode);
+ else
+ realinode = inode;
+
+ if (!realinode)
+ return;
+ f.inode = realinode;
+ trace_erofs_readahead(realinode, readahead_index(rac), nrpages, false);
z_erofs_pcluster_readmore(&f, rac, true);
while ((folio = readahead_folio(rac))) {
folio->private = head;
@@ -1926,8 +1945,8 @@ static void z_erofs_readahead(struct readahead_control *rac)
err = z_erofs_scan_folio(&f, folio, true);
if (err && err != -EINTR)
- erofs_err(inode->i_sb, "readahead error at folio %lu @ nid %llu",
- folio->index, EROFS_I(inode)->nid);
+ erofs_err(realinode->i_sb, "readahead error at folio %lu @ nid %llu",
+ folio->index, EROFS_I(realinode)->nid);
}
z_erofs_pcluster_readmore(&f, rac, false);
z_erofs_pcluster_end(&f);
@@ -1935,6 +1954,9 @@ static void z_erofs_readahead(struct readahead_control *rac)
(void)z_erofs_runqueue(&f, nrpages);
erofs_put_metabuf(&f.map.buf);
erofs_release_pages(&f.pagepool);
+
+ if (erofs_is_ishare_inode(inode))
+ erofs_ishare_iput(realinode);
}
const struct address_space_operations z_erofs_aops = {
--
2.22.0
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter
2025-12-23 1:56 ` [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter Hongbo Li
@ 2025-12-23 2:32 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 2:32 UTC (permalink / raw)
To: Hongbo Li, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel
On 2025/12/23 09:56, Hongbo Li wrote:
> It's useful to get filesystem-specific information using the
> existing private field in the @iomap_iter passed to iomap_{begin,end}
> for advanced usage for iomap buffered reads, which is much like the
> current iomap DIO.
>
> For example, EROFS needs it to:
>
> - implement an efficient page cache sharing feature, since iomap
> needs to apply to anon inode page cache but we'd like to get the
> backing inode/fs instead, so filesystem-specific private data is
> needed to keep such information;
>
> - pass in both struct page * and void * for inline data to avoid
> kmap_to_page() usage (which is bogus).
>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed
2025-12-23 1:56 ` [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed Hongbo Li
@ 2025-12-23 2:32 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 2:32 UTC (permalink / raw)
To: Hongbo Li, chao, brauner, djwong, amir73il, hch
Cc: linux-fsdevel, linux-erofs, linux-kernel
On 2025/12/23 09:56, Hongbo Li wrote:
> Introduce `struct erofs_iomap_iter_ctx` to hold both `struct page *`
> and `void *base`, avoiding bogus use of `kmap_to_page()` in
> `erofs_iomap_end()`.
>
> With this change, fiemap and bmap no longer need to read inline data.
>
> Additionally, the upcoming page cache sharing mechanism requires
> passing the backing inode pointer to `erofs_iomap_{begin,end}()`, as
> I/O accesses must apply to backing inodes rather than anon inodes.
>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 05/10] erofs: support user-defined fingerprint name
2025-12-23 1:56 ` [PATCH v10 05/10] erofs: support user-defined fingerprint name Hongbo Li
@ 2025-12-23 7:22 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 7:22 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>
> When creating the EROFS image, users can specify the fingerprint name.
> This is to prepare for the upcoming inode page cache share.
>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
> fs/erofs/Kconfig | 9 +++++++++
> fs/erofs/erofs_fs.h | 5 +++--
> fs/erofs/internal.h | 2 ++
> fs/erofs/super.c | 3 +++
> fs/erofs/xattr.c | 13 +++++++++++++
> 5 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
> index d81f3318417d..c88b6d0714a4 100644
> --- a/fs/erofs/Kconfig
> +++ b/fs/erofs/Kconfig
> @@ -194,3 +194,12 @@ config EROFS_FS_PCPU_KTHREAD_HIPRI
> at higher priority.
>
> If unsure, say N.
> +
> +config EROFS_FS_PAGE_CACHE_SHARE
> + bool "EROFS page cache share support (experimental)"
> + depends on EROFS_FS && EROFS_FS_XATTR && !EROFS_FS_ONDEMAND
> + help
> + This enables page cache sharing among inodes with identical
> + content fingerprints on the same device.
`on the same device` seems ambiguous because of `device`.
maybe just `on the same machine`.
> +
> + If unsure, say N.
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index e24268acdd62..20515d2462af 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -17,7 +17,7 @@
> #define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004
> #define EROFS_FEATURE_COMPAT_SHARED_EA_IN_METABOX 0x00000008
> #define EROFS_FEATURE_COMPAT_PLAIN_XATTR_PFX 0x00000010
> -
> +#define EROFS_FEATURE_COMPAT_ISHARE_XATTRS 0x00000020
>
> /*
> * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should
> @@ -83,7 +83,8 @@ struct erofs_super_block {
> __le32 xattr_prefix_start; /* start of long xattr prefixes */
> __le64 packed_nid; /* nid of the special packed inode */
> __u8 xattr_filter_reserved; /* reserved for xattr name filter */
> - __u8 reserved[3];
> + __u8 ishare_xattr_prefix_id; /* indexes the ishare key in prefix xattres */
> + __u8 reserved[2];
> __le32 build_time; /* seconds added to epoch for mkfs time */
> __le64 rootnid_8b; /* (48BIT on) nid of root directory */
> __le64 reserved2;
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index 98fe652aea33..99e2857173c3 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -134,6 +134,7 @@ struct erofs_sb_info {
> u32 xattr_blkaddr;
> u32 xattr_prefix_start;
> u8 xattr_prefix_count;
> + u8 ishare_xattr_pfx;
> struct erofs_xattr_prefix_item *xattr_prefixes;
> unsigned int xattr_filter_reserved;
> #endif
> @@ -238,6 +239,7 @@ EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM)
> EROFS_FEATURE_FUNCS(xattr_filter, compat, COMPAT_XATTR_FILTER)
> EROFS_FEATURE_FUNCS(shared_ea_in_metabox, compat, COMPAT_SHARED_EA_IN_METABOX)
> EROFS_FEATURE_FUNCS(plain_xattr_pfx, compat, COMPAT_PLAIN_XATTR_PFX)
> +EROFS_FEATURE_FUNCS(ishare_xattrs, compat, COMPAT_ISHARE_XATTRS)
>
> static inline u64 erofs_nid_to_ino64(struct erofs_sb_info *sbi, erofs_nid_t nid)
> {
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 2a44c4e5af4f..68480f10e69d 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -298,6 +298,9 @@ static int erofs_read_superblock(struct super_block *sb)
> if (ret)
> goto out;
> }
> + if (erofs_sb_has_ishare_xattrs(sbi))
> + sbi->ishare_xattr_pfx =
> + dsb->ishare_xattr_prefix_id & EROFS_XATTR_LONG_PREFIX_MASK;
Is it possible to move this one into erofs_xattr_prefixes_init()?
we could pass dsb into erofs_xattr_prefixes_init() too.
>
> ret = -EINVAL;
> sbi->feature_incompat = le32_to_cpu(dsb->feature_incompat);
> diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c
> index 396536d9a862..969e77efd038 100644
> --- a/fs/erofs/xattr.c
> +++ b/fs/erofs/xattr.c
> @@ -519,6 +519,19 @@ int erofs_xattr_prefixes_init(struct super_block *sb)
> }
>
> erofs_put_metabuf(&buf);
> + if (!ret && erofs_sb_has_ishare_xattrs(sbi)) {
> + struct erofs_xattr_prefix_item *pf = pfs + sbi->ishare_xattr_pfx;
> + struct erofs_xattr_long_prefix *newpfx;
then:
sbi->ishare_xattr_pfx =
dsb->ishare_xattr_prefix_id & EROFS_XATTR_LONG_PREFIX_MASK;
...
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 06/10] erofs: support domain-specific page cache share
2025-12-23 1:56 ` [PATCH v10 06/10] erofs: support domain-specific page cache share Hongbo Li
@ 2025-12-23 7:25 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 7:25 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>
> Only files in the same domain will share the page cache. Also modify
> the sysfs related content in preparation for the upcoming page cache
> share feature.
>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
> fs/erofs/super.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 68480f10e69d..be9f96252c6c 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -518,6 +518,8 @@ static int erofs_fc_parse_param(struct fs_context *fc,
> if (!sbi->fsid)
> return -ENOMEM;
> break;
> +#endif
> +#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
> case Opt_domain_id:
> kfree(sbi->domain_id);
> sbi->domain_id = kstrdup(param->string, GFP_KERNEL);
> @@ -618,7 +620,7 @@ static void erofs_set_sysfs_name(struct super_block *sb)
> {
> struct erofs_sb_info *sbi = EROFS_SB(sb);
>
> - if (sbi->domain_id)
> + if (sbi->domain_id && !erofs_sb_has_ishare_xattrs(sbi))
> super_set_sysfs_name_generic(sb, "%s,%s", sbi->domain_id,
> sbi->fsid);
> else if (sbi->fsid)
> @@ -1052,6 +1054,8 @@ static int erofs_show_options(struct seq_file *seq, struct dentry *root)
> #ifdef CONFIG_EROFS_FS_ONDEMAND
here.
> if (sbi->fsid)
> seq_printf(seq, ",fsid=%s", sbi->fsid);
> +#endif
> +#if defined(CONFIG_EROFS_FS_ONDEMAND) || defined(CONFIG_EROFS_FS_PAGE_CACHE_SHARE)
I think we could just kill these `#if` entirely since
`sbi->domain_id` and `sbi->fsid` are defined
unconditionally.
Otherwise it looks good to me:
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 07/10] erofs: introduce the page cache share feature
2025-12-23 1:56 ` [PATCH v10 07/10] erofs: introduce the page cache share feature Hongbo Li
@ 2025-12-23 8:11 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:11 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>
> Currently, reading files with different paths (or names) but the same
> content will consume multiple copies of the page cache, even if the
> content of these page caches is the same. For example, reading
> identical files (e.g., *.so files) from two different minor versions of
> container images will cost multiple copies of the same page cache,
> since different containers have different mount points. Therefore,
> sharing the page cache for files with the same content can save memory.
>
> This introduces the page cache share feature in erofs. It allocate a
> deduplicated inode and use its page cache as shared. Reads for files
> with identical content will ultimately be routed to the page cache of
> the deduplicated inode. In this way, a single page cache satisfies
> multiple read requests for different files with the same contents.
>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
> fs/erofs/Makefile | 1 +
> fs/erofs/internal.h | 29 ++++++
> fs/erofs/ishare.c | 211 ++++++++++++++++++++++++++++++++++++++++++++
> fs/erofs/super.c | 34 ++++++-
> 4 files changed, 272 insertions(+), 3 deletions(-)
> create mode 100644 fs/erofs/ishare.c
>
> diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile
> index 549abc424763..a80e1762b607 100644
> --- a/fs/erofs/Makefile
> +++ b/fs/erofs/Makefile
> @@ -10,3 +10,4 @@ erofs-$(CONFIG_EROFS_FS_ZIP_ZSTD) += decompressor_zstd.o
> erofs-$(CONFIG_EROFS_FS_ZIP_ACCEL) += decompressor_crypto.o
> erofs-$(CONFIG_EROFS_FS_BACKED_BY_FILE) += fileio.o
> erofs-$(CONFIG_EROFS_FS_ONDEMAND) += fscache.o
> +erofs-$(CONFIG_EROFS_FS_PAGE_CACHE_SHARE) += ishare.o
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index 99e2857173c3..ae9560434324 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -304,6 +304,22 @@ struct erofs_inode {
> };
> #endif /* CONFIG_EROFS_FS_ZIP */
> };
> +#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
> + union {
> + /* internal dedup inode */
> + struct {
> + char *fingerprint;
> + spinlock_t lock;
> + /* all backing inodes */
> + struct list_head backing_head;
> + };
> +
> + struct {
> + struct inode *ishare;
> + struct list_head backing_link;
> + };
I think it would be better to reform as below:
struct erofs_inode_fingerprint {
u8 *opaque;
int size;
};
struct list_head ishare_list;
union {
struct {
struct erofs_inode_fingerprint fingerprint;
spinlock_t ishare_lock;
};
struct inode *realinode;
};
> + };
> +#endif
> /* the corresponding vfs inode */
> struct inode vfs_inode;
> };
> @@ -410,6 +426,7 @@ extern const struct inode_operations erofs_dir_iops;
>
> extern const struct file_operations erofs_file_fops;
> extern const struct file_operations erofs_dir_fops;
> +extern const struct file_operations erofs_ishare_fops;
>
> extern const struct iomap_ops z_erofs_iomap_report_ops;
>
> @@ -541,6 +558,18 @@ static inline struct bio *erofs_fscache_bio_alloc(struct erofs_map_dev *mdev) {
> static inline void erofs_fscache_submit_bio(struct bio *bio) {}
> #endif
>
> +#ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
> +int __init erofs_init_ishare(void);
> +void erofs_exit_ishare(void);
> +bool erofs_ishare_fill_inode(struct inode *inode);
> +void erofs_ishare_free_inode(struct inode *inode);
> +#else
> +static inline int erofs_init_ishare(void) { return 0; }
> +static inline void erofs_exit_ishare(void) {}
> +static inline bool erofs_ishare_fill_inode(struct inode *inode) { return false; }
> +static inline void erofs_ishare_free_inode(struct inode *inode) {}
> +#endif // CONFIG_EROFS_FS_PAGE_CACHE_SHARE
> +
> long erofs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
> long erofs_compat_ioctl(struct file *filp, unsigned int cmd,
> unsigned long arg);
> diff --git a/fs/erofs/ishare.c b/fs/erofs/ishare.c
> new file mode 100644
> index 000000000000..4b46016bcd03
> --- /dev/null
> +++ b/fs/erofs/ishare.c
> @@ -0,0 +1,211 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Copyright (C) 2024, Alibaba Cloud
> + */
> +#include <linux/xxhash.h>
> +#include <linux/refcount.h>
> +#include <linux/mount.h>
> +#include <linux/mutex.h>
> +#include <linux/ramfs.h>
> +#include "internal.h"
> +#include "xattr.h"
> +
> +#include "../internal.h"
> +
> +static struct vfsmount *erofs_ishare_mnt;
> +
> +static int erofs_ishare_iget5_eq(struct inode *inode, void *data)
> +{
> + struct erofs_inode *vi = EROFS_I(inode);
struct erofs_inode_fingerprint *fp1 = &EROFS_I(inode)->fingerprint;
struct erofs_inode_fingerprint *fp2 = data;
return fp1->size == fp2->size &&
!memcmp(fp1->opaque, fp2->opaque, fp2->size);
return vi->fingerprint.opaque && memcmp(vi->
> +
> + return vi->fingerprint && memcmp(vi->fingerprint, data,
> + sizeof(size_t) + *(size_t *)data) == 0;
> +}
> +
> +static int erofs_ishare_iget5_set(struct inode *inode, void *data)
> +{
> + struct erofs_inode *vi = EROFS_I(inode);
> +> + vi->fingerprint = data;
vi->fingerprint = *(struct erofs_inode_fingerprint *)data;
> + INIT_LIST_HEAD(&vi->backing_head);
> + spin_lock_init(&vi->lock);
> + return 0;
> +}
> +
> +bool erofs_ishare_fill_inode(struct inode *inode)
> +{
> + struct erofs_sb_info *sbi = EROFS_SB(inode->i_sb);
> + struct erofs_xattr_prefix_item *ishare_prefix;
just call
struct erofs_xattr_prefix_item *prefix;
is fine, since it's unambiguous.
> + struct erofs_inode *vi = EROFS_I(inode);
> + struct inode *idedup;
> + /*
> + * fingerprint layout:
> + * fingerprint length + fingerprint content (xattr_value + domain_id)
> + */
That is too hard to follow, just convert to what I mentioned above;
struct erofs_inode_fingerprint fp;
> + char *ishare_key, *fingerprint;
char *infix;
> + ssize_t ishare_vlen;
size_t valuelen;
> + unsigned long hash;
> + int key_idx;
int base_index;
> +
> + if (!sbi->domain_id || !erofs_sb_has_ishare_xattrs(sbi))
> + return false;
> +
> + ishare_prefix = sbi->xattr_prefixes + sbi->ishare_xattr_pfx;
> + ishare_key = ishare_prefix->prefix->infix;
> + key_idx = ishare_prefix->prefix->base_index;
> + ishare_vlen = erofs_getxattr(inode, key_idx, ishare_key, NULL, 0);
> + if (ishare_vlen <= 0 || ishare_vlen > (1 << sbi->blkszbits))
> + return false;
> +
Then:
fp.size = valuelen + strlen(sbi->domain_id);
fp.opaque = kmalloc(fp.size, GFP_KERNEL);
And fix the remaining logic.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 08/10] erofs: support unencoded inodes for page cache share
2025-12-23 1:56 ` [PATCH v10 08/10] erofs: support unencoded inodes for page cache share Hongbo Li
@ 2025-12-23 8:15 ` Gao Xiang
2025-12-23 8:34 ` Gao Xiang
0 siblings, 1 reply; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:15 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> This patch adds inode page cache sharing functionality for unencoded
> files.
>
> I conducted experiments in the container environment. Below is the
> memory usage for reading all files in two different minor versions
> of container images:
>
> +-------------------+------------------+-------------+---------------+
> | Image | Page Cache Share | Memory (MB) | Memory |
> | | | | Reduction (%) |
> +-------------------+------------------+-------------+---------------+
> | | No | 241 | - |
> | redis +------------------+-------------+---------------+
> | 7.2.4 & 7.2.5 | Yes | 163 | 33% |
> +-------------------+------------------+-------------+---------------+
> | | No | 872 | - |
> | postgres +------------------+-------------+---------------+
> | 16.1 & 16.2 | Yes | 630 | 28% |
> +-------------------+------------------+-------------+---------------+
> | | No | 2771 | - |
> | tensorflow +------------------+-------------+---------------+
> | 2.11.0 & 2.11.1 | Yes | 2340 | 16% |
> +-------------------+------------------+-------------+---------------+
> | | No | 926 | - |
> | mysql +------------------+-------------+---------------+
> | 8.0.11 & 8.0.12 | Yes | 735 | 21% |
> +-------------------+------------------+-------------+---------------+
> | | No | 390 | - |
> | nginx +------------------+-------------+---------------+
> | 7.2.4 & 7.2.5 | Yes | 219 | 44% |
> +-------------------+------------------+-------------+---------------+
> | tomcat | No | 924 | - |
> | 10.1.25 & 10.1.26 +------------------+-------------+---------------+
> | | Yes | 474 | 49% |
> +-------------------+------------------+-------------+---------------+
>
> Additionally, the table below shows the runtime memory usage of the
> container:
>
> +-------------------+------------------+-------------+---------------+
> | Image | Page Cache Share | Memory (MB) | Memory |
> | | | | Reduction (%) |
> +-------------------+------------------+-------------+---------------+
> | | No | 35 | - |
> | redis +------------------+-------------+---------------+
> | 7.2.4 & 7.2.5 | Yes | 28 | 20% |
> +-------------------+------------------+-------------+---------------+
> | | No | 149 | - |
> | postgres +------------------+-------------+---------------+
> | 16.1 & 16.2 | Yes | 95 | 37% |
> +-------------------+------------------+-------------+---------------+
> | | No | 1028 | - |
> | tensorflow +------------------+-------------+---------------+
> | 2.11.0 & 2.11.1 | Yes | 930 | 10% |
> +-------------------+------------------+-------------+---------------+
> | | No | 155 | - |
> | mysql +------------------+-------------+---------------+
> | 8.0.11 & 8.0.12 | Yes | 132 | 15% |
> +-------------------+------------------+-------------+---------------+
> | | No | 25 | - |
> | nginx +------------------+-------------+---------------+
> | 7.2.4 & 7.2.5 | Yes | 20 | 20% |
> +-------------------+------------------+-------------+---------------+
> | tomcat | No | 186 | - |
> | 10.1.25 & 10.1.26 +------------------+-------------+---------------+
> | | Yes | 98 | 48% |
> +-------------------+------------------+-------------+---------------+
>
> Co-developed-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
...
> index 4b46016bcd03..269b53b3ed79 100644
> --- a/fs/erofs/ishare.c
> +++ b/fs/erofs/ishare.c
> @@ -197,6 +197,37 @@ const struct file_operations erofs_ishare_fops = {
> .splice_read = filemap_splice_read,
> };
>
> +/*
> + * erofs_ishare_iget - find the backing inode.
> + */
> +struct inode *erofs_ishare_iget(struct inode *inode)
Just:
struct inode *erofs_get_real_inode(struct inode *inode)
`ishare_` prefix seems useless here.
> +{
> + struct erofs_inode *vi, *vi_dedup;
> + struct inode *realinode;
> +
> + if (!erofs_is_ishare_inode(inode))
> + return igrab(inode);
> +
> + vi_dedup = EROFS_I(inode);
> + spin_lock(&vi_dedup->lock);
> + /* fall back to all backing inodes */
> + DBG_BUGON(list_empty(&vi_dedup->backing_head));
> + list_for_each_entry(vi, &vi_dedup->backing_head, backing_link) {
> + realinode = igrab(&vi->vfs_inode);
> + if (realinode)
> + break;
> + }
> + spin_unlock(&vi_dedup->lock);
> +
> + DBG_BUGON(!realinode);
> + return realinode;
> +}
> +
> +void erofs_ishare_iput(struct inode *realinode)
Just:
erofs_put_real_inode().
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 09/10] erofs: support compressed inodes for page cache share
2025-12-23 1:56 ` [PATCH v10 09/10] erofs: support compressed " Hongbo Li
@ 2025-12-23 8:18 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:18 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, djwong, Amir Goldstein, Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>
> This patch adds page cache sharing functionality for compressed inodes.
>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
> fs/erofs/zdata.c | 42 ++++++++++++++++++++++++++++++++----------
> 1 file changed, 32 insertions(+), 10 deletions(-)
>
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index 65da21504632..465918093984 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -493,7 +493,7 @@ enum z_erofs_pclustermode {
> };
>
> struct z_erofs_frontend {
> - struct inode *const inode;
> + struct inode *inode;
> struct erofs_map_blocks map;
> struct z_erofs_bvec_iter biter;
>
> @@ -1883,10 +1883,18 @@ static void z_erofs_pcluster_readmore(struct z_erofs_frontend *f,
>
> static int z_erofs_read_folio(struct file *file, struct folio *folio)
> {
> - struct inode *const inode = folio->mapping->host;
> - Z_EROFS_DEFINE_FRONTEND(f, inode, folio_pos(folio));
> + struct inode *const inode = folio->mapping->host, *realinode;
> + Z_EROFS_DEFINE_FRONTEND(f, NULL, folio_pos(folio));
> int err;
>
> + if (erofs_is_ishare_inode(inode))
> + realinode = erofs_ishare_iget(inode);
> + else
> + realinode = inode;
I don't think it makes any sense to differ those two cases, just
struct inode *inode = folio->mapping->host;
struct inode *realinode = erofs_get_real_inode(inode);
Z_EROFS_DEFINE_FRONTEND(f, realinode, folio_pos(folio));
...
> +
> + if (!realinode)
> + return -EIO;
That is an impossible case, just `DBG_BUGON(!realinode);`
> + f.inode = realinode;
> trace_erofs_read_folio(folio, false);
> z_erofs_pcluster_readmore(&f, NULL, true);
> err = z_erofs_scan_folio(&f, folio, false);
> @@ -1896,23 +1904,34 @@ static int z_erofs_read_folio(struct file *file, struct folio *folio)
> /* if some pclusters are ready, need submit them anyway */
> err = z_erofs_runqueue(&f, 0) ?: err;
> if (err && err != -EINTR)
> - erofs_err(inode->i_sb, "read error %d @ %lu of nid %llu",
> - err, folio->index, EROFS_I(inode)->nid);
> + erofs_err(realinode->i_sb, "read error %d @ %lu of nid %llu",
> + err, folio->index, EROFS_I(realinode)->nid);
>
> erofs_put_metabuf(&f.map.buf);
> erofs_release_pages(&f.pagepool);
> +
> + if (erofs_is_ishare_inode(inode))
> + erofs_ishare_iput(realinode);
erofs_put_real_inode(realinode);
> return err;
> }
>
> static void z_erofs_readahead(struct readahead_control *rac)
> {
> - struct inode *const inode = rac->mapping->host;
> - Z_EROFS_DEFINE_FRONTEND(f, inode, readahead_pos(rac));
> + struct inode *const inode = rac->mapping->host, *realinode;
> + Z_EROFS_DEFINE_FRONTEND(f, NULL, readahead_pos(rac));
> unsigned int nrpages = readahead_count(rac);
> struct folio *head = NULL, *folio;
> int err;
>
> - trace_erofs_readahead(inode, readahead_index(rac), nrpages, false);
> + if (erofs_is_ishare_inode(inode))
> + realinode = erofs_ishare_iget(inode);
> + else
> + realinode = inode;
> +
> + if (!realinode)
> + return;
> + f.inode = realinode;
Same here.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c
2025-12-23 1:56 ` [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c Hongbo Li
@ 2025-12-23 8:30 ` Gao Xiang
2025-12-23 9:28 ` Hongbo Li
0 siblings, 1 reply; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:30 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>
> Move the `struct erofs_anon_fs_type` to the super.c and
> expose it in preparation for the upcoming page cache share
> feature.
>
> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Can you just replace this one with the following patch?
From: Gao Xiang <hsiangkao@linux.alibaba.com>
Date: Tue, 23 Dec 2025 16:27:17 +0800
Subject: [PATCH] erofs: decouple `struct erofs_anon_fs_type`
- Move the `struct erofs_anon_fs_type` to super.c and expose it
in preparation for the upcoming page cache share feature;
- Remove the `.owner` field, as they are all internal mounts and
fully managed by EROFS. Retaining `.owner` would unnecessarily
increment module reference counts, preventing the EROFS kernel
module from being unloaded.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
fs/erofs/fscache.c | 13 -------------
fs/erofs/internal.h | 2 ++
fs/erofs/super.c | 14 ++++++++++++++
3 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
index 7a346e20f7b7..f4937b025038 100644
--- a/fs/erofs/fscache.c
+++ b/fs/erofs/fscache.c
@@ -3,7 +3,6 @@
* Copyright (C) 2022, Alibaba Cloud
* Copyright (C) 2022, Bytedance Inc. All rights reserved.
*/
-#include <linux/pseudo_fs.h>
#include <linux/fscache.h>
#include "internal.h"
@@ -13,18 +12,6 @@ static LIST_HEAD(erofs_domain_list);
static LIST_HEAD(erofs_domain_cookies_list);
static struct vfsmount *erofs_pseudo_mnt;
-static int erofs_anon_init_fs_context(struct fs_context *fc)
-{
- return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
-}
-
-static struct file_system_type erofs_anon_fs_type = {
- .owner = THIS_MODULE,
- .name = "pseudo_erofs",
- .init_fs_context = erofs_anon_init_fs_context,
- .kill_sb = kill_anon_super,
-};
-
struct erofs_fscache_io {
struct netfs_cache_resources cres;
struct iov_iter iter;
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index f7f622836198..98fe652aea33 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -188,6 +188,8 @@ static inline bool erofs_is_fileio_mode(struct erofs_sb_info *sbi)
return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->dif0.file;
}
+extern struct file_system_type erofs_anon_fs_type;
+
static inline bool erofs_is_fscache_mode(struct super_block *sb)
{
return IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) &&
diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index 937a215f626c..f18f43b78fca 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -11,6 +11,7 @@
#include <linux/fs_parser.h>
#include <linux/exportfs.h>
#include <linux/backing-dev.h>
+#include <linux/pseudo_fs.h>
#include "xattr.h"
#define CREATE_TRACE_POINTS
@@ -936,6 +937,19 @@ static struct file_system_type erofs_fs_type = {
};
MODULE_ALIAS_FS("erofs");
+#if defined(CONFIG_EROFS_FS_ONDEMAND)
+static int erofs_anon_init_fs_context(struct fs_context *fc)
+{
+ return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
+}
+
+struct file_system_type erofs_anon_fs_type = {
+ .name = "pseudo_erofs",
+ .init_fs_context = erofs_anon_init_fs_context,
+ .kill_sb = kill_anon_super,
+};
+#endif
+
static int __init erofs_module_init(void)
{
int err;
--
2.43.5
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v10 03/10] fs: Export alloc_empty_backing_file
2025-12-23 1:56 ` [PATCH v10 03/10] fs: Export alloc_empty_backing_file Hongbo Li
@ 2025-12-23 8:31 ` Gao Xiang
2025-12-23 12:40 ` Amir Goldstein
0 siblings, 1 reply; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:31 UTC (permalink / raw)
To: Hongbo Li, Amir Goldstein
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Christoph Hellwig
On 2025/12/23 09:56, Hongbo Li wrote:
> There is no need to open nonexistent real files if backing files
> couldn't be backed by real files (e.g., EROFS page cache sharing
> doesn't need typical real files to open again).
>
> Therefore, we export the alloc_empty_backing_file() helper, allowing
> filesystems to dynamically set the backing file without real file
> open. This is particularly useful for obtaining the correct @path
> and @inode when calling file_user_path() and file_user_inode().
>
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
(I hope Amir could ack this particular patch too..)
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 08/10] erofs: support unencoded inodes for page cache share
2025-12-23 8:15 ` Gao Xiang
@ 2025-12-23 8:34 ` Gao Xiang
0 siblings, 0 replies; 22+ messages in thread
From: Gao Xiang @ 2025-12-23 8:34 UTC (permalink / raw)
To: Hongbo Li
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 16:15, Gao Xiang wrote:
>
>
> On 2025/12/23 09:56, Hongbo Li wrote:
>> This patch adds inode page cache sharing functionality for unencoded
>> files.
>>
>> I conducted experiments in the container environment. Below is the
>> memory usage for reading all files in two different minor versions
>> of container images:
>>
>> +-------------------+------------------+-------------+---------------+
>> | Image | Page Cache Share | Memory (MB) | Memory |
>> | | | | Reduction (%) |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 241 | - |
>> | redis +------------------+-------------+---------------+
>> | 7.2.4 & 7.2.5 | Yes | 163 | 33% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 872 | - |
>> | postgres +------------------+-------------+---------------+
>> | 16.1 & 16.2 | Yes | 630 | 28% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 2771 | - |
>> | tensorflow +------------------+-------------+---------------+
>> | 2.11.0 & 2.11.1 | Yes | 2340 | 16% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 926 | - |
>> | mysql +------------------+-------------+---------------+
>> | 8.0.11 & 8.0.12 | Yes | 735 | 21% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 390 | - |
>> | nginx +------------------+-------------+---------------+
>> | 7.2.4 & 7.2.5 | Yes | 219 | 44% |
>> +-------------------+------------------+-------------+---------------+
>> | tomcat | No | 924 | - |
>> | 10.1.25 & 10.1.26 +------------------+-------------+---------------+
>> | | Yes | 474 | 49% |
>> +-------------------+------------------+-------------+---------------+
>>
>> Additionally, the table below shows the runtime memory usage of the
>> container:
>>
>> +-------------------+------------------+-------------+---------------+
>> | Image | Page Cache Share | Memory (MB) | Memory |
>> | | | | Reduction (%) |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 35 | - |
>> | redis +------------------+-------------+---------------+
>> | 7.2.4 & 7.2.5 | Yes | 28 | 20% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 149 | - |
>> | postgres +------------------+-------------+---------------+
>> | 16.1 & 16.2 | Yes | 95 | 37% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 1028 | - |
>> | tensorflow +------------------+-------------+---------------+
>> | 2.11.0 & 2.11.1 | Yes | 930 | 10% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 155 | - |
>> | mysql +------------------+-------------+---------------+
>> | 8.0.11 & 8.0.12 | Yes | 132 | 15% |
>> +-------------------+------------------+-------------+---------------+
>> | | No | 25 | - |
>> | nginx +------------------+-------------+---------------+
>> | 7.2.4 & 7.2.5 | Yes | 20 | 20% |
>> +-------------------+------------------+-------------+---------------+
>> | tomcat | No | 186 | - |
>> | 10.1.25 & 10.1.26 +------------------+-------------+---------------+
>> | | Yes | 98 | 48% |
>> +-------------------+------------------+-------------+---------------+
>>
>> Co-developed-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
>> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
>> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
>> ---
>
> ...
>
>> index 4b46016bcd03..269b53b3ed79 100644
>> --- a/fs/erofs/ishare.c
>> +++ b/fs/erofs/ishare.c
>> @@ -197,6 +197,37 @@ const struct file_operations erofs_ishare_fops = {
>> .splice_read = filemap_splice_read,
>> };
>> +/*
>> + * erofs_ishare_iget - find the backing inode.
>> + */
>> +struct inode *erofs_ishare_iget(struct inode *inode)
>
> Just:
>
> struct inode *erofs_get_real_inode(struct inode *inode)
>
> `ishare_` prefix seems useless here.
>
>> +{
>> + struct erofs_inode *vi, *vi_dedup;
>> + struct inode *realinode;
>> +
>> + if (!erofs_is_ishare_inode(inode))
>> + return igrab(inode);
Also please `return inode;` directly if `erofs_is_ishare_inode`
is off.
No need to bump the inode reference unnecessarily if ishare is off;
>> +
>> + vi_dedup = EROFS_I(inode);
>> + spin_lock(&vi_dedup->lock);
>> + /* fall back to all backing inodes */
>> + DBG_BUGON(list_empty(&vi_dedup->backing_head));
>> + list_for_each_entry(vi, &vi_dedup->backing_head, backing_link) {
>> + realinode = igrab(&vi->vfs_inode);
>> + if (realinode)
>> + break;
>> + }
>> + spin_unlock(&vi_dedup->lock);
>> +
>> + DBG_BUGON(!realinode);
>> + return realinode;
>> +}
>> +
>> +void erofs_ishare_iput(struct inode *realinode)
>
> Just:
>
> erofs_put_real_inode().
>
> Thanks,
> Gao Xiang
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c
2025-12-23 8:30 ` Gao Xiang
@ 2025-12-23 9:28 ` Hongbo Li
0 siblings, 0 replies; 22+ messages in thread
From: Hongbo Li @ 2025-12-23 9:28 UTC (permalink / raw)
To: Gao Xiang
Cc: linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Amir Goldstein,
Christoph Hellwig
On 2025/12/23 16:30, Gao Xiang wrote:
>
>
> On 2025/12/23 09:56, Hongbo Li wrote:
>> From: Hongzhen Luo <hongzhen@linux.alibaba.com>
>>
>> Move the `struct erofs_anon_fs_type` to the super.c and
>> expose it in preparation for the upcoming page cache share
>> feature.
>>
>> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
>> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
>
> Can you just replace this one with the following patch?
>
Sure, I will add this in the next version.
Thanks,
Hongbo
>
> From: Gao Xiang <hsiangkao@linux.alibaba.com>
> Date: Tue, 23 Dec 2025 16:27:17 +0800
> Subject: [PATCH] erofs: decouple `struct erofs_anon_fs_type`
>
> - Move the `struct erofs_anon_fs_type` to super.c and expose it
> in preparation for the upcoming page cache share feature;
>
> - Remove the `.owner` field, as they are all internal mounts and
> fully managed by EROFS. Retaining `.owner` would unnecessarily
> increment module reference counts, preventing the EROFS kernel
> module from being unloaded.
>
> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> ---
> fs/erofs/fscache.c | 13 -------------
> fs/erofs/internal.h | 2 ++
> fs/erofs/super.c | 14 ++++++++++++++
> 3 files changed, 16 insertions(+), 13 deletions(-)
>
> diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c
> index 7a346e20f7b7..f4937b025038 100644
> --- a/fs/erofs/fscache.c
> +++ b/fs/erofs/fscache.c
> @@ -3,7 +3,6 @@
> * Copyright (C) 2022, Alibaba Cloud
> * Copyright (C) 2022, Bytedance Inc. All rights reserved.
> */
> -#include <linux/pseudo_fs.h>
> #include <linux/fscache.h>
> #include "internal.h"
>
> @@ -13,18 +12,6 @@ static LIST_HEAD(erofs_domain_list);
> static LIST_HEAD(erofs_domain_cookies_list);
> static struct vfsmount *erofs_pseudo_mnt;
>
> -static int erofs_anon_init_fs_context(struct fs_context *fc)
> -{
> - return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
> -}
> -
> -static struct file_system_type erofs_anon_fs_type = {
> - .owner = THIS_MODULE,
> - .name = "pseudo_erofs",
> - .init_fs_context = erofs_anon_init_fs_context,
> - .kill_sb = kill_anon_super,
> -};
> -
> struct erofs_fscache_io {
> struct netfs_cache_resources cres;
> struct iov_iter iter;
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index f7f622836198..98fe652aea33 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -188,6 +188,8 @@ static inline bool erofs_is_fileio_mode(struct
> erofs_sb_info *sbi)
> return IS_ENABLED(CONFIG_EROFS_FS_BACKED_BY_FILE) && sbi->dif0.file;
> }
>
> +extern struct file_system_type erofs_anon_fs_type;
> +
> static inline bool erofs_is_fscache_mode(struct super_block *sb)
> {
> return IS_ENABLED(CONFIG_EROFS_FS_ONDEMAND) &&
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 937a215f626c..f18f43b78fca 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -11,6 +11,7 @@
> #include <linux/fs_parser.h>
> #include <linux/exportfs.h>
> #include <linux/backing-dev.h>
> +#include <linux/pseudo_fs.h>
> #include "xattr.h"
>
> #define CREATE_TRACE_POINTS
> @@ -936,6 +937,19 @@ static struct file_system_type erofs_fs_type = {
> };
> MODULE_ALIAS_FS("erofs");
>
> +#if defined(CONFIG_EROFS_FS_ONDEMAND)
> +static int erofs_anon_init_fs_context(struct fs_context *fc)
> +{
> + return init_pseudo(fc, EROFS_SUPER_MAGIC) ? 0 : -ENOMEM;
> +}
> +
> +struct file_system_type erofs_anon_fs_type = {
> + .name = "pseudo_erofs",
> + .init_fs_context = erofs_anon_init_fs_context,
> + .kill_sb = kill_anon_super,
> +};
> +#endif
> +
> static int __init erofs_module_init(void)
> {
> int err;
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v10 03/10] fs: Export alloc_empty_backing_file
2025-12-23 8:31 ` Gao Xiang
@ 2025-12-23 12:40 ` Amir Goldstein
0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2025-12-23 12:40 UTC (permalink / raw)
To: Gao Xiang
Cc: Hongbo Li, linux-fsdevel, linux-erofs, linux-kernel, Chao Yu,
Christian Brauner, Darrick J. Wong, Christoph Hellwig
On Tue, Dec 23, 2025 at 10:31 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
>
>
>
> On 2025/12/23 09:56, Hongbo Li wrote:
> > There is no need to open nonexistent real files if backing files
> > couldn't be backed by real files (e.g., EROFS page cache sharing
> > doesn't need typical real files to open again).
> >
> > Therefore, we export the alloc_empty_backing_file() helper, allowing
> > filesystems to dynamically set the backing file without real file
> > open. This is particularly useful for obtaining the correct @path
> > and @inode when calling file_user_path() and file_user_inode().
> >
> > Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
>
> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
>
> (I hope Amir could ack this particular patch too..)
As long as Chritian is ok with this, I don't mind.
Acked-by: Amir Goldstein <amir73il@gmail.com>
Thanks,
Amir.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2025-12-23 12:40 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-23 1:56 [PATCH v10 00/10] erofs: Introduce page cache sharing feature Hongbo Li
2025-12-23 1:56 ` [PATCH v10 01/10] iomap: stash iomap read ctx in the private field of iomap_iter Hongbo Li
2025-12-23 2:32 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 02/10] erofs: hold read context in iomap_iter if needed Hongbo Li
2025-12-23 2:32 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 03/10] fs: Export alloc_empty_backing_file Hongbo Li
2025-12-23 8:31 ` Gao Xiang
2025-12-23 12:40 ` Amir Goldstein
2025-12-23 1:56 ` [PATCH v10 04/10] erofs: move `struct erofs_anon_fs_type` to super.c Hongbo Li
2025-12-23 8:30 ` Gao Xiang
2025-12-23 9:28 ` Hongbo Li
2025-12-23 1:56 ` [PATCH v10 05/10] erofs: support user-defined fingerprint name Hongbo Li
2025-12-23 7:22 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 06/10] erofs: support domain-specific page cache share Hongbo Li
2025-12-23 7:25 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 07/10] erofs: introduce the page cache share feature Hongbo Li
2025-12-23 8:11 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 08/10] erofs: support unencoded inodes for page cache share Hongbo Li
2025-12-23 8:15 ` Gao Xiang
2025-12-23 8:34 ` Gao Xiang
2025-12-23 1:56 ` [PATCH v10 09/10] erofs: support compressed " Hongbo Li
2025-12-23 8:18 ` Gao Xiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox