public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Hongbo Li <lihongbo22@huawei.com>,
	chao@kernel.org, brauner@kernel.org, djwong@kernel.org,
	amir73il@gmail.com, linux-fsdevel@vger.kernel.org,
	linux-erofs@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v15 5/9] erofs: introduce the page cache share feature
Date: Mon, 19 Jan 2026 16:12:28 +0800	[thread overview]
Message-ID: <be558d13-6b41-48b7-9f5c-5da0f1ca1fce@linux.alibaba.com> (raw)
In-Reply-To: <8e30bc4b-c97f-4ab2-a7ce-27f399ae7462@linux.alibaba.com>



On 2026/1/19 15:53, Gao Xiang wrote:
> 
> 
> On 2026/1/19 15:29, Christoph Hellwig wrote:
>> On Sat, Jan 17, 2026 at 12:21:16AM +0800, Gao Xiang wrote:
>>> Hi Christoph,
>>>
>>> On 2026/1/16 23:46, Christoph Hellwig wrote:
>>>> I don't really understand the fingerprint idea.  Files with the
>>>> same content will point to the same physical disk blocks, so that
>>>> should be a much better indicator than a finger print?  Also how does
>>>
>>> Page cache sharing should apply to different EROFS
>>> filesystem images on the same machine too, so the
>>> physical disk block number idea cannot be applied
>>> to this.
>>
>> Oh.  That's kinda unexpected and adds another twist to the whole scheme.
>> So in that case the on-disk data actually is duplicated in each image
>> and then de-duplicated in memory only?  Ewwww...
> 
> On-disk deduplication is decoupled from this feature:

Of course, first of all:

  - Data within a single EROFS image is deduplicated of
    course (for example, erofs supports extent-based
    chunks);

> 
> - EROFS can share the same blocks in blobs (multiple
> devices) among different images, so that on-disk data

   This way is like docker layers, common data/layers
can be kept in seperate blobs;

> can be shared by refering the same blobs;

Both deduplication ways above will be applied to the
golden images which will be transfered on the wire.

> 
> - On-disk data won't be deduplicated in image if reflink
> is enabled for backing fses, userspace mounters can
> trigger background GCs to deduplicate the identical
> blocks.

And this way is applied at runtime if underlayfs
supports reflink.

> 
> I just tried to say EROFS doesn't limit what's
> the real meaning of `fingerprint` (they can be serialized
> integer numbers for example defined by a specific image
> publisher, or a specific secure hash.  Currently,
> "mkfs.erofs" will generate sha256 for each files), but
> left them to the image builders:
> 
> 
> 1) if `fingerprint` is distributed as on-disk part of
> signed images, as I said, it could be shared within a
> trusted domain_id (usually the same image builder) --
> that is the top priority thing using dmverity;
> 
> Or
> 
> 2) If `fingerprint` is not distributed in the image
> or images are untrusted (e.g. unknown signatures),
> image fetchers can scan each inode in the golden
> images to generate an extra minimal EROFS
> metadata-only image with local calculated
> `fingerprint` too, which is much similar to the
> current ostree way (parse remote files and calculate
> digests).
> 
> Thanks,
> Gao Xiang


  reply	other threads:[~2026-01-19  8:12 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16  9:55 [PATCH v15 0/9] erofs: Introduce page cache sharing feature Hongbo Li
2026-01-16  9:55 ` [PATCH v15 1/9] fs: Export alloc_empty_backing_file Hongbo Li
2026-01-16  9:55 ` [PATCH v15 2/9] erofs: decouple `struct erofs_anon_fs_type` Hongbo Li
2026-01-16 15:38   ` Christoph Hellwig
2026-01-19  1:34     ` Hongbo Li
2026-01-19  1:44       ` Gao Xiang
2026-01-19  2:23         ` Hongbo Li
2026-01-19  7:28         ` Christoph Hellwig
2026-01-16  9:55 ` [PATCH v15 3/9] erofs: support user-defined fingerprint name Hongbo Li
2026-01-16  9:55 ` [PATCH v15 4/9] erofs: support domain-specific page cache share Hongbo Li
2026-01-16  9:55 ` [PATCH v15 5/9] erofs: introduce the page cache share feature Hongbo Li
2026-01-16 15:46   ` Christoph Hellwig
2026-01-16 16:21     ` Gao Xiang
2026-01-19  7:29       ` Christoph Hellwig
2026-01-19  7:53         ` Gao Xiang
2026-01-19  8:12           ` Gao Xiang [this message]
2026-01-19  8:32           ` Christoph Hellwig
2026-01-19  8:52             ` Gao Xiang
2026-01-19  9:22               ` Christoph Hellwig
2026-01-19  9:38                 ` Gao Xiang
2026-01-19  9:53                   ` Gao Xiang
2026-01-20  3:07                   ` Gao Xiang
2026-01-20  6:52                     ` Christoph Hellwig
2026-01-20  7:19                       ` Gao Xiang
2026-01-22  8:33                         ` Christoph Hellwig
2026-01-22  8:40                           ` Gao Xiang
2026-01-23  5:39                             ` Christoph Hellwig
2026-01-23  5:58                               ` Gao Xiang
2026-01-20 13:40                       ` Christian Brauner
2026-01-20 14:11                         ` Gao Xiang
2026-01-20 12:29     ` Hongbo Li
2026-01-22 14:48       ` Hongbo Li
2026-01-23  6:19         ` Christoph Hellwig
2026-01-20 14:19   ` Gao Xiang
2026-01-20 14:33     ` Gao Xiang
2026-01-21  1:29     ` Hongbo Li
2026-01-16  9:55 ` [PATCH v15 6/9] erofs: pass inode to trace_erofs_read_folio Hongbo Li
2026-01-16  9:55 ` [PATCH v15 7/9] erofs: support unencoded inodes for page cache share Hongbo Li
2026-01-16  9:55 ` [PATCH v15 8/9] erofs: support compressed " Hongbo Li
2026-01-16  9:55 ` [PATCH v15 9/9] erofs: implement .fadvise " Hongbo Li
2026-01-16 15:46   ` Christoph Hellwig
2026-01-19  1:30     ` Hongbo Li
2026-01-16 15:36 ` [PATCH v15 0/9] erofs: Introduce page cache sharing feature Christoph Hellwig
2026-01-16 16:30   ` Gao Xiang
2026-01-16 16:43   ` Gao Xiang
2026-01-19  1:23     ` Hongbo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be558d13-6b41-48b7-9f5c-5da0f1ca1fce@linux.alibaba.com \
    --to=hsiangkao@linux.alibaba.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=chao@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=lihongbo22@huawei.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox