public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@linux.alibaba.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Hongbo Li <lihongbo22@huawei.com>,
	chao@kernel.org, brauner@kernel.org, djwong@kernel.org,
	amir73il@gmail.com, linux-fsdevel@vger.kernel.org,
	linux-erofs@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v15 5/9] erofs: introduce the page cache share feature
Date: Mon, 19 Jan 2026 17:38:33 +0800	[thread overview]
Message-ID: <73f2c243-e029-4f95-aa8e-285c7affacac@linux.alibaba.com> (raw)
In-Reply-To: <20260119092220.GA9140@lst.de>



On 2026/1/19 17:22, Christoph Hellwig wrote:
> On Mon, Jan 19, 2026 at 04:52:54PM +0800, Gao Xiang wrote:
>>> To me this sounds pretty scary, as we have code in the kernel's trust
>>> domain that heavily depends on arbitrary userspace policy decisions.
>>
>> For example, overlayfs metacopy can also points to
>> arbitary files, what's the difference between them?
>> https://docs.kernel.org/filesystems/overlayfs.html#metadata-only-copy-up
>>
>> By using metacopy, overlayfs can access arbitary files
>> as long as the metacopy has the pointer, so it should
>> be a priviledged stuff, which is similar to this feature.
> 
> Sounds scary too.  But overlayfs' job is to combine underlying files, so
> it is expected.  I think it's the mix of erofs being a disk based file

But you still could point to an arbitary page cache
if metacopy is used.

> system, and reaching out beyond the device(s) assigned to the file system
> instance that makes me feel rather uneasy.

You mean the page cache can be shared from other
filesystems even not backed by these devices/files?

I admitted yes, there could be different: but that
is why new mount options "inode_share" and the
"domain_id" mount option are used.

I think they should be regarded as a single super
filesystem if "domain_id" is the same: From the
security perspective much like subvolumes of
a single super filesystem.

And mounting a new filesystem within a "domain_id"
can be regard as importing data into the super
"domain_id" filesystem, and I think only trusted
data within the single domain can be mounted/shared.

> 
>>>
>>> Similarly the sharing of blocks between different file system
>>> instances opens a lot of questions about trust boundaries and life
>>> time rules.  I don't really have good answers, but writing up the
>>
>> Could you give more details about the these? Since you
>> raised the questions but I have no idea what the threats
>> really come from.
> 
> Right now by default we don't allow any unprivileged mounts.  Now
> if people thing that say erofs is safe enough and opt into that,
> it needs to be clear what the boundaries of that are.  For a file
> system limited to a single block device that boundaries are
> pretty clear.  For file systems reaching out to the entire system
> (or some kind of domain), the scope is much wider.

Why multiple device differ for an immutable fses, any
filesystem instance cannot change the primary or
external device/blobs. All data are immutable.

> 
>> As for the lifetime: The blob itself are immutable files,
>> what the lifetime rules means?
> 
> What happens if the blob gets removed, intentionally or accidentally?

The extra device/blob reference is held during
the whole mount lifetime, much like the primary
(block) device.

And EROFS is an immutable filesystem, so that
inner blocks within the blob won't be go away
by some fs instance too.

> 
>> And how do you define trust boundaries?  You mean users
>> have no right to access the data?
>>
>> I think it's similar: for blockdevice-based filesystems,
>> you mount the filesystem with a given source, and it
>> should have permission to the mounter.
> 
> Yes.
> 
>> For multiple-blob EROFS filesystems, you mount the
>> filesystem with multiple data sources, and the blockdevices
>> and/or backed files should have permission to the
>> mounters too.
> 
> And what prevents other from modifying them, or sneaking
> unexpected data including unexpected comparison blobs in?

I don't think it's difference from filesystems with single
device.

First, EROFS instances never modify any underlay
device/blobs:

If you say some other program modify the device data, yes,
it can be changed externally, but I think it's just like
trusted FUSE deamons, untrusted FUSE daemon can return
arbitary (meta)data at random times too.

Thanks,
Gao Xiang



  reply	other threads:[~2026-01-19  9:38 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-16  9:55 [PATCH v15 0/9] erofs: Introduce page cache sharing feature Hongbo Li
2026-01-16  9:55 ` [PATCH v15 1/9] fs: Export alloc_empty_backing_file Hongbo Li
2026-01-16  9:55 ` [PATCH v15 2/9] erofs: decouple `struct erofs_anon_fs_type` Hongbo Li
2026-01-16 15:38   ` Christoph Hellwig
2026-01-19  1:34     ` Hongbo Li
2026-01-19  1:44       ` Gao Xiang
2026-01-19  2:23         ` Hongbo Li
2026-01-19  7:28         ` Christoph Hellwig
2026-01-16  9:55 ` [PATCH v15 3/9] erofs: support user-defined fingerprint name Hongbo Li
2026-01-16  9:55 ` [PATCH v15 4/9] erofs: support domain-specific page cache share Hongbo Li
2026-01-16  9:55 ` [PATCH v15 5/9] erofs: introduce the page cache share feature Hongbo Li
2026-01-16 15:46   ` Christoph Hellwig
2026-01-16 16:21     ` Gao Xiang
2026-01-19  7:29       ` Christoph Hellwig
2026-01-19  7:53         ` Gao Xiang
2026-01-19  8:12           ` Gao Xiang
2026-01-19  8:32           ` Christoph Hellwig
2026-01-19  8:52             ` Gao Xiang
2026-01-19  9:22               ` Christoph Hellwig
2026-01-19  9:38                 ` Gao Xiang [this message]
2026-01-19  9:53                   ` Gao Xiang
2026-01-20  3:07                   ` Gao Xiang
2026-01-20  6:52                     ` Christoph Hellwig
2026-01-20  7:19                       ` Gao Xiang
2026-01-22  8:33                         ` Christoph Hellwig
2026-01-22  8:40                           ` Gao Xiang
2026-01-23  5:39                             ` Christoph Hellwig
2026-01-23  5:58                               ` Gao Xiang
2026-01-20 13:40                       ` Christian Brauner
2026-01-20 14:11                         ` Gao Xiang
2026-01-20 12:29     ` Hongbo Li
2026-01-22 14:48       ` Hongbo Li
2026-01-23  6:19         ` Christoph Hellwig
2026-01-20 14:19   ` Gao Xiang
2026-01-20 14:33     ` Gao Xiang
2026-01-21  1:29     ` Hongbo Li
2026-01-16  9:55 ` [PATCH v15 6/9] erofs: pass inode to trace_erofs_read_folio Hongbo Li
2026-01-16  9:55 ` [PATCH v15 7/9] erofs: support unencoded inodes for page cache share Hongbo Li
2026-01-16  9:55 ` [PATCH v15 8/9] erofs: support compressed " Hongbo Li
2026-01-16  9:55 ` [PATCH v15 9/9] erofs: implement .fadvise " Hongbo Li
2026-01-16 15:46   ` Christoph Hellwig
2026-01-19  1:30     ` Hongbo Li
2026-01-16 15:36 ` [PATCH v15 0/9] erofs: Introduce page cache sharing feature Christoph Hellwig
2026-01-16 16:30   ` Gao Xiang
2026-01-16 16:43   ` Gao Xiang
2026-01-19  1:23     ` Hongbo Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73f2c243-e029-4f95-aa8e-285c7affacac@linux.alibaba.com \
    --to=hsiangkao@linux.alibaba.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=chao@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=lihongbo22@huawei.com \
    --cc=linux-erofs@lists.ozlabs.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox