* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
[not found] <20260505155615.2719500-1-hsiangkao@linux.alibaba.com>
@ 2026-05-08 8:20 ` Christoph Hellwig
2026-05-08 8:25 ` Tatsuyuki Ishi
[not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>
0 siblings, 2 replies; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-08 8:20 UTC (permalink / raw)
To: Gao Xiang
Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas,
Sandeep Dhavale, Tatsuyuki Ishi, Christian Brauner, linux-fsdevel
On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote:
> Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's
> credentials when accessing backing file"), rw_verify_area() needs
> the same too.
Two things here:
- rw_verify_area is a helper for use inside the VFS and file system
read/write method implementation. Erofs as a user of the VFS should
not use it at all.
- using the opener credentials when accessing the backing file seems
wrong. The entity accessing it is the file system, so it should
have system or mounter credentials, not that of someone causing
metadata / fs data access. And this applies to all access by
a file system backed by a backing file.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-08 8:20 ` [PATCH] erofs: use the opener's credential when verifing metadata accesses Christoph Hellwig
@ 2026-05-08 8:25 ` Tatsuyuki Ishi
[not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>
1 sibling, 0 replies; 9+ messages in thread
From: Tatsuyuki Ishi @ 2026-05-08 8:25 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Gao Xiang, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas,
Sandeep Dhavale, Christian Brauner, linux-fsdevel
> - using the opener credentials when accessing the backing file seems
> wrong. The entity accessing it is the file system, so it should
> have system or mounter credentials, not that of someone causing
> metadata / fs data access. And this applies to all access by
> a file system backed by a backing file.
I think there's probably some confusion of terminology here. buf->file
is opened with the mounter's credentials, so we are impersonating the
mounter here. Perhaps the commit message could describe that more
clearly. Same for the previous patches mentioned.
[resend: previous mail was rejected due to HTML]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
[not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>
@ 2026-05-08 8:39 ` Gao Xiang
2026-05-08 8:51 ` Christoph Hellwig
2026-05-11 13:51 ` Christian Brauner
0 siblings, 2 replies; 9+ messages in thread
From: Gao Xiang @ 2026-05-08 8:39 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas,
Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi
Hi Christiph,
On 2026/5/8 16:24, Tatsuyuki Ishi wrote:
> On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote:
>
>> On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote:
>>> Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's
>>> credentials when accessing backing file"), rw_verify_area() needs
>>> the same too.
>>
>> Two things here:
Let me use Tatsuyuki's reply to address your two comments.
>>
>> - rw_verify_area is a helper for use inside the VFS and file system
>> read/write method implementation. Erofs as a user of the VFS should
>> not use it at all.
Currently EROFS file-backed mount metadata is directly using underlay
fs page cache, which is mainly used for composefs, etc. to avoid
different EROFS instances have their own EROFS page cache for the
same underlay backing file and avoid unnecessary copies into them.
--- That is also what composefs once did in their codebase.
Since EROFS just read the underlayfs page cache and does _not_
touch anything inside the underlay page cache itself, so I guess
it's fine?
On the other hand, we talked a bit commit f2fed441c69b ("loop:
stop using vfs_iter_{read,write} for buffered I/O") in another
private thread related to fanotify, which lacks proper
rw_verify_area() as well, since it called into raw read/write
iter methods instead of using the previous vfs_iter_{read,write}.
>> - using the opener credentials when accessing the backing file seems
>> wrong. The entity accessing it is the file system, so it should
>> have system or mounter credentials, not that of someone causing
>> metadata / fs data access. And this applies to all access by
>> a file system backed by a backing file.
>>
>
> I think there's probably some confusion of terminology here. buf->file is
> opened with the mounter's credentials, so we are impersonating the mounter
> here. Perhaps the commit message could describe that more clearly. Same for
> the previous patches mentioned.
Here "opener" means the mounter as Tatsuyuki mentioned, I just
follows Tatsuyuki's term, but it just means mounter credentials
indeed.
Thanks,
Gao Xiang
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-08 8:39 ` Gao Xiang
@ 2026-05-08 8:51 ` Christoph Hellwig
2026-05-08 9:10 ` Gao Xiang
2026-05-11 13:51 ` Christian Brauner
1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-08 8:51 UTC (permalink / raw)
To: Gao Xiang
Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang,
Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel,
Tatsuyuki Ishi, Matthew Wilcox
On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote:
> Currently EROFS file-backed mount metadata is directly using underlay
> fs page cache, which is mainly used for composefs, etc. to avoid
> different EROFS instances have their own EROFS page cache for the
> same underlay backing file and avoid unnecessary copies into them.
> --- That is also what composefs once did in their codebase.
>
> Since EROFS just read the underlayfs page cache and does _not_
> touch anything inside the underlay page cache itself, so I guess
> it's fine?
At the micro-level this does mean erofs needs to do the checks itself.
OTOH it means this whole scheme is completely broken. The page cache
is owned by the file system, so erofs can't simply poke into it.
Now for reads it mostly works on the most common disk-based file systems,
but it does create lots of problem for slightly more complex ones like
network/clustered or synthetic file systems. It also really breaks
layering, so we need to fix it. Not sure what would be best, but I'd be
tempted to have a cross-instance cache maintained by erofs and filled
using in-kernel direct I/O. IFF the page policies work great for you
that even could be a synthetic inode/mapping.
> On the other hand, we talked a bit commit f2fed441c69b ("loop:
> stop using vfs_iter_{read,write} for buffered I/O") in another
> private thread related to fanotify, which lacks proper
> rw_verify_area() as well, since it called into raw read/write
> iter methods instead of using the previous vfs_iter_{read,write}.
Note that this does not add the bypass, just extends it to both I/O
types. But yes, this breaks fanotify. We actually have quite a few
raw ->read_iter/->write_iter calls, so this might need more structured
treatment.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-08 8:51 ` Christoph Hellwig
@ 2026-05-08 9:10 ` Gao Xiang
2026-05-11 6:18 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: Gao Xiang @ 2026-05-08 9:10 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas,
Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi,
Matthew Wilcox
On 2026/5/8 16:51, Christoph Hellwig wrote:
> On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote:
>> Currently EROFS file-backed mount metadata is directly using underlay
>> fs page cache, which is mainly used for composefs, etc. to avoid
>> different EROFS instances have their own EROFS page cache for the
>> same underlay backing file and avoid unnecessary copies into them.
>> --- That is also what composefs once did in their codebase.
>>
>> Since EROFS just read the underlayfs page cache and does _not_
>> touch anything inside the underlay page cache itself, so I guess
>> it's fine?
>
> At the micro-level this does mean erofs needs to do the checks itself.
> OTOH it means this whole scheme is completely broken. The page cache
> is owned by the file system, so erofs can't simply poke into it.
The page cache is indeed owned by the underlay file system
instead, but erofs doesn't poke into it: it just needs some
temporary metadata read usage without extra allocated buffers.
On the one side, I hope if there could be some interface for
such temporary usage rather than just one vfs_iter_read model.
>
> Now for reads it mostly works on the most common disk-based file systems,
> but it does create lots of problem for slightly more complex ones like
> network/clustered or synthetic file systems. It also really breaks
Just out of curiousity, could you point out one specific path
so I can look into that.
> layering, so we need to fix it. Not sure what would be best, but I'd be
> tempted to have a cross-instance cache maintained by erofs and filled
> using in-kernel direct I/O. IFF the page policies work great for you
Direct I/O may be improper for many cases, since users will use
buffer I/Os to download the images from remotes just now, and
direct I/O just makes it worse (invalidate the cache, and reread
from disk) and double caching if underlay file is also read.
> that even could be a synthetic inode/mapping.
I expect the similar comments, if we really need to work out such
cross-instance cache, I'm fine to implement for Linux 7.2. It will
increase the complexity of the codebase and also it won't share the
cache with the underlay fs.
But could we just fix this issue first for previous linux versions?
>
>> On the other hand, we talked a bit commit f2fed441c69b ("loop:
>> stop using vfs_iter_{read,write} for buffered I/O") in another
>> private thread related to fanotify, which lacks proper
>> rw_verify_area() as well, since it called into raw read/write
>> iter methods instead of using the previous vfs_iter_{read,write}.
>
> Note that this does not add the bypass, just extends it to both I/O
> types. But yes, this breaks fanotify. We actually have quite a few
> raw ->read_iter/->write_iter calls, so this might need more structured
> treatment.
It also bypasses the security hooks I think.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-08 9:10 ` Gao Xiang
@ 2026-05-11 6:18 ` Christoph Hellwig
2026-05-11 6:52 ` Gao Xiang
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-11 6:18 UTC (permalink / raw)
To: Gao Xiang
Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang,
Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel,
Tatsuyuki Ishi, Matthew Wilcox
On Fri, May 08, 2026 at 05:10:21PM +0800, Gao Xiang wrote:
> On the one side, I hope if there could be some interface for
> such temporary usage rather than just one vfs_iter_read model.
As in a in-kernel mmap? While not entirely impossible, the locking
model for that sounds horrible.
> > Now for reads it mostly works on the most common disk-based file systems,
> > but it does create lots of problem for slightly more complex ones like
> > network/clustered or synthetic file systems. It also really breaks
>
> Just out of curiousity, could you point out one specific path
> so I can look into that.
file system might require their own locking, e.g. cluster locks for
cluster file systems, and at least in the path direct page cache access
also caused problems with NFS data invalidation semantics. Last but not
least ->read_folio has a file paramater that isn't really a file but a
file system specific cookie. So calling this with something not managed
by the file system can cause problems as has caused crashes in the past,
although the offender at that time (the old smbfs) is now gone.
> But could we just fix this issue first for previous linux versions?
I just pointed out another issue. You'll have to fix the credentials
either way.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-11 6:18 ` Christoph Hellwig
@ 2026-05-11 6:52 ` Gao Xiang
0 siblings, 0 replies; 9+ messages in thread
From: Gao Xiang @ 2026-05-11 6:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas,
Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi,
Matthew Wilcox
On 2026/5/11 14:18, Christoph Hellwig wrote:
> On Fri, May 08, 2026 at 05:10:21PM +0800, Gao Xiang wrote:
>> On the one side, I hope if there could be some interface for
>> such temporary usage rather than just one vfs_iter_read model.
>
> As in a in-kernel mmap? While not entirely impossible, the locking
> model for that sounds horrible.
I don't think it needs a full in-kernel mmap, it just works on
some uptodate folios.
Which locking model? For page cache, it's expected that all folios
shouldn't clear uptodate randomly at any time.
At least for erofs use cases, we only care uptodate folios, no
matter if it's being invalidated/truncated or not (mapping == NULL).
Maybe it's not suitable for other stricter cases, but for immutable
fs models, that is enough and efficient.
>
>>> Now for reads it mostly works on the most common disk-based file systems,
>>> but it does create lots of problem for slightly more complex ones like
>>> network/clustered or synthetic file systems. It also really breaks
>>
>> Just out of curiousity, could you point out one specific path
>> so I can look into that.
>
> file system might require their own locking, e.g. cluster locks for
> cluster file systems, and at least in the path direct page cache access
> also caused problems with NFS data invalidation semantics. Last but not
> least ->read_folio has a file paramater that isn't really a file but a
> file system specific cookie. So calling this with something not managed
> by the file system can cause problems as has caused crashes in the past,
> although the offender at that time (the old smbfs) is now gone.
file is indeed a cookie, but I did some research on the codebase,
and I've seen no odd cases other than a real "struct file *" anymore.
I agree such usage is kind of gray area, but I've seen no risk in
practice as long as the underlay fs supports proper ->read_folio
callback (and erofs restricts that.)
>
>> But could we just fix this issue first for previous linux versions?
>
> I just pointed out another issue. You'll have to fix the credentials
> either way.
I really hope Matthew could give some opinion on this too, because
this way, the underlay cache can be directly used for temporary use,
and it should be a RO access and won't impact any fs-owned state.
Anyway, I could work out an alternative, but that makes the metadata
access less efficient.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-08 8:39 ` Gao Xiang
2026-05-08 8:51 ` Christoph Hellwig
@ 2026-05-11 13:51 ` Christian Brauner
2026-05-11 14:42 ` Gao Xiang
1 sibling, 1 reply; 9+ messages in thread
From: Christian Brauner @ 2026-05-11 13:51 UTC (permalink / raw)
To: Gao Xiang
Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang,
Carlos Llamas, Sandeep Dhavale, linux-fsdevel, Tatsuyuki Ishi
On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote:
> Hi Christiph,
>
> On 2026/5/8 16:24, Tatsuyuki Ishi wrote:
> > On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > > On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote:
> > > > Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's
> > > > credentials when accessing backing file"), rw_verify_area() needs
> > > > the same too.
> > >
> > > Two things here:
>
> Let me use Tatsuyuki's reply to address your two comments.
>
> > >
> > > - rw_verify_area is a helper for use inside the VFS and file system
> > > read/write method implementation. Erofs as a user of the VFS should
> > > not use it at all.
>
> Currently EROFS file-backed mount metadata is directly using underlay
> fs page cache, which is mainly used for composefs, etc. to avoid
> different EROFS instances have their own EROFS page cache for the
> same underlay backing file and avoid unnecessary copies into them.
> --- That is also what composefs once did in their codebase.
>
> Since EROFS just read the underlayfs page cache and does _not_
> touch anything inside the underlay page cache itself, so I guess
> it's fine?
>
> On the other hand, we talked a bit commit f2fed441c69b ("loop:
> stop using vfs_iter_{read,write} for buffered I/O") in another
> private thread related to fanotify, which lacks proper
> rw_verify_area() as well, since it called into raw read/write
> iter methods instead of using the previous vfs_iter_{read,write}.
>
> > > - using the opener credentials when accessing the backing file seems
> > > wrong. The entity accessing it is the file system, so it should
> > > have system or mounter credentials, not that of someone causing
> > > metadata / fs data access. And this applies to all access by
> > > a file system backed by a backing file.
> > >
> >
> > I think there's probably some confusion of terminology here. buf->file is
> > opened with the mounter's credentials, so we are impersonating the mounter
> > here. Perhaps the commit message could describe that more clearly. Same for
> > the previous patches mentioned.
>
> Here "opener" means the mounter as Tatsuyuki mentioned, I just
> follows Tatsuyuki's term, but it just means mounter credentials
> indeed.
We're slowly reinventing overlayfs I see. ;) I think it's probably fine
but it's also rather sketchy to mess around with permissions like that.
Mainly because I don't think we have any actual page cache permission
model. It's inherently shared beetween everyone and this kinda tries to
bolt permissions on top to not make it so. Probably fine here but also a
bit wonky.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses
2026-05-11 13:51 ` Christian Brauner
@ 2026-05-11 14:42 ` Gao Xiang
0 siblings, 0 replies; 9+ messages in thread
From: Gao Xiang @ 2026-05-11 14:42 UTC (permalink / raw)
To: Christian Brauner
Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang,
Carlos Llamas, Sandeep Dhavale, linux-fsdevel, Tatsuyuki Ishi
Hi Christian,
On 2026/5/11 21:51, Christian Brauner wrote:
> On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote:
>> Hi Christiph,
>>
>> On 2026/5/8 16:24, Tatsuyuki Ishi wrote:
>>> On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote:
>>>
>>>> On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote:
>>>>> Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's
>>>>> credentials when accessing backing file"), rw_verify_area() needs
>>>>> the same too.
>>>>
>>>> Two things here:
>>
>> Let me use Tatsuyuki's reply to address your two comments.
>>
>>>>
>>>> - rw_verify_area is a helper for use inside the VFS and file system
>>>> read/write method implementation. Erofs as a user of the VFS should
>>>> not use it at all.
>>
>> Currently EROFS file-backed mount metadata is directly using underlay
>> fs page cache, which is mainly used for composefs, etc. to avoid
>> different EROFS instances have their own EROFS page cache for the
>> same underlay backing file and avoid unnecessary copies into them.
>> --- That is also what composefs once did in their codebase.
>>
>> Since EROFS just read the underlayfs page cache and does _not_
>> touch anything inside the underlay page cache itself, so I guess
>> it's fine?
>>
>> On the other hand, we talked a bit commit f2fed441c69b ("loop:
>> stop using vfs_iter_{read,write} for buffered I/O") in another
>> private thread related to fanotify, which lacks proper
>> rw_verify_area() as well, since it called into raw read/write
>> iter methods instead of using the previous vfs_iter_{read,write}.
>>
>>>> - using the opener credentials when accessing the backing file seems
>>>> wrong. The entity accessing it is the file system, so it should
>>>> have system or mounter credentials, not that of someone causing
>>>> metadata / fs data access. And this applies to all access by
>>>> a file system backed by a backing file.
>>>>
>>>
>>> I think there's probably some confusion of terminology here. buf->file is
>>> opened with the mounter's credentials, so we are impersonating the mounter
>>> here. Perhaps the commit message could describe that more clearly. Same for
>>> the previous patches mentioned.
>>
>> Here "opener" means the mounter as Tatsuyuki mentioned, I just
>> follows Tatsuyuki's term, but it just means mounter credentials
>> indeed.
>
> We're slowly reinventing overlayfs I see. ;) I think it's probably fine
> but it's also rather sketchy to mess around with permissions like that.
> Mainly because I don't think we have any actual page cache permission
> model. It's inherently shared beetween everyone and this kinda tries to
> bolt permissions on top to not make it so. Probably fine here but also a
> bit wonky.
Loop devices just purely use kernel cred instead, I think using
the mounter cred is more reasonable and safer than the kernel
cred.
Anyway, I think this cred part is less controversy.. The main
issue out of Christoph is still the metadata path: I tend to use
the underlay inode page cache for temporary RO access since it's
efficient and cache-friendly; and for immutable models we
shouldn't care too much about the invalidation, etc. since there
is no need to rely on the locking to keep the underlay data in
a strict way.
Thanks,
Gao Xiang
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-11 14:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260505155615.2719500-1-hsiangkao@linux.alibaba.com>
2026-05-08 8:20 ` [PATCH] erofs: use the opener's credential when verifing metadata accesses Christoph Hellwig
2026-05-08 8:25 ` Tatsuyuki Ishi
[not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>
2026-05-08 8:39 ` Gao Xiang
2026-05-08 8:51 ` Christoph Hellwig
2026-05-08 9:10 ` Gao Xiang
2026-05-11 6:18 ` Christoph Hellwig
2026-05-11 6:52 ` Gao Xiang
2026-05-11 13:51 ` Christian Brauner
2026-05-11 14:42 ` Gao Xiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox