* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses [not found] <20260505155615.2719500-1-hsiangkao@linux.alibaba.com> @ 2026-05-08 8:20 ` Christoph Hellwig 2026-05-08 8:25 ` Tatsuyuki Ishi [not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com> 0 siblings, 2 replies; 9+ messages in thread From: Christoph Hellwig @ 2026-05-08 8:20 UTC (permalink / raw) To: Gao Xiang Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Tatsuyuki Ishi, Christian Brauner, linux-fsdevel On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote: > Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's > credentials when accessing backing file"), rw_verify_area() needs > the same too. Two things here: - rw_verify_area is a helper for use inside the VFS and file system read/write method implementation. Erofs as a user of the VFS should not use it at all. - using the opener credentials when accessing the backing file seems wrong. The entity accessing it is the file system, so it should have system or mounter credentials, not that of someone causing metadata / fs data access. And this applies to all access by a file system backed by a backing file. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-08 8:20 ` [PATCH] erofs: use the opener's credential when verifing metadata accesses Christoph Hellwig @ 2026-05-08 8:25 ` Tatsuyuki Ishi [not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com> 1 sibling, 0 replies; 9+ messages in thread From: Tatsuyuki Ishi @ 2026-05-08 8:25 UTC (permalink / raw) To: Christoph Hellwig Cc: Gao Xiang, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel > - using the opener credentials when accessing the backing file seems > wrong. The entity accessing it is the file system, so it should > have system or mounter credentials, not that of someone causing > metadata / fs data access. And this applies to all access by > a file system backed by a backing file. I think there's probably some confusion of terminology here. buf->file is opened with the mounter's credentials, so we are impersonating the mounter here. Perhaps the commit message could describe that more clearly. Same for the previous patches mentioned. [resend: previous mail was rejected due to HTML] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>]
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses [not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com> @ 2026-05-08 8:39 ` Gao Xiang 2026-05-08 8:51 ` Christoph Hellwig 2026-05-11 13:51 ` Christian Brauner 0 siblings, 2 replies; 9+ messages in thread From: Gao Xiang @ 2026-05-08 8:39 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi Hi Christiph, On 2026/5/8 16:24, Tatsuyuki Ishi wrote: > On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote: > >> On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote: >>> Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's >>> credentials when accessing backing file"), rw_verify_area() needs >>> the same too. >> >> Two things here: Let me use Tatsuyuki's reply to address your two comments. >> >> - rw_verify_area is a helper for use inside the VFS and file system >> read/write method implementation. Erofs as a user of the VFS should >> not use it at all. Currently EROFS file-backed mount metadata is directly using underlay fs page cache, which is mainly used for composefs, etc. to avoid different EROFS instances have their own EROFS page cache for the same underlay backing file and avoid unnecessary copies into them. --- That is also what composefs once did in their codebase. Since EROFS just read the underlayfs page cache and does _not_ touch anything inside the underlay page cache itself, so I guess it's fine? On the other hand, we talked a bit commit f2fed441c69b ("loop: stop using vfs_iter_{read,write} for buffered I/O") in another private thread related to fanotify, which lacks proper rw_verify_area() as well, since it called into raw read/write iter methods instead of using the previous vfs_iter_{read,write}. >> - using the opener credentials when accessing the backing file seems >> wrong. The entity accessing it is the file system, so it should >> have system or mounter credentials, not that of someone causing >> metadata / fs data access. And this applies to all access by >> a file system backed by a backing file. >> > > I think there's probably some confusion of terminology here. buf->file is > opened with the mounter's credentials, so we are impersonating the mounter > here. Perhaps the commit message could describe that more clearly. Same for > the previous patches mentioned. Here "opener" means the mounter as Tatsuyuki mentioned, I just follows Tatsuyuki's term, but it just means mounter credentials indeed. Thanks, Gao Xiang > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-08 8:39 ` Gao Xiang @ 2026-05-08 8:51 ` Christoph Hellwig 2026-05-08 9:10 ` Gao Xiang 2026-05-11 13:51 ` Christian Brauner 1 sibling, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2026-05-08 8:51 UTC (permalink / raw) To: Gao Xiang Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi, Matthew Wilcox On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote: > Currently EROFS file-backed mount metadata is directly using underlay > fs page cache, which is mainly used for composefs, etc. to avoid > different EROFS instances have their own EROFS page cache for the > same underlay backing file and avoid unnecessary copies into them. > --- That is also what composefs once did in their codebase. > > Since EROFS just read the underlayfs page cache and does _not_ > touch anything inside the underlay page cache itself, so I guess > it's fine? At the micro-level this does mean erofs needs to do the checks itself. OTOH it means this whole scheme is completely broken. The page cache is owned by the file system, so erofs can't simply poke into it. Now for reads it mostly works on the most common disk-based file systems, but it does create lots of problem for slightly more complex ones like network/clustered or synthetic file systems. It also really breaks layering, so we need to fix it. Not sure what would be best, but I'd be tempted to have a cross-instance cache maintained by erofs and filled using in-kernel direct I/O. IFF the page policies work great for you that even could be a synthetic inode/mapping. > On the other hand, we talked a bit commit f2fed441c69b ("loop: > stop using vfs_iter_{read,write} for buffered I/O") in another > private thread related to fanotify, which lacks proper > rw_verify_area() as well, since it called into raw read/write > iter methods instead of using the previous vfs_iter_{read,write}. Note that this does not add the bypass, just extends it to both I/O types. But yes, this breaks fanotify. We actually have quite a few raw ->read_iter/->write_iter calls, so this might need more structured treatment. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-08 8:51 ` Christoph Hellwig @ 2026-05-08 9:10 ` Gao Xiang 2026-05-11 6:18 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: Gao Xiang @ 2026-05-08 9:10 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi, Matthew Wilcox On 2026/5/8 16:51, Christoph Hellwig wrote: > On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote: >> Currently EROFS file-backed mount metadata is directly using underlay >> fs page cache, which is mainly used for composefs, etc. to avoid >> different EROFS instances have their own EROFS page cache for the >> same underlay backing file and avoid unnecessary copies into them. >> --- That is also what composefs once did in their codebase. >> >> Since EROFS just read the underlayfs page cache and does _not_ >> touch anything inside the underlay page cache itself, so I guess >> it's fine? > > At the micro-level this does mean erofs needs to do the checks itself. > OTOH it means this whole scheme is completely broken. The page cache > is owned by the file system, so erofs can't simply poke into it. The page cache is indeed owned by the underlay file system instead, but erofs doesn't poke into it: it just needs some temporary metadata read usage without extra allocated buffers. On the one side, I hope if there could be some interface for such temporary usage rather than just one vfs_iter_read model. > > Now for reads it mostly works on the most common disk-based file systems, > but it does create lots of problem for slightly more complex ones like > network/clustered or synthetic file systems. It also really breaks Just out of curiousity, could you point out one specific path so I can look into that. > layering, so we need to fix it. Not sure what would be best, but I'd be > tempted to have a cross-instance cache maintained by erofs and filled > using in-kernel direct I/O. IFF the page policies work great for you Direct I/O may be improper for many cases, since users will use buffer I/Os to download the images from remotes just now, and direct I/O just makes it worse (invalidate the cache, and reread from disk) and double caching if underlay file is also read. > that even could be a synthetic inode/mapping. I expect the similar comments, if we really need to work out such cross-instance cache, I'm fine to implement for Linux 7.2. It will increase the complexity of the codebase and also it won't share the cache with the underlay fs. But could we just fix this issue first for previous linux versions? > >> On the other hand, we talked a bit commit f2fed441c69b ("loop: >> stop using vfs_iter_{read,write} for buffered I/O") in another >> private thread related to fanotify, which lacks proper >> rw_verify_area() as well, since it called into raw read/write >> iter methods instead of using the previous vfs_iter_{read,write}. > > Note that this does not add the bypass, just extends it to both I/O > types. But yes, this breaks fanotify. We actually have quite a few > raw ->read_iter/->write_iter calls, so this might need more structured > treatment. It also bypasses the security hooks I think. Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-08 9:10 ` Gao Xiang @ 2026-05-11 6:18 ` Christoph Hellwig 2026-05-11 6:52 ` Gao Xiang 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2026-05-11 6:18 UTC (permalink / raw) To: Gao Xiang Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi, Matthew Wilcox On Fri, May 08, 2026 at 05:10:21PM +0800, Gao Xiang wrote: > On the one side, I hope if there could be some interface for > such temporary usage rather than just one vfs_iter_read model. As in a in-kernel mmap? While not entirely impossible, the locking model for that sounds horrible. > > Now for reads it mostly works on the most common disk-based file systems, > > but it does create lots of problem for slightly more complex ones like > > network/clustered or synthetic file systems. It also really breaks > > Just out of curiousity, could you point out one specific path > so I can look into that. file system might require their own locking, e.g. cluster locks for cluster file systems, and at least in the path direct page cache access also caused problems with NFS data invalidation semantics. Last but not least ->read_folio has a file paramater that isn't really a file but a file system specific cookie. So calling this with something not managed by the file system can cause problems as has caused crashes in the past, although the offender at that time (the old smbfs) is now gone. > But could we just fix this issue first for previous linux versions? I just pointed out another issue. You'll have to fix the credentials either way. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-11 6:18 ` Christoph Hellwig @ 2026-05-11 6:52 ` Gao Xiang 0 siblings, 0 replies; 9+ messages in thread From: Gao Xiang @ 2026-05-11 6:52 UTC (permalink / raw) To: Christoph Hellwig Cc: linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, Christian Brauner, linux-fsdevel, Tatsuyuki Ishi, Matthew Wilcox On 2026/5/11 14:18, Christoph Hellwig wrote: > On Fri, May 08, 2026 at 05:10:21PM +0800, Gao Xiang wrote: >> On the one side, I hope if there could be some interface for >> such temporary usage rather than just one vfs_iter_read model. > > As in a in-kernel mmap? While not entirely impossible, the locking > model for that sounds horrible. I don't think it needs a full in-kernel mmap, it just works on some uptodate folios. Which locking model? For page cache, it's expected that all folios shouldn't clear uptodate randomly at any time. At least for erofs use cases, we only care uptodate folios, no matter if it's being invalidated/truncated or not (mapping == NULL). Maybe it's not suitable for other stricter cases, but for immutable fs models, that is enough and efficient. > >>> Now for reads it mostly works on the most common disk-based file systems, >>> but it does create lots of problem for slightly more complex ones like >>> network/clustered or synthetic file systems. It also really breaks >> >> Just out of curiousity, could you point out one specific path >> so I can look into that. > > file system might require their own locking, e.g. cluster locks for > cluster file systems, and at least in the path direct page cache access > also caused problems with NFS data invalidation semantics. Last but not > least ->read_folio has a file paramater that isn't really a file but a > file system specific cookie. So calling this with something not managed > by the file system can cause problems as has caused crashes in the past, > although the offender at that time (the old smbfs) is now gone. file is indeed a cookie, but I did some research on the codebase, and I've seen no odd cases other than a real "struct file *" anymore. I agree such usage is kind of gray area, but I've seen no risk in practice as long as the underlay fs supports proper ->read_folio callback (and erofs restricts that.) > >> But could we just fix this issue first for previous linux versions? > > I just pointed out another issue. You'll have to fix the credentials > either way. I really hope Matthew could give some opinion on this too, because this way, the underlay cache can be directly used for temporary use, and it should be a RO access and won't impact any fs-owned state. Anyway, I could work out an alternative, but that makes the metadata access less efficient. Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-08 8:39 ` Gao Xiang 2026-05-08 8:51 ` Christoph Hellwig @ 2026-05-11 13:51 ` Christian Brauner 2026-05-11 14:42 ` Gao Xiang 1 sibling, 1 reply; 9+ messages in thread From: Christian Brauner @ 2026-05-11 13:51 UTC (permalink / raw) To: Gao Xiang Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, linux-fsdevel, Tatsuyuki Ishi On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote: > Hi Christiph, > > On 2026/5/8 16:24, Tatsuyuki Ishi wrote: > > On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote: > > > > Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's > > > > credentials when accessing backing file"), rw_verify_area() needs > > > > the same too. > > > > > > Two things here: > > Let me use Tatsuyuki's reply to address your two comments. > > > > > > > - rw_verify_area is a helper for use inside the VFS and file system > > > read/write method implementation. Erofs as a user of the VFS should > > > not use it at all. > > Currently EROFS file-backed mount metadata is directly using underlay > fs page cache, which is mainly used for composefs, etc. to avoid > different EROFS instances have their own EROFS page cache for the > same underlay backing file and avoid unnecessary copies into them. > --- That is also what composefs once did in their codebase. > > Since EROFS just read the underlayfs page cache and does _not_ > touch anything inside the underlay page cache itself, so I guess > it's fine? > > On the other hand, we talked a bit commit f2fed441c69b ("loop: > stop using vfs_iter_{read,write} for buffered I/O") in another > private thread related to fanotify, which lacks proper > rw_verify_area() as well, since it called into raw read/write > iter methods instead of using the previous vfs_iter_{read,write}. > > > > - using the opener credentials when accessing the backing file seems > > > wrong. The entity accessing it is the file system, so it should > > > have system or mounter credentials, not that of someone causing > > > metadata / fs data access. And this applies to all access by > > > a file system backed by a backing file. > > > > > > > I think there's probably some confusion of terminology here. buf->file is > > opened with the mounter's credentials, so we are impersonating the mounter > > here. Perhaps the commit message could describe that more clearly. Same for > > the previous patches mentioned. > > Here "opener" means the mounter as Tatsuyuki mentioned, I just > follows Tatsuyuki's term, but it just means mounter credentials > indeed. We're slowly reinventing overlayfs I see. ;) I think it's probably fine but it's also rather sketchy to mess around with permissions like that. Mainly because I don't think we have any actual page cache permission model. It's inherently shared beetween everyone and this kinda tries to bolt permissions on top to not make it so. Probably fine here but also a bit wonky. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] erofs: use the opener's credential when verifing metadata accesses 2026-05-11 13:51 ` Christian Brauner @ 2026-05-11 14:42 ` Gao Xiang 0 siblings, 0 replies; 9+ messages in thread From: Gao Xiang @ 2026-05-11 14:42 UTC (permalink / raw) To: Christian Brauner Cc: Christoph Hellwig, linux-erofs, Chao Yu, LKML, oliver.yang, Carlos Llamas, Sandeep Dhavale, linux-fsdevel, Tatsuyuki Ishi Hi Christian, On 2026/5/11 21:51, Christian Brauner wrote: > On Fri, May 08, 2026 at 04:39:15PM +0800, Gao Xiang wrote: >> Hi Christiph, >> >> On 2026/5/8 16:24, Tatsuyuki Ishi wrote: >>> On Fri, May 8, 2026 at 5:20 PM Christoph Hellwig <hch@infradead.org> wrote: >>> >>>> On Tue, May 05, 2026 at 11:56:15PM +0800, Gao Xiang wrote: >>>>> Similar to commit 905eeb2b7c33 ("erofs: impersonate the opener's >>>>> credentials when accessing backing file"), rw_verify_area() needs >>>>> the same too. >>>> >>>> Two things here: >> >> Let me use Tatsuyuki's reply to address your two comments. >> >>>> >>>> - rw_verify_area is a helper for use inside the VFS and file system >>>> read/write method implementation. Erofs as a user of the VFS should >>>> not use it at all. >> >> Currently EROFS file-backed mount metadata is directly using underlay >> fs page cache, which is mainly used for composefs, etc. to avoid >> different EROFS instances have their own EROFS page cache for the >> same underlay backing file and avoid unnecessary copies into them. >> --- That is also what composefs once did in their codebase. >> >> Since EROFS just read the underlayfs page cache and does _not_ >> touch anything inside the underlay page cache itself, so I guess >> it's fine? >> >> On the other hand, we talked a bit commit f2fed441c69b ("loop: >> stop using vfs_iter_{read,write} for buffered I/O") in another >> private thread related to fanotify, which lacks proper >> rw_verify_area() as well, since it called into raw read/write >> iter methods instead of using the previous vfs_iter_{read,write}. >> >>>> - using the opener credentials when accessing the backing file seems >>>> wrong. The entity accessing it is the file system, so it should >>>> have system or mounter credentials, not that of someone causing >>>> metadata / fs data access. And this applies to all access by >>>> a file system backed by a backing file. >>>> >>> >>> I think there's probably some confusion of terminology here. buf->file is >>> opened with the mounter's credentials, so we are impersonating the mounter >>> here. Perhaps the commit message could describe that more clearly. Same for >>> the previous patches mentioned. >> >> Here "opener" means the mounter as Tatsuyuki mentioned, I just >> follows Tatsuyuki's term, but it just means mounter credentials >> indeed. > > We're slowly reinventing overlayfs I see. ;) I think it's probably fine > but it's also rather sketchy to mess around with permissions like that. > Mainly because I don't think we have any actual page cache permission > model. It's inherently shared beetween everyone and this kinda tries to > bolt permissions on top to not make it so. Probably fine here but also a > bit wonky. Loop devices just purely use kernel cred instead, I think using the mounter cred is more reasonable and safer than the kernel cred. Anyway, I think this cred part is less controversy.. The main issue out of Christoph is still the metadata path: I tend to use the underlay inode page cache for temporary RO access since it's efficient and cache-friendly; and for immutable models we shouldn't care too much about the invalidation, etc. since there is no need to rely on the locking to keep the underlay data in a strict way. Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-11 14:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260505155615.2719500-1-hsiangkao@linux.alibaba.com>
2026-05-08 8:20 ` [PATCH] erofs: use the opener's credential when verifing metadata accesses Christoph Hellwig
2026-05-08 8:25 ` Tatsuyuki Ishi
[not found] ` <CABqzrSOaCMPD_QrSq_y_6bXLC3ecm3FZsE_ACrdNbTHG8baMCw@mail.gmail.com>
2026-05-08 8:39 ` Gao Xiang
2026-05-08 8:51 ` Christoph Hellwig
2026-05-08 9:10 ` Gao Xiang
2026-05-11 6:18 ` Christoph Hellwig
2026-05-11 6:52 ` Gao Xiang
2026-05-11 13:51 ` Christian Brauner
2026-05-11 14:42 ` Gao Xiang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox