* [RFC PATCH] ovl: keep merged and impure readdir caches separate
@ 2026-05-11 6:20 Nirmoy Das
2026-05-11 20:54 ` Amir Goldstein
0 siblings, 1 reply; 4+ messages in thread
From: Nirmoy Das @ 2026-05-11 6:20 UTC (permalink / raw)
To: Miklos Szeredi, Amir Goldstein
Cc: linux-unionfs, linux-kernel, Nirmoy Das,
syzbot+a16fb0cce329a320661c
Overlayfs uses one inode cache slot for two readdir cache users with
different lifetime rules. Merged directory iteration pins the cache from
open directory files with cache->refcount. Impure real-directory iteration
uses the inode cache as an unrefcounted lookup table.
Those caches cannot be reused interchangeably. If merged iteration finds
an impure cache in the inode slot, it can pin and seek through a cache
that was not built for merged iteration. If impure iteration finds a merged
cache, it can walk an object whose lifetime is controlled by open directory
files. Either direction can leave ovl_iterate() using stale cache entries.
Add ovl_dir_cache_drop() to detach the inode cache before freeing it. Keep
refcounted merged caches alive until ovl_cache_put(), stop publishing new
merged caches through the inode slot, and let impure iteration reuse only
unrefcounted caches.
Fixes: 4edb83bb1041 ("ovl: constant d_ino for non-merge dirs")
Reported-by: syzbot+a16fb0cce329a320661c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a16fb0cce329a320661c
Assisted-by: Codex:GPT-5 [lore] [checkpatch]
Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
---
fs/overlayfs/readdir.c | 38 ++++++++++++++++++++++++++------------
1 file changed, 26 insertions(+), 12 deletions(-)
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index 1dcc75b3a90f9..326d8ad173881 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -292,6 +292,27 @@ void ovl_dir_cache_free(struct inode *inode)
}
}
+static void ovl_dir_cache_drop(struct inode *inode)
+{
+ struct ovl_dir_cache *cache = ovl_dir_cache(inode);
+
+ if (!cache)
+ return;
+
+ ovl_set_dir_cache(inode, NULL);
+
+ /*
+ * Merged dir caches are refcounted by open directory files. If the
+ * inode cache is replaced while such a file still references it, keep
+ * the old cache alive until ovl_cache_put().
+ */
+ if (cache->refcount)
+ return;
+
+ ovl_cache_free(&cache->entries);
+ kfree(cache);
+}
+
static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode)
{
struct ovl_dir_cache *cache = od->cache;
@@ -485,13 +506,7 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
struct ovl_dir_cache *cache;
struct inode *inode = d_inode(dentry);
- cache = ovl_dir_cache(inode);
- if (cache && ovl_inode_version_get(inode) == cache->version) {
- WARN_ON(!cache->refcount);
- cache->refcount++;
- return cache;
- }
- ovl_set_dir_cache(d_inode(dentry), NULL);
+ ovl_dir_cache_drop(inode);
cache = kzalloc_obj(struct ovl_dir_cache);
if (!cache)
@@ -509,7 +524,6 @@ static struct ovl_dir_cache *ovl_cache_get(struct dentry *dentry)
}
cache->version = ovl_inode_version_get(inode);
- ovl_set_dir_cache(inode, cache);
return cache;
}
@@ -699,12 +713,12 @@ static struct ovl_dir_cache *ovl_cache_get_impure(const struct path *path)
struct ovl_dir_cache *cache;
cache = ovl_dir_cache(inode);
- if (cache && ovl_inode_version_get(inode) == cache->version)
+ if (cache && !cache->refcount &&
+ ovl_inode_version_get(inode) == cache->version)
return cache;
- /* Impure cache is not refcounted, free it here */
- ovl_dir_cache_free(inode);
- ovl_set_dir_cache(inode, NULL);
+ /* Drop stale or incompatible inode cache before building impure cache */
+ ovl_dir_cache_drop(inode);
cache = kzalloc_obj(struct ovl_dir_cache);
if (!cache)
base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate
2026-05-11 6:20 [RFC PATCH] ovl: keep merged and impure readdir caches separate Nirmoy Das
@ 2026-05-11 20:54 ` Amir Goldstein
2026-05-12 18:27 ` Nirmoy Das
0 siblings, 1 reply; 4+ messages in thread
From: Amir Goldstein @ 2026-05-11 20:54 UTC (permalink / raw)
To: Nirmoy Das
Cc: Miklos Szeredi, linux-unionfs, linux-kernel,
syzbot+a16fb0cce329a320661c
[-- Attachment #1: Type: text/plain, Size: 2081 bytes --]
Hi Nirmoy,
Thanks for the patch.
On Mon, May 11, 2026 at 8:21 AM Nirmoy Das <nirmoyd@nvidia.com> wrote:
>
> Overlayfs uses one inode cache slot for two readdir cache users with
> different lifetime rules. Merged directory iteration pins the cache from
> open directory files with cache->refcount. Impure real-directory iteration
> uses the inode cache as an unrefcounted lookup table.
>
> Those caches cannot be reused interchangeably. If merged iteration finds
> an impure cache in the inode slot, it can pin and seek through a cache
> that was not built for merged iteration. If impure iteration finds a merged
> cache, it can walk an object whose lifetime is controlled by open directory
> files. Either direction can leave ovl_iterate() using stale cache entries.
>
> Add ovl_dir_cache_drop() to detach the inode cache before freeing it. Keep
> refcounted merged caches alive until ovl_cache_put(), stop publishing new
> merged caches through the inode slot,
This is unacceptable - invalidating merged dir cache to paper over
another root bug, which was not really fixed.
> and let impure iteration reuse only
> unrefcounted caches.
All those guards are nice, but how does a directory inode change from
being merged to impure or vice versa?
This should never happen.
It took a lot of arguing with Sonnet about wrong leads to find the real bug.
Real bug was introduced by:
b79e05aaa1667 ("ovl: no direct iteration for dir with origin xattr")
It changed the test from:
od->is_real = !OVL_TYPE_MERGE(type);
od->is_upper = OVL_TYPE_UPPER(type);
to:
od->is_real = !ovl_test_flag(OVL_WHITEOUTS, d_inode(dir));
od->is_upper = OVL_TYPE_UPPER(type);
But there is a race window in copy up of a directory where
upper is set and published before the OVL_WHITEOUTS flag
is set.
an opendir() observing this state inside the race window will
wrongly start to iterate_real and another opendir later will
observe the flag and start iterate_merged - boom!
Attached is what I think is a correct fix.
WDYT?
Thanks,
Amir.
[-- Attachment #2: 0001-ovl-fix-race-between-copy-up-and-open-of-a-directory.patch --]
[-- Type: application/x-patch, Size: 2436 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate
2026-05-11 20:54 ` Amir Goldstein
@ 2026-05-12 18:27 ` Nirmoy Das
2026-05-12 19:21 ` Amir Goldstein
0 siblings, 1 reply; 4+ messages in thread
From: Nirmoy Das @ 2026-05-12 18:27 UTC (permalink / raw)
To: Amir Goldstein
Cc: Miklos Szeredi, linux-unionfs, linux-kernel,
syzbot+a16fb0cce329a320661c
Hi Amir,
On 11.05.26 23:54, Amir Goldstein wrote:
> Hi Nirmoy,
>
> Thanks for the patch.
>
> On Mon, May 11, 2026 at 8:21 AM Nirmoy Das <nirmoyd@nvidia.com> wrote:
>> Overlayfs uses one inode cache slot for two readdir cache users with
>> different lifetime rules. Merged directory iteration pins the cache from
>> open directory files with cache->refcount. Impure real-directory iteration
>> uses the inode cache as an unrefcounted lookup table.
>>
>> Those caches cannot be reused interchangeably. If merged iteration finds
>> an impure cache in the inode slot, it can pin and seek through a cache
>> that was not built for merged iteration. If impure iteration finds a merged
>> cache, it can walk an object whose lifetime is controlled by open directory
>> files. Either direction can leave ovl_iterate() using stale cache entries.
>>
>> Add ovl_dir_cache_drop() to detach the inode cache before freeing it. Keep
>> refcounted merged caches alive until ovl_cache_put(), stop publishing new
>> merged caches through the inode slot,
> This is unacceptable - invalidating merged dir cache to paper over
> another root bug, which was not really fixed.
>
>> and let impure iteration reuse only
>> unrefcounted caches.
> All those guards are nice, but how does a directory inode change from
> being merged to impure or vice versa?
> This should never happen.
>
> It took a lot of arguing with Sonnet about wrong leads to find the real bug.
>
> Real bug was introduced by:
> b79e05aaa1667 ("ovl: no direct iteration for dir with origin xattr")
>
> It changed the test from:
>
> od->is_real = !OVL_TYPE_MERGE(type);
> od->is_upper = OVL_TYPE_UPPER(type);
>
> to:
>
> od->is_real = !ovl_test_flag(OVL_WHITEOUTS, d_inode(dir));
> od->is_upper = OVL_TYPE_UPPER(type);
>
> But there is a race window in copy up of a directory where
> upper is set and published before the OVL_WHITEOUTS flag
> is set.
>
> an opendir() observing this state inside the race window will
> wrongly start to iterate_real and another opendir later will
> observe the flag and start iterate_merged - boom!
>
> Attached is what I think is a correct fix.
> WDYT?
This looks correct with my limited understanding of the overlay code. I
still see the issue when running with
the syzkaller-derived C reproducer in a loop inside an arm64 virtme/qemu
VM with a KASAN/debug kernel.
Let me try to debug it more and get back.
Regards,
Nirmoy
>
> Thanks,
> Amir.
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [RFC PATCH] ovl: keep merged and impure readdir caches separate
2026-05-12 18:27 ` Nirmoy Das
@ 2026-05-12 19:21 ` Amir Goldstein
0 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2026-05-12 19:21 UTC (permalink / raw)
To: Nirmoy Das
Cc: Miklos Szeredi, linux-unionfs, linux-kernel,
syzbot+a16fb0cce329a320661c
On Tue, May 12, 2026 at 8:28 PM Nirmoy Das <nirmoyd@nvidia.com> wrote:
>
> Hi Amir,
>
> On 11.05.26 23:54, Amir Goldstein wrote:
> > Hi Nirmoy,
> >
> > Thanks for the patch.
> >
> > On Mon, May 11, 2026 at 8:21 AM Nirmoy Das <nirmoyd@nvidia.com> wrote:
> >> Overlayfs uses one inode cache slot for two readdir cache users with
> >> different lifetime rules. Merged directory iteration pins the cache from
> >> open directory files with cache->refcount. Impure real-directory iteration
> >> uses the inode cache as an unrefcounted lookup table.
> >>
> >> Those caches cannot be reused interchangeably. If merged iteration finds
> >> an impure cache in the inode slot, it can pin and seek through a cache
> >> that was not built for merged iteration. If impure iteration finds a merged
> >> cache, it can walk an object whose lifetime is controlled by open directory
> >> files. Either direction can leave ovl_iterate() using stale cache entries.
> >>
> >> Add ovl_dir_cache_drop() to detach the inode cache before freeing it. Keep
> >> refcounted merged caches alive until ovl_cache_put(), stop publishing new
> >> merged caches through the inode slot,
> > This is unacceptable - invalidating merged dir cache to paper over
> > another root bug, which was not really fixed.
> >
> >> and let impure iteration reuse only
> >> unrefcounted caches.
> > All those guards are nice, but how does a directory inode change from
> > being merged to impure or vice versa?
> > This should never happen.
> >
> > It took a lot of arguing with Sonnet about wrong leads to find the real bug.
> >
> > Real bug was introduced by:
> > b79e05aaa1667 ("ovl: no direct iteration for dir with origin xattr")
> >
> > It changed the test from:
> >
> > od->is_real = !OVL_TYPE_MERGE(type);
> > od->is_upper = OVL_TYPE_UPPER(type);
> >
> > to:
> >
> > od->is_real = !ovl_test_flag(OVL_WHITEOUTS, d_inode(dir));
> > od->is_upper = OVL_TYPE_UPPER(type);
> >
> > But there is a race window in copy up of a directory where
> > upper is set and published before the OVL_WHITEOUTS flag
> > is set.
> >
> > an opendir() observing this state inside the race window will
> > wrongly start to iterate_real and another opendir later will
> > observe the flag and start iterate_merged - boom!
> >
> > Attached is what I think is a correct fix.
> > WDYT?
>
> This looks correct with my limited understanding of the overlay code. I
> still see the issue when running with
>
> the syzkaller-derived C reproducer in a loop inside an arm64 virtme/qemu
> VM with a KASAN/debug kernel.
>
>
> Let me try to debug it more and get back.
Thanks! I appreciate it.
I wasn't sure how reliably the reproducer works.
If you do find a fix to the root cause, do feel free to use my patch as is
or modified and post with changes and proper commit message.
b.t.w I am not against adding WARN_ON() assertions in readdir
code to make sure we do not hit a problem like this again,
but as assertions, not as code that looks like it should really
work in this situation (i.e. no dropping of wrong cache just error).
FYI, I had contemplated an alternative fix, but went with the
smallest one:
implement ovl_i_path_type(struct inode *inode)
and make ovl_path_type(struct dentry *dentry) a wrapper
static inline bool ovl_dir_is_real(struct inode *dir)
{
return !ovl_test_flag(OVL_WHITEOUTS, dir) &&
!OVL_TYPE_MERGE(ovl_i_path_type(dir));
}
This fix is a bit nicer to understand and the OVL_WHITEOUTS
flags becomes an optimization for OVL_TYPE_MERGE()
in addition to handling the remnant whiteouts case.
If this happens to survive the reproducer for some reason,
do feel free to post it instead of my patch.
Thanks,
Amir.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-12 19:21 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 6:20 [RFC PATCH] ovl: keep merged and impure readdir caches separate Nirmoy Das
2026-05-11 20:54 ` Amir Goldstein
2026-05-12 18:27 ` Nirmoy Das
2026-05-12 19:21 ` Amir Goldstein
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox