public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fuse: fix race between inode/dentry invalidation and readdir
@ 2026-04-24 13:49 Luis Henriques
  2026-04-24 19:35 ` Joanne Koong
  0 siblings, 1 reply; 5+ messages in thread
From: Luis Henriques @ 2026-04-24 13:49 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: fuse-devel, linux-fsdevel, linux-kernel, Matt Harvey, kernel-dev,
	Luis Henriques

When there's a readdir in progress, doing a FUSE_NOTIFY_INVAL_{INODE,ENTRY}
on an inode or dentry may result in stale directory info being cached.  This
is because the invalidation does not reset the readdir cache.

This patch fixes this issue by adding a call to fuse_rdc_reset() (modified
to include the required locking) to these two operations, allowing the
readdir cache to be invalidated while it's being filled-in.

Assisted-by: Claude:claude-opus-4-5
Signed-off-by: Luis Henriques <luis@igalia.com>
---
 fs/fuse/dir.c     |  5 +++--
 fs/fuse/fuse_i.h  | 13 +++++++++++++
 fs/fuse/inode.c   |  1 +
 fs/fuse/readdir.c |  6 +++---
 4 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 7ac6b232ef12..6e5851de3613 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1615,6 +1615,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
 	if (!(flags & FUSE_EXPIRE_ONLY))
 		d_invalidate(entry);
 	fuse_invalidate_entry_cache(entry);
+	fuse_rdc_reset(entry->d_inode);
 
 	if (child_nodeid != 0) {
 		inode_lock(d_inode(entry));
@@ -1637,7 +1638,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
 		dont_mount(entry);
 		clear_nlink(d_inode(entry));
 		err = 0;
- badentry:
+badentry:
 		inode_unlock(d_inode(entry));
 		if (!err)
 			d_delete(entry);
@@ -1646,7 +1647,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
 	}
 
 	end_removing(entry);
- put_parent:
+put_parent:
 	dput(dir);
 	iput(parent);
 	return err;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 7f16049387d1..0f31065f8046 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1494,6 +1494,19 @@ int fuse_set_acl(struct mnt_idmap *, struct dentry *dentry,
 
 /* readdir.c */
 int fuse_readdir(struct file *file, struct dir_context *ctx);
+void __fuse_rdc_reset(struct inode *inode);
+
+static inline void fuse_rdc_reset(struct inode *inode)
+{
+	struct fuse_inode *fi;
+
+	if (S_ISDIR(inode->i_mode)) {
+		fi = get_fuse_inode(inode);
+		spin_lock(&fi->rdc.lock);
+		__fuse_rdc_reset(inode);
+		spin_unlock(&fi->rdc.lock);
+	}
+}
 
 /**
  * Return the number of bytes in an arguments list
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index c795abe47a4f..4d8220f573f2 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -570,6 +570,7 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
 	fi->attr_version = atomic64_inc_return(&fc->attr_version);
 	spin_unlock(&fi->lock);
 
+	fuse_rdc_reset(inode);
 	fuse_invalidate_attr(inode);
 	forget_all_cached_acls(inode);
 	if (offset >= 0) {
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index c2aae2eef086..e7e1f051e45c 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -430,7 +430,7 @@ static enum fuse_parse_result fuse_parse_cache(struct fuse_file *ff,
 	return res;
 }
 
-static void fuse_rdc_reset(struct inode *inode)
+void __fuse_rdc_reset(struct inode *inode)
 {
 	struct fuse_inode *fi = get_fuse_inode(inode);
 
@@ -493,7 +493,7 @@ static int fuse_readdir_cached(struct file *file, struct dir_context *ctx)
 
 		if (inode_peek_iversion(inode) != fi->rdc.iversion ||
 		    !timespec64_equal(&fi->rdc.mtime, &mtime)) {
-			fuse_rdc_reset(inode);
+			__fuse_rdc_reset(inode);
 			goto retry_locked;
 		}
 	}
@@ -541,7 +541,7 @@ static int fuse_readdir_cached(struct file *file, struct dir_context *ctx)
 		 * Uh-oh: page gone missing, cache is useless
 		 */
 		if (fi->rdc.version == ff->readdir.version)
-			fuse_rdc_reset(inode);
+			__fuse_rdc_reset(inode);
 		goto retry_locked;
 	}
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] fuse: fix race between inode/dentry invalidation and readdir
  2026-04-24 13:49 [PATCH] fuse: fix race between inode/dentry invalidation and readdir Luis Henriques
@ 2026-04-24 19:35 ` Joanne Koong
  2026-04-27  9:23   ` Luis Henriques
  0 siblings, 1 reply; 5+ messages in thread
From: Joanne Koong @ 2026-04-24 19:35 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, linux-kernel,
	Matt Harvey, kernel-dev

On Fri, Apr 24, 2026 at 6:53 AM Luis Henriques <luis@igalia.com> wrote:
>
> When there's a readdir in progress, doing a FUSE_NOTIFY_INVAL_{INODE,ENTRY}
> on an inode or dentry may result in stale directory info being cached.  This
> is because the invalidation does not reset the readdir cache.
>
> This patch fixes this issue by adding a call to fuse_rdc_reset() (modified
> to include the required locking) to these two operations, allowing the
> readdir cache to be invalidated while it's being filled-in.

Hi Luis,

Just curious, are you hitting this issue in practice or is this mostly
theoretical?

afaict for fuse_notify_inval_entry(), it calls into
fuse_reverse_inval_entry() -> fuse_dir_changed(parent), which calls
inode_maybe_inc_iversion(). afaict, this actually increments i_version
(since I_VERSION_QUERIED flag was set when the cache's iversion was
initialized with inode_query_iversion() in fuse_readdir_cached()),
which means the next readdir call will detect this version change and
call fuse_rdc_reset() (in fuse_readdir_cached()). I'm not sure I see
how this leads to stale directory info lingering in the cache after a
concurrent fuse_notify_inval_entry()?

For teh fuse_notify_inval_inode() case, which I'm assuming is the case
you're running into where the directory is the inode being
invalidated, I see the call to fuse_reverse_inval_inode() which calls
invalidate_inode_pages2_range() if the offset was non-negative, which
will invalidate the readdir cache's pages, which means on the next
readdir call, will already call fuse_rdc_reset() when it detects the
missing page in the cache (in fuse_readdir_cached()). So I'm not
really seeing how this can happen either for the
fuse_notify_inval_inode() case? Unless you are passing a negative
offset, but as I understand it, passing a negative offset is used only
if the server wants attributes invalidated [1], not any data.

afaics, the onlyy stale directory info returned would be for the case
for a concurrent readdir that has already passed the pos == 0
iversion/mtime check when the invalidation arrives, but that seems
like a server synchronization issue, eg if the server wants uptodate
data when doing a concurrent readdir and invalidation, they have to
order that themselves. ANy fresh lookup after that though, I think
wouldalways return fresh/non-stale data for the reasons mentioned
above.

Does this align with what you're seeing in the code or am I missing
something here?

>
> Assisted-by: Claude:claude-opus-4-5
> Signed-off-by: Luis Henriques <luis@igalia.com>
> ---
>  fs/fuse/dir.c     |  5 +++--
>  fs/fuse/fuse_i.h  | 13 +++++++++++++
>  fs/fuse/inode.c   |  1 +
>  fs/fuse/readdir.c |  6 +++---
>  4 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 7ac6b232ef12..6e5851de3613 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -1615,6 +1615,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
>         if (!(flags & FUSE_EXPIRE_ONLY))
>                 d_invalidate(entry);
>         fuse_invalidate_entry_cache(entry);
> +       fuse_rdc_reset(entry->d_inode);

Hmm... I think this resets the child's readdir cache but it's the
parent's readdir cache that would have to be invalidated, so would
this have to be fuse_rdc_reset(parent)?

Thanks,
Joanne

[1] https://libfuse.github.io/doxygen/fuse__lowlevel_8h.html#a9cb974af9745294ff446d11cba2422f1

>
>         if (child_nodeid != 0) {
>                 inode_lock(d_inode(entry));
> @@ -1637,7 +1638,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
>                 dont_mount(entry);
>                 clear_nlink(d_inode(entry));
>                 err = 0;
> - badentry:
> +badentry:
>                 inode_unlock(d_inode(entry));
>                 if (!err)
>                         d_delete(entry);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fuse: fix race between inode/dentry invalidation and readdir
  2026-04-24 19:35 ` Joanne Koong
@ 2026-04-27  9:23   ` Luis Henriques
  2026-04-27 12:48     ` Joanne Koong
  0 siblings, 1 reply; 5+ messages in thread
From: Luis Henriques @ 2026-04-27  9:23 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, linux-kernel,
	Matt Harvey, kernel-dev

Hi Joanne!

On Fri, Apr 24 2026, Joanne Koong wrote:

> On Fri, Apr 24, 2026 at 6:53 AM Luis Henriques <luis@igalia.com> wrote:
>>
>> When there's a readdir in progress, doing a FUSE_NOTIFY_INVAL_{INODE,ENTRY}
>> on an inode or dentry may result in stale directory info being cached.  This
>> is because the invalidation does not reset the readdir cache.
>>
>> This patch fixes this issue by adding a call to fuse_rdc_reset() (modified
>> to include the required locking) to these two operations, allowing the
>> readdir cache to be invalidated while it's being filled-in.
>
> Hi Luis,
>
> Just curious, are you hitting this issue in practice or is this mostly
> theoretical?
>
> afaict for fuse_notify_inval_entry(), it calls into
> fuse_reverse_inval_entry() -> fuse_dir_changed(parent), which calls
> inode_maybe_inc_iversion(). afaict, this actually increments i_version
> (since I_VERSION_QUERIED flag was set when the cache's iversion was
> initialized with inode_query_iversion() in fuse_readdir_cached()),
> which means the next readdir call will detect this version change and
> call fuse_rdc_reset() (in fuse_readdir_cached()). I'm not sure I see
> how this leads to stale directory info lingering in the cache after a
> concurrent fuse_notify_inval_entry()?
>
> For teh fuse_notify_inval_inode() case, which I'm assuming is the case
> you're running into where the directory is the inode being
> invalidated, I see the call to fuse_reverse_inval_inode() which calls
> invalidate_inode_pages2_range() if the offset was non-negative, which
> will invalidate the readdir cache's pages, which means on the next
> readdir call, will already call fuse_rdc_reset() when it detects the
> missing page in the cache (in fuse_readdir_cached()). So I'm not
> really seeing how this can happen either for the
> fuse_notify_inval_inode() case? Unless you are passing a negative
> offset, but as I understand it, passing a negative offset is used only
> if the server wants attributes invalidated [1], not any data.
>
> afaics, the onlyy stale directory info returned would be for the case
> for a concurrent readdir that has already passed the pos == 0
> iversion/mtime check when the invalidation arrives, but that seems
> like a server synchronization issue, eg if the server wants uptodate
> data when doing a concurrent readdir and invalidation, they have to
> order that themselves. ANy fresh lookup after that though, I think
> wouldalways return fresh/non-stale data for the reasons mentioned
> above.
>
> Does this align with what you're seeing in the code or am I missing
> something here?

First of all, thanks a lot for looking into this and for doing such a
great description of the issue.

So, I did had a report regarding a possible race between a readdir and
invalidation when using keep_cache and cache_readdir.  But, unfortunately,
I don't have a lot of information regarding the actual issue, and it isn't
something reproducible.

Then, looking at the code (and, for full-disclosure, I've also looked at a
claude analysis that was handed over to me) I could see a race that I'm
trying to fix with this patch.  But I believe it's the race that you claim
above that it's a server synchronisation problem.  For example, with a
NOTIFY_INVAL_INODE operation, when fuse_reverse_inval_inode() is called
while fuse_add_dirent_to_cache() is being executed in parallel, the
iversion/mtime update could be missed.

It is possible to hit this small race by instrumenting the code, and I
could occasionally (and momentarily) see stale data while running readdir
in such instrumented testing environment.  Do you think that's something
inherent to the usage of the INVAL_INODE op, and this race will need to be
handled by user-space?

In fact, the report I got seemed to indicate that the issue was not going
away with a fresh lookup (though an 'echo 1 > /proc/sys/vm/drop_cache'
would fix it).  But maybe that's another indication that this is a problem
in the user-space server.

Cheers,
-- 
Luís

>
>>
>> Assisted-by: Claude:claude-opus-4-5
>> Signed-off-by: Luis Henriques <luis@igalia.com>
>> ---
>>  fs/fuse/dir.c     |  5 +++--
>>  fs/fuse/fuse_i.h  | 13 +++++++++++++
>>  fs/fuse/inode.c   |  1 +
>>  fs/fuse/readdir.c |  6 +++---
>>  4 files changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
>> index 7ac6b232ef12..6e5851de3613 100644
>> --- a/fs/fuse/dir.c
>> +++ b/fs/fuse/dir.c
>> @@ -1615,6 +1615,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
>>         if (!(flags & FUSE_EXPIRE_ONLY))
>>                 d_invalidate(entry);
>>         fuse_invalidate_entry_cache(entry);
>> +       fuse_rdc_reset(entry->d_inode);
>
> Hmm... I think this resets the child's readdir cache but it's the
> parent's readdir cache that would have to be invalidated, so would
> this have to be fuse_rdc_reset(parent)?
>
> Thanks,
> Joanne
>
> [1] https://libfuse.github.io/doxygen/fuse__lowlevel_8h.html#a9cb974af9745294ff446d11cba2422f1
>
>>
>>         if (child_nodeid != 0) {
>>                 inode_lock(d_inode(entry));
>> @@ -1637,7 +1638,7 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
>>                 dont_mount(entry);
>>                 clear_nlink(d_inode(entry));
>>                 err = 0;
>> - badentry:
>> +badentry:
>>                 inode_unlock(d_inode(entry));
>>                 if (!err)
>>                         d_delete(entry);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fuse: fix race between inode/dentry invalidation and readdir
  2026-04-27  9:23   ` Luis Henriques
@ 2026-04-27 12:48     ` Joanne Koong
  2026-04-27 17:06       ` Luis Henriques
  0 siblings, 1 reply; 5+ messages in thread
From: Joanne Koong @ 2026-04-27 12:48 UTC (permalink / raw)
  To: Luis Henriques
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, linux-kernel,
	Matt Harvey, kernel-dev

On Mon, Apr 27, 2026 at 10:23 AM Luis Henriques <luis@igalia.com> wrote:
>
> Hi Joanne!
>
> On Fri, Apr 24 2026, Joanne Koong wrote:
>
> > On Fri, Apr 24, 2026 at 6:53 AM Luis Henriques <luis@igalia.com> wrote:
> >>
> >> When there's a readdir in progress, doing a FUSE_NOTIFY_INVAL_{INODE,ENTRY}
> >> on an inode or dentry may result in stale directory info being cached.  This
> >> is because the invalidation does not reset the readdir cache.
> >>
> >> This patch fixes this issue by adding a call to fuse_rdc_reset() (modified
> >> to include the required locking) to these two operations, allowing the
> >> readdir cache to be invalidated while it's being filled-in.
> >
> > Hi Luis,
> >
> > Just curious, are you hitting this issue in practice or is this mostly
> > theoretical?
> >
> > afaict for fuse_notify_inval_entry(), it calls into
> > fuse_reverse_inval_entry() -> fuse_dir_changed(parent), which calls
> > inode_maybe_inc_iversion(). afaict, this actually increments i_version
> > (since I_VERSION_QUERIED flag was set when the cache's iversion was
> > initialized with inode_query_iversion() in fuse_readdir_cached()),
> > which means the next readdir call will detect this version change and
> > call fuse_rdc_reset() (in fuse_readdir_cached()). I'm not sure I see
> > how this leads to stale directory info lingering in the cache after a
> > concurrent fuse_notify_inval_entry()?
> >
> > For teh fuse_notify_inval_inode() case, which I'm assuming is the case
> > you're running into where the directory is the inode being
> > invalidated, I see the call to fuse_reverse_inval_inode() which calls
> > invalidate_inode_pages2_range() if the offset was non-negative, which
> > will invalidate the readdir cache's pages, which means on the next
> > readdir call, will already call fuse_rdc_reset() when it detects the
> > missing page in the cache (in fuse_readdir_cached()). So I'm not
> > really seeing how this can happen either for the
> > fuse_notify_inval_inode() case? Unless you are passing a negative
> > offset, but as I understand it, passing a negative offset is used only
> > if the server wants attributes invalidated [1], not any data.
> >
> > afaics, the onlyy stale directory info returned would be for the case
> > for a concurrent readdir that has already passed the pos == 0
> > iversion/mtime check when the invalidation arrives, but that seems
> > like a server synchronization issue, eg if the server wants uptodate
> > data when doing a concurrent readdir and invalidation, they have to
> > order that themselves. ANy fresh lookup after that though, I think
> > wouldalways return fresh/non-stale data for the reasons mentioned
> > above.
> >
> > Does this align with what you're seeing in the code or am I missing
> > something here?

Hi Luis,

>
> First of all, thanks a lot for looking into this and for doing such a
> great description of the issue.
>
> So, I did had a report regarding a possible race between a readdir and
> invalidation when using keep_cache and cache_readdir.  But, unfortunately,
> I don't have a lot of information regarding the actual issue, and it isn't
> something reproducible.
>
> Then, looking at the code (and, for full-disclosure, I've also looked at a
> claude analysis that was handed over to me) I could see a race that I'm
> trying to fix with this patch.  But I believe it's the race that you claim
> above that it's a server synchronisation problem.  For example, with a
> NOTIFY_INVAL_INODE operation, when fuse_reverse_inval_inode() is called
> while fuse_add_dirent_to_cache() is being executed in parallel, the
> iversion/mtime update could be missed.
>
> It is possible to hit this small race by instrumenting the code, and I
> could occasionally (and momentarily) see stale data while running readdir
> in such instrumented testing environment.  Do you think that's something
> inherent to the usage of the INVAL_INODE op, and this race will need to be
> handled by user-space?

imo yes, that is not a bug in the kernel and userspace is responsible
for synchronizing/coordinating that. I think the kernel is just
responsible for ensuring that any subsequent readdirs are not stale,
but afaict the existing code handles that.

>
> In fact, the report I got seemed to indicate that the issue was not going
> away with a fresh lookup (though an 'echo 1 > /proc/sys/vm/drop_cache'
> would fix it).  But maybe that's another indication that this is a problem
> in the user-space server.

that seems weird to me, maybe there's something else at play here in
addition to the concurrent race? Is there a repro for where the stale
data survives a fresh lookup?

Thanks,
Joanne

>
> Cheers,

> --
> Luís

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] fuse: fix race between inode/dentry invalidation and readdir
  2026-04-27 12:48     ` Joanne Koong
@ 2026-04-27 17:06       ` Luis Henriques
  0 siblings, 0 replies; 5+ messages in thread
From: Luis Henriques @ 2026-04-27 17:06 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Miklos Szeredi, fuse-devel, linux-fsdevel, linux-kernel,
	Matt Harvey, kernel-dev

On Mon, Apr 27 2026, Joanne Koong wrote:

> On Mon, Apr 27, 2026 at 10:23 AM Luis Henriques <luis@igalia.com> wrote:
>>
>> Hi Joanne!
>>
>> On Fri, Apr 24 2026, Joanne Koong wrote:
>>
>> > On Fri, Apr 24, 2026 at 6:53 AM Luis Henriques <luis@igalia.com> wrote:
>> >>
>> >> When there's a readdir in progress, doing a FUSE_NOTIFY_INVAL_{INODE,ENTRY}
>> >> on an inode or dentry may result in stale directory info being cached.  This
>> >> is because the invalidation does not reset the readdir cache.
>> >>
>> >> This patch fixes this issue by adding a call to fuse_rdc_reset() (modified
>> >> to include the required locking) to these two operations, allowing the
>> >> readdir cache to be invalidated while it's being filled-in.
>> >
>> > Hi Luis,
>> >
>> > Just curious, are you hitting this issue in practice or is this mostly
>> > theoretical?
>> >
>> > afaict for fuse_notify_inval_entry(), it calls into
>> > fuse_reverse_inval_entry() -> fuse_dir_changed(parent), which calls
>> > inode_maybe_inc_iversion(). afaict, this actually increments i_version
>> > (since I_VERSION_QUERIED flag was set when the cache's iversion was
>> > initialized with inode_query_iversion() in fuse_readdir_cached()),
>> > which means the next readdir call will detect this version change and
>> > call fuse_rdc_reset() (in fuse_readdir_cached()). I'm not sure I see
>> > how this leads to stale directory info lingering in the cache after a
>> > concurrent fuse_notify_inval_entry()?
>> >
>> > For teh fuse_notify_inval_inode() case, which I'm assuming is the case
>> > you're running into where the directory is the inode being
>> > invalidated, I see the call to fuse_reverse_inval_inode() which calls
>> > invalidate_inode_pages2_range() if the offset was non-negative, which
>> > will invalidate the readdir cache's pages, which means on the next
>> > readdir call, will already call fuse_rdc_reset() when it detects the
>> > missing page in the cache (in fuse_readdir_cached()). So I'm not
>> > really seeing how this can happen either for the
>> > fuse_notify_inval_inode() case? Unless you are passing a negative
>> > offset, but as I understand it, passing a negative offset is used only
>> > if the server wants attributes invalidated [1], not any data.
>> >
>> > afaics, the onlyy stale directory info returned would be for the case
>> > for a concurrent readdir that has already passed the pos == 0
>> > iversion/mtime check when the invalidation arrives, but that seems
>> > like a server synchronization issue, eg if the server wants uptodate
>> > data when doing a concurrent readdir and invalidation, they have to
>> > order that themselves. ANy fresh lookup after that though, I think
>> > wouldalways return fresh/non-stale data for the reasons mentioned
>> > above.
>> >
>> > Does this align with what you're seeing in the code or am I missing
>> > something here?
>
> Hi Luis,
>
>>
>> First of all, thanks a lot for looking into this and for doing such a
>> great description of the issue.
>>
>> So, I did had a report regarding a possible race between a readdir and
>> invalidation when using keep_cache and cache_readdir.  But, unfortunately,
>> I don't have a lot of information regarding the actual issue, and it isn't
>> something reproducible.
>>
>> Then, looking at the code (and, for full-disclosure, I've also looked at a
>> claude analysis that was handed over to me) I could see a race that I'm
>> trying to fix with this patch.  But I believe it's the race that you claim
>> above that it's a server synchronisation problem.  For example, with a
>> NOTIFY_INVAL_INODE operation, when fuse_reverse_inval_inode() is called
>> while fuse_add_dirent_to_cache() is being executed in parallel, the
>> iversion/mtime update could be missed.
>>
>> It is possible to hit this small race by instrumenting the code, and I
>> could occasionally (and momentarily) see stale data while running readdir
>> in such instrumented testing environment.  Do you think that's something
>> inherent to the usage of the INVAL_INODE op, and this race will need to be
>> handled by user-space?
>
> imo yes, that is not a bug in the kernel and userspace is responsible
> for synchronizing/coordinating that. I think the kernel is just
> responsible for ensuring that any subsequent readdirs are not stale,
> but afaict the existing code handles that.

Ack, thanks.

>> In fact, the report I got seemed to indicate that the issue was not going
>> away with a fresh lookup (though an 'echo 1 > /proc/sys/vm/drop_cache'
>> would fix it).  But maybe that's another indication that this is a problem
>> in the user-space server.
>
> that seems weird to me, maybe there's something else at play here in
> addition to the concurrent race? Is there a repro for where the stale
> data survives a fresh lookup?

Unfortunately no, I do not have any reproducer.  And from looking at the
code I couldn't find anything else.  I'll have to look closer into the
user-space code doing the invalidation to try to understand what else
could be at play here.  And again, thank you for your feedback, Joanne.

Cheers,
-- 
Luís

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-27 17:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 13:49 [PATCH] fuse: fix race between inode/dentry invalidation and readdir Luis Henriques
2026-04-24 19:35 ` Joanne Koong
2026-04-27  9:23   ` Luis Henriques
2026-04-27 12:48     ` Joanne Koong
2026-04-27 17:06       ` Luis Henriques

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox