linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Questions about FUSE_NOTIFY_INVAL_ENTRY
@ 2025-08-19 23:34 Jim Harris
  2025-08-20  8:55 ` Miklos Szeredi
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Harris @ 2025-08-19 23:34 UTC (permalink / raw)
  To: linux-fsdevel@vger.kernel.org, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, miklos@szeredi.hu, stefanha@redhat.com
  Cc: Max Gurtovoy, Idan Zach, Roman Spiegelman, Ben Walker, Oren Duer

[-- Attachment #1: Type: text/plain, Size: 1241 bytes --]

Hi,

We have a case where our virtio-fs FUSE device (running on Bluefield DPU) needs to invalidate some of its internal file objects, so that it can free memory to be used for file objects for newly looked up files. We cannot rely on the host to invalidate entries itself since it typically has far more memory for its caches than is available on a real hardware device for its file object caching.

We would like to use the FUSE_NOTIFY_INVAL_ENTRY notification to ask the host to invalidate inodes, triggering FUSE FORGET commands that will enable the device to free its associated file objects for those inodes. We cannot find any documentation that explicitly says FUSE_NOTIFY_INVAL_ENTRY can be used for this purpose. But initial testing and code inspection indicates that this does trigger the FORGET commands that allow us to free some of the file objects in device memory.

Can we safely depend on the FUSE_NOTIFY_INVAL_ENTRY notifications to trigger FORGET commands for the associated inodes? If not, can we consider adding a new FUSE_NOTIFY_DROP_ENTRY notification that would ask the kernel to release the inode and send a FORGET command when memory pressure or clean-up is needed by the device?

Best regards,

Jim Harris

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4312 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Questions about FUSE_NOTIFY_INVAL_ENTRY
  2025-08-19 23:34 Questions about FUSE_NOTIFY_INVAL_ENTRY Jim Harris
@ 2025-08-20  8:55 ` Miklos Szeredi
  2025-08-20 20:42   ` Jim Harris
  0 siblings, 1 reply; 5+ messages in thread
From: Miklos Szeredi @ 2025-08-20  8:55 UTC (permalink / raw)
  To: Jim Harris
  Cc: linux-fsdevel@vger.kernel.org, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, stefanha@redhat.com, Max Gurtovoy, Idan Zach,
	Roman Spiegelman, Ben Walker, Oren Duer

On Wed, 20 Aug 2025 at 01:35, Jim Harris <jiharris@nvidia.com> wrote:

> Can we safely depend on the FUSE_NOTIFY_INVAL_ENTRY notifications to trigger FORGET commands for the associated inodes? If not, can we consider adding a new FUSE_NOTIFY_DROP_ENTRY notification that would ask the kernel to release the inode and send a FORGET command when memory pressure or clean-up is needed by the device?

As far as I understand what you want is drop the entry from the cache
*if it is unused*.  Plain FUSE_NOTIFY_INVAL_ENTRY will unhash the
dentry regardless of its refcount, of course FORGET will be sent only
after the reference is released.

FUSE_NOTIFY_INVAL_ENTRY with FUSE_EXPIRE_ONLY will do something like
your desired FUSE_NOTIFY_DROP_ENTRY operation, at least on virtiofs
(fc->delete_stale is on).  I notice there's a fuse_dir_changed() call
regardless of FUSE_EXPIRE_ONLY, which is not appropriate for the drop
case, this can probably be moved inside the !FUSE_EXPIRE_ONLY branch.

The other question is whether something more efficient should be
added. E.g. FUSE_NOTIFY_SHRINK_LOOKUP_CACHE with a num_drop argument
that tells fuse to try to drop this many unused entries?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Questions about FUSE_NOTIFY_INVAL_ENTRY
  2025-08-20  8:55 ` Miklos Szeredi
@ 2025-08-20 20:42   ` Jim Harris
  2025-08-27 13:05     ` Miklos Szeredi
  0 siblings, 1 reply; 5+ messages in thread
From: Jim Harris @ 2025-08-20 20:42 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel@vger.kernel.org, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, stefanha@redhat.com, Max Gurtovoy, Idan Zach,
	Roman Spiegelman, Ben Walker, Oren Duer

[-- Attachment #1: Type: text/plain, Size: 1758 bytes --]



> On Aug 20, 2025, at 1:55 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Wed, 20 Aug 2025 at 01:35, Jim Harris <jiharris@nvidia.com> wrote:
> 
>> Can we safely depend on the FUSE_NOTIFY_INVAL_ENTRY notifications to trigger FORGET commands for the associated inodes? If not, can we consider adding a new FUSE_NOTIFY_DROP_ENTRY notification that would ask the kernel to release the inode and send a FORGET command when memory pressure or clean-up is needed by the device?
> 
> As far as I understand what you want is drop the entry from the cache
> *if it is unused*.  Plain FUSE_NOTIFY_INVAL_ENTRY will unhash the
> dentry regardless of its refcount, of course FORGET will be sent only
> after the reference is released.
> 
> FUSE_NOTIFY_INVAL_ENTRY with FUSE_EXPIRE_ONLY will do something like
> your desired FUSE_NOTIFY_DROP_ENTRY operation, at least on virtiofs
> (fc->delete_stale is on).  I notice there's a fuse_dir_changed() call
> regardless of FUSE_EXPIRE_ONLY, which is not appropriate for the drop
> case, this can probably be moved inside the !FUSE_EXPIRE_ONLY branch.

Thanks for the clarification.

For that extra fuse_dir_changed() call - is this a required fix for correctness or just an optimization to avoid unnecessarily invalidating the parent directory’s attributes?

> 
> The other question is whether something more efficient should be
> added. E.g. FUSE_NOTIFY_SHRINK_LOOKUP_CACHE with a num_drop argument
> that tells fuse to try to drop this many unused entries?

Absolutely something like this would be more efficient. Using FUSE_NOTIFY_INVAL_ENTRY requires saving filenames which isn’t ideal.

> Thanks,
> Miklos


[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4312 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Questions about FUSE_NOTIFY_INVAL_ENTRY
  2025-08-20 20:42   ` Jim Harris
@ 2025-08-27 13:05     ` Miklos Szeredi
  2025-08-27 17:45       ` Jim Harris
  0 siblings, 1 reply; 5+ messages in thread
From: Miklos Szeredi @ 2025-08-27 13:05 UTC (permalink / raw)
  To: Jim Harris
  Cc: linux-fsdevel@vger.kernel.org, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, stefanha@redhat.com, Max Gurtovoy, Idan Zach,
	Roman Spiegelman, Ben Walker, Oren Duer

On Wed, 20 Aug 2025 at 22:42, Jim Harris <jiharris@nvidia.com> wrote:
>
>
>
> > On Aug 20, 2025, at 1:55 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:

> > FUSE_NOTIFY_INVAL_ENTRY with FUSE_EXPIRE_ONLY will do something like
> > your desired FUSE_NOTIFY_DROP_ENTRY operation, at least on virtiofs
> > (fc->delete_stale is on).  I notice there's a fuse_dir_changed() call
> > regardless of FUSE_EXPIRE_ONLY, which is not appropriate for the drop
> > case, this can probably be moved inside the !FUSE_EXPIRE_ONLY branch.
>
> Thanks for the clarification.
>
> For that extra fuse_dir_changed() call - is this a required fix for correctness or just an optimization to avoid unnecessarily invalidating the parent directory’s attributes?

You see it correctly, it would be an optimization.



> > The other question is whether something more efficient should be
> > added. E.g. FUSE_NOTIFY_SHRINK_LOOKUP_CACHE with a num_drop argument
> > that tells fuse to try to drop this many unused entries?
>
> Absolutely something like this would be more efficient. Using FUSE_NOTIFY_INVAL_ENTRY requires saving filenames which isn’t ideal.

Okay, I suspect an interface that supplies an array of nodeid's would
be best, as it would give control to the filesystem which inodes it
wants to give up, but would allow batching the operation and would not
require supplying the name.

Will work on this.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Questions about FUSE_NOTIFY_INVAL_ENTRY
  2025-08-27 13:05     ` Miklos Szeredi
@ 2025-08-27 17:45       ` Jim Harris
  0 siblings, 0 replies; 5+ messages in thread
From: Jim Harris @ 2025-08-27 17:45 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-fsdevel@vger.kernel.org, virtualization@lists.linux.dev,
	kvm@vger.kernel.org, stefanha@redhat.com, Max Gurtovoy, Idan Zach,
	Roman Spiegelman, Ben Walker, Oren Duer


[-- Attachment #1.1: Type: text/plain, Size: 1132 bytes --]



> On Aug 27, 2025, at 6:05 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Wed, 20 Aug 2025 at 22:42, Jim Harris <jiharris@nvidia.com <mailto:jiharris@nvidia.com>> wrote:
>> 
>> 
>> 
>>> On Aug 20, 2025, at 1:55 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> 

<snip>

>>> The other question is whether something more efficient should be
>>> added. E.g. FUSE_NOTIFY_SHRINK_LOOKUP_CACHE with a num_drop argument
>>> that tells fuse to try to drop this many unused entries?
>> 
>> Absolutely something like this would be more efficient. Using FUSE_NOTIFY_INVAL_ENTRY requires saving filenames which isn’t ideal.
> 
> Okay, I suspect an interface that supplies an array of nodeid's would
> be best, as it would give control to the filesystem which inodes it
> wants to give up, but would allow batching the operation and would not
> require supplying the name.

I agree, this would be the perfect interface. Better to let the filesystem decide which inodes it wants to give up.

> 
> Will work on this.

Thanks!

-Jim


[-- Attachment #1.2: Type: text/html, Size: 9696 bytes --]

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4312 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-27 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-19 23:34 Questions about FUSE_NOTIFY_INVAL_ENTRY Jim Harris
2025-08-20  8:55 ` Miklos Szeredi
2025-08-20 20:42   ` Jim Harris
2025-08-27 13:05     ` Miklos Szeredi
2025-08-27 17:45       ` Jim Harris

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).