The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Luis Henriques <luis@igalia.com>
To: Jan Kara <jack@suse.cz>
Cc: Bernd Schubert <bschubert@ddn.com>,
	 Miklos Szeredi <miklos@szeredi.hu>,
	Dave Chinner <david@fromorbit.com>,
	 Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>,
	 Matt Harvey <mharvey@jumptrading.com>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 2/2] fuse: add new function to invalidate cache for all inodes
Date: Tue, 18 Feb 2025 13:28:59 +0000	[thread overview]
Message-ID: <877c5n8jqc.fsf@igalia.com> (raw)
In-Reply-To: <bpevrif4k5h2l4pscsnsj3flwmwdw6w5nge5n7ji2yshk5pz6z@tngefti6stld> (Jan Kara's message of "Tue, 18 Feb 2025 12:57:28 +0100")

On Tue, Feb 18 2025, Jan Kara wrote:

> On Mon 17-02-25 11:47:09, Luis Henriques wrote:
>> On Mon, Feb 17 2025, Bernd Schubert wrote:
>> > On 2/17/25 11:07, Luis Henriques wrote:
>> >> On Mon, Feb 17 2025, Bernd Schubert wrote:
>> >> 
>> >>> On 2/16/25 17:50, Luis Henriques wrote:
>> >>>> Currently userspace is able to notify the kernel to invalidate the cache
>> >>>> for an inode.  This means that, if all the inodes in a filesystem need to
>> >>>> be invalidated, then userspace needs to iterate through all of them and do
>> >>>> this kernel notification separately.
>> >>>>
>> >>>> This patch adds a new option that allows userspace to invalidate all the
>> >>>> inodes with a single notification operation.  In addition to invalidate
>> >>>> all the inodes, it also shrinks the sb dcache.
>> >>>>
>> >>>> Signed-off-by: Luis Henriques <luis@igalia.com>
>> >>>> ---
>> >>>>  fs/fuse/inode.c           | 33 +++++++++++++++++++++++++++++++++
>> >>>>  include/uapi/linux/fuse.h |  3 +++
>> >>>>  2 files changed, 36 insertions(+)
>> >>>>
>> >>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>> >>>> index e9db2cb8c150..01a4dc5677ae 100644
>> >>>> --- a/fs/fuse/inode.c
>> >>>> +++ b/fs/fuse/inode.c
>> >>>> @@ -547,6 +547,36 @@ struct inode *fuse_ilookup(struct fuse_conn *fc, u64 nodeid,
>> >>>>  	return NULL;
>> >>>>  }
>> >>>>  
>> >>>> +static int fuse_reverse_inval_all(struct fuse_conn *fc)
>> >>>> +{
>> >>>> +	struct fuse_mount *fm;
>> >>>> +	struct inode *inode;
>> >>>> +
>> >>>> +	inode = fuse_ilookup(fc, FUSE_ROOT_ID, &fm);
>> >>>> +	if (!inode || !fm)
>> >>>> +		return -ENOENT;
>> >>>> +
>> >>>> +	/* Remove all possible active references to cached inodes */
>> >>>> +	shrink_dcache_sb(fm->sb);
>> >>>> +
>> >>>> +	/* Remove all unreferenced inodes from cache */
>> >>>> +	invalidate_inodes(fm->sb);
>> >>>> +
>> >>>> +	return 0;
>> >>>> +}
>> >>>> +
>> >>>> +/*
>> >>>> + * Notify to invalidate inodes cache.  It can be called with @nodeid set to
>> >>>> + * either:
>> >>>> + *
>> >>>> + * - An inode number - Any pending writebacks within the rage [@offset @len]
>> >>>> + *   will be triggered and the inode will be validated.  To invalidate the whole
>> >>>> + *   cache @offset has to be set to '0' and @len needs to be <= '0'; if @offset
>> >>>> + *   is negative, only the inode attributes are invalidated.
>> >>>> + *
>> >>>> + * - FUSE_INVAL_ALL_INODES - All the inodes in the superblock are invalidated
>> >>>> + *   and the whole dcache is shrinked.
>> >>>> + */
>> >>>>  int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>> >>>>  			     loff_t offset, loff_t len)
>> >>>>  {
>> >>>> @@ -555,6 +585,9 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
>> >>>>  	pgoff_t pg_start;
>> >>>>  	pgoff_t pg_end;
>> >>>>  
>> >>>> +	if (nodeid == FUSE_INVAL_ALL_INODES)
>> >>>> +		return fuse_reverse_inval_all(fc);
>> >>>> +
>> >>>>  	inode = fuse_ilookup(fc, nodeid, NULL);
>> >>>>  	if (!inode)
>> >>>>  		return -ENOENT;
>> >>>> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
>> >>>> index 5e0eb41d967e..e5852b63f99f 100644
>> >>>> --- a/include/uapi/linux/fuse.h
>> >>>> +++ b/include/uapi/linux/fuse.h
>> >>>> @@ -669,6 +669,9 @@ enum fuse_notify_code {
>> >>>>  	FUSE_NOTIFY_CODE_MAX,
>> >>>>  };
>> >>>>  
>> >>>> +/* The nodeid to request to invalidate all inodes */
>> >>>> +#define FUSE_INVAL_ALL_INODES 0
>> >>>> +
>> >>>>  /* The read buffer is required to be at least 8k, but may be much larger */
>> >>>>  #define FUSE_MIN_READ_BUFFER 8192
>> >>>>  
>> >>>
>> >>>
>> >>> I think this version might end up in 
>> >>>
>> >>> static void fuse_evict_inode(struct inode *inode)
>> >>> {
>> >>> 	struct fuse_inode *fi = get_fuse_inode(inode);
>> >>>
>> >>> 	/* Will write inode on close/munmap and in all other dirtiers */
>> >>> 	WARN_ON(inode->i_state & I_DIRTY_INODE);
>> >>>
>> >>>
>> >>> if the fuse connection has writeback cache enabled.
>> >>>
>> >>>
>> >>> Without having it tested, reproducer would probably be to run
>> >>> something like passthrough_hp (without --direct-io), opening
>> >>> and writing to a file and then sending FUSE_INVAL_ALL_INODES.
>> >> 
>> >> Thanks, Bernd.  So far I couldn't trigger this warning.  But I just found
>> >> that there's a stupid bug in the code: a missing iput() after doing the
>> >> fuse_ilookup().
>> >> 
>> >> I'll spend some more time trying to understand how (or if) the warning you
>> >> mentioned can triggered before sending a new revision.
>> >> 
>> >
>> > Maybe I'm wrong, but it calls 
>> >
>> >    invalidate_inodes()
>> >       dispose_list()
>> >         evict(inode)
>> >            fuse_evict_inode()
>> >
>> > and if at the same time something writes to inode page cache, the
>> > warning would be triggered? 
>> > There are some conditions in evict, like inode_wait_for_writeback()
>> > that might protect us, but what is if it waited and then just
>> > in the right time the another write comes and dirties the inode
>> > again?
>> 
>> Right, I have looked into that too but my understanding is that this can
>> not happen because, before doing that wait, the code does:
>> 
>> 	inode_sb_list_del(inode);
>> 
>> and the inode state will include I_FREEING.
>> 
>> Thus, before writing to it again, the inode will need to get added back to
>> the sb list.  Also, reading the comments on evict(), if something writes
>> into the inode at that point that's likely a bug.  But this is just my
>> understanding, and I may be missing something.
>
> Yes. invalidate_inodes() checks i_count == 0 and sets I_FREEING. Once
> I_FREEING is set nobody can acquire inode reference until the inode is
> fully destroyed. So nobody should be writing to the inode or anything like
> that.

Awesome, it's good to have that confirmed.  Thank you, Jan!

Cheers,
-- 
Luís

      reply	other threads:[~2025-02-18 13:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-16 16:50 [PATCH v5 0/2] fuse: allow notify_inval for all inodes Luis Henriques
2025-02-16 16:50 ` [PATCH v5 1/2] vfs: export invalidate_inodes() Luis Henriques
2025-02-18 11:59   ` Jan Kara
2025-02-18 13:35     ` Luis Henriques
2025-02-16 16:50 ` [PATCH v5 2/2] fuse: add new function to invalidate cache for all inodes Luis Henriques
2025-02-17  0:40   ` Bernd Schubert
2025-02-17 10:07     ` Luis Henriques
2025-02-17 10:29       ` Bernd Schubert
2025-02-17 11:47         ` Luis Henriques
2025-02-18 11:57           ` Jan Kara
2025-02-18 13:28             ` Luis Henriques [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877c5n8jqc.fsf@igalia.com \
    --to=luis@igalia.com \
    --cc=brauner@kernel.org \
    --cc=bschubert@ddn.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mharvey@jumptrading.com \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox