public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Luis Henriques <luis@igalia.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>,
	 Bernd Schubert <bschubert@ddn.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>,  Jan Kara <jack@suse.cz>,
	 Matt Harvey <mharvey@jumptrading.com>,
	 linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Valentin Volkl <valentin.volkl@cern.ch>,
	Laura Promberger <laura.promberger@cern.ch>
Subject: Re: [PATCH v6 2/2] fuse: add new function to invalidate cache for all inodes
Date: Wed, 19 Feb 2025 11:23:06 +0000	[thread overview]
Message-ID: <87eczu41r9.fsf@igalia.com> (raw)
In-Reply-To: <Z7UED8Gh7Uo-Yj6K@dread.disaster.area> (Dave Chinner's message of "Wed, 19 Feb 2025 09:05:03 +1100")

On Wed, Feb 19 2025, Dave Chinner wrote:

> On Tue, Feb 18, 2025 at 06:11:17PM +0000, Luis Henriques wrote:
>> On Tue, Feb 18 2025, Miklos Szeredi wrote:
>> 
>> > On Tue, 18 Feb 2025 at 12:51, Luis Henriques <luis@igalia.com> wrote:
>> >>
>> >> On Tue, Feb 18 2025, Miklos Szeredi wrote:
>> >>
>> >> > On Tue, 18 Feb 2025 at 11:04, Luis Henriques <luis@igalia.com> wrote:
>> >> >
>> >> >> The problem I'm trying to solve is that, if a filesystem wants to ask the
>> >> >> kernel to get rid of all inodes, it has to request the kernel to forget
>> >> >> each one, individually.  The specific filesystem I'm looking at is CVMFS,
>> >> >> which is a read-only filesystem that needs to be able to update the full
>> >> >> set of filesystem objects when a new generation snapshot becomes
>> >> >> available.
>> >> >
>> >> > Yeah, we talked about this use case.  As I remember there was a
>> >> > proposal to set an epoch, marking all objects for "revalidate needed",
>> >> > which I think is a better solution to the CVMFS problem, than just
>> >> > getting rid of unused objects.
>> >>
>> >> OK, so I think I'm missing some context here.  And, obviously, I also miss
>> >> some more knowledge on the filesystem itself.  But, if I understand it
>> >> correctly, the concept of 'inode' in CVMFS is very loose: when a new
>> >> snapshot generation is available (you mentioned 'epoch', which is, I
>> >> guess, the same thing) the inodes are all renewed -- the inode numbers
>> >> aren't kept between generations/epochs.
>> >>
>> >> Do you have any links for such discussions, or any details on how this
>> >> proposal is being implemented?  This would probably be done mostly in
>> >> user-space I guess, but it would still need a way to get rid of the unused
>> >> inodes from old snapshots, right?  (inodes from old snapshots still in use
>> >> would obvious be kept aroud).
>> >
>> > I don't have links.  Adding Valentin Volkl and Laura Promberger to the
>> > Cc list, maybe they can help with clarification.
>> >
>> > As far as I understand it would work by incrementing fc->epoch on
>> > FUSE_INVALIDATE_ALL. When an object is looked up/created the current
>> > epoch is copied to e.g. dentry->d_time.  fuse_dentry_revalidate() then
>> > compares d_time with fc->epoch and forces an invalidate on mismatch.
>> 
>> OK, so hopefully Valentin or Laura will be able to help providing some
>> more details.  But, from your description, we would still require this
>> FUSE_INVALIDATE_ALL operation to exist in order to increment the epoch.
>> And this new operation could do that *and* also already invalidate those
>> unused objects.
>
> I think you are still looking at this from the wrong direction.
>
> Invalidation is -not the operation- that is being requested. The
> CVMFS fuse server needs to update some global state in the kernel
> side fuse mount (i.e. the snapshot ID/epoch), and the need to evict
> cached inodes from previous IDs is a CVMFS implementation
> optimisation related to changing the global state.
>
>> > Only problem with this is that it seems very CVMFS specific, but I
>> > guess so is your proposal.
>> >
>> > Implementing the LRU purge is more generally useful, but I'm not sure
>> > if that helps CVMFS, since it would only get rid of unused objects.
>> 
>> The LRU inodes purge can indeed work for me as well, because my patch is
>> also only getting rid of unused objects, right?  Any inode still being
>> referenced will be kept around.
>> 
>> So, based on your reply, let me try to summarize a possible alternative
>> solution, that I think would be useful for CVMFS but also generic enough
>> for other filesystems:
>> 
>> - Add a new operation FUSE_INVAL_LRU_INODES, which would get rid of, at
>>   most, 'N' unused inodes.
>>
>> - This operation would have an argument 'N' with the maximum number of
>>   inodes to invalidate.
>>
>> - In addition, it would also increment this new fuse_connection attribute
>>   'epoch', to be used in the dentry revalidation as you suggested above
>
> As per above: invalidation is an implementation optimisation for the
> CVMFS epoch update. Invalidation, OTOH, does not imply that any fuse
> mount/connector global state (e.g. the epoch) needs to change...
>
> ii.e. the operation should be FUSE_UPDATE_EPOCH, not
> FUSE_INVAL_LRU_INODES...
>
>> 
>> - This 'N' could also be set to a pre-#define'ed value that would mean
>>   *all* (unused) inodes.
>
> Saying "only invalidate N inodes" makes no sense to me - it is
> fundamentally impossible for userspace to get right. Either the
> epoch update should evict all unreferenced inodes immediately, or it
> should leave them all behind to be purged by memory pressure or
> other periodic garbage collection mechanisms.

So, below I've a patch that is totally untested (not even compile-tested).
It's unlikely to be fully correct, but I just wanted to make sure I got
the main idea right.

What I'm trying to do there is to initialize this new 'epoch'
counter, both in the fuse connection and in every new dentry.  Then, in
the ->d_revalidate() it simply invalidate a dentry if the epochs don't
match.  Then, there's the new fuse notify operation to increment the
epoch and shrink dcache (dropped the call to {evict,invalidate}_inodes()
as Miklos suggested elsewhere).

Does this look reasonable?

(I may be missing other places where epoch should be checked or
initialized.)

Cheers,
-- 
Luís

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 5b5f789b37eb..f560d1bc327e 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1902,6 +1902,22 @@ static int fuse_notify_resend(struct fuse_conn *fc)
 	return 0;
 }
 
+static int fuse_notify_update_epoch(struct fuse_conn *fc)
+{
+	struct fuse_mount *fm;
+	struct inode *inode;
+
+	inode = fuse_ilookup(fc, FUSE_ROOT_ID, &fm);
+	if (!inode) || !fm)
+		return -ENOENT;
+	
+	iput(inode);
+	atomic_inc(&fc->epoch);
+	shrink_dcache_sb(fm->sb);
+
+	return 0;
+}
+
 static int fuse_notify(struct fuse_conn *fc, enum fuse_notify_code code,
 		       unsigned int size, struct fuse_copy_state *cs)
 {
@@ -1930,6 +1946,9 @@ static int fuse_notify(struct fuse_conn *fc, enum fuse_notify_code code,
 	case FUSE_NOTIFY_RESEND:
 		return fuse_notify_resend(fc);
 
+	case FUSE_NOTIFY_UPDATE_EPOCH:
+		return fuse_notify_update_epoch(fc);
+
 	default:
 		fuse_copy_finish(cs);
 		return -EINVAL;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 198862b086ff..d4d58b169c57 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -204,6 +204,12 @@ static int fuse_dentry_revalidate(struct inode *dir, const struct qstr *name,
 	int ret;
 
 	inode = d_inode_rcu(entry);
+	if (inode) {
+		fm = get_fuse_mount(inode);
+		if (entry->d_time < atomic_read(&fm->fc->epoch))
+			goto invalid;
+	}
+
 	if (inode && fuse_is_bad(inode))
 		goto invalid;
 	else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
@@ -446,6 +452,12 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
 		goto out_err;
 
 	entry = newent ? newent : entry;
+	if (inode) {
+		struct fuse_mount *fm = get_fuse_mount(inode);
+		entry->d_time = atomic_read(&fm->fc->epoch);
+	} else {
+		entry->d_time = 0;
+	}
 	if (outarg_valid)
 		fuse_change_entry_timeout(entry, &outarg);
 	else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index fee96fe7887b..bb6b1ebaa42d 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -611,6 +611,8 @@ struct fuse_conn {
 	/** Number of fuse_dev's */
 	atomic_t dev_count;
 
+	atomic_t epoch;
+
 	struct rcu_head rcu;
 
 	/** The user id for this mount */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index e9db2cb8c150..5d2d29fad658 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -959,6 +959,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm,
 	init_rwsem(&fc->killsb);
 	refcount_set(&fc->count, 1);
 	atomic_set(&fc->dev_count, 1);
+	atomic_set(&fc->epoch, 1);
 	init_waitqueue_head(&fc->blocked_waitq);
 	fuse_iqueue_init(&fc->iq, fiq_ops, fiq_priv);
 	INIT_LIST_HEAD(&fc->bg_queue);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 5e0eb41d967e..62cc60e61cca 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -666,6 +666,7 @@ enum fuse_notify_code {
 	FUSE_NOTIFY_RETRIEVE = 5,
 	FUSE_NOTIFY_DELETE = 6,
 	FUSE_NOTIFY_RESEND = 7,
+	FUSE_NOTIFY_UPDATE_EPOCH = 8,
 	FUSE_NOTIFY_CODE_MAX,
 };
 

  reply	other threads:[~2025-02-19 11:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-17 13:32 [PATCH v6 0/2] fuse: allow notify_inval for all inodes Luis Henriques
2025-02-17 13:32 ` [PATCH v6 1/2] vfs: export invalidate_inodes() Luis Henriques
2025-02-18  0:39   ` Dave Chinner
2025-02-17 13:32 ` [PATCH v6 2/2] fuse: add new function to invalidate cache for all inodes Luis Henriques
2025-02-18  0:55   ` Dave Chinner
2025-02-18  9:15     ` Miklos Szeredi
2025-02-18 10:04       ` Luis Henriques
2025-02-18 10:34         ` Miklos Szeredi
2025-02-18 11:51           ` Luis Henriques
2025-02-18 14:26             ` Miklos Szeredi
2025-02-18 18:11               ` Luis Henriques
2025-02-18 22:05                 ` Dave Chinner
2025-02-19 11:23                   ` Luis Henriques [this message]
2025-02-19 15:39                     ` Miklos Szeredi
2025-02-19 16:31                       ` Luis Henriques
2025-02-18 21:29               ` Dave Chinner
2025-02-18 21:44       ` Dave Chinner
2025-02-18  9:07   ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87eczu41r9.fsf@igalia.com \
    --to=luis@igalia.com \
    --cc=brauner@kernel.org \
    --cc=bschubert@ddn.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=laura.promberger@cern.ch \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mharvey@jumptrading.com \
    --cc=miklos@szeredi.hu \
    --cc=valentin.volkl@cern.ch \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox