From: "Darrick J. Wong" <djwong@kernel.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Luis Henriques <luis@igalia.com>,
Miklos Szeredi <miklos@szeredi.hu>,
Bernd Schubert <bschubert@ddn.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] Another take at restarting FUSE servers
Date: Thu, 31 Jul 2025 10:38:58 -0700 [thread overview]
Message-ID: <20250731173858.GE2672029@frogsfrogsfrogs> (raw)
In-Reply-To: <20250731130458.GE273706@mit.edu>
On Thu, Jul 31, 2025 at 09:04:58AM -0400, Theodore Ts'o wrote:
> On Tue, Jul 29, 2025 at 04:38:54PM -0700, Darrick J. Wong wrote:
> >
> > Just speaking for fuse2fs here -- that would be kinda nifty if libfuse
> > could restart itself. It's unclear if doing so will actually enable us
> > to clear the condition that caused the failure in the first place, but I
> > suppose fuse2fs /does/ have e2fsck -fy at hand. So maybe restarts
> > aren't totally crazy.
>
> I'm trying to understand what the failure scenario is here. Is this
> if the userspace fuse server (i.e., fuse2fs) has crashed? If so, what
> is supposed to happen with respect to open files, metadata and data
> modifications which were in transit, etc.? Sure, fuse2fs could run
> e2fsck -fy, but if there are dirty inode on the system, that's going
> potentally to be out of sync, right?
>
> What are the recovery semantics that we hope to be able to provide?
<echoing what we said on the ext4 call this morning>
With iomap, most of the dirty state is in the kernel, so I think the new
fuse2fs instance would poke the kernel with FUSE_NOTIFY_RESTARTED, which
would initiate GETATTR requests on all the cached inodes to validate
that they still exist; and then resend all the unacknowledged requests
that were pending at the time. It might be the case that you have to
that in the reverse order; I only know enough about the design of fuse
to suspect that to be true.
Anyhow once those are complete, I think we can resume operations with
the surviving inodes. The ones that fail the GETATTR revalidation are
fuse_make_bad'd, which effectively revokes them.
All of this of course relies on fuse2fs maintaining as little volatile
state of its own as possible. I think that means disabling the block
cache in the unix io manager, and if we ever implemented delalloc then
either we'd have to save the reservations somewhere or I guess you could
immediately syncfs the whole filesystem to try to push all the dirty
data to disk before we start allowing new free space allocations for new
changes.
--D
> - Ted
>
next prev parent reply other threads:[~2025-07-31 17:38 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-29 13:56 [RFC] Another take at restarting FUSE servers Luis Henriques
2025-07-29 23:38 ` Darrick J. Wong
2025-07-30 14:04 ` Luis Henriques
2025-07-31 11:33 ` Christian Brauner
2025-07-31 12:23 ` Luis Henriques
2025-07-31 17:29 ` Darrick J. Wong
2025-08-04 8:45 ` Christian Brauner
2025-08-12 19:28 ` Darrick J. Wong
2025-07-31 13:04 ` Theodore Ts'o
2025-07-31 17:38 ` Darrick J. Wong [this message]
2025-08-01 10:15 ` Luis Henriques
2025-08-11 15:43 ` Darrick J. Wong
2025-08-13 13:14 ` Luis Henriques
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250731173858.GE2672029@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=bschubert@ddn.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luis@igalia.com \
--cc=miklos@szeredi.hu \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).