From: Antonio SJ Musumeci <trapexit@spawn.link>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Bernd Schubert <bernd.schubert@fastmail.fm>,
Amir Goldstein <amir73il@gmail.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
fuse-devel <fuse-devel@lists.sourceforge.net>
Subject: Re: [fuse-devel] Proxmox + NFS w/ exported FUSE = EIO
Date: Sun, 25 Feb 2024 20:58:13 +0000 [thread overview]
Message-ID: <9b9aab6f-ee29-441b-960d-a95d99ba90d8@spawn.link> (raw)
In-Reply-To: <f70732f8-4d67-474a-a4b8-320f78c3394d@spawn.link>
On 2/24/24 18:18, Antonio SJ Musumeci wrote:
> On 2/22/24 05:09, Miklos Szeredi wrote:
>> On Thu, 22 Feb 2024 at 02:26, Antonio SJ Musumeci <trapexit@spawn.link> wrote:
>>
>>> I'll try it when I get some cycles in the next week or so but... I'm not
>>> sure I see how this would address it. Is this not still marking the
>>> inode bad. So while it won't forget it perhaps it will still error out.
>>> How does this keep ".." of root being looked up?
>>>
>>> I don't know the code well but I'd have thought the reason for the
>>> forget was because the lookup of the parent fails.
>>
>> It shouldn't be looking up the parent of root. Root should always be
>> there, and the only way I see root disappearing is by marking it bad.
>>
>> If the patch makes a difference, then you need to find out why the
>> root is marked bad, since the filesystem will still fail in that case.
>> But at least the kernel won't do stupid things.
>>
>> I think the patch is correct and is needed regardless of the outcome
>> of your test. But there might be other kernel bugs involved, so
>> definitely need to see what happens.
>>
>> Thanks,
>> Miklos
>
> With the patch it doesn't issue forget(nodeid=1) anymore. Nor requesting
> parent of nodeid=1.
>
> However, I'm seeing different issues.
>
> I instrumented FUSE to print when it tags an inode bad.
>
> After it gets into the bad state I'm seeing nfsd hammering the mount
> even when I've umounted the nfs share and killed the FUSE server. nfsd
> is pegging a CPU core and the kernel log is filled with
> fuse_stale_inode(nodeid=1) fuse_make_bad(nodeid=1) calls. Have to reboot.
>
> What's triggering the flagging the inode as bad seems to be in
> fuse_iget() at fuse_stale_inode() check. inode->i_generation is 0 while
> the generation value is as I set it originally.
>
> From the FUSE server I see:
>
> lookup(nodeid=3,name=".")
> lookup(nodeid=3,name="..") which returns ino=1 gen=expected_val
> getattr(nodeid=2) inodeid=2 is the file I'm reading in a loop
> forget(nodeid=2)
>
> after which point it's no longer functional.
>
>
I've resolved the issue and I believe I know why I couldn't reproduce
with current libfuse examples. The fact root node has a generation of 0
is implicit in the examples and as a result when the request came in the
lookup on ".." of a child node to root it would return 0. However, in my
server I start the generation value of everything at different non-zero
value per instance of the server as at one point I read that ensuring
different nodeid + gen pairs for different filesystems was better/needed
for NFS support. I'm guessing the increase in reports I've had was
happenstance of people upgrading to kernels past 5.14.
In retrospect it makes sense that the nodeid and gen are assumed to be 1
and 0 respectively, and don't change, but due to the symptoms I had it
wasn't clicking till I saw the stale check.
Not sure if there is any changes to the kernel code that would make
sense. A log entry indicating root was tagged as bad and why would have
helped but not sure it needs more than a note in some docs. Which I'll
likely add to libfuse.
Thanks for everyone's help. Sorry for the goose chase.
next prev parent reply other threads:[~2024-02-25 20:58 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <d997c02b-d5ef-41f8-92b6-8c6775899388@spawn.link>
2024-02-06 6:53 ` [fuse-devel] Proxmox + NFS w/ exported FUSE = EIO Amir Goldstein
2024-02-07 0:08 ` Antonio SJ Musumeci
2024-02-07 7:24 ` Amir Goldstein
[not found] ` <b9cec6b7-0973-4d61-9bef-120e3c4654d7@spawn.link>
[not found] ` <CAOQ4uxgZR4OtCkdrpcDGCK-MqZEHcrx+RY4G94saqaXVkL4cKA@mail.gmail.com>
2024-02-18 0:48 ` Antonio SJ Musumeci
2024-02-19 11:36 ` Bernd Schubert
2024-02-19 19:05 ` Antonio SJ Musumeci
2024-02-19 19:17 ` Bernd Schubert
2024-02-19 19:38 ` Miklos Szeredi
2024-02-19 19:54 ` Antonio SJ Musumeci
2024-02-20 8:35 ` Miklos Szeredi
2024-02-20 8:47 ` Miklos Szeredi
2024-02-22 1:25 ` Antonio SJ Musumeci
2024-02-22 11:09 ` Miklos Szeredi
2024-02-25 0:18 ` Antonio SJ Musumeci
2024-02-25 20:58 ` Antonio SJ Musumeci [this message]
2024-02-28 13:06 ` Miklos Szeredi
2024-02-28 13:16 ` Bernd Schubert
2024-02-28 14:14 ` Miklos Szeredi
2024-02-28 16:03 ` Miklos Szeredi
2024-02-19 19:55 ` Bernd Schubert
2024-02-19 19:58 ` Miklos Szeredi
2024-02-19 21:14 ` Bernd Schubert
2024-02-19 23:22 ` Bernd Schubert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9b9aab6f-ee29-441b-960d-a95d99ba90d8@spawn.link \
--to=trapexit@spawn.link \
--cc=amir73il@gmail.com \
--cc=bernd.schubert@fastmail.fm \
--cc=fuse-devel@lists.sourceforge.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).