From: Chuck Lever <chuck.lever@oracle.com>
To: Rick Macklem <rick.macklem@gmail.com>, J David <j.david.lists@gmail.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: knfsd server bug when GETATTR follows READDIR
Date: Sun, 22 Dec 2024 15:17:18 -0500 [thread overview]
Message-ID: <3998d739-c042-46b4-8166-dbd6c5f0e804@oracle.com> (raw)
In-Reply-To: <CAM5tNy6dFBgAhkX_mBzXyRyb+WfukZT0egpt75XRgCYKHPsP3Q@mail.gmail.com>
On 12/21/24 6:53 PM, Rick Macklem wrote:
> On Sat, Dec 21, 2024 at 3:27 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>>
>> On Sat, Dec 21, 2024 at 9:34 AM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>
>>> On 12/20/24 9:16 PM, J David wrote:
>>>> Hello,
>>>>
>>>> On Tue, Dec 17, 2024 at 8:51 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>> If they can reproduce
>>>>> this issue with an "in tree" file system contained in a recent upstream
>>>>> Linux kernel, then we can take a look. (Or you and J. David can give it
>>>>> a try).
>>>>
>>>> Yes, I reproduced this behavior on ext4 with 6.11.5+bpo-amd64 from
>>>> Debian backports on completely different hardware.
>>>>
>>>> Then I set up another NFS server on Arch (running kernel 6.12.4), and
>>>> reproduced the issue there as well.
>>>>
>>>> Then, just to be sure, I went and found the instructions for building
>>>> the Linux kernel from source, built and tested both 6.12.6 and
>>>> 6.13-rc3 as downloaded directly from www.kernel.org, and the issue
>>>> occurs with those as well.
>>>
>>> Reproducing on v6.13-rc with ext4 is all that was necessary, thank you!
>>>
>>>
>>>> Additionally, I have tested every combination of FreeBSD, Linux and
>>>> OpenIndiana as client and server to confirm that FreeBSD client with
>>>> Linux server is the only case where this problem occurs.
>>>
>>> Interesting.
>>>
>>>
>>>> Does this count as reproducing the issue with an "in tree" file system
>>>> contained in a recent upstream Linux kernel? I'm asking sincerely; I'm
>>>> so far out of my depth that I'm pretty sure there are sea monsters
>>>> swimming around down there. So I can't rule out the possibility that
>>>> I've done something wrong either in setup or testing.
>>>>
>>>> During the course of this, I've gotten the reproduction down to
>>>> extracting a 2k tar file and then running "du" on the resulting
>>>> directory from the client. Doesn't matter if the file is untarred on
>>>> the FreeBSD client, the server, or another client. The tar file
>>>> contains a directory with a handful of random Javascript files from
>>>> Drupal. As far as I can tell, it has something to do with the number,
>>>> size, or names of the files. The Drupal project has three separate
>>>> directories all structured like this with the same filenames, but the
>>>> file contents vary. The issue occurs with all of them.
>>>>
>>>> The Linux /etc/exports file is just:
>>>>
>>>> /data 192.168.201.0/24(rw,sync)
>>>>
>>>> (The production case also uses crossmnt and no_subtree_check, anonuid,
>>>> and anongid, but I eliminated those one by one to make sure they
>>>> weren't responsible.)
>>>>
>>>> The corresponding fstab entry on the FreeBSD 14.2-RELEASE client is:
>>>>
>>>> 192.168.201.200:/data /data nfs rw,tcp,nfsv4,minorversion=2 0 0
>>>
>>> Out of curiosity, do you see the problem recur with nfsv3 or the other
>>> NFSv4 minor versions?
>>>
>>>
>>>> One additional thing I noticed that really blew my mind is that I can
>>>> shutdown both the client and the server, wait, power them back on, and
>>>> the issue is still there. So it's not something in RAM. That prompted
>>>> me to try "touch x" in the directory to create a new 0-length file.
>>>> The issue then goes away. Then I can "rm x" and the issue comes back.
>>>> By contrast, I can write megabytes from /dev/random into one of the
>>>> files without affecting anything; the issue stays the same.
>>>>
>>>> I then tried it with all empty files using the same filenames. The
>>>> issue still occurred. Add or remove one file and the issue goes away.
>>>> I then renamed one of the files to zz.js. Issue still occurs. Renamed
>>>> it to zzz.js. Problem still occurs. Kept going until I got to
>>>> zzzzzz.js and it worked.
>>>>
>>>> Finally, I got it to the point where running this in an empty mounted
>>>> directory will create the issue:
>>>>
>>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
>>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
>>>> done; touch y0-xxxxxx.xx
>>>>
>>>> and this will not:
>>>>
>>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
>>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
>>>> done; touch y0-xxxxxxx.xx
>>>>
>>>> (The difference being one extra x in the last filename.)
>>>>
>>>> It works in the other direction as well. This causes the issue:
>>>>
>>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
>>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
>>>> done; touch y0-xxx.xx
>>>>
>>>> This does not:
>>>>
>>>> rm *.xx; for a in a b c d e f g h ; do for b in 1 2 3 4 5 6 7 ; do
>>>> touch $a$b.xx ; done; done; for a in 1 2 3 4 5; do touch x$a-xx.xx;
>>>> done; touch y0-xx.xx
>>>>
>>>> There's a four-character window involving the length of the filenames
>>>> where 62 files in a directory causes this issue. There's a little more
>>>> to it than that; it doesn't look like you can just create 61
>>>> two-letter filenames and then one really long one and get the issue.
>>>>
>>>> So I haven't found the specifics yet, but perhaps due to pure chance
>>>> this directory structure is exactly right to provoke an incredibly
>>>> obscure edge case?
>>>
>>> Well it's likely that this is a problem with READDIR, so file content
>>> is not going to be an issue. The file name lengths are the problem.
>>>
>>> Also, I'm wondering what the FreeBSD client's directory readdir
>>> arguments are (how much does it request, what are the maximum limits it
>>> negotiates, and so on). Rick?
>> As you'll see in the packet trace:
>> Sequence: cache this: No
>> Putfh: directory fh
>> Readdir:
>> cookie: 0
>> cookie_verf: 0
>> dircount: 8706
>> maxcount: 8706
>> attr: type, RDattr_error, fileid, mounted_on_fleid
>> Getattr: same attributes as requested for a previous GETATTR, mainly
>> to keep the directory's attribute cache up to date.
>>
>> The session negotiates a max request/reply size of just over 1Mbyte and a
>> maximum of something like 20 ops. (Can't recall, but definitely more than 4.)
>>
>> If you are wondering where the 8706 comes from, it was an estimate of how
>> much would be needed to fill an 8K buffer with the XDR translated to UFS dirents
>> by adding 512 to 8K.
>>
>> I have not yet had a chance to see if I can reproduce the problem with
>> J. David's
>> reproducer. I will try that soon, and if I can reproduce it, I will
>> poke at it to try and
>> figure out what is going on.
> Just fyi, I have reproduced it. Once you use J. David's little shell script to
> create the files in the directory, the Readdir RPC gets the junk reply
> to GETATTR
> (the count of words for the attribute bitmap in the reply is 0 instead of 2).
> You can unmount/remount it and still get the failure, assuming you do not
> mess with the directory contents.
>
> Good work finding the reproducer, J. David!
>
> I will start to poke around to see if I can figure out what the knfsd server is
> doing.
>
> Chuck, I suspect any fairly recent FreeBSD client will be sufficient to
> reproduce this, just in case you are inspired to cross over to the dark
> side and install FreeBSD somewhere.
I see the same malformed GETATTR result in the attachments.
Linux doesn't trip on this issue because it's NFS client doesn't ever
append a GETATTR operation after a READDIR.
So I've installed a small FreeBSD 14.2 guest, and copied the reproducer
script over to it. I see the extra GETATTR now, and I'm trying to
figure out what is causing the corrupted reply. At first glance, I
can see the problem involves a particularly placed page boundary in
the XDR encoding buffer, but it isn't a problem with GETATTR encoding
per se.
> I'll post when I have more info, rick
>
>>
>> rick
>>
>>>
>>> Since this isn't reproducible (yet) with a Linux client, let's try
>>> another set of network captures, and you can send these to me
>>> privately.
>>>
>>> Start the capture
>>> Mount
>>> Run one of the reproducers above
>>> Unmount
>>> Stop the capture
>>>
>>> I'd like to see one with v6.13-rc3 and ext4 that works as expected, and
>>> one with the same configuration that fails.
>>>
>>> --
>>> Chuck Lever
--
Chuck Lever
next prev parent reply other threads:[~2024-12-22 20:17 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-17 22:10 knfsd server bug when GETATTR follows READDIR Rick Macklem
2024-12-18 1:51 ` Chuck Lever
2024-12-21 2:16 ` J David
2024-12-21 17:34 ` Chuck Lever
2024-12-21 20:52 ` J David
2024-12-21 23:27 ` Rick Macklem
2024-12-21 23:53 ` Rick Macklem
2024-12-22 0:29 ` Rick Macklem
2024-12-22 1:02 ` J David
2024-12-22 20:17 ` Chuck Lever [this message]
2024-12-24 6:51 ` Cedric Blancher
2024-12-24 13:45 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3998d739-c042-46b4-8166-dbd6c5f0e804@oracle.com \
--to=chuck.lever@oracle.com \
--cc=j.david.lists@gmail.com \
--cc=linux-nfs@vger.kernel.org \
--cc=rick.macklem@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox