From: Wolfgang Walter <linux@stwm.de>
To: Chuck Lever <cel@kernel.org>
Cc: stable@vger.kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Jeff Layton <jlayton@kernel.org>,
Alexandr Alexandrov <alexandr.alexandrov@oracle.com>,
Yang Erkun <yangerkun@huawei.com>,
linux-nfs@vger.kernel.org
Subject: Re: 6.18.37 has problems with nfs4 (server), 6.18.36 works
Date: Fri, 03 Jul 2026 20:30:38 +0200 [thread overview]
Message-ID: <3d80d1812ab903dbc831fef122d3cc75@stwm.de> (raw)
In-Reply-To: <20260703160306.1651327-1-cel@kernel.org>
Hello Chuck,
Am 2026-07-03 18:03, schrieb Chuck Lever:
> Hi Wolfgang, and stable@ --
>
> Short version for stable@: 6.18.37 does not need a revert of
> 95f9eb19d5e6 ("Revert 'NFSD: Defer sub-object cleanup in export
> put callbacks'"). That commit is correct for 6.18, and it is
> not the cause of Wolfgang's crash. Please leave it in place.
Ok. I run v6.18.37 with the patch reverted since about a day (just for
the record). But according to your analysis, that's just a coincidence.
>
> The reasoning: 95f9eb19d5e6 touches only fs/nfsd/export.c,
> export.h, and nfsctl.c. Wolfgang's oops is in
> remove_blocked_locks() -> __destroy_client() ->
> nfsd4_destroy_clientid(), entirely within fs/nfsd/nfs4state.c,
> which the revert does not modify. That path is byte-for-byte
> identical across 6.18.36, 6.18.37, and current mainline, so the
> revert cannot have introduced the bug and no missing backport
> repairs it. The 6.18.36-good / 6.18.37-bad split is a timing
> coincidence; I believe the same latent bug is present in both.
>
> Because the defect is present upstream as well, the fix belongs
> in mainline first and is then backported to 6.18.y and the other
> affected trees.
>
> Wolfgang - to confirm this and capture the allocation and free
> stacks, a KASAN-enabled kernel would settle it. On a v6.18.37
> tree:
>
> 1. Add to your .config (keep your usual CONFIG_DEBUG_INFO so
> symbols resolve):
>
> CONFIG_KASAN=y
> CONFIG_KASAN_GENERIC=y
> CONFIG_KASAN_INLINE=y
> CONFIG_STACKTRACE=y
>
> 2. Build and boot that kernel. Stay on 6.18.37 -- you do not
> need the revert-the-revert build I suggested earlier; that
> experiment no longer tells us anything.
>
> 3. When it trips, KASAN prints a "BUG: KASAN: use-after-free"
> report with "Allocated by" and "Freed by" call stacks.
> That report, in full, is what I need -- it should land in
> /var/log/messages just as the last oops did.
>
> One caveat: KASAN roughly doubles memory use and adds CPU cost,
> so weigh that before running it on the production server. If
> that is not practical, a full log from the first stall line
> onward, with all CPU backtraces, captured over netconsole or
> serial, is a useful second best.
>
> I will draft a candidate upstream fix from the analysis so far
> and send it separately. If KASAN on the production box is not
> an option, testing that patch may be the least disruptive way
> to confirm.
>
I think the memory usage should not be a problem, higher cpu usage
neither.
But as it is a coincidence the probability to catch that error is
probably very low. We use v6.18 kernels since v6.18.1 on that fileserver
and this error never occured before.
Or do you think it happens more often, but without symptoms, and KASAN
would detect it?
So I will try running a v3.18.37 + your patch applied. This of course
can not prove that it fixes the problem because it almost never happens,
but probably this would detect if if the patch had side effects.
Regards,
--
Wolfgang Walter
Studierendenwerk München Oberbayern
Anstalt des öffentlichen Rechts
next prev parent reply other threads:[~2026-07-03 18:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 16:14 6.18.37 has problems with nfs4 (server), 6.18.36 works Wolfgang Walter
2026-07-01 23:43 ` Chuck Lever
2026-07-02 16:53 ` Wolfgang Walter
2026-07-03 16:03 ` Chuck Lever
2026-07-03 18:30 ` Wolfgang Walter [this message]
2026-07-03 20:59 ` Chuck Lever
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3d80d1812ab903dbc831fef122d3cc75@stwm.de \
--to=linux@stwm.de \
--cc=alexandr.alexandrov@oracle.com \
--cc=cel@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=jlayton@kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=yangerkun@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox