From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
"nix155nix@gmail.com" <nix155nix@gmail.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8
Date: Wed, 26 Nov 2025 23:22:38 +0000 [thread overview]
Message-ID: <53f37729a201ab5878559f178fedb55ef01551a7.camel@ibm.com> (raw)
In-Reply-To: <CABS1u1DX+YB+Vz1_1gZ0byPLjk7Qhv9x9X+xJYGxx3uWTWyiLw@mail.gmail.com>
On Thu, 2025-11-27 at 02:12 +0300, Уолтер О'Дим wrote:
>
> Subject: [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8
> To: ceph-devel@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
>
> Hi,
>
> I would like to report a regression in the in-kernel CephFS client which appeared between Linux 6.17.7 and 6.17.8. The issue is fully reproducible on my hardware and completely prevents accessing CephFS.
>
> The same CephFS cluster works fine from Ubuntu and Debian kernel clients, so this appears to be a kernel-side regression in the CephFS client codepath.
>
> ======================================================
> Summary
> ======================================================
>
> Starting with Linux 6.17.8, running "ls /mnt/cephfs" triggers an immediate kernel crash (NULL pointer dereference in strcmp), inside:
>
> ceph_mds_check_access()
> ceph_open()
>
> CephFS becomes unusable: any attempt to open files or directories on the mount kills the calling process.
>
> Rolling back to 6.17.7 fixes the issue.
>
> ======================================================
> Environment
> ======================================================
>
> Distro: Arch Linux (rolling)
> Kernel (bad): 6.17.8.arch1-1
> Kernel (good): 6.17.7.arch1-1
> Architecture: x86_64
>
> Hardware:
> Dell Latitude 7490
> BIOS 1.39.0 (2024-07-04)
>
> Ceph modules:
> ceph.ko srcversion 8A90DA7BD7115993B7D91C5
> libceph.ko srcversion 451CE8A92FEA7625419462C
>
> CephFS mount:
> 172.27.0.71:6789,172.27.1.51:6789,172.27.5.25:6789:/ /mnt/cephfs
> -t ceph
> -o name=cephfs,secret=...,noatime,_netdev,x-systemd.automount
>
> ======================================================
> Regression window
> ======================================================
>
> Last known good: 6.17.7
> First bad: 6.17.8
> Also bad: 6.17.9
> Also affected: linux-lts 6.12.x (same crash on this machine)
>
> ======================================================
> Reproducer
> ======================================================
>
> 1. Boot kernel 6.17.8 or newer.
> 2. Mount CephFS.
> 3. Run: ls /mnt/cephfs
> 4. Kernel immediately BUGs with a NULL dereference and kills the process.
>
> This is 100% reproducible.
>
> ======================================================
> Crash excerpt (full dmesg attached)
> ======================================================
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> Oops: 0000 [#1] SMP PTI
> CPU: 1 PID: 5365 Comm: ls
>
> RIP: 0010:strcmp+0x2c/0x50
> RAX: 0000000000000000
> RSI: 0000000000000000
> RDI: ffff8a16d6da87c8
>
> Call Trace:
> ceph_mds_check_access+0x103/0x840 [ceph]
> __touch_cap+0x30/0x180 [ceph]
> ceph_open+0x17a/0x620 [ceph]
> do_dentry_open+0x23d/0x480
> vfs_open
> path_openat
> do_filp_open
> do_sys_openat2
> __x64_sys_openat
> do_syscall_64
> entry_SYSCALL_64_after_hwframe
>
> Second ls run produces an identical crash.
>
> ======================================================
> Notes
> ======================================================
>
> * The issue occurs before any user operations.
> * The CephFS cluster is unchanged between tests.
> * Other Linux clients (Ubuntu, Debian kernels) work fine.
> * I can test patches or help bisect.
>
> Full logs are attached.
>
>
Thanks for the report. I believe we are talking about the same issue. Please,
check this patch [1] as current workaround.
Thanks,
Slava.
[1]
https://lore.kernel.org/ceph-devel/9534e58061c7832826bbd3500b9da9479e8a8244.camel@ibm.com/T/#t
prev parent reply other threads:[~2025-11-26 23:22 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-26 23:12 [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8 Уолтер О'Дим
2025-11-26 23:22 ` Viacheslav Dubeyko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53f37729a201ab5878559f178fedb55ef01551a7.camel@ibm.com \
--to=slava.dubeyko@ibm.com \
--cc=ceph-devel@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=nix155nix@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).