linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
To: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>,
	"nix155nix@gmail.com" <nix155nix@gmail.com>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re:  [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8
Date: Wed, 26 Nov 2025 23:22:38 +0000	[thread overview]
Message-ID: <53f37729a201ab5878559f178fedb55ef01551a7.camel@ibm.com> (raw)
In-Reply-To: <CABS1u1DX+YB+Vz1_1gZ0byPLjk7Qhv9x9X+xJYGxx3uWTWyiLw@mail.gmail.com>

On Thu, 2025-11-27 at 02:12 +0300, Уолтер О'Дим wrote:
> 
> Subject: [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8
> To: ceph-devel@vger.kernel.org
> Cc: linux-fsdevel@vger.kernel.org
> 
> Hi,
> 
> I would like to report a regression in the in-kernel CephFS client which appeared between Linux 6.17.7 and 6.17.8. The issue is fully reproducible on my hardware and completely prevents accessing CephFS.
> 
> The same CephFS cluster works fine from Ubuntu and Debian kernel clients, so this appears to be a kernel-side regression in the CephFS client codepath.
> 
> ======================================================
> Summary
> ======================================================
> 
> Starting with Linux 6.17.8, running "ls /mnt/cephfs" triggers an immediate kernel crash (NULL pointer dereference in strcmp), inside:
> 
>   ceph_mds_check_access()
>   ceph_open()
> 
> CephFS becomes unusable: any attempt to open files or directories on the mount kills the calling process.
> 
> Rolling back to 6.17.7 fixes the issue.
> 
> ======================================================
> Environment
> ======================================================
> 
> Distro: Arch Linux (rolling)
> Kernel (bad): 6.17.8.arch1-1
> Kernel (good): 6.17.7.arch1-1
> Architecture: x86_64
> 
> Hardware:
>   Dell Latitude 7490
>   BIOS 1.39.0 (2024-07-04)
> 
> Ceph modules:
>   ceph.ko     srcversion 8A90DA7BD7115993B7D91C5
>   libceph.ko  srcversion 451CE8A92FEA7625419462C
> 
> CephFS mount:
>   172.27.0.71:6789,172.27.1.51:6789,172.27.5.25:6789:/ /mnt/cephfs
>     -t ceph
>     -o name=cephfs,secret=...,noatime,_netdev,x-systemd.automount
> 
> ======================================================
> Regression window
> ======================================================
> 
> Last known good: 6.17.7
> First bad:       6.17.8
> Also bad:        6.17.9
> Also affected:   linux-lts 6.12.x (same crash on this machine)
> 
> ======================================================
> Reproducer
> ======================================================
> 
> 1. Boot kernel 6.17.8 or newer.
> 2. Mount CephFS.
> 3. Run: ls /mnt/cephfs
> 4. Kernel immediately BUGs with a NULL dereference and kills the process.
> 
> This is 100% reproducible.
> 
> ======================================================
> Crash excerpt (full dmesg attached)
> ======================================================
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> Oops: 0000 [#1] SMP PTI
> CPU: 1 PID: 5365 Comm: ls
> 
> RIP: 0010:strcmp+0x2c/0x50
> RAX: 0000000000000000
> RSI: 0000000000000000
> RDI: ffff8a16d6da87c8
> 
> Call Trace:
>   ceph_mds_check_access+0x103/0x840 [ceph]
>   __touch_cap+0x30/0x180 [ceph]
>   ceph_open+0x17a/0x620 [ceph]
>   do_dentry_open+0x23d/0x480
>   vfs_open
>   path_openat
>   do_filp_open
>   do_sys_openat2
>   __x64_sys_openat
>   do_syscall_64
>   entry_SYSCALL_64_after_hwframe
> 
> Second ls run produces an identical crash.
> 
> ======================================================
> Notes
> ======================================================
> 
> * The issue occurs before any user operations.
> * The CephFS cluster is unchanged between tests.
> * Other Linux clients (Ubuntu, Debian kernels) work fine.
> * I can test patches or help bisect.
> 
> Full logs are attached.
> 
> 
Thanks for the report. I believe we are talking about the same issue. Please,
check this patch [1] as current workaround.

Thanks,
Slava.

[1]
https://lore.kernel.org/ceph-devel/9534e58061c7832826bbd3500b9da9479e8a8244.camel@ibm.com/T/#t

      reply	other threads:[~2025-11-26 23:22 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-26 23:12 [REGRESSION] CephFS kernel client crash (NULL deref in strcmp) since Linux 6.17.8 Уолтер О'Дим
2025-11-26 23:22 ` Viacheslav Dubeyko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53f37729a201ab5878559f178fedb55ef01551a7.camel@ibm.com \
    --to=slava.dubeyko@ibm.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nix155nix@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).