ceph-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: caskd <caskd@redxen.eu>
To: ceph-devel@vger.kernel.org
Subject: [bug report] listing cephfs snapshots causes general protection fault
Date: Mon, 13 Oct 2025 20:35:22 +0000	[thread overview]
Message-ID: <2XRKV3P9QNXDI.37IHX7CVTJ6SD@unix.is.love.unix.is.life> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 232 bytes --]

Hello Ceph developers,

i've encountered a bug on Squid 19.2.3 with the cephfs kernel client.

Listing the snapshot of any directory causes the kernel to access invalid/unexpected address. 

# Logs and further information:


[-- Attachment #1.2: klog.txt --]
[-- Type: text/plain, Size: 5399 bytes --]

kern.warn: [581015.841007] Oops: general protection fault, probably for non-canonical address 0x5d2ecc0680000000: 0000 [#1] PREEMPT SMP PTI
kern.warn: [581015.844081] CPU: 1 UID: 0 PID: 3825461 Comm: kworker/1:3 Not tainted 6.12.43-0-virt #1-Alpine
kern.warn: [581015.846230] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-stable202408-prebuilt.qemu.org 08/13/2024
kern.warn: [581015.849810] Workqueue: ceph-msgr ceph_con_workfn [libceph]
kern.warn: [581015.852352] RIP: 0010:rb_insert_color+0xa4/0x170
kern.warn: [581015.854292] Code: 31 f6 31 ff 45 31 c0 c3 cc cc cc cc 48 89 06 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 c3 cc cc cc cc 48 8b 51 10 48 85 d2 74 05 <f6> 02 01 74 1b 48 8b 50 10 48 39 fa 74 75 48 89 c7 48 89 51 08 48
kern.warn: [581015.861182] RSP: 0000:ffffbab5cbf9fa08 EFLAGS: 00010206
kern.warn: [581015.863069] RAX: ffff8f5d2ecc0c40 RBX: ffff8f5d2ecc0000 RCX: ffff8f5d2ecc0ad0
kern.warn: [581015.865096] RDX: 5d2ecc0680000000 RSI: ffff8f5d47e64da0 RDI: ffff8f5d2ecc0380
kern.warn: [581015.866517] RBP: 0000000000005595 R08: 0000000000000000 R09: 0000000000000000
kern.warn: [581015.867908] R10: ffff8f5d2ecc0680 R11: 0000000000000000 R12: ffff8f5d47e64d98
kern.warn: [581015.869309] R13: ffff8f5d47d30000 R14: ffff8f5d2ecc0680 R15: ffff8f5d47e64da0
kern.warn: [581015.871406] FS:  0000000000000000(0000) GS:ffff8f645fc40000(0000) knlGS:0000000000000000
kern.warn: [581015.872992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern.warn: [581015.874178] CR2: 00007fe0accea180 CR3: 00000002e5ab0006 CR4: 0000000000172eb0
kern.warn: [581015.875475] Call Trace:
kern.warn: [581015.875948]  <TASK>
kern.warn: [581015.876366]  ceph_get_snapid_map+0x125/0x300 [ceph]
kern.warn: [581015.877432]  ceph_fill_inode+0x1015/0x14a0 [ceph]
kern.warn: [581015.878288]  ceph_readdir_prepopulate+0x34d/0xe40 [ceph]
kern.warn: [581015.879220]  mds_dispatch+0x1b2f/0x1ec0 [ceph]
kern.warn: [581015.880024]  ? gcm_decrypt_aesni_avx+0x1e7/0x270 [aesni_intel]
kern.warn: [581015.881054]  ceph_con_process_message+0x72/0x140 [libceph]
kern.warn: [581015.882022]  process_message+0x9/0x110 [libceph]
kern.warn: [581015.882856]  ceph_con_v2_try_read+0xb3b/0x1830 [libceph]
kern.warn: [581015.883820]  ? set_next_entity+0xf1/0x200
kern.warn: [581015.884552]  ? __schedule+0x3c3/0x1540
kern.warn: [581015.885250]  ceph_con_workfn+0x140/0x650 [libceph]
kern.warn: [581015.886107]  process_one_work+0x173/0x330
kern.warn: [581015.886857]  worker_thread+0x259/0x390
kern.warn: [581015.887514]  ? __pfx_worker_thread+0x10/0x10
kern.warn: [581015.888264]  kthread+0xcd/0x100
kern.warn: [581015.888865]  ? __pfx_kthread+0x10/0x10
kern.warn: [581015.889521]  ret_from_fork+0x2f/0x50
kern.warn: [581015.890195]  ? __pfx_kthread+0x10/0x10
kern.warn: [581015.890858]  ret_from_fork_asm+0x1a/0x30
kern.warn: [581015.891565]  </TASK>
kern.warn: [581015.891983] Modules linked in: overlay rbd nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_chain_nat nf_nat mousedev nft_fib_ipv4 kvm_intel tiny_power_button psmouse evdev efi_pstore qemu_fw_cfg button nft_ct nls_utf8 nf_conntrack nls_cp437 nf_defrag_ipv6 nf_defrag_ipv4 nft_fib_ipv6 nft_fib nf_tables kvm af_packet irqbypass sch_fq_codel ext4 wireguard ceph curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel crc16 libceph udp_tunnel mbcache bridge libcurve25519_generic nfnetlink vfat netfs jbd2 fuse stp tun libchacha llc fat dm_crypt encrypted_keys dm_mod virtio_balloon virtio_gpu virtio_dma_buf virtio_blk virtio_rng rng_core virtio_net net_failover failover crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 sha1_generic aesni_intel gf128mul crypto_simd cryptd xhci_pci xhci_hcd usbcore usb_common simpledrm drm_shmem_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_kms_helper drm i2c_core drm_panel_orientation_quirks fb loop btrfs
kern.warn: [581015.892392]  blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq
kern.warn: [581015.907981] ---[ end trace 0000000000000000 ]---
kern.warn: [581017.966077] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2118959053 wd_nsec: 2118958425
kern.warn: [581017.969353] RIP: 0010:rb_insert_color+0xa4/0x170
kern.warn: [581017.972504] Code: 31 f6 31 ff 45 31 c0 c3 cc cc cc cc 48 89 06 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 c3 cc cc cc cc 48 8b 51 10 48 85 d2 74 05 <f6> 02 01 74 1b 48 8b 50 10 48 39 fa 74 75 48 89 c7 48 89 51 08 48
kern.warn: [581017.976473] RSP: 0000:ffffbab5cbf9fa08 EFLAGS: 00010206
kern.warn: [581017.977392] RAX: ffff8f5d2ecc0c40 RBX: ffff8f5d2ecc0000 RCX: ffff8f5d2ecc0ad0
kern.warn: [581017.978583] RDX: 5d2ecc0680000000 RSI: ffff8f5d47e64da0 RDI: ffff8f5d2ecc0380
kern.warn: [581017.979912] RBP: 0000000000005595 R08: 0000000000000000 R09: 0000000000000000
kern.warn: [581017.981156] R10: ffff8f5d2ecc0680 R11: 0000000000000000 R12: ffff8f5d47e64d98
kern.warn: [581017.982354] R13: ffff8f5d47d30000 R14: ffff8f5d2ecc0680 R15: ffff8f5d47e64da0
kern.warn: [581017.983666] FS:  0000000000000000(0000) GS:ffff8f645fc40000(0000) knlGS:0000000000000000
kern.warn: [581017.985059] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kern.warn: [581017.986020] CR2: 00007fe0accea180 CR3: 00000002e5ab0006 CR4: 0000000000172eb0
kern.info: [581017.987213] note: kworker/1:3[3825461] exited with preempt_count 1

[-- Attachment #1.3: mds-metadata.txt --]
[-- Type: text/plain, Size: 638 bytes --]

{
    "addr": "v2:REDACTED:6801/REDACTED",
    "arch": "x86_64",
    "ceph_release": "squid",
    "ceph_version": "ceph version 19.2.3 (c92aebb279828e9c3c1f5d24613efca272649e62) squid (stable)",
    "ceph_version_short": "19.2.3",
    "cpu": "Intel(R) Xeon(R) CPU E5-2470 v2 @ 2.40GHz",
    "distro": "nnd",
    "distro_description": "nonamedistribution 1.0",
    "distro_version": "1.0",
    "hostname": "la-orilla.mexico",
    "kernel_description": "#1-Alpine SMP PREEMPT_DYNAMIC 2025-10-02 13:09:51",
    "kernel_version": "6.12.50-0-virt",
    "mem_swap_kb": "0",
    "mem_total_kb": "164821204",
    "os": "Linux"
}

[-- Attachment #1.4: fs-metadata.txt --]
[-- Type: text/plain, Size: 1473 bytes --]

e1041500
btime 2025-10-13T19:11:24:543289+0000
enable_multiple, ever_enabled_multiple: 1,1
default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
legacy client fscid: -1
 
#### 2 FILESYSTEMS REDACTED #### 
 
Filesystem 'caskd' (16)
fs_name	caskd
epoch	1041500
flags	12 joinable allow_snaps allow_multimds_snaps
created	2025-10-03T18:06:16.720334+0000
modified	2025-10-13T19:11:24.492967+0000
tableserver	0
root	0
session_timeout	60
session_autoclose	300
max_file_size	1099511627776
max_xattr_size	65536
required_client_features	{}
last_failure	0
last_failure_osd_epoch	0
compat	compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes}
max_mds	1
in	0
up	{0=368484109}
failed	
damaged	
stopped	
data_pools	[106,105]
metadata_pool	107
inline_data	disabled
balancer	
bal_rank_mask	-1
standby_count_wanted	1
qdb_cluster	leader: 368484109 members: 368484109
[mds.la-orilla.mexico.2{0:368484109} state up:active seq 217880 addr v2:REDACTED:6801/REDACTED compat {c=[1],r=[1],i=[1fff]}]

[-- Attachment #1.5: Type: text/plain, Size: 1333 bytes --]


Mount spec:
admin@REDACTED.caskd=/ REDACTED ceph rw,relatime,name=admin,secret=<hidden>,ms_mode=secure,acl,mon_addr=REDACTED:3300/REDACTED:3300,recover_session=clean 0 0

- snap-schedule is set up on / and the filesystem is heavily used
  - not many concurrent accesses to the same dir/inodes
- encountered on both 6.12 and 6.16 on multiple machines, including qemu
- i can replicate this every time with the kernel client but not with the FUSE client

# Here is what i have tried so far.

- rebuild the filesystem from scratch to rule out potential inconsistent metadata
  - bug has returned after a few days on the new fs

# Ways to replicate (UNCONFIRMED):

Create a cephfs filesystem
Setup snap-schedule with retention
Do heavy RW on filesystem while snapshot gets deleted
List .snap/

# Misc info

- The previous filesystem had the main pool on 3x replicated and this new one is ec 6-2 jerasure
  - This seems to play no role into this bug.
- I would be willing to narrow down the issue to the minimal reproducible example if this information is not enough.
- This filesystem contains a significant amount of small files and directories if it plays any role.

Please let me know if you need any further information

-- 
Alex D.
RedXen System & Infrastructure Administration
https://redxen.eu/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 858 bytes --]

             reply	other threads:[~2025-10-13 20:35 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-13 20:35 caskd [this message]
2025-10-27 22:10 ` [bug report] listing cephfs snapshots causes general protection fault Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2XRKV3P9QNXDI.37IHX7CVTJ6SD@unix.is.love.unix.is.life \
    --to=caskd@redxen.eu \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).