Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* BUG in raid6_pq while running fstest btrfs/286
@ 2023-06-15 17:58 Jeff Layton
  2023-06-16  1:57 ` Qu Wenruo
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2023-06-15 17:58 UTC (permalink / raw)
  To: Alexander Gordeev, Song Liu, Heiko Carstens, Giulio Benetti; +Cc: linux-btrfs

I hit this today, while doing some testing with kdevops. Test btrfs/286
was running when it failed:

[ 4759.230216] run fstests btrfs/286 at 2023-06-15 16:11:41
[ 4759.636322] BTRFS: device fsid 8d197804-9964-4b3f-bbea-3ef33869b564 devid 1 transid 484 /dev/loop16 scanned by mount (893879)
[ 4759.641190] BTRFS info (device loop16): using crc32c (crc32c-intel) checksum algorithm
[ 4759.644817] BTRFS info (device loop16): using free space tree
[ 4759.650706] BTRFS info (device loop16): enabling ssd optimizations
[ 4759.652720] BTRFS info (device loop16): auto enabling async discard
[ 4760.484561] BTRFS: device fsid 2a451aed-b7b6-4498-ba17-0e28a2e1a26b devid 1 transid 6 /dev/loop5 scanned by (udev-worker) (894101)
[ 4760.494221] BTRFS: device fsid 2a451aed-b7b6-4498-ba17-0e28a2e1a26b devid 2 transid 6 /dev/loop6 scanned by (udev-worker) (892207)
[ 4760.497373] BTRFS: device fsid 2a451aed-b7b6-4498-ba17-0e28a2e1a26b devid 3 transid 6 /dev/loop7 scanned by (udev-worker) (892535)
[ 4760.502687] BTRFS: device fsid 2a451aed-b7b6-4498-ba17-0e28a2e1a26b devid 4 transid 6 /dev/loop8 scanned by mkfs.btrfs (894095)
[ 4760.515672] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4760.519412] BTRFS info (device loop5): setting nodatasum
[ 4760.521777] BTRFS info (device loop5): using free space tree
[ 4760.527120] BTRFS info (device loop5): enabling ssd optimizations
[ 4760.528861] BTRFS info (device loop5): auto enabling async discard
[ 4760.532184] BTRFS info (device loop5): checking UUID tree
[ 4762.658754] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4762.662098] BTRFS info (device loop5): allowing degraded mounts
[ 4762.664749] BTRFS info (device loop5): setting nodatasum
[ 4762.667347] BTRFS info (device loop5): using free space tree
[ 4762.672306] BTRFS warning (device loop5): devid 2 uuid de8712ab-ca85-4414-93a7-213060d1831d is missing
[ 4762.676977] BTRFS info (device loop5): enabling ssd optimizations
[ 4762.679852] BTRFS info (device loop5): auto enabling async discard
[ 4763.355404] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 started
[ 4763.595633] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 finished
[ 4764.044660] 286 (893750): drop_caches: 3
[ 4765.384814] BTRFS: device fsid 7acce38c-63c2-4365-a338-e1f6c0fd484b devid 1 transid 6 /dev/loop5 scanned by (udev-worker) (894101)
[ 4765.392235] BTRFS: device fsid 7acce38c-63c2-4365-a338-e1f6c0fd484b devid 2 transid 6 /dev/loop6 scanned by (udev-worker) (892207)
[ 4765.404469] BTRFS: device fsid 7acce38c-63c2-4365-a338-e1f6c0fd484b devid 3 transid 6 /dev/loop7 scanned by (udev-worker) (894101)
[ 4765.412107] BTRFS: device fsid 7acce38c-63c2-4365-a338-e1f6c0fd484b devid 4 transid 6 /dev/loop8 scanned by mkfs.btrfs (894169)
[ 4765.429084] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4765.433332] BTRFS info (device loop5): setting nodatasum
[ 4765.435506] BTRFS info (device loop5): using free space tree
[ 4765.440808] BTRFS info (device loop5): enabling ssd optimizations
[ 4765.442402] BTRFS info (device loop5): auto enabling async discard
[ 4765.444752] BTRFS info (device loop5): checking UUID tree
[ 4767.634901] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4767.637985] BTRFS info (device loop5): allowing degraded mounts
[ 4767.640216] BTRFS info (device loop5): setting nodatasum
[ 4767.642221] BTRFS info (device loop5): using free space tree
[ 4767.646646] BTRFS warning (device loop5): devid 2 uuid 6240c286-893c-4d19-bbf5-f1d2fecc6b96 is missing
[ 4767.650311] BTRFS warning (device loop5): devid 2 uuid 6240c286-893c-4d19-bbf5-f1d2fecc6b96 is missing
[ 4767.655256] BTRFS info (device loop5): enabling ssd optimizations
[ 4767.658073] BTRFS info (device loop5): auto enabling async discard
[ 4768.343633] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 started
[ 4768.608799] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 finished
[ 4768.750345] 286 (893750): drop_caches: 3
[ 4769.993871] BTRFS: device fsid 965cdb50-095a-4fd9-bcda-2c17bd80c3ad devid 1 transid 6 /dev/loop5 scanned by (udev-worker) (894101)
[ 4770.002879] BTRFS: device fsid 965cdb50-095a-4fd9-bcda-2c17bd80c3ad devid 2 transid 6 /dev/loop6 scanned by (udev-worker) (892207)
[ 4770.015617] BTRFS: device fsid 965cdb50-095a-4fd9-bcda-2c17bd80c3ad devid 3 transid 6 /dev/loop7 scanned by (udev-worker) (894101)
[ 4770.021936] BTRFS: device fsid 965cdb50-095a-4fd9-bcda-2c17bd80c3ad devid 4 transid 6 /dev/loop8 scanned by mkfs.btrfs (894243)
[ 4770.041357] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4770.043426] BTRFS info (device loop5): setting nodatasum
[ 4770.045340] BTRFS info (device loop5): using free space tree
[ 4770.050615] BTRFS info (device loop5): enabling ssd optimizations
[ 4770.053473] BTRFS info (device loop5): auto enabling async discard
[ 4770.056311] BTRFS info (device loop5): checking UUID tree
[ 4772.692223] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4772.695043] BTRFS info (device loop5): allowing degraded mounts
[ 4772.697901] BTRFS info (device loop5): setting nodatasum
[ 4772.700355] BTRFS info (device loop5): using free space tree
[ 4772.704900] BTRFS warning (device loop5): devid 2 uuid 5fa35bdf-8f54-4652-ba28-7c302a265f8d is missing
[ 4772.708151] BTRFS warning (device loop5): devid 2 uuid 5fa35bdf-8f54-4652-ba28-7c302a265f8d is missing
[ 4772.713703] BTRFS info (device loop5): enabling ssd optimizations
[ 4772.716270] BTRFS info (device loop5): auto enabling async discard
[ 4773.735253] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 started
[ 4774.089640] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 finished
[ 4774.269606] 286 (893750): drop_caches: 3
[ 4775.897236] BTRFS: device fsid 0552fbf6-2877-4ab3-b5a2-da5db268e1c3 devid 1 transid 6 /dev/loop5 scanned by (udev-worker) (894101)
[ 4775.905939] BTRFS: device fsid 0552fbf6-2877-4ab3-b5a2-da5db268e1c3 devid 2 transid 6 /dev/loop6 scanned by mkfs.btrfs (894317)
[ 4775.909603] BTRFS: device fsid 0552fbf6-2877-4ab3-b5a2-da5db268e1c3 devid 3 transid 6 /dev/loop7 scanned by mkfs.btrfs (894317)
[ 4775.913080] BTRFS: device fsid 0552fbf6-2877-4ab3-b5a2-da5db268e1c3 devid 4 transid 6 /dev/loop8 scanned by mkfs.btrfs (894317)
[ 4775.928177] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4775.930566] BTRFS info (device loop5): setting nodatasum
[ 4775.932930] BTRFS info (device loop5): using free space tree
[ 4775.937296] BTRFS info (device loop5): enabling ssd optimizations
[ 4775.938306] BTRFS info (device loop5): auto enabling async discard
[ 4775.940084] BTRFS info (device loop5): checking UUID tree
[ 4779.204728] BTRFS info (device loop5): using crc32c (crc32c-intel) checksum algorithm
[ 4779.207351] BTRFS info (device loop5): allowing degraded mounts
[ 4779.210284] BTRFS info (device loop5): setting nodatasum
[ 4779.212740] BTRFS info (device loop5): using free space tree
[ 4779.218547] BTRFS warning (device loop5): devid 2 uuid 9a9f7178-0caa-4c5f-8f92-034e72257005 is missing
[ 4779.221982] BTRFS warning (device loop5): devid 2 uuid 9a9f7178-0caa-4c5f-8f92-034e72257005 is missing
[ 4779.227912] BTRFS info (device loop5): enabling ssd optimizations
[ 4779.230483] BTRFS info (device loop5): auto enabling async discard
[ 4780.128223] BTRFS info (device loop5): dev_replace from <missing disk> (devid 2) to /dev/loop9 started
[ 4780.422390] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 4780.423934] #PF: supervisor read access in kernel mode
[ 4780.425584] #PF: error_code(0x0000) - not-present page
[ 4780.427234] PGD 0 P4D 0 
[ 4780.428293] Oops: 0000 [#1] PREEMPT SMP PTI
[ 4780.429722] CPU: 3 PID: 761699 Comm: kworker/u16:4 Not tainted 6.4.0-rc6+ #6
[ 4780.431582] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
[ 4780.433897] Workqueue: btrfs-rmw rmw_rbio_work [btrfs]
[ 4780.435655] RIP: 0010:raid6_sse21_gen_syndrome+0x9e/0x130 [raid6_pq]
[ 4780.437518] Code: 4d 8d 54 05 00 44 89 c0 48 c1 e0 03 48 29 c6 49 8b 03 48 01 d0 0f 18 00 66 0f 6f 10 49 8b 01 0f 18 04 10 66 0f 6f e2 49 8b 01 <66> 0f 6f 34 10 4c 89 d0 45 85 c0 78 34 48 8b 08 0f 18 04 11 66 0f
[ 4780.442488] RSP: 0018:ffffb66f0296fdc8 EFLAGS: 00010286
[ 4780.444147] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffffa0ff4cfa3248
[ 4780.446192] RDX: 0000000000000000 RSI: ffffa0f74cfa3238 RDI: 0000000000000000
[ 4780.448278] RBP: ffffa0ff4e72a000 R08: 00000000fffffffe R09: ffffa0ff4cfa3238
[ 4780.450387] R10: ffffa0ff4cfa3230 R11: ffffa0ff4cfa3240 R12: ffffa0fe8bdf3000
[ 4780.452515] R13: ffffa0ff4cfa3240 R14: 0000000000000003 R15: 0000000000000000
[ 4780.454638] FS:  0000000000000000(0000) GS:ffffa0ff77cc0000(0000) knlGS:0000000000000000
[ 4780.456956] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4780.458778] CR2: 0000000000000000 CR3: 000000015eb0a001 CR4: 0000000000060ee0
[ 4780.460789] Call Trace:
[ 4780.461832]  <TASK>
[ 4780.462804]  ? __die+0x1f/0x70
[ 4780.463915]  ? page_fault_oops+0x159/0x450
[ 4780.465207]  ? fixup_exception+0x22/0x310
[ 4780.466484]  ? exc_page_fault+0x7a/0x180
[ 4780.467666]  ? asm_exc_page_fault+0x22/0x30
[ 4780.468879]  ? raid6_sse21_gen_syndrome+0x9e/0x130 [raid6_pq]
[ 4780.470372]  ? raid6_sse21_gen_syndrome+0x38/0x130 [raid6_pq]
[ 4780.471801]  rmw_rbio+0x5c8/0xa80 [btrfs]
[ 4780.472987]  ? preempt_count_add+0x6a/0xa0
[ 4780.474061]  ? lock_stripe_add+0xe1/0x290 [btrfs]
[ 4780.475288]  process_one_work+0x1c7/0x3d0
[ 4780.476304]  worker_thread+0x4d/0x380
[ 4780.477232]  ? __pfx_worker_thread+0x10/0x10
[ 4780.478241]  kthread+0xf3/0x120
[ 4780.479071]  ? __pfx_kthread+0x10/0x10
[ 4780.479982]  ret_from_fork+0x2c/0x50
[ 4780.480843]  </TASK>
[ 4780.481488] Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_flakey nls_iso8859_1 nls_cp437 vfat fat ext4 9p crc16 joydev kvm_intel netfs virtio_net mbcache cirrus kvm psmouse pcspkr net_failover failover xfs irqbypass drm_shmem_helper virtio_balloon jbd2 evdev button 9pnet_virtio drm_kms_helper loop drm dm_mod zram zsmalloc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sha512_ssse3 sha512_generic aesni_intel nvme virtio_blk crypto_simd nvme_core virtio_pci cryptd t10_pi virtio i6300esb virtio_pci_legacy_dev crc64_rocksoft_generic virtio_pci_modern_dev crc64_rocksoft crc64 virtio_ring serio_raw btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq autofs4
[ 4780.492421] CR2: 0000000000000000
[ 4780.493185] ---[ end trace 0000000000000000 ]---
[ 4780.494099] RIP: 0010:raid6_sse21_gen_syndrome+0x9e/0x130 [raid6_pq]
[ 4780.495217] Code: 4d 8d 54 05 00 44 89 c0 48 c1 e0 03 48 29 c6 49 8b 03 48 01 d0 0f 18 00 66 0f 6f 10 49 8b 01 0f 18 04 10 66 0f 6f e2 49 8b 01 <66> 0f 6f 34 10 4c 89 d0 45 85 c0 78 34 48 8b 08 0f 18 04 11 66 0f
[ 4780.498186] RSP: 0018:ffffb66f0296fdc8 EFLAGS: 00010286
[ 4780.499138] RAX: 0000000000000000 RBX: 0000000000001000 RCX: ffffa0ff4cfa3248
[ 4780.500327] RDX: 0000000000000000 RSI: ffffa0f74cfa3238 RDI: 0000000000000000
[ 4780.501533] RBP: ffffa0ff4e72a000 R08: 00000000fffffffe R09: ffffa0ff4cfa3238
[ 4780.502683] R10: ffffa0ff4cfa3230 R11: ffffa0ff4cfa3240 R12: ffffa0fe8bdf3000
[ 4780.503827] R13: ffffa0ff4cfa3240 R14: 0000000000000003 R15: 0000000000000000
[ 4780.504971] FS:  0000000000000000(0000) GS:ffffa0ff77cc0000(0000) knlGS:0000000000000000
[ 4780.506207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4780.507143] CR2: 0000000000000000 CR3: 000000015eb0a001 CR4: 0000000000060ee0
[ 4780.508242] note: kworker/u16:4[761699] exited with irqs disabled
[ 4780.509242] note: kworker/u16:4[761699] exited with preempt_count 1


Looks like a quadword move failed? I'm not well-versed in SSE asm, I'm afraid:

$ ./scripts/faddr2line --list ./lib/raid6/raid6_pq.ko raid6_sse21_gen_syndrome+0x9e/0x130
raid6_sse21_gen_syndrome+0x9e/0x130:

raid6_sse21_gen_syndrome at /home/jlayton/git/kdevops/linux/lib/raid6/sse2.c:56
 51 		for ( d = 0 ; d < bytes ; d += 16 ) {
 52 			asm volatile("prefetchnta %0" : : "m" (dptr[z0][d]));
 53 			asm volatile("movdqa %0,%%xmm2" : : "m" (dptr[z0][d])); /* P[0] */
 54 			asm volatile("prefetchnta %0" : : "m" (dptr[z0-1][d]));
 55 			asm volatile("movdqa %xmm2,%xmm4"); /* Q[0] */
>56<			asm volatile("movdqa %0,%%xmm6" : : "m" (dptr[z0-1][d]));
 57 			for ( z = z0-2 ; z >= 0 ; z-- ) {
 58 				asm volatile("prefetchnta %0" : : "m" (dptr[z][d]));
 59 				asm volatile("pcmpgtb %xmm4,%xmm5");
 60 				asm volatile("paddb %xmm4,%xmm4");
 61 				asm volatile("pand %xmm0,%xmm5");


This machine is running v6.4.0-rc5 with some ctime handling patches on
top (nothing that should affect anything at this level). The Kconfig is
config-next-20230530 from the kdevops tree:

https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/bootlinux/templates/config-next-20230530)

Let me know if you need other info!
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-26 11:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-15 17:58 BUG in raid6_pq while running fstest btrfs/286 Jeff Layton
2023-06-16  1:57 ` Qu Wenruo
2023-06-19 17:54   ` David Sterba
2023-06-19 18:23     ` Jeff Layton
2024-01-25 10:13     ` Filipe Manana
2024-01-25 23:04       ` Qu Wenruo
2024-01-26 11:55         ` Filipe Manana

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox