Linux kernel -stable discussions
 help / color / mirror / Atom feed
* [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
       [not found] <174602441004.174814.6400502946223473449.reportbug@talos.vermwa.re>
@ 2025-04-30 15:55 ` Salvatore Bonaccorso
  2025-05-05 11:47   ` Moritz Mühlenhoff
                     ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Salvatore Bonaccorso @ 2025-04-30 15:55 UTC (permalink / raw)
  To: Melvin Vermeeren, Yu Kuai, Greg Kroah-Hartman
  Cc: 1104460, Coly Li, Sasha Levin, stable, regressions

Hi

We got a regression report in Debian after the update from 6.1.133 to
6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
stalls idefintively. The full report is inlined below and originates
from https://bugs.debian.org/1104460 .

On Wed, Apr 30, 2025 at 04:46:50PM +0200, Melvin Vermeeren wrote:
> Package: src:linux
> Version: 6.1.135-1
> Severity: important
> Tags: upstream
> X-Debbugs-Cc: vermeeren@vermwa.re
> 
> Dear Maintainer,
> 
> Upgrading from linux-image-6.1.0-33-powerpc64le (6.1.133-1) to
> linux-image-6.1.0-34-powerpc64le (6.1.135-1) it appears there is a
> serious regression bug related to discard/TRIM through a RAID10 array.
> This only affects RAID10, RAID1 array on the same SSD device is not
> affected. Array in question is a fairly standard RAID10 in 2far layout.
> 
> md127 : active raid10 dm-1[2] dm-0[0]
>       1872188416 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU]
>       bitmap: 1/1 pages [64KB], 65536KB chunk
> 
> Any discard operation will result in quite a long kernel error. The
> calling process will either segfault (swapon) or, more likely, be stuck
> forever (Qemu, fstrim) in the D state per htop. The iostat utility
> reports a %util of 100% for any device on top of (directly or
> indirectly) of the RAID10 device, despite there being no read or write
> requests to the devices or any other acitivty.
> 
> Stuck processes cannot be terminated or killed. Attempting to reboot
> normally will result in a stuck machine on shutdown, so only a
> REISUB-style reboot will work via procfs sysrq.
> 
> I have briefly diffed and inspected commits between the two kernel
> versions and I suspect the commit below may be at fault. Do keep in mind
> I have not verified this in any way, so I may be wrong.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4a05f7ae33716d996c5ce56478a36a3ede1d76f2
> 
> Considering this is shipped as part of a stable security update I
> consider it quite a serious bug. Affected hosts will not boot up
> cleanly, may not have swap, processes will freeze upon discard and clean
> reboot it also not possible.
> 
> More logs available upon request.
> 
> Many thanks,
> 
> Melvin Vermeeren.
> 
> -- Package-specific info:
> ** Version:
> Linux version 6.1.0-34-powerpc64le (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP Debian 6.1.135-1 (2025-04-25)
> 
> ** Command line:
> root=/dev/mapper/...-root ro quiet
> 
> ** Not tainted
> 
> ** Kernel log:
> # /etc/fstab entry
> /dev/.../swap none swap sw,discard=once 0 0
> 
> ~# swapon -va
> swapon: /dev/mapper/...-swap: found signature [pagesize=65536, signature=swap]
> swapon: /dev/mapper/...-swap: pagesize=65536, swapsize=17179869184, devsize=17179869184
> swapon /dev/mapper/...-swap
> Segmentation fault
> 
> ~# dmesg
> ...
> [  223.017257] kernel tried to execute user page (0) - exploit attempt? (uid: 0)
> [  223.017287] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> [  223.017301] Faulting instruction address: 0x00000000
> [  223.017326] Oops: Kernel access of bad area, sig: 11 [#1]
> [  223.017338] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
> [  223.017365] Modules linked in: bridge stp llc binfmt_misc nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy scsi_transport_sas usb_common
> [  223.017812] CPU: 8 PID: 10609 Comm: swapon Not tainted 6.1.0-34-powerpc64le #1  Debian 6.1.135-1
> [  223.017844] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> [  223.017879] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 0000000000000000
> [  223.017926] REGS: c0000000276cf200 TRAP: 0400   Not tainted  (6.1.0-34-powerpc64le Debian 6.1.135-1)
> [  223.017979] MSR:  900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
> [  223.018060] CFAR: c0000000003efe6c IRQMASK: 0
>                GPR00: c0000000003efec4 c0000000276cf4a0 c000000001148100 0000000000092800
>                GPR04: 0000000000000000 0000000000000003 0000000000000c00 c00000000296e700
>                GPR08: c00000000c0e9700 00000c0000090800 0000000000000000 0000000000002000
>                GPR12: 0000000000000000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>                GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>                GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>                GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>                GPR28: c000200009e2dd00 c00000000296e718 00000c0000092800 0000000000092c00
> [  223.018372] NIP [0000000000000000] 0x0
> [  223.018397] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
> [  223.018435] Call Trace:
> [  223.018453] [c0000000276cf4a0] [c0000000003efec4] mempool_alloc+0xf4/0x210 (unreliable)
> [  223.018507] [c0000000276cf520] [c000000000743bf8] bio_alloc_bioset+0x368/0x510
> [  223.018552] [c0000000276cf5a0] [c000000000743e74] bio_alloc_clone+0x44/0xa0
> [  223.018601] [c0000000276cf5e0] [c008000015793adc] md_account_bio+0x54/0xb0 [md_mod]
> [  223.018655] [c0000000276cf610] [c00800001567778c] raid10_make_request+0xc54/0x1040 [raid10]
> [  223.018687] [c0000000276cf770] [c00800001579a290] md_handle_request+0x198/0x380 [md_mod]
> [  223.018735] [c0000000276cf800] [c00000000074c32c] __submit_bio+0x9c/0x250
> [  223.018773] [c0000000276cf840] [c00000000074ca88] submit_bio_noacct_nocheck+0x178/0x3f0
> [  223.018825] [c0000000276cf8b0] [c000000000743e08] blk_next_bio+0x68/0x90
> [  223.018863] [c0000000276cf8e0] [c000000000758c60] __blkdev_issue_discard+0x180/0x280
> [  223.018898] [c0000000276cf980] [c000000000758de8] blkdev_issue_discard+0x88/0x120
> [  223.018927] [c0000000276cfa00] [c0000000004a9e8c] sys_swapon+0x11dc/0x18a0
> [  223.018971] [c0000000276cfb50] [c00000000002b038] system_call_exception+0x138/0x260
> [  223.019015] [c0000000276cfe10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
> [  223.019058] --- interrupt: 3000 at 0x7fff95146770
> [  223.019095] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 0000000000000000
> [  223.019132] REGS: c0000000276cfe80 TRAP: 3000   Not tainted  (6.1.0-34-powerpc64le Debian 6.1.135-1)
> [  223.019182] MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
> [  223.019267] IRQMASK: 0
>                GPR00: 0000000000000057 00007fffdca2ace0 00007fff95256f00 00000001220a1c20
>                GPR04: 0000000000030000 000000000000001e 000000000000000a 000000000000000a
>                GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>                GPR12: 0000000000000000 00007fff955dcbc0 0000000000000000 0000000000000000
>                GPR16: 0000000000000000 00000001104066b0 00007fffdca2afc8 000000011040cbd0
>                GPR20: 000000011040cbd8 0000000000000000 0000000000010000 00007fffdca2aff0
>                GPR24: 00007fffdca2afd0 0000000000000003 0000000000030000 0000000400000000
>                GPR28: 00000001220a1c20 000000000000fff6 00000001220a30a0 0000000000100000
> [  223.019542] NIP [00007fff95146770] 0x7fff95146770
> [  223.019568] LR [00007fff95146770] 0x7fff95146770
> [  223.019595] --- interrupt: 3000
> [  223.019604] Instruction dump:
> [  223.019626] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  223.019665] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  223.019712] ---[ end trace 0000000000000000 ]---
> 
> [  224.623456] note: swapon[10609] exited with irqs disabled
> [  224.623483] ------------[ cut here ]------------
> [  224.623502] WARNING: CPU: 8 PID: 10609 at kernel/exit.c:816 do_exit+0x94/0xbc0
> [  224.623516] Modules linked in: bridge stp llc binfmt_misc nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy scsi_transport_sas usb_common
> [  224.623825] CPU: 8 PID: 10609 Comm: swapon Tainted: G      D            6.1.0-34-powerpc64le #1  Debian 6.1.135-1
> [  224.623860] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> [  224.623892] NIP:  c000000000140fa4 LR: c000000000140fa0 CTR: 0000000000000000
> [  224.623935] REGS: c0000000276cecb0 TRAP: 0700   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
> [  224.623969] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24004222  XER: 00000004
> [  224.624012] CFAR: c00000000013ea68 IRQMASK: 0
>                GPR00: c000000000140fa0 c0000000276cef50 c000000001148100 0000000000000000
>                GPR04: 0000000000000000 c0000000276cee20 c0000000276cee18 0000001ffb000000
>                GPR08: 0000000000000027 c0000000276cf9b0 0000000000000000 0000000000004000
>                GPR12: 0000000031c40000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>                GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>                GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>                GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>                GPR28: 000000000000000b c00000001ce25d80 c000000078409c00 c000000026529d80
> [  224.624208] NIP [c000000000140fa4] do_exit+0x94/0xbc0
> [  224.624239] LR [c000000000140fa0] do_exit+0x90/0xbc0
> [  224.624269] Call Trace:
> [  224.624274] [c0000000276cef50] [c000000000140fa0] do_exit+0x90/0xbc0 (unreliable)
> [  224.624308] [c0000000276cf020] [c000000000141b80] make_task_dead+0xb0/0x1f0
> [  224.624320] [c0000000276cf0a0] [c000000000025718] oops_end+0x188/0x1c0
> [  224.624341] [c0000000276cf120] [c00000000007f72c] __bad_page_fault+0x18c/0x1b0
> [  224.624375] [c0000000276cf190] [c000000000008cd4] instruction_access_common_virt+0x194/0x1a0
> [  224.624421] --- interrupt: 400 at 0x0
> [  224.624438] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 0000000000000000
> [  224.624471] REGS: c0000000276cf200 TRAP: 0400   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
> [  224.624507] MSR:  900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
> [  224.624544] CFAR: c0000000003efe6c IRQMASK: 0
>                GPR00: c0000000003efec4 c0000000276cf4a0 c000000001148100 0000000000092800
>                GPR04: 0000000000000000 0000000000000003 0000000000000c00 c00000000296e700
>                GPR08: c00000000c0e9700 00000c0000090800 0000000000000000 0000000000002000
>                GPR12: 0000000000000000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>                GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>                GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>                GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>                GPR28: c000200009e2dd00 c00000000296e718 00000c0000092800 0000000000092c00
> [  224.624732] NIP [0000000000000000] 0x0
> [  224.624749] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
> [  224.624771] --- interrupt: 400
> [  224.624789] [c0000000276cf4a0] [c0000000003efec4] mempool_alloc+0xf4/0x210 (unreliable)
> [  224.624823] [c0000000276cf520] [c000000000743bf8] bio_alloc_bioset+0x368/0x510
> [  224.624859] [c0000000276cf5a0] [c000000000743e74] bio_alloc_clone+0x44/0xa0
> [  224.624892] [c0000000276cf5e0] [c008000015793adc] md_account_bio+0x54/0xb0 [md_mod]
> [  224.624930] [c0000000276cf610] [c00800001567778c] raid10_make_request+0xc54/0x1040 [raid10]
> [  224.624964] [c0000000276cf770] [c00800001579a290] md_handle_request+0x198/0x380 [md_mod]
> [  224.624997] [c0000000276cf800] [c00000000074c32c] __submit_bio+0x9c/0x250
> [  224.625018] [c0000000276cf840] [c00000000074ca88] submit_bio_noacct_nocheck+0x178/0x3f0
> [  224.625043] [c0000000276cf8b0] [c000000000743e08] blk_next_bio+0x68/0x90
> [  224.625066] [c0000000276cf8e0] [c000000000758c60] __blkdev_issue_discard+0x180/0x280
> [  224.625091] [c0000000276cf980] [c000000000758de8] blkdev_issue_discard+0x88/0x120
> [  224.625115] [c0000000276cfa00] [c0000000004a9e8c] sys_swapon+0x11dc/0x18a0
> [  224.625139] [c0000000276cfb50] [c00000000002b038] system_call_exception+0x138/0x260
> [  224.625164] [c0000000276cfe10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
> [  224.625201] --- interrupt: 3000 at 0x7fff95146770
> [  224.625270] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 0000000000000000
> [  224.625367] REGS: c0000000276cfe80 TRAP: 3000   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
> [  224.625458] MSR:  900000000000f033 <SF,HV,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
> [  224.625570] IRQMASK: 0
>                GPR00: 0000000000000057 00007fffdca2ace0 00007fff95256f00 00000001220a1c20
>                GPR04: 0000000000030000 000000000000001e 000000000000000a 000000000000000a
>                GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>                GPR12: 0000000000000000 00007fff955dcbc0 0000000000000000 0000000000000000
>                GPR16: 0000000000000000 00000001104066b0 00007fffdca2afc8 000000011040cbd0
>                GPR20: 000000011040cbd8 0000000000000000 0000000000010000 00007fffdca2aff0
>                GPR24: 00007fffdca2afd0 0000000000000003 0000000000030000 0000000400000000
>                GPR28: 00000001220a1c20 000000000000fff6 00000001220a30a0 0000000000100000
> [  224.626325] NIP [00007fff95146770] 0x7fff95146770
> [  224.626388] LR [00007fff95146770] 0x7fff95146770
> [  224.626522] --- interrupt: 3000
> [  224.626568] Instruction dump:
> [  224.626587] 60000000 813f000c 3929ffff 2c090000 913f000c 40820010 813f0074 71290004
> [  224.626680] 4182074c 7fa3eb78 4bffda7d e93e0b10 <0b090000> e87e0a48 48c7dd0d 60000000
> [  224.626786] ---[ end trace 0000000000000000 ]---

Does this ring a bell?

Melvin, the same change went as well in other stable series, 6.6.88,
6.12.25, 6.14.4, can you test e.g. 6.12.25-1 in Debian as well from
unstable to see if the regression is there as well?

Might you be able to bisect the upstream stable series between 6.1.133
to 6.1.135 to really confirm the mentioned commit is the one breaking?

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-04-30 15:55 ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Salvatore Bonaccorso
@ 2025-05-05 11:47   ` Moritz Mühlenhoff
  2025-05-05 14:00     ` Salvatore Bonaccorso
  2025-05-06  1:11   ` Yu Kuai
  2025-05-06 15:16   ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Melvin Vermeeren
  2 siblings, 1 reply; 13+ messages in thread
From: Moritz Mühlenhoff @ 2025-05-05 11:47 UTC (permalink / raw)
  To: Yu Kuai
  Cc: Melvin Vermeeren, Greg Kroah-Hartman, 1104460, Coly Li,
	Sasha Levin, stable, regressions, Salvatore Bonaccorso

Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
> Hi
> 
> We got a regression report in Debian after the update from 6.1.133 to
> 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> stalls idefintively. The full report is inlined below and originates
> from https://bugs.debian.org/1104460 .

JFTR, we ran into the same problem with a few Wikimedia servers running
6.1.135 and RAID 10: The servers started to lock up once fstrim.service
got started. Full oops messages are available at
https://phabricator.wikimedia.org/P75746

Cheers,
        Moritz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-05-05 11:47   ` Moritz Mühlenhoff
@ 2025-05-05 14:00     ` Salvatore Bonaccorso
  2025-05-05 16:02       ` Salvatore Bonaccorso
  0 siblings, 1 reply; 13+ messages in thread
From: Salvatore Bonaccorso @ 2025-05-05 14:00 UTC (permalink / raw)
  To: Moritz Mühlenhoff
  Cc: Yu Kuai, Melvin Vermeeren, Greg Kroah-Hartman, 1104460, Coly Li,
	Sasha Levin, stable, regressions

Hi Moritz,

On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
> Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
> > Hi
> > 
> > We got a regression report in Debian after the update from 6.1.133 to
> > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> > stalls idefintively. The full report is inlined below and originates
> > from https://bugs.debian.org/1104460 .
> 
> JFTR, we ran into the same problem with a few Wikimedia servers running
> 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
> got started. Full oops messages are available at
> https://phabricator.wikimedia.org/P75746

Thanks for this aditional datapoints. Assuming you wont be able to
thest the other stable series where the commit d05af90d6218
("md/raid10: fix missing discard IO accounting") went in, might you at
least be able to test the 6.1.y branch with the commit reverted again
and manually trigger the issue?

If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
with the patch reverted. 

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-05-05 14:00     ` Salvatore Bonaccorso
@ 2025-05-05 16:02       ` Salvatore Bonaccorso
  2025-05-05 18:50         ` Bug#1104460: " Antoine Beaupré
  0 siblings, 1 reply; 13+ messages in thread
From: Salvatore Bonaccorso @ 2025-05-05 16:02 UTC (permalink / raw)
  To: Moritz Mühlenhoff, Yu Kuai
  Cc: Melvin Vermeeren, Greg Kroah-Hartman, 1104460, Coly Li,
	Sasha Levin, stable, regressions

On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
> Hi Moritz,
> 
> On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
> > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
> > > Hi
> > > 
> > > We got a regression report in Debian after the update from 6.1.133 to
> > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> > > stalls idefintively. The full report is inlined below and originates
> > > from https://bugs.debian.org/1104460 .
> > 
> > JFTR, we ran into the same problem with a few Wikimedia servers running
> > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
> > got started. Full oops messages are available at
> > https://phabricator.wikimedia.org/P75746
> 
> Thanks for this aditional datapoints. Assuming you wont be able to
> thest the other stable series where the commit d05af90d6218
> ("md/raid10: fix missing discard IO accounting") went in, might you at
> least be able to test the 6.1.y branch with the commit reverted again
> and manually trigger the issue?
> 
> If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
> with the patch reverted. 

So one additional data point as several Debian users were reporting
back beeing affected: One user did upgrade to 6.12.25 (where the
commit was backported as well) and is not able to reproduce the issue
there.

This indicates we might miss some pre-requisites in the 6.1.y series?

user is trying now the 6.1.135 with patch reverted as well.

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-05-05 16:02       ` Salvatore Bonaccorso
@ 2025-05-05 18:50         ` Antoine Beaupré
  2025-05-05 20:36           ` Salvatore Bonaccorso
  0 siblings, 1 reply; 13+ messages in thread
From: Antoine Beaupré @ 2025-05-05 18:50 UTC (permalink / raw)
  To: Salvatore Bonaccorso, 1104460, Moritz Mühlenhoff, Yu Kuai
  Cc: Melvin Vermeeren, Greg Kroah-Hartman, 1104460, Coly Li,
	Sasha Levin, stable, regressions

On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
> On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
>> Hi Moritz,
>> 
>> On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
>> > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
>> > > Hi
>> > > 
>> > > We got a regression report in Debian after the update from 6.1.133 to
>> > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
>> > > stalls idefintively. The full report is inlined below and originates
>> > > from https://bugs.debian.org/1104460 .
>> > 
>> > JFTR, we ran into the same problem with a few Wikimedia servers running
>> > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
>> > got started. Full oops messages are available at
>> > https://phabricator.wikimedia.org/P75746
>> 
>> Thanks for this aditional datapoints. Assuming you wont be able to
>> thest the other stable series where the commit d05af90d6218
>> ("md/raid10: fix missing discard IO accounting") went in, might you at
>> least be able to test the 6.1.y branch with the commit reverted again
>> and manually trigger the issue?
>> 
>> If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
>> with the patch reverted. 
>
> So one additional data point as several Debian users were reporting
> back beeing affected: One user did upgrade to 6.12.25 (where the
> commit was backported as well) and is not able to reproduce the issue
> there.

That would be me.

I can reproduce the issue as outlined by Moritz above fairly reliably in
6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
RAID-10 host:

 1. reboot
 2. systemctl start fstrim.service

We're tracking the issue internally in:

https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146

I've managed to workaround the issue by upgrading to the Debian package
from testing/unstable (6.12.25), as Salvatore indicated above. There,
fstrim doesn't cause any crash and completes successfully. In stable, it
just hangs there forever. The kernel doesn't completely panic and the
machine is otherwise somewhat still functional: my existing SSH
connection keeps working, for example, but new ones fail. And an `apt
install` of another kernel hangs forever.

> This indicates we might miss some pre-requisites in the 6.1.y series?
>
> user is trying now the 6.1.135 with patch reverted as well.

I am embarrassed to say I couldn't figure out how to build a Debian
package of the Linux kernel at the moment. I would be happy to test a
built package, that said. I got stock in various snags: the
`debian/bin/test-patches` script seem to require a flavor (worked around
with `-f amd64`) and in the end the build failed with:

[...]

  ld -r -m elf_x86_64 -z noexecstack --no-warn-rwx-segments --build-id=sha1  -T scripts/module.lds -o virt/lib/irqbypass.ko virt/lib/irqbypass.o virt/lib/irqbypass.mod.o;  true
debian/bin/buildcheck.py debian/build/build_amd64_none_amd64 amd64 none amd64
Can't read ABI reference.  ABI not checked!
make[2]: Leaving directory '/home/anarcat/dist/linux-6.1.135'
/usr/bin/make -f debian/rules.real build_kbuild ABINAME='6.1.0-0.a.test' ARCH='amd64' DESTDIR='/home/anarcat/dist/linux-6.1.135/debian/linux-kbuild-6.1' DH_OPTIONS='-plinux-kbuild-6.1' KERNEL_ARCH='x86' PACKAGE_NAME='linux-kbuild-6.1' SOURCEVERSION='6.1.135-1a~test' SOURCE_BASENAME='linux' SOURCE_SUFFIX='' UPSTREAMVERSION='6.1' VERSION='6.1'
make[2]: Entering directory '/home/anarcat/dist/linux-6.1.135'
mkdir -p debian/build/build-tools/headers-tools
/usr/bin/make ARCH=x86 O=debian/build/build-tools/headers-tools \
	INSTALL_HDR_PATH=/home/anarcat/dist/linux-6.1.135/debian/build/build-tools \
	headers_install
make[3]: Entering directory '/home/anarcat/dist/linux-6.1.135'
***
*** Configuration file ".config" not found!
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
/home/anarcat/dist/linux-6.1.135/Makefile:792: include/config/auto.conf.cmd: No such file or directory
make[4]: *** [/home/anarcat/dist/linux-6.1.135/Makefile:801: .config] Error 1
make[3]: *** [Makefile:250: __sub-make] Error 2
make[3]: Leaving directory '/home/anarcat/dist/linux-6.1.135'
make[2]: *** [debian/rules.real:530: debian/stamps/build-tools-headers] Error 2
make[2]: Leaving directory '/home/anarcat/dist/linux-6.1.135'
make[1]: *** [debian/rules.gen:1471: build-arch_amd64_real_kbuild] Error 2
make[1]: Leaving directory '/home/anarcat/dist/linux-6.1.135'
make: *** [debian/rules:40: build-arch] Error 2
dpkg-buildpackage: error: debian/rules binary subprocess returned exit status 2

It's been a while since I compiled linux, amazingly... It might be
because I'm trying to compile the Debian 12 kernel on Debian 13. Here
are the steps I took:

curl -o 4a05f7ae33716d996c5ce56478a36a3ede1d76f2.patch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/patch/?id=4a05f7ae33716d996c5ce56478a36a3ede1d76f2
# (reverse the patch)
sudo apt-get build-dep linux
apt source -t bookworm-security linux
./debian/bin/test-patches -f amd64 ../4a05f7ae33716d996c5ce56478a36a3ede1d76f2.patch

a.

-- 
Life is like riding a bicycle. To keep your balance you must keep moving.
                       - Albert Einstein

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-05-05 18:50         ` Bug#1104460: " Antoine Beaupré
@ 2025-05-05 20:36           ` Salvatore Bonaccorso
  2025-05-05 20:59             ` Antoine Beaupré
  0 siblings, 1 reply; 13+ messages in thread
From: Salvatore Bonaccorso @ 2025-05-05 20:36 UTC (permalink / raw)
  To: Antoine Beaupré, 1104460
  Cc: Moritz Mühlenhoff, Yu Kuai, Melvin Vermeeren,
	Greg Kroah-Hartman, Coly Li, Sasha Levin, stable, regressions

Hi Antoine,

On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote:
> On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
> > On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
> >> Hi Moritz,
> >> 
> >> On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
> >> > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
> >> > > Hi
> >> > > 
> >> > > We got a regression report in Debian after the update from 6.1.133 to
> >> > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> >> > > stalls idefintively. The full report is inlined below and originates
> >> > > from https://bugs.debian.org/1104460 .
> >> > 
> >> > JFTR, we ran into the same problem with a few Wikimedia servers running
> >> > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
> >> > got started. Full oops messages are available at
> >> > https://phabricator.wikimedia.org/P75746
> >> 
> >> Thanks for this aditional datapoints. Assuming you wont be able to
> >> thest the other stable series where the commit d05af90d6218
> >> ("md/raid10: fix missing discard IO accounting") went in, might you at
> >> least be able to test the 6.1.y branch with the commit reverted again
> >> and manually trigger the issue?
> >> 
> >> If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
> >> with the patch reverted. 
> >
> > So one additional data point as several Debian users were reporting
> > back beeing affected: One user did upgrade to 6.12.25 (where the
> > commit was backported as well) and is not able to reproduce the issue
> > there.
> 
> That would be me.
> 
> I can reproduce the issue as outlined by Moritz above fairly reliably in
> 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
> RAID-10 host:
> 
>  1. reboot
>  2. systemctl start fstrim.service
> 
> We're tracking the issue internally in:
> 
> https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146
> 
> I've managed to workaround the issue by upgrading to the Debian package
> from testing/unstable (6.12.25), as Salvatore indicated above. There,
> fstrim doesn't cause any crash and completes successfully. In stable, it
> just hangs there forever. The kernel doesn't completely panic and the
> machine is otherwise somewhat still functional: my existing SSH
> connection keeps working, for example, but new ones fail. And an `apt
> install` of another kernel hangs forever.

So likely at least in 6.1.y there are missing pre-requisites causing
the behaviour.

If you can test 6.1.135-1 with the commit
4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch
built packages at:

https://people.debian.org/~carnil/tmp/linux/1104460/

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-05-05 20:36           ` Salvatore Bonaccorso
@ 2025-05-05 20:59             ` Antoine Beaupré
  2025-05-06  1:25               ` Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking Yu Kuai
  0 siblings, 1 reply; 13+ messages in thread
From: Antoine Beaupré @ 2025-05-05 20:59 UTC (permalink / raw)
  To: Salvatore Bonaccorso, 1104460
  Cc: Moritz Mühlenhoff, Yu Kuai, Melvin Vermeeren,
	Greg Kroah-Hartman, Coly Li, Sasha Levin, stable, regressions

On 2025-05-05 22:36:07, Salvatore Bonaccorso wrote:
> Hi Antoine,
>
> On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote:
>> On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
>> > On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
>> >> Hi Moritz,
>> >> 
>> >> On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
>> >> > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
>> >> > > Hi
>> >> > > 
>> >> > > We got a regression report in Debian after the update from 6.1.133 to
>> >> > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
>> >> > > stalls idefintively. The full report is inlined below and originates
>> >> > > from https://bugs.debian.org/1104460 .
>> >> > 
>> >> > JFTR, we ran into the same problem with a few Wikimedia servers running
>> >> > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
>> >> > got started. Full oops messages are available at
>> >> > https://phabricator.wikimedia.org/P75746
>> >> 
>> >> Thanks for this aditional datapoints. Assuming you wont be able to
>> >> thest the other stable series where the commit d05af90d6218
>> >> ("md/raid10: fix missing discard IO accounting") went in, might you at
>> >> least be able to test the 6.1.y branch with the commit reverted again
>> >> and manually trigger the issue?
>> >> 
>> >> If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
>> >> with the patch reverted. 
>> >
>> > So one additional data point as several Debian users were reporting
>> > back beeing affected: One user did upgrade to 6.12.25 (where the
>> > commit was backported as well) and is not able to reproduce the issue
>> > there.
>> 
>> That would be me.
>> 
>> I can reproduce the issue as outlined by Moritz above fairly reliably in
>> 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
>> RAID-10 host:
>> 
>>  1. reboot
>>  2. systemctl start fstrim.service
>> 
>> We're tracking the issue internally in:
>> 
>> https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146
>> 
>> I've managed to workaround the issue by upgrading to the Debian package
>> from testing/unstable (6.12.25), as Salvatore indicated above. There,
>> fstrim doesn't cause any crash and completes successfully. In stable, it
>> just hangs there forever. The kernel doesn't completely panic and the
>> machine is otherwise somewhat still functional: my existing SSH
>> connection keeps working, for example, but new ones fail. And an `apt
>> install` of another kernel hangs forever.
>
> So likely at least in 6.1.y there are missing pre-requisites causing
> the behaviour.
>
> If you can test 6.1.135-1 with the commit
> 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch
> built packages at:
>
> https://people.debian.org/~carnil/tmp/linux/1104460/

I can confirm this kernel does not crash when running fstrim.service,
which seems to confirm the bisect.

A.

-- 
Drowning people
Sometimes die
Fighting their rescuers.
                        - Octavia Butler

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking
  2025-04-30 15:55 ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Salvatore Bonaccorso
  2025-05-05 11:47   ` Moritz Mühlenhoff
@ 2025-05-06  1:11   ` Yu Kuai
  2025-05-06  1:19     ` Yu Kuai
  2025-05-06 15:16   ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Melvin Vermeeren
  2 siblings, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2025-05-06  1:11 UTC (permalink / raw)
  To: Salvatore Bonaccorso, Melvin Vermeeren, Greg Kroah-Hartman
  Cc: 1104460, Coly Li, Sasha Levin, stable, regressions, yukuai (C)

Hi,

在 2025/04/30 23:55, Salvatore Bonaccorso 写道:
> Hi
> 
> We got a regression report in Debian after the update from 6.1.133 to
> 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> stalls idefintively. The full report is inlined below and originates
> from https://bugs.debian.org/1104460 .
> 
> On Wed, Apr 30, 2025 at 04:46:50PM +0200, Melvin Vermeeren wrote:
>> Package: src:linux
>> Version: 6.1.135-1
>> Severity: important
>> Tags: upstream
>> X-Debbugs-Cc: vermeeren@vermwa.re
>>
>> Dear Maintainer,
>>
>> Upgrading from linux-image-6.1.0-33-powerpc64le (6.1.133-1) to
>> linux-image-6.1.0-34-powerpc64le (6.1.135-1) it appears there is a
>> serious regression bug related to discard/TRIM through a RAID10 array.
>> This only affects RAID10, RAID1 array on the same SSD device is not
>> affected. Array in question is a fairly standard RAID10 in 2far layout.
>>
>> md127 : active raid10 dm-1[2] dm-0[0]
>>        1872188416 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU]
>>        bitmap: 1/1 pages [64KB], 65536KB chunk
>>
>> Any discard operation will result in quite a long kernel error. The
>> calling process will either segfault (swapon) or, more likely, be stuck
>> forever (Qemu, fstrim) in the D state per htop. The iostat utility
>> reports a %util of 100% for any device on top of (directly or
>> indirectly) of the RAID10 device, despite there being no read or write
>> requests to the devices or any other acitivty.
>>
>> Stuck processes cannot be terminated or killed. Attempting to reboot
>> normally will result in a stuck machine on shutdown, so only a
>> REISUB-style reboot will work via procfs sysrq.
>>
>> I have briefly diffed and inspected commits between the two kernel
>> versions and I suspect the commit below may be at fault. Do keep in mind
>> I have not verified this in any way, so I may be wrong.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4a05f7ae33716d996c5ce56478a36a3ede1d76f2
>>

Thanks for the report, the commit relied on another commit
820455238366 ("md/raid10: switch to use md_account_bio() for io
accounting"), and it's wrong for v6.1. I'll send a revert soon.

Thanks,
Kuai

>> Considering this is shipped as part of a stable security update I
>> consider it quite a serious bug. Affected hosts will not boot up
>> cleanly, may not have swap, processes will freeze upon discard and clean
>> reboot it also not possible.
>>
>> More logs available upon request.
>>
>> Many thanks,
>>
>> Melvin Vermeeren.
>>
>> -- Package-specific info:
>> ** Version:
>> Linux version 6.1.0-34-powerpc64le (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP Debian 6.1.135-1 (2025-04-25)
>>
>> ** Command line:
>> root=/dev/mapper/...-root ro quiet
>>
>> ** Not tainted
>>
>> ** Kernel log:
>> # /etc/fstab entry
>> /dev/.../swap none swap sw,discard=once 0 0
>>
>> ~# swapon -va
>> swapon: /dev/mapper/...-swap: found signature [pagesize=65536, signature=swap]
>> swapon: /dev/mapper/...-swap: pagesize=65536, swapsize=17179869184, devsize=17179869184
>> swapon /dev/mapper/...-swap
>> Segmentation fault
>>
>> ~# dmesg
>> ...
>> [  223.017257] kernel tried to execute user page (0) - exploit attempt? (uid: 0)
>> [  223.017287] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
>> [  223.017301] Faulting instruction address: 0x00000000
>> [  223.017326] Oops: Kernel access of bad area, sig: 11 [#1]
>> [  223.017338] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
>> [  223.017365] Modules linked in: bridge stp llc binfmt_misc nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy scsi_transport_sas usb_common
>> [  223.017812] CPU: 8 PID: 10609 Comm: swapon Not tainted 6.1.0-34-powerpc64le #1  Debian 6.1.135-1
>> [  223.017844] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
>> [  223.017879] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 0000000000000000
>> [  223.017926] REGS: c0000000276cf200 TRAP: 0400   Not tainted  (6.1.0-34-powerpc64le Debian 6.1.135-1)
>> [  223.017979] MSR:  900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
>> [  223.018060] CFAR: c0000000003efe6c IRQMASK: 0
>>                 GPR00: c0000000003efec4 c0000000276cf4a0 c000000001148100 0000000000092800
>>                 GPR04: 0000000000000000 0000000000000003 0000000000000c00 c00000000296e700
>>                 GPR08: c00000000c0e9700 00000c0000090800 0000000000000000 0000000000002000
>>                 GPR12: 0000000000000000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>>                 GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>>                 GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>>                 GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>>                 GPR28: c000200009e2dd00 c00000000296e718 00000c0000092800 0000000000092c00
>> [  223.018372] NIP [0000000000000000] 0x0
>> [  223.018397] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
>> [  223.018435] Call Trace:
>> [  223.018453] [c0000000276cf4a0] [c0000000003efec4] mempool_alloc+0xf4/0x210 (unreliable)
>> [  223.018507] [c0000000276cf520] [c000000000743bf8] bio_alloc_bioset+0x368/0x510
>> [  223.018552] [c0000000276cf5a0] [c000000000743e74] bio_alloc_clone+0x44/0xa0
>> [  223.018601] [c0000000276cf5e0] [c008000015793adc] md_account_bio+0x54/0xb0 [md_mod]
>> [  223.018655] [c0000000276cf610] [c00800001567778c] raid10_make_request+0xc54/0x1040 [raid10]
>> [  223.018687] [c0000000276cf770] [c00800001579a290] md_handle_request+0x198/0x380 [md_mod]
>> [  223.018735] [c0000000276cf800] [c00000000074c32c] __submit_bio+0x9c/0x250
>> [  223.018773] [c0000000276cf840] [c00000000074ca88] submit_bio_noacct_nocheck+0x178/0x3f0
>> [  223.018825] [c0000000276cf8b0] [c000000000743e08] blk_next_bio+0x68/0x90
>> [  223.018863] [c0000000276cf8e0] [c000000000758c60] __blkdev_issue_discard+0x180/0x280
>> [  223.018898] [c0000000276cf980] [c000000000758de8] blkdev_issue_discard+0x88/0x120
>> [  223.018927] [c0000000276cfa00] [c0000000004a9e8c] sys_swapon+0x11dc/0x18a0
>> [  223.018971] [c0000000276cfb50] [c00000000002b038] system_call_exception+0x138/0x260
>> [  223.019015] [c0000000276cfe10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
>> [  223.019058] --- interrupt: 3000 at 0x7fff95146770
>> [  223.019095] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 0000000000000000
>> [  223.019132] REGS: c0000000276cfe80 TRAP: 3000   Not tainted  (6.1.0-34-powerpc64le Debian 6.1.135-1)
>> [  223.019182] MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
>> [  223.019267] IRQMASK: 0
>>                 GPR00: 0000000000000057 00007fffdca2ace0 00007fff95256f00 00000001220a1c20
>>                 GPR04: 0000000000030000 000000000000001e 000000000000000a 000000000000000a
>>                 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>                 GPR12: 0000000000000000 00007fff955dcbc0 0000000000000000 0000000000000000
>>                 GPR16: 0000000000000000 00000001104066b0 00007fffdca2afc8 000000011040cbd0
>>                 GPR20: 000000011040cbd8 0000000000000000 0000000000010000 00007fffdca2aff0
>>                 GPR24: 00007fffdca2afd0 0000000000000003 0000000000030000 0000000400000000
>>                 GPR28: 00000001220a1c20 000000000000fff6 00000001220a30a0 0000000000100000
>> [  223.019542] NIP [00007fff95146770] 0x7fff95146770
>> [  223.019568] LR [00007fff95146770] 0x7fff95146770
>> [  223.019595] --- interrupt: 3000
>> [  223.019604] Instruction dump:
>> [  223.019626] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [  223.019665] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [  223.019712] ---[ end trace 0000000000000000 ]---
>>
>> [  224.623456] note: swapon[10609] exited with irqs disabled
>> [  224.623483] ------------[ cut here ]------------
>> [  224.623502] WARNING: CPU: 8 PID: 10609 at kernel/exit.c:816 do_exit+0x94/0xbc0
>> [  224.623516] Modules linked in: bridge stp llc binfmt_misc nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy scsi_transport_sas usb_common
>> [  224.623825] CPU: 8 PID: 10609 Comm: swapon Tainted: G      D            6.1.0-34-powerpc64le #1  Debian 6.1.135-1
>> [  224.623860] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
>> [  224.623892] NIP:  c000000000140fa4 LR: c000000000140fa0 CTR: 0000000000000000
>> [  224.623935] REGS: c0000000276cecb0 TRAP: 0700   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>> [  224.623969] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24004222  XER: 00000004
>> [  224.624012] CFAR: c00000000013ea68 IRQMASK: 0
>>                 GPR00: c000000000140fa0 c0000000276cef50 c000000001148100 0000000000000000
>>                 GPR04: 0000000000000000 c0000000276cee20 c0000000276cee18 0000001ffb000000
>>                 GPR08: 0000000000000027 c0000000276cf9b0 0000000000000000 0000000000004000
>>                 GPR12: 0000000031c40000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>>                 GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>>                 GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>>                 GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>>                 GPR28: 000000000000000b c00000001ce25d80 c000000078409c00 c000000026529d80
>> [  224.624208] NIP [c000000000140fa4] do_exit+0x94/0xbc0
>> [  224.624239] LR [c000000000140fa0] do_exit+0x90/0xbc0
>> [  224.624269] Call Trace:
>> [  224.624274] [c0000000276cef50] [c000000000140fa0] do_exit+0x90/0xbc0 (unreliable)
>> [  224.624308] [c0000000276cf020] [c000000000141b80] make_task_dead+0xb0/0x1f0
>> [  224.624320] [c0000000276cf0a0] [c000000000025718] oops_end+0x188/0x1c0
>> [  224.624341] [c0000000276cf120] [c00000000007f72c] __bad_page_fault+0x18c/0x1b0
>> [  224.624375] [c0000000276cf190] [c000000000008cd4] instruction_access_common_virt+0x194/0x1a0
>> [  224.624421] --- interrupt: 400 at 0x0
>> [  224.624438] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 0000000000000000
>> [  224.624471] REGS: c0000000276cf200 TRAP: 0400   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>> [  224.624507] MSR:  900000004280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
>> [  224.624544] CFAR: c0000000003efe6c IRQMASK: 0
>>                 GPR00: c0000000003efec4 c0000000276cf4a0 c000000001148100 0000000000092800
>>                 GPR04: 0000000000000000 0000000000000003 0000000000000c00 c00000000296e700
>>                 GPR08: c00000000c0e9700 00000c0000090800 0000000000000000 0000000000002000
>>                 GPR12: 0000000000000000 c000001ffffd9800 c0000000446b8c00 0000000000000000
>>                 GPR16: 0000000000000400 0000000000000000 0000000000000001 000000000000c812
>>                 GPR20: 000000000000c911 c0000000170c5700 c00000000296e718 c00000000296e3f0
>>                 GPR24: 0000000000000000 00000000000003ff 0000000000000000 0000000000000c00
>>                 GPR28: c000200009e2dd00 c00000000296e718 00000c0000092800 0000000000092c00
>> [  224.624732] NIP [0000000000000000] 0x0
>> [  224.624749] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
>> [  224.624771] --- interrupt: 400
>> [  224.624789] [c0000000276cf4a0] [c0000000003efec4] mempool_alloc+0xf4/0x210 (unreliable)
>> [  224.624823] [c0000000276cf520] [c000000000743bf8] bio_alloc_bioset+0x368/0x510
>> [  224.624859] [c0000000276cf5a0] [c000000000743e74] bio_alloc_clone+0x44/0xa0
>> [  224.624892] [c0000000276cf5e0] [c008000015793adc] md_account_bio+0x54/0xb0 [md_mod]
>> [  224.624930] [c0000000276cf610] [c00800001567778c] raid10_make_request+0xc54/0x1040 [raid10]
>> [  224.624964] [c0000000276cf770] [c00800001579a290] md_handle_request+0x198/0x380 [md_mod]
>> [  224.624997] [c0000000276cf800] [c00000000074c32c] __submit_bio+0x9c/0x250
>> [  224.625018] [c0000000276cf840] [c00000000074ca88] submit_bio_noacct_nocheck+0x178/0x3f0
>> [  224.625043] [c0000000276cf8b0] [c000000000743e08] blk_next_bio+0x68/0x90
>> [  224.625066] [c0000000276cf8e0] [c000000000758c60] __blkdev_issue_discard+0x180/0x280
>> [  224.625091] [c0000000276cf980] [c000000000758de8] blkdev_issue_discard+0x88/0x120
>> [  224.625115] [c0000000276cfa00] [c0000000004a9e8c] sys_swapon+0x11dc/0x18a0
>> [  224.625139] [c0000000276cfb50] [c00000000002b038] system_call_exception+0x138/0x260
>> [  224.625164] [c0000000276cfe10] [c00000000000c0f0] system_call_vectored_common+0xf0/0x280
>> [  224.625201] --- interrupt: 3000 at 0x7fff95146770
>> [  224.625270] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 0000000000000000
>> [  224.625367] REGS: c0000000276cfe80 TRAP: 3000   Tainted: G      D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>> [  224.625458] MSR:  900000000000f033 <SF,HV,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
>> [  224.625570] IRQMASK: 0
>>                 GPR00: 0000000000000057 00007fffdca2ace0 00007fff95256f00 00000001220a1c20
>>                 GPR04: 0000000000030000 000000000000001e 000000000000000a 000000000000000a
>>                 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>                 GPR12: 0000000000000000 00007fff955dcbc0 0000000000000000 0000000000000000
>>                 GPR16: 0000000000000000 00000001104066b0 00007fffdca2afc8 000000011040cbd0
>>                 GPR20: 000000011040cbd8 0000000000000000 0000000000010000 00007fffdca2aff0
>>                 GPR24: 00007fffdca2afd0 0000000000000003 0000000000030000 0000000400000000
>>                 GPR28: 00000001220a1c20 000000000000fff6 00000001220a30a0 0000000000100000
>> [  224.626325] NIP [00007fff95146770] 0x7fff95146770
>> [  224.626388] LR [00007fff95146770] 0x7fff95146770
>> [  224.626522] --- interrupt: 3000
>> [  224.626568] Instruction dump:
>> [  224.626587] 60000000 813f000c 3929ffff 2c090000 913f000c 40820010 813f0074 71290004
>> [  224.626680] 4182074c 7fa3eb78 4bffda7d e93e0b10 <0b090000> e87e0a48 48c7dd0d 60000000
>> [  224.626786] ---[ end trace 0000000000000000 ]---
> 
> Does this ring a bell?
> 
> Melvin, the same change went as well in other stable series, 6.6.88,
> 6.12.25, 6.14.4, can you test e.g. 6.12.25-1 in Debian as well from
> unstable to see if the regression is there as well?
> 
> Might you be able to bisect the upstream stable series between 6.1.133
> to 6.1.135 to really confirm the mentioned commit is the one breaking?
> 
> Regards,
> Salvatore
> 
> .
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking
  2025-05-06  1:11   ` Yu Kuai
@ 2025-05-06  1:19     ` Yu Kuai
  0 siblings, 0 replies; 13+ messages in thread
From: Yu Kuai @ 2025-05-06  1:19 UTC (permalink / raw)
  To: Yu Kuai, Salvatore Bonaccorso, Melvin Vermeeren,
	Greg Kroah-Hartman
  Cc: 1104460, Coly Li, Sasha Levin, stable, regressions, yukuai (C)

Hi,

在 2025/05/06 9:11, Yu Kuai 写道:
> Hi,
> 
> 在 2025/04/30 23:55, Salvatore Bonaccorso 写道:
>> Hi
>>
>> We got a regression report in Debian after the update from 6.1.133 to
>> 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
>> stalls idefintively. The full report is inlined below and originates
>> from https://bugs.debian.org/1104460 .
>>
>> On Wed, Apr 30, 2025 at 04:46:50PM +0200, Melvin Vermeeren wrote:
>>> Package: src:linux
>>> Version: 6.1.135-1
>>> Severity: important
>>> Tags: upstream
>>> X-Debbugs-Cc: vermeeren@vermwa.re
>>>
>>> Dear Maintainer,
>>>
>>> Upgrading from linux-image-6.1.0-33-powerpc64le (6.1.133-1) to
>>> linux-image-6.1.0-34-powerpc64le (6.1.135-1) it appears there is a
>>> serious regression bug related to discard/TRIM through a RAID10 array.
>>> This only affects RAID10, RAID1 array on the same SSD device is not
>>> affected. Array in question is a fairly standard RAID10 in 2far layout.
>>>
>>> md127 : active raid10 dm-1[2] dm-0[0]
>>>        1872188416 blocks super 1.2 512K chunks 2 far-copies [2/2] [UU]
>>>        bitmap: 1/1 pages [64KB], 65536KB chunk
>>>
>>> Any discard operation will result in quite a long kernel error. The
>>> calling process will either segfault (swapon) or, more likely, be stuck
>>> forever (Qemu, fstrim) in the D state per htop. The iostat utility
>>> reports a %util of 100% for any device on top of (directly or
>>> indirectly) of the RAID10 device, despite there being no read or write
>>> requests to the devices or any other acitivty.
>>>
>>> Stuck processes cannot be terminated or killed. Attempting to reboot
>>> normally will result in a stuck machine on shutdown, so only a
>>> REISUB-style reboot will work via procfs sysrq.
>>>
>>> I have briefly diffed and inspected commits between the two kernel
>>> versions and I suspect the commit below may be at fault. Do keep in mind
>>> I have not verified this in any way, so I may be wrong.
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=4a05f7ae33716d996c5ce56478a36a3ede1d76f2 
>>>
>>>
> 
> Thanks for the report, the commit relied on another commit
> 820455238366 ("md/raid10: switch to use md_account_bio() for io
> accounting"), and it's wrong for v6.1. I'll send a revert soon.

Take a look at the report stack, looks like the relied patch is actually
https://lore.kernel.org/all/20230621165110.1498313-2-yukuai1@huaweicloud.com/

Thanks,
Kuai

> 
> Thanks,
> Kuai
> 
>>> Considering this is shipped as part of a stable security update I
>>> consider it quite a serious bug. Affected hosts will not boot up
>>> cleanly, may not have swap, processes will freeze upon discard and clean
>>> reboot it also not possible.
>>>
>>> More logs available upon request.
>>>
>>> Many thanks,
>>>
>>> Melvin Vermeeren.
>>>
>>> -- Package-specific info:
>>> ** Version:
>>> Linux version 6.1.0-34-powerpc64le (debian-kernel@lists.debian.org) 
>>> (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 
>>> 2.40) #1 SMP Debian 6.1.135-1 (2025-04-25)
>>>
>>> ** Command line:
>>> root=/dev/mapper/...-root ro quiet
>>>
>>> ** Not tainted
>>>
>>> ** Kernel log:
>>> # /etc/fstab entry
>>> /dev/.../swap none swap sw,discard=once 0 0
>>>
>>> ~# swapon -va
>>> swapon: /dev/mapper/...-swap: found signature [pagesize=65536, 
>>> signature=swap]
>>> swapon: /dev/mapper/...-swap: pagesize=65536, swapsize=17179869184, 
>>> devsize=17179869184
>>> swapon /dev/mapper/...-swap
>>> Segmentation fault
>>>
>>> ~# dmesg
>>> ...
>>> [  223.017257] kernel tried to execute user page (0) - exploit 
>>> attempt? (uid: 0)
>>> [  223.017287] BUG: Unable to handle kernel instruction fetch (NULL 
>>> pointer?)
>>> [  223.017301] Faulting instruction address: 0x00000000
>>> [  223.017326] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [  223.017338] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
>>> [  223.017365] Modules linked in: bridge stp llc binfmt_misc 
>>> nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart 
>>> ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler 
>>> opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt 
>>> fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 
>>> nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 
>>> nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib 
>>> nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks 
>>> configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt 
>>> dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov 
>>> async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c 
>>> crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi 
>>> crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic 
>>> crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto 
>>> gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy 
>>> scsi_transport_sas usb_common
>>> [  223.017812] CPU: 8 PID: 10609 Comm: swapon Not tainted 
>>> 6.1.0-34-powerpc64le #1  Debian 6.1.135-1
>>> [  223.017844] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 
>>> opal:skiboot-bc106a0 PowerNV
>>> [  223.017879] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 
>>> 0000000000000000
>>> [  223.017926] REGS: c0000000276cf200 TRAP: 0400   Not tainted  
>>> (6.1.0-34-powerpc64le Debian 6.1.135-1)
>>> [  223.017979] MSR:  900000004280b033 
>>> <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
>>> [  223.018060] CFAR: c0000000003efe6c IRQMASK: 0
>>>                 GPR00: c0000000003efec4 c0000000276cf4a0 
>>> c000000001148100 0000000000092800
>>>                 GPR04: 0000000000000000 0000000000000003 
>>> 0000000000000c00 c00000000296e700
>>>                 GPR08: c00000000c0e9700 00000c0000090800 
>>> 0000000000000000 0000000000002000
>>>                 GPR12: 0000000000000000 c000001ffffd9800 
>>> c0000000446b8c00 0000000000000000
>>>                 GPR16: 0000000000000400 0000000000000000 
>>> 0000000000000001 000000000000c812
>>>                 GPR20: 000000000000c911 c0000000170c5700 
>>> c00000000296e718 c00000000296e3f0
>>>                 GPR24: 0000000000000000 00000000000003ff 
>>> 0000000000000000 0000000000000c00
>>>                 GPR28: c000200009e2dd00 c00000000296e718 
>>> 00000c0000092800 0000000000092c00
>>> [  223.018372] NIP [0000000000000000] 0x0
>>> [  223.018397] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
>>> [  223.018435] Call Trace:
>>> [  223.018453] [c0000000276cf4a0] [c0000000003efec4] 
>>> mempool_alloc+0xf4/0x210 (unreliable)
>>> [  223.018507] [c0000000276cf520] [c000000000743bf8] 
>>> bio_alloc_bioset+0x368/0x510
>>> [  223.018552] [c0000000276cf5a0] [c000000000743e74] 
>>> bio_alloc_clone+0x44/0xa0
>>> [  223.018601] [c0000000276cf5e0] [c008000015793adc] 
>>> md_account_bio+0x54/0xb0 [md_mod]
>>> [  223.018655] [c0000000276cf610] [c00800001567778c] 
>>> raid10_make_request+0xc54/0x1040 [raid10]
>>> [  223.018687] [c0000000276cf770] [c00800001579a290] 
>>> md_handle_request+0x198/0x380 [md_mod]
>>> [  223.018735] [c0000000276cf800] [c00000000074c32c] 
>>> __submit_bio+0x9c/0x250
>>> [  223.018773] [c0000000276cf840] [c00000000074ca88] 
>>> submit_bio_noacct_nocheck+0x178/0x3f0
>>> [  223.018825] [c0000000276cf8b0] [c000000000743e08] 
>>> blk_next_bio+0x68/0x90
>>> [  223.018863] [c0000000276cf8e0] [c000000000758c60] 
>>> __blkdev_issue_discard+0x180/0x280
>>> [  223.018898] [c0000000276cf980] [c000000000758de8] 
>>> blkdev_issue_discard+0x88/0x120
>>> [  223.018927] [c0000000276cfa00] [c0000000004a9e8c] 
>>> sys_swapon+0x11dc/0x18a0
>>> [  223.018971] [c0000000276cfb50] [c00000000002b038] 
>>> system_call_exception+0x138/0x260
>>> [  223.019015] [c0000000276cfe10] [c00000000000c0f0] 
>>> system_call_vectored_common+0xf0/0x280
>>> [  223.019058] --- interrupt: 3000 at 0x7fff95146770
>>> [  223.019095] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 
>>> 0000000000000000
>>> [  223.019132] REGS: c0000000276cfe80 TRAP: 3000   Not tainted  
>>> (6.1.0-34-powerpc64le Debian 6.1.135-1)
>>> [  223.019182] MSR:  900000000280f033 
>>> <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
>>> [  223.019267] IRQMASK: 0
>>>                 GPR00: 0000000000000057 00007fffdca2ace0 
>>> 00007fff95256f00 00000001220a1c20
>>>                 GPR04: 0000000000030000 000000000000001e 
>>> 000000000000000a 000000000000000a
>>>                 GPR08: 0000000000000000 0000000000000000 
>>> 0000000000000000 0000000000000000
>>>                 GPR12: 0000000000000000 00007fff955dcbc0 
>>> 0000000000000000 0000000000000000
>>>                 GPR16: 0000000000000000 00000001104066b0 
>>> 00007fffdca2afc8 000000011040cbd0
>>>                 GPR20: 000000011040cbd8 0000000000000000 
>>> 0000000000010000 00007fffdca2aff0
>>>                 GPR24: 00007fffdca2afd0 0000000000000003 
>>> 0000000000030000 0000000400000000
>>>                 GPR28: 00000001220a1c20 000000000000fff6 
>>> 00000001220a30a0 0000000000100000
>>> [  223.019542] NIP [00007fff95146770] 0x7fff95146770
>>> [  223.019568] LR [00007fff95146770] 0x7fff95146770
>>> [  223.019595] --- interrupt: 3000
>>> [  223.019604] Instruction dump:
>>> [  223.019626] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
>>> XXXXXXXX XXXXXXXX
>>> [  223.019665] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
>>> XXXXXXXX XXXXXXXX
>>> [  223.019712] ---[ end trace 0000000000000000 ]---
>>>
>>> [  224.623456] note: swapon[10609] exited with irqs disabled
>>> [  224.623483] ------------[ cut here ]------------
>>> [  224.623502] WARNING: CPU: 8 PID: 10609 at kernel/exit.c:816 
>>> do_exit+0x94/0xbc0
>>> [  224.623516] Modules linked in: bridge stp llc binfmt_misc 
>>> nft_connlimit nf_conncount ast drm_vram_helper drm_ttm_helper ofpart 
>>> ipmi_powernv ttm ipmi_devintf powernv_flash at24 mtd ipmi_msghandler 
>>> opal_prd regmap_i2c drm_kms_helper syscopyarea sysfillrect sysimgblt 
>>> fb_sys_fops i2c_algo_bit sg nft_reject_inet nf_reject_ipv4 
>>> nf_reject_ipv6 nft_reject nft_ct nf_conntrack nf_defrag_ipv6 
>>> nf_defrag_ipv4 nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib 
>>> nf_tables nfnetlink drm loop fuse drm_panel_orientation_quirks 
>>> configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 dm_crypt 
>>> dm_integrity dm_bufio dm_mod macvlan raid10 raid456 async_raid6_recov 
>>> async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c 
>>> crc32c_generic raid1 raid0 multipath linear md_mod sd_mod t10_pi 
>>> crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic 
>>> crc64 crct10dif_common xhci_pci xts ecb xhci_hcd ctr vmx_crypto 
>>> gf128mul crc32c_vpmsum tg3 mpt3sas usbcore raid_class libphy 
>>> scsi_transport_sas usb_common
>>> [  224.623825] CPU: 8 PID: 10609 Comm: swapon Tainted: G      
>>> D            6.1.0-34-powerpc64le #1  Debian 6.1.135-1
>>> [  224.623860] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 
>>> opal:skiboot-bc106a0 PowerNV
>>> [  224.623892] NIP:  c000000000140fa4 LR: c000000000140fa0 CTR: 
>>> 0000000000000000
>>> [  224.623935] REGS: c0000000276cecb0 TRAP: 0700   Tainted: G      
>>> D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>>> [  224.623969] MSR:  9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  
>>> CR: 24004222  XER: 00000004
>>> [  224.624012] CFAR: c00000000013ea68 IRQMASK: 0
>>>                 GPR00: c000000000140fa0 c0000000276cef50 
>>> c000000001148100 0000000000000000
>>>                 GPR04: 0000000000000000 c0000000276cee20 
>>> c0000000276cee18 0000001ffb000000
>>>                 GPR08: 0000000000000027 c0000000276cf9b0 
>>> 0000000000000000 0000000000004000
>>>                 GPR12: 0000000031c40000 c000001ffffd9800 
>>> c0000000446b8c00 0000000000000000
>>>                 GPR16: 0000000000000400 0000000000000000 
>>> 0000000000000001 000000000000c812
>>>                 GPR20: 000000000000c911 c0000000170c5700 
>>> c00000000296e718 c00000000296e3f0
>>>                 GPR24: 0000000000000000 00000000000003ff 
>>> 0000000000000000 0000000000000c00
>>>                 GPR28: 000000000000000b c00000001ce25d80 
>>> c000000078409c00 c000000026529d80
>>> [  224.624208] NIP [c000000000140fa4] do_exit+0x94/0xbc0
>>> [  224.624239] LR [c000000000140fa0] do_exit+0x90/0xbc0
>>> [  224.624269] Call Trace:
>>> [  224.624274] [c0000000276cef50] [c000000000140fa0] 
>>> do_exit+0x90/0xbc0 (unreliable)
>>> [  224.624308] [c0000000276cf020] [c000000000141b80] 
>>> make_task_dead+0xb0/0x1f0
>>> [  224.624320] [c0000000276cf0a0] [c000000000025718] 
>>> oops_end+0x188/0x1c0
>>> [  224.624341] [c0000000276cf120] [c00000000007f72c] 
>>> __bad_page_fault+0x18c/0x1b0
>>> [  224.624375] [c0000000276cf190] [c000000000008cd4] 
>>> instruction_access_common_virt+0x194/0x1a0
>>> [  224.624421] --- interrupt: 400 at 0x0
>>> [  224.624438] NIP:  0000000000000000 LR: c0000000003efe70 CTR: 
>>> 0000000000000000
>>> [  224.624471] REGS: c0000000276cf200 TRAP: 0400   Tainted: G      
>>> D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>>> [  224.624507] MSR:  900000004280b033 
>>> <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24004480  XER: 00000004
>>> [  224.624544] CFAR: c0000000003efe6c IRQMASK: 0
>>>                 GPR00: c0000000003efec4 c0000000276cf4a0 
>>> c000000001148100 0000000000092800
>>>                 GPR04: 0000000000000000 0000000000000003 
>>> 0000000000000c00 c00000000296e700
>>>                 GPR08: c00000000c0e9700 00000c0000090800 
>>> 0000000000000000 0000000000002000
>>>                 GPR12: 0000000000000000 c000001ffffd9800 
>>> c0000000446b8c00 0000000000000000
>>>                 GPR16: 0000000000000400 0000000000000000 
>>> 0000000000000001 000000000000c812
>>>                 GPR20: 000000000000c911 c0000000170c5700 
>>> c00000000296e718 c00000000296e3f0
>>>                 GPR24: 0000000000000000 00000000000003ff 
>>> 0000000000000000 0000000000000c00
>>>                 GPR28: c000200009e2dd00 c00000000296e718 
>>> 00000c0000092800 0000000000092c00
>>> [  224.624732] NIP [0000000000000000] 0x0
>>> [  224.624749] LR [c0000000003efe70] mempool_alloc+0xa0/0x210
>>> [  224.624771] --- interrupt: 400
>>> [  224.624789] [c0000000276cf4a0] [c0000000003efec4] 
>>> mempool_alloc+0xf4/0x210 (unreliable)
>>> [  224.624823] [c0000000276cf520] [c000000000743bf8] 
>>> bio_alloc_bioset+0x368/0x510
>>> [  224.624859] [c0000000276cf5a0] [c000000000743e74] 
>>> bio_alloc_clone+0x44/0xa0
>>> [  224.624892] [c0000000276cf5e0] [c008000015793adc] 
>>> md_account_bio+0x54/0xb0 [md_mod]
>>> [  224.624930] [c0000000276cf610] [c00800001567778c] 
>>> raid10_make_request+0xc54/0x1040 [raid10]
>>> [  224.624964] [c0000000276cf770] [c00800001579a290] 
>>> md_handle_request+0x198/0x380 [md_mod]
>>> [  224.624997] [c0000000276cf800] [c00000000074c32c] 
>>> __submit_bio+0x9c/0x250
>>> [  224.625018] [c0000000276cf840] [c00000000074ca88] 
>>> submit_bio_noacct_nocheck+0x178/0x3f0
>>> [  224.625043] [c0000000276cf8b0] [c000000000743e08] 
>>> blk_next_bio+0x68/0x90
>>> [  224.625066] [c0000000276cf8e0] [c000000000758c60] 
>>> __blkdev_issue_discard+0x180/0x280
>>> [  224.625091] [c0000000276cf980] [c000000000758de8] 
>>> blkdev_issue_discard+0x88/0x120
>>> [  224.625115] [c0000000276cfa00] [c0000000004a9e8c] 
>>> sys_swapon+0x11dc/0x18a0
>>> [  224.625139] [c0000000276cfb50] [c00000000002b038] 
>>> system_call_exception+0x138/0x260
>>> [  224.625164] [c0000000276cfe10] [c00000000000c0f0] 
>>> system_call_vectored_common+0xf0/0x280
>>> [  224.625201] --- interrupt: 3000 at 0x7fff95146770
>>> [  224.625270] NIP:  00007fff95146770 LR: 00007fff95146770 CTR: 
>>> 0000000000000000
>>> [  224.625367] REGS: c0000000276cfe80 TRAP: 3000   Tainted: G      
>>> D             (6.1.0-34-powerpc64le Debian 6.1.135-1)
>>> [  224.625458] MSR:  900000000000f033 
>>> <SF,HV,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 48002481  XER: 00000000
>>> [  224.625570] IRQMASK: 0
>>>                 GPR00: 0000000000000057 00007fffdca2ace0 
>>> 00007fff95256f00 00000001220a1c20
>>>                 GPR04: 0000000000030000 000000000000001e 
>>> 000000000000000a 000000000000000a
>>>                 GPR08: 0000000000000000 0000000000000000 
>>> 0000000000000000 0000000000000000
>>>                 GPR12: 0000000000000000 00007fff955dcbc0 
>>> 0000000000000000 0000000000000000
>>>                 GPR16: 0000000000000000 00000001104066b0 
>>> 00007fffdca2afc8 000000011040cbd0
>>>                 GPR20: 000000011040cbd8 0000000000000000 
>>> 0000000000010000 00007fffdca2aff0
>>>                 GPR24: 00007fffdca2afd0 0000000000000003 
>>> 0000000000030000 0000000400000000
>>>                 GPR28: 00000001220a1c20 000000000000fff6 
>>> 00000001220a30a0 0000000000100000
>>> [  224.626325] NIP [00007fff95146770] 0x7fff95146770
>>> [  224.626388] LR [00007fff95146770] 0x7fff95146770
>>> [  224.626522] --- interrupt: 3000
>>> [  224.626568] Instruction dump:
>>> [  224.626587] 60000000 813f000c 3929ffff 2c090000 913f000c 40820010 
>>> 813f0074 71290004
>>> [  224.626680] 4182074c 7fa3eb78 4bffda7d e93e0b10 <0b090000> 
>>> e87e0a48 48c7dd0d 60000000
>>> [  224.626786] ---[ end trace 0000000000000000 ]---
>>
>> Does this ring a bell?
>>
>> Melvin, the same change went as well in other stable series, 6.6.88,
>> 6.12.25, 6.14.4, can you test e.g. 6.12.25-1 in Debian as well from
>> unstable to see if the regression is there as well?
>>
>> Might you be able to bisect the upstream stable series between 6.1.133
>> to 6.1.135 to really confirm the mentioned commit is the one breaking?
>>
>> Regards,
>> Salvatore
>>
>> .
>>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking
  2025-05-05 20:59             ` Antoine Beaupré
@ 2025-05-06  1:25               ` Yu Kuai
  2025-05-06  6:00                 ` Salvatore Bonaccorso
  0 siblings, 1 reply; 13+ messages in thread
From: Yu Kuai @ 2025-05-06  1:25 UTC (permalink / raw)
  To: Antoine Beaupré, Salvatore Bonaccorso, 1104460
  Cc: Moritz Mühlenhoff, Melvin Vermeeren, Greg Kroah-Hartman,
	Coly Li, Sasha Levin, stable, regressions, yukuai (C)

Hi,

在 2025/05/06 4:59, Antoine Beaupré 写道:
> On 2025-05-05 22:36:07, Salvatore Bonaccorso wrote:
>> Hi Antoine,
>>
>> On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote:
>>> On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
>>>> On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
>>>>> Hi Moritz,
>>>>>
>>>>> On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
>>>>>> Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
>>>>>>> Hi
>>>>>>>
>>>>>>> We got a regression report in Debian after the update from 6.1.133 to
>>>>>>> 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
>>>>>>> stalls idefintively. The full report is inlined below and originates
>>>>>>> from https://bugs.debian.org/1104460 .
>>>>>>
>>>>>> JFTR, we ran into the same problem with a few Wikimedia servers running
>>>>>> 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
>>>>>> got started. Full oops messages are available at
>>>>>> https://phabricator.wikimedia.org/P75746
>>>>>
>>>>> Thanks for this aditional datapoints. Assuming you wont be able to
>>>>> thest the other stable series where the commit d05af90d6218
>>>>> ("md/raid10: fix missing discard IO accounting") went in, might you at
>>>>> least be able to test the 6.1.y branch with the commit reverted again
>>>>> and manually trigger the issue?
>>>>>
>>>>> If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
>>>>> with the patch reverted.
>>>>
>>>> So one additional data point as several Debian users were reporting
>>>> back beeing affected: One user did upgrade to 6.12.25 (where the
>>>> commit was backported as well) and is not able to reproduce the issue
>>>> there.
>>>
>>> That would be me.
>>>
>>> I can reproduce the issue as outlined by Moritz above fairly reliably in
>>> 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
>>> RAID-10 host:
>>>
>>>   1. reboot
>>>   2. systemctl start fstrim.service
>>>
>>> We're tracking the issue internally in:
>>>
>>> https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146
>>>
>>> I've managed to workaround the issue by upgrading to the Debian package
>>> from testing/unstable (6.12.25), as Salvatore indicated above. There,
>>> fstrim doesn't cause any crash and completes successfully. In stable, it
>>> just hangs there forever. The kernel doesn't completely panic and the
>>> machine is otherwise somewhat still functional: my existing SSH
>>> connection keeps working, for example, but new ones fail. And an `apt
>>> install` of another kernel hangs forever.
>>
>> So likely at least in 6.1.y there are missing pre-requisites causing
>> the behaviour.
>>
>> If you can test 6.1.135-1 with the commit
>> 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch
>> built packages at:
>>
>> https://people.debian.org/~carnil/tmp/linux/1104460/

Can you also test with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 not
reverted, and also cherry-pick c567c86b90d4715081adfe5eb812141a5b6b4883?

Thanks,
Kuai

> 
> I can confirm this kernel does not crash when running fstrim.service,
> which seems to confirm the bisect.
> 
> A.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking
  2025-05-06  1:25               ` Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking Yu Kuai
@ 2025-05-06  6:00                 ` Salvatore Bonaccorso
  2025-05-06 13:12                   ` Antoine Beaupré
  0 siblings, 1 reply; 13+ messages in thread
From: Salvatore Bonaccorso @ 2025-05-06  6:00 UTC (permalink / raw)
  To: Yu Kuai, 1104460
  Cc: Antoine Beaupré, Moritz Mühlenhoff, Melvin Vermeeren,
	Greg Kroah-Hartman, Coly Li, Sasha Levin, stable, regressions,
	yukuai (C)

Hi Yu,

Thanks for your followups.

On Tue, May 06, 2025 at 09:25:50AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2025/05/06 4:59, Antoine Beaupré 写道:
> > On 2025-05-05 22:36:07, Salvatore Bonaccorso wrote:
> > > Hi Antoine,
> > > 
> > > On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote:
> > > > On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
> > > > > On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
> > > > > > Hi Moritz,
> > > > > > 
> > > > > > On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
> > > > > > > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
> > > > > > > > Hi
> > > > > > > > 
> > > > > > > > We got a regression report in Debian after the update from 6.1.133 to
> > > > > > > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
> > > > > > > > stalls idefintively. The full report is inlined below and originates
> > > > > > > > from https://bugs.debian.org/1104460 .
> > > > > > > 
> > > > > > > JFTR, we ran into the same problem with a few Wikimedia servers running
> > > > > > > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
> > > > > > > got started. Full oops messages are available at
> > > > > > > https://phabricator.wikimedia.org/P75746
> > > > > > 
> > > > > > Thanks for this aditional datapoints. Assuming you wont be able to
> > > > > > thest the other stable series where the commit d05af90d6218
> > > > > > ("md/raid10: fix missing discard IO accounting") went in, might you at
> > > > > > least be able to test the 6.1.y branch with the commit reverted again
> > > > > > and manually trigger the issue?
> > > > > > 
> > > > > > If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
> > > > > > with the patch reverted.
> > > > > 
> > > > > So one additional data point as several Debian users were reporting
> > > > > back beeing affected: One user did upgrade to 6.12.25 (where the
> > > > > commit was backported as well) and is not able to reproduce the issue
> > > > > there.
> > > > 
> > > > That would be me.
> > > > 
> > > > I can reproduce the issue as outlined by Moritz above fairly reliably in
> > > > 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
> > > > RAID-10 host:
> > > > 
> > > >   1. reboot
> > > >   2. systemctl start fstrim.service
> > > > 
> > > > We're tracking the issue internally in:
> > > > 
> > > > https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146
> > > > 
> > > > I've managed to workaround the issue by upgrading to the Debian package
> > > > from testing/unstable (6.12.25), as Salvatore indicated above. There,
> > > > fstrim doesn't cause any crash and completes successfully. In stable, it
> > > > just hangs there forever. The kernel doesn't completely panic and the
> > > > machine is otherwise somewhat still functional: my existing SSH
> > > > connection keeps working, for example, but new ones fail. And an `apt
> > > > install` of another kernel hangs forever.
> > > 
> > > So likely at least in 6.1.y there are missing pre-requisites causing
> > > the behaviour.
> > > 
> > > If you can test 6.1.135-1 with the commit
> > > 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch
> > > built packages at:
> > > 
> > > https://people.debian.org/~carnil/tmp/linux/1104460/
> 
> Can you also test with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 not
> reverted, and also cherry-pick c567c86b90d4715081adfe5eb812141a5b6b4883?

Thank you.

Antoine, Moritz,
https://people.debian.org/~carnil/tmp/linux/1104460-2/ contains a
build with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 *not* reverted and
with c567c86b90d4715081adfe5eb812141a5b6b4883 cherry-picked, can you
test this one as well?

Regards,
Salvatore

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking
  2025-05-06  6:00                 ` Salvatore Bonaccorso
@ 2025-05-06 13:12                   ` Antoine Beaupré
  0 siblings, 0 replies; 13+ messages in thread
From: Antoine Beaupré @ 2025-05-06 13:12 UTC (permalink / raw)
  To: Salvatore Bonaccorso, Yu Kuai, 1104460
  Cc: Moritz Mühlenhoff, Melvin Vermeeren, Greg Kroah-Hartman,
	Coly Li, Sasha Levin, stable, regressions, yukuai (C)

On 2025-05-06 08:00:34, Salvatore Bonaccorso wrote:
> Hi Yu,
>
> Thanks for your followups.
>
> On Tue, May 06, 2025 at 09:25:50AM +0800, Yu Kuai wrote:
>> Hi,
>> 
>> 在 2025/05/06 4:59, Antoine Beaupré 写道:
>> > On 2025-05-05 22:36:07, Salvatore Bonaccorso wrote:
>> > > Hi Antoine,
>> > > 
>> > > On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote:
>> > > > On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote:
>> > > > > On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote:
>> > > > > > Hi Moritz,
>> > > > > > 
>> > > > > > On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote:
>> > > > > > > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore Bonaccorso:
>> > > > > > > > Hi
>> > > > > > > > 
>> > > > > > > > We got a regression report in Debian after the update from 6.1.133 to
>> > > > > > > > 6.1.135. Melvin is reporting that discard/trimm trhough a RAID10 array
>> > > > > > > > stalls idefintively. The full report is inlined below and originates
>> > > > > > > > from https://bugs.debian.org/1104460 .
>> > > > > > > 
>> > > > > > > JFTR, we ran into the same problem with a few Wikimedia servers running
>> > > > > > > 6.1.135 and RAID 10: The servers started to lock up once fstrim.service
>> > > > > > > got started. Full oops messages are available at
>> > > > > > > https://phabricator.wikimedia.org/P75746
>> > > > > > 
>> > > > > > Thanks for this aditional datapoints. Assuming you wont be able to
>> > > > > > thest the other stable series where the commit d05af90d6218
>> > > > > > ("md/raid10: fix missing discard IO accounting") went in, might you at
>> > > > > > least be able to test the 6.1.y branch with the commit reverted again
>> > > > > > and manually trigger the issue?
>> > > > > > 
>> > > > > > If needed I can provide a test Debian package of 6.1.135 (or 6.1.137)
>> > > > > > with the patch reverted.
>> > > > > 
>> > > > > So one additional data point as several Debian users were reporting
>> > > > > back beeing affected: One user did upgrade to 6.12.25 (where the
>> > > > > commit was backported as well) and is not able to reproduce the issue
>> > > > > there.
>> > > > 
>> > > > That would be me.
>> > > > 
>> > > > I can reproduce the issue as outlined by Moritz above fairly reliably in
>> > > > 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a
>> > > > RAID-10 host:
>> > > > 
>> > > >   1. reboot
>> > > >   2. systemctl start fstrim.service
>> > > > 
>> > > > We're tracking the issue internally in:
>> > > > 
>> > > > https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146
>> > > > 
>> > > > I've managed to workaround the issue by upgrading to the Debian package
>> > > > from testing/unstable (6.12.25), as Salvatore indicated above. There,
>> > > > fstrim doesn't cause any crash and completes successfully. In stable, it
>> > > > just hangs there forever. The kernel doesn't completely panic and the
>> > > > machine is otherwise somewhat still functional: my existing SSH
>> > > > connection keeps working, for example, but new ones fail. And an `apt
>> > > > install` of another kernel hangs forever.
>> > > 
>> > > So likely at least in 6.1.y there are missing pre-requisites causing
>> > > the behaviour.
>> > > 
>> > > If you can test 6.1.135-1 with the commit
>> > > 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch
>> > > built packages at:
>> > > 
>> > > https://people.debian.org/~carnil/tmp/linux/1104460/
>> 
>> Can you also test with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 not
>> reverted, and also cherry-pick c567c86b90d4715081adfe5eb812141a5b6b4883?
>
> Thank you.
>
> Antoine, Moritz,
> https://people.debian.org/~carnil/tmp/linux/1104460-2/ contains a
> build with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 *not* reverted and
> with c567c86b90d4715081adfe5eb812141a5b6b4883 cherry-picked, can you
> test this one as well?

I tested this one, and could succesfully run fstrim.service without
problems.

A.

-- 
L'ennui avec la grande famille humaine, c'est que tout le monde veut
en être le père.
                        - Mafalda

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt?
  2025-04-30 15:55 ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Salvatore Bonaccorso
  2025-05-05 11:47   ` Moritz Mühlenhoff
  2025-05-06  1:11   ` Yu Kuai
@ 2025-05-06 15:16   ` Melvin Vermeeren
  2 siblings, 0 replies; 13+ messages in thread
From: Melvin Vermeeren @ 2025-05-06 15:16 UTC (permalink / raw)
  To: Yu Kuai, Greg Kroah-Hartman, Salvatore Bonaccorso
  Cc: 1104460, Coly Li, Sasha Levin, stable, regressions

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]

Hi Salvatore,

I had been unexpectedly busy the past week, caught up to all the mails just 
now. Many thanks to everyone involved and the additional information from 
several people, am happy to see it.

On Wednesday, 30 April 2025 17:55:20 Central European Summer Time Salvatore 
Bonaccorso wrote:
> Melvin, the same change went as well in other stable series, 6.6.88,
> 6.12.25, 6.14.4, can you test e.g. 6.12.25-1 in Debian as well from
> unstable to see if the regression is there as well?

Specifically for this, I did just now test this with Debian testing's 
6.12.25-1, albeit on amd64 instead of ppc64le, with an identical storage 
layout and can confirm the issue does *not* exist there.

This confirms what others already discovered by now, I agree with the findings 
and have nothing to add specifically.

Thanks again to all,

-- 
Melvin Vermeeren
Systems engineer

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-05-06 15:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <174602441004.174814.6400502946223473449.reportbug@talos.vermwa.re>
2025-04-30 15:55 ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Salvatore Bonaccorso
2025-05-05 11:47   ` Moritz Mühlenhoff
2025-05-05 14:00     ` Salvatore Bonaccorso
2025-05-05 16:02       ` Salvatore Bonaccorso
2025-05-05 18:50         ` Bug#1104460: " Antoine Beaupré
2025-05-05 20:36           ` Salvatore Bonaccorso
2025-05-05 20:59             ` Antoine Beaupré
2025-05-06  1:25               ` Bug#1104460: [regression 6.1.y] discard/TRIM through RAID10 blocking Yu Kuai
2025-05-06  6:00                 ` Salvatore Bonaccorso
2025-05-06 13:12                   ` Antoine Beaupré
2025-05-06  1:11   ` Yu Kuai
2025-05-06  1:19     ` Yu Kuai
2025-05-06 15:16   ` [regression 6.1.y] discard/TRIM through RAID10 blocking (was: Re: Bug#1104460: linux-image-6.1.0-34-powerpc64le: Discard broken) with RAID10: BUG: kernel tried to execute user page (0) - exploit attempt? Melvin Vermeeren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox