* NULL pointer dereference with MD write-back journal, where journal device is RAID-1
@ 2023-08-06 22:48 Corey Hickey
2023-08-07 1:02 ` Yu Kuai
0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-06 22:48 UTC (permalink / raw)
To: 'Linux RAID'
Hello,
I have encountered a reproducible NULL pointer dereference when using
the write-back journal feature for RAID-5. This _seems_ to happen
only when the journal device is itself a RAID-1.
https://docs.kernel.org/driver-api/md/raid5-cache.html
This report supersedes a report I sent to Debian earlier:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043078
Steps to reproduce, including example commands:
1. Create a RAID-1 for the journal device.
$ sudo mdadm --create /dev/md101 -n 2 -l 1 /dev/disk/by-id/ata-Samsung_SSD_850_PRO_256GB_S251NX0H60631*
2. Create a RAID-5 with the journal included. I'm using '-z 10G' for
testing in order to reduce the initial sync time.
$ sudo mdadm --create /dev/md10 -n 3 -l 5 -z 10G --write-journal /dev/md101 /dev/disk/by-id/ata-ST32000645NS_Z1K0*
3. Enable write-back (completes once re-sync is finished).
$ until echo write-back | sudo tee /sys/block/md10/md/journal_mode ; do sleep 5 ; done
4. Write to the disk (may take a few attempts).
$ sudo dd if=/dev/zero of=/dev/md10 iflag=fullblock bs=1M count=10240
Notes:
* The bug does not always manifest immediately but for me, it nearly
always manifests on the first or second 'dd' run.
* The bug is not limited to buffered I/O: writes via 'oflag=direct'
can cause the bug as well.
* I was not able to reproduce the bug on 10 attempts when I used a
single non-RAID SSD as the journal.
* The bug can manifest while the journal RAID-1 is resyncing or not;
the resync does not seem relevant.
My SSDs are attached to an onboard SATA controller:
$ lspci | grep 06:00
06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller (rev 11)
My hard disks are attached to an external SATA-->USB enclosure,
but I this is not relevant--I had the same problem with hard disks
attached to internal SATA controllers in earlier tests.
I'm using Debian Sid on Linux 6.4.8. The kernel is compiled locally
and installed via:
--------------------------------------------------------------------
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz
tar xf linux-6.4.8.tar.xz
cd linux-6.4.8
cp -p "/boot/config-$(uname -r)" .config
make oldconfig # and accept all defaults
make -j 12 bindeb-pkg
sudo dpkg -i linux-image-6.4.8_6.4.8-3_amd64.deb
--------------------------------------------------------------------
Here are the errors reported by the kernel:
--------------------------------------------------------------------
[ 2566.222104] BUG: kernel NULL pointer dereference, address: 0000000000000157
[ 2566.222111] #PF: supervisor read access in kernel mode
[ 2566.222114] #PF: error_code(0x0000) - not-present page
[ 2566.222117] PGD 0 P4D 0
[ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
[ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII HERO (WI-FI), BIOS 4603 09/13/2021
[ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
[ 2566.222139] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
[ 2566.222142] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
[ 2566.222146] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 0000000000000000
[ 2566.222148] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 00000000ffffffff
[ 2566.222151] RBP: 0000000000040001 R08: 0000000000000000 R09: 0000000000000000
[ 2566.222153] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 2566.222155] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 0000000000000000
[ 2566.222157] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.222160] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.222162] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.222165] Call Trace:
[ 2566.222167] <TASK>
[ 2566.222171] ? __die+0x23/0x70
[ 2566.222176] ? page_fault_oops+0x17d/0x4c0
[ 2566.222180] ? update_load_avg+0x7e/0x780
[ 2566.222185] ? exc_page_fault+0x7f/0x180
[ 2566.222190] ? asm_exc_page_fault+0x26/0x30
[ 2566.222196] ? submit_bio_noacct+0x182/0x5c0
[ 2566.222201] handle_active_stripes.isra.0+0x377/0x550 [raid456]
[ 2566.222220] raid5d+0x487/0x750 [raid456]
[ 2566.222234] ? __schedule+0x3e7/0xb80
[ 2566.222240] ? _raw_spin_lock_irqsave+0x27/0x60
[ 2566.222245] ? preempt_count_add+0x6e/0xa0
[ 2566.222248] ? _raw_spin_lock_irqsave+0x27/0x60
[ 2566.222254] ? __pfx_md_thread+0x10/0x10 [md_mod]
[ 2566.222273] md_thread+0xae/0x190 [md_mod]
[ 2566.222293] ? __pfx_autoremove_wake_function+0x10/0x10
[ 2566.222299] kthread+0xf7/0x130
[ 2566.222304] ? __pfx_kthread+0x10/0x10
[ 2566.222309] ret_from_fork+0x2c/0x50
[ 2566.222316] </TASK>
[ 2566.222318] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc nfs_acl ppdev
[ 2566.222390] lockd lp grace parport drm fuse loop efi_pstore dm_mod configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common gpio_amdpt wmi gpio_generic
[ 2566.222451] CR2: 0000000000000157
[ 2566.222454] ---[ end trace 0000000000000000 ]---
[ 2566.436029] RIP: 0010:submit_bio_noacct+0x182/0x5c0
[ 2566.436038] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
[ 2566.436041] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
[ 2566.436044] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 0000000000000000
[ 2566.436047] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 00000000ffffffff
[ 2566.436049] RBP: 0000000000040001 R08: 0000000000000000 R09: 0000000000000000
[ 2566.436051] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 2566.436053] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 0000000000000000
[ 2566.436055] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.436058] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.436060] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.436063] note: md10_raid5[5415] exited with irqs disabled
[ 2566.436109] ------------[ cut here ]------------
[ 2566.436112] WARNING: CPU: 1 PID: 5415 at kernel/exit.c:818 do_exit+0x8ef/0xb20
[ 2566.436119] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc nfs_acl ppdev
[ 2566.436188] lockd lp grace parport drm fuse loop efi_pstore dm_mod configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common gpio_amdpt wmi gpio_generic
[ 2566.436250] CPU: 1 PID: 5415 Comm: md10_raid5 Tainted: G D 6.4.8 #3
[ 2566.436254] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII HERO (WI-FI), BIOS 4603 09/13/2021
[ 2566.436256] RIP: 0010:do_exit+0x8ef/0xb20
[ 2566.436260] Code: e9 12 ff ff ff 48 8b bb 98 09 00 00 31 f6 e8 88 d9 ff ff e9 a0 fd ff ff 4c 89 e6 bf 05 06 00 00 e8 f6 0b 01 00 e9 59 f8 ff ff <0f> 0b e9 88 f7 ff ff 0f 0b e9 45 f7 ff ff 48 89 df e8 fb e0 11 00
[ 2566.436263] RSP: 0018:ffffa41d46e5bed8 EFLAGS: 00010286
[ 2566.436266] RAX: 0000000000000000 RBX: ffff9327df5a6600 RCX: 0000000000000000
[ 2566.436269] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff
[ 2566.436271] RBP: ffff9327c0afb600 R08: 0000000000000000 R09: ffffa41d46e5bde0
[ 2566.436273] R10: 0000000000000003 R11: ffff93363f2f7fe8 R12: 0000000000000009
[ 2566.436275] R13: ffff9327df4deb40 R14: 0000000000000000 R15: 0000000000000000
[ 2566.436277] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.436280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.436282] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.436284] Call Trace:
[ 2566.436287] <TASK>
[ 2566.436288] ? do_exit+0x8ef/0xb20
[ 2566.436292] ? __warn+0x81/0x130
[ 2566.436298] ? do_exit+0x8ef/0xb20
[ 2566.436301] ? report_bug+0x191/0x1c0
[ 2566.436308] ? handle_bug+0x3c/0x80
[ 2566.436312] ? exc_invalid_op+0x17/0x70
[ 2566.436316] ? asm_exc_invalid_op+0x1a/0x20
[ 2566.436321] ? do_exit+0x8ef/0xb20
[ 2566.436325] ? do_exit+0x70/0xb20
[ 2566.436329] make_task_dead+0x81/0x170
[ 2566.436333] rewind_stack_and_make_dead+0x17/0x20
[ 2566.436338] RIP: 0000:0x0
[ 2566.436344] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 2566.436346] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
[ 2566.436349] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2566.436350] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2566.436352] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 2566.436354] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 2566.436355] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 2566.436359] </TASK>
[ 2566.436361] ---[ end trace 0000000000000000 ]---
--------------------------------------------------------------------
Thank you,
Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-06 22:48 NULL pointer dereference with MD write-back journal, where journal device is RAID-1 Corey Hickey
@ 2023-08-07 1:02 ` Yu Kuai
2023-08-07 2:09 ` Corey Hickey
0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07 1:02 UTC (permalink / raw)
To: Corey Hickey, 'Linux RAID'; +Cc: yukuai (C)
Hi,
在 2023/08/07 6:48, Corey Hickey 写道:
> Hello,
>
> I have encountered a reproducible NULL pointer dereference when using
> the write-back journal feature for RAID-5. This _seems_ to happen
> only when the journal device is itself a RAID-1.
>
> https://docs.kernel.org/driver-api/md/raid5-cache.html
>
> This report supersedes a report I sent to Debian earlier:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043078
>
> Steps to reproduce, including example commands:
>
> 1. Create a RAID-1 for the journal device.
> $ sudo mdadm --create /dev/md101 -n 2 -l 1
> /dev/disk/by-id/ata-Samsung_SSD_850_PRO_256GB_S251NX0H60631*
>
> 2. Create a RAID-5 with the journal included. I'm using '-z 10G' for
> testing in order to reduce the initial sync time.
> $ sudo mdadm --create /dev/md10 -n 3 -l 5 -z 10G --write-journal
> /dev/md101 /dev/disk/by-id/ata-ST32000645NS_Z1K0*
>
> 3. Enable write-back (completes once re-sync is finished).
> $ until echo write-back | sudo tee /sys/block/md10/md/journal_mode ; do
> sleep 5 ; done
>
> 4. Write to the disk (may take a few attempts).
> $ sudo dd if=/dev/zero of=/dev/md10 iflag=fullblock bs=1M count=10240
>
> Notes:
> * The bug does not always manifest immediately but for me, it nearly
> always manifests on the first or second 'dd' run.
> * The bug is not limited to buffered I/O: writes via 'oflag=direct'
> can cause the bug as well.
> * I was not able to reproduce the bug on 10 attempts when I used a
> single non-RAID SSD as the journal.
> * The bug can manifest while the journal RAID-1 is resyncing or not;
> the resync does not seem relevant.
>
> My SSDs are attached to an onboard SATA controller:
>
> $ lspci | grep 06:00
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0
> x2 4-port SATA 6 Gb/s Controller (rev 11)
>
> My hard disks are attached to an external SATA-->USB enclosure,
> but I this is not relevant--I had the same problem with hard disks
> attached to internal SATA controllers in earlier tests.
>
> I'm using Debian Sid on Linux 6.4.8. The kernel is compiled locally
> and installed via:
> --------------------------------------------------------------------
> wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz
> tar xf linux-6.4.8.tar.xz
> cd linux-6.4.8
> cp -p "/boot/config-$(uname -r)" .config
> make oldconfig # and accept all defaults
> make -j 12 bindeb-pkg
> sudo dpkg -i linux-image-6.4.8_6.4.8-3_amd64.deb
> --------------------------------------------------------------------
>
> Here are the errors reported by the kernel:
> --------------------------------------------------------------------
> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
> 0000000000000157
> [ 2566.222111] #PF: supervisor read access in kernel mode
> [ 2566.222114] #PF: error_code(0x0000) - not-present page
> [ 2566.222117] PGD 0 P4D 0
> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
> HERO (WI-FI), BIOS 4603 09/13/2021
> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
Can you provide addr2line result? This will be helpful to locate the
problem.
Thanks,
Kuai
> [ 2566.222139] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef
> 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63
> c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
> [ 2566.222142] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
> [ 2566.222146] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX:
> 0000000000000000
> [ 2566.222148] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI:
> 00000000ffffffff
> [ 2566.222151] RBP: 0000000000040001 R08: 0000000000000000 R09:
> 0000000000000000
> [ 2566.222153] R10: 0000000000000001 R11: 0000000000000000 R12:
> 0000000000000000
> [ 2566.222155] R13: 0000000000000001 R14: 000000001dcb2a80 R15:
> 0000000000000000
> [ 2566.222157] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000)
> knlGS:0000000000000000
> [ 2566.222160] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.222162] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4:
> 00000000003506e0
> [ 2566.222165] Call Trace:
> [ 2566.222167] <TASK>
> [ 2566.222171] ? __die+0x23/0x70
> [ 2566.222176] ? page_fault_oops+0x17d/0x4c0
> [ 2566.222180] ? update_load_avg+0x7e/0x780
> [ 2566.222185] ? exc_page_fault+0x7f/0x180
> [ 2566.222190] ? asm_exc_page_fault+0x26/0x30
> [ 2566.222196] ? submit_bio_noacct+0x182/0x5c0
> [ 2566.222201] handle_active_stripes.isra.0+0x377/0x550 [raid456]
> [ 2566.222220] raid5d+0x487/0x750 [raid456]
> [ 2566.222234] ? __schedule+0x3e7/0xb80
> [ 2566.222240] ? _raw_spin_lock_irqsave+0x27/0x60
> [ 2566.222245] ? preempt_count_add+0x6e/0xa0
> [ 2566.222248] ? _raw_spin_lock_irqsave+0x27/0x60
> [ 2566.222254] ? __pfx_md_thread+0x10/0x10 [md_mod]
> [ 2566.222273] md_thread+0xae/0x190 [md_mod]
> [ 2566.222293] ? __pfx_autoremove_wake_function+0x10/0x10
> [ 2566.222299] kthread+0xf7/0x130
> [ 2566.222304] ? __pfx_kthread+0x10/0x10
> [ 2566.222309] ret_from_fork+0x2c/0x50
> [ 2566.222316] </TASK>
> [ 2566.222318] Modules linked in: twofish_generic twofish_avx_x86_64
> twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt
> cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5
> nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu
> eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd
> hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek
> sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper
> kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio
> snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel
> snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass
> drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core
> mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button
> sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus
> snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd
> psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc
> nfs_acl ppdev
> [ 2566.222390] lockd lp grace parport drm fuse loop efi_pstore dm_mod
> configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs
> efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq
> async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath
> linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod
> usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic
> crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64
> ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci
> xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore
> igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common
> gpio_amdpt wmi gpio_generic
> [ 2566.222451] CR2: 0000000000000157
> [ 2566.222454] ---[ end trace 0000000000000000 ]---
> [ 2566.436029] RIP: 0010:submit_bio_noacct+0x182/0x5c0
> [ 2566.436038] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef
> 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63
> c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
> [ 2566.436041] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
> [ 2566.436044] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX:
> 0000000000000000
> [ 2566.436047] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI:
> 00000000ffffffff
> [ 2566.436049] RBP: 0000000000040001 R08: 0000000000000000 R09:
> 0000000000000000
> [ 2566.436051] R10: 0000000000000001 R11: 0000000000000000 R12:
> 0000000000000000
> [ 2566.436053] R13: 0000000000000001 R14: 000000001dcb2a80 R15:
> 0000000000000000
> [ 2566.436055] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000)
> knlGS:0000000000000000
> [ 2566.436058] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.436060] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4:
> 00000000003506e0
> [ 2566.436063] note: md10_raid5[5415] exited with irqs disabled
> [ 2566.436109] ------------[ cut here ]------------
> [ 2566.436112] WARNING: CPU: 1 PID: 5415 at kernel/exit.c:818
> do_exit+0x8ef/0xb20
> [ 2566.436119] Modules linked in: twofish_generic twofish_avx_x86_64
> twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt
> cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5
> nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu
> eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd
> hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek
> sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper
> kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio
> snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel
> snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass
> drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core
> mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button
> sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus
> snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd
> psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc
> nfs_acl ppdev
> [ 2566.436188] lockd lp grace parport drm fuse loop efi_pstore dm_mod
> configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs
> efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq
> async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath
> linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod
> usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic
> crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64
> ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci
> xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore
> igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common
> gpio_amdpt wmi gpio_generic
> [ 2566.436250] CPU: 1 PID: 5415 Comm: md10_raid5 Tainted: G
> D 6.4.8 #3
> [ 2566.436254] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
> HERO (WI-FI), BIOS 4603 09/13/2021
> [ 2566.436256] RIP: 0010:do_exit+0x8ef/0xb20
> [ 2566.436260] Code: e9 12 ff ff ff 48 8b bb 98 09 00 00 31 f6 e8 88 d9
> ff ff e9 a0 fd ff ff 4c 89 e6 bf 05 06 00 00 e8 f6 0b 01 00 e9 59 f8 ff
> ff <0f> 0b e9 88 f7 ff ff 0f 0b e9 45 f7 ff ff 48 89 df e8 fb e0 11 00
> [ 2566.436263] RSP: 0018:ffffa41d46e5bed8 EFLAGS: 00010286
> [ 2566.436266] RAX: 0000000000000000 RBX: ffff9327df5a6600 RCX:
> 0000000000000000
> [ 2566.436269] RDX: 0000000000000001 RSI: 0000000000002710 RDI:
> 00000000ffffffff
> [ 2566.436271] RBP: ffff9327c0afb600 R08: 0000000000000000 R09:
> ffffa41d46e5bde0
> [ 2566.436273] R10: 0000000000000003 R11: ffff93363f2f7fe8 R12:
> 0000000000000009
> [ 2566.436275] R13: ffff9327df4deb40 R14: 0000000000000000 R15:
> 0000000000000000
> [ 2566.436277] FS: 0000000000000000(0000) GS:ffff93363ea40000(0000)
> knlGS:0000000000000000
> [ 2566.436280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.436282] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4:
> 00000000003506e0
> [ 2566.436284] Call Trace:
> [ 2566.436287] <TASK>
> [ 2566.436288] ? do_exit+0x8ef/0xb20
> [ 2566.436292] ? __warn+0x81/0x130
> [ 2566.436298] ? do_exit+0x8ef/0xb20
> [ 2566.436301] ? report_bug+0x191/0x1c0
> [ 2566.436308] ? handle_bug+0x3c/0x80
> [ 2566.436312] ? exc_invalid_op+0x17/0x70
> [ 2566.436316] ? asm_exc_invalid_op+0x1a/0x20
> [ 2566.436321] ? do_exit+0x8ef/0xb20
> [ 2566.436325] ? do_exit+0x70/0xb20
> [ 2566.436329] make_task_dead+0x81/0x170
> [ 2566.436333] rewind_stack_and_make_dead+0x17/0x20
> [ 2566.436338] RIP: 0000:0x0
> [ 2566.436344] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [ 2566.436346] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX:
> 0000000000000000
> [ 2566.436349] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [ 2566.436350] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
> [ 2566.436352] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [ 2566.436354] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [ 2566.436355] R13: 0000000000000000 R14: 0000000000000000 R15:
> 0000000000000000
> [ 2566.436359] </TASK>
> [ 2566.436361] ---[ end trace 0000000000000000 ]---
> --------------------------------------------------------------------
>
> Thank you,
> Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 1:02 ` Yu Kuai
@ 2023-08-07 2:09 ` Corey Hickey
2023-08-07 2:15 ` Yu Kuai
0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-07 2:09 UTC (permalink / raw)
To: Yu Kuai, 'Linux RAID'; +Cc: yukuai (C)
On 2023-08-06 18:02, Yu Kuai wrote:
>> Here are the errors reported by the kernel:
>> --------------------------------------------------------------------
>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>> 0000000000000157
>> [ 2566.222111] #PF: supervisor read access in kernel mode
>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>> [ 2566.222117] PGD 0 P4D 0
>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
>> HERO (WI-FI), BIOS 4603 09/13/2021
>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
>
> Can you provide addr2line result? This will be helpful to locate the
> problem.
I have not done this before; I struggled a bit until I found this:
https://lwn.net/Articles/592724/
These are run within the kernel source tree, which I have not
modified since the original compilation.
$ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
[ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
[ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920 kernel/sched/fair.c:4255)
[ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695 arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
[ 2566.222190] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
[ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198 block/blk-throttle.h:210 block/blk-core.c:800)
[ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709 (discriminator 1)) raid456
[ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
[ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
[ 2566.222240] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4))
[ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976 kernel/sched/core.c:5793 kernel/sched/core.c:5790 kernel/sched/core.c:5818)
[ 2566.222248] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4))
[ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
[ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
[ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
[ 2566.222299] kthread (kernel/kthread.c:379)
[ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
[ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)
$ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
[ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436292] ? __warn (kernel/panic.c:673)
[ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
[ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[ 2566.436316] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436329] make_task_dead (kernel/exit.c:972)
[ 2566.436333] rewind_stack_and_make_dead (??:?)
Is that what you are looking for?
Thanks,
Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 2:09 ` Corey Hickey
@ 2023-08-07 2:15 ` Yu Kuai
2023-08-07 2:46 ` Yu Kuai
2023-08-07 3:22 ` Corey Hickey
0 siblings, 2 replies; 9+ messages in thread
From: Yu Kuai @ 2023-08-07 2:15 UTC (permalink / raw)
To: Corey Hickey, Yu Kuai, 'Linux RAID', yukuai (C)
Hi,
在 2023/08/07 10:09, Corey Hickey 写道:
> On 2023-08-06 18:02, Yu Kuai wrote:
>>> Here are the errors reported by the kernel:
>>> --------------------------------------------------------------------
>>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>>> 0000000000000157
>>> [ 2566.222111] #PF: supervisor read access in kernel mode
>>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>>> [ 2566.222117] PGD 0 P4D 0
>>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
>>> HERO (WI-FI), BIOS 4603 09/13/2021
>>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
>>
>> Can you provide addr2line result? This will be helpful to locate the
>> problem.
>
> I have not done this before; I struggled a bit until I found this:
> https://lwn.net/Articles/592724/
>
> These are run within the kernel source tree, which I have not
> modified since the original compilation.
>
>
> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
> [ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421
> arch/x86/kernel/dumpstack.c:434)
> [ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
> [ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920
> kernel/sched/fair.c:4255)
> [ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695
> arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
> [ 2566.222190] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
> [ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198
> block/blk-throttle.h:210 block/blk-core.c:800)
> [ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709
> (discriminator 1)) raid456
> [ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
> [ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
> [ 2566.222240] ? _raw_spin_lock_irqsave
> (./arch/x86/include/asm/atomic.h:202 (discriminator 4)
> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4)
> ./include/asm-generic/qspinlock.h:111 (discriminator 4)
> ./include/linux/spinlock.h:186 (discriminator 4)
> ./include/linux/spinlock_api_smp.h:111 (discriminator 4)
> kernel/locking/spinlock.c:162 (discriminator 4))
> [ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976
> kernel/sched/core.c:5793 kernel/sched/core.c:5790 kernel/sched/core.c:5818)
> [ 2566.222248] ? _raw_spin_lock_irqsave
> (./arch/x86/include/asm/atomic.h:202 (discriminator 4)
> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4)
> ./include/asm-generic/qspinlock.h:111 (discriminator 4)
> ./include/linux/spinlock.h:186 (discriminator 4)
> ./include/linux/spinlock_api_smp.h:111 (discriminator 4)
> kernel/locking/spinlock.c:162 (discriminator 4))
> [ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
> [ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
> [ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
> [ 2566.222299] kthread (kernel/kthread.c:379)
> [ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
> [ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)
>
>
>
> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
> [ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436292] ? __warn (kernel/panic.c:673)
> [ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
> [ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
> [ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345
> (discriminator 1))
> [ 2566.436316] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
> [ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436329] make_task_dead (kernel/exit.c:972)
> [ 2566.436333] rewind_stack_and_make_dead (??:?)
>
>
>
> Is that what you are looking for?
Yes, and can you provide witch commit are you testing?
Thanks,
Kuai
>
> Thanks,
> Corey
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 2:15 ` Yu Kuai
@ 2023-08-07 2:46 ` Yu Kuai
2023-08-07 4:51 ` Corey Hickey
2023-08-07 3:22 ` Corey Hickey
1 sibling, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07 2:46 UTC (permalink / raw)
To: Yu Kuai, Corey Hickey, 'Linux RAID', yukuai (C)
Cc: yangerkun@huawei.com
Hi,
在 2023/08/07 10:15, Yu Kuai 写道:
> Hi,
>
> 在 2023/08/07 10:09, Corey Hickey 写道:
>> On 2023-08-06 18:02, Yu Kuai wrote:
>>>> Here are the errors reported by the kernel:
>>>> --------------------------------------------------------------------
>>>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>>>> 0000000000000157
>>>> [ 2566.222111] #PF: supervisor read access in kernel mode
>>>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>>>> [ 2566.222117] PGD 0 P4D 0
>>>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>>>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR
>>>> VII
>>>> HERO (WI-FI), BIOS 4603 09/13/2021
>>>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
>>>
>>> Can you provide addr2line result? This will be helpful to locate the
>>> problem.
>>
>> I have not done this before; I struggled a bit until I found this:
>> https://lwn.net/Articles/592724/
>>
>> These are run within the kernel source tree, which I have not
>> modified since the original compilation.
>>
>>
>> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
>> [ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421
>> arch/x86/kernel/dumpstack.c:434)
>> [ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
>> [ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920
>> kernel/sched/fair.c:4255)
>> [ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695
>> arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
>> [ 2566.222190] ? asm_exc_page_fault
>> (./arch/x86/include/asm/idtentry.h:570)
>> [ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198
>> block/blk-throttle.h:210 block/blk-core.c:800)
>> [ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709
>> (discriminator 1)) raid456
I'm not sure yet where is this io come from, however, based on your
test, I think this is from
raid5d
handle_active_stripes
r5l_flush_stripe_to_raid
submit_bio
And I found a problem after a quick look here:
t1: submit flush io
raid5d
handle_active_stripes
r5l_flush_stripe_to_raid
bio_init
submit_bio
// io1
t2: io1 is done
r5l_log_flush_endio
list_splice_tail_init
// new flush io can be dispatched
t3: submit new flush io
...
r5l_flush_stripe_to_raid
bio_init
bio_uninit
// clear bio->bi_blkg
submit_bio
// null-ptr-deref
This is definitly a problem, however, I'm not sure if this is your case,
can you test the following patch?
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 51a68fbc241c..a85ea19fcf14 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
list_for_each_entry(io, &log->flushing_ios, log_sibling)
r5l_io_run_stripes(io);
list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
- spin_unlock_irqrestore(&log->io_list_lock, flags);
-
bio_uninit(bio);
+ spin_unlock_irqrestore(&log->io_list_lock, flags);
}
/*
Thanks,
Kuai
>> [ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
>> [ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
>> [ 2566.222240] ? _raw_spin_lock_irqsave
>> (./arch/x86/include/asm/atomic.h:202 (discriminator 4)
>> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4)
>> ./include/asm-generic/qspinlock.h:111 (discriminator 4)
>> ./include/linux/spinlock.h:186 (discriminator 4)
>> ./include/linux/spinlock_api_smp.h:111 (discriminator 4)
>> kernel/locking/spinlock.c:162 (discriminator 4))
>> [ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976
>> kernel/sched/core.c:5793 kernel/sched/core.c:5790
>> kernel/sched/core.c:5818)
>> [ 2566.222248] ? _raw_spin_lock_irqsave
>> (./arch/x86/include/asm/atomic.h:202 (discriminator 4)
>> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4)
>> ./include/asm-generic/qspinlock.h:111 (discriminator 4)
>> ./include/linux/spinlock.h:186 (discriminator 4)
>> ./include/linux/spinlock_api_smp.h:111 (discriminator 4)
>> kernel/locking/spinlock.c:162 (discriminator 4))
>> [ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
>> [ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
>> [ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
>> [ 2566.222299] kthread (kernel/kthread.c:379)
>> [ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
>> [ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)
>>
>>
>>
>> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
>> [ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436292] ? __warn (kernel/panic.c:673)
>> [ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
>> [ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
>> [ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345
>> (discriminator 1))
>> [ 2566.436316] ? asm_exc_invalid_op
>> (./arch/x86/include/asm/idtentry.h:568)
>> [ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436329] make_task_dead (kernel/exit.c:972)
>> [ 2566.436333] rewind_stack_and_make_dead (??:?)
>>
>>
>>
>> Is that what you are looking for?
>
> Yes, and can you provide witch commit are you testing?
>
> Thanks,
> Kuai
>>
>> Thanks,
>> Corey
>> .
>>
>
> .
>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 2:15 ` Yu Kuai
2023-08-07 2:46 ` Yu Kuai
@ 2023-08-07 3:22 ` Corey Hickey
1 sibling, 0 replies; 9+ messages in thread
From: Corey Hickey @ 2023-08-07 3:22 UTC (permalink / raw)
To: Yu Kuai, 'Linux RAID', yukuai (C)
On 2023-08-06 19:15, Yu Kuai wrote:
>> Is that what you are looking for?
>
> Yes, and can you provide witch commit are you testing?
This is 6.4.8 from the release tarball:
https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz
Thank you,
Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 2:46 ` Yu Kuai
@ 2023-08-07 4:51 ` Corey Hickey
2023-08-07 6:08 ` Yu Kuai
0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-07 4:51 UTC (permalink / raw)
To: Yu Kuai, 'Linux RAID', yukuai (C); +Cc: yangerkun@huawei.com
On 2023-08-06 19:46, Yu Kuai wrote:
> can you test the following patch?
>
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 51a68fbc241c..a85ea19fcf14 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
> list_for_each_entry(io, &log->flushing_ios, log_sibling)
> r5l_io_run_stripes(io);
> list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
> - spin_unlock_irqrestore(&log->io_list_lock, flags);
> -
> bio_uninit(bio);
> + spin_unlock_irqrestore(&log->io_list_lock, flags);
> }
>
> /*
My patch utility didn't like it for some reason, but I applied the
changes manually to get what I think is the same thing. I'll paste the
diff here just in case.
--- drivers/md/raid5-cache.c.orig 2023-08-06 20:26:10.386665042 -0700
+++ drivers/md/raid5-cache.c 2023-08-06 20:31:33.290688590 -0700
@@ -1265,9 +1265,8 @@
list_for_each_entry(io, &log->flushing_ios, log_sibling)
r5l_io_run_stripes(io);
list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
- spin_unlock_irqrestore(&log->io_list_lock, flags);
-
bio_uninit(bio);
+ spin_unlock_irqrestore(&log->io_list_lock, flags);
}
/*
With a new kernel including this change, I can now no longer reproduce
the problem; 12 successful runs seems pretty definitive given the
failure rate I was seeing before.
This was on a newly-recreated RAID-5, and I double-checked that I did
indeed re-enable write-back.
Thank you for this! I wasn't expecting such a fast response, especially
on the weekend.
-Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 4:51 ` Corey Hickey
@ 2023-08-07 6:08 ` Yu Kuai
2023-08-08 0:03 ` Corey Hickey
0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07 6:08 UTC (permalink / raw)
To: Corey Hickey, Yu Kuai, 'Linux RAID'
Cc: yangerkun@huawei.com, yukuai (C)
Hi,
在 2023/08/07 12:51, Corey Hickey 写道:
> On 2023-08-06 19:46, Yu Kuai wrote:
>> can you test the following patch?
>>
>> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
>> index 51a68fbc241c..a85ea19fcf14 100644
>> --- a/drivers/md/raid5-cache.c
>> +++ b/drivers/md/raid5-cache.c
>> @@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
>> list_for_each_entry(io, &log->flushing_ios, log_sibling)
>> r5l_io_run_stripes(io);
>> list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
>> - spin_unlock_irqrestore(&log->io_list_lock, flags);
>> -
>> bio_uninit(bio);
>> + spin_unlock_irqrestore(&log->io_list_lock, flags);
>> }
>>
>> /*
>
> My patch utility didn't like it for some reason, but I applied the
> changes manually to get what I think is the same thing. I'll paste the
> diff here just in case.
>
> --- drivers/md/raid5-cache.c.orig 2023-08-06 20:26:10.386665042 -0700
> +++ drivers/md/raid5-cache.c 2023-08-06 20:31:33.290688590 -0700
> @@ -1265,9 +1265,8 @@
> list_for_each_entry(io, &log->flushing_ios, log_sibling)
> r5l_io_run_stripes(io);
> list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
> - spin_unlock_irqrestore(&log->io_list_lock, flags);
> -
> bio_uninit(bio);
> + spin_unlock_irqrestore(&log->io_list_lock, flags);
> }
Yes, this is what I expected.
>
> /*
>
>
> With a new kernel including this change, I can now no longer reproduce
> the problem; 12 successful runs seems pretty definitive given the
> failure rate I was seeing before.
>
> This was on a newly-recreated RAID-5, and I double-checked that I did
> indeed re-enable write-back.
Thanks for the test, I'll send a patch with your tested-by tag soon.
>
> Thank you for this! I wasn't expecting such a fast response, especially
> on the weekend.
It's Monday for us, actually 😄
Thanks,
Kuai
>
> -Corey
> .
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
2023-08-07 6:08 ` Yu Kuai
@ 2023-08-08 0:03 ` Corey Hickey
0 siblings, 0 replies; 9+ messages in thread
From: Corey Hickey @ 2023-08-08 0:03 UTC (permalink / raw)
To: Yu Kuai, 'Linux RAID'; +Cc: yangerkun@huawei.com, yukuai (C)
On 2023-08-06 23:08, Yu Kuai wrote:
>> Thank you for this! I wasn't expecting such a fast response, especially
>> on the weekend.
>
> It's Monday for us, actually 😄
Oh, I should have realized. Thank you all the same.
-Corey
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-08-08 0:03 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-06 22:48 NULL pointer dereference with MD write-back journal, where journal device is RAID-1 Corey Hickey
2023-08-07 1:02 ` Yu Kuai
2023-08-07 2:09 ` Corey Hickey
2023-08-07 2:15 ` Yu Kuai
2023-08-07 2:46 ` Yu Kuai
2023-08-07 4:51 ` Corey Hickey
2023-08-07 6:08 ` Yu Kuai
2023-08-08 0:03 ` Corey Hickey
2023-08-07 3:22 ` Corey Hickey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).