linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NULL pointer dereference with MD write-back journal, where journal device is RAID-1
@ 2023-08-06 22:48 Corey Hickey
  2023-08-07  1:02 ` Yu Kuai
  0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-06 22:48 UTC (permalink / raw)
  To: 'Linux RAID'

Hello,

I have encountered a reproducible NULL pointer dereference when using
the write-back journal feature for RAID-5. This _seems_ to happen
only when the journal device is itself a RAID-1.

https://docs.kernel.org/driver-api/md/raid5-cache.html

This report supersedes a report I sent to Debian earlier:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043078

Steps to reproduce, including example commands:

1. Create a RAID-1 for the journal device.
$ sudo mdadm --create /dev/md101 -n 2 -l 1 /dev/disk/by-id/ata-Samsung_SSD_850_PRO_256GB_S251NX0H60631*

2. Create a RAID-5 with the journal included. I'm using '-z 10G' for
testing in order to reduce the initial sync time.
$ sudo mdadm --create /dev/md10 -n 3 -l 5 -z 10G --write-journal /dev/md101 /dev/disk/by-id/ata-ST32000645NS_Z1K0*

3. Enable write-back (completes once re-sync is finished).
$ until echo write-back | sudo tee /sys/block/md10/md/journal_mode ; do sleep 5 ; done

4. Write to the disk (may take a few attempts).
$ sudo dd if=/dev/zero of=/dev/md10 iflag=fullblock bs=1M count=10240

Notes:
* The bug does not always manifest immediately but for me, it nearly
   always manifests on the first or second 'dd' run.
* The bug is not limited to buffered I/O: writes via 'oflag=direct'
   can cause the bug as well.
* I was not able to reproduce the bug on 10 attempts when I used a
   single non-RAID SSD as the journal.
* The bug can manifest while the journal RAID-1 is resyncing or not;
   the resync does not seem relevant.

My SSDs are attached to an onboard SATA controller:

$ lspci | grep 06:00
06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 x2 4-port SATA 6 Gb/s Controller (rev 11)

My hard disks are attached to an external SATA-->USB enclosure,
but I this is not relevant--I had the same problem with hard disks
attached to internal SATA controllers in earlier tests.

I'm using Debian Sid on Linux 6.4.8. The kernel is compiled locally
and installed via:
--------------------------------------------------------------------
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz
tar xf linux-6.4.8.tar.xz
cd linux-6.4.8
cp -p "/boot/config-$(uname -r)" .config
make oldconfig # and accept all defaults
make -j 12 bindeb-pkg
sudo dpkg -i linux-image-6.4.8_6.4.8-3_amd64.deb
--------------------------------------------------------------------

Here are the errors reported by the kernel:
--------------------------------------------------------------------
[ 2566.222104] BUG: kernel NULL pointer dereference, address: 0000000000000157
[ 2566.222111] #PF: supervisor read access in kernel mode
[ 2566.222114] #PF: error_code(0x0000) - not-present page
[ 2566.222117] PGD 0 P4D 0
[ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
[ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII HERO (WI-FI), BIOS 4603 09/13/2021
[ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
[ 2566.222139] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
[ 2566.222142] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
[ 2566.222146] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 0000000000000000
[ 2566.222148] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 00000000ffffffff
[ 2566.222151] RBP: 0000000000040001 R08: 0000000000000000 R09: 0000000000000000
[ 2566.222153] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 2566.222155] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 0000000000000000
[ 2566.222157] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.222160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.222162] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.222165] Call Trace:
[ 2566.222167]  <TASK>
[ 2566.222171]  ? __die+0x23/0x70
[ 2566.222176]  ? page_fault_oops+0x17d/0x4c0
[ 2566.222180]  ? update_load_avg+0x7e/0x780
[ 2566.222185]  ? exc_page_fault+0x7f/0x180
[ 2566.222190]  ? asm_exc_page_fault+0x26/0x30
[ 2566.222196]  ? submit_bio_noacct+0x182/0x5c0
[ 2566.222201]  handle_active_stripes.isra.0+0x377/0x550 [raid456]
[ 2566.222220]  raid5d+0x487/0x750 [raid456]
[ 2566.222234]  ? __schedule+0x3e7/0xb80
[ 2566.222240]  ? _raw_spin_lock_irqsave+0x27/0x60
[ 2566.222245]  ? preempt_count_add+0x6e/0xa0
[ 2566.222248]  ? _raw_spin_lock_irqsave+0x27/0x60
[ 2566.222254]  ? __pfx_md_thread+0x10/0x10 [md_mod]
[ 2566.222273]  md_thread+0xae/0x190 [md_mod]
[ 2566.222293]  ? __pfx_autoremove_wake_function+0x10/0x10
[ 2566.222299]  kthread+0xf7/0x130
[ 2566.222304]  ? __pfx_kthread+0x10/0x10
[ 2566.222309]  ret_from_fork+0x2c/0x50
[ 2566.222316]  </TASK>
[ 2566.222318] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc nfs_acl ppdev
[ 2566.222390]  lockd lp grace parport drm fuse loop efi_pstore dm_mod configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common gpio_amdpt wmi gpio_generic
[ 2566.222451] CR2: 0000000000000157
[ 2566.222454] ---[ end trace 0000000000000000 ]---
[ 2566.436029] RIP: 0010:submit_bio_noacct+0x182/0x5c0
[ 2566.436038] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
[ 2566.436041] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
[ 2566.436044] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 0000000000000000
[ 2566.436047] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 00000000ffffffff
[ 2566.436049] RBP: 0000000000040001 R08: 0000000000000000 R09: 0000000000000000
[ 2566.436051] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 2566.436053] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 0000000000000000
[ 2566.436055] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.436058] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.436060] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.436063] note: md10_raid5[5415] exited with irqs disabled
[ 2566.436109] ------------[ cut here ]------------
[ 2566.436112] WARNING: CPU: 1 PID: 5415 at kernel/exit.c:818 do_exit+0x8ef/0xb20
[ 2566.436119] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc nfs_acl ppdev
[ 2566.436188]  lockd lp grace parport drm fuse loop efi_pstore dm_mod configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common gpio_amdpt wmi gpio_generic
[ 2566.436250] CPU: 1 PID: 5415 Comm: md10_raid5 Tainted: G      D            6.4.8 #3
[ 2566.436254] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII HERO (WI-FI), BIOS 4603 09/13/2021
[ 2566.436256] RIP: 0010:do_exit+0x8ef/0xb20
[ 2566.436260] Code: e9 12 ff ff ff 48 8b bb 98 09 00 00 31 f6 e8 88 d9 ff ff e9 a0 fd ff ff 4c 89 e6 bf 05 06 00 00 e8 f6 0b 01 00 e9 59 f8 ff ff <0f> 0b e9 88 f7 ff ff 0f 0b e9 45 f7 ff ff 48 89 df e8 fb e0 11 00
[ 2566.436263] RSP: 0018:ffffa41d46e5bed8 EFLAGS: 00010286
[ 2566.436266] RAX: 0000000000000000 RBX: ffff9327df5a6600 RCX: 0000000000000000
[ 2566.436269] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff
[ 2566.436271] RBP: ffff9327c0afb600 R08: 0000000000000000 R09: ffffa41d46e5bde0
[ 2566.436273] R10: 0000000000000003 R11: ffff93363f2f7fe8 R12: 0000000000000009
[ 2566.436275] R13: ffff9327df4deb40 R14: 0000000000000000 R15: 0000000000000000
[ 2566.436277] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) knlGS:0000000000000000
[ 2566.436280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2566.436282] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 00000000003506e0
[ 2566.436284] Call Trace:
[ 2566.436287]  <TASK>
[ 2566.436288]  ? do_exit+0x8ef/0xb20
[ 2566.436292]  ? __warn+0x81/0x130
[ 2566.436298]  ? do_exit+0x8ef/0xb20
[ 2566.436301]  ? report_bug+0x191/0x1c0
[ 2566.436308]  ? handle_bug+0x3c/0x80
[ 2566.436312]  ? exc_invalid_op+0x17/0x70
[ 2566.436316]  ? asm_exc_invalid_op+0x1a/0x20
[ 2566.436321]  ? do_exit+0x8ef/0xb20
[ 2566.436325]  ? do_exit+0x70/0xb20
[ 2566.436329]  make_task_dead+0x81/0x170
[ 2566.436333]  rewind_stack_and_make_dead+0x17/0x20
[ 2566.436338] RIP: 0000:0x0
[ 2566.436344] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 2566.436346] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
[ 2566.436349] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 2566.436350] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2566.436352] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 2566.436354] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 2566.436355] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 2566.436359]  </TASK>
[ 2566.436361] ---[ end trace 0000000000000000 ]---
--------------------------------------------------------------------

Thank you,
Corey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-06 22:48 NULL pointer dereference with MD write-back journal, where journal device is RAID-1 Corey Hickey
@ 2023-08-07  1:02 ` Yu Kuai
  2023-08-07  2:09   ` Corey Hickey
  0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07  1:02 UTC (permalink / raw)
  To: Corey Hickey, 'Linux RAID'; +Cc: yukuai (C)

Hi,

在 2023/08/07 6:48, Corey Hickey 写道:
> Hello,
> 
> I have encountered a reproducible NULL pointer dereference when using
> the write-back journal feature for RAID-5. This _seems_ to happen
> only when the journal device is itself a RAID-1.
> 
> https://docs.kernel.org/driver-api/md/raid5-cache.html
> 
> This report supersedes a report I sent to Debian earlier:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1043078
> 
> Steps to reproduce, including example commands:
> 
> 1. Create a RAID-1 for the journal device.
> $ sudo mdadm --create /dev/md101 -n 2 -l 1 
> /dev/disk/by-id/ata-Samsung_SSD_850_PRO_256GB_S251NX0H60631*
> 
> 2. Create a RAID-5 with the journal included. I'm using '-z 10G' for
> testing in order to reduce the initial sync time.
> $ sudo mdadm --create /dev/md10 -n 3 -l 5 -z 10G --write-journal 
> /dev/md101 /dev/disk/by-id/ata-ST32000645NS_Z1K0*
> 
> 3. Enable write-back (completes once re-sync is finished).
> $ until echo write-back | sudo tee /sys/block/md10/md/journal_mode ; do 
> sleep 5 ; done
> 
> 4. Write to the disk (may take a few attempts).
> $ sudo dd if=/dev/zero of=/dev/md10 iflag=fullblock bs=1M count=10240
> 
> Notes:
> * The bug does not always manifest immediately but for me, it nearly
>    always manifests on the first or second 'dd' run.
> * The bug is not limited to buffered I/O: writes via 'oflag=direct'
>    can cause the bug as well.
> * I was not able to reproduce the bug on 10 attempts when I used a
>    single non-RAID SSD as the journal.
> * The bug can manifest while the journal RAID-1 is resyncing or not;
>    the resync does not seem relevant.
> 
> My SSDs are attached to an onboard SATA controller:
> 
> $ lspci | grep 06:00
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9235 PCIe 2.0 
> x2 4-port SATA 6 Gb/s Controller (rev 11)
> 
> My hard disks are attached to an external SATA-->USB enclosure,
> but I this is not relevant--I had the same problem with hard disks
> attached to internal SATA controllers in earlier tests.
> 
> I'm using Debian Sid on Linux 6.4.8. The kernel is compiled locally
> and installed via:
> --------------------------------------------------------------------
> wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz
> tar xf linux-6.4.8.tar.xz
> cd linux-6.4.8
> cp -p "/boot/config-$(uname -r)" .config
> make oldconfig # and accept all defaults
> make -j 12 bindeb-pkg
> sudo dpkg -i linux-image-6.4.8_6.4.8-3_amd64.deb
> --------------------------------------------------------------------
> 
> Here are the errors reported by the kernel:
> --------------------------------------------------------------------
> [ 2566.222104] BUG: kernel NULL pointer dereference, address: 
> 0000000000000157
> [ 2566.222111] #PF: supervisor read access in kernel mode
> [ 2566.222114] #PF: error_code(0x0000) - not-present page
> [ 2566.222117] PGD 0 P4D 0
> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII 
> HERO (WI-FI), BIOS 4603 09/13/2021
> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0

Can you provide addr2line result? This will be helpful to locate the
problem.

Thanks,
Kuai

> [ 2566.222139] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 
> 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 
> c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
> [ 2566.222142] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
> [ 2566.222146] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 
> 0000000000000000
> [ 2566.222148] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 
> 00000000ffffffff
> [ 2566.222151] RBP: 0000000000040001 R08: 0000000000000000 R09: 
> 0000000000000000
> [ 2566.222153] R10: 0000000000000001 R11: 0000000000000000 R12: 
> 0000000000000000
> [ 2566.222155] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 
> 0000000000000000
> [ 2566.222157] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) 
> knlGS:0000000000000000
> [ 2566.222160] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.222162] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 
> 00000000003506e0
> [ 2566.222165] Call Trace:
> [ 2566.222167]  <TASK>
> [ 2566.222171]  ? __die+0x23/0x70
> [ 2566.222176]  ? page_fault_oops+0x17d/0x4c0
> [ 2566.222180]  ? update_load_avg+0x7e/0x780
> [ 2566.222185]  ? exc_page_fault+0x7f/0x180
> [ 2566.222190]  ? asm_exc_page_fault+0x26/0x30
> [ 2566.222196]  ? submit_bio_noacct+0x182/0x5c0
> [ 2566.222201]  handle_active_stripes.isra.0+0x377/0x550 [raid456]
> [ 2566.222220]  raid5d+0x487/0x750 [raid456]
> [ 2566.222234]  ? __schedule+0x3e7/0xb80
> [ 2566.222240]  ? _raw_spin_lock_irqsave+0x27/0x60
> [ 2566.222245]  ? preempt_count_add+0x6e/0xa0
> [ 2566.222248]  ? _raw_spin_lock_irqsave+0x27/0x60
> [ 2566.222254]  ? __pfx_md_thread+0x10/0x10 [md_mod]
> [ 2566.222273]  md_thread+0xae/0x190 [md_mod]
> [ 2566.222293]  ? __pfx_autoremove_wake_function+0x10/0x10
> [ 2566.222299]  kthread+0xf7/0x130
> [ 2566.222304]  ? __pfx_kthread+0x10/0x10
> [ 2566.222309]  ret_from_fork+0x2c/0x50
> [ 2566.222316]  </TASK>
> [ 2566.222318] Modules linked in: twofish_generic twofish_avx_x86_64 
> twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt 
> cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 
> nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu 
> eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd 
> hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek 
> sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper 
> kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio 
> snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel 
> snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass 
> drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core 
> mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button 
> sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus 
> snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd 
> psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc 
> nfs_acl ppdev
> [ 2566.222390]  lockd lp grace parport drm fuse loop efi_pstore dm_mod 
> configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs 
> efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq 
> async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath 
> linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod 
> usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic 
> crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 
> ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci 
> xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore 
> igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common 
> gpio_amdpt wmi gpio_generic
> [ 2566.222451] CR2: 0000000000000157
> [ 2566.222454] ---[ end trace 0000000000000000 ]---
> [ 2566.436029] RIP: 0010:submit_bio_noacct+0x182/0x5c0
> [ 2566.436038] Code: ff ff ff 4c 8b 63 48 4d 85 e4 74 0f 48 63 05 e5 ef 
> 41 01 4d 8b a4 c4 d0 00 00 00 41 89 ed 41 83 e5 01 0f 1f 44 00 00 49 63 
> c5 <41> 80 bc 04 56 01 00 00 00 0f 85 fc 00 00 00 41 80 bc 04 54 01 00
> [ 2566.436041] RSP: 0018:ffffa41d46e5bd00 EFLAGS: 00010202
> [ 2566.436044] RAX: 0000000000000001 RBX: ffff93275b6668b8 RCX: 
> 0000000000000000
> [ 2566.436047] RDX: ffff932741380640 RSI: ffffffffb323f686 RDI: 
> 00000000ffffffff
> [ 2566.436049] RBP: 0000000000040001 R08: 0000000000000000 R09: 
> 0000000000000000
> [ 2566.436051] R10: 0000000000000001 R11: 0000000000000000 R12: 
> 0000000000000000
> [ 2566.436053] R13: 0000000000000001 R14: 000000001dcb2a80 R15: 
> 0000000000000000
> [ 2566.436055] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) 
> knlGS:0000000000000000
> [ 2566.436058] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.436060] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 
> 00000000003506e0
> [ 2566.436063] note: md10_raid5[5415] exited with irqs disabled
> [ 2566.436109] ------------[ cut here ]------------
> [ 2566.436112] WARNING: CPU: 1 PID: 5415 at kernel/exit.c:818 
> do_exit+0x8ef/0xb20
> [ 2566.436119] Modules linked in: twofish_generic twofish_avx_x86_64 
> twofish_x86_64_3way twofish_x86_64 twofish_common essiv authenc dm_crypt 
> cpufreq_conservative cpufreq_userspace cpufreq_powersave rpcsec_gss_krb5 
> nfsv4 dns_resolver nfs fscache netfs bridge stp llc binfmt_misc amdgpu 
> eeepc_wmi intel_rapl_msr asus_wmi intel_rapl_common battery edac_mce_amd 
> hid_pl sparse_keymap platform_profile hid_dr snd_hda_codec_realtek 
> sp5100_tco drm_buddy rfkill ff_memless gpu_sched drm_suballoc_helper 
> kvm_amd snd_hda_codec_generic drm_display_helper ledtrig_audio 
> snd_hda_codec_hdmi cec rc_core drm_ttm_helper kvm snd_hda_intel 
> snd_intel_dspcfg ttm snd_intel_sdw_acpi asus_wmi_sensors irqbypass 
> drm_kms_helper snd_hda_codec rapl video acpi_cpufreq snd_hda_core 
> mxm_wmi pcspkr wmi_bmof k10temp watchdog ccp snd_hwdep rng_core button 
> sg cpufreq_ondemand lm90 snd_intel8x0 snd_ac97_codec ac97_bus 
> snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore evdev nfsd 
> psmouse i2c_dev sidewinder gameport joydev auth_rpcgss parport_pc 
> nfs_acl ppdev
> [ 2566.436188]  lockd lp grace parport drm fuse loop efi_pstore dm_mod 
> configfs sunrpc ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs 
> efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq 
> async_xor async_tx xor raid6_pq libcrc32c crc32c_generic multipath 
> linear hid_generic raid0 bcache raid1 md_mod uas usb_storage sd_mod 
> usbhid crc32_pclmul crc32c_intel t10_pi hid crc64_rocksoft_generic 
> crc64_rocksoft crc_t10dif crct10dif_generic crct10dif_pclmul crc64 
> ghash_clmulni_intel crct10dif_common sha512_ssse3 sha512_generic ahci 
> xhci_pci libahci xhci_hcd aesni_intel crypto_simd libata cryptd usbcore 
> igb e1000e i2c_piix4 scsi_mod i2c_algo_bit dca usb_common scsi_common 
> gpio_amdpt wmi gpio_generic
> [ 2566.436250] CPU: 1 PID: 5415 Comm: md10_raid5 Tainted: G      
> D            6.4.8 #3
> [ 2566.436254] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII 
> HERO (WI-FI), BIOS 4603 09/13/2021
> [ 2566.436256] RIP: 0010:do_exit+0x8ef/0xb20
> [ 2566.436260] Code: e9 12 ff ff ff 48 8b bb 98 09 00 00 31 f6 e8 88 d9 
> ff ff e9 a0 fd ff ff 4c 89 e6 bf 05 06 00 00 e8 f6 0b 01 00 e9 59 f8 ff 
> ff <0f> 0b e9 88 f7 ff ff 0f 0b e9 45 f7 ff ff 48 89 df e8 fb e0 11 00
> [ 2566.436263] RSP: 0018:ffffa41d46e5bed8 EFLAGS: 00010286
> [ 2566.436266] RAX: 0000000000000000 RBX: ffff9327df5a6600 RCX: 
> 0000000000000000
> [ 2566.436269] RDX: 0000000000000001 RSI: 0000000000002710 RDI: 
> 00000000ffffffff
> [ 2566.436271] RBP: ffff9327c0afb600 R08: 0000000000000000 R09: 
> ffffa41d46e5bde0
> [ 2566.436273] R10: 0000000000000003 R11: ffff93363f2f7fe8 R12: 
> 0000000000000009
> [ 2566.436275] R13: ffff9327df4deb40 R14: 0000000000000000 R15: 
> 0000000000000000
> [ 2566.436277] FS:  0000000000000000(0000) GS:ffff93363ea40000(0000) 
> knlGS:0000000000000000
> [ 2566.436280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 2566.436282] CR2: 0000000000000157 CR3: 0000000140b8e000 CR4: 
> 00000000003506e0
> [ 2566.436284] Call Trace:
> [ 2566.436287]  <TASK>
> [ 2566.436288]  ? do_exit+0x8ef/0xb20
> [ 2566.436292]  ? __warn+0x81/0x130
> [ 2566.436298]  ? do_exit+0x8ef/0xb20
> [ 2566.436301]  ? report_bug+0x191/0x1c0
> [ 2566.436308]  ? handle_bug+0x3c/0x80
> [ 2566.436312]  ? exc_invalid_op+0x17/0x70
> [ 2566.436316]  ? asm_exc_invalid_op+0x1a/0x20
> [ 2566.436321]  ? do_exit+0x8ef/0xb20
> [ 2566.436325]  ? do_exit+0x70/0xb20
> [ 2566.436329]  make_task_dead+0x81/0x170
> [ 2566.436333]  rewind_stack_and_make_dead+0x17/0x20
> [ 2566.436338] RIP: 0000:0x0
> [ 2566.436344] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
> [ 2566.436346] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 
> 0000000000000000
> [ 2566.436349] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
> 0000000000000000
> [ 2566.436350] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
> 0000000000000000
> [ 2566.436352] RBP: 0000000000000000 R08: 0000000000000000 R09: 
> 0000000000000000
> [ 2566.436354] R10: 0000000000000000 R11: 0000000000000000 R12: 
> 0000000000000000
> [ 2566.436355] R13: 0000000000000000 R14: 0000000000000000 R15: 
> 0000000000000000
> [ 2566.436359]  </TASK>
> [ 2566.436361] ---[ end trace 0000000000000000 ]---
> --------------------------------------------------------------------
> 
> Thank you,
> Corey


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  1:02 ` Yu Kuai
@ 2023-08-07  2:09   ` Corey Hickey
  2023-08-07  2:15     ` Yu Kuai
  0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-07  2:09 UTC (permalink / raw)
  To: Yu Kuai, 'Linux RAID'; +Cc: yukuai (C)

On 2023-08-06 18:02, Yu Kuai wrote:
>> Here are the errors reported by the kernel:
>> --------------------------------------------------------------------
>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>> 0000000000000157
>> [ 2566.222111] #PF: supervisor read access in kernel mode
>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>> [ 2566.222117] PGD 0 P4D 0
>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
>> HERO (WI-FI), BIOS 4603 09/13/2021
>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
> 
> Can you provide addr2line result? This will be helpful to locate the
> problem.

I have not done this before; I struggled a bit until I found this:
https://lwn.net/Articles/592724/

These are run within the kernel source tree, which I have not
modified since the original compilation.


$ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
[ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
[ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920 kernel/sched/fair.c:4255)
[ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695 arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
[ 2566.222190] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
[ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198 block/blk-throttle.h:210 block/blk-core.c:800)
[ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709 (discriminator 1)) raid456
[ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
[ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
[ 2566.222240] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4))
[ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976 kernel/sched/core.c:5793 kernel/sched/core.c:5790 kernel/sched/core.c:5818)
[ 2566.222248] ? _raw_spin_lock_irqsave (./arch/x86/include/asm/atomic.h:202 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:186 (discriminator 4) ./include/linux/spinlock_api_smp.h:111 (discriminator 4) kernel/locking/spinlock.c:162 (discriminator 4))
[ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
[ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
[ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
[ 2566.222299] kthread (kernel/kthread.c:379)
[ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
[ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)



$ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
[ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436292] ? __warn (kernel/panic.c:673)
[ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
[ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[ 2566.436316] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
[ 2566.436329] make_task_dead (kernel/exit.c:972)
[ 2566.436333] rewind_stack_and_make_dead (??:?)



Is that what you are looking for?

Thanks,
Corey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  2:09   ` Corey Hickey
@ 2023-08-07  2:15     ` Yu Kuai
  2023-08-07  2:46       ` Yu Kuai
  2023-08-07  3:22       ` Corey Hickey
  0 siblings, 2 replies; 9+ messages in thread
From: Yu Kuai @ 2023-08-07  2:15 UTC (permalink / raw)
  To: Corey Hickey, Yu Kuai, 'Linux RAID', yukuai (C)

Hi,

在 2023/08/07 10:09, Corey Hickey 写道:
> On 2023-08-06 18:02, Yu Kuai wrote:
>>> Here are the errors reported by the kernel:
>>> --------------------------------------------------------------------
>>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>>> 0000000000000157
>>> [ 2566.222111] #PF: supervisor read access in kernel mode
>>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>>> [ 2566.222117] PGD 0 P4D 0
>>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR VII
>>> HERO (WI-FI), BIOS 4603 09/13/2021
>>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
>>
>> Can you provide addr2line result? This will be helpful to locate the
>> problem.
> 
> I have not done this before; I struggled a bit until I found this:
> https://lwn.net/Articles/592724/
> 
> These are run within the kernel source tree, which I have not
> modified since the original compilation.
> 
> 
> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
> [ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421 
> arch/x86/kernel/dumpstack.c:434)
> [ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
> [ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920 
> kernel/sched/fair.c:4255)
> [ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695 
> arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
> [ 2566.222190] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
> [ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198 
> block/blk-throttle.h:210 block/blk-core.c:800)
> [ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709 
> (discriminator 1)) raid456
> [ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
> [ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
> [ 2566.222240] ? _raw_spin_lock_irqsave 
> (./arch/x86/include/asm/atomic.h:202 (discriminator 4) 
> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) 
> ./include/asm-generic/qspinlock.h:111 (discriminator 4) 
> ./include/linux/spinlock.h:186 (discriminator 4) 
> ./include/linux/spinlock_api_smp.h:111 (discriminator 4) 
> kernel/locking/spinlock.c:162 (discriminator 4))
> [ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976 
> kernel/sched/core.c:5793 kernel/sched/core.c:5790 kernel/sched/core.c:5818)
> [ 2566.222248] ? _raw_spin_lock_irqsave 
> (./arch/x86/include/asm/atomic.h:202 (discriminator 4) 
> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) 
> ./include/asm-generic/qspinlock.h:111 (discriminator 4) 
> ./include/linux/spinlock.h:186 (discriminator 4) 
> ./include/linux/spinlock_api_smp.h:111 (discriminator 4) 
> kernel/locking/spinlock.c:162 (discriminator 4))
> [ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
> [ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
> [ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
> [ 2566.222299] kthread (kernel/kthread.c:379)
> [ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
> [ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)
> 
> 
> 
> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
> [ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436292] ? __warn (kernel/panic.c:673)
> [ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
> [ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
> [ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345 
> (discriminator 1))
> [ 2566.436316] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
> [ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
> [ 2566.436329] make_task_dead (kernel/exit.c:972)
> [ 2566.436333] rewind_stack_and_make_dead (??:?)
> 
> 
> 
> Is that what you are looking for?

Yes, and can you provide witch commit are you testing?

Thanks,
Kuai
> 
> Thanks,
> Corey
> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  2:15     ` Yu Kuai
@ 2023-08-07  2:46       ` Yu Kuai
  2023-08-07  4:51         ` Corey Hickey
  2023-08-07  3:22       ` Corey Hickey
  1 sibling, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07  2:46 UTC (permalink / raw)
  To: Yu Kuai, Corey Hickey, 'Linux RAID', yukuai (C)
  Cc: yangerkun@huawei.com

Hi,

在 2023/08/07 10:15, Yu Kuai 写道:
> Hi,
> 
> 在 2023/08/07 10:09, Corey Hickey 写道:
>> On 2023-08-06 18:02, Yu Kuai wrote:
>>>> Here are the errors reported by the kernel:
>>>> --------------------------------------------------------------------
>>>> [ 2566.222104] BUG: kernel NULL pointer dereference, address:
>>>> 0000000000000157
>>>> [ 2566.222111] #PF: supervisor read access in kernel mode
>>>> [ 2566.222114] #PF: error_code(0x0000) - not-present page
>>>> [ 2566.222117] PGD 0 P4D 0
>>>> [ 2566.222121] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>>> [ 2566.222125] CPU: 1 PID: 5415 Comm: md10_raid5 Not tainted 6.4.8 #3
>>>> [ 2566.222129] Hardware name: ASUS System Product Name/ROG CROSSHAIR 
>>>> VII
>>>> HERO (WI-FI), BIOS 4603 09/13/2021
>>>> [ 2566.222132] RIP: 0010:submit_bio_noacct+0x182/0x5c0
>>>
>>> Can you provide addr2line result? This will be helpful to locate the
>>> problem.
>>
>> I have not done this before; I struggled a bit until I found this:
>> https://lwn.net/Articles/592724/
>>
>> These are run within the kernel source tree, which I have not
>> modified since the original compilation.
>>
>>
>> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace1
>> [ 2566.222171] ? __die (arch/x86/kernel/dumpstack.c:421 
>> arch/x86/kernel/dumpstack.c:434)
>> [ 2566.222176] ? page_fault_oops (arch/x86/mm/fault.c:707)
>> [ 2566.222180] ? update_load_avg (kernel/sched/fair.c:3920 
>> kernel/sched/fair.c:4255)
>> [ 2566.222185] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:695 
>> arch/x86/mm/fault.c:1494 arch/x86/mm/fault.c:1542)
>> [ 2566.222190] ? asm_exc_page_fault 
>> (./arch/x86/include/asm/idtentry.h:570)
>> [ 2566.222196] ? submit_bio_noacct (block/blk-throttle.h:198 
>> block/blk-throttle.h:210 block/blk-core.c:800)
>> [ 2566.222201] handle_active_stripes.isra.0 (drivers/md/raid5.c:6709 
>> (discriminator 1)) raid456

I'm not sure yet where is this io come from, however, based on your
test, I think this is from

raid5d
  handle_active_stripes
   r5l_flush_stripe_to_raid
    submit_bio

And I found a problem after a quick look here:

t1: submit flush io
raid5d
  handle_active_stripes
   r5l_flush_stripe_to_raid
    bio_init
    submit_bio
    // io1

t2: io1 is done
r5l_log_flush_endio
  list_splice_tail_init
  // new flush io can be dispatched

			t3: submit new flush io
			...
			r5l_flush_stripe_to_raid
			 bio_init
  bio_uninit
  // clear bio->bi_blkg
			 submit_bio
			 // null-ptr-deref

This is definitly a problem, however, I'm not sure if this is your case,
can you test the following patch?

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 51a68fbc241c..a85ea19fcf14 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
         list_for_each_entry(io, &log->flushing_ios, log_sibling)
                 r5l_io_run_stripes(io);
         list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
-       spin_unlock_irqrestore(&log->io_list_lock, flags);
-
         bio_uninit(bio);
+       spin_unlock_irqrestore(&log->io_list_lock, flags);
  }

  /*

Thanks,
Kuai
>> [ 2566.222220] raid5d (drivers/md/raid5.c:6821) raid456
>> [ 2566.222234] ? __schedule (kernel/sched/core.c:6677)
>> [ 2566.222240] ? _raw_spin_lock_irqsave 
>> (./arch/x86/include/asm/atomic.h:202 (discriminator 4) 
>> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) 
>> ./include/asm-generic/qspinlock.h:111 (discriminator 4) 
>> ./include/linux/spinlock.h:186 (discriminator 4) 
>> ./include/linux/spinlock_api_smp.h:111 (discriminator 4) 
>> kernel/locking/spinlock.c:162 (discriminator 4))
>> [ 2566.222245] ? preempt_count_add (./include/linux/ftrace.h:976 
>> kernel/sched/core.c:5793 kernel/sched/core.c:5790 
>> kernel/sched/core.c:5818)
>> [ 2566.222248] ? _raw_spin_lock_irqsave 
>> (./arch/x86/include/asm/atomic.h:202 (discriminator 4) 
>> ./include/linux/atomic/atomic-instrumented.h:543 (discriminator 4) 
>> ./include/asm-generic/qspinlock.h:111 (discriminator 4) 
>> ./include/linux/spinlock.h:186 (discriminator 4) 
>> ./include/linux/spinlock_api_smp.h:111 (discriminator 4) 
>> kernel/locking/spinlock.c:162 (discriminator 4))
>> [ 2566.222254] ? __pfx_md_thread (drivers/md/md.c:7862) md_mod
>> [ 2566.222273] md_thread (drivers/md/md.c:7898) md_mod
>> [ 2566.222293] ? __pfx_autoremove_wake_function (kernel/sched/wait.c:418)
>> [ 2566.222299] kthread (kernel/kthread.c:379)
>> [ 2566.222304] ? __pfx_kthread (kernel/kthread.c:332)
>> [ 2566.222309] ret_from_fork (arch/x86/entry/entry_64.S:314)
>>
>>
>>
>> $ scripts/decode_stacktrace.sh vmlinux < /tmp/trace2
>> [ 2566.436288] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436292] ? __warn (kernel/panic.c:673)
>> [ 2566.436298] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436301] ? report_bug (lib/bug.c:180 lib/bug.c:219)
>> [ 2566.436308] ? handle_bug (arch/x86/kernel/traps.c:303)
>> [ 2566.436312] ? exc_invalid_op (arch/x86/kernel/traps.c:345 
>> (discriminator 1))
>> [ 2566.436316] ? asm_exc_invalid_op 
>> (./arch/x86/include/asm/idtentry.h:568)
>> [ 2566.436321] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436325] ? do_exit (kernel/exit.c:818 (discriminator 1))
>> [ 2566.436329] make_task_dead (kernel/exit.c:972)
>> [ 2566.436333] rewind_stack_and_make_dead (??:?)
>>
>>
>>
>> Is that what you are looking for?
> 
> Yes, and can you provide witch commit are you testing?
> 
> Thanks,
> Kuai
>>
>> Thanks,
>> Corey
>> .
>>
> 
> .
> 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  2:15     ` Yu Kuai
  2023-08-07  2:46       ` Yu Kuai
@ 2023-08-07  3:22       ` Corey Hickey
  1 sibling, 0 replies; 9+ messages in thread
From: Corey Hickey @ 2023-08-07  3:22 UTC (permalink / raw)
  To: Yu Kuai, 'Linux RAID', yukuai (C)

On 2023-08-06 19:15, Yu Kuai wrote:
>> Is that what you are looking for?
> 
> Yes, and can you provide witch commit are you testing?

This is 6.4.8 from the release tarball:
https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.4.8.tar.xz

Thank you,
Corey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  2:46       ` Yu Kuai
@ 2023-08-07  4:51         ` Corey Hickey
  2023-08-07  6:08           ` Yu Kuai
  0 siblings, 1 reply; 9+ messages in thread
From: Corey Hickey @ 2023-08-07  4:51 UTC (permalink / raw)
  To: Yu Kuai, 'Linux RAID', yukuai (C); +Cc: yangerkun@huawei.com

On 2023-08-06 19:46, Yu Kuai wrote:
> can you test the following patch?
> 
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 51a68fbc241c..a85ea19fcf14 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
>           list_for_each_entry(io, &log->flushing_ios, log_sibling)
>                   r5l_io_run_stripes(io);
>           list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
> -       spin_unlock_irqrestore(&log->io_list_lock, flags);
> -
>           bio_uninit(bio);
> +       spin_unlock_irqrestore(&log->io_list_lock, flags);
>    }
> 
>    /*

My patch utility didn't like it for some reason, but I applied the 
changes manually to get what I think is the same thing. I'll paste the 
diff here just in case.

--- drivers/md/raid5-cache.c.orig	2023-08-06 20:26:10.386665042 -0700
+++ drivers/md/raid5-cache.c	2023-08-06 20:31:33.290688590 -0700
@@ -1265,9 +1265,8 @@
  	list_for_each_entry(io, &log->flushing_ios, log_sibling)
  		r5l_io_run_stripes(io);
  	list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
-	spin_unlock_irqrestore(&log->io_list_lock, flags);
-
  	bio_uninit(bio);
+	spin_unlock_irqrestore(&log->io_list_lock, flags);
  }

  /*


With a new kernel including this change, I can now no longer reproduce 
the problem; 12 successful runs seems pretty definitive given the 
failure rate I was seeing before.

This was on a newly-recreated RAID-5, and I double-checked that I did 
indeed re-enable write-back.

Thank you for this! I wasn't expecting such a fast response, especially 
on the weekend.

-Corey

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  4:51         ` Corey Hickey
@ 2023-08-07  6:08           ` Yu Kuai
  2023-08-08  0:03             ` Corey Hickey
  0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-08-07  6:08 UTC (permalink / raw)
  To: Corey Hickey, Yu Kuai, 'Linux RAID'
  Cc: yangerkun@huawei.com, yukuai (C)

Hi,

在 2023/08/07 12:51, Corey Hickey 写道:
> On 2023-08-06 19:46, Yu Kuai wrote:
>> can you test the following patch?
>>
>> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
>> index 51a68fbc241c..a85ea19fcf14 100644
>> --- a/drivers/md/raid5-cache.c
>> +++ b/drivers/md/raid5-cache.c
>> @@ -1266,9 +1266,8 @@ static void r5l_log_flush_endio(struct bio *bio)
>>           list_for_each_entry(io, &log->flushing_ios, log_sibling)
>>                   r5l_io_run_stripes(io);
>>           list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
>> -       spin_unlock_irqrestore(&log->io_list_lock, flags);
>> -
>>           bio_uninit(bio);
>> +       spin_unlock_irqrestore(&log->io_list_lock, flags);
>>    }
>>
>>    /*
> 
> My patch utility didn't like it for some reason, but I applied the 
> changes manually to get what I think is the same thing. I'll paste the 
> diff here just in case.
> 
> --- drivers/md/raid5-cache.c.orig    2023-08-06 20:26:10.386665042 -0700
> +++ drivers/md/raid5-cache.c    2023-08-06 20:31:33.290688590 -0700
> @@ -1265,9 +1265,8 @@
>       list_for_each_entry(io, &log->flushing_ios, log_sibling)
>           r5l_io_run_stripes(io);
>       list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
> -    spin_unlock_irqrestore(&log->io_list_lock, flags);
> -
>       bio_uninit(bio);
> +    spin_unlock_irqrestore(&log->io_list_lock, flags);
>   }

Yes, this is what I expected.
> 
>   /*
> 
> 
> With a new kernel including this change, I can now no longer reproduce 
> the problem; 12 successful runs seems pretty definitive given the 
> failure rate I was seeing before.
> 
> This was on a newly-recreated RAID-5, and I double-checked that I did 
> indeed re-enable write-back.

Thanks for the test, I'll send a patch with your tested-by tag soon.
> 
> Thank you for this! I wasn't expecting such a fast response, especially 
> on the weekend.

It's Monday for us, actually 😄

Thanks,
Kuai

> 
> -Corey
> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: NULL pointer dereference with MD write-back journal, where journal device is RAID-1
  2023-08-07  6:08           ` Yu Kuai
@ 2023-08-08  0:03             ` Corey Hickey
  0 siblings, 0 replies; 9+ messages in thread
From: Corey Hickey @ 2023-08-08  0:03 UTC (permalink / raw)
  To: Yu Kuai, 'Linux RAID'; +Cc: yangerkun@huawei.com, yukuai (C)

On 2023-08-06 23:08, Yu Kuai wrote:
>> Thank you for this! I wasn't expecting such a fast response, especially
>> on the weekend.
> 
> It's Monday for us, actually 😄

Oh, I should have realized. Thank you all the same.

-Corey

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-08  0:03 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-06 22:48 NULL pointer dereference with MD write-back journal, where journal device is RAID-1 Corey Hickey
2023-08-07  1:02 ` Yu Kuai
2023-08-07  2:09   ` Corey Hickey
2023-08-07  2:15     ` Yu Kuai
2023-08-07  2:46       ` Yu Kuai
2023-08-07  4:51         ` Corey Hickey
2023-08-07  6:08           ` Yu Kuai
2023-08-08  0:03             ` Corey Hickey
2023-08-07  3:22       ` Corey Hickey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).