public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report] BUG: kernel NULL pointer dereference, address: 0000000000000060
@ 2025-07-01  1:55 Changhui Zhong
  2025-07-01  3:08 ` Ming Lei
  0 siblings, 1 reply; 3+ messages in thread
From: Changhui Zhong @ 2025-07-01  1:55 UTC (permalink / raw)
  To: Linux Block Devices; +Cc: Ming Lei

Hello,

the following kernel panic was triggered by 'ubdsrv  make test T=generic' tests,
please help check and let me know if you need any info/test, thanks.

repo: https://github.com/torvalds/linux.git
branch: master
INFO: HEAD of cloned kernel:
commit d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Jun 29 13:09:04 2025 -0700

    Linux 6.16-rc4

dmesg log:
[ 3431.347957] BUG: kernel NULL pointer dereference, address: 0000000000000060
[ 3431.355744] #PF: supervisor read access in kernel mode
[ 3431.361484] #PF: error_code(0x0000) - not-present page
[ 3431.367224] PGD 119ffa067 P4D 0
[ 3431.370830] Oops: Oops: 0000 [#1] SMP NOPTI
[ 3431.375503] CPU: 22 UID: 0 PID: 397273 Comm: fio Tainted: G S
           6.16.0-rc4 #1 PREEMPT(voluntary)
[ 3431.386864] Tainted: [S]=CPU_OUT_OF_SPEC
[ 3431.391243] Hardware name: Lenovo ThinkSystem SR650 V2/7Z73CTO1WW,
BIOS AFE118M-1.32 06/29/2022
[ 3431.400954] RIP: 0010:ublk_queue_rqs+0x7d/0x1c0 [ublk_drv]
[ 3431.407085] Code: 00 00 4c 8b b8 c8 00 00 00 48 63 43 20 48 c1 e0
05 4d 8d 6c 07 30 48 85 d2 0f 84 c1 00 00 00 4c 01 f8 48 8b 7a 10 48
8b 70 40 <48> 8b 4e 60 48 39 4f 60 0f 84 9a 00 00 00 4d 85 e4 0f 84 9f
00 00
[ 3431.428035] RSP: 0018:ff711b900c5379d8 EFLAGS: 00010282
[ 3431.433869] RAX: ff300f05d5c534b0 RBX: ff300f05ea3b2940 RCX: ff300f05d5c53090
[ 3431.441834] RDX: ff300f05d5c534c0 RSI: 0000000000000000 RDI: 0000000000000000
[ 3431.449799] RBP: ff300f05ea3b2800 R08: 0000000000000000 R09: 0000000000000028
[ 3431.457766] R10: ff711b900c537a60 R11: ff300f062256ec20 R12: 0000000000000000
[ 3431.465723] R13: ff300f05d5c534e0 R14: ff711b900c537af0 R15: ff300f05d5c53090
[ 3431.473687] FS:  00007fddab6cd080(0000) GS:ff300f08df445000(0000)
knlGS:0000000000000000
[ 3431.482720] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3431.489137] CR2: 0000000000000060 CR3: 0000000122ff0001 CR4: 0000000000773ef0
[ 3431.497095] PKRU: 55555554
[ 3431.500108] Call Trace:
[ 3431.502829]  <TASK>
[ 3431.505172]  blk_mq_dispatch_queue_requests+0x15a/0x190
[ 3431.511011]  blk_mq_flush_plug_list+0x78/0x190
[ 3431.515971]  ? io_submit_one+0xee/0x370
[ 3431.520257]  __blk_flush_plug+0xf2/0x150
[ 3431.524639]  blk_finish_plug+0x28/0x40
[ 3431.528826]  __x64_sys_io_submit+0xd5/0x1e0
[ 3431.533500]  do_syscall_64+0x7f/0x980
[ 3431.537591]  ? blk_mq_start_request+0x48/0x190
[ 3431.542554]  ? __io_req_task_work_add+0x35/0x1f0
[ 3431.547711]  ? ublk_queue_rqs+0x103/0x1c0 [ublk_drv]
[ 3431.553255]  ? blk_mq_dispatch_queue_requests+0x162/0x190
[ 3431.559283]  ? blk_mq_flush_plug_list+0x78/0x190
[ 3431.564435]  ? io_submit_one+0xee/0x370
[ 3431.568717]  ? __blk_flush_plug+0xf2/0x150
[ 3431.573292]  ? rseq_get_rseq_cs.isra.0+0x16/0x210
[ 3431.578546]  ? rseq_ip_fixup+0x90/0x1d0
[ 3431.582820]  ? __rseq_handle_notify_resume+0x35/0x60
[ 3431.588363]  ? arch_exit_to_user_mode_prepare.isra.0+0x82/0xb0
[ 3431.594877]  ? do_syscall_64+0xb1/0x980
[ 3431.599160]  ? blk_mq_dispatch_queue_requests+0x162/0x190
[ 3431.605189]  ? blk_mq_flush_plug_list+0x78/0x190
[ 3431.610344]  ? io_submit_one+0xee/0x370
[ 3431.614625]  ? __blk_flush_plug+0xf2/0x150
[ 3431.619201]  ? blk_finish_plug+0x28/0x40
[ 3431.623581]  ? __x64_sys_io_submit+0x104/0x1e0
[ 3431.628542]  ? syscall_exit_work+0x108/0x140
[ 3431.633302]  ? clear_bhb_loop+0x50/0xa0
[ 3431.637585]  ? clear_bhb_loop+0x50/0xa0
[ 3431.641860]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 3431.647499] RIP: 0033:0x7fddab7d1a3d
[ 3431.651490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e
fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24
08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a3 b3 0c 00 f7 d8 64 89
01 48
[ 3431.672448] RSP: 002b:00007ffd378a7a08 EFLAGS: 00000246 ORIG_RAX:
00000000000000d1
[ 3431.680900] RAX: ffffffffffffffda RBX: 00007fddab6ccff8 RCX: 00007fddab7d1a3d
[ 3431.688867] RDX: 000055ff413149c0 RSI: 0000000000000010 RDI: 00007fdda3226000
[ 3431.696824] RBP: 00007fdda3226000 R08: 00007fdda3240000 R09: 0000000000000280
[ 3431.704789] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000010
[ 3431.712755] R13: 0000000000000000 R14: 000055ff413149c0 R15: 000055ff4131b740
[ 3431.720723]  </TASK>
[ 3431.723160] Modules linked in: ublk_drv raid10 raid1 raid0 dm_raid
raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx
raid6_pq nf_tables rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs
lockd grace nfs_localio netfs sunrpc rfkill intel_rapl_msr
intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common
i10nm_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm ipmi_ssif iTCO_wdt irqbypass
dax_hmem rapl cxl_acpi mgag200 cdc_ether isst_if_mbox_pci
iTCO_vendor_support cxl_port intel_cstate isst_if_mmio cxl_core tg3
usbnet intel_th_gth i2c_algo_bit i2c_i801 mei_me ioatdma intel_uncore
mii intel_th_pci einj mei pcspkr acpi_power_meter isst_if_common
i2c_smbus intel_vsec intel_pch_thermal intel_th dca ipmi_si acpi_ipmi
ipmi_devintf ipmi_msghandler acpi_pad sg fuse loop nfnetlink xfs
sd_mod ahci libahci libata ghash_clmulni_intel wmi dm_mirror
dm_region_hash dm_log dm_mod [last unloaded: null_blk]
[ 3431.818860] CR2: 0000000000000060
[ 3431.822559] ---[ end trace 0000000000000000 ]---
[ 3431.856214] RIP: 0010:ublk_queue_rqs+0x7d/0x1c0 [ublk_drv]
[ 3431.862342] Code: 00 00 4c 8b b8 c8 00 00 00 48 63 43 20 48 c1 e0
05 4d 8d 6c 07 30 48 85 d2 0f 84 c1 00 00 00 4c 01 f8 48 8b 7a 10 48
8b 70 40 <48> 8b 4e 60 48 39 4f 60 0f 84 9a 00 00 00 4d 85 e4 0f 84 9f
00 00
[ 3431.883302] RSP: 0018:ff711b900c5379d8 EFLAGS: 00010282
[ 3431.889136] RAX: ff300f05d5c534b0 RBX: ff300f05ea3b2940 RCX: ff300f05d5c53090
[ 3431.897094] RDX: ff300f05d5c534c0 RSI: 0000000000000000 RDI: 0000000000000000
[ 3431.905052] RBP: ff300f05ea3b2800 R08: 0000000000000000 R09: 0000000000000028
[ 3431.913009] R10: ff711b900c537a60 R11: ff300f062256ec20 R12: 0000000000000000
[ 3431.920974] R13: ff300f05d5c534e0 R14: ff711b900c537af0 R15: ff300f05d5c53090
[ 3431.928940] FS:  00007fddab6cd080(0000) GS:ff300f08df445000(0000)
knlGS:0000000000000000
[ 3431.937973] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3431.944388] CR2: 0000000000000060 CR3: 0000000122ff0001 CR4: 0000000000773ef0
[ 3431.952354] PKRU: 55555554
[ 3431.955374] Kernel panic - not syncing: Fatal exception
[ 3431.961292] Kernel Offset: 0xd000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 3432.002335] ---[ end Kernel panic - not syncing: Fatal exception ]---

Best Regards,
Changhui


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] BUG: kernel NULL pointer dereference, address: 0000000000000060
  2025-07-01  1:55 [bug report] BUG: kernel NULL pointer dereference, address: 0000000000000060 Changhui Zhong
@ 2025-07-01  3:08 ` Ming Lei
  2025-07-01  8:05   ` Changhui Zhong
  0 siblings, 1 reply; 3+ messages in thread
From: Ming Lei @ 2025-07-01  3:08 UTC (permalink / raw)
  To: Changhui Zhong; +Cc: Linux Block Devices

Hi Changhui,

Thanks for the report!

On Tue, Jul 01, 2025 at 09:55:23AM +0800, Changhui Zhong wrote:
> Hello,
> 
> the following kernel panic was triggered by 'ubdsrv  make test T=generic' tests,
> please help check and let me know if you need any info/test, thanks.
> 
> repo: https://github.com/torvalds/linux.git
> branch: master
> INFO: HEAD of cloned kernel:
> commit d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Sun Jun 29 13:09:04 2025 -0700
> 
>     Linux 6.16-rc4
> 
> dmesg log:
> [ 3431.347957] BUG: kernel NULL pointer dereference, address: 0000000000000060
> [ 3431.355744] #PF: supervisor read access in kernel mode
> [ 3431.361484] #PF: error_code(0x0000) - not-present page
> [ 3431.367224] PGD 119ffa067 P4D 0
> [ 3431.370830] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 3431.375503] CPU: 22 UID: 0 PID: 397273 Comm: fio Tainted: G S
>            6.16.0-rc4 #1 PREEMPT(voluntary)
> [ 3431.386864] Tainted: [S]=CPU_OUT_OF_SPEC
> [ 3431.391243] Hardware name: Lenovo ThinkSystem SR650 V2/7Z73CTO1WW,
> BIOS AFE118M-1.32 06/29/2022
> [ 3431.400954] RIP: 0010:ublk_queue_rqs+0x7d/0x1c0 [ublk_drv]

It is one regression of commit 524346e9d79f ("ublk: build batch from IOs in same io_ring_ctx and io task").

io->cmd can't be derefered unless the uring cmd is live, and the following patch
should fix the oops:

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index c3e3c3b65a6d..99894d712c1f 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -1442,15 +1442,14 @@ static void ublk_queue_rqs(struct rq_list *rqlist)
 		struct ublk_queue *this_q = req->mq_hctx->driver_data;
 		struct ublk_io *this_io = &this_q->ios[req->tag];
 
-		if (io && !ublk_belong_to_same_batch(io, this_io) &&
-				!rq_list_empty(&submit_list))
-			ublk_queue_cmd_list(io, &submit_list);
-		io = this_io;
-
-		if (ublk_prep_req(this_q, req, true) == BLK_STS_OK)
+		if (ublk_prep_req(this_q, req, true) == BLK_STS_OK) {
+			if (io && !ublk_belong_to_same_batch(io, this_io) &&
+					!rq_list_empty(&submit_list))
+				ublk_queue_cmd_list(io, &submit_list);
 			rq_list_add_tail(&submit_list, req);
-		else
+		} else
 			rq_list_add_tail(&requeue_list, req);
+		io = this_io;
 	}
 
 	if (!rq_list_empty(&submit_list))


Thanks,
Ming


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [bug report] BUG: kernel NULL pointer dereference, address: 0000000000000060
  2025-07-01  3:08 ` Ming Lei
@ 2025-07-01  8:05   ` Changhui Zhong
  0 siblings, 0 replies; 3+ messages in thread
From: Changhui Zhong @ 2025-07-01  8:05 UTC (permalink / raw)
  To: Ming Lei; +Cc: Linux Block Devices

On Tue, Jul 1, 2025 at 11:08 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> Hi Changhui,
>
> Thanks for the report!
>
> On Tue, Jul 01, 2025 at 09:55:23AM +0800, Changhui Zhong wrote:
> > Hello,
> >
> > the following kernel panic was triggered by 'ubdsrv  make test T=generic' tests,
> > please help check and let me know if you need any info/test, thanks.
> >
> > repo: https://github.com/torvalds/linux.git
> > branch: master
> > INFO: HEAD of cloned kernel:
> > commit d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date:   Sun Jun 29 13:09:04 2025 -0700
> >
> >     Linux 6.16-rc4
> >
> > dmesg log:
> > [ 3431.347957] BUG: kernel NULL pointer dereference, address: 0000000000000060
> > [ 3431.355744] #PF: supervisor read access in kernel mode
> > [ 3431.361484] #PF: error_code(0x0000) - not-present page
> > [ 3431.367224] PGD 119ffa067 P4D 0
> > [ 3431.370830] Oops: Oops: 0000 [#1] SMP NOPTI
> > [ 3431.375503] CPU: 22 UID: 0 PID: 397273 Comm: fio Tainted: G S
> >            6.16.0-rc4 #1 PREEMPT(voluntary)
> > [ 3431.386864] Tainted: [S]=CPU_OUT_OF_SPEC
> > [ 3431.391243] Hardware name: Lenovo ThinkSystem SR650 V2/7Z73CTO1WW,
> > BIOS AFE118M-1.32 06/29/2022
> > [ 3431.400954] RIP: 0010:ublk_queue_rqs+0x7d/0x1c0 [ublk_drv]
>
> It is one regression of commit 524346e9d79f ("ublk: build batch from IOs in same io_ring_ctx and io task").
>
> io->cmd can't be derefered unless the uring cmd is live, and the following patch
> should fix the oops:
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index c3e3c3b65a6d..99894d712c1f 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -1442,15 +1442,14 @@ static void ublk_queue_rqs(struct rq_list *rqlist)
>                 struct ublk_queue *this_q = req->mq_hctx->driver_data;
>                 struct ublk_io *this_io = &this_q->ios[req->tag];
>
> -               if (io && !ublk_belong_to_same_batch(io, this_io) &&
> -                               !rq_list_empty(&submit_list))
> -                       ublk_queue_cmd_list(io, &submit_list);
> -               io = this_io;
> -
> -               if (ublk_prep_req(this_q, req, true) == BLK_STS_OK)
> +               if (ublk_prep_req(this_q, req, true) == BLK_STS_OK) {
> +                       if (io && !ublk_belong_to_same_batch(io, this_io) &&
> +                                       !rq_list_empty(&submit_list))
> +                               ublk_queue_cmd_list(io, &submit_list);
>                         rq_list_add_tail(&submit_list, req);
> -               else
> +               } else
>                         rq_list_add_tail(&requeue_list, req);
> +               io = this_io;
>         }
>
>         if (!rq_list_empty(&submit_list))
>
>
> Thanks,
> Ming
>

Hi,Ming

thanks for fix patch,I ran the test 30 times with your patch and did
not hit this issue again.
I saw you sent a new patch
https://lore.kernel.org/linux-block/20250701072325.1458109-1-ming.lei@redhat.com/T/#u
will re-run the tests for the new patch,

Thanks,


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-01  8:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-01  1:55 [bug report] BUG: kernel NULL pointer dereference, address: 0000000000000060 Changhui Zhong
2025-07-01  3:08 ` Ming Lei
2025-07-01  8:05   ` Changhui Zhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox