* FAILED: patch "[PATCH] dm: fix NULL pointer dereference in __dm_suspend()" failed to apply to 5.10-stable tree
@ 2025-10-13 9:12 gregkh
2025-10-14 3:03 ` [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() Sasha Levin
0 siblings, 1 reply; 13+ messages in thread
From: gregkh @ 2025-10-13 9:12 UTC (permalink / raw)
To: zhengqixing, mpatocka; +Cc: stable
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 8d33a030c566e1f105cd5bf27f37940b6367f3be
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2025101316-favored-pulsate-a811@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8d33a030c566e1f105cd5bf27f37940b6367f3be Mon Sep 17 00:00:00 2001
From: Zheng Qixing <zhengqixing@huawei.com>
Date: Tue, 26 Aug 2025 15:42:04 +0800
Subject: [PATCH] dm: fix NULL pointer dereference in __dm_suspend()
There is a race condition between dm device suspend and table load that
can lead to null pointer dereference. The issue occurs when suspend is
invoked before table load completes:
BUG: kernel NULL pointer dereference, address: 0000000000000054
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50
Call Trace:
<TASK>
blk_mq_quiesce_queue+0x2c/0x50
dm_stop_queue+0xd/0x20
__dm_suspend+0x130/0x330
dm_suspend+0x11a/0x180
dev_suspend+0x27e/0x560
ctl_ioctl+0x4cf/0x850
dm_ctl_ioctl+0xd/0x20
vfs_ioctl+0x1d/0x50
__se_sys_ioctl+0x9b/0xc0
__x64_sys_ioctl+0x19/0x30
x64_sys_call+0x2c4a/0x4620
do_syscall_64+0x9e/0x1b0
The issue can be triggered as below:
T1 T2
dm_suspend table_load
__dm_suspend dm_setup_md_queue
dm_mq_init_request_queue
blk_mq_init_allocated_queue
=> q->mq_ops = set->ops; (1)
dm_stop_queue / dm_wait_for_completion
=> q->tag_set NULL pointer! (2)
=> q->tag_set = set; (3)
Fix this by checking if a valid table (map) exists before performing
request-based suspend and waiting for target I/O. When map is NULL,
skip these table-dependent suspend steps.
Even when map is NULL, no I/O can reach any target because there is
no table loaded; I/O submitted in this state will fail early in the
DM layer. Skipping the table-dependent suspend logic in this case
is safe and avoids NULL pointer dereferences.
Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion")
Cc: stable@vger.kernel.org
Signed-off-by: Zheng Qixing <zhengqixing@huawei.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 7222f20c1a83..66dd5f6ce778 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2908,7 +2908,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
{
bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG;
bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG;
- int r;
+ int r = 0;
lockdep_assert_held(&md->suspend_lock);
@@ -2960,7 +2960,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* Stop md->queue before flushing md->wq in case request-based
* dm defers requests to md->wq from md->queue.
*/
- if (dm_request_based(md)) {
+ if (map && dm_request_based(md)) {
dm_stop_queue(md->queue);
set_bit(DMF_QUEUE_STOPPED, &md->flags);
}
@@ -2972,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* We call dm_wait_for_completion to wait for all existing requests
* to finish.
*/
- r = dm_wait_for_completion(md, task_state);
+ if (map)
+ r = dm_wait_for_completion(md, task_state);
if (!r)
set_bit(dmf_suspended_flag, &md->flags);
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-13 9:12 FAILED: patch "[PATCH] dm: fix NULL pointer dereference in __dm_suspend()" failed to apply to 5.10-stable tree gregkh @ 2025-10-14 3:03 ` Sasha Levin 2025-10-22 8:01 ` Li Lingfeng 0 siblings, 1 reply; 13+ messages in thread From: Sasha Levin @ 2025-10-14 3:03 UTC (permalink / raw) To: stable; +Cc: Zheng Qixing, Mikulas Patocka, Sasha Levin From: Zheng Qixing <zhengqixing@huawei.com> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] There is a race condition between dm device suspend and table load that can lead to null pointer dereference. The issue occurs when suspend is invoked before table load completes: BUG: kernel NULL pointer dereference, address: 0000000000000054 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 Call Trace: <TASK> blk_mq_quiesce_queue+0x2c/0x50 dm_stop_queue+0xd/0x20 __dm_suspend+0x130/0x330 dm_suspend+0x11a/0x180 dev_suspend+0x27e/0x560 ctl_ioctl+0x4cf/0x850 dm_ctl_ioctl+0xd/0x20 vfs_ioctl+0x1d/0x50 __se_sys_ioctl+0x9b/0xc0 __x64_sys_ioctl+0x19/0x30 x64_sys_call+0x2c4a/0x4620 do_syscall_64+0x9e/0x1b0 The issue can be triggered as below: T1 T2 dm_suspend table_load __dm_suspend dm_setup_md_queue dm_mq_init_request_queue blk_mq_init_allocated_queue => q->mq_ops = set->ops; (1) dm_stop_queue / dm_wait_for_completion => q->tag_set NULL pointer! (2) => q->tag_set = set; (3) Fix this by checking if a valid table (map) exists before performing request-based suspend and waiting for target I/O. When map is NULL, skip these table-dependent suspend steps. Even when map is NULL, no I/O can reach any target because there is no table loaded; I/O submitted in this state will fail early in the DM layer. Skipping the table-dependent suspend logic in this case is safe and avoids NULL pointer dereferences. Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") Cc: stable@vger.kernel.org Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] Signed-off-by: Sasha Levin <sashal@kernel.org> --- drivers/md/dm.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 0868358a7a8d2..b51281ce15d4c 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, { bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; - int r; + int r = 0; lockdep_assert_held(&md->suspend_lock); @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, * Stop md->queue before flushing md->wq in case request-based * dm defers requests to md->wq from md->queue. */ - if (dm_request_based(md)) + if (map && dm_request_based(md)) dm_stop_queue(md->queue); flush_workqueue(md->wq); @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, * We call dm_wait_for_completion to wait for all existing requests * to finish. */ - r = dm_wait_for_completion(md, task_state); + if (map) + r = dm_wait_for_completion(md, task_state); if (!r) set_bit(dmf_suspended_flag, &md->flags); -- 2.51.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-14 3:03 ` [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() Sasha Levin @ 2025-10-22 8:01 ` Li Lingfeng 2025-10-22 8:10 ` Greg KH 0 siblings, 1 reply; 13+ messages in thread From: Li Lingfeng @ 2025-10-22 8:01 UTC (permalink / raw) To: Sasha Levin, stable Cc: Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao Hi, 在 2025/10/14 11:03, Sasha Levin 写道: > From: Zheng Qixing <zhengqixing@huawei.com> > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > There is a race condition between dm device suspend and table load that > can lead to null pointer dereference. The issue occurs when suspend is > invoked before table load completes: > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > Call Trace: > <TASK> > blk_mq_quiesce_queue+0x2c/0x50 > dm_stop_queue+0xd/0x20 > __dm_suspend+0x130/0x330 > dm_suspend+0x11a/0x180 > dev_suspend+0x27e/0x560 > ctl_ioctl+0x4cf/0x850 > dm_ctl_ioctl+0xd/0x20 > vfs_ioctl+0x1d/0x50 > __se_sys_ioctl+0x9b/0xc0 > __x64_sys_ioctl+0x19/0x30 > x64_sys_call+0x2c4a/0x4620 > do_syscall_64+0x9e/0x1b0 > > The issue can be triggered as below: > > T1 T2 > dm_suspend table_load > __dm_suspend dm_setup_md_queue > dm_mq_init_request_queue > blk_mq_init_allocated_queue > => q->mq_ops = set->ops; (1) > dm_stop_queue / dm_wait_for_completion > => q->tag_set NULL pointer! (2) > => q->tag_set = set; (3) > > Fix this by checking if a valid table (map) exists before performing > request-based suspend and waiting for target I/O. When map is NULL, > skip these table-dependent suspend steps. > > Even when map is NULL, no I/O can reach any target because there is > no table loaded; I/O submitted in this state will fail early in the > DM layer. Skipping the table-dependent suspend logic in this case > is safe and avoids NULL pointer dereferences. > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > Cc: stable@vger.kernel.org > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > Signed-off-by: Sasha Levin <sashal@kernel.org> > --- > drivers/md/dm.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 0868358a7a8d2..b51281ce15d4c 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > { > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > - int r; > + int r = 0; > > lockdep_assert_held(&md->suspend_lock); > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > * Stop md->queue before flushing md->wq in case request-based > * dm defers requests to md->wq from md->queue. > */ > - if (dm_request_based(md)) > + if (map && dm_request_based(md)) > dm_stop_queue(md->queue); > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() would not attempt to access q->tag_set, so in dm_stop_queue() this kind of race condition would not cause a null pointer dereference. The change may not be necessary for 5.10, but adding it doesn’t appear to cause any issues either. > flush_workqueue(md->wq); > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > * We call dm_wait_for_completion to wait for all existing requests > * to finish. > */ > - r = dm_wait_for_completion(md, task_state); > + if (map) > + r = dm_wait_for_completion(md, task_state); The fix tag c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") seems to correspond to the modification of dm_wait_for_completion(). > if (!r) > set_bit(dmf_suspended_flag, &md->flags); > Perhaps adding another fix tag would be more appropriate? Thanks, Lingfeng ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 8:01 ` Li Lingfeng @ 2025-10-22 8:10 ` Greg KH 2025-10-22 8:21 ` Li Lingfeng 0 siblings, 1 reply; 13+ messages in thread From: Greg KH @ 2025-10-22 8:10 UTC (permalink / raw) To: Li Lingfeng Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: > Hi, > > 在 2025/10/14 11:03, Sasha Levin 写道: > > From: Zheng Qixing <zhengqixing@huawei.com> > > > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > > > There is a race condition between dm device suspend and table load that > > can lead to null pointer dereference. The issue occurs when suspend is > > invoked before table load completes: > > > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > > Oops: 0000 [#1] PREEMPT SMP PTI > > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > > Call Trace: > > <TASK> > > blk_mq_quiesce_queue+0x2c/0x50 > > dm_stop_queue+0xd/0x20 > > __dm_suspend+0x130/0x330 > > dm_suspend+0x11a/0x180 > > dev_suspend+0x27e/0x560 > > ctl_ioctl+0x4cf/0x850 > > dm_ctl_ioctl+0xd/0x20 > > vfs_ioctl+0x1d/0x50 > > __se_sys_ioctl+0x9b/0xc0 > > __x64_sys_ioctl+0x19/0x30 > > x64_sys_call+0x2c4a/0x4620 > > do_syscall_64+0x9e/0x1b0 > > > > The issue can be triggered as below: > > > > T1 T2 > > dm_suspend table_load > > __dm_suspend dm_setup_md_queue > > dm_mq_init_request_queue > > blk_mq_init_allocated_queue > > => q->mq_ops = set->ops; (1) > > dm_stop_queue / dm_wait_for_completion > > => q->tag_set NULL pointer! (2) > > => q->tag_set = set; (3) > > > > Fix this by checking if a valid table (map) exists before performing > > request-based suspend and waiting for target I/O. When map is NULL, > > skip these table-dependent suspend steps. > > > > Even when map is NULL, no I/O can reach any target because there is > > no table loaded; I/O submitted in this state will fail early in the > > DM layer. Skipping the table-dependent suspend logic in this case > > is safe and avoids NULL pointer dereferences. > > > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > > Cc: stable@vger.kernel.org > > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > --- > > drivers/md/dm.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > index 0868358a7a8d2..b51281ce15d4c 100644 > > --- a/drivers/md/dm.c > > +++ b/drivers/md/dm.c > > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > { > > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > > - int r; > > + int r = 0; > > lockdep_assert_held(&md->suspend_lock); > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > * Stop md->queue before flushing md->wq in case request-based > > * dm defers requests to md->wq from md->queue. > > */ > > - if (dm_request_based(md)) > > + if (map && dm_request_based(md)) > > dm_stop_queue(md->queue); > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct > used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() > would not attempt to access q->tag_set, so in dm_stop_queue() this kind > of race condition would not cause a null pointer dereference. The change > may not be necessary for 5.10, but adding it doesn’t appear to cause any > issues either. > > flush_workqueue(md->wq); > > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > * We call dm_wait_for_completion to wait for all existing requests > > * to finish. > > */ > > - r = dm_wait_for_completion(md, task_state); > > + if (map) > > + r = dm_wait_for_completion(md, task_state); > The fix tag c4576aed8d85 ("dm: fix request-based dm's use of > dm_wait_for_completion") seems to correspond to the modification of > dm_wait_for_completion(). > > if (!r) > > set_bit(dmf_suspended_flag, &md->flags); > > Perhaps adding another fix tag would be more appropriate? Those tags come directly from the original commit. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 8:10 ` Greg KH @ 2025-10-22 8:21 ` Li Lingfeng 2025-10-22 8:45 ` Greg KH 0 siblings, 1 reply; 13+ messages in thread From: Li Lingfeng @ 2025-10-22 8:21 UTC (permalink / raw) To: Greg KH Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao Hi, 在 2025/10/22 16:10, Greg KH 写道: > On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: >> Hi, >> >> 在 2025/10/14 11:03, Sasha Levin 写道: >>> From: Zheng Qixing <zhengqixing@huawei.com> >>> >>> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] >>> >>> There is a race condition between dm device suspend and table load that >>> can lead to null pointer dereference. The issue occurs when suspend is >>> invoked before table load completes: >>> >>> BUG: kernel NULL pointer dereference, address: 0000000000000054 >>> Oops: 0000 [#1] PREEMPT SMP PTI >>> CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 >>> RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 >>> Call Trace: >>> <TASK> >>> blk_mq_quiesce_queue+0x2c/0x50 >>> dm_stop_queue+0xd/0x20 >>> __dm_suspend+0x130/0x330 >>> dm_suspend+0x11a/0x180 >>> dev_suspend+0x27e/0x560 >>> ctl_ioctl+0x4cf/0x850 >>> dm_ctl_ioctl+0xd/0x20 >>> vfs_ioctl+0x1d/0x50 >>> __se_sys_ioctl+0x9b/0xc0 >>> __x64_sys_ioctl+0x19/0x30 >>> x64_sys_call+0x2c4a/0x4620 >>> do_syscall_64+0x9e/0x1b0 >>> >>> The issue can be triggered as below: >>> >>> T1 T2 >>> dm_suspend table_load >>> __dm_suspend dm_setup_md_queue >>> dm_mq_init_request_queue >>> blk_mq_init_allocated_queue >>> => q->mq_ops = set->ops; (1) >>> dm_stop_queue / dm_wait_for_completion >>> => q->tag_set NULL pointer! (2) >>> => q->tag_set = set; (3) >>> >>> Fix this by checking if a valid table (map) exists before performing >>> request-based suspend and waiting for target I/O. When map is NULL, >>> skip these table-dependent suspend steps. >>> >>> Even when map is NULL, no I/O can reach any target because there is >>> no table loaded; I/O submitted in this state will fail early in the >>> DM layer. Skipping the table-dependent suspend logic in this case >>> is safe and avoids NULL pointer dereferences. >>> >>> Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") >>> Cc: stable@vger.kernel.org >>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> >>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >>> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] >>> Signed-off-by: Sasha Levin <sashal@kernel.org> >>> --- >>> drivers/md/dm.c | 7 ++++--- >>> 1 file changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c >>> index 0868358a7a8d2..b51281ce15d4c 100644 >>> --- a/drivers/md/dm.c >>> +++ b/drivers/md/dm.c >>> @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>> { >>> bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; >>> bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; >>> - int r; >>> + int r = 0; >>> lockdep_assert_held(&md->suspend_lock); >>> @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>> * Stop md->queue before flushing md->wq in case request-based >>> * dm defers requests to md->wq from md->queue. >>> */ >>> - if (dm_request_based(md)) >>> + if (map && dm_request_based(md)) >>> dm_stop_queue(md->queue); >> It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct >> used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() >> would not attempt to access q->tag_set, so in dm_stop_queue() this kind >> of race condition would not cause a null pointer dereference. The change >> may not be necessary for 5.10, but adding it doesn’t appear to cause any >> issues either. >>> flush_workqueue(md->wq); >>> @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>> * We call dm_wait_for_completion to wait for all existing requests >>> * to finish. >>> */ >>> - r = dm_wait_for_completion(md, task_state); >>> + if (map) >>> + r = dm_wait_for_completion(md, task_state); >> The fix tag c4576aed8d85 ("dm: fix request-based dm's use of >> dm_wait_for_completion") seems to correspond to the modification of >> dm_wait_for_completion(). >>> if (!r) >>> set_bit(dmf_suspended_flag, &md->flags); >> Perhaps adding another fix tag would be more appropriate? > Those tags come directly from the original commit. Thanks for the clarification. Since the patch has already been merged into the mainline tree, we can't update the commit there. I was just wondering if it would be acceptable to add an additional “Fixes” tag when applying it to the stable branch, before merging. Thanks, Lingfeng. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 8:21 ` Li Lingfeng @ 2025-10-22 8:45 ` Greg KH 2025-10-22 9:10 ` Li Lingfeng 0 siblings, 1 reply; 13+ messages in thread From: Greg KH @ 2025-10-22 8:45 UTC (permalink / raw) To: Li Lingfeng Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: > Hi, > > 在 2025/10/22 16:10, Greg KH 写道: > > On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: > > > Hi, > > > > > > 在 2025/10/14 11:03, Sasha Levin 写道: > > > > From: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > > > > > > > There is a race condition between dm device suspend and table load that > > > > can lead to null pointer dereference. The issue occurs when suspend is > > > > invoked before table load completes: > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > > > > Oops: 0000 [#1] PREEMPT SMP PTI > > > > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > > > > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > > > > Call Trace: > > > > <TASK> > > > > blk_mq_quiesce_queue+0x2c/0x50 > > > > dm_stop_queue+0xd/0x20 > > > > __dm_suspend+0x130/0x330 > > > > dm_suspend+0x11a/0x180 > > > > dev_suspend+0x27e/0x560 > > > > ctl_ioctl+0x4cf/0x850 > > > > dm_ctl_ioctl+0xd/0x20 > > > > vfs_ioctl+0x1d/0x50 > > > > __se_sys_ioctl+0x9b/0xc0 > > > > __x64_sys_ioctl+0x19/0x30 > > > > x64_sys_call+0x2c4a/0x4620 > > > > do_syscall_64+0x9e/0x1b0 > > > > > > > > The issue can be triggered as below: > > > > > > > > T1 T2 > > > > dm_suspend table_load > > > > __dm_suspend dm_setup_md_queue > > > > dm_mq_init_request_queue > > > > blk_mq_init_allocated_queue > > > > => q->mq_ops = set->ops; (1) > > > > dm_stop_queue / dm_wait_for_completion > > > > => q->tag_set NULL pointer! (2) > > > > => q->tag_set = set; (3) > > > > > > > > Fix this by checking if a valid table (map) exists before performing > > > > request-based suspend and waiting for target I/O. When map is NULL, > > > > skip these table-dependent suspend steps. > > > > > > > > Even when map is NULL, no I/O can reach any target because there is > > > > no table loaded; I/O submitted in this state will fail early in the > > > > DM layer. Skipping the table-dependent suspend logic in this case > > > > is safe and avoids NULL pointer dereferences. > > > > > > > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > --- > > > > drivers/md/dm.c | 7 ++++--- > > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > > > index 0868358a7a8d2..b51281ce15d4c 100644 > > > > --- a/drivers/md/dm.c > > > > +++ b/drivers/md/dm.c > > > > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > { > > > > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > > > > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > > > > - int r; > > > > + int r = 0; > > > > lockdep_assert_held(&md->suspend_lock); > > > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > * Stop md->queue before flushing md->wq in case request-based > > > > * dm defers requests to md->wq from md->queue. > > > > */ > > > > - if (dm_request_based(md)) > > > > + if (map && dm_request_based(md)) > > > > dm_stop_queue(md->queue); > > > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct > > > used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() > > > would not attempt to access q->tag_set, so in dm_stop_queue() this kind > > > of race condition would not cause a null pointer dereference. The change > > > may not be necessary for 5.10, but adding it doesn’t appear to cause any > > > issues either. > > > > flush_workqueue(md->wq); > > > > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > * We call dm_wait_for_completion to wait for all existing requests > > > > * to finish. > > > > */ > > > > - r = dm_wait_for_completion(md, task_state); > > > > + if (map) > > > > + r = dm_wait_for_completion(md, task_state); > > > The fix tag c4576aed8d85 ("dm: fix request-based dm's use of > > > dm_wait_for_completion") seems to correspond to the modification of > > > dm_wait_for_completion(). > > > > if (!r) > > > > set_bit(dmf_suspended_flag, &md->flags); > > > Perhaps adding another fix tag would be more appropriate? > > Those tags come directly from the original commit. > Thanks for the clarification. Since the patch has already been merged into > the mainline tree, we can't update the commit there. I was just wondering > if it would be acceptable to add an additional “Fixes” tag when applying > it to the stable branch, before merging. I can, if needed, can you please provide that line? thanks, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 8:45 ` Greg KH @ 2025-10-22 9:10 ` Li Lingfeng 2025-10-22 9:14 ` Zheng Qixing 2025-10-22 9:20 ` Greg KH 0 siblings, 2 replies; 13+ messages in thread From: Li Lingfeng @ 2025-10-22 9:10 UTC (permalink / raw) To: Greg KH Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao 在 2025/10/22 16:45, Greg KH 写道: > On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: >> Hi, >> >> 在 2025/10/22 16:10, Greg KH 写道: >>> On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: >>>> Hi, >>>> >>>> 在 2025/10/14 11:03, Sasha Levin 写道: >>>>> From: Zheng Qixing <zhengqixing@huawei.com> >>>>> >>>>> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] >>>>> >>>>> There is a race condition between dm device suspend and table load that >>>>> can lead to null pointer dereference. The issue occurs when suspend is >>>>> invoked before table load completes: >>>>> >>>>> BUG: kernel NULL pointer dereference, address: 0000000000000054 >>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>> CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 >>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 >>>>> RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 >>>>> Call Trace: >>>>> <TASK> >>>>> blk_mq_quiesce_queue+0x2c/0x50 >>>>> dm_stop_queue+0xd/0x20 >>>>> __dm_suspend+0x130/0x330 >>>>> dm_suspend+0x11a/0x180 >>>>> dev_suspend+0x27e/0x560 >>>>> ctl_ioctl+0x4cf/0x850 >>>>> dm_ctl_ioctl+0xd/0x20 >>>>> vfs_ioctl+0x1d/0x50 >>>>> __se_sys_ioctl+0x9b/0xc0 >>>>> __x64_sys_ioctl+0x19/0x30 >>>>> x64_sys_call+0x2c4a/0x4620 >>>>> do_syscall_64+0x9e/0x1b0 >>>>> >>>>> The issue can be triggered as below: >>>>> >>>>> T1 T2 >>>>> dm_suspend table_load >>>>> __dm_suspend dm_setup_md_queue >>>>> dm_mq_init_request_queue >>>>> blk_mq_init_allocated_queue >>>>> => q->mq_ops = set->ops; (1) >>>>> dm_stop_queue / dm_wait_for_completion >>>>> => q->tag_set NULL pointer! (2) >>>>> => q->tag_set = set; (3) >>>>> >>>>> Fix this by checking if a valid table (map) exists before performing >>>>> request-based suspend and waiting for target I/O. When map is NULL, >>>>> skip these table-dependent suspend steps. >>>>> >>>>> Even when map is NULL, no I/O can reach any target because there is >>>>> no table loaded; I/O submitted in this state will fail early in the >>>>> DM layer. Skipping the table-dependent suspend logic in this case >>>>> is safe and avoids NULL pointer dereferences. >>>>> >>>>> Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") >>>>> Cc: stable@vger.kernel.org >>>>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> >>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >>>>> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] >>>>> Signed-off-by: Sasha Levin <sashal@kernel.org> >>>>> --- >>>>> drivers/md/dm.c | 7 ++++--- >>>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c >>>>> index 0868358a7a8d2..b51281ce15d4c 100644 >>>>> --- a/drivers/md/dm.c >>>>> +++ b/drivers/md/dm.c >>>>> @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>> { >>>>> bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; >>>>> bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; >>>>> - int r; >>>>> + int r = 0; >>>>> lockdep_assert_held(&md->suspend_lock); >>>>> @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>> * Stop md->queue before flushing md->wq in case request-based >>>>> * dm defers requests to md->wq from md->queue. >>>>> */ >>>>> - if (dm_request_based(md)) >>>>> + if (map && dm_request_based(md)) >>>>> dm_stop_queue(md->queue); >>>> It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct >>>> used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() >>>> would not attempt to access q->tag_set, so in dm_stop_queue() this kind >>>> of race condition would not cause a null pointer dereference. The change >>>> may not be necessary for 5.10, but adding it doesn’t appear to cause any >>>> issues either. >>>>> flush_workqueue(md->wq); >>>>> @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>> * We call dm_wait_for_completion to wait for all existing requests >>>>> * to finish. >>>>> */ >>>>> - r = dm_wait_for_completion(md, task_state); >>>>> + if (map) >>>>> + r = dm_wait_for_completion(md, task_state); >>>> The fix tag c4576aed8d85 ("dm: fix request-based dm's use of >>>> dm_wait_for_completion") seems to correspond to the modification of >>>> dm_wait_for_completion(). >>>>> if (!r) >>>>> set_bit(dmf_suspended_flag, &md->flags); >>>> Perhaps adding another fix tag would be more appropriate? >>> Those tags come directly from the original commit. >> Thanks for the clarification. Since the patch has already been merged into >> the mainline tree, we can't update the commit there. I was just wondering >> if it would be acceptable to add an additional “Fixes” tag when applying >> it to the stable branch, before merging. > I can, if needed, can you please provide that line? Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset") Thanks, Lingfeng > > thanks, > > greg k-h > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 9:10 ` Li Lingfeng @ 2025-10-22 9:14 ` Zheng Qixing 2025-10-22 9:20 ` Greg KH 1 sibling, 0 replies; 13+ messages in thread From: Zheng Qixing @ 2025-10-22 9:14 UTC (permalink / raw) To: Li Lingfeng, Greg KH Cc: Sasha Levin, stable, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao Hi, 在 2025/10/22 17:10, Li Lingfeng 写道: > > 在 2025/10/22 16:45, Greg KH 写道: >> On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: >>> Hi, >>> >>> 在 2025/10/22 16:10, Greg KH 写道: >>>> On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: >>>>> Hi, >>>>> >>>>> 在 2025/10/14 11:03, Sasha Levin 写道: >>>>>> From: Zheng Qixing <zhengqixing@huawei.com> >>>>>> >>>>>> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] >>>>>> >>>>>> There is a race condition between dm device suspend and table >>>>>> load that >>>>>> can lead to null pointer dereference. The issue occurs when >>>>>> suspend is >>>>>> invoked before table load completes: >>>>>> >>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000054 >>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>> CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 >>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >>>>>> 1.16.1-2.fc37 04/01/2014 >>>>>> RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 >>>>>> Call Trace: >>>>>> <TASK> >>>>>> blk_mq_quiesce_queue+0x2c/0x50 >>>>>> dm_stop_queue+0xd/0x20 >>>>>> __dm_suspend+0x130/0x330 >>>>>> dm_suspend+0x11a/0x180 >>>>>> dev_suspend+0x27e/0x560 >>>>>> ctl_ioctl+0x4cf/0x850 >>>>>> dm_ctl_ioctl+0xd/0x20 >>>>>> vfs_ioctl+0x1d/0x50 >>>>>> __se_sys_ioctl+0x9b/0xc0 >>>>>> __x64_sys_ioctl+0x19/0x30 >>>>>> x64_sys_call+0x2c4a/0x4620 >>>>>> do_syscall_64+0x9e/0x1b0 >>>>>> >>>>>> The issue can be triggered as below: >>>>>> >>>>>> T1 T2 >>>>>> dm_suspend table_load >>>>>> __dm_suspend dm_setup_md_queue >>>>>> dm_mq_init_request_queue >>>>>> blk_mq_init_allocated_queue >>>>>> => q->mq_ops = set->ops; (1) >>>>>> dm_stop_queue / dm_wait_for_completion >>>>>> => q->tag_set NULL pointer! (2) >>>>>> => q->tag_set = set; (3) >>>>>> >>>>>> Fix this by checking if a valid table (map) exists before performing >>>>>> request-based suspend and waiting for target I/O. When map is NULL, >>>>>> skip these table-dependent suspend steps. >>>>>> >>>>>> Even when map is NULL, no I/O can reach any target because there is >>>>>> no table loaded; I/O submitted in this state will fail early in the >>>>>> DM layer. Skipping the table-dependent suspend logic in this case >>>>>> is safe and avoids NULL pointer dereferences. >>>>>> >>>>>> Fixes: c4576aed8d85 ("dm: fix request-based dm's use of >>>>>> dm_wait_for_completion") >>>>>> Cc: stable@vger.kernel.org >>>>>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> >>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >>>>>> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] >>>>>> Signed-off-by: Sasha Levin <sashal@kernel.org> >>>>>> --- >>>>>> drivers/md/dm.c | 7 ++++--- >>>>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>>>> >>>>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c >>>>>> index 0868358a7a8d2..b51281ce15d4c 100644 >>>>>> --- a/drivers/md/dm.c >>>>>> +++ b/drivers/md/dm.c >>>>>> @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct >>>>>> mapped_device *md, struct dm_table *map, >>>>>> { >>>>>> bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; >>>>>> bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; >>>>>> - int r; >>>>>> + int r = 0; >>>>>> lockdep_assert_held(&md->suspend_lock); >>>>>> @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct >>>>>> mapped_device *md, struct dm_table *map, >>>>>> * Stop md->queue before flushing md->wq in case >>>>>> request-based >>>>>> * dm defers requests to md->wq from md->queue. >>>>>> */ >>>>>> - if (dm_request_based(md)) >>>>>> + if (map && dm_request_based(md)) >>>>>> dm_stop_queue(md->queue); >>>>> It seems that before commit 80bd4a7aab4c ("blk-mq: move the >>>>> srcu_struct >>>>> used for quiescing to the tagset") was merged, >>>>> blk_mq_wait_quiesce_done() >>>>> would not attempt to access q->tag_set, so in dm_stop_queue() this >>>>> kind >>>>> of race condition would not cause a null pointer dereference. The >>>>> change >>>>> may not be necessary for 5.10, but adding it doesn’t appear to >>>>> cause any >>>>> issues either. >>>>>> flush_workqueue(md->wq); >>>>>> @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct >>>>>> mapped_device *md, struct dm_table *map, >>>>>> * We call dm_wait_for_completion to wait for all >>>>>> existing requests >>>>>> * to finish. >>>>>> */ >>>>>> - r = dm_wait_for_completion(md, task_state); >>>>>> + if (map) >>>>>> + r = dm_wait_for_completion(md, task_state); >>>>> The fix tag c4576aed8d85 ("dm: fix request-based dm's use of >>>>> dm_wait_for_completion") seems to correspond to the modification of >>>>> dm_wait_for_completion(). >>>>>> if (!r) >>>>>> set_bit(dmf_suspended_flag, &md->flags); >>>>> Perhaps adding another fix tag would be more appropriate? >>>> Those tags come directly from the original commit. >>> Thanks for the clarification. Since the patch has already been >>> merged into >>> the mainline tree, we can't update the commit there. I was just >>> wondering >>> if it would be acceptable to add an additional “Fixes” tag when >>> applying >>> it to the stable branch, before merging. >> I can, if needed, can you please provide that line? > > Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing > to the tagset") > > Thanks, > Lingfeng > Thank you for adding a fix tag. Qixing >> >> thanks, >> >> greg k-h >> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 9:10 ` Li Lingfeng 2025-10-22 9:14 ` Zheng Qixing @ 2025-10-22 9:20 ` Greg KH 2025-10-23 1:12 ` Li Lingfeng 1 sibling, 1 reply; 13+ messages in thread From: Greg KH @ 2025-10-22 9:20 UTC (permalink / raw) To: Li Lingfeng Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao On Wed, Oct 22, 2025 at 05:10:36PM +0800, Li Lingfeng wrote: > > 在 2025/10/22 16:45, Greg KH 写道: > > On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: > > > Hi, > > > > > > 在 2025/10/22 16:10, Greg KH 写道: > > > > On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: > > > > > Hi, > > > > > > > > > > 在 2025/10/14 11:03, Sasha Levin 写道: > > > > > > From: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > > > > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > > > > > > > > > > > There is a race condition between dm device suspend and table load that > > > > > > can lead to null pointer dereference. The issue occurs when suspend is > > > > > > invoked before table load completes: > > > > > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > > > > > > Oops: 0000 [#1] PREEMPT SMP PTI > > > > > > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > > > > > > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > > > > > > Call Trace: > > > > > > <TASK> > > > > > > blk_mq_quiesce_queue+0x2c/0x50 > > > > > > dm_stop_queue+0xd/0x20 > > > > > > __dm_suspend+0x130/0x330 > > > > > > dm_suspend+0x11a/0x180 > > > > > > dev_suspend+0x27e/0x560 > > > > > > ctl_ioctl+0x4cf/0x850 > > > > > > dm_ctl_ioctl+0xd/0x20 > > > > > > vfs_ioctl+0x1d/0x50 > > > > > > __se_sys_ioctl+0x9b/0xc0 > > > > > > __x64_sys_ioctl+0x19/0x30 > > > > > > x64_sys_call+0x2c4a/0x4620 > > > > > > do_syscall_64+0x9e/0x1b0 > > > > > > > > > > > > The issue can be triggered as below: > > > > > > > > > > > > T1 T2 > > > > > > dm_suspend table_load > > > > > > __dm_suspend dm_setup_md_queue > > > > > > dm_mq_init_request_queue > > > > > > blk_mq_init_allocated_queue > > > > > > => q->mq_ops = set->ops; (1) > > > > > > dm_stop_queue / dm_wait_for_completion > > > > > > => q->tag_set NULL pointer! (2) > > > > > > => q->tag_set = set; (3) > > > > > > > > > > > > Fix this by checking if a valid table (map) exists before performing > > > > > > request-based suspend and waiting for target I/O. When map is NULL, > > > > > > skip these table-dependent suspend steps. > > > > > > > > > > > > Even when map is NULL, no I/O can reach any target because there is > > > > > > no table loaded; I/O submitted in this state will fail early in the > > > > > > DM layer. Skipping the table-dependent suspend logic in this case > > > > > > is safe and avoids NULL pointer dereferences. > > > > > > > > > > > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > > > > > > Cc: stable@vger.kernel.org > > > > > > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > > > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > > > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > --- > > > > > > drivers/md/dm.c | 7 ++++--- > > > > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > > > > > index 0868358a7a8d2..b51281ce15d4c 100644 > > > > > > --- a/drivers/md/dm.c > > > > > > +++ b/drivers/md/dm.c > > > > > > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > { > > > > > > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > > > > > > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > > > > > > - int r; > > > > > > + int r = 0; > > > > > > lockdep_assert_held(&md->suspend_lock); > > > > > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > * Stop md->queue before flushing md->wq in case request-based > > > > > > * dm defers requests to md->wq from md->queue. > > > > > > */ > > > > > > - if (dm_request_based(md)) > > > > > > + if (map && dm_request_based(md)) > > > > > > dm_stop_queue(md->queue); > > > > > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct > > > > > used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() > > > > > would not attempt to access q->tag_set, so in dm_stop_queue() this kind > > > > > of race condition would not cause a null pointer dereference. The change > > > > > may not be necessary for 5.10, but adding it doesn’t appear to cause any > > > > > issues either. > > > > > > flush_workqueue(md->wq); > > > > > > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > * We call dm_wait_for_completion to wait for all existing requests > > > > > > * to finish. > > > > > > */ > > > > > > - r = dm_wait_for_completion(md, task_state); > > > > > > + if (map) > > > > > > + r = dm_wait_for_completion(md, task_state); > > > > > The fix tag c4576aed8d85 ("dm: fix request-based dm's use of > > > > > dm_wait_for_completion") seems to correspond to the modification of > > > > > dm_wait_for_completion(). > > > > > > if (!r) > > > > > > set_bit(dmf_suspended_flag, &md->flags); > > > > > Perhaps adding another fix tag would be more appropriate? > > > > Those tags come directly from the original commit. > > > Thanks for the clarification. Since the patch has already been merged into > > > the mainline tree, we can't update the commit there. I was just wondering > > > if it would be acceptable to add an additional “Fixes” tag when applying > > > it to the stable branch, before merging. > > I can, if needed, can you please provide that line? > > Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the > tagset") This is odd, that is a 6.2 commit, yet this is being asked to be added to 5.10.y. So what help is this going to provide? confused, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-22 9:20 ` Greg KH @ 2025-10-23 1:12 ` Li Lingfeng 2025-10-23 5:20 ` Greg KH 0 siblings, 1 reply; 13+ messages in thread From: Li Lingfeng @ 2025-10-23 1:12 UTC (permalink / raw) To: Greg KH Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao 在 2025/10/22 17:20, Greg KH 写道: > On Wed, Oct 22, 2025 at 05:10:36PM +0800, Li Lingfeng wrote: >> 在 2025/10/22 16:45, Greg KH 写道: >>> On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: >>>> Hi, >>>> >>>> 在 2025/10/22 16:10, Greg KH 写道: >>>>> On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: >>>>>> Hi, >>>>>> >>>>>> 在 2025/10/14 11:03, Sasha Levin 写道: >>>>>>> From: Zheng Qixing <zhengqixing@huawei.com> >>>>>>> >>>>>>> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] >>>>>>> >>>>>>> There is a race condition between dm device suspend and table load that >>>>>>> can lead to null pointer dereference. The issue occurs when suspend is >>>>>>> invoked before table load completes: >>>>>>> >>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000054 >>>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>>> CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 >>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 >>>>>>> RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 >>>>>>> Call Trace: >>>>>>> <TASK> >>>>>>> blk_mq_quiesce_queue+0x2c/0x50 >>>>>>> dm_stop_queue+0xd/0x20 >>>>>>> __dm_suspend+0x130/0x330 >>>>>>> dm_suspend+0x11a/0x180 >>>>>>> dev_suspend+0x27e/0x560 >>>>>>> ctl_ioctl+0x4cf/0x850 >>>>>>> dm_ctl_ioctl+0xd/0x20 >>>>>>> vfs_ioctl+0x1d/0x50 >>>>>>> __se_sys_ioctl+0x9b/0xc0 >>>>>>> __x64_sys_ioctl+0x19/0x30 >>>>>>> x64_sys_call+0x2c4a/0x4620 >>>>>>> do_syscall_64+0x9e/0x1b0 >>>>>>> >>>>>>> The issue can be triggered as below: >>>>>>> >>>>>>> T1 T2 >>>>>>> dm_suspend table_load >>>>>>> __dm_suspend dm_setup_md_queue >>>>>>> dm_mq_init_request_queue >>>>>>> blk_mq_init_allocated_queue >>>>>>> => q->mq_ops = set->ops; (1) >>>>>>> dm_stop_queue / dm_wait_for_completion >>>>>>> => q->tag_set NULL pointer! (2) >>>>>>> => q->tag_set = set; (3) >>>>>>> >>>>>>> Fix this by checking if a valid table (map) exists before performing >>>>>>> request-based suspend and waiting for target I/O. When map is NULL, >>>>>>> skip these table-dependent suspend steps. >>>>>>> >>>>>>> Even when map is NULL, no I/O can reach any target because there is >>>>>>> no table loaded; I/O submitted in this state will fail early in the >>>>>>> DM layer. Skipping the table-dependent suspend logic in this case >>>>>>> is safe and avoids NULL pointer dereferences. >>>>>>> >>>>>>> Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") >>>>>>> Cc: stable@vger.kernel.org >>>>>>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> >>>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >>>>>>> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] >>>>>>> Signed-off-by: Sasha Levin <sashal@kernel.org> >>>>>>> --- >>>>>>> drivers/md/dm.c | 7 ++++--- >>>>>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c >>>>>>> index 0868358a7a8d2..b51281ce15d4c 100644 >>>>>>> --- a/drivers/md/dm.c >>>>>>> +++ b/drivers/md/dm.c >>>>>>> @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>> { >>>>>>> bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; >>>>>>> bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; >>>>>>> - int r; >>>>>>> + int r = 0; >>>>>>> lockdep_assert_held(&md->suspend_lock); >>>>>>> @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>> * Stop md->queue before flushing md->wq in case request-based >>>>>>> * dm defers requests to md->wq from md->queue. >>>>>>> */ >>>>>>> - if (dm_request_based(md)) >>>>>>> + if (map && dm_request_based(md)) >>>>>>> dm_stop_queue(md->queue); >>>>>> It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct >>>>>> used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() >>>>>> would not attempt to access q->tag_set, so in dm_stop_queue() this kind >>>>>> of race condition would not cause a null pointer dereference. The change >>>>>> may not be necessary for 5.10, but adding it doesn’t appear to cause any >>>>>> issues either. >>>>>>> flush_workqueue(md->wq); >>>>>>> @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>> * We call dm_wait_for_completion to wait for all existing requests >>>>>>> * to finish. >>>>>>> */ >>>>>>> - r = dm_wait_for_completion(md, task_state); >>>>>>> + if (map) >>>>>>> + r = dm_wait_for_completion(md, task_state); >>>>>> The fix tag c4576aed8d85 ("dm: fix request-based dm's use of >>>>>> dm_wait_for_completion") seems to correspond to the modification of >>>>>> dm_wait_for_completion(). >>>>>>> if (!r) >>>>>>> set_bit(dmf_suspended_flag, &md->flags); >>>>>> Perhaps adding another fix tag would be more appropriate? >>>>> Those tags come directly from the original commit. >>>> Thanks for the clarification. Since the patch has already been merged into >>>> the mainline tree, we can't update the commit there. I was just wondering >>>> if it would be acceptable to add an additional “Fixes” tag when applying >>>> it to the stable branch, before merging. >>> I can, if needed, can you please provide that line? >> Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the >> tagset") > This is odd, that is a 6.2 commit, yet this is being asked to be added > to 5.10.y. So what help is this going to provide? > > confused, > > greg k-h Apologies, I didn't notice the version of this patch. Indeed, this patch was not included in 5.10, and 5.10 is not affected by this issue. Perhaps we could add the new fix tag to the stable branch patches starting from 6.2? Thanks, Lingfeng ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-23 1:12 ` Li Lingfeng @ 2025-10-23 5:20 ` Greg KH 2025-10-23 7:07 ` Li Lingfeng 0 siblings, 1 reply; 13+ messages in thread From: Greg KH @ 2025-10-23 5:20 UTC (permalink / raw) To: Li Lingfeng Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao On Thu, Oct 23, 2025 at 09:12:13AM +0800, Li Lingfeng wrote: > > 在 2025/10/22 17:20, Greg KH 写道: > > On Wed, Oct 22, 2025 at 05:10:36PM +0800, Li Lingfeng wrote: > > > 在 2025/10/22 16:45, Greg KH 写道: > > > > On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: > > > > > Hi, > > > > > > > > > > 在 2025/10/22 16:10, Greg KH 写道: > > > > > > On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: > > > > > > > Hi, > > > > > > > > > > > > > > 在 2025/10/14 11:03, Sasha Levin 写道: > > > > > > > > From: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > > > > > > > > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > > > > > > > > > > > > > > > There is a race condition between dm device suspend and table load that > > > > > > > > can lead to null pointer dereference. The issue occurs when suspend is > > > > > > > > invoked before table load completes: > > > > > > > > > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > > > > > > > > Oops: 0000 [#1] PREEMPT SMP PTI > > > > > > > > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > > > > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > > > > > > > > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > > > > > > > > Call Trace: > > > > > > > > <TASK> > > > > > > > > blk_mq_quiesce_queue+0x2c/0x50 > > > > > > > > dm_stop_queue+0xd/0x20 > > > > > > > > __dm_suspend+0x130/0x330 > > > > > > > > dm_suspend+0x11a/0x180 > > > > > > > > dev_suspend+0x27e/0x560 > > > > > > > > ctl_ioctl+0x4cf/0x850 > > > > > > > > dm_ctl_ioctl+0xd/0x20 > > > > > > > > vfs_ioctl+0x1d/0x50 > > > > > > > > __se_sys_ioctl+0x9b/0xc0 > > > > > > > > __x64_sys_ioctl+0x19/0x30 > > > > > > > > x64_sys_call+0x2c4a/0x4620 > > > > > > > > do_syscall_64+0x9e/0x1b0 > > > > > > > > > > > > > > > > The issue can be triggered as below: > > > > > > > > > > > > > > > > T1 T2 > > > > > > > > dm_suspend table_load > > > > > > > > __dm_suspend dm_setup_md_queue > > > > > > > > dm_mq_init_request_queue > > > > > > > > blk_mq_init_allocated_queue > > > > > > > > => q->mq_ops = set->ops; (1) > > > > > > > > dm_stop_queue / dm_wait_for_completion > > > > > > > > => q->tag_set NULL pointer! (2) > > > > > > > > => q->tag_set = set; (3) > > > > > > > > > > > > > > > > Fix this by checking if a valid table (map) exists before performing > > > > > > > > request-based suspend and waiting for target I/O. When map is NULL, > > > > > > > > skip these table-dependent suspend steps. > > > > > > > > > > > > > > > > Even when map is NULL, no I/O can reach any target because there is > > > > > > > > no table loaded; I/O submitted in this state will fail early in the > > > > > > > > DM layer. Skipping the table-dependent suspend logic in this case > > > > > > > > is safe and avoids NULL pointer dereferences. > > > > > > > > > > > > > > > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > > > > > > > > Cc: stable@vger.kernel.org > > > > > > > > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > > > > > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > > > > > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > > > --- > > > > > > > > drivers/md/dm.c | 7 ++++--- > > > > > > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > > > > > > > index 0868358a7a8d2..b51281ce15d4c 100644 > > > > > > > > --- a/drivers/md/dm.c > > > > > > > > +++ b/drivers/md/dm.c > > > > > > > > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > { > > > > > > > > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > > > > > > > > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > > > > > > > > - int r; > > > > > > > > + int r = 0; > > > > > > > > lockdep_assert_held(&md->suspend_lock); > > > > > > > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > * Stop md->queue before flushing md->wq in case request-based > > > > > > > > * dm defers requests to md->wq from md->queue. > > > > > > > > */ > > > > > > > > - if (dm_request_based(md)) > > > > > > > > + if (map && dm_request_based(md)) > > > > > > > > dm_stop_queue(md->queue); > > > > > > > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct > > > > > > > used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() > > > > > > > would not attempt to access q->tag_set, so in dm_stop_queue() this kind > > > > > > > of race condition would not cause a null pointer dereference. The change > > > > > > > may not be necessary for 5.10, but adding it doesn’t appear to cause any > > > > > > > issues either. > > > > > > > > flush_workqueue(md->wq); > > > > > > > > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > * We call dm_wait_for_completion to wait for all existing requests > > > > > > > > * to finish. > > > > > > > > */ > > > > > > > > - r = dm_wait_for_completion(md, task_state); > > > > > > > > + if (map) > > > > > > > > + r = dm_wait_for_completion(md, task_state); > > > > > > > The fix tag c4576aed8d85 ("dm: fix request-based dm's use of > > > > > > > dm_wait_for_completion") seems to correspond to the modification of > > > > > > > dm_wait_for_completion(). > > > > > > > > if (!r) > > > > > > > > set_bit(dmf_suspended_flag, &md->flags); > > > > > > > Perhaps adding another fix tag would be more appropriate? > > > > > > Those tags come directly from the original commit. > > > > > Thanks for the clarification. Since the patch has already been merged into > > > > > the mainline tree, we can't update the commit there. I was just wondering > > > > > if it would be acceptable to add an additional “Fixes” tag when applying > > > > > it to the stable branch, before merging. > > > > I can, if needed, can you please provide that line? > > > Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the > > > tagset") > > This is odd, that is a 6.2 commit, yet this is being asked to be added > > to 5.10.y. So what help is this going to provide? > > > > confused, > > > > greg k-h > Apologies, I didn't notice the version of this patch. Indeed, this patch > was not included in 5.10, and 5.10 is not affected by this issue. The original Fixes: tag points to a commit from 5.0, which is why this was queued up for here. > Perhaps we could add the new fix tag to the stable branch patches > starting from 6.2? I do not understand, sorry. Let's start over, should this patch be included in the 5.10.y tree? The 5.4.y tree? If not, why not? And as this is already in the 5.15.y and 6.1.y releases, should it be reverted from there? thanks, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-23 5:20 ` Greg KH @ 2025-10-23 7:07 ` Li Lingfeng 2025-10-23 8:43 ` Greg KH 0 siblings, 1 reply; 13+ messages in thread From: Li Lingfeng @ 2025-10-23 7:07 UTC (permalink / raw) To: Greg KH Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao 在 2025/10/23 13:20, Greg KH 写道: > On Thu, Oct 23, 2025 at 09:12:13AM +0800, Li Lingfeng wrote: >> 在 2025/10/22 17:20, Greg KH 写道: >>> On Wed, Oct 22, 2025 at 05:10:36PM +0800, Li Lingfeng wrote: >>>> 在 2025/10/22 16:45, Greg KH 写道: >>>>> On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: >>>>>> Hi, >>>>>> >>>>>> 在 2025/10/22 16:10, Greg KH 写道: >>>>>>> On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> 在 2025/10/14 11:03, Sasha Levin 写道: >>>>>>>>> From: Zheng Qixing <zhengqixing@huawei.com> >>>>>>>>> >>>>>>>>> [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] >>>>>>>>> >>>>>>>>> There is a race condition between dm device suspend and table load that >>>>>>>>> can lead to null pointer dereference. The issue occurs when suspend is >>>>>>>>> invoked before table load completes: >>>>>>>>> >>>>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000054 >>>>>>>>> Oops: 0000 [#1] PREEMPT SMP PTI >>>>>>>>> CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 >>>>>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 >>>>>>>>> RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 >>>>>>>>> Call Trace: >>>>>>>>> <TASK> >>>>>>>>> blk_mq_quiesce_queue+0x2c/0x50 >>>>>>>>> dm_stop_queue+0xd/0x20 >>>>>>>>> __dm_suspend+0x130/0x330 >>>>>>>>> dm_suspend+0x11a/0x180 >>>>>>>>> dev_suspend+0x27e/0x560 >>>>>>>>> ctl_ioctl+0x4cf/0x850 >>>>>>>>> dm_ctl_ioctl+0xd/0x20 >>>>>>>>> vfs_ioctl+0x1d/0x50 >>>>>>>>> __se_sys_ioctl+0x9b/0xc0 >>>>>>>>> __x64_sys_ioctl+0x19/0x30 >>>>>>>>> x64_sys_call+0x2c4a/0x4620 >>>>>>>>> do_syscall_64+0x9e/0x1b0 >>>>>>>>> >>>>>>>>> The issue can be triggered as below: >>>>>>>>> >>>>>>>>> T1 T2 >>>>>>>>> dm_suspend table_load >>>>>>>>> __dm_suspend dm_setup_md_queue >>>>>>>>> dm_mq_init_request_queue >>>>>>>>> blk_mq_init_allocated_queue >>>>>>>>> => q->mq_ops = set->ops; (1) >>>>>>>>> dm_stop_queue / dm_wait_for_completion >>>>>>>>> => q->tag_set NULL pointer! (2) >>>>>>>>> => q->tag_set = set; (3) >>>>>>>>> >>>>>>>>> Fix this by checking if a valid table (map) exists before performing >>>>>>>>> request-based suspend and waiting for target I/O. When map is NULL, >>>>>>>>> skip these table-dependent suspend steps. >>>>>>>>> >>>>>>>>> Even when map is NULL, no I/O can reach any target because there is >>>>>>>>> no table loaded; I/O submitted in this state will fail early in the >>>>>>>>> DM layer. Skipping the table-dependent suspend logic in this case >>>>>>>>> is safe and avoids NULL pointer dereferences. >>>>>>>>> >>>>>>>>> Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") >>>>>>>>> Cc: stable@vger.kernel.org >>>>>>>>> Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> >>>>>>>>> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> >>>>>>>>> [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] >>>>>>>>> Signed-off-by: Sasha Levin <sashal@kernel.org> >>>>>>>>> --- >>>>>>>>> drivers/md/dm.c | 7 ++++--- >>>>>>>>> 1 file changed, 4 insertions(+), 3 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c >>>>>>>>> index 0868358a7a8d2..b51281ce15d4c 100644 >>>>>>>>> --- a/drivers/md/dm.c >>>>>>>>> +++ b/drivers/md/dm.c >>>>>>>>> @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>>>> { >>>>>>>>> bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; >>>>>>>>> bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; >>>>>>>>> - int r; >>>>>>>>> + int r = 0; >>>>>>>>> lockdep_assert_held(&md->suspend_lock); >>>>>>>>> @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>>>> * Stop md->queue before flushing md->wq in case request-based >>>>>>>>> * dm defers requests to md->wq from md->queue. >>>>>>>>> */ >>>>>>>>> - if (dm_request_based(md)) >>>>>>>>> + if (map && dm_request_based(md)) >>>>>>>>> dm_stop_queue(md->queue); >>>>>>>> It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct >>>>>>>> used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() >>>>>>>> would not attempt to access q->tag_set, so in dm_stop_queue() this kind >>>>>>>> of race condition would not cause a null pointer dereference. The change >>>>>>>> may not be necessary for 5.10, but adding it doesn’t appear to cause any >>>>>>>> issues either. >>>>>>>>> flush_workqueue(md->wq); >>>>>>>>> @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, >>>>>>>>> * We call dm_wait_for_completion to wait for all existing requests >>>>>>>>> * to finish. >>>>>>>>> */ >>>>>>>>> - r = dm_wait_for_completion(md, task_state); >>>>>>>>> + if (map) >>>>>>>>> + r = dm_wait_for_completion(md, task_state); >>>>>>>> The fix tag c4576aed8d85 ("dm: fix request-based dm's use of >>>>>>>> dm_wait_for_completion") seems to correspond to the modification of >>>>>>>> dm_wait_for_completion(). >>>>>>>>> if (!r) >>>>>>>>> set_bit(dmf_suspended_flag, &md->flags); >>>>>>>> Perhaps adding another fix tag would be more appropriate? >>>>>>> Those tags come directly from the original commit. >>>>>> Thanks for the clarification. Since the patch has already been merged into >>>>>> the mainline tree, we can't update the commit there. I was just wondering >>>>>> if it would be acceptable to add an additional “Fixes” tag when applying >>>>>> it to the stable branch, before merging. >>>>> I can, if needed, can you please provide that line? >>>> Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the >>>> tagset") >>> This is odd, that is a 6.2 commit, yet this is being asked to be added >>> to 5.10.y. So what help is this going to provide? >>> >>> confused, >>> >>> greg k-h >> Apologies, I didn't notice the version of this patch. Indeed, this patch >> was not included in 5.10, and 5.10 is not affected by this issue. > The original Fixes: tag points to a commit from 5.0, which is why this > was queued up for here. > >> Perhaps we could add the new fix tag to the stable branch patches >> starting from 6.2? > I do not understand, sorry. > > Let's start over, should this patch be included in the 5.10.y tree? The > 5.4.y tree? If not, why not? > > And as this is already in the 5.15.y and 6.1.y releases, should it be > reverted from there? > > thanks, > > greg k-h The original patch made two changes: 1. @@ -2960,7 +2960,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, * Stop md->queue before flushing md->wq in case request-based * dm defers requests to md->wq from md->queue. */ - if (dm_request_based(md)) { + if (map && dm_request_based(md)) { dm_stop_queue(md->queue); set_bit(DMF_QUEUE_STOPPED, &md->flags); } After commit 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset") was merged in v6.2, dm_request_based() started to trigger access to tag_set, so a null check on map became necessary to avoid a potential null pointer dereference. The original patch merged into mainline was missing this Fixes tag. 2. @@ -2972,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, * We call dm_wait_for_completion to wait for all existing requests * to finish. */ - r = dm_wait_for_completion(md, task_state); + if (map) + r = dm_wait_for_completion(md, task_state); if (!r) set_bit(dmf_suspended_flag, &md->flags); After commit c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") was merged in v5.0, dm_wait_for_completion() started to access tag_set, so the additional null check on map was also needed to prevent a null pointer dereference. In summary: For kernels v6.2 and later, there should be two Fixes tags: Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the tagset") Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") For kernels v5.0 to v6.2, only the second Fixes tag is needed: Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") Although the first change isn't actually required for kernels between v5.0 and v6.2, it doesn't introduce any new issues, so there's no need to revert it. My suggestion is: For stable kernels v6.2 and later, if this patch hasn't been applied yet, it would be better to include the extra Fixes tag for 80bd4a7aab4c when merging. If it has already been merged, there's no need to revert it, since the first change is harmless anyway. Sorry for not explaining this clearly earlier. Thanks, Lingfeng ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() 2025-10-23 7:07 ` Li Lingfeng @ 2025-10-23 8:43 ` Greg KH 0 siblings, 0 replies; 13+ messages in thread From: Greg KH @ 2025-10-23 8:43 UTC (permalink / raw) To: Li Lingfeng Cc: Sasha Levin, stable, Zheng Qixing, Mikulas Patocka, dm-devel, yangerkun, zhangyi (F), Hou Tao On Thu, Oct 23, 2025 at 03:07:28PM +0800, Li Lingfeng wrote: > > 在 2025/10/23 13:20, Greg KH 写道: > > On Thu, Oct 23, 2025 at 09:12:13AM +0800, Li Lingfeng wrote: > > > 在 2025/10/22 17:20, Greg KH 写道: > > > > On Wed, Oct 22, 2025 at 05:10:36PM +0800, Li Lingfeng wrote: > > > > > 在 2025/10/22 16:45, Greg KH 写道: > > > > > > On Wed, Oct 22, 2025 at 04:21:33PM +0800, Li Lingfeng wrote: > > > > > > > Hi, > > > > > > > > > > > > > > 在 2025/10/22 16:10, Greg KH 写道: > > > > > > > > On Wed, Oct 22, 2025 at 04:01:04PM +0800, Li Lingfeng wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > 在 2025/10/14 11:03, Sasha Levin 写道: > > > > > > > > > > From: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > > > > > > > > > > > > > [ Upstream commit 8d33a030c566e1f105cd5bf27f37940b6367f3be ] > > > > > > > > > > > > > > > > > > > > There is a race condition between dm device suspend and table load that > > > > > > > > > > can lead to null pointer dereference. The issue occurs when suspend is > > > > > > > > > > invoked before table load completes: > > > > > > > > > > > > > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000054 > > > > > > > > > > Oops: 0000 [#1] PREEMPT SMP PTI > > > > > > > > > > CPU: 6 PID: 6798 Comm: dmsetup Not tainted 6.6.0-g7e52f5f0ca9b #62 > > > > > > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 > > > > > > > > > > RIP: 0010:blk_mq_wait_quiesce_done+0x0/0x50 > > > > > > > > > > Call Trace: > > > > > > > > > > <TASK> > > > > > > > > > > blk_mq_quiesce_queue+0x2c/0x50 > > > > > > > > > > dm_stop_queue+0xd/0x20 > > > > > > > > > > __dm_suspend+0x130/0x330 > > > > > > > > > > dm_suspend+0x11a/0x180 > > > > > > > > > > dev_suspend+0x27e/0x560 > > > > > > > > > > ctl_ioctl+0x4cf/0x850 > > > > > > > > > > dm_ctl_ioctl+0xd/0x20 > > > > > > > > > > vfs_ioctl+0x1d/0x50 > > > > > > > > > > __se_sys_ioctl+0x9b/0xc0 > > > > > > > > > > __x64_sys_ioctl+0x19/0x30 > > > > > > > > > > x64_sys_call+0x2c4a/0x4620 > > > > > > > > > > do_syscall_64+0x9e/0x1b0 > > > > > > > > > > > > > > > > > > > > The issue can be triggered as below: > > > > > > > > > > > > > > > > > > > > T1 T2 > > > > > > > > > > dm_suspend table_load > > > > > > > > > > __dm_suspend dm_setup_md_queue > > > > > > > > > > dm_mq_init_request_queue > > > > > > > > > > blk_mq_init_allocated_queue > > > > > > > > > > => q->mq_ops = set->ops; (1) > > > > > > > > > > dm_stop_queue / dm_wait_for_completion > > > > > > > > > > => q->tag_set NULL pointer! (2) > > > > > > > > > > => q->tag_set = set; (3) > > > > > > > > > > > > > > > > > > > > Fix this by checking if a valid table (map) exists before performing > > > > > > > > > > request-based suspend and waiting for target I/O. When map is NULL, > > > > > > > > > > skip these table-dependent suspend steps. > > > > > > > > > > > > > > > > > > > > Even when map is NULL, no I/O can reach any target because there is > > > > > > > > > > no table loaded; I/O submitted in this state will fail early in the > > > > > > > > > > DM layer. Skipping the table-dependent suspend logic in this case > > > > > > > > > > is safe and avoids NULL pointer dereferences. > > > > > > > > > > > > > > > > > > > > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of dm_wait_for_completion") > > > > > > > > > > Cc: stable@vger.kernel.org > > > > > > > > > > Signed-off-by: Zheng Qixing <zhengqixing@huawei.com> > > > > > > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > > > > > > > [ omitted DMF_QUEUE_STOPPED flag setting and braces absent in 5.15 ] > > > > > > > > > > Signed-off-by: Sasha Levin <sashal@kernel.org> > > > > > > > > > > --- > > > > > > > > > > drivers/md/dm.c | 7 ++++--- > > > > > > > > > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > > > > > > > > > index 0868358a7a8d2..b51281ce15d4c 100644 > > > > > > > > > > --- a/drivers/md/dm.c > > > > > > > > > > +++ b/drivers/md/dm.c > > > > > > > > > > @@ -2457,7 +2457,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > > > { > > > > > > > > > > bool do_lockfs = suspend_flags & DM_SUSPEND_LOCKFS_FLAG; > > > > > > > > > > bool noflush = suspend_flags & DM_SUSPEND_NOFLUSH_FLAG; > > > > > > > > > > - int r; > > > > > > > > > > + int r = 0; > > > > > > > > > > lockdep_assert_held(&md->suspend_lock); > > > > > > > > > > @@ -2509,7 +2509,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > > > * Stop md->queue before flushing md->wq in case request-based > > > > > > > > > > * dm defers requests to md->wq from md->queue. > > > > > > > > > > */ > > > > > > > > > > - if (dm_request_based(md)) > > > > > > > > > > + if (map && dm_request_based(md)) > > > > > > > > > > dm_stop_queue(md->queue); > > > > > > > > > It seems that before commit 80bd4a7aab4c ("blk-mq: move the srcu_struct > > > > > > > > > used for quiescing to the tagset") was merged, blk_mq_wait_quiesce_done() > > > > > > > > > would not attempt to access q->tag_set, so in dm_stop_queue() this kind > > > > > > > > > of race condition would not cause a null pointer dereference. The change > > > > > > > > > may not be necessary for 5.10, but adding it doesn’t appear to cause any > > > > > > > > > issues either. > > > > > > > > > > flush_workqueue(md->wq); > > > > > > > > > > @@ -2519,7 +2519,8 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map, > > > > > > > > > > * We call dm_wait_for_completion to wait for all existing requests > > > > > > > > > > * to finish. > > > > > > > > > > */ > > > > > > > > > > - r = dm_wait_for_completion(md, task_state); > > > > > > > > > > + if (map) > > > > > > > > > > + r = dm_wait_for_completion(md, task_state); > > > > > > > > > The fix tag c4576aed8d85 ("dm: fix request-based dm's use of > > > > > > > > > dm_wait_for_completion") seems to correspond to the modification of > > > > > > > > > dm_wait_for_completion(). > > > > > > > > > > if (!r) > > > > > > > > > > set_bit(dmf_suspended_flag, &md->flags); > > > > > > > > > Perhaps adding another fix tag would be more appropriate? > > > > > > > > Those tags come directly from the original commit. > > > > > > > Thanks for the clarification. Since the patch has already been merged into > > > > > > > the mainline tree, we can't update the commit there. I was just wondering > > > > > > > if it would be acceptable to add an additional “Fixes” tag when applying > > > > > > > it to the stable branch, before merging. > > > > > > I can, if needed, can you please provide that line? > > > > > Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the > > > > > tagset") > > > > This is odd, that is a 6.2 commit, yet this is being asked to be added > > > > to 5.10.y. So what help is this going to provide? > > > > > > > > confused, > > > > > > > > greg k-h > > > Apologies, I didn't notice the version of this patch. Indeed, this patch > > > was not included in 5.10, and 5.10 is not affected by this issue. > > The original Fixes: tag points to a commit from 5.0, which is why this > > was queued up for here. > > > > > Perhaps we could add the new fix tag to the stable branch patches > > > starting from 6.2? > > I do not understand, sorry. > > > > Let's start over, should this patch be included in the 5.10.y tree? The > > 5.4.y tree? If not, why not? > > > > And as this is already in the 5.15.y and 6.1.y releases, should it be > > reverted from there? > > > > thanks, > > > > greg k-h > The original patch made two changes: > 1. > @@ -2960,7 +2960,7 @@ static int __dm_suspend(struct mapped_device *md, > struct dm_table *map, > * Stop md->queue before flushing md->wq in case request-based > * dm defers requests to md->wq from md->queue. > */ > - if (dm_request_based(md)) { > + if (map && dm_request_based(md)) { > dm_stop_queue(md->queue); > set_bit(DMF_QUEUE_STOPPED, &md->flags); > } > After commit 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing > to the tagset") was merged in v6.2, dm_request_based() started to trigger > access to tag_set, so a null check on map became necessary to avoid a > potential null pointer dereference. > The original patch merged into mainline was missing this Fixes tag. > > 2. > @@ -2972,7 +2972,8 @@ static int __dm_suspend(struct mapped_device *md, > struct dm_table *map, > * We call dm_wait_for_completion to wait for all existing requests > * to finish. > */ > - r = dm_wait_for_completion(md, task_state); > + if (map) > + r = dm_wait_for_completion(md, task_state); > if (!r) > set_bit(dmf_suspended_flag, &md->flags); > After commit c4576aed8d85 ("dm: fix request-based dm's use of > dm_wait_for_completion") > was merged in v5.0, dm_wait_for_completion() started to access tag_set, so > the additional null check on map was also needed to prevent a null pointer > dereference. > > In summary: > For kernels v6.2 and later, there should be two Fixes tags: > Fixes: 80bd4a7aab4c ("blk-mq: move the srcu_struct used for quiescing to the > tagset") > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of > dm_wait_for_completion") > > For kernels v5.0 to v6.2, only the second Fixes tag is needed: > Fixes: c4576aed8d85 ("dm: fix request-based dm's use of > dm_wait_for_completion") > > Although the first change isn't actually required for kernels between v5.0 > and v6.2, it doesn't introduce any new issues, so there's no need to > revert it. > > My suggestion is: > For stable kernels v6.2 and later, if this patch hasn't been applied yet, > it would be better to include the extra Fixes tag for 80bd4a7aab4c when > merging. It has already been merged. > If it has already been merged, there's no need to revert it, since the > first change is harmless anyway. Great, so there's nothing I need to do here, I like that :) thanks for spelling it all out, makes more sense now. greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-10-23 8:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-13 9:12 FAILED: patch "[PATCH] dm: fix NULL pointer dereference in __dm_suspend()" failed to apply to 5.10-stable tree gregkh 2025-10-14 3:03 ` [PATCH 5.10.y] dm: fix NULL pointer dereference in __dm_suspend() Sasha Levin 2025-10-22 8:01 ` Li Lingfeng 2025-10-22 8:10 ` Greg KH 2025-10-22 8:21 ` Li Lingfeng 2025-10-22 8:45 ` Greg KH 2025-10-22 9:10 ` Li Lingfeng 2025-10-22 9:14 ` Zheng Qixing 2025-10-22 9:20 ` Greg KH 2025-10-23 1:12 ` Li Lingfeng 2025-10-23 5:20 ` Greg KH 2025-10-23 7:07 ` Li Lingfeng 2025-10-23 8:43 ` Greg KH
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).