From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Omar Sandoval <osandov@fb.com>,
Tejun Heo <tj@kernel.org>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 5.10 22/52] blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Date: Mon, 21 Oct 2024 12:25:43 +0200 [thread overview]
Message-ID: <20241021102242.494130971@linuxfoundation.org> (raw)
In-Reply-To: <20241021102241.624153108@linuxfoundation.org>
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Omar Sandoval <osandov@fb.com>
commit e972b08b91ef48488bae9789f03cfedb148667fb upstream.
We're seeing crashes from rq_qos_wake_function that look like this:
BUG: unable to handle page fault for address: ffffafe180a40084
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
Oops: Oops: 0002 [#1] PREEMPT SMP PTI
CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
try_to_wake_up+0x5a/0x6a0
rq_qos_wake_function+0x71/0x80
__wake_up_common+0x75/0xa0
__wake_up+0x36/0x60
scale_up.part.0+0x50/0x110
wb_timer_fn+0x227/0x450
...
So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).
p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.
What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:
rq_qos_wait() rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
data->got_token = true;
list_del_init(&curr->entry);
if (data.got_token)
break;
finish_wait(&rqw->wait, &data.wq);
^- returns immediately because
list_empty_careful(&wq_entry->entry)
is true
... return, go do something else ...
wake_up_process(data->task)
(NO LONGER VALID!)-^
Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.
The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.
Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().
Fixes: 38cfb5a45ee0 ("blk-wbt: improve waking of tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
block/blk-rq-qos.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -225,8 +225,8 @@ static int rq_qos_wake_function(struct w
data->got_token = true;
smp_wmb();
- list_del_init(&curr->entry);
wake_up_process(data->task);
+ list_del_init_careful(&curr->entry);
return 1;
}
next prev parent reply other threads:[~2024-10-21 10:49 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-21 10:25 [PATCH 5.10 00/52] 5.10.228-rc1 review Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 01/52] ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 02/52] net: enetc: add missing static descriptor and inline keyword Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 03/52] posix-clock: Fix missing timespec64 check in pc_clock_settime() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 04/52] arm64: probes: Remove broken LDR (literal) uprobe support Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 05/52] arm64: probes: Fix simulate_ldr*_literal() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 06/52] net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 07/52] irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1 Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 08/52] fat: fix uninitialized variable Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 09/52] mm/swapfile: skip HugeTLB pages for unuse_vma Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 10/52] wifi: mac80211: fix potential key use-after-free Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 11/52] KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 12/52] io_uring/sqpoll: do not allow pinning outside of cpuset Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 13/52] io_uring/sqpoll: retain test for whether the CPU is valid Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 14/52] io_uring/sqpoll: do not put cpumask on stack Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 15/52] s390/sclp_vt220: Convert newlines to CRLF instead of LFCR Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 16/52] KVM: s390: Change virtual to physical address access in diag 0x258 handler Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 17/52] x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 18/52] x86/cpufeatures: Add a IBPB_NO_RET BUG flag Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 19/52] x86/entry: Have entry_ibpb() invalidate return predictions Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 20/52] x86/bugs: Skip RSB fill at VMEXIT Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 21/52] x86/bugs: Do not use UNTRAIN_RET with IBPB on entry Greg Kroah-Hartman
2024-10-21 10:25 ` Greg Kroah-Hartman [this message]
2024-10-21 10:25 ` [PATCH 5.10 23/52] io_uring/sqpoll: close race on waiting for sqring entries Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 24/52] drm/radeon: Fix encoder->possible_clones Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 25/52] drm/vmwgfx: Handle surface check failure correctly Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 26/52] iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 27/52] iio: dac: ltc1660: " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 28/52] iio: dac: stm32-dac-core: add missing select REGMAP_MMIO " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 29/52] iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 30/52] iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 31/52] iio: light: veml6030: fix ALS sensor resolution Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 32/52] iio: light: veml6030: fix IIO device retrieval from embedded device Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 33/52] iio: light: opt3001: add missing full-scale range value Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 34/52] iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 35/52] iio: adc: ti-ads124s08: " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 36/52] Bluetooth: Remove debugfs directory on module init failure Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 37/52] Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.10 38/52] xhci: Fix incorrect stream context type macro Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 39/52] USB: serial: option: add support for Quectel EG916Q-GL Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 40/52] USB: serial: option: add Telit FN920C04 MBIM compositions Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 41/52] parport: Proper fix for array out-of-bounds access Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 42/52] x86/resctrl: Annotate get_mem_config() functions as __init Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 43/52] x86/apic: Always explicitly disarm TSC-deadline timer Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 44/52] x86/entry_32: Do not clobber user EFLAGS.ZF Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 45/52] x86/entry_32: Clear CPU buffers after register restore in NMI return Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 46/52] irqchip/gic-v4: Dont allow a VMOVP on a dying VPE Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 47/52] mptcp: track and update contiguous data status Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 48/52] mptcp: handle consistently DSS corruption Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 49/52] tcp: fix mptcp DSS corruption due to large pmtu xmit Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 50/52] nilfs2: propagate directory read errors from nilfs_find_entry() Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 51/52] powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.10 52/52] ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 Greg Kroah-Hartman
2024-10-21 17:46 ` [PATCH 5.10 00/52] 5.10.228-rc1 review Florian Fainelli
2024-10-22 6:46 ` Naresh Kamboju
2024-10-22 9:59 ` Pavel Machek
2024-10-22 13:00 ` Mark Brown
2024-10-22 17:55 ` Jon Hunter
2024-10-23 7:27 ` Muhammad Usama Anjum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241021102242.494130971@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=axboe@kernel.dk \
--cc=johannes.thumshirn@wdc.com \
--cc=osandov@fb.com \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox