public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Omar Sandoval <osandov@fb.com>,
	Tejun Heo <tj@kernel.org>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>,
	Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 5.15 47/82] blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
Date: Mon, 21 Oct 2024 12:25:28 +0200	[thread overview]
Message-ID: <20241021102249.099416605@linuxfoundation.org> (raw)
In-Reply-To: <20241021102247.209765070@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Omar Sandoval <osandov@fb.com>

commit e972b08b91ef48488bae9789f03cfedb148667fb upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a45ee0 ("blk-wbt: improve waking of tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 block/blk-rq-qos.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/block/blk-rq-qos.c
+++ b/block/blk-rq-qos.c
@@ -225,8 +225,8 @@ static int rq_qos_wake_function(struct w
 
 	data->got_token = true;
 	smp_wmb();
-	list_del_init(&curr->entry);
 	wake_up_process(data->task);
+	list_del_init_careful(&curr->entry);
 	return 1;
 }
 



  parent reply	other threads:[~2024-10-21 10:48 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-21 10:24 [PATCH 5.15 00/82] 5.15.169-rc1 review Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 01/82] ALSA: hda/conexant - Fix audio routing for HP EliteOne 1000 G2 Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 02/82] udf: New directory iteration code Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 03/82] udf: Convert udf_expand_dir_adinicb() to new directory iteration Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 04/82] udf: Move udf_expand_dir_adinicb() to its callsite Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 05/82] udf: Implement searching for directory entry using new iteration code Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 06/82] udf: Provide function to mark entry as deleted using new directory " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 07/82] udf: Convert udf_rename() to " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 08/82] udf: Convert udf_readdir() to new directory iteration Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 09/82] udf: Convert udf_lookup() to use new directory iteration code Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 10/82] udf: Convert udf_get_parent() to " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 11/82] udf: Convert empty_dir() " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 12/82] udf: Convert udf_rmdir() " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 13/82] udf: Convert udf_unlink() " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 14/82] udf: Implement adding of dir entries using new " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 15/82] udf: Convert udf_add_nondir() to new directory iteration Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 16/82] udf: Convert udf_mkdir() to new directory iteration code Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 17/82] udf: Convert udf_link() " Greg Kroah-Hartman
2024-10-21 10:24 ` [PATCH 5.15 18/82] udf: Remove old " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 19/82] udf: Handle error when expanding directory Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 20/82] udf: Dont return bh from udf_expand_dir_adinicb() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 21/82] udf: Fix bogus checksum computation in udf_rename() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 22/82] net: enetc: remove xdp_drops statistic from enetc_xdp_drop() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 23/82] net: enetc: add missing static descriptor and inline keyword Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 24/82] posix-clock: Fix missing timespec64 check in pc_clock_settime() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 25/82] arm64: probes: Remove broken LDR (literal) uprobe support Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 26/82] arm64: probes: Fix simulate_ldr*_literal() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 27/82] net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 28/82] irqchip/gic-v3-its: Fix VSYNC referencing an unmapped VPE on GIC v4.1 Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 29/82] fat: fix uninitialized variable Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 30/82] mm/swapfile: skip HugeTLB pages for unuse_vma Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 31/82] secretmem: disable memfd_secret() if arch cannot set direct map Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 32/82] wifi: mac80211: fix potential key use-after-free Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 33/82] dm-crypt, dm-verity: disable tasklets Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 34/82] KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 35/82] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE) Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 36/82] io_uring/sqpoll: do not allow pinning outside of cpuset Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 37/82] io_uring/sqpoll: retain test for whether the CPU is valid Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 38/82] io_uring/sqpoll: do not put cpumask on stack Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 39/82] iommu/vt-d: Fix incorrect pci_for_each_dma_alias() for non-PCI devices Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 40/82] s390/sclp_vt220: Convert newlines to CRLF instead of LFCR Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 41/82] KVM: s390: Change virtual to physical address access in diag 0x258 handler Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 42/82] x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RET Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 43/82] x86/cpufeatures: Add a IBPB_NO_RET BUG flag Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 44/82] x86/entry: Have entry_ibpb() invalidate return predictions Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 45/82] x86/bugs: Skip RSB fill at VMEXIT Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 46/82] x86/bugs: Do not use UNTRAIN_RET with IBPB on entry Greg Kroah-Hartman
2024-10-21 10:25 ` Greg Kroah-Hartman [this message]
2024-10-21 10:25 ` [PATCH 5.15 48/82] io_uring/sqpoll: close race on waiting for sqring entries Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 49/82] drm/radeon: Fix encoder->possible_clones Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 50/82] drm/vmwgfx: Handle surface check failure correctly Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 51/82] iio: dac: ad5770r: add missing select REGMAP_SPI in Kconfig Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 52/82] iio: dac: ltc1660: " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 53/82] iio: dac: stm32-dac-core: add missing select REGMAP_MMIO " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 54/82] iio: adc: ti-ads8688: add missing select IIO_(TRIGGERED_)BUFFER " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 55/82] iio: hid-sensors: Fix an error handling path in _hid_sensor_set_report_latency() Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 56/82] iio: light: veml6030: fix ALS sensor resolution Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 57/82] iio: light: veml6030: fix IIO device retrieval from embedded device Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 58/82] iio: light: opt3001: add missing full-scale range value Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 59/82] iio: proximity: mb1232: add missing select IIO_(TRIGGERED_)BUFFER in Kconfig Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 60/82] iio: adc: ti-ads124s08: " Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 61/82] Bluetooth: Remove debugfs directory on module init failure Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 62/82] Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001 Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 63/82] xhci: Fix incorrect stream context type macro Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 64/82] xhci: Mitigate failed set dequeue pointer commands Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 65/82] USB: serial: option: add support for Quectel EG916Q-GL Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 66/82] USB: serial: option: add Telit FN920C04 MBIM compositions Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 67/82] parport: Proper fix for array out-of-bounds access Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 68/82] x86/resctrl: Annotate get_mem_config() functions as __init Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 69/82] x86/apic: Always explicitly disarm TSC-deadline timer Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 70/82] x86/entry_32: Do not clobber user EFLAGS.ZF Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 71/82] x86/entry_32: Clear CPU buffers after register restore in NMI return Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 72/82] pinctrl: ocelot: fix system hang on level based interrupts Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 73/82] irqchip/gic-v4: Dont allow a VMOVP on a dying VPE Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 74/82] mptcp: track and update contiguous data status Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 75/82] mptcp: handle consistently DSS corruption Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 76/82] tcp: fix mptcp DSS corruption due to large pmtu xmit Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 77/82] mptcp: fallback when MPTCP opts are dropped after 1st data Greg Kroah-Hartman
2024-10-21 10:25 ` [PATCH 5.15 78/82] mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.15 79/82] mptcp: prevent MPC handshake on port-based signal endpoints Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.15 80/82] nilfs2: propagate directory read errors from nilfs_find_entry() Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.15 81/82] powerpc/mm: Always update max/min_low_pfn in mem_topology_setup() Greg Kroah-Hartman
2024-10-21 10:26 ` [PATCH 5.15 82/82] ALSA: hda/conexant - Use cached pin control for Node 0x1d on HP EliteOne 1000 G2 Greg Kroah-Hartman
2024-10-21 18:05 ` [PATCH 5.15 00/82] 5.15.169-rc1 review Florian Fainelli
2024-10-21 18:06 ` SeongJae Park
2024-10-21 19:25 ` Naresh Kamboju
2024-10-21 20:01 ` Harshit Mogalapalli
2024-10-21 22:41 ` Shuah Khan
2024-10-22 13:00 ` Mark Brown
2024-10-22 17:56 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241021102249.099416605@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=johannes.thumshirn@wdc.com \
    --cc=osandov@fb.com \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox