From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
Sasha Levin <sashal@kernel.org>,
frederic@kernel.org, quic_neeraju@quicinc.com,
josh@joshtriplett.org, rcu@vger.kernel.org
Subject: [PATCH AUTOSEL 5.10 13/76] rcu-tasks: Fix race in schedule and flush work
Date: Mon, 30 May 2022 09:43:03 -0400 [thread overview]
Message-ID: <20220530134406.1934928-13-sashal@kernel.org> (raw)
In-Reply-To: <20220530134406.1934928-1-sashal@kernel.org>
From: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
[ Upstream commit f75fd4b9221d93177c50dcfde671b2e907f53e86 ]
While booting secondary CPUs, cpus_read_[lock/unlock] is not keeping
online cpumask stable. The transient online mask results in below
calltrace.
[ 0.324121] CPU1: Booted secondary processor 0x0000000001 [0x410fd083]
[ 0.346652] Detected PIPT I-cache on CPU2
[ 0.347212] CPU2: Booted secondary processor 0x0000000002 [0x410fd083]
[ 0.377255] Detected PIPT I-cache on CPU3
[ 0.377823] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
[ 0.379040] ------------[ cut here ]------------
[ 0.383662] WARNING: CPU: 0 PID: 10 at kernel/workqueue.c:3084 __flush_work+0x12c/0x138
[ 0.384850] Modules linked in:
[ 0.385403] CPU: 0 PID: 10 Comm: rcu_tasks_rude_ Not tainted 5.17.0-rc3-v8+ #13
[ 0.386473] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 0.387289] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.388308] pc : __flush_work+0x12c/0x138
[ 0.388970] lr : __flush_work+0x80/0x138
[ 0.389620] sp : ffffffc00aaf3c60
[ 0.390139] x29: ffffffc00aaf3d20 x28: ffffffc009c16af0 x27: ffffff80f761df48
[ 0.391316] x26: 0000000000000004 x25: 0000000000000003 x24: 0000000000000100
[ 0.392493] x23: ffffffffffffffff x22: ffffffc009c16b10 x21: ffffffc009c16b28
[ 0.393668] x20: ffffffc009e53861 x19: ffffff80f77fbf40 x18: 00000000d744fcc9
[ 0.394842] x17: 000000000000000b x16: 00000000000001c2 x15: ffffffc009e57550
[ 0.396016] x14: 0000000000000000 x13: ffffffffffffffff x12: 0000000100000000
[ 0.397190] x11: 0000000000000462 x10: ffffff8040258008 x9 : 0000000100000000
[ 0.398364] x8 : 0000000000000000 x7 : ffffffc0093c8bf4 x6 : 0000000000000000
[ 0.399538] x5 : 0000000000000000 x4 : ffffffc00a976e40 x3 : ffffffc00810444c
[ 0.400711] x2 : 0000000000000004 x1 : 0000000000000000 x0 : 0000000000000000
[ 0.401886] Call trace:
[ 0.402309] __flush_work+0x12c/0x138
[ 0.402941] schedule_on_each_cpu+0x228/0x278
[ 0.403693] rcu_tasks_rude_wait_gp+0x130/0x144
[ 0.404502] rcu_tasks_kthread+0x220/0x254
[ 0.405264] kthread+0x174/0x1ac
[ 0.405837] ret_from_fork+0x10/0x20
[ 0.406456] irq event stamp: 102
[ 0.406966] hardirqs last enabled at (101): [<ffffffc0093c8468>] _raw_spin_unlock_irq+0x78/0xb4
[ 0.408304] hardirqs last disabled at (102): [<ffffffc0093b8270>] el1_dbg+0x24/0x5c
[ 0.409410] softirqs last enabled at (54): [<ffffffc0081b80c8>] local_bh_enable+0xc/0x2c
[ 0.410645] softirqs last disabled at (50): [<ffffffc0081b809c>] local_bh_disable+0xc/0x2c
[ 0.411890] ---[ end trace 0000000000000000 ]---
[ 0.413000] smp: Brought up 1 node, 4 CPUs
[ 0.413762] SMP: Total of 4 processors activated.
[ 0.414566] CPU features: detected: 32-bit EL0 Support
[ 0.415414] CPU features: detected: 32-bit EL1 Support
[ 0.416278] CPU features: detected: CRC32 instructions
[ 0.447021] Callback from call_rcu_tasks_rude() invoked.
[ 0.506693] Callback from call_rcu_tasks() invoked.
This commit therefore fixes this issue by applying a single-CPU
optimization to the RCU Tasks Rude grace-period process. The key point
here is that the purpose of this RCU flavor is to force a schedule on
each online CPU since some past event. But the rcu_tasks_rude_wait_gp()
function runs in the context of the RCU Tasks Rude's grace-period kthread,
so there must already have been a context switch on the current CPU since
the call to either synchronize_rcu_tasks_rude() or call_rcu_tasks_rude().
So if there is only a single CPU online, RCU Tasks Rude's grace-period
kthread does not need to anything at all.
It turns out that the rcu_tasks_rude_wait_gp() function's call to
schedule_on_each_cpu() causes problems during early boot. During that
time, there is only one online CPU, namely the boot CPU. Therefore,
applying this single-CPU optimization fixes early-boot instances of
this problem.
Link: https://lore.kernel.org/lkml/20220210184319.25009-1-treasure4paddy@gmail.com/T/
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/rcu/tasks.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 7c05c5ab7865..14af29fe1377 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -620,6 +620,9 @@ static void rcu_tasks_be_rude(struct work_struct *work)
// Wait for one rude RCU-tasks grace period.
static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
{
+ if (num_online_cpus() <= 1)
+ return; // Fastpath for only one CPU.
+
rtp->n_ipis += cpumask_weight(cpu_online_mask);
schedule_on_each_cpu(rcu_tasks_be_rude);
}
--
2.35.1
next prev parent reply other threads:[~2022-05-30 14:34 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-30 13:42 [PATCH AUTOSEL 5.10 01/76] iommu/vt-d: Add RPLS to quirk list to skip TE disabling Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 02/76] drm/virtio: fix NULL pointer dereference in virtio_gpu_conn_get_modes Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 03/76] selftests/bpf: Fix vfs_link kprobe definition Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 04/76] mwifiex: add mutex lock for call in mwifiex_dfs_chan_sw_work_queue Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 05/76] b43legacy: Fix assigning negative value to unsigned variable Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 06/76] b43: " Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 07/76] ipw2x00: Fix potential NULL dereference in libipw_xmit() Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 08/76] ipv6: fix locking issues with loops over idev->addr_list Sasha Levin
2022-05-30 13:42 ` [PATCH AUTOSEL 5.10 09/76] fbcon: Consistently protect deferred_takeover with console_lock() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 10/76] x86/platform/uv: Update TSC sync state for UV5 Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 11/76] ACPICA: Avoid cache flush inside virtual machines Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 12/76] drm/komeda: return early if drm_universal_plane_init() fails Sasha Levin
2022-05-30 13:43 ` Sasha Levin [this message]
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 14/76] rcu: Make TASKS_RUDE_RCU select IRQ_WORK Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 15/76] sfc: ef10: Fix assigning negative value to unsigned variable Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 16/76] ALSA: jack: Access input_dev under mutex Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 17/76] spi: spi-rspi: Remove setting {src,dst}_{addr,addr_width} based on DMA direction Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 18/76] tools/power turbostat: fix ICX DRAM power numbers Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 19/76] drm/amd/pm: fix double free in si_parse_power_table() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 20/76] ath9k: fix QCA9561 PA bias level Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 21/76] media: venus: hfi: avoid null dereference in deinit Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 22/76] media: pci: cx23885: Fix the error handling in cx23885_initdev() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 23/76] media: cx25821: Fix the warning when removing the module Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 24/76] md/bitmap: don't set sb values if can't pass sanity check Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 25/76] mmc: jz4740: Apply DMA engine limits to maximum segment size Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 26/76] drivers: mmc: sdhci_am654: Add the quirk to set TESTCD bit Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 27/76] drm/sun4i: Add support for D1 TCONs Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 28/76] scsi: megaraid: Fix error check return value of register_chrdev() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 29/76] scsi: ufs: Use pm_runtime_resume_and_get() instead of pm_runtime_get_sync() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 30/76] scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 31/76] ath11k: disable spectral scan during spectral deinit Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 32/76] ASoC: Intel: bytcr_rt5640: Add quirk for the HP Pro Tablet 408 Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 33/76] drm/plane: Move range check for format_count earlier Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 34/76] drm/amd/pm: fix the compile warning Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 35/76] ath10k: skip ath10k_halt during suspend for driver state RESTARTING Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 36/76] arm64: compat: Do not treat syscall number as ESR_ELx for a bad syscall Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 37/76] drm: msm: fix error check return value of irq_of_parse_and_map() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 38/76] ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 39/76] net/mlx5: fs, delete the FTE when there are no rules attached to it Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 40/76] ASoC: dapm: Don't fold register value changes into notifications Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 41/76] mlxsw: spectrum_dcb: Do not warn about priority changes Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 42/76] mlxsw: Treat LLDP packets as control Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 43/76] drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_bo Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 44/76] HID: bigben: fix slab-out-of-bounds Write in bigben_probe Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 45/76] ASoC: tscs454: Add endianness flag in snd_soc_component_driver Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 46/76] net: remove two BUG() from skb_checksum_help() Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 47/76] s390/preempt: disable __preempt_count_add() optimization for PROFILE_ALL_BRANCHES Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 48/76] perf/amd/ibs: Cascade pmu init functions' return value Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 49/76] spi: stm32-qspi: Fix wait_cmd timeout in APM mode Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 50/76] dma-debug: change allocation mode from GFP_NOWAIT to GFP_ATIOMIC Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 51/76] ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 52/76] ipmi:ssif: Check for NULL msg when handling events and messages Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 53/76] ipmi: Fix pr_fmt to avoid compilation issues Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 54/76] rtlwifi: Use pr_warn instead of WARN_ONCE Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 55/76] media: rga: fix possible memory leak in rga_probe Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 56/76] media: coda: limit frame interval enumeration to supported encoder frame sizes Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 57/76] media: imon: reorganize serialization Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 58/76] media: cec-adap.c: fix is_configuring state Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 59/76] openrisc: start CPU timer early in boot Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 60/76] nvme-pci: fix a NULL pointer dereference in nvme_alloc_admin_tags Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 61/76] ASoC: rt5645: Fix errorenous cleanup order Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 62/76] nbd: Fix hung on disconnect request if socket is closed before Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 63/76] net: phy: micrel: Allow probing without .driver_data Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 64/76] media: exynos4-is: Fix compile warning Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 65/76] ASoC: max98357a: remove dependency on GPIOLIB Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 66/76] ASoC: rt1015p: " Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 67/76] can: mcp251xfd: silence clang's -Wunaligned-access warning Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 68/76] x86/microcode: Add explicit CPU vendor dependency Sasha Levin
2022-05-30 13:43 ` [PATCH AUTOSEL 5.10 69/76] ARM: 9201/1: spectre-bhb: rely on linker to emit cross-section literal loads Sasha Levin
2022-05-30 13:52 ` Ard Biesheuvel
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 70/76] m68k: atari: Make Atari ROM port I/O write macros return void Sasha Levin
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 71/76] hwmon: Make chip parameter for with_info API mandatory Sasha Levin
2022-05-30 14:29 ` Guenter Roeck
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 72/76] rxrpc: Return an error to sendmsg if call failed Sasha Levin
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 73/76] rxrpc, afs: Fix selection of abort codes Sasha Levin
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 74/76] eth: tg3: silence the GCC 12 array-bounds warning Sasha Levin
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 75/76] selftests/bpf: fix btf_dump/btf_dump due to recent clang change Sasha Levin
2022-05-30 13:44 ` [PATCH AUTOSEL 5.10 76/76] gfs2: use i_lock spin_lock for inode qadata Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220530134406.1934928-13-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=frederic@kernel.org \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=quic_neeraju@quicinc.com \
--cc=rcu@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=treasure4paddy@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox