public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Petr Pavlu <petr.pavlu@suse.com>,
	"Steven Rostedt (Google)" <rostedt@goodmis.org>
Subject: [PATCH 5.4 38/67] tracing: Fix incomplete locking when disabling buffered events
Date: Mon, 11 Dec 2023 19:22:22 +0100	[thread overview]
Message-ID: <20231211182016.700348285@linuxfoundation.org> (raw)
In-Reply-To: <20231211182015.049134368@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Petr Pavlu <petr.pavlu@suse.com>

commit 7fed14f7ac9cf5e38c693836fe4a874720141845 upstream.

The following warning appears when using buffered events:

[  203.556451] WARNING: CPU: 53 PID: 10220 at kernel/trace/ring_buffer.c:3912 ring_buffer_discard_commit+0x2eb/0x420
[...]
[  203.670690] CPU: 53 PID: 10220 Comm: stress-ng-sysin Tainted: G            E      6.7.0-rc2-default #4 56e6d0fcf5581e6e51eaaecbdaec2a2338c80f3a
[  203.670704] Hardware name: Intel Corp. GROVEPORT/GROVEPORT, BIOS GVPRCRB1.86B.0016.D04.1705030402 05/03/2017
[  203.670709] RIP: 0010:ring_buffer_discard_commit+0x2eb/0x420
[  203.735721] Code: 4c 8b 4a 50 48 8b 42 48 49 39 c1 0f 84 b3 00 00 00 49 83 e8 01 75 b1 48 8b 42 10 f0 ff 40 08 0f 0b e9 fc fe ff ff f0 ff 47 08 <0f> 0b e9 77 fd ff ff 48 8b 42 10 f0 ff 40 08 0f 0b e9 f5 fe ff ff
[  203.735734] RSP: 0018:ffffb4ae4f7b7d80 EFLAGS: 00010202
[  203.735745] RAX: 0000000000000000 RBX: ffffb4ae4f7b7de0 RCX: ffff8ac10662c000
[  203.735754] RDX: ffff8ac0c750be00 RSI: ffff8ac10662c000 RDI: ffff8ac0c004d400
[  203.781832] RBP: ffff8ac0c039cea0 R08: 0000000000000000 R09: 0000000000000000
[  203.781839] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[  203.781842] R13: ffff8ac10662c000 R14: ffff8ac0c004d400 R15: ffff8ac10662c008
[  203.781846] FS:  00007f4cd8a67740(0000) GS:ffff8ad798880000(0000) knlGS:0000000000000000
[  203.781851] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  203.781855] CR2: 0000559766a74028 CR3: 00000001804c4000 CR4: 00000000001506f0
[  203.781862] Call Trace:
[  203.781870]  <TASK>
[  203.851949]  trace_event_buffer_commit+0x1ea/0x250
[  203.851967]  trace_event_raw_event_sys_enter+0x83/0xe0
[  203.851983]  syscall_trace_enter.isra.0+0x182/0x1a0
[  203.851990]  do_syscall_64+0x3a/0xe0
[  203.852075]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  203.852090] RIP: 0033:0x7f4cd870fa77
[  203.982920] Code: 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 90 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 43 0e 00 f7 d8 64 89 01 48
[  203.982932] RSP: 002b:00007fff99717dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000089
[  203.982942] RAX: ffffffffffffffda RBX: 0000558ea1d7b6f0 RCX: 00007f4cd870fa77
[  203.982948] RDX: 0000000000000000 RSI: 00007fff99717de0 RDI: 0000558ea1d7b6f0
[  203.982957] RBP: 00007fff99717de0 R08: 00007fff997180e0 R09: 00007fff997180e0
[  203.982962] R10: 00007fff997180e0 R11: 0000000000000246 R12: 00007fff99717f40
[  204.049239] R13: 00007fff99718590 R14: 0000558e9f2127a8 R15: 00007fff997180b0
[  204.049256]  </TASK>

For instance, it can be triggered by running these two commands in
parallel:

 $ while true; do
    echo hist:key=id.syscall:val=hitcount > \
      /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger;
  done
 $ stress-ng --sysinfo $(nproc)

The warning indicates that the current ring_buffer_per_cpu is not in the
committing state. It happens because the active ring_buffer_event
doesn't actually come from the ring_buffer_per_cpu but is allocated from
trace_buffered_event.

The bug is in function trace_buffered_event_disable() where the
following normally happens:

* The code invokes disable_trace_buffered_event() via
  smp_call_function_many() and follows it by synchronize_rcu(). This
  increments the per-CPU variable trace_buffered_event_cnt on each
  target CPU and grants trace_buffered_event_disable() the exclusive
  access to the per-CPU variable trace_buffered_event.

* Maintenance is performed on trace_buffered_event, all per-CPU event
  buffers get freed.

* The code invokes enable_trace_buffered_event() via
  smp_call_function_many(). This decrements trace_buffered_event_cnt and
  releases the access to trace_buffered_event.

A problem is that smp_call_function_many() runs a given function on all
target CPUs except on the current one. The following can then occur:

* Task X executing trace_buffered_event_disable() runs on CPU 0.

* The control reaches synchronize_rcu() and the task gets rescheduled on
  another CPU 1.

* The RCU synchronization finishes. At this point,
  trace_buffered_event_disable() has the exclusive access to all
  trace_buffered_event variables except trace_buffered_event[CPU0]
  because trace_buffered_event_cnt[CPU0] is never incremented and if the
  buffer is currently unused, remains set to 0.

* A different task Y is scheduled on CPU 0 and hits a trace event. The
  code in trace_event_buffer_lock_reserve() sees that
  trace_buffered_event_cnt[CPU0] is set to 0 and decides the use the
  buffer provided by trace_buffered_event[CPU0].

* Task X continues its execution in trace_buffered_event_disable(). The
  code incorrectly frees the event buffer pointed by
  trace_buffered_event[CPU0] and resets the variable to NULL.

* Task Y writes event data to the now freed buffer and later detects the
  created inconsistency.

The issue is observable since commit dea499781a11 ("tracing: Fix warning
in trace_buffered_event_disable()") which moved the call of
trace_buffered_event_disable() in __ftrace_event_enable_disable()
earlier, prior to invoking call->class->reg(.. TRACE_REG_UNREGISTER ..).
The underlying problem in trace_buffered_event_disable() is however
present since the original implementation in commit 0fc1b09ff1ff
("tracing: Use temp buffer when filtering events").

Fix the problem by replacing the two smp_call_function_many() calls with
on_each_cpu_mask() which invokes a given callback on all CPUs.

Link: https://lore.kernel.org/all/20231127151248.7232-2-petr.pavlu@suse.com/
Link: https://lkml.kernel.org/r/20231205161736.19663-2-petr.pavlu@suse.com

Cc: stable@vger.kernel.org
Fixes: 0fc1b09ff1ff ("tracing: Use temp buffer when filtering events")
Fixes: dea499781a11 ("tracing: Fix warning in trace_buffered_event_disable()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/trace/trace.c |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2490,11 +2490,9 @@ void trace_buffered_event_disable(void)
 	if (--trace_buffered_event_ref)
 		return;
 
-	preempt_disable();
 	/* For each CPU, set the buffer as used. */
-	smp_call_function_many(tracing_buffer_mask,
-			       disable_trace_buffered_event, NULL, 1);
-	preempt_enable();
+	on_each_cpu_mask(tracing_buffer_mask, disable_trace_buffered_event,
+			 NULL, true);
 
 	/* Wait for all current users to finish */
 	synchronize_rcu();
@@ -2509,11 +2507,9 @@ void trace_buffered_event_disable(void)
 	 */
 	smp_wmb();
 
-	preempt_disable();
 	/* Do the work on each cpu */
-	smp_call_function_many(tracing_buffer_mask,
-			       enable_trace_buffered_event, NULL, 1);
-	preempt_enable();
+	on_each_cpu_mask(tracing_buffer_mask, enable_trace_buffered_event, NULL,
+			 true);
 }
 
 static struct ring_buffer *temp_buffer;



  parent reply	other threads:[~2023-12-11 18:43 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 18:21 [PATCH 5.4 00/67] 5.4.264-rc1 review Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 01/67] hrtimers: Push pending hrtimers away from outgoing CPU earlier Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 02/67] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 03/67] tg3: Move the [rt]x_dropped counters to tg3_napi Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 04/67] tg3: Increment tx_dropped in tg3_tso_bug() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 05/67] kconfig: fix memory leak from range properties Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 06/67] drm/amdgpu: correct chunk_ptr to a pointer to chunk Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 07/67] of: base: Add of_get_cpu_state_node() to get idle states for a CPU node Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 08/67] ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 09/67] ACPI/IORT: Make iort_msi_map_rid() PCI agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 10/67] of/iommu: Make of_map_rid() " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 11/67] of/irq: make of_msi_map_get_device_domain() bus agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 12/67] of/irq: Make of_msi_map_rid() PCI " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 13/67] of: base: Fix some formatting issues and provide missing descriptions Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 14/67] of: Fix kerneldoc output formatting Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 15/67] of: Add missing Return section in kerneldoc comments Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 16/67] of: dynamic: Fix of_reconfig_get_state_change() return value documentation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 17/67] ipv6: fix potential NULL deref in fib6_add() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 18/67] hv_netvsc: rndis_filter needs to select NLS Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 19/67] net: arcnet: Fix RESET flag handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 20/67] net: arcnet: com20020 fix error handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 21/67] arcnet: restoring support for multiple Sohard Arcnet cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 22/67] ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 23/67] net: hns: fix fake link up on xge port Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 24/67] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 25/67] tcp: do not accept ACK of bytes we never sent Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 26/67] bpf: sockmap, updating the sg structure should also update curr Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 27/67] RDMA/bnxt_re: Correct module description string Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 28/67] hwmon: (acpi_power_meter) Fix 4.29 MW bug Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 29/67] ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 30/67] tracing: Fix a warning when allocating buffered events fails Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 31/67] scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 32/67] ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 33/67] ARM: dts: imx: make gpt node name generic Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 34/67] ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 35/67] ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 36/67] nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 37/67] tracing: Always update snapshot buffer size Greg Kroah-Hartman
2023-12-11 18:22 ` Greg Kroah-Hartman [this message]
2023-12-11 18:22 ` [PATCH 5.4 39/67] tracing: Fix a possible race when disabling buffered events Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 40/67] packet: Move reference count in packet_sock to atomic_long_t Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 41/67] arm64: dts: mediatek: mt7622: fix memory node warning check Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 42/67] arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 43/67] perf/core: Add a new read format to get a number of lost samples Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 44/67] perf: Fix perf_event_validate_size() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 45/67] gpiolib: sysfs: Fix error handling on failed export Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 46/67] mmc: core: add helpers mmc_regulator_enable/disable_vqmmc Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 47/67] mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 48/67] usb: gadget: f_hid: fix report descriptor allocation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 49/67] parport: Add support for Brainboxes IX/UC/PX parallel cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 50/67] usb: typec: class: fix typec_altmode_put_partner to put plugs Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 51/67] ARM: PL011: Fix DMA support Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 52/67] serial: sc16is7xx: address RX timeout interrupt errata Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 53/67] serial: 8250_omap: Add earlycon support for the AM654 UART controller Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 54/67] x86/CPU/AMD: Check vendor in the AMD microcode callback Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 55/67] KVM: s390/mm: Properly reset no-dat Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 56/67] nilfs2: fix missing error check for sb_set_blocksize call Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 57/67] io_uring/af_unix: disable sending io_uring over sockets Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 58/67] netlink: dont call ->netlink_bind with table lock held Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 59/67] genetlink: add CAP_NET_ADMIN test for multicast bind Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 60/67] psample: Require CAP_NET_ADMIN when joining "packets" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 61/67] drop_monitor: Require CAP_SYS_ADMIN when joining "events" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 62/67] tools headers UAPI: Sync linux/perf_event.h with the kernel sources Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 63/67] Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem" Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 64/67] cifs: Fix non-availability of dedup breaking generic/304 Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 65/67] smb: client: fix potential NULL deref in parse_dfs_referrals() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 66/67] devcoredump : Serialize devcd_del work Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 67/67] devcoredump: Send uevent once devcd is ready Greg Kroah-Hartman
2023-12-11 19:04 ` [PATCH 5.4 00/67] 5.4.264-rc1 review Florian Fainelli
2023-12-12 13:23 ` Harshit Mogalapalli
2023-12-12 16:13 ` Shuah Khan
2023-12-12 17:00 ` Guenter Roeck
2023-12-12 17:05 ` Naresh Kamboju
2023-12-12 22:19 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211182016.700348285@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=patches@lists.linux.dev \
    --cc=petr.pavlu@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox