Archive-only list for patches
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev,
	"Steven Rostedt (Google)" <rostedt@goodmis.org>,
	Zheng Yejian <zhengyejian1@huawei.com>
Subject: [PATCH 5.15 90/93] ring-buffer: Fix race while reader and writer are on the same page
Date: Wed, 12 Apr 2023 10:34:31 +0200	[thread overview]
Message-ID: <20230412082826.943222173@linuxfoundation.org> (raw)
In-Reply-To: <20230412082823.045155996@linuxfoundation.org>

From: Zheng Yejian <zhengyejian1@huawei.com>

commit 6455b6163d8c680366663cdb8c679514d55fc30c upstream.

When user reads file 'trace_pipe', kernel keeps printing following logs
that warn at "cpu_buffer->reader_page->read > rb_page_size(reader)" in
rb_get_reader_page(). It just looks like there's an infinite loop in
tracing_read_pipe(). This problem occurs several times on arm64 platform
when testing v5.10 and below.

  Call trace:
   rb_get_reader_page+0x248/0x1300
   rb_buffer_peek+0x34/0x160
   ring_buffer_peek+0xbc/0x224
   peek_next_entry+0x98/0xbc
   __find_next_entry+0xc4/0x1c0
   trace_find_next_entry_inc+0x30/0x94
   tracing_read_pipe+0x198/0x304
   vfs_read+0xb4/0x1e0
   ksys_read+0x74/0x100
   __arm64_sys_read+0x24/0x30
   el0_svc_common.constprop.0+0x7c/0x1bc
   do_el0_svc+0x2c/0x94
   el0_svc+0x20/0x30
   el0_sync_handler+0xb0/0xb4
   el0_sync+0x160/0x180

Then I dump the vmcore and look into the problematic per_cpu ring_buffer,
I found that tail_page/commit_page/reader_page are on the same page while
reader_page->read is obviously abnormal:
  tail_page == commit_page == reader_page == {
    .write = 0x100d20,
    .read = 0x8f9f4805,  // Far greater than 0xd20, obviously abnormal!!!
    .entries = 0x10004c,
    .real_end = 0x0,
    .page = {
      .time_stamp = 0x857257416af0,
      .commit = 0xd20,  // This page hasn't been full filled.
      // .data[0...0xd20] seems normal.
    }
 }

The root cause is most likely the race that reader and writer are on the
same page while reader saw an event that not fully committed by writer.

To fix this, add memory barriers to make sure the reader can see the
content of what is committed. Since commit a0fcaaed0c46 ("ring-buffer: Fix
race between reset page and reading page") has added the read barrier in
rb_get_reader_page(), here we just need to add the write barrier.

Link: https://lore.kernel.org/linux-trace-kernel/20230325021247.2923907-1-zhengyejian1@huawei.com

Cc: stable@vger.kernel.org
Fixes: 77ae365eca89 ("ring-buffer: make lockless")
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 kernel/trace/ring_buffer.c |   13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3041,6 +3041,10 @@ rb_set_commit_to_write(struct ring_buffe
 		if (RB_WARN_ON(cpu_buffer,
 			       rb_is_reader_page(cpu_buffer->tail_page)))
 			return;
+		/*
+		 * No need for a memory barrier here, as the update
+		 * of the tail_page did it for this page.
+		 */
 		local_set(&cpu_buffer->commit_page->page->commit,
 			  rb_page_write(cpu_buffer->commit_page));
 		rb_inc_page(&cpu_buffer->commit_page);
@@ -3050,6 +3054,8 @@ rb_set_commit_to_write(struct ring_buffe
 	while (rb_commit_index(cpu_buffer) !=
 	       rb_page_write(cpu_buffer->commit_page)) {
 
+		/* Make sure the readers see the content of what is committed. */
+		smp_wmb();
 		local_set(&cpu_buffer->commit_page->page->commit,
 			  rb_page_write(cpu_buffer->commit_page));
 		RB_WARN_ON(cpu_buffer,
@@ -4632,7 +4638,12 @@ rb_get_reader_page(struct ring_buffer_pe
 
 	/*
 	 * Make sure we see any padding after the write update
-	 * (see rb_reset_tail())
+	 * (see rb_reset_tail()).
+	 *
+	 * In addition, a writer may be writing on the reader page
+	 * if the page has not been fully filled, so the read barrier
+	 * is also needed to make sure we see the content of what is
+	 * committed by the writer (see rb_set_commit_to_write()).
 	 */
 	smp_rmb();
 



  parent reply	other threads:[~2023-04-12  8:39 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12  8:33 [PATCH 5.15 00/93] 5.15.107-rc1 review Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 01/93] soc: sifive: ccache: Rename SiFive L2 cache to Composable cache Greg Kroah-Hartman
2023-04-12  9:36   ` Conor Dooley
2023-04-12 10:14     ` Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 02/93] soc: sifive: ccache: determine the cache level from dts Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 03/93] soc: sifive: ccache: reduce printing on init Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 04/93] soc: sifive: ccache: use pr_fmt() to remove CCACHE: prefixes Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 05/93] soc: sifive: ccache: fix missing iounmap() in error path in sifive_ccache_init() Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 06/93] soc: sifive: ccache: fix missing free_irq() " Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 07/93] soc: sifive: ccache: fix missing of_node_put() " Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 08/93] ocfs2: ocfs2_mount_volume does cleanup job before return error Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 09/93] ocfs2: rewrite error handling of ocfs2_fill_super Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 10/93] ocfs2: fix memory leak in ocfs2_mount_volume() Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 11/93] NFSD: Fix sparse warning Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 12/93] NFSD: pass range end to vfs_fsync_range() instead of count Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 13/93] RDMA/irdma: Do not request 2-level PBLEs for CQ alloc Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 14/93] platform/x86: int3472: Split into 2 drivers Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 15/93] platform/x86: int3472/discrete: Ensure the clk/power enable pins are in output mode Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 16/93] iavf: return errno code instead of status code Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 17/93] iavf/iavf_main: actually log ->src mask when talking about it Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 18/93] serial: 8250_exar: derive nr_ports from PCI ID for Acces I/O cards Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 19/93] serial: exar: Add support for Sealevel 7xxxC serial cards Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 20/93] drm/amdgpu: Prevent race between late signaled fences and GPU reset Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 21/93] drm/amdgpu: fix amdgpu_job_free_resources v2 Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 22/93] bpf: hash map, avoid deadlock with suitable hash mask Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 23/93] gpio: GPIO_REGMAP: select REGMAP instead of depending on it Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 24/93] Drivers: vmbus: Check for channel allocation before looking up relids Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 25/93] pwm: cros-ec: Explicitly set .polarity in .get_state() Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 26/93] pwm: sprd: " Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 27/93] KVM: s390: pv: fix external interruption loop not always detected Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 28/93] wifi: mac80211: fix invalid drv_sta_pre_rcu_remove calls for non-uploaded sta Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 29/93] net: qrtr: combine nameservice into main module Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 30/93] net: qrtr: Fix a refcount bug in qrtr_recvmsg() Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 31/93] NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 32/93] icmp: guard against too small mtu Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 33/93] net: dont let netpoll invoke NAPI if in xmit context Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 34/93] net: dsa: mv88e6xxx: Reset mv88e6393x force WD event bit Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 35/93] sctp: check send stream number after wait_for_sndbuf Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 36/93] net: qrtr: Do not do DEL_SERVER broadcast after DEL_CLIENT Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 37/93] ipv6: Fix an uninit variable access bug in __ip6_make_skb() Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 38/93] platform/x86: think-lmi: Fix memory leak when showing current settings Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 39/93] platform/x86: think-lmi: Fix memory leaks when parsing ThinkStation WMI strings Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 40/93] platform/x86: think-lmi: Clean up display of current_value on Thinkstation Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 41/93] gpio: davinci: Add irq chip flag to skip set wake Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 42/93] net: ethernet: ti: am65-cpsw: Fix mdio cleanup in probe Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 43/93] net: stmmac: fix up RX flow hash indirection table when setting channels Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 44/93] sunrpc: only free unix grouplist after RCU settles Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 45/93] NFSD: callback request does not use correct credential for AUTH_SYS Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 46/93] ice: fix wrong fallback logic for FDIR Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 47/93] ice: Reset FDIR counter in FDIR init stage Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 48/93] ethtool: reset #lanes when lanes is omitted Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 49/93] gve: Secure enough bytes in the first TX desc for all TCP pkts Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 50/93] kbuild: refactor single builds of *.ko Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 51/93] usb: xhci: tegra: fix sleep in atomic call Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 52/93] xhci: also avoid the XHCI_ZERO_64B_REGS quirk with a passthrough iommu Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 53/93] usb: cdnsp: Fixes error: uninitialized symbol len Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 54/93] usb: dwc3: pci: add support for the Intel Meteor Lake-S Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 55/93] USB: serial: cp210x: add Silicon Labs IFS-USB-DATACABLE IDs Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 56/93] usb: typec: altmodes/displayport: Fix configure initial pin assignment Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 57/93] USB: serial: option: add Telit FE990 compositions Greg Kroah-Hartman
2023-04-12  8:33 ` [PATCH 5.15 58/93] USB: serial: option: add Quectel RM500U-CN modem Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 59/93] iio: adis16480: select CONFIG_CRC32 Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 60/93] iio: adc: ti-ads7950: Set `can_sleep` flag for GPIO chip Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 61/93] iio: dac: cio-dac: Fix max DAC write value check for 12-bit Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 62/93] iio: light: cm32181: Unregister second I2C client if present Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 63/93] tty: serial: sh-sci: Fix transmit end interrupt handler Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 64/93] tty: serial: sh-sci: Fix Rx on RZ/G2L SCI Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 65/93] tty: serial: fsl_lpuart: avoid checking for transfer complete when UARTCTRL_SBK is asserted in lpuart32_tx_empty Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 66/93] nilfs2: fix potential UAF of struct nilfs_sc_info in nilfs_segctor_thread() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 67/93] nilfs2: fix sysfs interface lifetime Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 68/93] dt-bindings: serial: renesas,scif: Fix 4th IRQ for 4-IRQ SCIFs Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 69/93] ksmbd: do not call kvmalloc() with __GFP_NORETRY | __GFP_NO_WARN Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 70/93] ALSA: hda/realtek: Add quirk for Clevo X370SNW Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 71/93] coresight: etm4x: Do not access TRCIDR1 for identification Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 72/93] coresight-etm4: Fix for() loop drvdata->nr_addr_cmp range bug Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 73/93] iio: adc: ad7791: fix IRQ flags Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 74/93] scsi: qla2xxx: Fix memory leak in qla2x00_probe_one() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 75/93] scsi: iscsi_tcp: Check that sock is valid before iscsi_set_param() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 76/93] smb3: allow deferred close timeout to be configurable Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 77/93] smb3: lower default deferred close timeout to address perf regression Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 78/93] cifs: sanitize paths in cifs_update_super_prepath Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 79/93] perf/core: Fix the same task check in perf_event_set_output Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 80/93] ftrace: Mark get_lock_parent_ip() __always_inline Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 81/93] ftrace: Fix issue that direct->addr not restored in modify_ftrace_direct() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 82/93] fs: drop peer group ids under namespace lock Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 83/93] can: j1939: j1939_tp_tx_dat_new(): fix out-of-bounds memory access Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 84/93] can: isotp: isotp_ops: fix poll() to not report false EPOLLOUT events Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 85/93] tracing: Free error logs of tracing instances Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 86/93] ASoC: hdac_hdmi: use set_stream() instead of set_tdm_slots() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 87/93] mm: vmalloc: avoid warn_alloc noise caused by fatal signal Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 88/93] drm/panfrost: Fix the panfrost_mmu_map_fault_addr() error path Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 89/93] drm/nouveau/disp: Support more modes by checking with lower bpc Greg Kroah-Hartman
2023-04-12  8:34 ` Greg Kroah-Hartman [this message]
2023-04-12  8:34 ` [PATCH 5.15 91/93] mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 92/93] drm/bridge: lt9611: Fix PLL being unable to lock Greg Kroah-Hartman
2023-04-12  8:34 ` [PATCH 5.15 93/93] mm: take a page reference when removing device exclusive entries Greg Kroah-Hartman
2023-04-12 16:53 ` [PATCH 5.15 00/93] 5.15.107-rc1 review Florian Fainelli
2023-04-12 19:41 ` Shuah Khan
2023-04-12 20:41 ` Guenter Roeck
2023-04-12 21:47 ` [PATCH 5.15 00/93] 5.15.107-rc1 review (possible amdgpu regression) Eddie Chapman
2023-04-13 14:46   ` Greg Kroah-Hartman
2023-06-07 22:24     ` Eddie Chapman
2023-04-13  2:04 ` [PATCH 5.15 00/93] 5.15.107-rc1 review Bagas Sanjaya
2023-04-13 13:28 ` Ron Economos
2023-04-13 14:18 ` Naresh Kamboju
2023-04-13 14:51 ` Harshit Mogalapalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230412082826.943222173@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=patches@lists.linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=zhengyejian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox