All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Victor Kaplansky <victork@il.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Michael Neuling <mikey@neuling.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	michael@ellerman.id.au, anton@samba.org,
	benh@kernel.crashing.org
Subject: [PATCH 3.11 18/25] perf: Fix perf ring buffer memory ordering
Date: Mon, 18 Nov 2013 10:40:47 -0800	[thread overview]
Message-ID: <20131118184037.718430418@linuxfoundation.org> (raw)
In-Reply-To: <20131118184032.248465920@linuxfoundation.org>

3.11-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@infradead.org>

commit bf378d341e4873ed928dc3c636252e6895a21f50 upstream.

The PPC64 people noticed a missing memory barrier and crufty old
comments in the perf ring buffer code. So update all the comments and
add the missing barrier.

When the architecture implements local_t using atomic_long_t there
will be double barriers issued; but short of introducing more
conditional barrier primitives this is the best we can do.

Reported-by: Victor Kaplansky <victork@il.ibm.com>
Tested-by: Victor Kaplansky <victork@il.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: michael@ellerman.id.au
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: anton@samba.org
Cc: benh@kernel.crashing.org
Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/uapi/linux/perf_event.h |   12 +++++++-----
 kernel/events/ring_buffer.c     |   31 +++++++++++++++++++++++++++----
 2 files changed, 34 insertions(+), 9 deletions(-)

--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -428,13 +428,15 @@ struct perf_event_mmap_page {
 	/*
 	 * Control data for the mmap() data buffer.
 	 *
-	 * User-space reading the @data_head value should issue an rmb(), on
-	 * SMP capable platforms, after reading this value -- see
-	 * perf_event_wakeup().
+	 * User-space reading the @data_head value should issue an smp_rmb(),
+	 * after reading this value.
 	 *
 	 * When the mapping is PROT_WRITE the @data_tail value should be
-	 * written by userspace to reflect the last read data. In this case
-	 * the kernel will not over-write unread data.
+	 * written by userspace to reflect the last read data, after issueing
+	 * an smp_mb() to separate the data read from the ->data_tail store.
+	 * In this case the kernel will not over-write unread data.
+	 *
+	 * See perf_output_put_handle() for the data ordering.
 	 */
 	__u64   data_head;		/* head in the data section */
 	__u64	data_tail;		/* user-space written tail */
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -87,10 +87,31 @@ again:
 		goto out;
 
 	/*
-	 * Publish the known good head. Rely on the full barrier implied
-	 * by atomic_dec_and_test() order the rb->head read and this
-	 * write.
+	 * Since the mmap() consumer (userspace) can run on a different CPU:
+	 *
+	 *   kernel				user
+	 *
+	 *   READ ->data_tail			READ ->data_head
+	 *   smp_mb()	(A)			smp_rmb()	(C)
+	 *   WRITE $data			READ $data
+	 *   smp_wmb()	(B)			smp_mb()	(D)
+	 *   STORE ->data_head			WRITE ->data_tail
+	 *
+	 * Where A pairs with D, and B pairs with C.
+	 *
+	 * I don't think A needs to be a full barrier because we won't in fact
+	 * write data until we see the store from userspace. So we simply don't
+	 * issue the data WRITE until we observe it. Be conservative for now.
+	 *
+	 * OTOH, D needs to be a full barrier since it separates the data READ
+	 * from the tail WRITE.
+	 *
+	 * For B a WMB is sufficient since it separates two WRITEs, and for C
+	 * an RMB is sufficient since it separates two READs.
+	 *
+	 * See perf_output_begin().
 	 */
+	smp_wmb();
 	rb->user_page->data_head = head;
 
 	/*
@@ -154,9 +175,11 @@ int perf_output_begin(struct perf_output
 		 * Userspace could choose to issue a mb() before updating the
 		 * tail pointer. So that all reads will be completed before the
 		 * write is issued.
+		 *
+		 * See perf_output_put_handle().
 		 */
 		tail = ACCESS_ONCE(rb->user_page->data_tail);
-		smp_rmb();
+		smp_mb();
 		offset = head = local_read(&rb->head);
 		head += size;
 		if (unlikely(!perf_output_space(rb, tail, offset, head)))



  parent reply	other threads:[~2013-11-18 18:55 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-18 18:40 [PATCH 3.11 00/25] 3.11.9-stable review Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 01/25] net/mlx4_core: Fix call to __mlx4_unregister_mac Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 02/25] net: sctp: do not trigger BUG_ON in sctp_cmd_delete_tcb Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 03/25] net: flow_dissector: fail on evil iph->ihl Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 04/25] virtio-net: correctly handle cpu hotplug notifier during resuming Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 05/25] xen-netback: use jiffies_64 value to calculate credit timeout Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 06/25] cxgb3: Fix length calculation in write_ofld_wr() on 32-bit architectures Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 07/25] tcp: gso: fix truesize tracking Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 08/25] ipv6: ip6_dst_check needs to check for expired dst_entries Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 09/25] ipv6: reset dst.expires value when clearing expire flag Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 10/25] xen-netback: Handle backend state transitions in a more robust way Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 11/25] xen-netback: transition to CLOSED when removing a VIF Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 12/25] Thermal: x86_pkg_temp: change spin lock Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 13/25] hyperv-fb: add pci stub Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 14/25] USB: add new zte 3g-dongles pid to option.c Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 15/25] ALSA: hda - hdmi: Fix reported channel map on common default layouts Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 16/25] tracing: Fix potential out-of-bounds in trace_get_user() Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 17/25] drm/i915/dp: workaround BIOS eDP bpp clamping issue Greg Kroah-Hartman
2013-11-18 18:40 ` Greg Kroah-Hartman [this message]
2013-11-18 18:40 ` [PATCH 3.11 19/25] iwlwifi: pcie: add new SKUs for 7000 & 3160 NIC series Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 20/25] misc: atmel_pwm: add deferred-probing support Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 21/25] backlight: atmel-pwm-bl: fix deferred probe from __init Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 22/25] usb: fix cleanup after failure in hub_configure() Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 23/25] usb: fail on usb_hub_create_port_device() errors Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 24/25] usbcore: set lpm_capable field for LPM capable root hubs Greg Kroah-Hartman
2013-11-18 18:40 ` [PATCH 3.11 25/25] media: sh_vou: almost forever loop in sh_vou_try_fmt_vid_out() Greg Kroah-Hartman
2013-11-19  3:09 ` [PATCH 3.11 00/25] 3.11.9-stable review Guenter Roeck
2013-11-20 11:06 ` Satoru Takeuchi
2013-11-20 15:26 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131118184037.718430418@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@polymtl.ca \
    --cc=michael@ellerman.id.au \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=victork@il.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.