linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Vince Weaver <vince@deater.net>,
	hpa@zytor.com, linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-tip-commits@vger.kernel.org, eranian@googlemail.com
Subject: [PATCH, v4] perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'
Date: Thu, 19 Sep 2013 13:42:54 +0200	[thread overview]
Message-ID: <20130919114254.GA22807@gmail.com> (raw)
In-Reply-To: <523ADD8D.1030607@intel.com>


* Adrian Hunter <adrian.hunter@intel.com> wrote:

> >  		struct {
> > -			__u64	cap_usr_time		: 1,
> > -				cap_usr_rdpmc		: 1,
> > -				cap_usr_time_zero	: 1,
> > -				cap_____res		: 61;
> > +			__u64	cap_bit0		: 1, /* Always 0, deprecated, see commit 860f085b74e9 */
> > +				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
> > +
> > +				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
> > +				cap_user_time		: 1, /* The time_* fields are used */
> > +				cap_user_time_zero	: 1, /* The time_zero field is used */
> > +				cap_____res		: 59;
> >  		};

> Please consider adding:

indeed - added. I also build-tested it all and added a changelog - see the 
updated patch below.

Thanks,

	Ingo

--------------------->
Subject: perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'
From: Peter Zijlstra <peterz@infradead.org>
Date: Thu, 19 Sep 2013 10:16:42 +0200

Solve the problems around the broken definition of perf_event_mmap_page::
cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially
fixed by:

  860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'")

The problem with the fix (merged in v3.12-rc1 and not yet released
officially), noticed by Vince Weaver is that the new behavior is
not detectable by new user-space, and that due to the reuse of the
field names it's easy to mis-compile a binary if old headers are used
on a new kernel or new headers are used on an old kernel.

To solve all that make this change explicit, detectable and self-contained,
by iterating the ABI the following way:

 - Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not
   confuse old user-space binaries. RDPMC will be marked as unavailable
   to old binaries but that's within the ABI, this is a capability bit.

 - Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new
   libraries can reliably detect that bit 0 is deprecated and perma-zero
   without having to check the kernel version.

 - Use bits 2, 3, 4 for the newly defined, correct functionality:

	cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
	cap_user_time		: 1, /* The time_* fields are used */
	cap_user_time_zero	: 1, /* The time_zero field is used */

 - Rename all the bitfield names in perf_event.h to be different from the
   old names, to make sure it's not possible to mis-compile it
   accidentally with old assumptions.

The 'size' field can then be used in the future to add new fields and it
will act as a natural ABI version indicator as well.

Also adjust tools/perf/ userspace for the new definitions, noticed by
Adrian Hunter.

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Also-Fixed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/n/tip-zr03yxjrpXesOzzupszqglbv@git.kernel.org
[ Restructured it, using Peter's original patches. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/perf_event.c |   10 +++++-----
 include/uapi/linux/perf_event.h  |   14 +++++++++-----
 kernel/events/core.c             |   21 +++++++++++++++++++++
 tools/perf/arch/x86/util/tsc.c   |    6 +++---
 4 files changed, 38 insertions(+), 13 deletions(-)

Index: tip/arch/x86/kernel/cpu/perf_event.c
===================================================================
--- tip.orig/arch/x86/kernel/cpu/perf_event.c
+++ tip/arch/x86/kernel/cpu/perf_event.c
@@ -1883,9 +1883,9 @@ static struct pmu pmu = {
 
 void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
 {
-	userpg->cap_usr_time = 0;
-	userpg->cap_usr_time_zero = 0;
-	userpg->cap_usr_rdpmc = x86_pmu.attr_rdpmc;
+	userpg->cap_user_time = 0;
+	userpg->cap_user_time_zero = 0;
+	userpg->cap_user_rdpmc = x86_pmu.attr_rdpmc;
 	userpg->pmc_width = x86_pmu.cntval_bits;
 
 	if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
@@ -1894,13 +1894,13 @@ void arch_perf_update_userpage(struct pe
 	if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
 		return;
 
-	userpg->cap_usr_time = 1;
+	userpg->cap_user_time = 1;
 	userpg->time_mult = this_cpu_read(cyc2ns);
 	userpg->time_shift = CYC2NS_SCALE_FACTOR;
 	userpg->time_offset = this_cpu_read(cyc2ns_offset) - now;
 
 	if (sched_clock_stable && !check_tsc_disabled()) {
-		userpg->cap_usr_time_zero = 1;
+		userpg->cap_user_time_zero = 1;
 		userpg->time_zero = this_cpu_read(cyc2ns_offset);
 	}
 }
Index: tip/include/uapi/linux/perf_event.h
===================================================================
--- tip.orig/include/uapi/linux/perf_event.h
+++ tip/include/uapi/linux/perf_event.h
@@ -380,10 +380,13 @@ struct perf_event_mmap_page {
 	union {
 		__u64	capabilities;
 		struct {
-			__u64	cap_usr_time		: 1,
-				cap_usr_rdpmc		: 1,
-				cap_usr_time_zero	: 1,
-				cap_____res		: 61;
+			__u64	cap_bit0		: 1, /* Always 0, deprecated, see commit 860f085b74e9 */
+				cap_bit0_is_deprecated	: 1, /* Always 1, signals that bit 0 is zero */
+
+				cap_user_rdpmc		: 1, /* The RDPMC instruction can be used to read counts */
+				cap_user_time		: 1, /* The time_* fields are used */
+				cap_user_time_zero	: 1, /* The time_zero field is used */
+				cap_____res		: 59;
 		};
 	};
 
@@ -442,12 +445,13 @@ struct perf_event_mmap_page {
 	 *               ((rem * time_mult) >> time_shift);
 	 */
 	__u64	time_zero;
+	__u32	size;			/* Header size up to __reserved[] fields. */
 
 		/*
 		 * Hole for extension of the self monitor capabilities
 		 */
 
-	__u64	__reserved[119];	/* align to 1k */
+	__u8	__reserved[118*8+4];	/* align to 1k. */
 
 	/*
 	 * Control data for the mmap() data buffer.
Index: tip/kernel/events/core.c
===================================================================
--- tip.orig/kernel/events/core.c
+++ tip/kernel/events/core.c
@@ -3660,6 +3660,26 @@ static void calc_timer_values(struct per
 	*running = ctx_time - event->tstamp_running;
 }
 
+static void perf_event_init_userpage(struct perf_event *event)
+{
+	struct perf_event_mmap_page *userpg;
+	struct ring_buffer *rb;
+
+	rcu_read_lock();
+	rb = rcu_dereference(event->rb);
+	if (!rb)
+		goto unlock;
+
+	userpg = rb->user_page;
+
+	/* Allow new userspace to detect that bit 0 is deprecated */
+	userpg->cap_bit0_is_deprecated = 1;
+	userpg->size = offsetof(struct perf_event_mmap_page, __reserved);
+
+unlock:
+	rcu_read_unlock();
+}
+
 void __weak arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
 {
 }
@@ -4044,6 +4064,7 @@ again:
 	ring_buffer_attach(event, rb);
 	rcu_assign_pointer(event->rb, rb);
 
+	perf_event_init_userpage(event);
 	perf_event_update_userpage(event);
 
 unlock:
Index: tip/tools/perf/arch/x86/util/tsc.c
===================================================================
--- tip.orig/tools/perf/arch/x86/util/tsc.c
+++ tip/tools/perf/arch/x86/util/tsc.c
@@ -32,7 +32,7 @@ u64 tsc_to_perf_time(u64 cyc, struct per
 int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
 			     struct perf_tsc_conversion *tc)
 {
-	bool cap_usr_time_zero;
+	bool cap_user_time_zero;
 	u32 seq;
 	int i = 0;
 
@@ -42,7 +42,7 @@ int perf_read_tsc_conversion(const struc
 		tc->time_mult = pc->time_mult;
 		tc->time_shift = pc->time_shift;
 		tc->time_zero = pc->time_zero;
-		cap_usr_time_zero = pc->cap_usr_time_zero;
+		cap_user_time_zero = pc->cap_user_time_zero;
 		rmb();
 		if (pc->lock == seq && !(seq & 1))
 			break;
@@ -52,7 +52,7 @@ int perf_read_tsc_conversion(const struc
 		}
 	}
 
-	if (!cap_usr_time_zero)
+	if (!cap_user_time_zero)
 		return -EOPNOTSUPP;
 
 	return 0;

  reply	other threads:[~2013-09-19 11:42 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-28 13:22 [PATCH 0/5] perf: add two new features Adrian Hunter
2013-06-28 13:22 ` [PATCH 1/5] perf: fix broken union in perf_event_mmap_page Adrian Hunter
2013-06-28 15:22   ` Peter Zijlstra
2013-07-16 11:51     ` H. Peter Anvin
2013-07-24  3:56   ` [tip:perf/core] perf: Fix broken union in ' struct perf_event_mmap_page' tip-bot for Adrian Hunter
2013-09-17 20:23     ` Vince Weaver
2013-09-17 20:35       ` Vince Weaver
2013-09-19  8:42         ` Ingo Molnar
2013-09-18  8:57       ` Peter Zijlstra
2013-09-18 14:19         ` Vince Weaver
2013-09-18 15:42           ` Peter Zijlstra
2013-09-18 18:33             ` Stephane Eranian
2013-09-19  8:43               ` Peter Zijlstra
2013-09-19  8:55                 ` Stephane Eranian
2013-09-19  9:16                   ` Ingo Molnar
2013-09-18 20:07             ` Vince Weaver
2013-09-19  8:16               ` Peter Zijlstra
2013-09-19  9:14                 ` [PATCH] perf: Always set bit 0 in the capabilities field of 'struct perf_event_mmap_page' to 0, to maintain the ABI Ingo Molnar
2013-09-19 10:12                   ` Peter Zijlstra
2013-09-19 10:28                     ` Ingo Molnar
2013-09-19 10:35                       ` Peter Zijlstra
2013-09-19 10:40                         ` [PATCH, v3] " Ingo Molnar
2013-09-19 11:18                           ` Adrian Hunter
2013-09-19 11:42                             ` Ingo Molnar [this message]
2013-09-19 17:40                               ` [PATCH, v4] perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page' Vince Weaver
2013-09-20  7:44                                 ` Ingo Molnar
2013-09-18  9:13       ` [tip:perf/core] perf: Fix broken union in ' struct perf_event_mmap_page' Adrian Hunter
2013-09-18 14:10         ` Vince Weaver
2013-06-28 13:22 ` [PATCH 2/5] x86: add ability to calculate TSC from perf sample timestamps Adrian Hunter
2013-07-24  3:56   ` [tip:perf/core] perf/x86: Add " tip-bot for Adrian Hunter
2013-06-28 13:22 ` [PATCH 3/5] perf tools: add test for converting perf time to/from TSC Adrian Hunter
2013-07-24  3:56   ` [tip:perf/core] perf tools: Add test for converting perf time to/ from TSC tip-bot for Adrian Hunter
2013-06-28 13:22 ` [PATCH 4/5] perf: add 'keep tracking' flag to PERF_EVENT_IOC_DISABLE Adrian Hunter
2013-06-28 13:22 ` [PATCH 5/5] perf tools: add 'keep tracking' test Adrian Hunter
2013-06-28 15:27 ` [PATCH 0/5] perf: add two new features Peter Zijlstra
2013-06-28 19:22   ` Adrian Hunter
2013-07-16  6:22     ` Adrian Hunter
2013-07-16 14:34       ` Peter Zijlstra
2013-07-17 11:28         ` Adrian Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130919114254.GA22807@gmail.com \
    --to=mingo@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=eranian@googlemail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vince@deater.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).