From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vince Weaver <vince@deater.net>,
hpa@zytor.com, linux-kernel@vger.kernel.org,
adrian.hunter@intel.com, tglx@linutronix.de,
linux-tip-commits@vger.kernel.org, eranian@googlemail.com
Subject: Re: [PATCH] perf: Always set bit 0 in the capabilities field of 'struct perf_event_mmap_page' to 0, to maintain the ABI
Date: Thu, 19 Sep 2013 12:28:18 +0200 [thread overview]
Message-ID: <20130919102818.GA23487@gmail.com> (raw)
In-Reply-To: <20130919101240.GN9326@twins.programming.kicks-ass.net>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Sep 19, 2013 at 11:14:53AM +0200, Ingo Molnar wrote:
> > @@ -442,12 +445,14 @@ struct perf_event_mmap_page {
> > * ((rem * time_mult) >> time_shift);
> > */
> > __u64 time_zero;
> > + __u32 size; /* Header size up to this point */
> > + __u32 __reserved0; /* 4 byte hole */
> >
> > /*
> > * Hole for extension of the self monitor capabilities
> > */
> >
> > - __u64 __reserved[119]; /* align to 1k */
> > + __u64 __reserved[118]; /* align to 1k */
> >
> > /*
> > * Control data for the mmap() data buffer.
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index dd236b6..27d339f 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -3660,6 +3660,26 @@ static void calc_timer_values(struct perf_event *event,
> > *running = ctx_time - event->tstamp_running;
> > }
> >
> > +static void perf_event_init_userpage(struct perf_event *event)
> > +{
> > + struct perf_event_mmap_page *userpg;
> > + struct ring_buffer *rb;
> > +
> > + rcu_read_lock();
> > + rb = rcu_dereference(event->rb);
> > + if (!rb)
> > + goto unlock;
> > +
> > + userpg = rb->user_page;
> > +
> > + /* Allow new userspace to detect that bit 0 is deprecated */
> > + userpg->cap_bit0_is_deprecated = 1;
> > + userpg->size = offsetof(struct perf_event_mmap_page, size);
>
> This is fragile and I'm 100% sure we'll forget to update it.
>
> userpg->size = offsetof(struct perf_event_mmap_page, __reserved);
>
> Will auto update and mostly do the right thing.
Ah, yes, agreed 100% - that was my intention, just implemented it badly.
One detail: I think we want to track size with u8 granularity, to be able
to detect when a new u32 (or u16) field gets added, right? Updated patch
attached below.
Note that this way of writing the array size:
+ __u8 __reserved[118*8+4]; /* align to 1k. */
Makes it sure that we are aware of the current word alignment situation
and makes it harder to break alignment in the future.
> Also, how will userspace know there's a valid size field way out there?
The current value of the size field on old kernels is 0 so it easily
detected by being nonzero.
> Shouldn't we bump the version field to indicate so? :-) After all,
> running on old userspace this field will be 0.
Well, on old kernel this will be 0, right?
Thanks,
Ingo
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 8355c84..3ab624c 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1883,9 +1883,9 @@ static struct pmu pmu = {
void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
{
- userpg->cap_usr_time = 0;
- userpg->cap_usr_time_zero = 0;
- userpg->cap_usr_rdpmc = x86_pmu.attr_rdpmc;
+ userpg->cap_usr_time_used = 0;
+ userpg->cap_usr_time_zero_used = 0;
+ userpg->cap_usr_rdpmc_available = x86_pmu.attr_rdpmc;
userpg->pmc_width = x86_pmu.cntval_bits;
if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
@@ -1894,13 +1894,13 @@ void arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
if (!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
return;
- userpg->cap_usr_time = 1;
+ userpg->cap_usr_time_used = 1;
userpg->time_mult = this_cpu_read(cyc2ns);
userpg->time_shift = CYC2NS_SCALE_FACTOR;
userpg->time_offset = this_cpu_read(cyc2ns_offset) - now;
if (sched_clock_stable && !check_tsc_disabled()) {
- userpg->cap_usr_time_zero = 1;
+ userpg->cap_usr_time_zero_used = 1;
userpg->time_zero = this_cpu_read(cyc2ns_offset);
}
}
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 40a1fb8..e3514d1 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -380,10 +380,13 @@ struct perf_event_mmap_page {
union {
__u64 capabilities;
struct {
- __u64 cap_usr_time : 1,
- cap_usr_rdpmc : 1,
- cap_usr_time_zero : 1,
- cap_____res : 61;
+ __u64 cap_bit0 : 1, /* Deprecated, always zero, see commit 860f085b74e9 */
+ cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
+
+ cap_usr_rdpmc_available : 1, /* The RDPMC instruction can be used to read counts */
+ cap_usr_time_used : 1, /* The time_* fields are uses */
+ cap_usr_time_zero_used : 1, /* The time_zero field is used */
+ cap_____res : 59;
};
};
@@ -442,12 +445,13 @@ struct perf_event_mmap_page {
* ((rem * time_mult) >> time_shift);
*/
__u64 time_zero;
+ __u32 size; /* Header size up to __reserved[] fields. */
/*
* Hole for extension of the self monitor capabilities
*/
- __u64 __reserved[119]; /* align to 1k */
+ __u8 __reserved[118*8+4]; /* align to 1k. */
/*
* Control data for the mmap() data buffer.
diff --git a/kernel/events/core.c b/kernel/events/core.c
index dd236b6..cb4238e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3660,6 +3660,26 @@ static void calc_timer_values(struct perf_event *event,
*running = ctx_time - event->tstamp_running;
}
+static void perf_event_init_userpage(struct perf_event *event)
+{
+ struct perf_event_mmap_page *userpg;
+ struct ring_buffer *rb;
+
+ rcu_read_lock();
+ rb = rcu_dereference(event->rb);
+ if (!rb)
+ goto unlock;
+
+ userpg = rb->user_page;
+
+ /* Allow new userspace to detect that bit 0 is deprecated */
+ userpg->cap_bit0_is_deprecated = 1;
+ userpg->size = offsetof(struct perf_event_mmap_page, __reserved);
+
+unlock:
+ rcu_read_unlock();
+}
+
void __weak arch_perf_update_userpage(struct perf_event_mmap_page *userpg, u64 now)
{
}
@@ -4044,6 +4064,7 @@ again:
ring_buffer_attach(event, rb);
rcu_assign_pointer(event->rb, rb);
+ perf_event_init_userpage(event);
perf_event_update_userpage(event);
unlock:
next prev parent reply other threads:[~2013-09-19 10:28 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-28 13:22 [PATCH 0/5] perf: add two new features Adrian Hunter
2013-06-28 13:22 ` [PATCH 1/5] perf: fix broken union in perf_event_mmap_page Adrian Hunter
2013-06-28 15:22 ` Peter Zijlstra
2013-07-16 11:51 ` H. Peter Anvin
2013-07-24 3:56 ` [tip:perf/core] perf: Fix broken union in ' struct perf_event_mmap_page' tip-bot for Adrian Hunter
2013-09-17 20:23 ` Vince Weaver
2013-09-17 20:35 ` Vince Weaver
2013-09-19 8:42 ` Ingo Molnar
2013-09-18 8:57 ` Peter Zijlstra
2013-09-18 14:19 ` Vince Weaver
2013-09-18 15:42 ` Peter Zijlstra
2013-09-18 18:33 ` Stephane Eranian
2013-09-19 8:43 ` Peter Zijlstra
2013-09-19 8:55 ` Stephane Eranian
2013-09-19 9:16 ` Ingo Molnar
2013-09-18 20:07 ` Vince Weaver
2013-09-19 8:16 ` Peter Zijlstra
2013-09-19 9:14 ` [PATCH] perf: Always set bit 0 in the capabilities field of 'struct perf_event_mmap_page' to 0, to maintain the ABI Ingo Molnar
2013-09-19 10:12 ` Peter Zijlstra
2013-09-19 10:28 ` Ingo Molnar [this message]
2013-09-19 10:35 ` Peter Zijlstra
2013-09-19 10:40 ` [PATCH, v3] " Ingo Molnar
2013-09-19 11:18 ` Adrian Hunter
2013-09-19 11:42 ` [PATCH, v4] perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page' Ingo Molnar
2013-09-19 17:40 ` Vince Weaver
2013-09-20 7:44 ` Ingo Molnar
2013-09-18 9:13 ` [tip:perf/core] perf: Fix broken union in ' struct perf_event_mmap_page' Adrian Hunter
2013-09-18 14:10 ` Vince Weaver
2013-06-28 13:22 ` [PATCH 2/5] x86: add ability to calculate TSC from perf sample timestamps Adrian Hunter
2013-07-24 3:56 ` [tip:perf/core] perf/x86: Add " tip-bot for Adrian Hunter
2013-06-28 13:22 ` [PATCH 3/5] perf tools: add test for converting perf time to/from TSC Adrian Hunter
2013-07-24 3:56 ` [tip:perf/core] perf tools: Add test for converting perf time to/ from TSC tip-bot for Adrian Hunter
2013-06-28 13:22 ` [PATCH 4/5] perf: add 'keep tracking' flag to PERF_EVENT_IOC_DISABLE Adrian Hunter
2013-06-28 13:22 ` [PATCH 5/5] perf tools: add 'keep tracking' test Adrian Hunter
2013-06-28 15:27 ` [PATCH 0/5] perf: add two new features Peter Zijlstra
2013-06-28 19:22 ` Adrian Hunter
2013-07-16 6:22 ` Adrian Hunter
2013-07-16 14:34 ` Peter Zijlstra
2013-07-17 11:28 ` Adrian Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130919102818.GA23487@gmail.com \
--to=mingo@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=eranian@googlemail.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=vince@deater.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.