linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf: Differentiate exec() and non-exec() comm events
@ 2014-05-28  8:45 Adrian Hunter
  2014-05-28  8:55 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Hunter @ 2014-05-28  8:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Jiri Olsa, Paul Mackerras

perf tools like 'perf report' can aggregate samples by comm
strings, which generally works.  However, there are other
potential use-cases.  For example, to pair up 'calls'
with 'returns' accurately (from branch events like Intel BTS)
it is necessary to identify whether the process has exec'd.
Although a comm event is generated when an 'exec' happens
it is also generated whenever the comm string is changed
on a whim (e.g. by prctl PR_SET_NAME).  This patch adds a
flag to the comm event to differentiate one case from the
other.

In order to determine whether the kernel supports the new
flag, a selection bit named 'exec' is added to struct
perf_event_attr.  The bit does nothing but will cause
perf_event_open() to fail if the bit is set on kernels
that do not have it defined.

Cc: Dave Jones <davej@redhat.com>
Link: http://lkml.kernel.org/r/537D9EBE.7030806@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 fs/exec.c                       | 6 +++---
 include/linux/perf_event.h      | 4 ++--
 include/linux/sched.h           | 6 +++++-
 include/uapi/linux/perf_event.h | 9 +++++++--
 kernel/events/core.c            | 4 ++--
 5 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index eb83064..a9a9591 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1046,13 +1046,13 @@ EXPORT_SYMBOL_GPL(get_task_comm);
  * so that a new one can be started
  */
 
-void set_task_comm(struct task_struct *tsk, const char *buf)
+void __set_task_comm(struct task_struct *tsk, const char *buf, bool exec)
 {
 	task_lock(tsk);
 	trace_task_rename(tsk, buf);
 	strlcpy(tsk->comm, buf, sizeof(tsk->comm));
 	task_unlock(tsk);
-	perf_event_comm(tsk);
+	perf_event_comm(tsk, exec);
 }
 
 int flush_old_exec(struct linux_binprm * bprm)
@@ -1111,7 +1111,7 @@ void setup_new_exec(struct linux_binprm * bprm)
 		set_dumpable(current->mm, suid_dumpable);
 
 	perf_event_exec();
-	set_task_comm(current, kbasename(bprm->filename));
+	__set_task_comm(current, kbasename(bprm->filename), true);
 
 	/* Set the new mm task size. We have to do that late because it may
 	 * depend on TIF_32BIT which is only updated in flush_thread() on
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index ec2e29f..3f736a5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -707,7 +707,7 @@ extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *
 extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
 
 extern void perf_event_exec(void);
-extern void perf_event_comm(struct task_struct *tsk);
+extern void perf_event_comm(struct task_struct *tsk, bool exec);
 extern void perf_event_fork(struct task_struct *tsk);
 
 /* Callchains */
@@ -814,7 +814,7 @@ static inline int perf_unregister_guest_info_callbacks
 (struct perf_guest_info_callbacks *callbacks)				{ return 0; }
 
 static inline void perf_event_mmap(struct vm_area_struct *vma)		{ }
-static inline void perf_event_comm(struct task_struct *tsk)		{ }
+static inline void perf_event_comm(struct task_struct *tsk, bool exec)	{ }
 static inline void perf_event_fork(struct task_struct *tsk)		{ }
 static inline void perf_event_init(void)				{ }
 static inline int  perf_swevent_get_recursion_context(void)		{ return -1; }
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 25f54c7..cadcf38 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2376,7 +2376,11 @@ extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, i
 struct task_struct *fork_idle(int);
 extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 
-extern void set_task_comm(struct task_struct *tsk, const char *from);
+extern void __set_task_comm(struct task_struct *tsk, const char *from, bool exec);
+static inline void set_task_comm(struct task_struct *tsk, const char *from)
+{
+	__set_task_comm(tsk, from, false);
+}
 extern char *get_task_comm(char *to, struct task_struct *tsk);
 
 #ifdef CONFIG_SMP
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index d9cd853..7e55160 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -302,8 +302,8 @@ struct perf_event_attr {
 				exclude_callchain_kernel : 1, /* exclude kernel callchains */
 				exclude_callchain_user   : 1, /* exclude user callchains */
 				mmap2          :  1, /* include mmap with inode data     */
-
-				__reserved_1   : 40;
+				exec           :  1, /* flag comm events that are due to an exec */
+				__reserved_1   : 39;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -502,7 +502,12 @@ struct perf_event_mmap_page {
 #define PERF_RECORD_MISC_GUEST_KERNEL		(4 << 0)
 #define PERF_RECORD_MISC_GUEST_USER		(5 << 0)
 
+/*
+ * PERF_RECORD_MISC_MMAP_DATA and PERF_RECORD_MISC_COMM_EXEC are used on
+ * different events so can reuse the same bit position.
+ */
 #define PERF_RECORD_MISC_MMAP_DATA		(1 << 13)
+#define PERF_RECORD_MISC_COMM_EXEC		(1 << 13)
 /*
  * Indicates that the content of PERF_SAMPLE_IP points to
  * the actual instruction that triggered the event. See also
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 6a5e064..9efb1e7 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5090,7 +5090,7 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event)
 		       NULL);
 }
 
-void perf_event_comm(struct task_struct *task)
+void perf_event_comm(struct task_struct *task, bool exec)
 {
 	struct perf_comm_event comm_event;
 
@@ -5104,7 +5104,7 @@ void perf_event_comm(struct task_struct *task)
 		.event_id  = {
 			.header = {
 				.type = PERF_RECORD_COMM,
-				.misc = 0,
+				.misc = exec ? PERF_RECORD_MISC_COMM_EXEC : 0,
 				/* .size */
 			},
 			/* .pid */
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf: Differentiate exec() and non-exec() comm events
  2014-05-28  8:45 [PATCH] perf: Differentiate exec() and non-exec() comm events Adrian Hunter
@ 2014-05-28  8:55 ` Peter Zijlstra
  2014-05-28  9:08   ` Adrian Hunter
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2014-05-28  8:55 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Jiri Olsa, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 1567 bytes --]

On Wed, May 28, 2014 at 11:45:04AM +0300, Adrian Hunter wrote:
> perf tools like 'perf report' can aggregate samples by comm
> strings, which generally works.  However, there are other
> potential use-cases.  For example, to pair up 'calls'
> with 'returns' accurately (from branch events like Intel BTS)
> it is necessary to identify whether the process has exec'd.
> Although a comm event is generated when an 'exec' happens
> it is also generated whenever the comm string is changed
> on a whim (e.g. by prctl PR_SET_NAME).  This patch adds a
> flag to the comm event to differentiate one case from the
> other.
> 
> In order to determine whether the kernel supports the new
> flag, a selection bit named 'exec' is added to struct
> perf_event_attr.  The bit does nothing but will cause
> perf_event_open() to fail if the bit is set on kernels
> that do not have it defined.
> 

> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -302,8 +302,8 @@ struct perf_event_attr {
>  				exclude_callchain_kernel : 1, /* exclude kernel callchains */
>  				exclude_callchain_user   : 1, /* exclude user callchains */
>  				mmap2          :  1, /* include mmap with inode data     */
> -
> -				__reserved_1   : 40;
> +				exec           :  1, /* flag comm events that are due to an exec */
> +				__reserved_1   : 39;
>  

Yah.. that's just sad :-(

the only capabilities mask we have is in the mmap() page, so without
mmap()ing we have no way to test that.

Would it make sense to call it comm_exec?

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf: Differentiate exec() and non-exec() comm events
  2014-05-28  8:55 ` Peter Zijlstra
@ 2014-05-28  9:08   ` Adrian Hunter
  2014-05-28  9:53     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Hunter @ 2014-05-28  9:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Jiri Olsa, Paul Mackerras

On 05/28/2014 11:55 AM, Peter Zijlstra wrote:
> On Wed, May 28, 2014 at 11:45:04AM +0300, Adrian Hunter wrote:
>> perf tools like 'perf report' can aggregate samples by comm
>> strings, which generally works.  However, there are other
>> potential use-cases.  For example, to pair up 'calls'
>> with 'returns' accurately (from branch events like Intel BTS)
>> it is necessary to identify whether the process has exec'd.
>> Although a comm event is generated when an 'exec' happens
>> it is also generated whenever the comm string is changed
>> on a whim (e.g. by prctl PR_SET_NAME).  This patch adds a
>> flag to the comm event to differentiate one case from the
>> other.
>>
>> In order to determine whether the kernel supports the new
>> flag, a selection bit named 'exec' is added to struct
>> perf_event_attr.  The bit does nothing but will cause
>> perf_event_open() to fail if the bit is set on kernels
>> that do not have it defined.
>>
> 
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -302,8 +302,8 @@ struct perf_event_attr {
>>  				exclude_callchain_kernel : 1, /* exclude kernel callchains */
>>  				exclude_callchain_user   : 1, /* exclude user callchains */
>>  				mmap2          :  1, /* include mmap with inode data     */
>> -
>> -				__reserved_1   : 40;
>> +				exec           :  1, /* flag comm events that are due to an exec */
>> +				__reserved_1   : 39;
>>  
> 
> Yah.. that's just sad :-(
> 
> the only capabilities mask we have is in the mmap() page, so without
> mmap()ing we have no way to test that.
> 
> Would it make sense to call it comm_exec?

Yes, that is better.  Do you want me to resend the patch?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf: Differentiate exec() and non-exec() comm events
  2014-05-28  9:08   ` Adrian Hunter
@ 2014-05-28  9:53     ` Peter Zijlstra
  2014-06-05  9:56       ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2014-05-28  9:53 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo, linux-kernel,
	David Ahern, Jiri Olsa, Paul Mackerras

[-- Attachment #1: Type: text/plain, Size: 1893 bytes --]

On Wed, May 28, 2014 at 12:08:57PM +0300, Adrian Hunter wrote:
> On 05/28/2014 11:55 AM, Peter Zijlstra wrote:
> > On Wed, May 28, 2014 at 11:45:04AM +0300, Adrian Hunter wrote:
> >> perf tools like 'perf report' can aggregate samples by comm
> >> strings, which generally works.  However, there are other
> >> potential use-cases.  For example, to pair up 'calls'
> >> with 'returns' accurately (from branch events like Intel BTS)
> >> it is necessary to identify whether the process has exec'd.
> >> Although a comm event is generated when an 'exec' happens
> >> it is also generated whenever the comm string is changed
> >> on a whim (e.g. by prctl PR_SET_NAME).  This patch adds a
> >> flag to the comm event to differentiate one case from the
> >> other.
> >>
> >> In order to determine whether the kernel supports the new
> >> flag, a selection bit named 'exec' is added to struct
> >> perf_event_attr.  The bit does nothing but will cause
> >> perf_event_open() to fail if the bit is set on kernels
> >> that do not have it defined.
> >>
> > 
> >> --- a/include/uapi/linux/perf_event.h
> >> +++ b/include/uapi/linux/perf_event.h
> >> @@ -302,8 +302,8 @@ struct perf_event_attr {
> >>  				exclude_callchain_kernel : 1, /* exclude kernel callchains */
> >>  				exclude_callchain_user   : 1, /* exclude user callchains */
> >>  				mmap2          :  1, /* include mmap with inode data     */
> >> -
> >> -				__reserved_1   : 40;
> >> +				exec           :  1, /* flag comm events that are due to an exec */
> >> +				__reserved_1   : 39;
> >>  
> > 
> > Yah.. that's just sad :-(
> > 
> > the only capabilities mask we have is in the mmap() page, so without
> > mmap()ing we have no way to test that.
> > 
> > Would it make sense to call it comm_exec?
> 
> Yes, that is better.  Do you want me to resend the patch?

Nah, I'll frob it. Thanks!

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf: Differentiate exec() and non-exec() comm events
  2014-05-28  9:53     ` Peter Zijlstra
@ 2014-06-05  9:56       ` Ingo Molnar
  2014-06-05  9:58         ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2014-06-05  9:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Adrian Hunter, Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo,
	linux-kernel, David Ahern, Jiri Olsa, Paul Mackerras


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, May 28, 2014 at 12:08:57PM +0300, Adrian Hunter wrote:
> > On 05/28/2014 11:55 AM, Peter Zijlstra wrote:
> > > On Wed, May 28, 2014 at 11:45:04AM +0300, Adrian Hunter wrote:
> > >> perf tools like 'perf report' can aggregate samples by comm
> > >> strings, which generally works.  However, there are other
> > >> potential use-cases.  For example, to pair up 'calls'
> > >> with 'returns' accurately (from branch events like Intel BTS)
> > >> it is necessary to identify whether the process has exec'd.
> > >> Although a comm event is generated when an 'exec' happens
> > >> it is also generated whenever the comm string is changed
> > >> on a whim (e.g. by prctl PR_SET_NAME).  This patch adds a
> > >> flag to the comm event to differentiate one case from the
> > >> other.
> > >>
> > >> In order to determine whether the kernel supports the new
> > >> flag, a selection bit named 'exec' is added to struct
> > >> perf_event_attr.  The bit does nothing but will cause
> > >> perf_event_open() to fail if the bit is set on kernels
> > >> that do not have it defined.
> > >>
> > > 
> > >> --- a/include/uapi/linux/perf_event.h
> > >> +++ b/include/uapi/linux/perf_event.h
> > >> @@ -302,8 +302,8 @@ struct perf_event_attr {
> > >>  				exclude_callchain_kernel : 1, /* exclude kernel callchains */
> > >>  				exclude_callchain_user   : 1, /* exclude user callchains */
> > >>  				mmap2          :  1, /* include mmap with inode data     */
> > >> -
> > >> -				__reserved_1   : 40;
> > >> +				exec           :  1, /* flag comm events that are due to an exec */
> > >> +				__reserved_1   : 39;
> > >>  
> > > 
> > > Yah.. that's just sad :-(
> > > 
> > > the only capabilities mask we have is in the mmap() page, so without
> > > mmap()ing we have no way to test that.
> > > 
> > > Would it make sense to call it comm_exec?
> > 
> > Yes, that is better.  Do you want me to resend the patch?
> 
> Nah, I'll frob it. Thanks!

FYI, this patch breaks pretty much every non-x86 architecture:

/home/mingo/tip/fs/exec.c: In function 'setup_new_exec':
/home/mingo/tip/fs/exec.c:1113: error: implicit declaration of function 'perf_event_exec'
make[2]: *** [fs/exec.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [fs] Error 2
make[1]: *** Waiting for unfinished jobs....

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf: Differentiate exec() and non-exec() comm events
  2014-06-05  9:56       ` Ingo Molnar
@ 2014-06-05  9:58         ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2014-06-05  9:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Adrian Hunter, Ingo Molnar, Dave Jones, Arnaldo Carvalho de Melo,
	linux-kernel, David Ahern, Jiri Olsa, Paul Mackerras


* Ingo Molnar <mingo@kernel.org> wrote:

> > Nah, I'll frob it. Thanks!
> 
> FYI, this patch breaks pretty much every non-x86 architecture:
> 
> /home/mingo/tip/fs/exec.c: In function 'setup_new_exec':
> /home/mingo/tip/fs/exec.c:1113: error: implicit declaration of function 'perf_event_exec'
> make[2]: *** [fs/exec.o] Error 1
> make[2]: *** Waiting for unfinished jobs....
> make[1]: *** [fs] Error 2
> make[1]: *** Waiting for unfinished jobs....

Sorry, it was another patch that broke things:

Author: Peter Zijlstra <peterz@infradead.org>
Date:   Wed May 21 17:32:19 2014 +0200

    perf: Fix perf_event_comm() vs. exec() assumption

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-06-05  9:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-28  8:45 [PATCH] perf: Differentiate exec() and non-exec() comm events Adrian Hunter
2014-05-28  8:55 ` Peter Zijlstra
2014-05-28  9:08   ` Adrian Hunter
2014-05-28  9:53     ` Peter Zijlstra
2014-06-05  9:56       ` Ingo Molnar
2014-06-05  9:58         ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).