linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
@ 2015-01-08 15:22 Vince Weaver
  2015-01-09 16:00 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Vince Weaver @ 2015-01-08 15:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo

Hello,

I was working on improving the manpage by working out how the 
perf_event_open() PERF_FLAG_FD_OUTPUT flag works.

It turns out it doesn't.

In kernel/events/core.c when opening a file this is done:

        if (output_event) {
           err = perf_event_set_output(event, output_event);
           if (err)
              goto err_context;
        }

	and perf_event_set_output() does:

                if (output_event->cpu == -1 && output_event->ctx != event->ctx) {
                   goto out;
                }

	But event->ctx is always NULL; it does not get set to ctx until
	after the call to perf_event_set_output().

	It looks like this was broken back in 2.6.35 days with the change:

		commit ac9721f3f54b27a16c7e1afb2481e7ee95a70318
		Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
		Date:   Thu May 27 12:54:41 2010 +0200

		perf_events: Fix races and clean up perf_event and perf_mmap_data interaction

So is this worth fixing seeing as apparently no one uses this feature?

As an aside, added error reporting is *really* needed as I had to sprinkle
all of events/core.c with printk() calls to even begin to understand why
I was getting the EINVAL result.

Vince

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
  2015-01-08 15:22 perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35 Vince Weaver
@ 2015-01-09 16:00 ` Peter Zijlstra
  2015-01-09 16:26   ` Vince Weaver
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2015-01-09 16:00 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo

On Thu, Jan 08, 2015 at 10:22:37AM -0500, Vince Weaver wrote:
> Hello,
> 
> I was working on improving the manpage by working out how the 
> perf_event_open() PERF_FLAG_FD_OUTPUT flag works.
> 
> It turns out it doesn't.
> 
> In kernel/events/core.c when opening a file this is done:
> 
>         if (output_event) {
>            err = perf_event_set_output(event, output_event);
>            if (err)
>               goto err_context;
>         }
> 
> 	and perf_event_set_output() does:
> 
>                 if (output_event->cpu == -1 && output_event->ctx != event->ctx) {
>                    goto out;
>                 }
> 
> 	But event->ctx is always NULL; it does not get set to ctx until
> 	after the call to perf_event_set_output().
> 
> 	It looks like this was broken back in 2.6.35 days with the change:
> 
> 		commit ac9721f3f54b27a16c7e1afb2481e7ee95a70318
> 		Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> 		Date:   Thu May 27 12:54:41 2010 +0200
> 
> 		perf_events: Fix races and clean up perf_event and perf_mmap_data interaction
> 
> So is this worth fixing seeing as apparently no one uses this feature?

I think there's a fair argument for removing it, Ingo, Acme?

> As an aside, added error reporting is *really* needed as I had to sprinkle
> all of events/core.c with printk() calls to even begin to understand why
> I was getting the EINVAL result.

Yes, let me see if I can find someone to work on that :/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
  2015-01-09 16:00 ` Peter Zijlstra
@ 2015-01-09 16:26   ` Vince Weaver
  2015-01-09 23:17     ` Arnaldo Carvalho de Melo
  2015-01-13  9:36     ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Vince Weaver @ 2015-01-09 16:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Vince Weaver, linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo


On Fri, 9 Jan 2015, Peter Zijlstra wrote:
> > So is this worth fixing seeing as apparently no one uses this feature?
> 
> I think there's a fair argument for removing it, Ingo, Acme?

could the functionality be replaced with a subsequent call to
	ioctl(PERF_EVENT_IOC_SET_OUTPUT)
?

Although I suppose there's a possibility for losing a small amount of data 
or some other reason that PERF_FLAG_FD_OUTPUT was introduced in the first 
place.

In addition, if we remove PERF_FLAG_FD_OUTPUT would there then be any 
reason to keep PERF_FLAG_FD_NO_GROUP around?

Vince

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
  2015-01-09 16:26   ` Vince Weaver
@ 2015-01-09 23:17     ` Arnaldo Carvalho de Melo
  2015-01-13  9:36     ` Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-09 23:17 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Peter Zijlstra, linux-kernel, Paul Mackerras, Ingo Molnar

Em Fri, Jan 09, 2015 at 11:26:54AM -0500, Vince Weaver escreveu:
> 
> On Fri, 9 Jan 2015, Peter Zijlstra wrote:
> > > So is this worth fixing seeing as apparently no one uses this feature?
> > 
> > I think there's a fair argument for removing it, Ingo, Acme?
> 
> could the functionality be replaced with a subsequent call to
> 	ioctl(PERF_EVENT_IOC_SET_OUTPUT)
> ?

That is the only thing tools/perf uses:

[acme@zoo linux]$ find tools/perf -name "*.[chly]" | xargs grep PERF_EVENT_IOC_SET_OUTPUT 
tools/perf/util/evlist.h: * @refcnt - e.g. code using PERF_EVENT_IOC_SET_OUTPUT to share this
tools/perf/util/evlist.c:			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
tools/perf/tests/perf-record.c:	 * (using ioctl(PERF_EVENT_IOC_SET_OUTPUT)).
[acme@zoo linux]$ find tools/perf -name "*.[chly]" | xargs grep PERF_FLAG_FD_OUTPUT
[acme@zoo linux]$
 
> Although I suppose there's a possibility for losing a small amount of data 
> or some other reason that PERF_FLAG_FD_OUTPUT was introduced in the first 
> place.

Humm, IIRC tools/perf starts with the event disabled and then asks for
enable_on_exec when starting workloads but yes, when you're attaching to
something that is already running you'd take a bit longer to start getting
samples.
 
> In addition, if we remove PERF_FLAG_FD_OUTPUT would there then be any 
> reason to keep PERF_FLAG_FD_NO_GROUP around?
> 
> Vince

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
  2015-01-09 16:26   ` Vince Weaver
  2015-01-09 23:17     ` Arnaldo Carvalho de Melo
@ 2015-01-13  9:36     ` Peter Zijlstra
  2015-01-13 11:56       ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2015-01-13  9:36 UTC (permalink / raw)
  To: Vince Weaver
  Cc: linux-kernel, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo

On Fri, Jan 09, 2015 at 11:26:54AM -0500, Vince Weaver wrote:
> 
> On Fri, 9 Jan 2015, Peter Zijlstra wrote:
> > > So is this worth fixing seeing as apparently no one uses this feature?
> > 
> > I think there's a fair argument for removing it, Ingo, Acme?
> 
> could the functionality be replaced with a subsequent call to
> 	ioctl(PERF_EVENT_IOC_SET_OUTPUT)

Yes.

> Although I suppose there's a possibility for losing a small amount of data 
> or some other reason that PERF_FLAG_FD_OUTPUT was introduced in the first 
> place.

That's a natural race without solution. That is, because there is no
serialization between the sys_perf_event_open() and any possible acts of
data generation, we can't even talk about loosing data.

Even if it were all in a single syscall, there is no saying how long
that syscall would take to complete -- imagine its stuck on memory
allocation or whatnot. Similarly you could have issued the syscall a wee
bit later or whatnot.

> In addition, if we remove PERF_FLAG_FD_OUTPUT would there then be any 
> reason to keep PERF_FLAG_FD_NO_GROUP around?

Good point, I think there was another potential user of that proposed
recently, but I can't quite remember where or what. Let me try and go
find that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35
  2015-01-13  9:36     ` Peter Zijlstra
@ 2015-01-13 11:56       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-01-13 11:56 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Vince Weaver, linux-kernel, Paul Mackerras, Ingo Molnar

Em Tue, Jan 13, 2015 at 10:36:03AM +0100, Peter Zijlstra escreveu:
> On Fri, Jan 09, 2015 at 11:26:54AM -0500, Vince Weaver wrote:
> > On Fri, 9 Jan 2015, Peter Zijlstra wrote:
> > > > So is this worth fixing seeing as apparently no one uses this feature?

> > > I think there's a fair argument for removing it, Ingo, Acme?

> > could the functionality be replaced with a subsequent call to
> > 	ioctl(PERF_EVENT_IOC_SET_OUTPUT)
 
> Yes.
 
> > Although I suppose there's a possibility for losing a small amount of data 
> > or some other reason that PERF_FLAG_FD_OUTPUT was introduced in the first 
> > place.
 
> That's a natural race without solution. That is, because there is no
> serialization between the sys_perf_event_open() and any possible acts of
> data generation, we can't even talk about loosing data.

> Even if it were all in a single syscall, there is no saying how long
> that syscall would take to complete -- imagine its stuck on memory
> allocation or whatnot. Similarly you could have issued the syscall a wee
> bit later or whatnot.

That is exactly my thinking, the world goes on while we set up perf to
observe it :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-13 11:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-08 15:22 perf: PERF_FLAG_FD_OUTPUT has been broken since 2.6.35 Vince Weaver
2015-01-09 16:00 ` Peter Zijlstra
2015-01-09 16:26   ` Vince Weaver
2015-01-09 23:17     ` Arnaldo Carvalho de Melo
2015-01-13  9:36     ` Peter Zijlstra
2015-01-13 11:56       ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).