perf: regression with PERF_EVENT_IOC

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* perf: regression with PERF_EVENT_IOC_REFRESH
@ 2011-05-23 20:04 Vince Weaver
  2011-05-23 20:22 ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-23 20:04 UTC (permalink / raw)
  To: linux-kernel; +Cc: fbuihuu, a.p.zijlstra, mingo, paulus, acme

Hello

the changeset 2e939d1d   perf: Limit event refresh to sampling event

changes the behavior of
  ioctl( , PERF_EVENT_IOC_REFRESH, )

before the changeset, you could have a counter group where only one of the 
subevents was sampling.  PERF_EVENT_IOC_REFRESH would correctly enable 
sampling for only that subevent.

With the changeset applied, this fails with EINVALID.  Now you can only 
sample in a group leader.

Was this an intended change?  It breaks various of our PAPI regression 
tests that worked fine before the change was committed.

Vince
vweaver1@eecs.utk.edu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-23 20:04 perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
@ 2011-05-23 20:22 ` Peter Zijlstra
  2011-05-24  6:20   ` Vince Weaver
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-23 20:22 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Mon, 2011-05-23 at 16:04 -0400, Vince Weaver wrote:
> Hello
> 
> the changeset 2e939d1d   perf: Limit event refresh to sampling event
> 
> changes the behavior of
>   ioctl( , PERF_EVENT_IOC_REFRESH, )
> 
> before the changeset, you could have a counter group where only one of the 
> subevents was sampling.  PERF_EVENT_IOC_REFRESH would correctly enable 
> sampling for only that subevent.

But how? it adds to the event_limit of the event you call the ioctl()
on, not on any of the group siblings.

> With the changeset applied, this fails with EINVALID.  Now you can only 
> sample in a group leader.

I'm not quite seeing how group-leaders are relevant here, you can only
call this ioctl() on sampling events, regardless of if they're they
group leader or a sibling.

> Was this an intended change?  It breaks various of our PAPI regression 
> tests that worked fine before the change was committed.

I'm not quite seeing what's wrong yet, the change seemed simple enough
in that calling that ioctl() on an event that wasn't capable of
generating samples doesn't make sense, since it will increase the event
limit for the event you call it on.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-23 20:22 ` Peter Zijlstra
@ 2011-05-24  6:20   ` Vince Weaver
  2011-05-24 10:30     ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-24  6:20 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1451 bytes --]

On Mon, 23 May 2011, Peter Zijlstra wrote:

> > before the changeset, you could have a counter group where only one of the 
> > subevents was sampling.  PERF_EVENT_IOC_REFRESH would correctly enable 
> > sampling for only that subevent.
> 
> But how? it adds to the event_limit of the event you call the ioctl()
> on, not on any of the group siblings.

the old behavior did.

> > With the changeset applied, this fails with EINVALID.  Now you can only 
> > sample in a group leader.
> 
> I'm not quite seeing how group-leaders are relevant here, you can only
> call this ioctl() on sampling events, regardless of if they're they
> group leader or a sibling.

previous behavior was that if you called it on a group leader that wasn't 
sampling, a sibling event that was sampling would be activated.

> > Was this an intended change?  It breaks various of our PAPI regression 
> > tests that worked fine before the change was committed.
> 
> I'm not quite seeing what's wrong yet, the change seemed simple enough
> in that calling that ioctl() on an event that wasn't capable of
> generating samples doesn't make sense, since it will increase the event
> limit for the event you call it on.

attached is some code that will return a sampled count on older kernels
but gives EINVAL on current kernels.

The old behavior might have been unintentional, but PAPI unfortunately 
depends on it (not code I originaly wrote, but code I have to maintain).

Vince

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-csrc; name=sampled_notleader_refresh.c, Size: 3040 bytes --]

/* sampled_notleader_refresh.c  */
/* by Vince Weaver   vweaver1 _at_ eecs.utk.edu */

/* Compile with gcc -O2 -Wall -o sampled_notleader_refresh sampled_notleader_refresh.c */

#define _GNU_SOURCE 1

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <unistd.h>
#include <fcntl.h>

#include <errno.h>

#include <signal.h>

#include <sys/mman.h>

#include <sys/ioctl.h>
#include <asm/unistd.h>
#include <sys/prctl.h>
#include <linux/perf_event.h>



static int count=0;

static void our_handler(int signum,siginfo_t *oh, void *blah) {
  count++;
}

int perf_event_open(struct perf_event_attr *hw_event_uptr,
		    pid_t pid, int cpu, int group_fd, unsigned long flags) {
 
   return syscall(__NR_perf_event_open, hw_event_uptr, pid, cpu, group_fd,flags);
}

double busywork(int count) {
 
   int i;
   double sum=0.0012;
   
   for(i=0;i<count;i++) {
      sum+=0.01;
   }
   return sum;
   
}


int main(int argc, char** argv) {
   
  int ret,fd1,fd2;   
  double result;

   struct perf_event_attr pe;

   struct sigaction sa;

   memset(&sa, 0, sizeof(struct sigaction));
   sa.sa_sigaction = our_handler;
   sa.sa_flags = SA_SIGINFO;

   if (sigaction( SIGIO, &sa, NULL) < 0) {
     fprintf(stderr,"Error setting up signal handler\n");
     exit(1);
   }
   
   memset(&pe,0,sizeof(struct perf_event_attr));

   pe.type=PERF_TYPE_HARDWARE;
   pe.size=sizeof(struct perf_event_attr);
   pe.config=PERF_COUNT_HW_INSTRUCTIONS;
   pe.sample_period=000000;
   pe.sample_type=0;
   pe.read_format=PERF_FORMAT_GROUP|PERF_FORMAT_ID;
   pe.disabled=1;
   pe.pinned=0;
   pe.exclude_kernel=1;
   pe.exclude_hv=1;
   pe.wakeup_events=1;

   fd1=perf_event_open(&pe,0,-1,-1,0);
   if (fd1<0) {
      fprintf(stderr,"Error opening leader %llx\n",pe.config);
      exit(1);
   }

   pe.type=PERF_TYPE_HARDWARE;
   pe.config=PERF_COUNT_HW_INSTRUCTIONS;
   pe.sample_period=1000000;
   pe.sample_type=PERF_SAMPLE_IP;
   pe.read_format=PERF_FORMAT_GROUP|PERF_FORMAT_ID;
   pe.disabled=0;
   pe.pinned=0;
   pe.exclude_kernel=1;
   pe.exclude_hv=1;
   pe.wakeup_events=1;

   fd2=perf_event_open(&pe,0,-1,fd1,0);
   if (fd2<0) {
      fprintf(stderr,"Error opening %llx\n",pe.config);
      exit(1);
   }

   
   void *blargh;
   
   /* important! */
   blargh=mmap(NULL, (1+1)*4096, 
         PROT_READ|PROT_WRITE, MAP_SHARED, fd2, 0);
   
   fcntl(fd2, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC);
   fcntl(fd2, F_SETSIG, SIGIO);
   fcntl(fd2,F_SETOWN,getpid());
   
   ioctl(fd1, PERF_EVENT_IOC_RESET, 0);   
   ioctl(fd2, PERF_EVENT_IOC_RESET, 0);   

   ret=ioctl(fd1, PERF_EVENT_IOC_REFRESH,0);
      if (ret<0) {
     fprintf(stderr,"Error with PERF_EVENT_IOC_REFRESH of group leader: "
	     "%d %s\n",errno,strerror(errno));
     exit(1);
   }

   result=busywork(10000000);
   
   ret=ioctl(fd2, PERF_EVENT_IOC_DISABLE,0);
   
   
   printf("Count: %d %lf\n",count,result);

   
   return 0;
}


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24  6:20   ` Vince Weaver
@ 2011-05-24 10:30     ` Peter Zijlstra
  2011-05-24 15:04       ` Vince Weaver
  2011-05-28  3:38       ` Vince Weaver
  0 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-24 10:30 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 2011-05-24 at 02:20 -0400, Vince Weaver wrote:
> On Mon, 23 May 2011, Peter Zijlstra wrote:
> 
> > > before the changeset, you could have a counter group where only one of the 
> > > subevents was sampling.  PERF_EVENT_IOC_REFRESH would correctly enable 
> > > sampling for only that subevent.
> > 
> > But how? it adds to the event_limit of the event you call the ioctl()
> > on, not on any of the group siblings.
> 
> the old behavior did.
> 
> > > With the changeset applied, this fails with EINVALID.  Now you can only 
> > > sample in a group leader.
> > 
> > I'm not quite seeing how group-leaders are relevant here, you can only
> > call this ioctl() on sampling events, regardless of if they're they
> > group leader or a sibling.
> 
> previous behavior was that if you called it on a group leader that wasn't 
> sampling, a sibling event that was sampling would be activated.
> 
> > > Was this an intended change?  It breaks various of our PAPI regression 
> > > tests that worked fine before the change was committed.
> > 
> > I'm not quite seeing what's wrong yet, the change seemed simple enough
> > in that calling that ioctl() on an event that wasn't capable of
> > generating samples doesn't make sense, since it will increase the event
> > limit for the event you call it on.
> 
> attached is some code that will return a sampled count on older kernels
> but gives EINVAL on current kernels.
> 
> The old behavior might have been unintentional, but PAPI unfortunately 
> depends on it (not code I originaly wrote, but code I have to maintain).

Right, so what the code does is create a group of which the leader is
disabled, which effectively has the whole group disabled. It then abuses
REFRESH,0 to enable the leader.

The code should use ENABLE (surprise, surprise!) to enable the leader
and get things going.

This is very clear abuse of the API and I'm not really inclined to fix
it, in fact, I'd be tempted to add an additional test to verify the
argument to REFRESH > 0.

Then again, I do appreciate you're having a problem there, how soon can
you push this trivial fix into all maintained PAPI branches? We could
revert this for one release and then properly shut this abuse down the
next release.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 10:30     ` Peter Zijlstra
@ 2011-05-24 15:04       ` Vince Weaver
  2011-05-24 15:10         ` Peter Zijlstra
  2011-05-24 15:11         ` Peter Zijlstra
  2011-05-28  3:38       ` Vince Weaver
  1 sibling, 2 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-24 15:04 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 24 May 2011, Peter Zijlstra wrote:

> > The old behavior might have been unintentional, but PAPI unfortunately 
> > depends on it (not code I originaly wrote, but code I have to maintain).
> 
> Right, so what the code does is create a group of which the leader is
> disabled, which effectively has the whole group disabled. It then abuses
> REFRESH,0 to enable the leader.

yes.  This is currently what PAPI does.

> The code should use ENABLE (surprise, surprise!) to enable the leader
> and get things going.

It was unclear from the documentation that enabling a group leader that 
had siblings with overflow setup would start them triggering
without some sort of REFRESH first.  It does seem to work though.
But then again, so did the original behavior.

> This is very clear abuse of the API and I'm not really inclined to fix
> it, in fact, I'd be tempted to add an additional test to verify the
> argument to REFRESH > 0.

Well, when there's no documentation then people write to the code.
As I said before, I didn't write this code.  I just get to pick up the 
pieces when it breaks.

As an aside, I also wasted 6 hours last night finding out that you don't 
get signaled on overflow if you don't have a ring-buffer mmap()ed, even
if you never read from the buffer and you only are interested in counting
the number of overflows.  I should stop complaining though or you'll
start telling me to fix the documentation.  Which maybe I would do if I 
didn't spend all my time writing code to reproduce problems in the perf 
ABI for bisection purposes.

> Then again, I do appreciate you're having a problem there, how soon can
> you push this trivial fix into all maintained PAPI branches? We could
> revert this for one release and then properly shut this abuse down the
> next release.

well since you apparently aren't going to retroactively revert it for 
2.6.37, 2.6.38, or 2.6.39 then it would be silly to do it just for 2.6.40.

I'll audit the PAPI code to see how widespread it is.

To warn you though, we still have people complain within hours when we 
break support for 2.6.16 kernels, so it's not like the people who use 
PAPI are the kind to run out and install a minor stable release.  They're
going to upgrade their distro or move their binary to a newer machine and 
suddenly start geting EINVAL returns on their previously working code and 
then start noisily complaining.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:04       ` Vince Weaver
@ 2011-05-24 15:10         ` Peter Zijlstra
  2011-05-24 15:40           ` Ingo Molnar
  2011-05-24 17:53           ` Vince Weaver
  2011-05-24 15:11         ` Peter Zijlstra
  1 sibling, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-24 15:10 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
> 2.6.37, 2.6.38, or 2.6.39 then it would be silly to do it just for
> 2.6.40.

Oh, I assumed it was recent and .39/.40 would suffice.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:10         ` Peter Zijlstra
@ 2011-05-24 15:40           ` Ingo Molnar
  2011-05-24 20:31             ` Vince Weaver
  2011-05-24 17:53           ` Vince Weaver
  1 sibling, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2011-05-24 15:40 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Vince Weaver, linux-kernel, fbuihuu, paulus, acme

* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
>
> > 2.6.37, 2.6.38, or 2.6.39 then it would be silly to do it just for
> > 2.6.40.

No, this commit was added in v2.6.38 so v2.6.37 should be fine.

> Oh, I assumed it was recent and .39/.40 would suffice.

Btw., how did it happen that the PAPI tests did not get run against upstream 
over the course of about half a year, two full stable kernels released:

 Date:   Mon Mar 14 18:20:32 2011 -0700    Linux 2.6.38
 Date:   Wed May 18 21:06:34 2011 -0700    Linux 2.6.39

?

I'd suggest periodically running the PAPI tests on the perf development tree:

  http://people.redhat.com/mingo/tip.git/README

doing that would have caught this problem 6 months ago.

The upstream policy is that regressions are generally recognized before the 
next kernel gets released: i.e. in the stabilization period after -rc1, the 
roughly two months until the final kernel gets released. That is the window 
when we can still fix regressions relatively cheaply.

Yes, there are exceptions, but if a piece of user-space code did not get tested 
with upstream over months and months then that moves into the 'fix it if we 
can' category - not a regression per se.

So the upstream message is: we can only care about you if you care testing 
upstream.

So if it's easy to fix we can certainly fix this bug and mark it for a -stable 
backport, but this is not a regression that got reported to us in any timely 
manner.

Btw., to get such assumptions tested more frequently than twice a year i'd 
suggest moving these usecases into 'perf test' or so - that it gets run every 
day:

 $ perf test
 1: vmlinux symtab matches kallsyms: FAILED!

 2: detect open syscall event: Ok
 3: detect open syscall event on all cpus: Ok
 4: read samples using the mmap interface: Ok

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:40           ` Ingo Molnar
@ 2011-05-24 20:31             ` Vince Weaver
  2011-05-25 10:39               ` Ingo Molnar
  0 siblings, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-24 20:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, linux-kernel, fbuihuu, paulus, acme

On Tue, 24 May 2011, Ingo Molnar wrote:
> 
> Btw., how did it happen that the PAPI tests did not get run against upstream 
> over the course of about half a year, two full stable kernels released:

we run regresion tests nightly.  There was a bug in our 
  "create two events and sample on the second"
test, where it was actualy sampling on the first counter by mistake.
When I fixed the test to do what it claimed to do it found the bug.

PAPI runs on at least 5 different operating systems, 3 different
Linux perf counter implementations, and on kernels dating back to 2.4.
Plus numerous architectures.  While we try to test against recent
Linux perf_events, we just don't have the hardware to be comprehensive.

It doesn't help that our test machines are primarily used for GPGPU work
during the day, so we're limited to what kernels we can have installed
due to driver issues.

> suggest moving these usecases into 'perf test' or so - that it gets run every 
> day:

you can feel free to install PAPI on your test machines and give it a run 
daily too.  It's open source

  setenv CVSROOT :pserver:anonymous@cvs.eecs.utk.edu:/cvs/homes/papi 
  cvs login
  cvs co all

  cd papi
  ./configure
  make
  ./run_tests

There are often false negatives on the tests that can be a pain
to track down.  Welcome to my world.  We gladly accept patches.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 20:31             ` Vince Weaver
@ 2011-05-25 10:39               ` Ingo Molnar
  2011-05-25 21:24                 ` Vince Weaver
  0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2011-05-25 10:39 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Peter Zijlstra, linux-kernel, fbuihuu, paulus, acme


* Vince Weaver <vweaver1@eecs.utk.edu> wrote:

> > suggest moving these usecases into 'perf test' or so - that it 
> > gets run every day:
> 
> you can feel free to install PAPI on your test machines and give it 
> a run daily too.  It's open source

Well, i gave you a suggestion of how to prevent such regressions in 
the future, should you be worried about such regressions.

The standing upstream policy is that while we try to do a good effort 
and obviously fix everything we find ourselves or which gets reported 
to us, we cannot test everything so if you want us to fix bugs you 
need to pick one or more of these options:

 - run our code

 - or help us build better tests, either to 'perf test' (or to LTP)

 - or get your users to test recent upstream kernels for you

 - or ignore us and deal with regressions whenever you get hit by them

Your choice really.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-25 10:39               ` Ingo Molnar
@ 2011-05-25 21:24                 ` Vince Weaver
  0 siblings, 0 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-25 21:24 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, linux-kernel, fbuihuu, paulus, acme

On Wed, 25 May 2011, Ingo Molnar wrote:
> * Vince Weaver <vweaver1@eecs.utk.edu> wrote:
> 
> Well, i gave you a suggestion of how to prevent such regressions in 
> the future, should you be worried about such regressions.
> 
>  - or help us build better tests, either to 'perf test' (or to LTP)

I do have a perf_event validation test suite that I maintain to help when 
debugging PAPI issues.

   http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html

it's only about 12 tests right now, but feel free to use them if you'd 
like (they're GPL).  I really don't have any interest in merging them into 
the perf tree though.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:10         ` Peter Zijlstra
  2011-05-24 15:40           ` Ingo Molnar
@ 2011-05-24 17:53           ` Vince Weaver
  1 sibling, 0 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-24 17:53 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 24 May 2011, Peter Zijlstra wrote:

> On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
> > 2.6.37, 2.6.38, or 2.6.39 then it would be silly to do it just for
> > 2.6.40.
> 
> Oh, I assumed it was recent and .39/.40 would suffice.

It turns out the problem was only introduced in PAPI in November ( I had 
assumed it dated back further).  So maybe not as dire as it seemed at 
first.  

The fix to PAPI does seem to be trivial, though the whole set of 
bad code was introduced to work around a different problem so I need
to verify the fix doesn't break other things.

Thanks,

Vince
vweaver1@eecs.utk.edu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:04       ` Vince Weaver
  2011-05-24 15:10         ` Peter Zijlstra
@ 2011-05-24 15:11         ` Peter Zijlstra
  2011-05-24 15:18           ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-24 15:11 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
> As an aside, I also wasted 6 hours last night finding out that you don't 
> get signaled on overflow if you don't have a ring-buffer mmap()ed, even
> if you never read from the buffer and you only are interested in counting
> the number of overflows. 

That sounds like something we could fix, let me investigate.

>  I should stop complaining though or you'll
> start telling me to fix the documentation.  Which maybe I would do if I 
> didn't spend all my time writing code to reproduce problems in the perf 
> ABI for bisection purposes. 

Both are appreciated.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:11         ` Peter Zijlstra
@ 2011-05-24 15:18           ` Peter Zijlstra
  2011-05-24 21:48             ` Vince Weaver
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-24 15:18 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 2011-05-24 at 17:11 +0200, Peter Zijlstra wrote:
> On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
> > As an aside, I also wasted 6 hours last night finding out that you don't 
> > get signaled on overflow if you don't have a ring-buffer mmap()ed, even
> > if you never read from the buffer and you only are interested in counting
> > the number of overflows. 
> 
> That sounds like something we could fix, let me investigate.

Does the below cure this?

---
 kernel/events/core.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index c09767f..bd1ba5e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5018,9 +5018,12 @@ static int __perf_event_overflow(struct perf_event *event, int nmi,
 		event->pending_kill = POLL_HUP;
 		if (nmi) {
 			event->pending_disable = 1;
+			event->pending_wakeup = 1;
 			irq_work_queue(&event->pending);
-		} else
+		} else {
 			perf_event_disable(event);
+			perf_event_wakeup(event);
+		}
 	}
 
 	if (event->overflow_handler)


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 15:18           ` Peter Zijlstra
@ 2011-05-24 21:48             ` Vince Weaver
  0 siblings, 0 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-24 21:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, fbuihuu, mingo, paulus, acme

On Tue, 24 May 2011, Peter Zijlstra wrote:

> On Tue, 2011-05-24 at 17:11 +0200, Peter Zijlstra wrote:
> > On Tue, 2011-05-24 at 11:04 -0400, Vince Weaver wrote:
> > > As an aside, I also wasted 6 hours last night finding out that you don't 
> > > get signaled on overflow if you don't have a ring-buffer mmap()ed, even
> > > if you never read from the buffer and you only are interested in counting
> > > the number of overflows. 
> > 
> > That sounds like something we could fix, let me investigate.
> 
> Does the below cure this?
> 

well that was an interesting interaction between debian-unstable and 
linus-git.  Luckily a udev update seemed to help.

Anyway, your patch did not fix the problem.  I still don't get overflow 
signals if the fd doesn't have a ring-buffer mmap()ed to it.

For a test case you can take my previous test case and comment out the 
mmap call.  Since this test case is probably the only code in the world 
trying to do this, maybe it's not that important.

I can test further patches, but my fastest build machine here at home 
takes 3 hours to build a kernel so there will be some latency in the 
response.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-24 10:30     ` Peter Zijlstra
  2011-05-24 15:04       ` Vince Weaver
@ 2011-05-28  3:38       ` Vince Weaver
  2011-05-28 10:22         ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-28  3:38 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mingo, paulus, acme

On Tue, 24 May 2011, Peter Zijlstra wrote:
> 
> This is very clear abuse of the API and I'm not really inclined to fix
> it, in fact, I'd be tempted to add an additional test to verify the
> argument to REFRESH > 0.

on that note (and while trying to document exactly what the ioctls do) it 
seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
than one does not work on kernels 2.6.36 and newer.  The behavior acts
as if 1 was passed, even if you pass in, say, 3.

Is it worth bisecting this, or has this become the new official behavior 
since no one noticed until now?

To reproduce you can grab my tests from here:
   http://web.eecs.utk.edu/~vweaver1/projects/perf-events/validation.html

and run the
    ./validation/simple_overflow_leader
test

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-28  3:38       ` Vince Weaver
@ 2011-05-28 10:22         ` Peter Zijlstra
  2011-05-28 13:26           ` perf: definition of a "regression" Vince Weaver
  2011-05-29 16:54           ` perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
  0 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-28 10:22 UTC (permalink / raw)
  To: Vince Weaver; +Cc: linux-kernel, mingo, paulus, acme

On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> on that note (and while trying to document exactly what the ioctls do) it 
> seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> than one does not work on kernels 2.6.36 and newer.  The behavior acts
> as if 1 was passed, even if you pass in, say, 3.

Urgh, no that should definitely work. Thanks for the test-case, I'll
work on that (probably not until Monday though, but who knows).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: definition of a "regression"
  2011-05-28 10:22         ` Peter Zijlstra
@ 2011-05-28 13:26           ` Vince Weaver
  2011-06-02  7:45             ` Ingo Molnar
  2011-05-29 16:54           ` perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
  1 sibling, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-28 13:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mingo, paulus, acme, torvalds

On Sat, 28 May 2011, Peter Zijlstra wrote:

> On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> > on that note (and while trying to document exactly what the ioctls do) it 
> > seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> > than one does not work on kernels 2.6.36 and newer.  The behavior acts
> > as if 1 was passed, even if you pass in, say, 3.
> 
> Urgh, no that should definitely work. Thanks for the test-case, I'll
> work on that (probably not until Monday though, but who knows).

So wait, the two regressions I found in 2.6.37 are WONTFIX because they 
are too old, even though they break existing userspace code?

And this older regression in 2.6.36 is going to be fixed, even though
perf, PAPI, and libpfm4 don't trigger the buggy functionality at all?

I think it's time to redefine the PERF_EVENT_IOC_REFRESH ioctl to just
refresh once (as that's what it actually does on 2.6.36 - 2.6.39) and
if we need to refresh multiple we should add a new 
PERF_EVENT_IOC_REFRESH_COUNT ioctl.

I know I am being difficult, but the perf-event ABI is a mess to program 
for in a backward compatible fashion.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: definition of a "regression"
  2011-05-28 13:26           ` perf: definition of a "regression" Vince Weaver
@ 2011-06-02  7:45             ` Ingo Molnar
  0 siblings, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2011-06-02  7:45 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Peter Zijlstra, linux-kernel, paulus, acme, torvalds


* Vince Weaver <vweaver1@eecs.utk.edu> wrote:

> On Sat, 28 May 2011, Peter Zijlstra wrote:
> 
> > On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> > > on that note (and while trying to document exactly what the ioctls do) it 
> > > seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> > > than one does not work on kernels 2.6.36 and newer.  The behavior acts
> > > as if 1 was passed, even if you pass in, say, 3.
> > 
> > Urgh, no that should definitely work. Thanks for the test-case, I'll
> > work on that (probably not until Monday though, but who knows).
> 
> So wait, the two regressions I found in 2.6.37 are WONTFIX because 
> they are too old, even though they break existing userspace code?
> 
> And this older regression in 2.6.36 is going to be fixed, even 
> though perf, PAPI, and libpfm4 don't trigger the buggy 
> functionality at all?

Btw., these considerations are flexible and we can reconsider and 
change the WONTFIX if there's a patch available and doesn't look 
horrible to backport. We can also mark fixes that havent been marked 
-stable originally as -stable later on, etc.

So please don't feel needlessly bitter about past decisions: when 
there's some good technical solution to a problem (or we were plain 
out wrong about a decision) we try hard not to stand in the way.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: regression with PERF_EVENT_IOC_REFRESH
  2011-05-28 10:22         ` Peter Zijlstra
  2011-05-28 13:26           ` perf: definition of a "regression" Vince Weaver
@ 2011-05-29 16:54           ` Vince Weaver
  2011-05-31  1:33             ` perf: [patch] " Vince Weaver
  1 sibling, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-29 16:54 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, mingo, paulus, acme

On Sat, 28 May 2011, Peter Zijlstra wrote:

> On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> > on that note (and while trying to document exactly what the ioctls do) it 
> > seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> > than one does not work on kernels 2.6.36 and newer.  The behavior acts
> > as if 1 was passed, even if you pass in, say, 3.
> 
> Urgh, no that should definitely work. Thanks for the test-case, I'll
> work on that (probably not until Monday though, but who knows).
> 

after a painfully long bisection, it turns out that this problem was in
theory introduced by the following commit:

  d57e34fdd60be7ffd0b1d86bfa1a553df86b7172

  perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed

I'll see if I can come up with a patch, but it's a bit non-obvious why
this commit is affecting the REFRESH value at all.

Vince
vweaver1@eecs.utk.edu

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-29 16:54           ` perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
@ 2011-05-31  1:33             ` Vince Weaver
  2011-05-31  7:17               ` Peter Zijlstra
  2011-05-31  7:23               ` Peter Zijlstra
  0 siblings, 2 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-31  1:33 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Peter Zijlstra, linux-kernel, mingo, paulus, acme

On Sun, 29 May 2011, Vince Weaver wrote:

> On Sat, 28 May 2011, Peter Zijlstra wrote:
> 
> > On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> > > on that note (and while trying to document exactly what the ioctls do) it 
> > > seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> > > than one does not work on kernels 2.6.36 and newer.  The behavior acts
> > > as if 1 was passed, even if you pass in, say, 3.
> > 
> > Urgh, no that should definitely work. Thanks for the test-case, I'll
> > work on that (probably not until Monday though, but who knows).
> > 
> 
> after a painfully long bisection, it turns out that this problem was in
> theory introduced by the following commit:
> 
>   d57e34fdd60be7ffd0b1d86bfa1a553df86b7172
> 
>   perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed
> 
> I'll see if I can come up with a patch, but it's a bit non-obvious why
> this commit is affecting the REFRESH value at all.

the problem was the mentioned commit tried to optimize the use of 
watermark and wakeup_watermark without taking into account that 
wakeup_watermark is a union with wakeup_events.

The patch below *should* fix it, but something unrelated has broken 
overflow support between 2.6.39 and 3.0-rc1 which I haven't had time to 
investigate.  The overflow count is suddenly about 10x what it should be 
though.  So the below is semi-untested and I possibly need to do another 
bisect.  *sigh*

Vince

Signed-off-by: Vince Weaver <vweaver1@cl320.eecs.utk.edu>


diff --git a/kernel/events/core.c b/kernel/events/core.c
index d863b3c..e3ff283 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3403,12 +3403,14 @@ unlock:
 static unsigned long perf_data_size(struct perf_buffer *buffer);
 
 static void
-perf_buffer_init(struct perf_buffer *buffer, long watermark, int flags)
+perf_buffer_init(struct perf_buffer *buffer, 
+		 long watermark, 
+		 long wakeup_watermark, int flags)
 {
 	long max_size = perf_data_size(buffer);
 
 	if (watermark)
-		buffer->watermark = min(max_size, watermark);
+		buffer->watermark = min(max_size, wakeup_watermark);
 
 	if (!buffer->watermark)
 		buffer->watermark = max_size / 2;
@@ -3451,7 +3453,8 @@ static void *perf_mmap_alloc_page(int cpu)
 }
 
 static struct perf_buffer *
-perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
+perf_buffer_alloc(int nr_pages, long watermark, 
+		  long wakeup_watermark, int cpu, int flags)
 {
 	struct perf_buffer *buffer;
 	unsigned long size;
@@ -3476,7 +3479,7 @@ perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
 
 	buffer->nr_pages = nr_pages;
 
-	perf_buffer_init(buffer, watermark, flags);
+	perf_buffer_init(buffer, watermark, wakeup_watermark, flags);
 
 	return buffer;
 
@@ -3568,7 +3571,8 @@ static void perf_buffer_free(struct perf_buffer *buffer)
 }
 
 static struct perf_buffer *
-perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
+perf_buffer_alloc(int nr_pages, long watermark, 
+		  long wakeup_watermark, int cpu, int flags)
 {
 	struct perf_buffer *buffer;
 	unsigned long size;
@@ -3592,7 +3596,7 @@ perf_buffer_alloc(int nr_pages, long watermark, int cpu, int flags)
 	buffer->page_order = ilog2(nr_pages);
 	buffer->nr_pages = 1;
 
-	perf_buffer_init(buffer, watermark, flags);
+	perf_buffer_init(buffer, watermark, wakeup_watermark, flags);
 
 	return buffer;
 
@@ -3787,7 +3791,9 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
 	if (vma->vm_flags & VM_WRITE)
 		flags |= PERF_BUFFER_WRITABLE;
 
-	buffer = perf_buffer_alloc(nr_pages, event->attr.wakeup_watermark,
+	buffer = perf_buffer_alloc(nr_pages, 
+				   event->attr.watermark,
+				   event->attr.wakeup_watermark,
 				   event->cpu, flags);
 	if (!buffer) {
 		ret = -ENOMEM;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-31  1:33             ` perf: [patch] " Vince Weaver
@ 2011-05-31  7:17               ` Peter Zijlstra
  2011-05-31  7:23               ` Peter Zijlstra
  1 sibling, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-31  7:17 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Vince Weaver, linux-kernel, mingo, paulus, acme

On Mon, 2011-05-30 at 21:33 -0400, Vince Weaver wrote:
> On Sun, 29 May 2011, Vince Weaver wrote:
> 
> > On Sat, 28 May 2011, Peter Zijlstra wrote:
> > 
> > > On Fri, 2011-05-27 at 23:38 -0400, Vince Weaver wrote:
> > > > on that note (and while trying to document exactly what the ioctls do) it 
> > > > seems that a PERF_EVENT_IOC_REFRESH with an argument of anything higher 
> > > > than one does not work on kernels 2.6.36 and newer.  The behavior acts
> > > > as if 1 was passed, even if you pass in, say, 3.
> > > 
> > > Urgh, no that should definitely work. Thanks for the test-case, I'll
> > > work on that (probably not until Monday though, but who knows).
> > > 
> > 
> > after a painfully long bisection, it turns out that this problem was in
> > theory introduced by the following commit:
> > 
> >   d57e34fdd60be7ffd0b1d86bfa1a553df86b7172
> > 
> >   perf: Simplify the ring-buffer logic: make perf_buffer_alloc() do everything needed
> > 
> > I'll see if I can come up with a patch, but it's a bit non-obvious why
> > this commit is affecting the REFRESH value at all.
> 
> the problem was the mentioned commit tried to optimize the use of 
> watermark and wakeup_watermark without taking into account that 
> wakeup_watermark is a union with wakeup_events.
> 
> The patch below *should* fix it,

Awesome thanks!

>  but something unrelated has broken 
> overflow support between 2.6.39 and 3.0-rc1 which I haven't had time to 
> investigate.  The overflow count is suddenly about 10x what it should be 
> though.  So the below is semi-untested and I possibly need to do another 
> bisect.  *sigh*

Yeah, I noticed, I was hunting that as well..

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-31  1:33             ` perf: [patch] " Vince Weaver
  2011-05-31  7:17               ` Peter Zijlstra
@ 2011-05-31  7:23               ` Peter Zijlstra
  2011-05-31 13:49                 ` Vince Weaver
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-31  7:23 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Vince Weaver, linux-kernel, mingo, paulus, acme

On Mon, 2011-05-30 at 21:33 -0400, Vince Weaver wrote:
> the problem was the mentioned commit tried to optimize the use of 
> watermark and wakeup_watermark without taking into account that 
> wakeup_watermark is a union with wakeup_events. 

Note that wake_events isn't related to IOC_REFRESH, wake_events is how
much events to buffer in the mmap-buffer before issuing a wakeup.

IOC_REFRESH increments event_limit, which is how many events to run
before disabling yourself.

What I gather is that due to that SIGIO bug (fixed by f506b3dc0e), you
had to have both an mmap and a wakeup in order for that signal to
arrive.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-31  7:23               ` Peter Zijlstra
@ 2011-05-31 13:49                 ` Vince Weaver
  2011-05-31 15:52                   ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Vince Weaver @ 2011-05-31 13:49 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Vince Weaver, linux-kernel, mingo, paulus, acme

On Tue, 31 May 2011, Peter Zijlstra wrote:

> On Mon, 2011-05-30 at 21:33 -0400, Vince Weaver wrote:
> > the problem was the mentioned commit tried to optimize the use of 
> > watermark and wakeup_watermark without taking into account that 
> > wakeup_watermark is a union with wakeup_events. 
> 
> Note that wake_events isn't related to IOC_REFRESH, wake_events is how
> much events to buffer in the mmap-buffer before issuing a wakeup.
> 
> IOC_REFRESH increments event_limit, which is how many events to run
> before disabling yourself.
> 
> What I gather is that due to that SIGIO bug (fixed by f506b3dc0e), you
> had to have both an mmap and a wakeup in order for that signal to
> arrive.

yes, but due to a bug in the mentioned changeset, the buffer watermark 
value was being set to a low value even if *watermark* was 0.  So if you 
were using IOC_REFRESH to set the *wakeup_events* value, it was also 
setting the *wakeup_watermark* value (it's a union) and the buffer setup 
was then unconditionally setting the buffer watermark to the value of the 
supposedly unrelated *wakeup_watermark*.  Normally the wakeup watermark 
would default to something like 2048, but if you were trying to set the 
wakeup_events value to something like 3 then wakeup_watermark would be set 
to that too, causing a lot more overflow events.

I verified all the above painfully using a lot of printks.

I agree this does seem to be a combination of bugs, as even with a 
properlyu set value on affected kernels you'd get spurious watermark 
overflow events if you weren't consuming the ring buffer.

In any case, I can provide a cleaner patch than the one before that isn't 
as intrusive.

I'm also bisecting the other problem I mentioned, the one where overflows 
are 10x too large on 3.0-rc1.  I'm at work with a Nehalem machine so the 
bisect should go faster than the bisect I had to do on an atom machine 
this weekend.  

A power outage over the weekend has taken part of the 
network down here though so my e-mail access is a bit limited, so I 
apologize if I've been missing comments sent to my other e-mail address.

Vince

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-31 13:49                 ` Vince Weaver
@ 2011-05-31 15:52                   ` Peter Zijlstra
  2011-05-31 16:39                     ` Vince Weaver
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2011-05-31 15:52 UTC (permalink / raw)
  To: Vince Weaver; +Cc: Vince Weaver, linux-kernel, mingo, paulus, acme

On Tue, 2011-05-31 at 09:49 -0400, Vince Weaver wrote:
> On Tue, 31 May 2011, Peter Zijlstra wrote:
> 
> > On Mon, 2011-05-30 at 21:33 -0400, Vince Weaver wrote:
> > > the problem was the mentioned commit tried to optimize the use of 
> > > watermark and wakeup_watermark without taking into account that 
> > > wakeup_watermark is a union with wakeup_events. 
> > 
> > Note that wake_events isn't related to IOC_REFRESH, wake_events is how
> > much events to buffer in the mmap-buffer before issuing a wakeup.
> > 
> > IOC_REFRESH increments event_limit, which is how many events to run
> > before disabling yourself.
> > 
> > What I gather is that due to that SIGIO bug (fixed by f506b3dc0e), you
> > had to have both an mmap and a wakeup in order for that signal to
> > arrive.
> 
> yes, but due to a bug in the mentioned changeset, the buffer watermark 
> value was being set to a low value even if *watermark* was 0.  So if you 
> were using IOC_REFRESH to set the *wakeup_events* value,

IOC_REFRESH sets event->event_limit, not wakeup_events.

>  it was also 
> setting the *wakeup_watermark* value (it's a union) and the buffer setup 
> was then unconditionally setting the buffer watermark to the value of the 
> supposedly unrelated *wakeup_watermark*.  Normally the wakeup watermark 
> would default to something like 2048, but if you were trying to set the 
> wakeup_events value to something like 3 then wakeup_watermark would be set 
> to that too, causing a lot more overflow events.

poll() wakeups, which were inadvertly linked to SIGIOs

> I verified all the above painfully using a lot of printks.

I prefer to use trace_printk() and /debug/tracing/, that doesn't slow
stuff down as much.

> I agree this does seem to be a combination of bugs, as even with a 
> properlyu set value on affected kernels you'd get spurious watermark 
> overflow events if you weren't consuming the ring buffer.

*nod*

> In any case, I can provide a cleaner patch than the one before that isn't 
> as intrusive.

Appreciated.

> I'm also bisecting the other problem I mentioned, the one where overflows 
> are 10x too large on 3.0-rc1.  I'm at work with a Nehalem machine so the 
> bisect should go faster than the bisect I had to do on an atom machine 
> this weekend.  

It wouldn't be the SIGIO fix would it?, with that every overflow
generates a SIGIO, not only the poll() wakeups. And ouch at bisecting
(or even building a kernel) on an Atom, those things are horridly slow.

> A power outage over the weekend has taken part of the 
> network down here though so my e-mail access is a bit limited, so I 
> apologize if I've been missing comments sent to my other e-mail address.

I'm afraid not, I've been mostly tied up with fixing some scheduler
regressions.

Also, it looks like I just broke stuff even worse in -tip, am bisecting
that now.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: perf: [patch] regression with PERF_EVENT_IOC_REFRESH
  2011-05-31 15:52                   ` Peter Zijlstra
@ 2011-05-31 16:39                     ` Vince Weaver
  0 siblings, 0 replies; 25+ messages in thread
From: Vince Weaver @ 2011-05-31 16:39 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Vince Weaver, linux-kernel, mingo, paulus, acme

On Tue, 31 May 2011, Peter Zijlstra wrote:
> 
> IOC_REFRESH sets event->event_limit, not wakeup_events.

ahh, yes. 

So it's a userspace "bug".

The test code called a "IOC_REFRESH, 3" whenever it got signalled.
It didn't distinguish between whether it was a plain overflow or if it was 
a ring-buffer overflow (can it distinguish?).

Thus the watermark bug was confusing the user-space code into refreshing 
when it was not necessary.

> > I'm also bisecting the other problem I mentioned, the one where overflows 
> > are 10x too large on 3.0-rc1.  I'm at work with a Nehalem machine so the 
> > bisect should go faster than the bisect I had to do on an atom machine 
> > this weekend.  
> 
> It wouldn't be the SIGIO fix would it?, with that every overflow
> generates a SIGIO, not only the poll() wakeups. And ouch at bisecting
> (or even building a kernel) on an Atom, those things are horridly slow.

Oh, it could be the SIGIO fix.  I hadn't realized that got merged already.

And yes, bisect on atom is painful, but my alternatives were a 1.7GHz P4 
(with small disk, so would have been over NFS), a 600MHz G3 iBook, or an 
armv6 machine.  So atom it was.

Vince
vweaver1@eecs.utk.edu

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2011-06-02  7:45 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-23 20:04 perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
2011-05-23 20:22 ` Peter Zijlstra
2011-05-24  6:20   ` Vince Weaver
2011-05-24 10:30     ` Peter Zijlstra
2011-05-24 15:04       ` Vince Weaver
2011-05-24 15:10         ` Peter Zijlstra
2011-05-24 15:40           ` Ingo Molnar
2011-05-24 20:31             ` Vince Weaver
2011-05-25 10:39               ` Ingo Molnar
2011-05-25 21:24                 ` Vince Weaver
2011-05-24 17:53           ` Vince Weaver
2011-05-24 15:11         ` Peter Zijlstra
2011-05-24 15:18           ` Peter Zijlstra
2011-05-24 21:48             ` Vince Weaver
2011-05-28  3:38       ` Vince Weaver
2011-05-28 10:22         ` Peter Zijlstra
2011-05-28 13:26           ` perf: definition of a "regression" Vince Weaver
2011-06-02  7:45             ` Ingo Molnar
2011-05-29 16:54           ` perf: regression with PERF_EVENT_IOC_REFRESH Vince Weaver
2011-05-31  1:33             ` perf: [patch] " Vince Weaver
2011-05-31  7:17               ` Peter Zijlstra
2011-05-31  7:23               ` Peter Zijlstra
2011-05-31 13:49                 ` Vince Weaver
2011-05-31 15:52                   ` Peter Zijlstra
2011-05-31 16:39                     ` Vince Weaver

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox