* perf script: rwtop: SIGALRM and pipe read race
@ 2012-09-14 15:39 Andrew Jones
2012-09-14 16:05 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2012-09-14 15:39 UTC (permalink / raw)
To: linux-kernel; +Cc: a.p.zijlstra, paulus, mingo, acme, tzanussi
I recently tried 'perf script rwtop', and it immediately failed with
'failed to read event header'. Running it through strace I found that the
when rwtop.pl is reading from the pipe, and gets one of it's alarms, that
the ERESTARTSYS seems to confuse it - causing it to fail. It also appears
that the problem only happens early in execution, or not at all. If I get
lucky and don't hit the problem right away, then rwtop will run fine as
long as I want, without any ERESTARTSYS's in its trace. I also found that
I can avoid hitting the problem by throwing a 'pv -q' in front of the perf
command in tools/perf/scripts/perl/bin/rwtop-report. Which I guess slows
things down in the reader enough to always avoid the race.
Sorry I don't have a solution (patch). I'll look at it more as time
permits, but I thought I'd get it reported for starters though.
Drew
Partial strace below
16774 read(0, <unfinished ...>
16772 <... getdents resumed> /* 0 entries */, 32768) = 0
16774 <... read resumed>
"\206A\0\0\206A\0\0perf\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 48) =
48
16772 close(70 <unfinished ...>
16774 read(0, <unfinished ...>
16772 <... close resumed> ) = 0
16772 open("/proc/16774/maps", O_RDONLY) = 70
16772 fstat(70, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
16772 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f76c6b5b000
16772 read(70, "00400000-00582000 r-xp 00000000 "..., 1024) = 1024
16774 <... read resumed> 0x28598e0, 8) = ? ERESTARTSYS (To be restarted
if SA_RESTART is set)
16772 write(1, "\1\0\0\0\2\0p\0\206A\0\0\206A\0\0\0\0@\0\0\0\0\0\0
\30\0\0\0\0\0"..., 112 <unfinished ...>
16774 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
16772 <... write resumed> ) = 112
16774 rt_sigreturn( <unfinished ...>
16772 write(1,
"\1\0\0\0\2\0`\0\206A\0\0\206A\0\0\0\0\240\0072\0\0\0\0\0\2\0\0\0\0\0"...,
96 <unfinished ...>
16774 <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system
call)
16772 <... write resumed> ) = 96
16774 write(2, "failed to read event header\n", 28 <unfinished ...>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-14 15:39 perf script: rwtop: SIGALRM and pipe read race Andrew Jones
@ 2012-09-14 16:05 ` David Ahern
2012-09-14 18:10 ` Andrew Jones
0 siblings, 1 reply; 11+ messages in thread
From: David Ahern @ 2012-09-14 16:05 UTC (permalink / raw)
To: Andrew Jones; +Cc: linux-kernel, a.p.zijlstra, paulus, mingo, acme, tzanussi
On 9/14/12 9:39 AM, Andrew Jones wrote:
>
> I recently tried 'perf script rwtop', and it immediately failed with
> 'failed to read event header'. Running it through strace I found that the
> when rwtop.pl is reading from the pipe, and gets one of it's alarms, that
> the ERESTARTSYS seems to confuse it - causing it to fail. It also appears
> that the problem only happens early in execution, or not at all. If I get
> lucky and don't hit the problem right away, then rwtop will run fine as
> long as I want, without any ERESTARTSYS's in its trace. I also found that
> I can avoid hitting the problem by throwing a 'pv -q' in front of the perf
> command in tools/perf/scripts/perl/bin/rwtop-report. Which I guess slows
> things down in the reader enough to always avoid the race.
>
> Sorry I don't have a solution (patch). I'll look at it more as time
> permits, but I thought I'd get it reported for starters though.
This fixes the run-time problem:
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 1b8775c..a4371ae 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -142,6 +142,9 @@ int readn(int fd, void *buf, size_t n)
while (n) {
int ret = read(fd, buf, n);
+ if ((ret < 0) && (errno == EINTR))
+ continue;
+
if (ret <= 0)
return ret;
The only problem you will find with rwtop is that bytes_read will be
really whacky. I traced it to:
if ($ret > 0) {
printf("comm %s bytes_read %d\n", $common_comm, $ret);
$reads{$common_pid}{bytes_read} += $ret;
Somehow the $ret > 0 is passing when in fact it is negative. I do not
know much about perl to fix it.
David
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-14 16:05 ` David Ahern
@ 2012-09-14 18:10 ` Andrew Jones
2012-09-17 14:55 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2012-09-14 18:10 UTC (permalink / raw)
To: David Ahern; +Cc: linux-kernel, a.p.zijlstra, paulus, mingo, acme, tzanussi
On Fri, Sep 14, 2012 at 10:05:03AM -0600, David Ahern wrote:
> On 9/14/12 9:39 AM, Andrew Jones wrote:
> >
> >I recently tried 'perf script rwtop', and it immediately failed with
> >'failed to read event header'. Running it through strace I found that the
> >when rwtop.pl is reading from the pipe, and gets one of it's alarms, that
> >the ERESTARTSYS seems to confuse it - causing it to fail. It also appears
> >that the problem only happens early in execution, or not at all. If I get
> >lucky and don't hit the problem right away, then rwtop will run fine as
> >long as I want, without any ERESTARTSYS's in its trace. I also found that
> >I can avoid hitting the problem by throwing a 'pv -q' in front of the perf
> >command in tools/perf/scripts/perl/bin/rwtop-report. Which I guess slows
> >things down in the reader enough to always avoid the race.
> >
> >Sorry I don't have a solution (patch). I'll look at it more as time
> >permits, but I thought I'd get it reported for starters though.
>
>
> This fixes the run-time problem:
>
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 1b8775c..a4371ae 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -142,6 +142,9 @@ int readn(int fd, void *buf, size_t n)
> while (n) {
> int ret = read(fd, buf, n);
>
> + if ((ret < 0) && (errno == EINTR))
> + continue;
> +
> if (ret <= 0)
> return ret;
>
>
>
> The only problem you will find with rwtop is that bytes_read will be
> really whacky. I traced it to:
>
> if ($ret > 0) {
> printf("comm %s bytes_read %d\n", $common_comm, $ret);
> $reads{$common_pid}{bytes_read} += $ret;
>
> Somehow the $ret > 0 is passing when in fact it is negative. I do
> not know much about perl to fix it.
>
This actually appears to be an issue with how perl sighandlers are
supposed to work.
http://perldoc.perl.org/perlipc.html#Restartable-system-calls
I tried the below patch though, and while it gets me past the read failure
it still doesn't solve the problem. With it the script stops processing
events after the first one.
Drew
diff --git a/tools/perf/scripts/perl/rwtop.pl
b/tools/perf/scripts/perl/rwtop.pl
index 4bb3ecd..8b20787 100644
--- a/tools/perf/scripts/perl/rwtop.pl
+++ b/tools/perf/scripts/perl/rwtop.pl
@@ -17,6 +17,7 @@ use lib
"$ENV{'PERF_EXEC_PATH'}/scripts/perl/Perf-Trace-Util/l
use lib "./Perf-Trace-Util/lib";
use Perf::Trace::Core;
use Perf::Trace::Util;
+use POSIX qw/SIGALRM SA_RESTART/;
my $default_interval = 3;
my $nlines = 20;
@@ -90,7 +91,10 @@ sub syscalls::sys_enter_write
sub trace_begin
{
- $SIG{ALRM} = \&set_print_pending;
+ my $sa = POSIX::SigAction->new(\&set_print_pending);
+ $sa->flags(SA_RESTART);
+ $sa->safe(1);
+ POSIX::sigaction(SIGALRM, $sa) or die "Can't set SIGALRM handler:
$!\n";
alarm 1;
}
Drew
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-14 18:10 ` Andrew Jones
@ 2012-09-17 14:55 ` David Ahern
2012-09-17 15:16 ` David Ahern
2012-09-18 9:05 ` Andrew Jones
0 siblings, 2 replies; 11+ messages in thread
From: David Ahern @ 2012-09-17 14:55 UTC (permalink / raw)
To: Andrew Jones, acme; +Cc: linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On 9/14/12 12:10 PM, Andrew Jones wrote:
> On Fri, Sep 14, 2012 at 10:05:03AM -0600, David Ahern wrote:
>> On 9/14/12 9:39 AM, Andrew Jones wrote:
>>>
>>> I recently tried 'perf script rwtop', and it immediately failed with
>>> 'failed to read event header'. Running it through strace I found that the
>>> when rwtop.pl is reading from the pipe, and gets one of it's alarms, that
>>> the ERESTARTSYS seems to confuse it - causing it to fail. It also appears
>>> that the problem only happens early in execution, or not at all. If I get
>>> lucky and don't hit the problem right away, then rwtop will run fine as
>>> long as I want, without any ERESTARTSYS's in its trace. I also found that
>>> I can avoid hitting the problem by throwing a 'pv -q' in front of the perf
>>> command in tools/perf/scripts/perl/bin/rwtop-report. Which I guess slows
>>> things down in the reader enough to always avoid the race.
>>>
>>> Sorry I don't have a solution (patch). I'll look at it more as time
>>> permits, but I thought I'd get it reported for starters though.
>>
>>
>> This fixes the run-time problem:
>>
>> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
>> index 1b8775c..a4371ae 100644
>> --- a/tools/perf/util/util.c
>> +++ b/tools/perf/util/util.c
>> @@ -142,6 +142,9 @@ int readn(int fd, void *buf, size_t n)
>> while (n) {
>> int ret = read(fd, buf, n);
>>
>> + if ((ret < 0) && (errno == EINTR))
>> + continue;
>> +
>> if (ret <= 0)
>> return ret;
>>
>>
>>
>> The only problem you will find with rwtop is that bytes_read will be
>> really whacky. I traced it to:
>>
>> if ($ret > 0) {
>> printf("comm %s bytes_read %d\n", $common_comm, $ret);
>> $reads{$common_pid}{bytes_read} += $ret;
>>
>> Somehow the $ret > 0 is passing when in fact it is negative. I do
>> not know much about perl to fix it.
>>
>
> This actually appears to be an issue with how perl sighandlers are
> supposed to work.
>
> http://perldoc.perl.org/perlipc.html#Restartable-system-calls
>
> I tried the below patch though, and while it gets me past the read failure
> it still doesn't solve the problem. With it the script stops processing
> events after the first one.
>
> Drew
>
> diff --git a/tools/perf/scripts/perl/rwtop.pl
> b/tools/perf/scripts/perl/rwtop.pl
> index 4bb3ecd..8b20787 100644
> --- a/tools/perf/scripts/perl/rwtop.pl
> +++ b/tools/perf/scripts/perl/rwtop.pl
> @@ -17,6 +17,7 @@ use lib
> "$ENV{'PERF_EXEC_PATH'}/scripts/perl/Perf-Trace-Util/l
> use lib "./Perf-Trace-Util/lib";
> use Perf::Trace::Core;
> use Perf::Trace::Util;
> +use POSIX qw/SIGALRM SA_RESTART/;
>
> my $default_interval = 3;
> my $nlines = 20;
> @@ -90,7 +91,10 @@ sub syscalls::sys_enter_write
>
> sub trace_begin
> {
> - $SIG{ALRM} = \&set_print_pending;
> + my $sa = POSIX::SigAction->new(\&set_print_pending);
> + $sa->flags(SA_RESTART);
> + $sa->safe(1);
> + POSIX::sigaction(SIGALRM, $sa) or die "Can't set SIGALRM handler:
> $!\n";
> alarm 1;
> }
>
> Drew
>
Not sure why you want to change the signal handling. The display routine
appears to be working.
Arnaldo had pinged me about rwtop a couple of weeks ago -- thinking one
of my patches broke it. After looking into it I see 3 problems with the
rwtop scripts:
1. readn can fail with EINTR and that needs to be handled. The earlier
patch fixes that.
2. the rwtop.pl script is not handling negative return values ($ret < 0)
properly -- the '$ret > 0' check is succeeding even though $ret is
negative (e.g., -EAGAIN) leading to astronomical read values
3. record by default uses the watermark so it does not push records to
the report script until the watermark is reached. This has 2 side effects:
a. for systems doing few reads and writes it can take a while for enough
records to be generated causing an apparent hang when the command is
launched. disabling delay (-D argument to perf-record in rwtop-record)
handles that, but that's not the right answer for busy systems.
b. the display will not be accurate (up to date) given that there are
buffered records that have not been processed.
Really we need a periodic mode for record that would push all records
every N timer interval. This way everything is pushed to the processing
script in a timely manner.
Arnaldo: I believe 3 explains the hang you saw.
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 14:55 ` David Ahern
@ 2012-09-17 15:16 ` David Ahern
2012-09-17 16:02 ` Arnaldo Carvalho de Melo
2012-09-18 9:05 ` Andrew Jones
1 sibling, 1 reply; 11+ messages in thread
From: David Ahern @ 2012-09-17 15:16 UTC (permalink / raw)
To: Andrew Jones, acme; +Cc: linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On 9/17/12 8:55 AM, David Ahern wrote:
> 2. the rwtop.pl script is not handling negative return values ($ret < 0)
> properly -- the '$ret > 0' check is succeeding even though $ret is
> negative (e.g., -EAGAIN) leading to astronomical read values
I think perl is treating $ret as an unsigned integer.
Again, I know little about perl, but this change to
./scripts/perl/rwtop.pl makes it behave properly:
my $n = sprintf("%d", $ret);
if ($n > 0) {
...
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 15:16 ` David Ahern
@ 2012-09-17 16:02 ` Arnaldo Carvalho de Melo
2012-09-17 16:32 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-09-17 16:02 UTC (permalink / raw)
To: David Ahern
Cc: Andrew Jones, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
Em Mon, Sep 17, 2012 at 09:16:19AM -0600, David Ahern escreveu:
> On 9/17/12 8:55 AM, David Ahern wrote:
> >2. the rwtop.pl script is not handling negative return values ($ret < 0)
> >properly -- the '$ret > 0' check is succeeding even though $ret is
> >negative (e.g., -EAGAIN) leading to astronomical read values
>
> I think perl is treating $ret as an unsigned integer.
>
> Again, I know little about perl, but this change to
> ./scripts/perl/rwtop.pl makes it behave properly:
>
> my $n = sprintf("%d", $ret);
>
> if ($n > 0) {
> ...
I think what you figured out makes sense, its the best we have and you
found it to get it back working, could you please send the two patches
properly signed-off, etc?
If Andrew could please test the patchset so that I could add his
Tested-by, that would be great,
Thanks,
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 16:02 ` Arnaldo Carvalho de Melo
@ 2012-09-17 16:32 ` David Ahern
2012-09-17 17:12 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 11+ messages in thread
From: David Ahern @ 2012-09-17 16:32 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Andrew Jones, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On 9/17/12 10:02 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Sep 17, 2012 at 09:16:19AM -0600, David Ahern escreveu:
>> On 9/17/12 8:55 AM, David Ahern wrote:
>>> 2. the rwtop.pl script is not handling negative return values ($ret < 0)
>>> properly -- the '$ret > 0' check is succeeding even though $ret is
>>> negative (e.g., -EAGAIN) leading to astronomical read values
>>
>> I think perl is treating $ret as an unsigned integer.
>>
>> Again, I know little about perl, but this change to
>> ./scripts/perl/rwtop.pl makes it behave properly:
>>
>> my $n = sprintf("%d", $ret);
>>
>> if ($n > 0) {
>> ...
>
> I think what you figured out makes sense, its the best we have and you
> found it to get it back working, could you please send the two patches
> properly signed-off, etc?
I'll do that for the readn.
The above change seems like a workaround - not a proper fix. Why is perl
treating it like an unsigned int? The trace format for the read syscall
shows long. How is the binding done for perl? I can't make heads or
tails of it scanning the files under perl. If we are forced to do a hack
like the above is there a better way? I tried (int) $ret and perl did
not like that. Any other (better) options?
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 16:32 ` David Ahern
@ 2012-09-17 17:12 ` Arnaldo Carvalho de Melo
2012-09-17 20:10 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-09-17 17:12 UTC (permalink / raw)
To: David Ahern
Cc: Andrew Jones, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
Em Mon, Sep 17, 2012 at 10:32:36AM -0600, David Ahern escreveu:
> On 9/17/12 10:02 AM, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Sep 17, 2012 at 09:16:19AM -0600, David Ahern escreveu:
> >>On 9/17/12 8:55 AM, David Ahern wrote:
> >>>2. the rwtop.pl script is not handling negative return values ($ret < 0)
> >>>properly -- the '$ret > 0' check is succeeding even though $ret is
> >>>negative (e.g., -EAGAIN) leading to astronomical read values
> >>
> >>I think perl is treating $ret as an unsigned integer.
> >>
> >>Again, I know little about perl, but this change to
> >>./scripts/perl/rwtop.pl makes it behave properly:
> >>
> >> my $n = sprintf("%d", $ret);
> >>
> >> if ($n > 0) {
> >
> >I think what you figured out makes sense, its the best we have and you
> >found it to get it back working, could you please send the two patches
> >properly signed-off, etc?
>
> I'll do that for the readn.
>
> The above change seems like a workaround - not a proper fix. Why is
> perl treating it like an unsigned int? The trace format for the read
> syscall shows long. How is the binding done for perl? I can't make
> heads or tails of it scanning the files under perl. If we are forced
> to do a hack like the above is there a better way? I tried (int)
> $ret and perl did not like that. Any other (better) options?
Its possible that changes made to support non tracepoint events
introduced this problem, so perhaps bisecting it, as this wasn't present
some time ago, i.e. those big numbers :-\
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 17:12 ` Arnaldo Carvalho de Melo
@ 2012-09-17 20:10 ` David Ahern
0 siblings, 0 replies; 11+ messages in thread
From: David Ahern @ 2012-09-17 20:10 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Steven Rostedt
Cc: Andrew Jones, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On 9/17/12 11:12 AM, Arnaldo Carvalho de Melo wrote:
>
> Its possible that changes made to support non tracepoint events
> introduced this problem, so perhaps bisecting it, as this wasn't present
> some time ago, i.e. those big numbers :-\
git bisect failed miserably for me. But, pondering a bit on changes to
the tracepoint processing and it's rather obvious - libtraceevent.
commit before libtraceevent - 44d1a3e -- works fine
first usable commit after libtraveevent - 8784eb7 -- fails
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-17 14:55 ` David Ahern
2012-09-17 15:16 ` David Ahern
@ 2012-09-18 9:05 ` Andrew Jones
2012-09-18 9:30 ` Andrew Jones
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Jones @ 2012-09-18 9:05 UTC (permalink / raw)
To: David Ahern; +Cc: acme, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On Mon, Sep 17, 2012 at 08:55:43AM -0600, David Ahern wrote:
> On 9/14/12 12:10 PM, Andrew Jones wrote:
> >On Fri, Sep 14, 2012 at 10:05:03AM -0600, David Ahern wrote:
> >>On 9/14/12 9:39 AM, Andrew Jones wrote:
> >>>
> >>>I recently tried 'perf script rwtop', and it immediately failed with
> >>>'failed to read event header'. Running it through strace I found that the
> >>>when rwtop.pl is reading from the pipe, and gets one of it's alarms, that
> >>>the ERESTARTSYS seems to confuse it - causing it to fail. It also appears
> >>>that the problem only happens early in execution, or not at all. If I get
> >>>lucky and don't hit the problem right away, then rwtop will run fine as
> >>>long as I want, without any ERESTARTSYS's in its trace. I also found that
> >>>I can avoid hitting the problem by throwing a 'pv -q' in front of the perf
> >>>command in tools/perf/scripts/perl/bin/rwtop-report. Which I guess slows
> >>>things down in the reader enough to always avoid the race.
> >>>
> >>>Sorry I don't have a solution (patch). I'll look at it more as time
> >>>permits, but I thought I'd get it reported for starters though.
> >>
> >>
> >>This fixes the run-time problem:
> >>
> >>diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> >>index 1b8775c..a4371ae 100644
> >>--- a/tools/perf/util/util.c
> >>+++ b/tools/perf/util/util.c
> >>@@ -142,6 +142,9 @@ int readn(int fd, void *buf, size_t n)
> >> while (n) {
> >> int ret = read(fd, buf, n);
> >>
> >>+ if ((ret < 0) && (errno == EINTR))
> >>+ continue;
> >>+
> >> if (ret <= 0)
> >> return ret;
> >>
> >>
> >>
> >>The only problem you will find with rwtop is that bytes_read will be
> >>really whacky. I traced it to:
> >>
> >> if ($ret > 0) {
> >>printf("comm %s bytes_read %d\n", $common_comm, $ret);
> >> $reads{$common_pid}{bytes_read} += $ret;
> >>
> >>Somehow the $ret > 0 is passing when in fact it is negative. I do
> >>not know much about perl to fix it.
> >>
> >
> >This actually appears to be an issue with how perl sighandlers are
> >supposed to work.
> >
> >http://perldoc.perl.org/perlipc.html#Restartable-system-calls
> >
> >I tried the below patch though, and while it gets me past the read failure
> >it still doesn't solve the problem. With it the script stops processing
> >events after the first one.
> >
> >Drew
> >
> >diff --git a/tools/perf/scripts/perl/rwtop.pl
> >b/tools/perf/scripts/perl/rwtop.pl
> >index 4bb3ecd..8b20787 100644
> >--- a/tools/perf/scripts/perl/rwtop.pl
> >+++ b/tools/perf/scripts/perl/rwtop.pl
> >@@ -17,6 +17,7 @@ use lib
> >"$ENV{'PERF_EXEC_PATH'}/scripts/perl/Perf-Trace-Util/l
> > use lib "./Perf-Trace-Util/lib";
> > use Perf::Trace::Core;
> > use Perf::Trace::Util;
> >+use POSIX qw/SIGALRM SA_RESTART/;
> >
> > my $default_interval = 3;
> > my $nlines = 20;
> >@@ -90,7 +91,10 @@ sub syscalls::sys_enter_write
> >
> > sub trace_begin
> > {
> >- $SIG{ALRM} = \&set_print_pending;
> >+ my $sa = POSIX::SigAction->new(\&set_print_pending);
> >+ $sa->flags(SA_RESTART);
> >+ $sa->safe(1);
> >+ POSIX::sigaction(SIGALRM, $sa) or die "Can't set SIGALRM handler:
> >$!\n";
> > alarm 1;
> > }
> >
> >Drew
> >
>
> Not sure why you want to change the signal handling. The display
> routine appears to be working.
Please read the link I posted to the Perl documentation. The standard
Perl signal handling doesn't support SA_RESTART. The Perl developers
found it could lead to data corruption. The patch above attempts to
replace the standard signal handler with a safe one that supports
SA_RESTART. It has nothing to do with the display routine, and
everything to do with avoiding the need for your EINTR patch.
>
> Arnaldo had pinged me about rwtop a couple of weeks ago -- thinking
> one of my patches broke it. After looking into it I see 3 problems
> with the rwtop scripts:
>
> 1. readn can fail with EINTR and that needs to be handled. The
> earlier patch fixes that.
This patch is a hack that shouldn't be necessary with proper
automatic ERESTARTSYS handling (SA_RESTART). Actually, I'd argue that
the patch is also wrong because it handles EINTR at all. The errors
should be propagated. What if some caller in builtin-* would prefer
to not use SA_RESTART, and then to make its own decisions on EINTR?
>
> 2. the rwtop.pl script is not handling negative return values ($ret
> < 0) properly -- the '$ret > 0' check is succeeding even though $ret
> is negative (e.g., -EAGAIN) leading to astronomical read values
Yeah, I see this problem too, and don't know what's up with it. The
real magic happens in the macro craziness in /usr/lib/perl5/CORE/,
but as far a perf goes, it looks like it should be right
perl_process_tracepoint() has
} else { /* FIELD_IS_NUMERIC */
val = read_size(pevent, data + field->offset,
field->size);
if (field->flags & FIELD_IS_SIGNED) {
XPUSHs(sv_2mortal(newSViv(val)));
} else {
XPUSHs(sv_2mortal(newSVuv(val)));
}
}
>
> 3. record by default uses the watermark so it does not push records
> to the report script until the watermark is reached. This has 2 side
> effects:
>
> a. for systems doing few reads and writes it can take a while for
> enough records to be generated causing an apparent hang when the
> command is launched. disabling delay (-D argument to perf-record in
> rwtop-record) handles that, but that's not the right answer for busy
> systems.
Adding -D fixed my problem with rwtop only processing the first event.
>
> b. the display will not be accurate (up to date) given that there
> are buffered records that have not been processed.
>
> Really we need a periodic mode for record that would push all
> records every N timer interval. This way everything is pushed to the
> processing script in a timely manner.
>
> Arnaldo: I believe 3 explains the hang you saw.
>
> David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf script: rwtop: SIGALRM and pipe read race
2012-09-18 9:05 ` Andrew Jones
@ 2012-09-18 9:30 ` Andrew Jones
0 siblings, 0 replies; 11+ messages in thread
From: Andrew Jones @ 2012-09-18 9:30 UTC (permalink / raw)
To: David Ahern; +Cc: acme, linux-kernel, a.p.zijlstra, paulus, mingo, tzanussi
On Tue, Sep 18, 2012 at 11:05:42AM +0200, Andrew Jones wrote:
> Please read the link I posted to the Perl documentation. The standard
> Perl signal handling doesn't support SA_RESTART. The Perl developers
> found it could lead to data corruption. The patch above attempts to
> replace the standard signal handler with a safe one that supports
> SA_RESTART. It has nothing to do with the display routine, and
> everything to do with avoiding the need for your EINTR patch.
Oh, I should state that it *does* avoid your EINTR patch. With the -D
added to the record script, then besides the unsigned $ret issue,
rwtop works fine for me now with with this SA_RESTART patch.
Below is a v2 where I've also updated the minimum version of Perl
needed. My understanding is that < 5.8.0 $SIG might work fine.
>= 5.8.0 it won't, and 5.8.2 or later is needed for ->safe().
diff --git a/tools/perf/scripts/perl/rwtop.pl b/tools/perf/scripts/perl/rwtop.pl
index 4bb3ecd..617a4d5 100644
--- a/tools/perf/scripts/perl/rwtop.pl
+++ b/tools/perf/scripts/perl/rwtop.pl
@@ -9,7 +9,7 @@
# refreshed every [interval] seconds. The default interval is 3
# seconds.
-use 5.010000;
+use 5.8.2;
use strict;
use warnings;
@@ -17,6 +17,7 @@ use lib "$ENV{'PERF_EXEC_PATH'}/scripts/perl/Perf-Trace-Util/lib";
use lib "./Perf-Trace-Util/lib";
use Perf::Trace::Core;
use Perf::Trace::Util;
+use POSIX qw/SIGALRM SA_RESTART/;
my $default_interval = 3;
my $nlines = 20;
@@ -90,7 +91,10 @@ sub syscalls::sys_enter_write
sub trace_begin
{
- $SIG{ALRM} = \&set_print_pending;
+ my $sa = POSIX::SigAction->new(\&set_print_pending);
+ $sa->flags(SA_RESTART);
+ $sa->safe(1);
+ POSIX::sigaction(SIGALRM, $sa) or die "Can't set SIGALRM handler: $!\n";
alarm 1;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-09-18 9:30 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-14 15:39 perf script: rwtop: SIGALRM and pipe read race Andrew Jones
2012-09-14 16:05 ` David Ahern
2012-09-14 18:10 ` Andrew Jones
2012-09-17 14:55 ` David Ahern
2012-09-17 15:16 ` David Ahern
2012-09-17 16:02 ` Arnaldo Carvalho de Melo
2012-09-17 16:32 ` David Ahern
2012-09-17 17:12 ` Arnaldo Carvalho de Melo
2012-09-17 20:10 ` David Ahern
2012-09-18 9:05 ` Andrew Jones
2012-09-18 9:30 ` Andrew Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox