* [PATCH] trace-cmd: use nonblocking reads for streaming
@ 2015-12-17 17:01 Josef Bacik
2016-02-23 23:17 ` Steven Rostedt
0 siblings, 1 reply; 3+ messages in thread
From: Josef Bacik @ 2015-12-17 17:01 UTC (permalink / raw)
To: linux-kernel, kernel-team, rostedt
I noticed while using the streaming infrastructure in trace-cmd that I was
seemingly missing events. Using other tracing methods I got these events and
record->missed_events was never being set. This is because the streaming
infrastructure uses blocking reads on the per cpu trace pipe's, which means
we'll wait for an entire pages worth of data to be ready before passing it along
to the recorder. This makes it impossible to do long term tracing that requires
coupling two different events that could occur on different CPU's, and I imagine
it has been what is screwing up my trace-cmd profile runs on our giant 40 cpu
boxes. Fix trace-cmd instead to use a nonblocking read with select to wait for
data on the pipe so we don't burn CPU unnecessarily. With this patch I'm no
longer seeing missed events in my app. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
---
trace-cmd.h | 1 +
trace-record.c | 9 +++++++--
trace-recorder.c | 30 ++++++++++++++++++++++++++++++
3 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/trace-cmd.h b/trace-cmd.h
index 3a230b4..c94f784 100644
--- a/trace-cmd.h
+++ b/trace-cmd.h
@@ -282,6 +282,7 @@ struct tracecmd_recorder *tracecmd_create_buffer_recorder(const char *file, int
struct tracecmd_recorder *tracecmd_create_buffer_recorder_maxkb(const char *file, int cpu, unsigned flags, const char *buffer, int maxkb);
int tracecmd_start_recording(struct tracecmd_recorder *recorder, unsigned long sleep);
+long tracecmd_stream_recording(struct tracecmd_recorder *recorder, unsigned long sleep);
void tracecmd_stop_recording(struct tracecmd_recorder *recorder);
void tracecmd_stat_cpu(struct trace_seq *s, int cpu);
long tracecmd_flush_recording(struct tracecmd_recorder *recorder);
diff --git a/trace-record.c b/trace-record.c
index 49a8280..7981932 100644
--- a/trace-record.c
+++ b/trace-record.c
@@ -2450,8 +2450,13 @@ static int create_recorder(struct buffer_instance *instance, int cpu,
}
while (!finished) {
- if (tracecmd_start_recording(recorder, sleep_time) < 0)
- break;
+ if (type == TRACE_TYPE_STREAM) {
+ if (tracecmd_stream_recording(recorder, sleep_time) < 0)
+ break;
+ } else {
+ if (tracecmd_start_recording(recorder, sleep_time) < 0)
+ break;
+ }
}
tracecmd_free_recorder(recorder);
diff --git a/trace-recorder.c b/trace-recorder.c
index 16d1be0..13e2e9f 100644
--- a/trace-recorder.c
+++ b/trace-recorder.c
@@ -440,6 +440,36 @@ long tracecmd_flush_recording(struct tracecmd_recorder *recorder)
return total;
}
+#define USECS_IN_SEC 1000000UL
+long tracecmd_stream_recording(struct tracecmd_recorder *recorder, unsigned long sleep)
+{
+ static int nonblock = 0;
+ fd_set rfds;
+ struct timeval tv;
+ long total = 0;
+ long ret;
+
+ if (!nonblock)
+ set_nonblock(recorder);
+
+ FD_ZERO(&rfds);
+ FD_SET(recorder->trace_fd, &rfds);
+ tv.tv_sec = sleep / USECS_IN_SEC;
+ tv.tv_usec = sleep % USECS_IN_SEC;
+ ret = select(FD_SETSIZE, &rfds, NULL, NULL, &tv);
+ if (ret <= 0)
+ return 0;
+
+ do {
+ ret = read_data(recorder);
+ if (ret < 0)
+ return ret;
+ total += ret;
+ } while (ret);
+
+ return total;
+}
+
int tracecmd_start_recording(struct tracecmd_recorder *recorder, unsigned long sleep)
{
struct timespec req;
--
2.5.0
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] trace-cmd: use nonblocking reads for streaming
2015-12-17 17:01 [PATCH] trace-cmd: use nonblocking reads for streaming Josef Bacik
@ 2016-02-23 23:17 ` Steven Rostedt
2016-02-24 18:27 ` Josef Bacik
0 siblings, 1 reply; 3+ messages in thread
From: Steven Rostedt @ 2016-02-23 23:17 UTC (permalink / raw)
To: Josef Bacik; +Cc: linux-kernel, kernel-team
On Thu, 17 Dec 2015 12:01:52 -0500
Josef Bacik <jbacik@fb.com> wrote:
> I noticed while using the streaming infrastructure in trace-cmd that I was
> seemingly missing events. Using other tracing methods I got these events and
> record->missed_events was never being set. This is because the streaming
> infrastructure uses blocking reads on the per cpu trace pipe's, which means
> we'll wait for an entire pages worth of data to be ready before passing it along
> to the recorder. This makes it impossible to do long term tracing that requires
> coupling two different events that could occur on different CPU's, and I imagine
> it has been what is screwing up my trace-cmd profile runs on our giant 40 cpu
> boxes. Fix trace-cmd instead to use a nonblocking read with select to wait for
> data on the pipe so we don't burn CPU unnecessarily. With this patch I'm no
> longer seeing missed events in my app. Thanks,
I just want to make sure I understand what is happening here.
This wasn't trace-cmd's default code right? This was your own app. And
I'm guessing you were matching events perhaps. That is, after seeing
some event, you looked for the other event. But if that event happened
on a CPU that isn't very active, it would wait forever, as the read
was waiting for a full page?
Or is there something else.
I don't have a problem with the patch. I just want to understand the
issue.
Thanks!
-- Steve
>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> ---
> trace-cmd.h | 1 +
> trace-record.c | 9 +++++++--
> trace-recorder.c | 30 ++++++++++++++++++++++++++++++
> 3 files changed, 38 insertions(+), 2 deletions(-)
>
> diff --git a/trace-cmd.h b/trace-cmd.h
> index 3a230b4..c94f784 100644
> --- a/trace-cmd.h
> +++ b/trace-cmd.h
> @@ -282,6 +282,7 @@ struct tracecmd_recorder *tracecmd_create_buffer_recorder(const char *file, int
> struct tracecmd_recorder *tracecmd_create_buffer_recorder_maxkb(const char *file, int cpu, unsigned flags, const char *buffer, int maxkb);
>
> int tracecmd_start_recording(struct tracecmd_recorder *recorder, unsigned long sleep);
> +long tracecmd_stream_recording(struct tracecmd_recorder *recorder, unsigned long sleep);
> void tracecmd_stop_recording(struct tracecmd_recorder *recorder);
> void tracecmd_stat_cpu(struct trace_seq *s, int cpu);
> long tracecmd_flush_recording(struct tracecmd_recorder *recorder);
> diff --git a/trace-record.c b/trace-record.c
> index 49a8280..7981932 100644
> --- a/trace-record.c
> +++ b/trace-record.c
> @@ -2450,8 +2450,13 @@ static int create_recorder(struct buffer_instance *instance, int cpu,
> }
>
> while (!finished) {
> - if (tracecmd_start_recording(recorder, sleep_time) < 0)
> - break;
> + if (type == TRACE_TYPE_STREAM) {
> + if (tracecmd_stream_recording(recorder, sleep_time) < 0)
> + break;
> + } else {
> + if (tracecmd_start_recording(recorder, sleep_time) < 0)
> + break;
> + }
> }
> tracecmd_free_recorder(recorder);
>
> diff --git a/trace-recorder.c b/trace-recorder.c
> index 16d1be0..13e2e9f 100644
> --- a/trace-recorder.c
> +++ b/trace-recorder.c
> @@ -440,6 +440,36 @@ long tracecmd_flush_recording(struct tracecmd_recorder *recorder)
> return total;
> }
>
> +#define USECS_IN_SEC 1000000UL
> +long tracecmd_stream_recording(struct tracecmd_recorder *recorder, unsigned long sleep)
> +{
> + static int nonblock = 0;
> + fd_set rfds;
> + struct timeval tv;
> + long total = 0;
> + long ret;
> +
> + if (!nonblock)
> + set_nonblock(recorder);
> +
> + FD_ZERO(&rfds);
> + FD_SET(recorder->trace_fd, &rfds);
> + tv.tv_sec = sleep / USECS_IN_SEC;
> + tv.tv_usec = sleep % USECS_IN_SEC;
> + ret = select(FD_SETSIZE, &rfds, NULL, NULL, &tv);
> + if (ret <= 0)
> + return 0;
> +
> + do {
> + ret = read_data(recorder);
> + if (ret < 0)
> + return ret;
> + total += ret;
> + } while (ret);
> +
> + return total;
> +}
> +
> int tracecmd_start_recording(struct tracecmd_recorder *recorder, unsigned long sleep)
> {
> struct timespec req;
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] trace-cmd: use nonblocking reads for streaming
2016-02-23 23:17 ` Steven Rostedt
@ 2016-02-24 18:27 ` Josef Bacik
0 siblings, 0 replies; 3+ messages in thread
From: Josef Bacik @ 2016-02-24 18:27 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-kernel, kernel-team
On 02/23/2016 06:17 PM, Steven Rostedt wrote:
> On Thu, 17 Dec 2015 12:01:52 -0500
> Josef Bacik <jbacik@fb.com> wrote:
>
>> I noticed while using the streaming infrastructure in trace-cmd that I was
>> seemingly missing events. Using other tracing methods I got these events and
>> record->missed_events was never being set. This is because the streaming
>> infrastructure uses blocking reads on the per cpu trace pipe's, which means
>> we'll wait for an entire pages worth of data to be ready before passing it along
>> to the recorder. This makes it impossible to do long term tracing that requires
>> coupling two different events that could occur on different CPU's, and I imagine
>> it has been what is screwing up my trace-cmd profile runs on our giant 40 cpu
>> boxes. Fix trace-cmd instead to use a nonblocking read with select to wait for
>> data on the pipe so we don't burn CPU unnecessarily. With this patch I'm no
>> longer seeing missed events in my app. Thanks,
>
> I just want to make sure I understand what is happening here.
>
> This wasn't trace-cmd's default code right? This was your own app. And
> I'm guessing you were matching events perhaps. That is, after seeing
> some event, you looked for the other event. But if that event happened
> on a CPU that isn't very active, it would wait forever, as the read
> was waiting for a full page?
>
> Or is there something else.
>
> I don't have a problem with the patch. I just want to understand the
> issue.
Yup I had an app that was watching block request issue and completion
events, and occasionally a completion event would happen on some mostly
idle cpu, so I wouldn't get the completion request until several hours
later (the app runs all the time) when we finally had a full page to
read from that cpu's buffer. Thanks,
Josef
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-02-24 18:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-17 17:01 [PATCH] trace-cmd: use nonblocking reads for streaming Josef Bacik
2016-02-23 23:17 ` Steven Rostedt
2016-02-24 18:27 ` Josef Bacik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox