* [PATCH v1 1/8] perf report: Name events in stats for pipe mode
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 2/8] perf session: Document struct and constify auxtrace Ian Rogers
` (7 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
In stats mode PERF_RECORD_EVENT_UPDATE isn't being handled meaning the
evsels aren't named when handling pipe mode output.
Before:
```
$ perf record -e inst_retired.any -a -o - sleep 0.1|perf report --stats -i -
...
Aggregated stats:
TOTAL events: 23358
COMM events: 2608 (11.2%)
EXIT events: 1 ( 0.0%)
FORK events: 2607 (11.2%)
SAMPLE events: 174 ( 0.7%)
MMAP2 events: 17936 (76.8%)
ATTR events: 2 ( 0.0%)
FINISHED_ROUND events: 2 ( 0.0%)
ID_INDEX events: 1 ( 0.0%)
THREAD_MAP events: 1 ( 0.0%)
CPU_MAP events: 1 ( 0.0%)
EVENT_UPDATE events: 3 ( 0.0%)
TIME_CONV events: 1 ( 0.0%)
FEATURE events: 20 ( 0.1%)
FINISHED_INIT events: 1 ( 0.0%)
raw 0xc0 stats:
SAMPLE events: 174
```
After:
```
$ perf record -e inst_retired.any -a -o - sleep 0.1|perf report --stats -i -
...
Aggregated stats:
TOTAL events: 23742
COMM events: 2620 (11.0%)
EXIT events: 2 ( 0.0%)
FORK events: 2619 (11.0%)
SAMPLE events: 165 ( 0.7%)
MMAP2 events: 18304 (77.1%)
ATTR events: 2 ( 0.0%)
FINISHED_ROUND events: 2 ( 0.0%)
ID_INDEX events: 1 ( 0.0%)
THREAD_MAP events: 1 ( 0.0%)
CPU_MAP events: 1 ( 0.0%)
EVENT_UPDATE events: 3 ( 0.0%)
TIME_CONV events: 1 ( 0.0%)
FEATURE events: 20 ( 0.1%)
FINISHED_INIT events: 1 ( 0.0%)
inst_retired.any stats:
SAMPLE events: 165
```
This makes the pipe output match the regular output.
Signed-off-by: Ian Rogers <irogers@google.com>
---
This bug pre-dates commit 113f614c6dd0 ("perf report: Use
perf_tool__init()")
---
tools/perf/builtin-report.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 1643113616f4..5c21ca33ca08 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -811,6 +811,7 @@ static void stats_setup(struct report *rep)
rep->tool.attr = process_attr;
rep->tool.sample = count_sample_event;
rep->tool.lost_samples = count_lost_samples_event;
+ rep->tool.event_update = perf_event__process_event_update;
rep->tool.no_warn = true;
}
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 2/8] perf session: Document struct and constify auxtrace
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
2024-08-29 15:01 ` [PATCH v1 1/8] perf report: Name events in stats for pipe mode Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 3/8] perf header: Add kerneldoc to perf_file_header Ian Rogers
` (6 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
perf_session is a central data structure to the tool so let's comment
it. The auxtrace callbacks are never modified in session so constify.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/util/session.h | 48 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 47 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 7c8dd6956330..e56518639711 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -26,26 +26,72 @@ struct decomp_data {
struct zstd_data *zstd_decomp;
};
+/**
+ * struct perf_session- A Perf session holds the main state when the program is
+ * working with live perf events or reading data from an input file.
+ *
+ * The rough organization of a perf_session is:
+ * ```
+ * +--------------+ +-----------+ +------------+
+ * | Session |1..* ----->| Machine |1..* ----->| Thread |
+ * +--------------+ +-----------+ +------------+
+ * ```
+ */
struct perf_session {
+ /**
+ * @header: The read version of a perf_file_header, or captures global
+ * information from a live session.
+ */
struct perf_header header;
+ /** @machines: Machines within the session a host and 0 or more guests. */
struct machines machines;
+ /** @evlist: List of evsels/events of the session. */
struct evlist *evlist;
- struct auxtrace *auxtrace;
+ /** @auxtrace: callbacks to allow AUX area data decoding. */
+ const struct auxtrace *auxtrace;
+ /** @itrace_synth_opts: AUX area tracing synthesis options. */
struct itrace_synth_opts *itrace_synth_opts;
+ /** @auxtrace_index: index of AUX area tracing events within a perf.data file. */
struct list_head auxtrace_index;
#ifdef HAVE_LIBTRACEEVENT
+ /** @tevent: handles for libtraceevent and plugins. */
struct trace_event tevent;
#endif
+ /** @time_conv: Holds contents of last PERF_RECORD_TIME_CONV event. */
struct perf_record_time_conv time_conv;
+ /**
+ * @repipe: When set causes certain reading (header and trace events) to
+ * also write events. The written file descriptor must be provided for
+ * the header but is implicitly stdout for trace events.
+ */
bool repipe;
+ /**
+ * @one_mmap: The reader will use a single mmap by default. There may be
+ * multiple data files in particular for aux events. If this is true
+ * then the single big mmap for the data file can be assumed.
+ */
bool one_mmap;
+ /** @one_mmap_addr: Address of initial perf data file reader mmap. */
void *one_mmap_addr;
+ /** @one_mmap_offset: File offset in perf.data file when mapped. */
u64 one_mmap_offset;
+ /** @ordered_events: Used to turn unordered events into ordered ones. */
struct ordered_events ordered_events;
+ /** @data: Optional perf data file being read from. */
struct perf_data *data;
+ /** @tool: callbacks for event handling. */
const struct perf_tool *tool;
+ /**
+ * @bytes_transferred: Used by perf record to count written bytes before
+ * compression.
+ */
u64 bytes_transferred;
+ /**
+ * @bytes_compressed: Used by perf record to count written bytes after
+ * compression.
+ */
u64 bytes_compressed;
+ /** @zstd_data: Owner of global compression state, buffers, etc. */
struct zstd_data zstd_data;
struct decomp_data decomp_data;
struct decomp_data *active_decomp;
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 3/8] perf header: Add kerneldoc to perf_file_header
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
2024-08-29 15:01 ` [PATCH v1 1/8] perf report: Name events in stats for pipe mode Ian Rogers
2024-08-29 15:01 ` [PATCH v1 2/8] perf session: Document struct and constify auxtrace Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 4/8] perf header: Fail read if header sections overlap Ian Rogers
` (5 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
Some of the values are a little strange so add documentation to
resolve ambiguity.
Signed-off-by: Ian Rogers <irogers@google.com>
---
| 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
--git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 07ff647197ff..3285981948d7 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -61,14 +61,28 @@ struct perf_file_section {
u64 size;
};
+/**
+ * struct perf_file_header: Header representation on disk.
+ */
struct perf_file_header {
+ /** @magic: Holds "PERFILE2". */
u64 magic;
+ /** @size: Size of this header - sizeof(struct perf_file_header). */
u64 size;
+ /**
+ * @attr_size: Size of attrs entries - sizeof(struct perf_event_attr) +
+ * sizeof(struct perf_file_section).
+ */
u64 attr_size;
+ /** @attrs: Offset and size of file section holding attributes. */
struct perf_file_section attrs;
+ /** @data: Offset and size of file section holding regular event data. */
struct perf_file_section data;
- /* event_types is ignored */
+ /** @event_types: Ignored. */
struct perf_file_section event_types;
+ /**
+ * @adds_features: Bitmap of features. The features are immediately after the data section.
+ */
DECLARE_BITMAP(adds_features, HEADER_FEAT_BITS);
};
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 4/8] perf header: Fail read if header sections overlap
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (2 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 3/8] perf header: Add kerneldoc to perf_file_header Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 5/8] perf header: Allow attributes to be written after data Ian Rogers
` (4 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
Buggy perf.data files can have the attributes and data
overlapping. For example, when processing pipe data the attributes
aren't known and so file offset header calculations can consider them
not present. Later this can cause the attributes to overwrite the
data. This can be seen in:
```
$ perf record -o - true > a.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.059 MB - ]
$ perf inject -i a.data -o b.data
$ perf report --stats -i b.data
0x68 [0]: failed to process type: 510379 [Invalid argument]
Error:
failed to process sample
```
This change makes reading the corrupt file fail:
```
$ perf report --stats -i b.data
Perf file header corrupt: Attributes and data overlap
incompatible file format (rerun with -v to learn more)
```
Which is more informative.
Signed-off-by: Ian Rogers <irogers@google.com>
---
| 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 3309fe7f1d12..65c9086610cb 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3986,6 +3986,24 @@ int perf_file_header__read(struct perf_file_header *header,
adds_features));
}
+ if (header->size > header->attrs.offset) {
+ pr_err("Perf file header corrupt: header overlaps attrs\n");
+ return -1;
+ }
+
+ if (header->size > header->data.offset) {
+ pr_err("Perf file header corrupt: header overlaps data\n");
+ return -1;
+ }
+
+ if ((header->attrs.offset <= header->data.offset &&
+ header->attrs.offset + header->attrs.size > header->data.offset) ||
+ (header->attrs.offset > header->data.offset &&
+ header->data.offset + header->data.size > header->attrs.offset)) {
+ pr_err("Perf file header corrupt: Attributes and data overlap\n");
+ return -1;
+ }
+
if (header->size != sizeof(*header)) {
/* Support the previous format */
if (header->size == offsetof(typeof(*header), adds_features))
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (3 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 4/8] perf header: Fail read if header sections overlap Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 19:31 ` Arnaldo Carvalho de Melo
2024-08-29 15:01 ` [PATCH v1 6/8] perf inject: Overhaul handling of pipe files Ian Rogers
` (3 subsequent siblings)
8 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
With a file, to write data an offset needs to be known. Typically data
follows the event attributes in a file. However, if processing a pipe
the number of event attributes may not be known. It is convenient in
that case to write the attributes after the data. Expand
perf_session__do_write_header to allow this when the data offset and
size are known.
This approach may be useful for more than just taking a pipe file to
write into a data file, `perf inject --itrace` will reserve and
additional 8kb for attributes, which would be unnecessary if the
attributes were written after the data.
Signed-off-by: Ian Rogers <irogers@google.com>
---
| 106 +++++++++++++++++++++++++--------------
1 file changed, 67 insertions(+), 39 deletions(-)
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 65c9086610cb..4eb39463067e 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
static int perf_session__do_write_header(struct perf_session *session,
struct evlist *evlist,
int fd, bool at_exit,
- struct feat_copier *fc)
+ struct feat_copier *fc,
+ bool write_attrs_after_data)
{
struct perf_file_header f_header;
- struct perf_file_attr f_attr;
struct perf_header *header = &session->header;
struct evsel *evsel;
struct feat_fd ff = {
.fd = fd,
};
- u64 attr_offset;
+ u64 attr_offset = sizeof(f_header), attr_size = 0;
int err;
- lseek(fd, sizeof(f_header), SEEK_SET);
+ if (write_attrs_after_data && at_exit) {
+ /*
+ * Write features at the end of the file first so that
+ * attributes may come after them.
+ */
+ if (!header->data_offset && header->data_size) {
+ pr_err("File contains data but offset unknown\n");
+ err = -1;
+ goto err_out;
+ }
+ header->feat_offset = header->data_offset + header->data_size;
+ err = perf_header__adds_write(header, evlist, fd, fc);
+ if (err < 0)
+ goto err_out;
+ attr_offset = lseek(fd, 0, SEEK_CUR);
+ } else {
+ lseek(fd, attr_offset, SEEK_SET);
+ }
evlist__for_each_entry(session->evlist, evsel) {
- evsel->id_offset = lseek(fd, 0, SEEK_CUR);
- err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
- if (err < 0) {
- pr_debug("failed to write perf header\n");
- free(ff.buf);
- return err;
+ evsel->id_offset = attr_offset;
+ /* Avoid writing at the end of the file until the session is exiting. */
+ if (!write_attrs_after_data || at_exit) {
+ err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
+ if (err < 0) {
+ pr_debug("failed to write perf header\n");
+ goto err_out;
+ }
}
+ attr_offset += evsel->core.ids * sizeof(u64);
}
- attr_offset = lseek(ff.fd, 0, SEEK_CUR);
-
evlist__for_each_entry(evlist, evsel) {
if (evsel->core.attr.size < sizeof(evsel->core.attr)) {
/*
@@ -3711,40 +3729,46 @@ static int perf_session__do_write_header(struct perf_session *session,
*/
evsel->core.attr.size = sizeof(evsel->core.attr);
}
- f_attr = (struct perf_file_attr){
- .attr = evsel->core.attr,
- .ids = {
- .offset = evsel->id_offset,
- .size = evsel->core.ids * sizeof(u64),
+ /* Avoid writing at the end of the file until the session is exiting. */
+ if (!write_attrs_after_data || at_exit) {
+ struct perf_file_attr f_attr = {
+ .attr = evsel->core.attr,
+ .ids = {
+ .offset = evsel->id_offset,
+ .size = evsel->core.ids * sizeof(u64),
+ }
+ };
+ err = do_write(&ff, &f_attr, sizeof(f_attr));
+ if (err < 0) {
+ pr_debug("failed to write perf header attribute\n");
+ goto err_out;
}
- };
- err = do_write(&ff, &f_attr, sizeof(f_attr));
- if (err < 0) {
- pr_debug("failed to write perf header attribute\n");
- free(ff.buf);
- return err;
}
+ attr_size += sizeof(struct perf_file_attr);
}
- if (!header->data_offset)
- header->data_offset = lseek(fd, 0, SEEK_CUR);
+ if (!header->data_offset) {
+ if (write_attrs_after_data)
+ header->data_offset = sizeof(f_header);
+ else
+ header->data_offset = attr_offset + attr_size;
+ }
header->feat_offset = header->data_offset + header->data_size;
- if (at_exit) {
+ if (!write_attrs_after_data && at_exit) {
+ /* Write features now feat_offset is known. */
err = perf_header__adds_write(header, evlist, fd, fc);
- if (err < 0) {
- free(ff.buf);
- return err;
- }
+ if (err < 0)
+ goto err_out;
}
f_header = (struct perf_file_header){
.magic = PERF_MAGIC,
.size = sizeof(f_header),
- .attr_size = sizeof(f_attr),
+ .attr_size = sizeof(struct perf_file_attr),
.attrs = {
.offset = attr_offset,
- .size = evlist->core.nr_entries * sizeof(f_attr),
+ .size = attr_size,
},
.data = {
.offset = header->data_offset,
@@ -3757,21 +3781,24 @@ static int perf_session__do_write_header(struct perf_session *session,
lseek(fd, 0, SEEK_SET);
err = do_write(&ff, &f_header, sizeof(f_header));
- free(ff.buf);
if (err < 0) {
pr_debug("failed to write perf header\n");
- return err;
+ goto err_out;
+ } else {
+ lseek(fd, 0, SEEK_END);
+ err = 0;
}
- lseek(fd, header->data_offset + header->data_size, SEEK_SET);
-
- return 0;
+err_out:
+ free(ff.buf);
+ return err;
}
int perf_session__write_header(struct perf_session *session,
struct evlist *evlist,
int fd, bool at_exit)
{
- return perf_session__do_write_header(session, evlist, fd, at_exit, NULL);
+ return perf_session__do_write_header(session, evlist, fd, at_exit, /*fc=*/NULL,
+ /*write_attrs_after_data=*/false);
}
size_t perf_session__data_offset(const struct evlist *evlist)
@@ -3793,7 +3820,8 @@ int perf_session__inject_header(struct perf_session *session,
int fd,
struct feat_copier *fc)
{
- return perf_session__do_write_header(session, evlist, fd, true, fc);
+ return perf_session__do_write_header(session, evlist, fd, true, fc,
+ /*write_attrs_after_data=*/false);
}
static int perf_header__getbuffer64(struct perf_header *header,
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 15:01 ` [PATCH v1 5/8] perf header: Allow attributes to be written after data Ian Rogers
@ 2024-08-29 19:31 ` Arnaldo Carvalho de Melo
2024-08-29 20:12 ` Ian Rogers
0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-29 19:31 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> With a file, to write data an offset needs to be known. Typically data
> follows the event attributes in a file. However, if processing a pipe
> the number of event attributes may not be known. It is convenient in
> that case to write the attributes after the data. Expand
> perf_session__do_write_header to allow this when the data offset and
> size are known.
>
> This approach may be useful for more than just taking a pipe file to
> write into a data file, `perf inject --itrace` will reserve and
> additional 8kb for attributes, which would be unnecessary if the
> attributes were written after the data.
>
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> 1 file changed, 67 insertions(+), 39 deletions(-)
>
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 65c9086610cb..4eb39463067e 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> static int perf_session__do_write_header(struct perf_session *session,
> struct evlist *evlist,
> int fd, bool at_exit,
> - struct feat_copier *fc)
> + struct feat_copier *fc,
> + bool write_attrs_after_data)
> {
> struct perf_file_header f_header;
> - struct perf_file_attr f_attr;
> struct perf_header *header = &session->header;
> struct evsel *evsel;
> struct feat_fd ff = {
> .fd = fd,
> };
> - u64 attr_offset;
> + u64 attr_offset = sizeof(f_header), attr_size = 0;
> int err;
>
> - lseek(fd, sizeof(f_header), SEEK_SET);
> + if (write_attrs_after_data && at_exit) {
> + /*
> + * Write features at the end of the file first so that
> + * attributes may come after them.
> + */
> + if (!header->data_offset && header->data_size) {
> + pr_err("File contains data but offset unknown\n");
> + err = -1;
> + goto err_out;
> + }
> + header->feat_offset = header->data_offset + header->data_size;
> + err = perf_header__adds_write(header, evlist, fd, fc);
> + if (err < 0)
> + goto err_out;
> + attr_offset = lseek(fd, 0, SEEK_CUR);
> + } else {
> + lseek(fd, attr_offset, SEEK_SET);
> + }
>
> evlist__for_each_entry(session->evlist, evsel) {
> - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> - if (err < 0) {
> - pr_debug("failed to write perf header\n");
> - free(ff.buf);
> - return err;
> + evsel->id_offset = attr_offset;
> + /* Avoid writing at the end of the file until the session is exiting. */
> + if (!write_attrs_after_data || at_exit) {
> + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> + if (err < 0) {
> + pr_debug("failed to write perf header\n");
> + goto err_out;
> + }
> }
> + attr_offset += evsel->core.ids * sizeof(u64);
So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
evsel->id_offset, now you're assuming that do_write will honour the size
parameter, i.e. write evsel->core.ids * sizeof(u64), but:
/* Return: 0 if succeeded, -ERR if failed. */
int do_write(struct feat_fd *ff, const void *buf, size_t size)
{
if (!ff->buf)
return __do_write_fd(ff, buf, size);
return __do_write_buf(ff, buf, size);
}
And then:
static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
{
ssize_t ret = writen(ff->fd, buf, size);
if (ret != (ssize_t)size)
return ret < 0 ? (int)ret : -1;
return 0;
}
I see that writen calls ion that even has a BUG_ON() if it doesn't write
exactly the requested size bytes, I got distracted with __do_write_fd
extra check that ret != size returning ret if not negative...
I.e. your code _should_ be equivalent due to the check in ion(), and
taking that as an assumption you reduce the number of lseek syscalls,
which is a good thing...
I was just trying to see that the !write_attrs_after_data case was
_exactly_ the same as before, which it doesn't look like it is :-\
- Arnaldo
> }
>
> - attr_offset = lseek(ff.fd, 0, SEEK_CUR);
> -
> evlist__for_each_entry(evlist, evsel) {
> if (evsel->core.attr.size < sizeof(evsel->core.attr)) {
> /*
> @@ -3711,40 +3729,46 @@ static int perf_session__do_write_header(struct perf_session *session,
> */
> evsel->core.attr.size = sizeof(evsel->core.attr);
> }
> - f_attr = (struct perf_file_attr){
> - .attr = evsel->core.attr,
> - .ids = {
> - .offset = evsel->id_offset,
> - .size = evsel->core.ids * sizeof(u64),
> + /* Avoid writing at the end of the file until the session is exiting. */
> + if (!write_attrs_after_data || at_exit) {
> + struct perf_file_attr f_attr = {
> + .attr = evsel->core.attr,
> + .ids = {
> + .offset = evsel->id_offset,
> + .size = evsel->core.ids * sizeof(u64),
> + }
> + };
> + err = do_write(&ff, &f_attr, sizeof(f_attr));
> + if (err < 0) {
> + pr_debug("failed to write perf header attribute\n");
> + goto err_out;
> }
> - };
> - err = do_write(&ff, &f_attr, sizeof(f_attr));
> - if (err < 0) {
> - pr_debug("failed to write perf header attribute\n");
> - free(ff.buf);
> - return err;
> }
> + attr_size += sizeof(struct perf_file_attr);
> }
>
> - if (!header->data_offset)
> - header->data_offset = lseek(fd, 0, SEEK_CUR);
> + if (!header->data_offset) {
> + if (write_attrs_after_data)
> + header->data_offset = sizeof(f_header);
> + else
> + header->data_offset = attr_offset + attr_size;
> + }
> header->feat_offset = header->data_offset + header->data_size;
>
> - if (at_exit) {
> + if (!write_attrs_after_data && at_exit) {
> + /* Write features now feat_offset is known. */
> err = perf_header__adds_write(header, evlist, fd, fc);
> - if (err < 0) {
> - free(ff.buf);
> - return err;
> - }
> + if (err < 0)
> + goto err_out;
> }
>
> f_header = (struct perf_file_header){
> .magic = PERF_MAGIC,
> .size = sizeof(f_header),
> - .attr_size = sizeof(f_attr),
> + .attr_size = sizeof(struct perf_file_attr),
> .attrs = {
> .offset = attr_offset,
> - .size = evlist->core.nr_entries * sizeof(f_attr),
> + .size = attr_size,
> },
> .data = {
> .offset = header->data_offset,
> @@ -3757,21 +3781,24 @@ static int perf_session__do_write_header(struct perf_session *session,
>
> lseek(fd, 0, SEEK_SET);
> err = do_write(&ff, &f_header, sizeof(f_header));
> - free(ff.buf);
> if (err < 0) {
> pr_debug("failed to write perf header\n");
> - return err;
> + goto err_out;
> + } else {
> + lseek(fd, 0, SEEK_END);
> + err = 0;
> }
> - lseek(fd, header->data_offset + header->data_size, SEEK_SET);
> -
> - return 0;
> +err_out:
> + free(ff.buf);
> + return err;
> }
>
> int perf_session__write_header(struct perf_session *session,
> struct evlist *evlist,
> int fd, bool at_exit)
> {
> - return perf_session__do_write_header(session, evlist, fd, at_exit, NULL);
> + return perf_session__do_write_header(session, evlist, fd, at_exit, /*fc=*/NULL,
> + /*write_attrs_after_data=*/false);
> }
>
> size_t perf_session__data_offset(const struct evlist *evlist)
> @@ -3793,7 +3820,8 @@ int perf_session__inject_header(struct perf_session *session,
> int fd,
> struct feat_copier *fc)
> {
> - return perf_session__do_write_header(session, evlist, fd, true, fc);
> + return perf_session__do_write_header(session, evlist, fd, true, fc,
> + /*write_attrs_after_data=*/false);
> }
>
> static int perf_header__getbuffer64(struct perf_header *header,
> --
> 2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 19:31 ` Arnaldo Carvalho de Melo
@ 2024-08-29 20:12 ` Ian Rogers
2024-08-29 20:58 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 20:12 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > With a file, to write data an offset needs to be known. Typically data
> > follows the event attributes in a file. However, if processing a pipe
> > the number of event attributes may not be known. It is convenient in
> > that case to write the attributes after the data. Expand
> > perf_session__do_write_header to allow this when the data offset and
> > size are known.
> >
> > This approach may be useful for more than just taking a pipe file to
> > write into a data file, `perf inject --itrace` will reserve and
> > additional 8kb for attributes, which would be unnecessary if the
> > attributes were written after the data.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> > tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> > 1 file changed, 67 insertions(+), 39 deletions(-)
> >
> > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > index 65c9086610cb..4eb39463067e 100644
> > --- a/tools/perf/util/header.c
> > +++ b/tools/perf/util/header.c
> > @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> > static int perf_session__do_write_header(struct perf_session *session,
> > struct evlist *evlist,
> > int fd, bool at_exit,
> > - struct feat_copier *fc)
> > + struct feat_copier *fc,
> > + bool write_attrs_after_data)
> > {
> > struct perf_file_header f_header;
> > - struct perf_file_attr f_attr;
> > struct perf_header *header = &session->header;
> > struct evsel *evsel;
> > struct feat_fd ff = {
> > .fd = fd,
> > };
> > - u64 attr_offset;
> > + u64 attr_offset = sizeof(f_header), attr_size = 0;
> > int err;
> >
> > - lseek(fd, sizeof(f_header), SEEK_SET);
> > + if (write_attrs_after_data && at_exit) {
> > + /*
> > + * Write features at the end of the file first so that
> > + * attributes may come after them.
> > + */
> > + if (!header->data_offset && header->data_size) {
> > + pr_err("File contains data but offset unknown\n");
> > + err = -1;
> > + goto err_out;
> > + }
> > + header->feat_offset = header->data_offset + header->data_size;
> > + err = perf_header__adds_write(header, evlist, fd, fc);
> > + if (err < 0)
> > + goto err_out;
> > + attr_offset = lseek(fd, 0, SEEK_CUR);
> > + } else {
> > + lseek(fd, attr_offset, SEEK_SET);
> > + }
> >
> > evlist__for_each_entry(session->evlist, evsel) {
> > - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> > - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > - if (err < 0) {
> > - pr_debug("failed to write perf header\n");
> > - free(ff.buf);
> > - return err;
> > + evsel->id_offset = attr_offset;
> > + /* Avoid writing at the end of the file until the session is exiting. */
> > + if (!write_attrs_after_data || at_exit) {
> > + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > + if (err < 0) {
> > + pr_debug("failed to write perf header\n");
> > + goto err_out;
> > + }
> > }
> > + attr_offset += evsel->core.ids * sizeof(u64);
>
> So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
> evsel->id_offset, now you're assuming that do_write will honour the size
> parameter, i.e. write evsel->core.ids * sizeof(u64), but:
>
> /* Return: 0 if succeeded, -ERR if failed. */
> int do_write(struct feat_fd *ff, const void *buf, size_t size)
> {
> if (!ff->buf)
> return __do_write_fd(ff, buf, size);
> return __do_write_buf(ff, buf, size);
> }
>
> And then:
>
> static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
> {
> ssize_t ret = writen(ff->fd, buf, size);
>
> if (ret != (ssize_t)size)
> return ret < 0 ? (int)ret : -1;
> return 0;
> }
>
> I see that writen calls ion that even has a BUG_ON() if it doesn't write
> exactly the requested size bytes, I got distracted with __do_write_fd
> extra check that ret != size returning ret if not negative...
>
> I.e. your code _should_ be equivalent due to the check in ion(), and
> taking that as an assumption you reduce the number of lseek syscalls,
> which is a good thing...
>
> I was just trying to see that the !write_attrs_after_data case was
> _exactly_ the same as before, which it doesn't look like it is :-\
I'm not seeing the difference. Before:
```
lseek(fd, sizeof(f_header), SEEK_SET);
evlist__for_each_entry(session->evlist, evsel) {
evsel->id_offset = lseek(fd, 0, SEEK_CUR); // Initially at sizeof(f_header)
...
err = do_write(&ff, evsel->core.id, evsel->core.ids *
sizeof(u64)); // offset has advanced by "evsel->core.ids *
sizeof(u64)" bytes
...
}
```
After:
```
u64 attr_offset = sizeof(f_header)
...
else {
lseek(fd, attr_offset, SEEK_SET);
}
evlist__for_each_entry(session->evlist, evsel) {
evsel->id_offset = attr_offset;
...
err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
...
attr_offset += evsel->core.ids * sizeof(u64);
}
```
The reason for using math rather than lseek (apart from reducing
syscalls) is that if we're writing the header at the beginning of
generating a file (usually done more to compute the
header->data_offset), we can't use lseek if writing at the end as we
don't know where the end of the data will be yet and don't write out
the bytes.
Thanks,
Ian
> - Arnaldo
>
> > }
> >
> > - attr_offset = lseek(ff.fd, 0, SEEK_CUR);
> > -
> > evlist__for_each_entry(evlist, evsel) {
> > if (evsel->core.attr.size < sizeof(evsel->core.attr)) {
> > /*
> > @@ -3711,40 +3729,46 @@ static int perf_session__do_write_header(struct perf_session *session,
> > */
> > evsel->core.attr.size = sizeof(evsel->core.attr);
> > }
> > - f_attr = (struct perf_file_attr){
> > - .attr = evsel->core.attr,
> > - .ids = {
> > - .offset = evsel->id_offset,
> > - .size = evsel->core.ids * sizeof(u64),
> > + /* Avoid writing at the end of the file until the session is exiting. */
> > + if (!write_attrs_after_data || at_exit) {
> > + struct perf_file_attr f_attr = {
> > + .attr = evsel->core.attr,
> > + .ids = {
> > + .offset = evsel->id_offset,
> > + .size = evsel->core.ids * sizeof(u64),
> > + }
> > + };
> > + err = do_write(&ff, &f_attr, sizeof(f_attr));
> > + if (err < 0) {
> > + pr_debug("failed to write perf header attribute\n");
> > + goto err_out;
> > }
> > - };
> > - err = do_write(&ff, &f_attr, sizeof(f_attr));
> > - if (err < 0) {
> > - pr_debug("failed to write perf header attribute\n");
> > - free(ff.buf);
> > - return err;
> > }
> > + attr_size += sizeof(struct perf_file_attr);
> > }
> >
> > - if (!header->data_offset)
> > - header->data_offset = lseek(fd, 0, SEEK_CUR);
> > + if (!header->data_offset) {
> > + if (write_attrs_after_data)
> > + header->data_offset = sizeof(f_header);
> > + else
> > + header->data_offset = attr_offset + attr_size;
> > + }
> > header->feat_offset = header->data_offset + header->data_size;
> >
> > - if (at_exit) {
> > + if (!write_attrs_after_data && at_exit) {
> > + /* Write features now feat_offset is known. */
> > err = perf_header__adds_write(header, evlist, fd, fc);
> > - if (err < 0) {
> > - free(ff.buf);
> > - return err;
> > - }
> > + if (err < 0)
> > + goto err_out;
> > }
> >
> > f_header = (struct perf_file_header){
> > .magic = PERF_MAGIC,
> > .size = sizeof(f_header),
> > - .attr_size = sizeof(f_attr),
> > + .attr_size = sizeof(struct perf_file_attr),
> > .attrs = {
> > .offset = attr_offset,
> > - .size = evlist->core.nr_entries * sizeof(f_attr),
> > + .size = attr_size,
> > },
> > .data = {
> > .offset = header->data_offset,
> > @@ -3757,21 +3781,24 @@ static int perf_session__do_write_header(struct perf_session *session,
> >
> > lseek(fd, 0, SEEK_SET);
> > err = do_write(&ff, &f_header, sizeof(f_header));
> > - free(ff.buf);
> > if (err < 0) {
> > pr_debug("failed to write perf header\n");
> > - return err;
> > + goto err_out;
> > + } else {
> > + lseek(fd, 0, SEEK_END);
> > + err = 0;
> > }
> > - lseek(fd, header->data_offset + header->data_size, SEEK_SET);
> > -
> > - return 0;
> > +err_out:
> > + free(ff.buf);
> > + return err;
> > }
> >
> > int perf_session__write_header(struct perf_session *session,
> > struct evlist *evlist,
> > int fd, bool at_exit)
> > {
> > - return perf_session__do_write_header(session, evlist, fd, at_exit, NULL);
> > + return perf_session__do_write_header(session, evlist, fd, at_exit, /*fc=*/NULL,
> > + /*write_attrs_after_data=*/false);
> > }
> >
> > size_t perf_session__data_offset(const struct evlist *evlist)
> > @@ -3793,7 +3820,8 @@ int perf_session__inject_header(struct perf_session *session,
> > int fd,
> > struct feat_copier *fc)
> > {
> > - return perf_session__do_write_header(session, evlist, fd, true, fc);
> > + return perf_session__do_write_header(session, evlist, fd, true, fc,
> > + /*write_attrs_after_data=*/false);
> > }
> >
> > static int perf_header__getbuffer64(struct perf_header *header,
> > --
> > 2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 20:12 ` Ian Rogers
@ 2024-08-29 20:58 ` Arnaldo Carvalho de Melo
2024-08-29 21:42 ` Ian Rogers
0 siblings, 1 reply; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-29 20:58 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 01:12:32PM -0700, Ian Rogers wrote:
> On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > > With a file, to write data an offset needs to be known. Typically data
> > > follows the event attributes in a file. However, if processing a pipe
> > > the number of event attributes may not be known. It is convenient in
> > > that case to write the attributes after the data. Expand
> > > perf_session__do_write_header to allow this when the data offset and
> > > size are known.
> > >
> > > This approach may be useful for more than just taking a pipe file to
> > > write into a data file, `perf inject --itrace` will reserve and
> > > additional 8kb for attributes, which would be unnecessary if the
> > > attributes were written after the data.
> > >
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > ---
> > > tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> > > 1 file changed, 67 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > > index 65c9086610cb..4eb39463067e 100644
> > > --- a/tools/perf/util/header.c
> > > +++ b/tools/perf/util/header.c
> > > @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> > > static int perf_session__do_write_header(struct perf_session *session,
> > > struct evlist *evlist,
> > > int fd, bool at_exit,
> > > - struct feat_copier *fc)
> > > + struct feat_copier *fc,
> > > + bool write_attrs_after_data)
> > > {
> > > struct perf_file_header f_header;
> > > - struct perf_file_attr f_attr;
> > > struct perf_header *header = &session->header;
> > > struct evsel *evsel;
> > > struct feat_fd ff = {
> > > .fd = fd,
> > > };
> > > - u64 attr_offset;
> > > + u64 attr_offset = sizeof(f_header), attr_size = 0;
> > > int err;
> > >
> > > - lseek(fd, sizeof(f_header), SEEK_SET);
> > > + if (write_attrs_after_data && at_exit) {
> > > + /*
> > > + * Write features at the end of the file first so that
> > > + * attributes may come after them.
> > > + */
> > > + if (!header->data_offset && header->data_size) {
> > > + pr_err("File contains data but offset unknown\n");
> > > + err = -1;
> > > + goto err_out;
> > > + }
> > > + header->feat_offset = header->data_offset + header->data_size;
> > > + err = perf_header__adds_write(header, evlist, fd, fc);
> > > + if (err < 0)
> > > + goto err_out;
> > > + attr_offset = lseek(fd, 0, SEEK_CUR);
> > > + } else {
> > > + lseek(fd, attr_offset, SEEK_SET);
> > > + }
> > >
> > > evlist__for_each_entry(session->evlist, evsel) {
> > > - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> > > - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > - if (err < 0) {
> > > - pr_debug("failed to write perf header\n");
> > > - free(ff.buf);
> > > - return err;
> > > + evsel->id_offset = attr_offset;
> > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > + if (!write_attrs_after_data || at_exit) {
> > > + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > + if (err < 0) {
> > > + pr_debug("failed to write perf header\n");
> > > + goto err_out;
> > > + }
> > > }
> > > + attr_offset += evsel->core.ids * sizeof(u64);
> >
> > So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
> > evsel->id_offset, now you're assuming that do_write will honour the size
> > parameter, i.e. write evsel->core.ids * sizeof(u64), but:
> >
> > /* Return: 0 if succeeded, -ERR if failed. */
> > int do_write(struct feat_fd *ff, const void *buf, size_t size)
> > {
> > if (!ff->buf)
> > return __do_write_fd(ff, buf, size);
> > return __do_write_buf(ff, buf, size);
> > }
> >
> > And then:
> >
> > static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
> > {
> > ssize_t ret = writen(ff->fd, buf, size);
> >
> > if (ret != (ssize_t)size)
> > return ret < 0 ? (int)ret : -1;
> > return 0;
> > }
> >
> > I see that writen calls ion that even has a BUG_ON() if it doesn't write
> > exactly the requested size bytes, I got distracted with __do_write_fd
> > extra check that ret != size returning ret if not negative...
> >
> > I.e. your code _should_ be equivalent due to the check in ion(), and
> > taking that as an assumption you reduce the number of lseek syscalls,
> > which is a good thing...
> >
> > I was just trying to see that the !write_attrs_after_data case was
> > _exactly_ the same as before, which it doesn't look like it is :-\
>
> I'm not seeing the difference. Before:
You noticed the difference: before we used lseek to get the current
offset to use, afterwards we moved to doing plain math.
So I had to check if we could assume that, and with the current code
structure, yes, we can assume that, so seems safe, but it is different
and if the assumption somehow breaks, as the code in __do_write_fd()
guard against (unneeded at the moment as ion has even a BUG_ON for that
not to happen), then the offset will not be where the data is.
Using lseek() is more costly (syscalls) but it is the ultimate answer to
get where in the file the current offset is.
So that is the difference I noticed.
Doing multiple things in the same patch causes these reviewing delays,
doubts, its something we discussed multiple times in the past, and that
continue to cause these discussions.
- Arnaldo
> ```
> lseek(fd, sizeof(f_header), SEEK_SET);
> evlist__for_each_entry(session->evlist, evsel) {
> evsel->id_offset = lseek(fd, 0, SEEK_CUR); // Initially at sizeof(f_header)
> ...
> err = do_write(&ff, evsel->core.id, evsel->core.ids *
> sizeof(u64)); // offset has advanced by "evsel->core.ids *
> sizeof(u64)" bytes
> ...
> }
> ```
> After:
> ```
> u64 attr_offset = sizeof(f_header)
> ...
> else {
> lseek(fd, attr_offset, SEEK_SET);
> }
>
> evlist__for_each_entry(session->evlist, evsel) {
> evsel->id_offset = attr_offset;
> ...
> err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> ...
> attr_offset += evsel->core.ids * sizeof(u64);
> }
> ```
>
> The reason for using math rather than lseek (apart from reducing
> syscalls) is that if we're writing the header at the beginning of
> generating a file (usually done more to compute the
> header->data_offset), we can't use lseek if writing at the end as we
> don't know where the end of the data will be yet and don't write out
> the bytes.
>
> Thanks,
> Ian
>
> > - Arnaldo
> >
> > > }
> > >
> > > - attr_offset = lseek(ff.fd, 0, SEEK_CUR);
> > > -
> > > evlist__for_each_entry(evlist, evsel) {
> > > if (evsel->core.attr.size < sizeof(evsel->core.attr)) {
> > > /*
> > > @@ -3711,40 +3729,46 @@ static int perf_session__do_write_header(struct perf_session *session,
> > > */
> > > evsel->core.attr.size = sizeof(evsel->core.attr);
> > > }
> > > - f_attr = (struct perf_file_attr){
> > > - .attr = evsel->core.attr,
> > > - .ids = {
> > > - .offset = evsel->id_offset,
> > > - .size = evsel->core.ids * sizeof(u64),
> > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > + if (!write_attrs_after_data || at_exit) {
> > > + struct perf_file_attr f_attr = {
> > > + .attr = evsel->core.attr,
> > > + .ids = {
> > > + .offset = evsel->id_offset,
> > > + .size = evsel->core.ids * sizeof(u64),
> > > + }
> > > + };
> > > + err = do_write(&ff, &f_attr, sizeof(f_attr));
> > > + if (err < 0) {
> > > + pr_debug("failed to write perf header attribute\n");
> > > + goto err_out;
> > > }
> > > - };
> > > - err = do_write(&ff, &f_attr, sizeof(f_attr));
> > > - if (err < 0) {
> > > - pr_debug("failed to write perf header attribute\n");
> > > - free(ff.buf);
> > > - return err;
> > > }
> > > + attr_size += sizeof(struct perf_file_attr);
> > > }
> > >
> > > - if (!header->data_offset)
> > > - header->data_offset = lseek(fd, 0, SEEK_CUR);
> > > + if (!header->data_offset) {
> > > + if (write_attrs_after_data)
> > > + header->data_offset = sizeof(f_header);
> > > + else
> > > + header->data_offset = attr_offset + attr_size;
> > > + }
> > > header->feat_offset = header->data_offset + header->data_size;
> > >
> > > - if (at_exit) {
> > > + if (!write_attrs_after_data && at_exit) {
> > > + /* Write features now feat_offset is known. */
> > > err = perf_header__adds_write(header, evlist, fd, fc);
> > > - if (err < 0) {
> > > - free(ff.buf);
> > > - return err;
> > > - }
> > > + if (err < 0)
> > > + goto err_out;
> > > }
> > >
> > > f_header = (struct perf_file_header){
> > > .magic = PERF_MAGIC,
> > > .size = sizeof(f_header),
> > > - .attr_size = sizeof(f_attr),
> > > + .attr_size = sizeof(struct perf_file_attr),
> > > .attrs = {
> > > .offset = attr_offset,
> > > - .size = evlist->core.nr_entries * sizeof(f_attr),
> > > + .size = attr_size,
> > > },
> > > .data = {
> > > .offset = header->data_offset,
> > > @@ -3757,21 +3781,24 @@ static int perf_session__do_write_header(struct perf_session *session,
> > >
> > > lseek(fd, 0, SEEK_SET);
> > > err = do_write(&ff, &f_header, sizeof(f_header));
> > > - free(ff.buf);
> > > if (err < 0) {
> > > pr_debug("failed to write perf header\n");
> > > - return err;
> > > + goto err_out;
> > > + } else {
> > > + lseek(fd, 0, SEEK_END);
> > > + err = 0;
> > > }
> > > - lseek(fd, header->data_offset + header->data_size, SEEK_SET);
> > > -
> > > - return 0;
> > > +err_out:
> > > + free(ff.buf);
> > > + return err;
> > > }
> > >
> > > int perf_session__write_header(struct perf_session *session,
> > > struct evlist *evlist,
> > > int fd, bool at_exit)
> > > {
> > > - return perf_session__do_write_header(session, evlist, fd, at_exit, NULL);
> > > + return perf_session__do_write_header(session, evlist, fd, at_exit, /*fc=*/NULL,
> > > + /*write_attrs_after_data=*/false);
> > > }
> > >
> > > size_t perf_session__data_offset(const struct evlist *evlist)
> > > @@ -3793,7 +3820,8 @@ int perf_session__inject_header(struct perf_session *session,
> > > int fd,
> > > struct feat_copier *fc)
> > > {
> > > - return perf_session__do_write_header(session, evlist, fd, true, fc);
> > > + return perf_session__do_write_header(session, evlist, fd, true, fc,
> > > + /*write_attrs_after_data=*/false);
> > > }
> > >
> > > static int perf_header__getbuffer64(struct perf_header *header,
> > > --
> > > 2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 20:58 ` Arnaldo Carvalho de Melo
@ 2024-08-29 21:42 ` Ian Rogers
2024-08-30 4:39 ` Namhyung Kim
2024-08-30 12:19 ` Arnaldo Carvalho de Melo
0 siblings, 2 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 21:42 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 1:58 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Thu, Aug 29, 2024 at 01:12:32PM -0700, Ian Rogers wrote:
> > On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > > > With a file, to write data an offset needs to be known. Typically data
> > > > follows the event attributes in a file. However, if processing a pipe
> > > > the number of event attributes may not be known. It is convenient in
> > > > that case to write the attributes after the data. Expand
> > > > perf_session__do_write_header to allow this when the data offset and
> > > > size are known.
> > > >
> > > > This approach may be useful for more than just taking a pipe file to
> > > > write into a data file, `perf inject --itrace` will reserve and
> > > > additional 8kb for attributes, which would be unnecessary if the
> > > > attributes were written after the data.
> > > >
> > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > ---
> > > > tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> > > > 1 file changed, 67 insertions(+), 39 deletions(-)
> > > >
> > > > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > > > index 65c9086610cb..4eb39463067e 100644
> > > > --- a/tools/perf/util/header.c
> > > > +++ b/tools/perf/util/header.c
> > > > @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> > > > static int perf_session__do_write_header(struct perf_session *session,
> > > > struct evlist *evlist,
> > > > int fd, bool at_exit,
> > > > - struct feat_copier *fc)
> > > > + struct feat_copier *fc,
> > > > + bool write_attrs_after_data)
> > > > {
> > > > struct perf_file_header f_header;
> > > > - struct perf_file_attr f_attr;
> > > > struct perf_header *header = &session->header;
> > > > struct evsel *evsel;
> > > > struct feat_fd ff = {
> > > > .fd = fd,
> > > > };
> > > > - u64 attr_offset;
> > > > + u64 attr_offset = sizeof(f_header), attr_size = 0;
> > > > int err;
> > > >
> > > > - lseek(fd, sizeof(f_header), SEEK_SET);
> > > > + if (write_attrs_after_data && at_exit) {
> > > > + /*
> > > > + * Write features at the end of the file first so that
> > > > + * attributes may come after them.
> > > > + */
> > > > + if (!header->data_offset && header->data_size) {
> > > > + pr_err("File contains data but offset unknown\n");
> > > > + err = -1;
> > > > + goto err_out;
> > > > + }
> > > > + header->feat_offset = header->data_offset + header->data_size;
> > > > + err = perf_header__adds_write(header, evlist, fd, fc);
> > > > + if (err < 0)
> > > > + goto err_out;
> > > > + attr_offset = lseek(fd, 0, SEEK_CUR);
> > > > + } else {
> > > > + lseek(fd, attr_offset, SEEK_SET);
> > > > + }
> > > >
> > > > evlist__for_each_entry(session->evlist, evsel) {
> > > > - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> > > > - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > - if (err < 0) {
> > > > - pr_debug("failed to write perf header\n");
> > > > - free(ff.buf);
> > > > - return err;
> > > > + evsel->id_offset = attr_offset;
> > > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > > + if (!write_attrs_after_data || at_exit) {
> > > > + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > + if (err < 0) {
> > > > + pr_debug("failed to write perf header\n");
> > > > + goto err_out;
> > > > + }
> > > > }
> > > > + attr_offset += evsel->core.ids * sizeof(u64);
> > >
> > > So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
> > > evsel->id_offset, now you're assuming that do_write will honour the size
> > > parameter, i.e. write evsel->core.ids * sizeof(u64), but:
> > >
> > > /* Return: 0 if succeeded, -ERR if failed. */
> > > int do_write(struct feat_fd *ff, const void *buf, size_t size)
> > > {
> > > if (!ff->buf)
> > > return __do_write_fd(ff, buf, size);
> > > return __do_write_buf(ff, buf, size);
> > > }
> > >
> > > And then:
> > >
> > > static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
> > > {
> > > ssize_t ret = writen(ff->fd, buf, size);
> > >
> > > if (ret != (ssize_t)size)
> > > return ret < 0 ? (int)ret : -1;
> > > return 0;
> > > }
> > >
> > > I see that writen calls ion that even has a BUG_ON() if it doesn't write
> > > exactly the requested size bytes, I got distracted with __do_write_fd
> > > extra check that ret != size returning ret if not negative...
> > >
> > > I.e. your code _should_ be equivalent due to the check in ion(), and
> > > taking that as an assumption you reduce the number of lseek syscalls,
> > > which is a good thing...
> > >
> > > I was just trying to see that the !write_attrs_after_data case was
> > > _exactly_ the same as before, which it doesn't look like it is :-\
> >
> > I'm not seeing the difference. Before:
>
> You noticed the difference: before we used lseek to get the current
> offset to use, afterwards we moved to doing plain math.
>
> So I had to check if we could assume that, and with the current code
> structure, yes, we can assume that, so seems safe, but it is different
> and if the assumption somehow breaks, as the code in __do_write_fd()
> guard against (unneeded at the moment as ion has even a BUG_ON for that
> not to happen), then the offset will not be where the data is.
>
> Using lseek() is more costly (syscalls) but it is the ultimate answer to
> get where in the file the current offset is.
>
> So that is the difference I noticed.
>
> Doing multiple things in the same patch causes these reviewing delays,
> doubts, its something we discussed multiple times in the past, and that
> continue to cause these discussions.
Right, but it is something of an unfortunate coincidence of how the
code is structured. The fact that writing the header updates
data_offset which is a thing that other things depend upon while
depending on its value itself, etc. - ie the function does more than
just a write, it also sometimes computes the layout, has inbuilt
assumptions on the values lseek will return, and so on. To get to this
final structure took a fair few iterations and I've separated this
change out from the bulk in the next change to keep the patch size
down. I could have done a patch switching from lseeks to math, then a
patch to add write_attrs_after_data. It probably would have yielded
about 4 lines of shared code, more lines that would have been deleted,
while creating quite a bit of work for me. Ideally when these
functions were created there would have been far more liberal use of
things like immutability, so side-effects are minimized. Yes I could
refactor everything, but time..
Thanks,
Ian
> - Arnaldo
>
> > ```
> > lseek(fd, sizeof(f_header), SEEK_SET);
> > evlist__for_each_entry(session->evlist, evsel) {
> > evsel->id_offset = lseek(fd, 0, SEEK_CUR); // Initially at sizeof(f_header)
> > ...
> > err = do_write(&ff, evsel->core.id, evsel->core.ids *
> > sizeof(u64)); // offset has advanced by "evsel->core.ids *
> > sizeof(u64)" bytes
> > ...
> > }
> > ```
> > After:
> > ```
> > u64 attr_offset = sizeof(f_header)
> > ...
> > else {
> > lseek(fd, attr_offset, SEEK_SET);
> > }
> >
> > evlist__for_each_entry(session->evlist, evsel) {
> > evsel->id_offset = attr_offset;
> > ...
> > err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > ...
> > attr_offset += evsel->core.ids * sizeof(u64);
> > }
> > ```
> >
> > The reason for using math rather than lseek (apart from reducing
> > syscalls) is that if we're writing the header at the beginning of
> > generating a file (usually done more to compute the
> > header->data_offset), we can't use lseek if writing at the end as we
> > don't know where the end of the data will be yet and don't write out
> > the bytes.
> >
> > Thanks,
> > Ian
> >
> > > - Arnaldo
> > >
> > > > }
> > > >
> > > > - attr_offset = lseek(ff.fd, 0, SEEK_CUR);
> > > > -
> > > > evlist__for_each_entry(evlist, evsel) {
> > > > if (evsel->core.attr.size < sizeof(evsel->core.attr)) {
> > > > /*
> > > > @@ -3711,40 +3729,46 @@ static int perf_session__do_write_header(struct perf_session *session,
> > > > */
> > > > evsel->core.attr.size = sizeof(evsel->core.attr);
> > > > }
> > > > - f_attr = (struct perf_file_attr){
> > > > - .attr = evsel->core.attr,
> > > > - .ids = {
> > > > - .offset = evsel->id_offset,
> > > > - .size = evsel->core.ids * sizeof(u64),
> > > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > > + if (!write_attrs_after_data || at_exit) {
> > > > + struct perf_file_attr f_attr = {
> > > > + .attr = evsel->core.attr,
> > > > + .ids = {
> > > > + .offset = evsel->id_offset,
> > > > + .size = evsel->core.ids * sizeof(u64),
> > > > + }
> > > > + };
> > > > + err = do_write(&ff, &f_attr, sizeof(f_attr));
> > > > + if (err < 0) {
> > > > + pr_debug("failed to write perf header attribute\n");
> > > > + goto err_out;
> > > > }
> > > > - };
> > > > - err = do_write(&ff, &f_attr, sizeof(f_attr));
> > > > - if (err < 0) {
> > > > - pr_debug("failed to write perf header attribute\n");
> > > > - free(ff.buf);
> > > > - return err;
> > > > }
> > > > + attr_size += sizeof(struct perf_file_attr);
> > > > }
> > > >
> > > > - if (!header->data_offset)
> > > > - header->data_offset = lseek(fd, 0, SEEK_CUR);
> > > > + if (!header->data_offset) {
> > > > + if (write_attrs_after_data)
> > > > + header->data_offset = sizeof(f_header);
> > > > + else
> > > > + header->data_offset = attr_offset + attr_size;
> > > > + }
> > > > header->feat_offset = header->data_offset + header->data_size;
> > > >
> > > > - if (at_exit) {
> > > > + if (!write_attrs_after_data && at_exit) {
> > > > + /* Write features now feat_offset is known. */
> > > > err = perf_header__adds_write(header, evlist, fd, fc);
> > > > - if (err < 0) {
> > > > - free(ff.buf);
> > > > - return err;
> > > > - }
> > > > + if (err < 0)
> > > > + goto err_out;
> > > > }
> > > >
> > > > f_header = (struct perf_file_header){
> > > > .magic = PERF_MAGIC,
> > > > .size = sizeof(f_header),
> > > > - .attr_size = sizeof(f_attr),
> > > > + .attr_size = sizeof(struct perf_file_attr),
> > > > .attrs = {
> > > > .offset = attr_offset,
> > > > - .size = evlist->core.nr_entries * sizeof(f_attr),
> > > > + .size = attr_size,
> > > > },
> > > > .data = {
> > > > .offset = header->data_offset,
> > > > @@ -3757,21 +3781,24 @@ static int perf_session__do_write_header(struct perf_session *session,
> > > >
> > > > lseek(fd, 0, SEEK_SET);
> > > > err = do_write(&ff, &f_header, sizeof(f_header));
> > > > - free(ff.buf);
> > > > if (err < 0) {
> > > > pr_debug("failed to write perf header\n");
> > > > - return err;
> > > > + goto err_out;
> > > > + } else {
> > > > + lseek(fd, 0, SEEK_END);
> > > > + err = 0;
> > > > }
> > > > - lseek(fd, header->data_offset + header->data_size, SEEK_SET);
> > > > -
> > > > - return 0;
> > > > +err_out:
> > > > + free(ff.buf);
> > > > + return err;
> > > > }
> > > >
> > > > int perf_session__write_header(struct perf_session *session,
> > > > struct evlist *evlist,
> > > > int fd, bool at_exit)
> > > > {
> > > > - return perf_session__do_write_header(session, evlist, fd, at_exit, NULL);
> > > > + return perf_session__do_write_header(session, evlist, fd, at_exit, /*fc=*/NULL,
> > > > + /*write_attrs_after_data=*/false);
> > > > }
> > > >
> > > > size_t perf_session__data_offset(const struct evlist *evlist)
> > > > @@ -3793,7 +3820,8 @@ int perf_session__inject_header(struct perf_session *session,
> > > > int fd,
> > > > struct feat_copier *fc)
> > > > {
> > > > - return perf_session__do_write_header(session, evlist, fd, true, fc);
> > > > + return perf_session__do_write_header(session, evlist, fd, true, fc,
> > > > + /*write_attrs_after_data=*/false);
> > > > }
> > > >
> > > > static int perf_header__getbuffer64(struct perf_header *header,
> > > > --
> > > > 2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 21:42 ` Ian Rogers
@ 2024-08-30 4:39 ` Namhyung Kim
2024-08-30 5:03 ` Ian Rogers
2024-08-30 12:19 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 18+ messages in thread
From: Namhyung Kim @ 2024-08-30 4:39 UTC (permalink / raw)
To: Ian Rogers
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 02:42:38PM -0700, Ian Rogers wrote:
> On Thu, Aug 29, 2024 at 1:58 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Thu, Aug 29, 2024 at 01:12:32PM -0700, Ian Rogers wrote:
> > > On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo
> > > <acme@kernel.org> wrote:
> > > >
> > > > On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > > > > With a file, to write data an offset needs to be known. Typically data
> > > > > follows the event attributes in a file. However, if processing a pipe
> > > > > the number of event attributes may not be known. It is convenient in
> > > > > that case to write the attributes after the data. Expand
> > > > > perf_session__do_write_header to allow this when the data offset and
> > > > > size are known.
> > > > >
> > > > > This approach may be useful for more than just taking a pipe file to
> > > > > write into a data file, `perf inject --itrace` will reserve and
> > > > > additional 8kb for attributes, which would be unnecessary if the
> > > > > attributes were written after the data.
> > > > >
> > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > ---
> > > > > tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> > > > > 1 file changed, 67 insertions(+), 39 deletions(-)
> > > > >
> > > > > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > > > > index 65c9086610cb..4eb39463067e 100644
> > > > > --- a/tools/perf/util/header.c
> > > > > +++ b/tools/perf/util/header.c
> > > > > @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> > > > > static int perf_session__do_write_header(struct perf_session *session,
> > > > > struct evlist *evlist,
> > > > > int fd, bool at_exit,
> > > > > - struct feat_copier *fc)
> > > > > + struct feat_copier *fc,
> > > > > + bool write_attrs_after_data)
> > > > > {
> > > > > struct perf_file_header f_header;
> > > > > - struct perf_file_attr f_attr;
> > > > > struct perf_header *header = &session->header;
> > > > > struct evsel *evsel;
> > > > > struct feat_fd ff = {
> > > > > .fd = fd,
> > > > > };
> > > > > - u64 attr_offset;
> > > > > + u64 attr_offset = sizeof(f_header), attr_size = 0;
> > > > > int err;
> > > > >
> > > > > - lseek(fd, sizeof(f_header), SEEK_SET);
> > > > > + if (write_attrs_after_data && at_exit) {
> > > > > + /*
> > > > > + * Write features at the end of the file first so that
> > > > > + * attributes may come after them.
> > > > > + */
> > > > > + if (!header->data_offset && header->data_size) {
> > > > > + pr_err("File contains data but offset unknown\n");
> > > > > + err = -1;
> > > > > + goto err_out;
> > > > > + }
> > > > > + header->feat_offset = header->data_offset + header->data_size;
> > > > > + err = perf_header__adds_write(header, evlist, fd, fc);
> > > > > + if (err < 0)
> > > > > + goto err_out;
> > > > > + attr_offset = lseek(fd, 0, SEEK_CUR);
> > > > > + } else {
> > > > > + lseek(fd, attr_offset, SEEK_SET);
> > > > > + }
> > > > >
> > > > > evlist__for_each_entry(session->evlist, evsel) {
> > > > > - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> > > > > - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > > - if (err < 0) {
> > > > > - pr_debug("failed to write perf header\n");
> > > > > - free(ff.buf);
> > > > > - return err;
> > > > > + evsel->id_offset = attr_offset;
> > > > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > > > + if (!write_attrs_after_data || at_exit) {
> > > > > + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > > + if (err < 0) {
> > > > > + pr_debug("failed to write perf header\n");
> > > > > + goto err_out;
> > > > > + }
> > > > > }
> > > > > + attr_offset += evsel->core.ids * sizeof(u64);
> > > >
> > > > So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
> > > > evsel->id_offset, now you're assuming that do_write will honour the size
> > > > parameter, i.e. write evsel->core.ids * sizeof(u64), but:
> > > >
> > > > /* Return: 0 if succeeded, -ERR if failed. */
> > > > int do_write(struct feat_fd *ff, const void *buf, size_t size)
> > > > {
> > > > if (!ff->buf)
> > > > return __do_write_fd(ff, buf, size);
> > > > return __do_write_buf(ff, buf, size);
> > > > }
> > > >
> > > > And then:
> > > >
> > > > static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
> > > > {
> > > > ssize_t ret = writen(ff->fd, buf, size);
> > > >
> > > > if (ret != (ssize_t)size)
> > > > return ret < 0 ? (int)ret : -1;
> > > > return 0;
> > > > }
> > > >
> > > > I see that writen calls ion that even has a BUG_ON() if it doesn't write
> > > > exactly the requested size bytes, I got distracted with __do_write_fd
> > > > extra check that ret != size returning ret if not negative...
> > > >
> > > > I.e. your code _should_ be equivalent due to the check in ion(), and
> > > > taking that as an assumption you reduce the number of lseek syscalls,
> > > > which is a good thing...
> > > >
> > > > I was just trying to see that the !write_attrs_after_data case was
> > > > _exactly_ the same as before, which it doesn't look like it is :-\
> > >
> > > I'm not seeing the difference. Before:
> >
> > You noticed the difference: before we used lseek to get the current
> > offset to use, afterwards we moved to doing plain math.
> >
> > So I had to check if we could assume that, and with the current code
> > structure, yes, we can assume that, so seems safe, but it is different
> > and if the assumption somehow breaks, as the code in __do_write_fd()
> > guard against (unneeded at the moment as ion has even a BUG_ON for that
> > not to happen), then the offset will not be where the data is.
> >
> > Using lseek() is more costly (syscalls) but it is the ultimate answer to
> > get where in the file the current offset is.
> >
> > So that is the difference I noticed.
> >
> > Doing multiple things in the same patch causes these reviewing delays,
> > doubts, its something we discussed multiple times in the past, and that
> > continue to cause these discussions.
>
> Right, but it is something of an unfortunate coincidence of how the
> code is structured. The fact that writing the header updates
> data_offset which is a thing that other things depend upon while
> depending on its value itself, etc. - ie the function does more than
> just a write, it also sometimes computes the layout, has inbuilt
> assumptions on the values lseek will return, and so on. To get to this
> final structure took a fair few iterations and I've separated this
> change out from the bulk in the next change to keep the patch size
> down. I could have done a patch switching from lseeks to math, then a
> patch to add write_attrs_after_data. It probably would have yielded
> about 4 lines of shared code, more lines that would have been deleted,
> while creating quite a bit of work for me. Ideally when these
> functions were created there would have been far more liberal use of
> things like immutability, so side-effects are minimized. Yes I could
> refactor everything, but time..
Maybe I'm too naive but can we skip header updates on pipe data? I'm
curious if this makes sense..
Thanks,
Namhyung
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a7c859db2e15..b36f84f29295 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -2341,6 +2341,9 @@ int cmd_inject(int argc, const char **argv)
if (ret)
goto out_delete;
+ if (data.is_pipe)
+ inject.is_pipe = true;
+
if (!data.is_pipe && inject.output.is_pipe) {
ret = perf_header__write_pipe(perf_data__fd(&inject.output));
if (ret < 0) {
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-30 4:39 ` Namhyung Kim
@ 2024-08-30 5:03 ` Ian Rogers
2024-08-30 16:04 ` Namhyung Kim
0 siblings, 1 reply; 18+ messages in thread
From: Ian Rogers @ 2024-08-30 5:03 UTC (permalink / raw)
To: Namhyung Kim
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 9:39 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Aug 29, 2024 at 02:42:38PM -0700, Ian Rogers wrote:
> > On Thu, Aug 29, 2024 at 1:58 PM Arnaldo Carvalho de Melo
> > <acme@kernel.org> wrote:
> > >
> > > On Thu, Aug 29, 2024 at 01:12:32PM -0700, Ian Rogers wrote:
> > > > On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo
> > > > <acme@kernel.org> wrote:
> > > > >
> > > > > On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > > > > > With a file, to write data an offset needs to be known. Typically data
> > > > > > follows the event attributes in a file. However, if processing a pipe
> > > > > > the number of event attributes may not be known. It is convenient in
> > > > > > that case to write the attributes after the data. Expand
> > > > > > perf_session__do_write_header to allow this when the data offset and
> > > > > > size are known.
> > > > > >
> > > > > > This approach may be useful for more than just taking a pipe file to
> > > > > > write into a data file, `perf inject --itrace` will reserve and
> > > > > > additional 8kb for attributes, which would be unnecessary if the
> > > > > > attributes were written after the data.
> > > > > >
> > > > > > Signed-off-by: Ian Rogers <irogers@google.com>
> > > > > > ---
> > > > > > tools/perf/util/header.c | 106 +++++++++++++++++++++++++--------------
> > > > > > 1 file changed, 67 insertions(+), 39 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> > > > > > index 65c9086610cb..4eb39463067e 100644
> > > > > > --- a/tools/perf/util/header.c
> > > > > > +++ b/tools/perf/util/header.c
> > > > > > @@ -3676,32 +3676,50 @@ int perf_header__write_pipe(int fd)
> > > > > > static int perf_session__do_write_header(struct perf_session *session,
> > > > > > struct evlist *evlist,
> > > > > > int fd, bool at_exit,
> > > > > > - struct feat_copier *fc)
> > > > > > + struct feat_copier *fc,
> > > > > > + bool write_attrs_after_data)
> > > > > > {
> > > > > > struct perf_file_header f_header;
> > > > > > - struct perf_file_attr f_attr;
> > > > > > struct perf_header *header = &session->header;
> > > > > > struct evsel *evsel;
> > > > > > struct feat_fd ff = {
> > > > > > .fd = fd,
> > > > > > };
> > > > > > - u64 attr_offset;
> > > > > > + u64 attr_offset = sizeof(f_header), attr_size = 0;
> > > > > > int err;
> > > > > >
> > > > > > - lseek(fd, sizeof(f_header), SEEK_SET);
> > > > > > + if (write_attrs_after_data && at_exit) {
> > > > > > + /*
> > > > > > + * Write features at the end of the file first so that
> > > > > > + * attributes may come after them.
> > > > > > + */
> > > > > > + if (!header->data_offset && header->data_size) {
> > > > > > + pr_err("File contains data but offset unknown\n");
> > > > > > + err = -1;
> > > > > > + goto err_out;
> > > > > > + }
> > > > > > + header->feat_offset = header->data_offset + header->data_size;
> > > > > > + err = perf_header__adds_write(header, evlist, fd, fc);
> > > > > > + if (err < 0)
> > > > > > + goto err_out;
> > > > > > + attr_offset = lseek(fd, 0, SEEK_CUR);
> > > > > > + } else {
> > > > > > + lseek(fd, attr_offset, SEEK_SET);
> > > > > > + }
> > > > > >
> > > > > > evlist__for_each_entry(session->evlist, evsel) {
> > > > > > - evsel->id_offset = lseek(fd, 0, SEEK_CUR);
> > > > > > - err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > > > - if (err < 0) {
> > > > > > - pr_debug("failed to write perf header\n");
> > > > > > - free(ff.buf);
> > > > > > - return err;
> > > > > > + evsel->id_offset = attr_offset;
> > > > > > + /* Avoid writing at the end of the file until the session is exiting. */
> > > > > > + if (!write_attrs_after_data || at_exit) {
> > > > > > + err = do_write(&ff, evsel->core.id, evsel->core.ids * sizeof(u64));
> > > > > > + if (err < 0) {
> > > > > > + pr_debug("failed to write perf header\n");
> > > > > > + goto err_out;
> > > > > > + }
> > > > > > }
> > > > > > + attr_offset += evsel->core.ids * sizeof(u64);
> > > > >
> > > > > So in the past we were using lseek(fd, 0, SEEK_CUR) to set the
> > > > > evsel->id_offset, now you're assuming that do_write will honour the size
> > > > > parameter, i.e. write evsel->core.ids * sizeof(u64), but:
> > > > >
> > > > > /* Return: 0 if succeeded, -ERR if failed. */
> > > > > int do_write(struct feat_fd *ff, const void *buf, size_t size)
> > > > > {
> > > > > if (!ff->buf)
> > > > > return __do_write_fd(ff, buf, size);
> > > > > return __do_write_buf(ff, buf, size);
> > > > > }
> > > > >
> > > > > And then:
> > > > >
> > > > > static int __do_write_fd(struct feat_fd *ff, const void *buf, size_t size)
> > > > > {
> > > > > ssize_t ret = writen(ff->fd, buf, size);
> > > > >
> > > > > if (ret != (ssize_t)size)
> > > > > return ret < 0 ? (int)ret : -1;
> > > > > return 0;
> > > > > }
> > > > >
> > > > > I see that writen calls ion that even has a BUG_ON() if it doesn't write
> > > > > exactly the requested size bytes, I got distracted with __do_write_fd
> > > > > extra check that ret != size returning ret if not negative...
> > > > >
> > > > > I.e. your code _should_ be equivalent due to the check in ion(), and
> > > > > taking that as an assumption you reduce the number of lseek syscalls,
> > > > > which is a good thing...
> > > > >
> > > > > I was just trying to see that the !write_attrs_after_data case was
> > > > > _exactly_ the same as before, which it doesn't look like it is :-\
> > > >
> > > > I'm not seeing the difference. Before:
> > >
> > > You noticed the difference: before we used lseek to get the current
> > > offset to use, afterwards we moved to doing plain math.
> > >
> > > So I had to check if we could assume that, and with the current code
> > > structure, yes, we can assume that, so seems safe, but it is different
> > > and if the assumption somehow breaks, as the code in __do_write_fd()
> > > guard against (unneeded at the moment as ion has even a BUG_ON for that
> > > not to happen), then the offset will not be where the data is.
> > >
> > > Using lseek() is more costly (syscalls) but it is the ultimate answer to
> > > get where in the file the current offset is.
> > >
> > > So that is the difference I noticed.
> > >
> > > Doing multiple things in the same patch causes these reviewing delays,
> > > doubts, its something we discussed multiple times in the past, and that
> > > continue to cause these discussions.
> >
> > Right, but it is something of an unfortunate coincidence of how the
> > code is structured. The fact that writing the header updates
> > data_offset which is a thing that other things depend upon while
> > depending on its value itself, etc. - ie the function does more than
> > just a write, it also sometimes computes the layout, has inbuilt
> > assumptions on the values lseek will return, and so on. To get to this
> > final structure took a fair few iterations and I've separated this
> > change out from the bulk in the next change to keep the patch size
> > down. I could have done a patch switching from lseeks to math, then a
> > patch to add write_attrs_after_data. It probably would have yielded
> > about 4 lines of shared code, more lines that would have been deleted,
> > while creating quite a bit of work for me. Ideally when these
> > functions were created there would have been far more liberal use of
> > things like immutability, so side-effects are minimized. Yes I could
> > refactor everything, but time..
>
> Maybe I'm too naive but can we skip header updates on pipe data? I'm
> curious if this makes sense..
>
> Thanks,
> Namhyung
>
>
> diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> index a7c859db2e15..b36f84f29295 100644
> --- a/tools/perf/builtin-inject.c
> +++ b/tools/perf/builtin-inject.c
> @@ -2341,6 +2341,9 @@ int cmd_inject(int argc, const char **argv)
> if (ret)
> goto out_delete;
>
> + if (data.is_pipe)
> + inject.is_pipe = true;
> +
I'm not sure what you are saying. We can't know definitively if the
input is a pipe style file or pipe until the header is read, which is
part of session__new and something we pass whether we want to repipe
the header or not. So we've made a decision or not to repipe but
opening the header may change the decision that was already made. As
you say we can opportunistically just copy/repipe the header if we
know the input and output types match, but:
1) generating the header isn't that much work,
2) if the header needs to change for extra attributes, such as with
some of the auxiliary flags, then the repiped header was no good
anyway.
Trying to keep header repiping alive for inject, the only use, is
weird given all the gotchas. I think it is simpler to open, know what
we're dealing with, then generate the output header accordingly -
possibly synthesizing events for the attributes in the case of file to
pipe.
Thanks,
Ian
> if (!data.is_pipe && inject.output.is_pipe) {
> ret = perf_header__write_pipe(perf_data__fd(&inject.output));
> if (ret < 0) {
>
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-30 5:03 ` Ian Rogers
@ 2024-08-30 16:04 ` Namhyung Kim
0 siblings, 0 replies; 18+ messages in thread
From: Namhyung Kim @ 2024-08-30 16:04 UTC (permalink / raw)
To: Ian Rogers
Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
Kan Liang, Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 10:03 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Aug 29, 2024 at 9:39 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > Maybe I'm too naive but can we skip header updates on pipe data? I'm
> > curious if this makes sense..
> >
> > Thanks,
> > Namhyung
> >
> >
> > diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
> > index a7c859db2e15..b36f84f29295 100644
> > --- a/tools/perf/builtin-inject.c
> > +++ b/tools/perf/builtin-inject.c
> > @@ -2341,6 +2341,9 @@ int cmd_inject(int argc, const char **argv)
> > if (ret)
> > goto out_delete;
> >
> > + if (data.is_pipe)
> > + inject.is_pipe = true;
> > +
>
> I'm not sure what you are saying. We can't know definitively if the
> input is a pipe style file or pipe until the header is read, which is
> part of session__new and something we pass whether we want to repipe
> the header or not. So we've made a decision or not to repipe but
> opening the header may change the decision that was already made. As
> you say we can opportunistically just copy/repipe the header if we
> know the input and output types match, but:
> 1) generating the header isn't that much work,
> 2) if the header needs to change for extra attributes, such as with
> some of the auxiliary flags, then the repiped header was no good
> anyway.
> Trying to keep header repiping alive for inject, the only use, is
> weird given all the gotchas. I think it is simpler to open, know what
> we're dealing with, then generate the output header accordingly -
> possibly synthesizing events for the attributes in the case of file to
> pipe.
I'm ok with removing repipe in session__new. What I want is
not to overwrite the file header for a data file containing pipe
header. In your example, 'perf record -o- > a.data' should have
the pipe header in a.data. Then b.data from perf inject should
have the pipe header as well, right? Then we don't need to
worry about the rewrite IIUC.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v1 5/8] perf header: Allow attributes to be written after data
2024-08-29 21:42 ` Ian Rogers
2024-08-30 4:39 ` Namhyung Kim
@ 2024-08-30 12:19 ` Arnaldo Carvalho de Melo
1 sibling, 0 replies; 18+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-08-30 12:19 UTC (permalink / raw)
To: Ian Rogers
Cc: Peter Zijlstra, Ingo Molnar, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
Nick Terrell, Yanteng Si, Yicong Yang, James Clark,
linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 02:42:38PM -0700, Ian Rogers wrote:
> On Thu, Aug 29, 2024 at 1:58 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > On Thu, Aug 29, 2024 at 01:12:32PM -0700, Ian Rogers wrote:
> > > On Thu, Aug 29, 2024 at 12:31 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > On Thu, Aug 29, 2024 at 08:01:51AM -0700, Ian Rogers wrote:
> > > I'm not seeing the difference. Before:
> > You noticed the difference: before we used lseek to get the current
> > offset to use, afterwards we moved to doing plain math.
> > So I had to check if we could assume that, and with the current code
> > structure, yes, we can assume that, so seems safe, but it is different
> > and if the assumption somehow breaks, as the code in __do_write_fd()
> > guard against (unneeded at the moment as ion has even a BUG_ON for that
> > not to happen), then the offset will not be where the data is.
> > Using lseek() is more costly (syscalls) but it is the ultimate answer to
> > get where in the file the current offset is.
> > So that is the difference I noticed.
> > Doing multiple things in the same patch causes these reviewing delays,
> > doubts, its something we discussed multiple times in the past, and that
> > continue to cause these discussions.
> Right, but it is something of an unfortunate coincidence of how the
> code is structured. The fact that writing the header updates
> data_offset which is a thing that other things depend upon while
> depending on its value itself, etc. - ie the function does more than
> just a write, it also sometimes computes the layout, has inbuilt
> assumptions on the values lseek will return, and so on. To get to this
I share your frustrations, code gets complex over time, that is why I at
least try to ask these questions, encourage more granular patches, that
do just one thing, etc, to avoid having this conversation again years
from now, when some other person tries to understand the codebase do
bisects, refactor it, etc, just like you're doing now.
> final structure took a fair few iterations and I've separated this
> change out from the bulk in the next change to keep the patch size
> down. I could have done a patch switching from lseeks to math, then a
> patch to add write_attrs_after_data.
> It probably would have yielded about 4 lines of shared code, more
> lines that would have been deleted, while creating quite a bit of work
> for me.
> Ideally when these functions were created there would have been far
> more liberal use of things like immutability, so side-effects are
> minimized. Yes I could refactor everything, but time..
As I said, I think your patch is safe as-is, its just that it took more
time than needed for reviewing, i.e. it will cost more one side or the
other, and as I have to review everything I merge, doing it on my side
slows things down the overall process of collecting patches from lots of
people.
So far I collected 234 patches for v6.12, and there are way more to
process, like Howard's perf trace BTF work, that I merged partially,
but where I'll have to do work on splitting patches as agreed with him
in another thread, etc.
- Arnaldo
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v1 6/8] perf inject: Overhaul handling of pipe files
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (4 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 5/8] perf header: Allow attributes to be written after data Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 7/8] perf header: Remove repipe option Ian Rogers
` (2 subsequent siblings)
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
Previously inject->is_pipe was set if the input or output were a
pipe. Determining the input was a pipe had to be done prior to
starting the session and opening the file. This was done by comparing
the input file name with '-' but it fails if the pipe file is written
to disk. Opening a pipe file from disk will correcrtly set
perf_data.is_pipe, but this is too late for perf inject and results in
a broken file. A workaround is 'cat pipe_perf|perf inject -i - ...'.
This change removes inject->is_pipe and changes the dependent
conditions to use the is_pipe flag on the input
(inject->session->data) and output files (inject->output). This
ensures the is_pipe condition reflects things like the header being
read.
The change removes the use of perf file header repiping, that is
writing the file header out while reading it in. The case of input
pipe and output file cannot repipe as the attributes for the file are
unknown. To resolve this, write the file header when writing to disk
and as the attributes may be unknown, write them after the data.
Update sessions repipe variable to be trace_event_repipe as those are
the only events now impacted by it. Update __perf_session__new as the
repipe_fd no longer needs passing. Fully removing repipe from session
header reading will be done in a later change.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/builtin-inject.c | 60 +++++++++++++++++++------------------
| 12 ++++----
| 3 +-
tools/perf/util/session.c | 8 ++---
tools/perf/util/session.h | 14 ++++-----
5 files changed, 48 insertions(+), 49 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a7c859db2e15..0ccf80fe8399 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -119,7 +119,6 @@ struct perf_inject {
bool jit_mode;
bool in_place_update;
bool in_place_update_dry_run;
- bool is_pipe;
bool copy_kcore_dir;
const char *input_name;
struct perf_data output;
@@ -205,7 +204,8 @@ static int perf_event__repipe_attr(const struct perf_tool *tool,
if (ret)
return ret;
- if (!inject->is_pipe)
+ /* If the output isn't a pipe then the attributes will be written as part of the header. */
+ if (!inject->output.is_pipe)
return 0;
return perf_event__repipe_synth(tool, event);
@@ -1966,7 +1966,13 @@ static int __cmd_inject(struct perf_inject *inject)
struct guest_session *gs = &inject->guest_session;
struct perf_session *session = inject->session;
int fd = output_fd(inject);
- u64 output_data_offset;
+ u64 output_data_offset = perf_session__data_offset(session->evlist);
+ /*
+ * Pipe input hasn't loaded the attributes and will handle them as
+ * events. So that the attributes don't overlap the data, write the
+ * attributes after the data.
+ */
+ bool write_attrs_after_data = !inject->output.is_pipe && inject->session->data->is_pipe;
signal(SIGINT, sig_handler);
@@ -1980,8 +1986,6 @@ static int __cmd_inject(struct perf_inject *inject)
#endif
}
- output_data_offset = perf_session__data_offset(session->evlist);
-
if (inject->build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
inject->tool.sample = perf_event__inject_buildid;
} else if (inject->sched_stat) {
@@ -2075,7 +2079,7 @@ static int __cmd_inject(struct perf_inject *inject)
if (!inject->itrace_synth_opts.set)
auxtrace_index__free(&session->auxtrace_index);
- if (!inject->is_pipe && !inject->in_place_update)
+ if (!inject->output.is_pipe && !inject->in_place_update)
lseek(fd, output_data_offset, SEEK_SET);
ret = perf_session__process_events(session);
@@ -2094,7 +2098,7 @@ static int __cmd_inject(struct perf_inject *inject)
}
}
- if (!inject->is_pipe && !inject->in_place_update) {
+ if (!inject->output.is_pipe && !inject->in_place_update) {
struct inject_fc inj_fc = {
.fc.copy = feat_copy_cb,
.inject = inject,
@@ -2124,7 +2128,8 @@ static int __cmd_inject(struct perf_inject *inject)
}
session->header.data_offset = output_data_offset;
session->header.data_size = inject->bytes_written;
- perf_session__inject_header(session, session->evlist, fd, &inj_fc.fc);
+ perf_session__inject_header(session, session->evlist, fd, &inj_fc.fc,
+ write_attrs_after_data);
if (inject->copy_kcore_dir) {
ret = copy_kcore_dir(inject);
@@ -2161,7 +2166,6 @@ int cmd_inject(int argc, const char **argv)
.use_stdio = true,
};
int ret;
- bool repipe = true;
const char *known_build_ids = NULL;
bool build_ids;
bool build_id_all;
@@ -2273,16 +2277,7 @@ int cmd_inject(int argc, const char **argv)
inject.build_id_style = BID_RWS__INJECT_HEADER_ALL;
data.path = inject.input_name;
- if (!strcmp(inject.input_name, "-") || inject.output.is_pipe) {
- inject.is_pipe = true;
- /*
- * Do not repipe header when input is a regular file
- * since either it can rewrite the header at the end
- * or write a new pipe header.
- */
- if (strcmp(inject.input_name, "-"))
- repipe = false;
- }
+
ordered_events = inject.jit_mode || inject.sched_stat ||
(inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY);
perf_tool__init(&inject.tool, ordered_events);
@@ -2325,9 +2320,9 @@ int cmd_inject(int argc, const char **argv)
inject.tool.compressed = perf_event__repipe_op4_synth;
inject.tool.auxtrace = perf_event__repipe_auxtrace;
inject.tool.dont_split_sample_group = true;
- inject.session = __perf_session__new(&data, repipe,
- output_fd(&inject),
- &inject.tool);
+ inject.session = __perf_session__new(&data, &inject.tool,
+ /*trace_event_repipe=*/inject.output.is_pipe);
+
if (IS_ERR(inject.session)) {
ret = PTR_ERR(inject.session);
goto out_close_output;
@@ -2341,19 +2336,26 @@ int cmd_inject(int argc, const char **argv)
if (ret)
goto out_delete;
- if (!data.is_pipe && inject.output.is_pipe) {
+ if (inject.output.is_pipe) {
ret = perf_header__write_pipe(perf_data__fd(&inject.output));
if (ret < 0) {
pr_err("Couldn't write a new pipe header.\n");
goto out_delete;
}
- ret = perf_event__synthesize_for_pipe(&inject.tool,
- inject.session,
- &inject.output,
- perf_event__repipe);
- if (ret < 0)
- goto out_delete;
+ /*
+ * If the input is already a pipe then the features and
+ * attributes don't need synthesizing, they will be present in
+ * the input.
+ */
+ if (!data.is_pipe) {
+ ret = perf_event__synthesize_for_pipe(&inject.tool,
+ inject.session,
+ &inject.output,
+ perf_event__repipe);
+ if (ret < 0)
+ goto out_delete;
+ }
}
if (inject.build_id_style == BID_RWS__INJECT_HEADER_LAZY) {
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 4eb39463067e..59e2f37c1cb4 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -3818,10 +3818,11 @@ size_t perf_session__data_offset(const struct evlist *evlist)
int perf_session__inject_header(struct perf_session *session,
struct evlist *evlist,
int fd,
- struct feat_copier *fc)
+ struct feat_copier *fc,
+ bool write_attrs_after_data)
{
return perf_session__do_write_header(session, evlist, fd, true, fc,
- /*write_attrs_after_data=*/false);
+ write_attrs_after_data);
}
static int perf_header__getbuffer64(struct perf_header *header,
@@ -4145,7 +4146,7 @@ static int perf_header__read_pipe(struct perf_session *session, int repipe_fd)
struct perf_pipe_file_header f_header;
if (perf_file_header__read_pipe(&f_header, header, session->data,
- session->repipe, repipe_fd) < 0) {
+ /*repipe=*/false, repipe_fd) < 0) {
pr_debug("incompatible file format\n");
return -EINVAL;
}
@@ -4560,15 +4561,14 @@ int perf_event__process_tracing_data(struct perf_session *session,
SEEK_SET);
}
- size_read = trace_report(fd, &session->tevent,
- session->repipe);
+ size_read = trace_report(fd, &session->tevent, session->trace_event_repipe);
padding = PERF_ALIGN(size_read, sizeof(u64)) - size_read;
if (readn(fd, buf, padding) < 0) {
pr_err("%s: reading input file", __func__);
return -1;
}
- if (session->repipe) {
+ if (session->trace_event_repipe) {
int retw = write(STDOUT_FILENO, buf, padding);
if (retw <= 0 || retw != padding) {
pr_err("%s: repiping tracing data padding", __func__);
--git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 3285981948d7..7137509cf6d8 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -150,7 +150,8 @@ struct feat_copier {
int perf_session__inject_header(struct perf_session *session,
struct evlist *evlist,
int fd,
- struct feat_copier *fc);
+ struct feat_copier *fc,
+ bool write_attrs_after_data);
size_t perf_session__data_offset(const struct evlist *evlist);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d2bd563119bc..23449d01093a 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -135,8 +135,8 @@ static int ordered_events__deliver_event(struct ordered_events *oe,
}
struct perf_session *__perf_session__new(struct perf_data *data,
- bool repipe, int repipe_fd,
- struct perf_tool *tool)
+ struct perf_tool *tool,
+ bool trace_event_repipe)
{
int ret = -ENOMEM;
struct perf_session *session = zalloc(sizeof(*session));
@@ -144,7 +144,7 @@ struct perf_session *__perf_session__new(struct perf_data *data,
if (!session)
goto out;
- session->repipe = repipe;
+ session->trace_event_repipe = trace_event_repipe;
session->tool = tool;
session->decomp_data.zstd_decomp = &session->zstd_data;
session->active_decomp = &session->decomp_data;
@@ -162,7 +162,7 @@ struct perf_session *__perf_session__new(struct perf_data *data,
session->data = data;
if (perf_data__is_read(data)) {
- ret = perf_session__open(session, repipe_fd);
+ ret = perf_session__open(session, /*repipe_fd=*/-1);
if (ret < 0)
goto out_delete;
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index e56518639711..bcf1bcf06959 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -59,12 +59,8 @@ struct perf_session {
#endif
/** @time_conv: Holds contents of last PERF_RECORD_TIME_CONV event. */
struct perf_record_time_conv time_conv;
- /**
- * @repipe: When set causes certain reading (header and trace events) to
- * also write events. The written file descriptor must be provided for
- * the header but is implicitly stdout for trace events.
- */
- bool repipe;
+ /** @trace_event_repipe: When set causes read trace events to be written to stdout. */
+ bool trace_event_repipe;
/**
* @one_mmap: The reader will use a single mmap by default. There may be
* multiple data files in particular for aux events. If this is true
@@ -110,13 +106,13 @@ struct decomp {
struct perf_tool;
struct perf_session *__perf_session__new(struct perf_data *data,
- bool repipe, int repipe_fd,
- struct perf_tool *tool);
+ struct perf_tool *tool,
+ bool trace_event_repipe);
static inline struct perf_session *perf_session__new(struct perf_data *data,
struct perf_tool *tool)
{
- return __perf_session__new(data, false, -1, tool);
+ return __perf_session__new(data, tool, /*trace_event_repipe=*/false);
}
void perf_session__delete(struct perf_session *session);
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 7/8] perf header: Remove repipe option
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (5 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 6/8] perf inject: Overhaul handling of pipe files Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:01 ` [PATCH v1 8/8] perf test: Additional pipe tests with pipe output written to a file Ian Rogers
2024-08-29 15:22 ` [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
No longer used by `perf inject` the repipe_fd is always -1 and repipe
is always false. Remove the options and associated code knowing the
constant values of the removed variables.
Signed-off-by: Ian Rogers <irogers@google.com>
---
| 19 +++++--------------
| 2 +-
tools/perf/util/session.c | 6 +++---
3 files changed, 9 insertions(+), 18 deletions(-)
--git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 59e2f37c1cb4..a6386d12afd7 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -4113,13 +4113,8 @@ static int perf_file_section__process(struct perf_file_section *section,
static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
struct perf_header *ph,
- struct perf_data* data,
- bool repipe, int repipe_fd)
+ struct perf_data *data)
{
- struct feat_fd ff = {
- .fd = repipe_fd,
- .ph = ph,
- };
ssize_t ret;
ret = perf_data__read(data, header, sizeof(*header));
@@ -4134,19 +4129,15 @@ static int perf_file_header__read_pipe(struct perf_pipe_file_header *header,
if (ph->needs_swap)
header->size = bswap_64(header->size);
- if (repipe && do_write(&ff, header, sizeof(*header)) < 0)
- return -1;
-
return 0;
}
-static int perf_header__read_pipe(struct perf_session *session, int repipe_fd)
+static int perf_header__read_pipe(struct perf_session *session)
{
struct perf_header *header = &session->header;
struct perf_pipe_file_header f_header;
- if (perf_file_header__read_pipe(&f_header, header, session->data,
- /*repipe=*/false, repipe_fd) < 0) {
+ if (perf_file_header__read_pipe(&f_header, header, session->data) < 0) {
pr_debug("incompatible file format\n");
return -EINVAL;
}
@@ -4246,7 +4237,7 @@ static int evlist__prepare_tracepoint_events(struct evlist *evlist, struct tep_h
}
#endif
-int perf_session__read_header(struct perf_session *session, int repipe_fd)
+int perf_session__read_header(struct perf_session *session)
{
struct perf_data *data = session->data;
struct perf_header *header = &session->header;
@@ -4267,7 +4258,7 @@ int perf_session__read_header(struct perf_session *session, int repipe_fd)
* We can read 'pipe' data event from regular file,
* check for the pipe header regardless of source.
*/
- err = perf_header__read_pipe(session, repipe_fd);
+ err = perf_header__read_pipe(session);
if (!err || perf_data__is_pipe(data)) {
data->is_pipe = true;
return err;
--git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 7137509cf6d8..a63a361f20f4 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -131,7 +131,7 @@ union perf_event;
extern const char perf_version_string[];
-int perf_session__read_header(struct perf_session *session, int repipe_fd);
+int perf_session__read_header(struct perf_session *session);
int perf_session__write_header(struct perf_session *session,
struct evlist *evlist,
int fd, bool at_exit);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 23449d01093a..578010be046b 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -45,11 +45,11 @@ static int perf_session__deliver_event(struct perf_session *session,
u64 file_offset,
const char *file_path);
-static int perf_session__open(struct perf_session *session, int repipe_fd)
+static int perf_session__open(struct perf_session *session)
{
struct perf_data *data = session->data;
- if (perf_session__read_header(session, repipe_fd) < 0) {
+ if (perf_session__read_header(session) < 0) {
pr_err("incompatible file format (rerun with -v to learn more)\n");
return -1;
}
@@ -162,7 +162,7 @@ struct perf_session *__perf_session__new(struct perf_data *data,
session->data = data;
if (perf_data__is_read(data)) {
- ret = perf_session__open(session, /*repipe_fd=*/-1);
+ ret = perf_session__open(session);
if (ret < 0)
goto out_delete;
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v1 8/8] perf test: Additional pipe tests with pipe output written to a file
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (6 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 7/8] perf header: Remove repipe option Ian Rogers
@ 2024-08-29 15:01 ` Ian Rogers
2024-08-29 15:22 ` [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:01 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
Additional pipe tests where piped files are written to disk. This
means that spotting a file name of "-" isn't a sufficient "is pipe?"
test.
Signed-off-by: Ian Rogers <irogers@google.com>
---
tools/perf/tests/shell/pipe_test.sh | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/tools/perf/tests/shell/pipe_test.sh b/tools/perf/tests/shell/pipe_test.sh
index ad10012fdc29..a3c94b4182c2 100755
--- a/tools/perf/tests/shell/pipe_test.sh
+++ b/tools/perf/tests/shell/pipe_test.sh
@@ -11,6 +11,7 @@ sym="noploop"
skip_test_missing_symbol ${sym}
data=$(mktemp /tmp/perf.data.XXXXXX)
+data2=$(mktemp /tmp/perf.data2.XXXXXX)
prog="perf test -w noploop"
err=0
@@ -19,6 +20,8 @@ set -e
cleanup() {
rm -rf "${data}"
rm -rf "${data}".old
+ rm -rf "${data2}"
+ rm -rf "${data2}".old
trap - EXIT TERM INT
}
@@ -49,6 +52,14 @@ test_record_report() {
return
fi
+ perf record -g -e task-clock:u -o - ${prog} > ${data}
+ if ! perf report -i ${data} --task | grep -q ${task}
+ then
+ echo "Record+report pipe test [Failed - cannot find the test file in the perf report #3]"
+ err=1
+ return
+ fi
+
echo "Record+report pipe test [Success]"
}
@@ -86,6 +97,21 @@ test_inject_bids() {
return
fi
+ perf record -e task-clock:u -o - ${prog} > ${data}
+ if ! perf inject ${inject_opt} -i ${data} | perf report -i - | grep -q ${sym}; then
+ echo "Inject ${inject_opt} build-ids test [Failed - cannot find noploop function in pipe #5]"
+ err=1
+ return
+ fi
+
+ perf record -e task-clock:u -o - ${prog} > ${data}
+ perf inject ${inject_opt} -i ${data} -o ${data2}
+ if ! perf report -i ${data2} | grep -q ${sym}; then
+ echo "Inject ${inject_opt} build-ids test [Failed - cannot find noploop function in pipe #6]"
+ err=1
+ return
+ fi
+
echo "Inject ${inject_opt} build-ids test [Success]"
}
--
2.46.0.295.g3b9ea8a38a-goog
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH v1 0/8] Correct inject's handling of pipe files on disk
2024-08-29 15:01 [PATCH v1 0/8] Correct inject's handling of pipe files on disk Ian Rogers
` (7 preceding siblings ...)
2024-08-29 15:01 ` [PATCH v1 8/8] perf test: Additional pipe tests with pipe output written to a file Ian Rogers
@ 2024-08-29 15:22 ` Ian Rogers
8 siblings, 0 replies; 18+ messages in thread
From: Ian Rogers @ 2024-08-29 15:22 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Ian Rogers, Adrian Hunter, Kan Liang, Nick Terrell, Yanteng Si,
Yicong Yang, James Clark, linux-perf-users, linux-kernel
On Thu, Aug 29, 2024 at 8:02 AM Ian Rogers <irogers@google.com> wrote:
>
> Perf inject tried to repipe the header, but this isn't possible if the
> input is a pipe and the output a file, as the attributes are events in
> pipe mode. Add an ability to write the attributes after the data so
> writing the header doesn't need a possibly too large or small region
> reserved for the attributes. Add testing for the case of a pipe mode
> file on disk, add checks that the perf file's header isn't obviously
> corrupt by having the header, data or attribute sections overlap. Add
> more comments.
>
> Ian Rogers (8):
> perf report: Name events in stats for pipe mode
> perf session: Document struct and constify auxtrace
> perf header: Add kerneldoc to perf_file_header
> perf header: Fail read if header sections overlap
> perf header: Allow attributes to be written after data
> perf inject: Overhaul handling of pipe files
> perf header: Remove repipe option
> perf test: Additional pipe tests with pipe output written to a file
Sorry, patch 1 was a resend of a patch merged yesterday. Please ignore.
Thanks,
Ian
> tools/perf/builtin-inject.c | 60 +++++------
> tools/perf/builtin-report.c | 1 +
> tools/perf/tests/shell/pipe_test.sh | 26 +++++
> tools/perf/util/header.c | 151 +++++++++++++++++-----------
> tools/perf/util/header.h | 21 +++-
> tools/perf/util/session.c | 12 +--
> tools/perf/util/session.h | 52 +++++++++-
> 7 files changed, 223 insertions(+), 100 deletions(-)
>
> --
> 2.46.0.295.g3b9ea8a38a-goog
>
^ permalink raw reply [flat|nested] 18+ messages in thread