* [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis
@ 2020-02-15 1:11 Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15 1:11 UTC (permalink / raw)
To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin
The tools provided here enable capturing performance metrics from the i915
driver and are used in conjunction with the GPUvis software here -
https://github.com/mikesart/gpuvis
The changes required in GPUvis are wip and will be posted following the merge of
these tools.
For more information, view tools/i915-perf/README in this patch series
Lionel Landwerlin (4):
lib/i915/perf: Add i915_perf library
lib/i915/perf: Add support for loading perf configurations
tools/i915/perf: Add i915 perf recorder tool
lib/i915/perf: Add i915 perf data reader
lib/i915-perf.pc.in | 10 +
lib/i915/perf-configs/README.md | 115 +
lib/i915/perf-configs/codegen.py | 33 +
lib/i915/perf-configs/guids.xml | 282 +
lib/i915/perf-configs/mdapi-xml-convert.py | 1000 +
lib/i915/perf-configs/oa-bdw.xml | 15653 ++++++++++++++++
lib/i915/perf-configs/oa-bxt.xml | 9595 ++++++++++
lib/i915/perf-configs/oa-cflgt2.xml | 10866 +++++++++++
lib/i915/perf-configs/oa-cflgt3.xml | 10933 +++++++++++
lib/i915/perf-configs/oa-chv.xml | 9757 ++++++++++
lib/i915/perf-configs/oa-cnl.xml | 10411 ++++++++++
lib/i915/perf-configs/oa-glk.xml | 9346 +++++++++
lib/i915/perf-configs/oa-hsw.xml | 4615 +++++
lib/i915/perf-configs/oa-icl.xml | 11899 ++++++++++++
lib/i915/perf-configs/oa-kblgt2.xml | 10866 +++++++++++
lib/i915/perf-configs/oa-kblgt3.xml | 10933 +++++++++++
lib/i915/perf-configs/oa-sklgt2.xml | 11895 ++++++++++++
lib/i915/perf-configs/oa-sklgt3.xml | 10933 +++++++++++
lib/i915/perf-configs/oa-sklgt4.xml | 10956 +++++++++++
lib/i915/perf-configs/perf-codegen.py | 854 +
lib/i915/perf-configs/update-guids.py | 230 +
lib/i915/perf.c | 332 +
lib/i915/perf.h | 240 +
lib/i915/perf_data.h | 88 +
lib/i915/perf_data_reader.c | 330 +
lib/i915/perf_data_reader.h | 103 +
lib/meson.build | 67 +
tools/i915-perf/README | 70 +
tools/i915-perf/i915_perf_configs.c | 277 +
tools/i915-perf/i915_perf_control.c | 133 +
tools/i915-perf/i915_perf_recorder.c | 931 +
tools/i915-perf/i915_perf_recorder_commands.h | 39 +
tools/i915-perf/meson.build | 17 +
tools/meson.build | 1 +
34 files changed, 153810 insertions(+)
create mode 100644 lib/i915-perf.pc.in
create mode 100644 lib/i915/perf-configs/README.md
create mode 100644 lib/i915/perf-configs/codegen.py
create mode 100644 lib/i915/perf-configs/guids.xml
create mode 100755 lib/i915/perf-configs/mdapi-xml-convert.py
create mode 100644 lib/i915/perf-configs/oa-bdw.xml
create mode 100644 lib/i915/perf-configs/oa-bxt.xml
create mode 100644 lib/i915/perf-configs/oa-cflgt2.xml
create mode 100644 lib/i915/perf-configs/oa-cflgt3.xml
create mode 100644 lib/i915/perf-configs/oa-chv.xml
create mode 100644 lib/i915/perf-configs/oa-cnl.xml
create mode 100644 lib/i915/perf-configs/oa-glk.xml
create mode 100644 lib/i915/perf-configs/oa-hsw.xml
create mode 100644 lib/i915/perf-configs/oa-icl.xml
create mode 100644 lib/i915/perf-configs/oa-kblgt2.xml
create mode 100644 lib/i915/perf-configs/oa-kblgt3.xml
create mode 100644 lib/i915/perf-configs/oa-sklgt2.xml
create mode 100644 lib/i915/perf-configs/oa-sklgt3.xml
create mode 100644 lib/i915/perf-configs/oa-sklgt4.xml
create mode 100755 lib/i915/perf-configs/perf-codegen.py
create mode 100755 lib/i915/perf-configs/update-guids.py
create mode 100644 lib/i915/perf.c
create mode 100644 lib/i915/perf.h
create mode 100644 lib/i915/perf_data.h
create mode 100644 lib/i915/perf_data_reader.c
create mode 100644 lib/i915/perf_data_reader.h
create mode 100644 tools/i915-perf/README
create mode 100644 tools/i915-perf/i915_perf_configs.c
create mode 100644 tools/i915-perf/i915_perf_control.c
create mode 100644 tools/i915-perf/i915_perf_recorder.c
create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h
create mode 100644 tools/i915-perf/meson.build
--
2.20.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 5+ messages in thread
* [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations
2020-02-15 1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
@ 2020-02-15 1:11 ` Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15 1:11 UTC (permalink / raw)
To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Add support for loading perf configurations used by gpuvis.
v2: rebase fixes for igt list (Umesh)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
lib/i915/perf.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++
lib/i915/perf.h | 2 +
2 files changed, 140 insertions(+)
diff --git a/lib/i915/perf.c b/lib/i915/perf.c
index 1627f102..ae786701 100644
--- a/lib/i915/perf.c
+++ b/lib/i915/perf.c
@@ -20,8 +20,18 @@
* SOFTWARE.
*/
+#include <assert.h>
+#include <errno.h>
+#include <stdio.h>
#include <stdlib.h>
#include <string.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/sysmacros.h>
+#include <sys/types.h>
+#include <unistd.h>
#include "intel_chipset.h"
#include "perf.h"
@@ -192,3 +202,131 @@ intel_perf_add_metric_set(struct intel_perf *perf,
{
igt_list_add_tail(&metric_set->link, &perf->metric_sets);
}
+
+static bool
+read_file_uint64(const char *file, uint64_t *value)
+{
+ char buf[32];
+ int fd, n;
+
+ fd = open(file, 0);
+ if (fd < 0)
+ return false;
+ n = read(fd, buf, sizeof (buf) - 1);
+ close(fd);
+ if (n < 0)
+ return false;
+
+ buf[n] = '\0';
+ *value = strtoull(buf, 0, 0);
+
+ return true;
+}
+
+static int
+get_card_for_fd(int fd)
+{
+ struct stat sb;
+ int mjr, mnr;
+ char buffer[128];
+ DIR *drm_dir;
+ struct dirent *entry;
+ int retval = -1;
+
+ if (fstat(fd, &sb))
+ return -1;
+
+ mjr = major(sb.st_rdev);
+ mnr = minor(sb.st_rdev);
+
+ snprintf(buffer, sizeof(buffer), "/sys/dev/char/%d:%d/device/drm", mjr, mnr);
+
+ drm_dir = opendir(buffer);
+ assert(drm_dir != NULL);
+
+ while ((entry = readdir(drm_dir))) {
+ if (entry->d_type == DT_DIR && strncmp(entry->d_name, "card", 4) == 0) {
+ retval = strtoull(entry->d_name + 4, NULL, 10);
+ break;
+ }
+ }
+
+ closedir(drm_dir);
+
+ return retval;
+}
+
+static void
+load_metric_set_config(struct intel_perf_metric_set *metric_set, int drm_fd)
+{
+ struct drm_i915_perf_oa_config config;
+ uint64_t config_id = 0;
+
+ memset(&config, 0, sizeof(config));
+
+ memcpy(config.uuid, metric_set->hw_config_guid, sizeof(config.uuid));
+
+ config.n_mux_regs = metric_set->n_mux_regs;
+ config.mux_regs_ptr = (uintptr_t) metric_set->mux_regs;
+
+ config.n_boolean_regs = metric_set->n_b_counter_regs;
+ config.boolean_regs_ptr = (uintptr_t) metric_set->b_counter_regs;
+
+ config.n_flex_regs = metric_set->n_flex_regs;
+ config.flex_regs_ptr = (uintptr_t) metric_set->flex_regs;
+
+ while (ioctl(drm_fd, DRM_IOCTL_I915_PERF_ADD_CONFIG, &config) < 0 &&
+ (errno == EAGAIN || errno == EINTR));
+
+ metric_set->perf_oa_metrics_set = config_id;
+}
+
+void
+intel_perf_load_perf_configs(struct intel_perf *perf, int drm_fd)
+{
+ int drm_card = get_card_for_fd(drm_fd);
+ struct dirent *entry;
+ char metrics_path[128];
+ DIR *metrics_dir;
+ struct intel_perf_metric_set *metric_set;
+
+ snprintf(metrics_path, sizeof(metrics_path),
+ "/sys/class/drm/card%d/metrics", drm_card);
+ metrics_dir = opendir(metrics_path);
+ if (!metrics_dir)
+ return;
+
+ while ((entry = readdir(metrics_dir))) {
+ char *metric_id_path;
+ uint64_t metric_id;
+
+ if (entry->d_type != DT_DIR)
+ continue;
+
+ asprintf(&metric_id_path, "%s/%s/id",
+ metrics_path, entry->d_name);
+
+ if (!read_file_uint64(metric_id_path, &metric_id)) {
+ free(metric_id_path);
+ continue;
+ }
+
+ free(metric_id_path);
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ if (!strcmp(metric_set->hw_config_guid, entry->d_name)) {
+ metric_set->perf_oa_metrics_set = metric_id;
+ break;
+ }
+ }
+ }
+
+ closedir(metrics_dir);
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ if (metric_set->perf_oa_metrics_set)
+ continue;
+
+ load_metric_set_config(metric_set, drm_fd);
+ }
+}
diff --git a/lib/i915/perf.h b/lib/i915/perf.h
index 5a091c46..0b66efe1 100644
--- a/lib/i915/perf.h
+++ b/lib/i915/perf.h
@@ -231,6 +231,8 @@ void intel_perf_add_logical_counter(struct intel_perf *perf,
void intel_perf_add_metric_set(struct intel_perf *perf,
struct intel_perf_metric_set *metric_set);
+void intel_perf_load_perf_configs(struct intel_perf *perf, int drm_fd);
+
#ifdef __cplusplus
};
#endif
--
2.20.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool
2020-02-15 1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
@ 2020-02-15 1:11 ` Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin
3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15 1:11 UTC (permalink / raw)
To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
i915 perf recorder tool captures OA perf data for a specific metric set
in a circular buffer of specified size. The i915 perf control tool is
used to dump the data captured in the circular buffer to a trace file.
The data captured is used to view relevant events in gpuvis.
v2: (Umesh)
- rebase fixes for igt_list apis
- memset circular_buffer to 0 to initialize size, beginpos and enpos
- _FORTIFY_SOURCE=2 caused snprintf to go through __snprintf_chk that
falsely flagged a buffer overflow and sent an abort signal to
i915_perf_control when capturing traces. undef the _FORTIFY_SOURCE
selectively for i915 control tool.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
lib/i915/perf_data.h | 88 ++
lib/meson.build | 1 +
tools/i915-perf/i915_perf_control.c | 133 +++
tools/i915-perf/i915_perf_recorder.c | 931 ++++++++++++++++++
tools/i915-perf/i915_perf_recorder_commands.h | 39 +
tools/i915-perf/meson.build | 12 +
6 files changed, 1204 insertions(+)
create mode 100644 lib/i915/perf_data.h
create mode 100644 tools/i915-perf/i915_perf_control.c
create mode 100644 tools/i915-perf/i915_perf_recorder.c
create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h
diff --git a/lib/i915/perf_data.h b/lib/i915/perf_data.h
new file mode 100644
index 00000000..13791187
--- /dev/null
+++ b/lib/i915/perf_data.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef PERF_DATA_H
+#define PERF_DATA_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* The structures below are embedded in the i915-perf stream so as to
+ * provide metadata. The types used in the
+ * drm_i915_perf_record_header.type are defined in
+ * intel_perf_record_type.
+ *
+ * Once defined, those structures cannot change. If you need to add
+ * new data, just define a new structure & record_type.
+ */
+
+#include <stdint.h>
+
+enum intel_perf_record_type {
+ /* Start at 65536, which is pretty safe since after 3years the
+ * kernel hasn't defined more than 3 entries.
+ */
+
+ /* intel_perf_record_device_info */
+ INTEL_PERF_RECORD_TYPE_DEVICE_INFO = 1 << 16,
+
+ /* intel_perf_record_device_topology */
+ INTEL_PERF_RECORD_TYPE_DEVICE_TOPOLOGY,
+
+ /* intel_perf_record_timestamp_correlation */
+ INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION,
+};
+
+struct intel_perf_record_device_info {
+ /* Frequency of the timestamps in the records. */
+ uint64_t timestamp_frequency;
+
+ /* PCI ID */
+ uint32_t device_id;
+
+ /* enum drm_i915_oa_format */
+ uint32_t oa_format;
+
+ /* Configuration identifier */
+ char uuid[40];
+};
+
+/* Topology as reported by i915. */
+struct intel_perf_record_device_topology {
+ struct drm_i915_query_topology_info topology;
+};
+
+/* Timestamp correlation between CPU/GPU. */
+struct intel_perf_record_timestamp_correlation {
+ /* In CLOCK_MONOTONIC */
+ uint64_t cpu_timestamp;
+
+ /* Engine timestamp associated with the OA unit */
+ uint64_t gpu_timestamp;
+};
+
+#ifdef __cplusplus
+};
+#endif
+
+#endif /* PERF_DATA_H */
diff --git a/lib/meson.build b/lib/meson.build
index edff8a67..6e935d45 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -217,6 +217,7 @@ install_headers(
'igt_list.h',
'intel_chipset.h',
'i915/perf.h',
+ 'i915/perf_data.h',
subdir : 'i915-perf'
)
diff --git a/tools/i915-perf/i915_perf_control.c b/tools/i915-perf/i915_perf_control.c
new file mode 100644
index 00000000..a8d0d30f
--- /dev/null
+++ b/tools/i915-perf/i915_perf_control.c
@@ -0,0 +1,133 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <getopt.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "i915_perf_recorder_commands.h"
+
+static void
+usage(const char *name)
+{
+ fprintf(stdout,
+ "Usage: %s [options]\n"
+ "\n"
+ " --help, -h Print this screen\n"
+ " --command-fifo, -f <path> Path to a command fifo\n"
+ " --dump, -d <path> Write a content of circular buffer to path\n",
+ name);
+}
+
+int
+main(int argc, char *argv[])
+{
+ const struct option long_options[] = {
+ {"help", no_argument, 0, 'h'},
+ {"dump", required_argument, 0, 'd'},
+ {"command-fifo", required_argument, 0, 'f'},
+ {"quit", no_argument, 0, 'q'},
+ {0, 0, 0, 0}
+ };
+ const char *command_fifo = I915_PERF_RECORD_FIFO_PATH, *dump_file = NULL;
+ FILE *command_fifo_file;
+ int opt;
+ bool quit = false;
+
+ while ((opt = getopt_long(argc, argv, "hd:f:q", long_options, NULL)) != -1) {
+ switch (opt) {
+ case 'h':
+ usage(argv[0]);
+ return EXIT_SUCCESS;
+ case 'd':
+ dump_file = optarg;
+ break;
+ case 'f':
+ command_fifo = optarg;
+ break;
+ case 'q':
+ quit = true;
+ break;
+ default:
+ fprintf(stderr, "Internal error: "
+ "unexpected getopt value: %d\n", opt);
+ usage(argv[0]);
+ return EXIT_FAILURE;
+ }
+ }
+
+ if (!command_fifo)
+ return EXIT_FAILURE;
+
+ command_fifo_file = fopen(command_fifo, "r+");
+ if (!command_fifo_file) {
+ fprintf(stderr, "Unable to open command file\n");
+ return EXIT_FAILURE;
+ }
+
+ if (dump_file) {
+ if (dump_file[0] == '/') {
+ uint32_t total_len =
+ sizeof(struct recorder_command_base) + strlen(dump_file) + 1;
+ struct {
+ struct recorder_command_base base;
+ struct recorder_command_dump dump;
+ } *data = malloc(total_len);
+
+ data->base.command = RECORDER_COMMAND_DUMP;
+ data->base.size = total_len;
+ snprintf((char *) data->dump.path, strlen(dump_file) + 1, "%s", dump_file);
+
+ fwrite(data, total_len, 1, command_fifo_file);
+ } else {
+ char *cwd = get_current_dir_name();
+ uint32_t path_len = strlen(cwd) + 1 + strlen(dump_file) + 1;
+ uint32_t total_len = sizeof(struct recorder_command_base) + path_len;
+ struct {
+ struct recorder_command_base base;
+ struct recorder_command_dump dump;
+ } *data = malloc(total_len);
+
+ data->base.command = RECORDER_COMMAND_DUMP;
+ data->base.size = total_len;
+ snprintf((char *) data->dump.path, path_len, "%s/%s", cwd, dump_file);
+
+ fwrite(data, total_len, 1, command_fifo_file);
+ }
+ }
+
+ if (quit) {
+ struct recorder_command_base base = {
+ .command = RECORDER_COMMAND_QUIT,
+ .size = sizeof(base),
+ };
+
+ fwrite(&base, sizeof(base), 1, command_fifo_file);
+ }
+
+ fclose(command_fifo_file);
+
+ return EXIT_SUCCESS;
+}
diff --git a/tools/i915-perf/i915_perf_recorder.c b/tools/i915-perf/i915_perf_recorder.c
new file mode 100644
index 00000000..61bde5ba
--- /dev/null
+++ b/tools/i915-perf/i915_perf_recorder.c
@@ -0,0 +1,931 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <assert.h>
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <inttypes.h>
+#include <poll.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/sysmacros.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <i915_drm.h>
+
+#include "igt_core.h"
+#include "intel_chipset.h"
+#include "i915/perf.h"
+#include "i915/perf_data.h"
+
+#include "i915_perf_recorder_commands.h"
+
+#define ALIGN(v, a) (((v) + (a)-1) & ~((a)-1))
+#define ARRAY_SIZE(arr) (sizeof(arr)/sizeof((arr)[0]))
+#define MAX(a,b) ((a) > (b) ? (a) : (b))
+#define MIN(a,b) ((a) < (b) ? (a) : (b))
+
+struct circular_buffer {
+ char *data;
+ size_t allocated_size;
+ size_t size;
+ size_t beginpos;
+ size_t endpos;
+};
+
+struct chunk {
+ char *data;
+ size_t len;
+};
+
+static size_t
+circular_available_size(const struct circular_buffer *buffer)
+{
+ assert(buffer->size <= buffer->allocated_size);
+ return buffer->allocated_size - buffer->size;
+}
+
+static void
+get_chunks(struct chunk *chunks, struct circular_buffer *buffer, bool write, size_t len)
+{
+ size_t offset = write ? buffer->endpos : buffer->beginpos;
+
+ if (write)
+ assert(circular_available_size(buffer) >= len);
+ else
+ assert(buffer->size >= len);
+
+ chunks[0].data = &buffer->data[offset];
+
+ if ((offset + len) > buffer->allocated_size) {
+ chunks[0].len = buffer->allocated_size - offset;
+ chunks[1].data = buffer->data;
+ chunks[1].len = len - (buffer->allocated_size - offset);
+ } else {
+ chunks[0].len = len;
+ chunks[1].data = NULL;
+ chunks[1].len = 0;
+ }
+}
+
+static ssize_t
+circular_buffer_read(void *c, char *buf, size_t size)
+{
+ struct circular_buffer *buffer = c;
+ struct chunk chunks[2];
+
+ if (buffer->size < size)
+ return -1;
+
+ get_chunks(chunks, buffer, false, size);
+
+ memcpy(buf, chunks[0].data, chunks[0].len);
+ memcpy(buf + chunks[0].len, chunks[1].data, chunks[1].len);
+ buffer->beginpos = (buffer->beginpos + size) % buffer->allocated_size;
+ buffer->size -= size;
+
+ return size;
+}
+
+static size_t
+peek_item_size(struct circular_buffer *buffer)
+{
+ struct drm_i915_perf_record_header header;
+ struct chunk chunks[2];
+
+ if (!buffer->size)
+ return 0;
+
+ assert(buffer->size >= sizeof(header));
+
+ get_chunks(chunks, buffer, false, sizeof(header));
+ memcpy(&header, chunks[0].data, chunks[0].len);
+ memcpy((char *) &header + chunks[0].len, chunks[1].data, chunks[1].len);
+
+ return header.size;
+}
+
+static void
+circular_shrink(struct circular_buffer *buffer, size_t size)
+{
+ size_t shrank = 0, item_size;
+
+ assert(size <= buffer->allocated_size);
+
+ while (shrank < size && buffer->size > (item_size = peek_item_size(buffer))) {
+ assert(item_size > 0 && item_size <= buffer->allocated_size);
+
+ buffer->beginpos = (buffer->beginpos + item_size) % buffer->allocated_size;
+ buffer->size -= item_size;
+
+ shrank += item_size;
+ }
+}
+
+static ssize_t
+circular_buffer_write(void *c, const char *buf, size_t _size)
+{
+ struct circular_buffer *buffer = c;
+ size_t size = _size;
+
+ while (size) {
+ size_t avail = circular_available_size(buffer), item_size;
+ struct chunk chunks[2];
+
+ /* Make space in the buffer if there is too much data. */
+ if (avail < size)
+ circular_shrink(buffer, size - avail);
+
+ item_size = MIN(circular_available_size(buffer), size);
+
+ get_chunks(chunks, buffer, true, item_size);
+
+ memcpy(chunks[0].data, buf, chunks[0].len);
+ memcpy(chunks[1].data, buf + chunks[0].len, chunks[1].len);
+
+ buf += item_size;
+ size -= item_size;
+
+ buffer->endpos = (buffer->endpos + item_size) % buffer->allocated_size;
+ buffer->size += item_size;
+ }
+
+ return _size;
+}
+
+static int
+circular_buffer_seek(void *c, off64_t *offset, int whence)
+{
+ return -1;
+}
+
+static int
+circular_buffer_close(void *c)
+{
+ return 0;
+}
+
+cookie_io_functions_t circular_buffer_functions = {
+ .read = circular_buffer_read,
+ .write = circular_buffer_write,
+ .seek = circular_buffer_seek,
+ .close = circular_buffer_close,
+};
+
+
+static bool
+read_file_uint64(const char *file, uint64_t *value)
+{
+ char buf[32];
+ int fd, n;
+
+ fd = open(file, 0);
+ if (fd < 0)
+ return false;
+ n = read(fd, buf, sizeof (buf) - 1);
+ close(fd);
+ if (n < 0)
+ return false;
+
+ buf[n] = '\0';
+ *value = strtoull(buf, 0, 0);
+
+ return true;
+}
+
+static uint32_t
+read_device_param(const char *stem, int id, const char *param)
+{
+ char *name;
+ int ret = asprintf(&name, "/sys/class/drm/%s%u/device/%s", stem, id, param);
+ uint64_t value;
+ bool success;
+
+ assert(ret != -1);
+
+ success = read_file_uint64(name, &value);
+ free(name);
+
+ return success ? value : 0;
+}
+
+static int
+find_intel_render_node(void)
+{
+ for (int i = 128; i < (128 + 16); i++) {
+ if (read_device_param("renderD", i, "vendor") == 0x8086)
+ return i;
+ }
+
+ return -1;
+}
+
+static int
+open_render_node(uint32_t *devid)
+{
+ char *name;
+ int ret;
+ int fd;
+
+ int render = find_intel_render_node();
+ if (render < 0)
+ return -1;
+
+ ret = asprintf(&name, "/dev/dri/renderD%u", render);
+ assert(ret != -1);
+
+ *devid = read_device_param("renderD", render, "device");
+
+ fd = open(name, O_RDWR);
+ free(name);
+
+ return fd;
+}
+
+static uint32_t
+oa_exponent_for_period(uint64_t device_timestamp_frequency, double period)
+{
+ uint64_t period_ns = 1000 * 1000 * 1000 * period;
+ uint64_t device_periods[32];
+
+ for (uint32_t i = 0; i < ARRAY_SIZE(device_periods); i++)
+ device_periods[i] = 1000000000ull * (1u << i) / device_timestamp_frequency;
+
+ for (uint32_t i = 1; i < ARRAY_SIZE(device_periods); i++) {
+ if (period_ns >= device_periods[i - 1] &&
+ period_ns < device_periods[i]) {
+ if ((device_periods[i] - period_ns) >
+ (period_ns - device_periods[i - 1]))
+ return i - 1;
+ return i;
+ }
+ }
+
+ return -1;
+}
+
+static int
+perf_ioctl(int fd, unsigned long request, void *arg)
+{
+ int ret;
+
+ do {
+ ret = ioctl(fd, request, arg);
+ } while (ret == -1 && (errno == EINTR || errno == EAGAIN));
+
+ return ret;
+}
+
+static uint64_t
+get_device_timestamp_frequency(const struct intel_device_info *devinfo, int drm_fd)
+{
+ drm_i915_getparam_t gp;
+ int timestamp_frequency;
+
+ gp.param = I915_PARAM_CS_TIMESTAMP_FREQUENCY;
+ gp.value = ×tamp_frequency;
+ if (perf_ioctl(drm_fd, DRM_IOCTL_I915_GETPARAM, &gp) == 0)
+ return timestamp_frequency;
+
+ if (devinfo->gen > 9) {
+ fprintf(stderr, "Unable to query timestamp frequency from i915, please update kernel.\n");
+ return 0;
+ }
+
+ fprintf(stderr, "Warning: unable to query timestamp frequency from i915, guessing values...\n");
+
+ if (devinfo->gen <= 8)
+ return 12500000;
+ if (devinfo->is_broxton)
+ return 19200000;
+ return 12000000;
+}
+
+static int
+perf_open(int drm_fd,
+ const struct intel_device_info *devinfo,
+ uint32_t oa_exponent,
+ const struct intel_perf_metric_set *metric_set)
+{
+ uint64_t properties[DRM_I915_PERF_PROP_MAX * 2];
+ struct drm_i915_perf_open_param param;
+ int p = 0, stream_fd;
+
+ properties[p++] = DRM_I915_PERF_PROP_SAMPLE_OA;
+ properties[p++] = true;
+
+ properties[p++] = DRM_I915_PERF_PROP_OA_METRICS_SET;
+ properties[p++] = metric_set->perf_oa_metrics_set;
+
+ properties[p++] = DRM_I915_PERF_PROP_OA_FORMAT;
+ properties[p++] = metric_set->perf_oa_format;
+
+ properties[p++] = DRM_I915_PERF_PROP_OA_EXPONENT;
+ properties[p++] = oa_exponent;
+
+ memset(¶m, 0, sizeof(param));
+ param.flags = 0;
+ param.flags |= I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK;
+ param.properties_ptr = (uintptr_t)properties;
+ param.num_properties = p / 2;
+
+ stream_fd = perf_ioctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m);
+ return stream_fd;
+}
+
+static bool quit = false;
+
+static void
+sigint_handler(int val)
+{
+ quit = true;
+}
+
+static bool
+write_header(FILE *output,
+ uint32_t device_id,
+ uint64_t timestamp_frequency,
+ const struct intel_perf_metric_set *metric_set)
+{
+ struct intel_perf_record_device_info info = {
+ .timestamp_frequency = timestamp_frequency,
+ .device_id = device_id,
+ .oa_format = metric_set->perf_oa_format,
+ };
+ struct drm_i915_perf_record_header header = {
+ .type = INTEL_PERF_RECORD_TYPE_DEVICE_INFO,
+ .size = sizeof(header) + sizeof(info),
+ };
+
+ snprintf(info.uuid, sizeof(info.uuid), "%s", metric_set->hw_config_guid);
+
+ if (fwrite(&header, sizeof(header), 1, output) != 1)
+ return false;
+
+ if (fwrite(&info, sizeof(info), 1, output) != 1)
+ return false;
+
+ return true;
+}
+
+static bool
+write_topology(FILE *output, int drm_fd)
+{
+ struct drm_i915_perf_record_header header = {
+ .type = INTEL_PERF_RECORD_TYPE_DEVICE_TOPOLOGY,
+ };
+ struct drm_i915_query query = {};
+ struct drm_i915_query_topology_info *topo_info;
+ struct drm_i915_query_item item = {
+ .query_id = DRM_I915_QUERY_TOPOLOGY_INFO,
+ };
+ int ret;
+
+ query.num_items = 1;
+ query.items_ptr = (uintptr_t) &item;
+
+ /* Maybe not be available on older kernels. */
+ ret = perf_ioctl(drm_fd, DRM_IOCTL_I915_QUERY, &query);
+ if (ret < 0)
+ return true;
+
+ assert(item.length > 0);
+ topo_info = malloc(item.length);
+ item.data_ptr = (uintptr_t) topo_info;
+
+ ret = perf_ioctl(drm_fd, DRM_IOCTL_I915_QUERY, &query);
+ assert(ret == 0);
+
+ header.size = sizeof(header) + item.length;
+ if (fwrite(&header, sizeof(header), 1, output) != 1)
+ return false;
+
+ if (fwrite(topo_info, item.length, 1, output) != 1)
+ return false;
+
+ return true;
+}
+
+static bool
+write_i915_perf_data(FILE *output, int perf_fd)
+{
+ ssize_t ret;
+ char data[4096];
+
+ while ((ret = read(perf_fd, data, sizeof(data))) > 0 ||
+ errno == EINTR) {
+ if (fwrite(data, ret, 1, output) != 1)
+ return false;
+ }
+
+ return true;
+}
+
+static uint64_t timespec_diff(struct timespec *begin,
+ struct timespec *end)
+{
+ return 1000000000ull * (end->tv_sec - begin->tv_sec) + end->tv_nsec - begin->tv_nsec;
+}
+
+static clock_t correlation_clock_id = CLOCK_MONOTONIC;
+
+static bool
+get_correlation_timestamps(struct intel_perf_record_timestamp_correlation *corr, int drm_fd)
+{
+ struct drm_i915_reg_read reg_read;
+ struct {
+ struct timespec cpu_ts_begin;
+ struct timespec cpu_ts_end;
+ uint64_t gpu_ts;
+ } attempts[3];
+ uint32_t best = 0;
+
+#define RENDER_RING_TIMESTAMP 0x2358
+
+ reg_read.offset = RENDER_RING_TIMESTAMP | I915_REG_READ_8B_WA;
+
+ /* Gather 3 correlations. */
+ for (uint32_t i = 0; i < ARRAY_SIZE(attempts); i++) {
+ clock_gettime(correlation_clock_id, &attempts[i].cpu_ts_begin);
+ if (perf_ioctl(drm_fd, DRM_IOCTL_I915_REG_READ, ®_read) < 0)
+ return false;
+ clock_gettime(correlation_clock_id, &attempts[i].cpu_ts_end);
+
+ attempts[i].gpu_ts = reg_read.val;
+ }
+
+ /* Now select the best. */
+ for (uint32_t i = 1; i < ARRAY_SIZE(attempts); i++) {
+ if (timespec_diff(&attempts[i].cpu_ts_begin,
+ &attempts[i].cpu_ts_end) <
+ timespec_diff(&attempts[best].cpu_ts_begin,
+ &attempts[best].cpu_ts_end))
+ best = i;
+ }
+
+ corr->cpu_timestamp =
+ (attempts[best].cpu_ts_begin.tv_sec * 1000000000ull +
+ attempts[best].cpu_ts_begin.tv_nsec) +
+ timespec_diff(&attempts[best].cpu_ts_begin,
+ &attempts[best].cpu_ts_end) / 2;
+ corr->gpu_timestamp = attempts[best].gpu_ts;
+
+ return true;
+}
+
+static bool
+write_saved_correlation_timestamps(FILE *output,
+ const struct intel_perf_record_timestamp_correlation *corr)
+{
+ struct drm_i915_perf_record_header header = {
+ .type = INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION,
+ .size = sizeof(header) + sizeof(*corr),
+ };
+
+ if (fwrite(&header, sizeof(header), 1, output) != 1)
+ return false;
+
+ if (fwrite(corr, sizeof(*corr), 1, output) != 1)
+ return false;
+
+ return true;
+}
+
+static bool
+write_correlation_timestamps(FILE *output, int drm_fd)
+{
+ struct intel_perf_record_timestamp_correlation corr;
+
+ if (!get_correlation_timestamps(&corr, drm_fd))
+ return false;
+
+ return write_saved_correlation_timestamps(output, &corr);
+}
+
+static void
+read_command_file(int command_fd, FILE *output_stream, struct circular_buffer *buffer,
+ int drm_fd, uint32_t devid, uint64_t timestamp_frequency,
+ struct intel_perf_metric_set *metric_set)
+{
+ struct recorder_command_base header;
+ ssize_t ret = read(command_fd, &header, sizeof(header));
+
+ if (ret < 0)
+ return;
+
+ switch (header.command) {
+ case RECORDER_COMMAND_DUMP: {
+ uint32_t len = header.size - sizeof(header), offset = 0;
+ struct recorder_command_dump *dump = malloc(len);
+ FILE *file;
+
+ while (offset < len &&
+ ((ret = read(command_fd, (void *) dump + offset, len - offset)) > 0
+ || errno == EAGAIN)) {
+ if (ret > 0)
+ offset += ret;
+ }
+
+ fprintf(stdout, "Writing circular buffer to %s\n", dump->path);
+
+ file = fopen((const char *) dump->path, "w+");
+ if (file) {
+ struct chunk chunks[2];
+
+ fflush(output_stream);
+ get_chunks(chunks, buffer, false, buffer->size);
+
+ if (!write_header(file, devid, timestamp_frequency, metric_set) ||
+ !write_topology(file, drm_fd) ||
+ fwrite(chunks[0].data, chunks[0].len, 1, file) != 1 ||
+ (chunks[1].len > 0 &&
+ fwrite(chunks[1].data, chunks[1].len, 1, file) != 1) ||
+ !write_correlation_timestamps(file, drm_fd)) {
+ fprintf(stderr, "Unable to write circular buffer data in file '%s'\n",
+ dump->path);
+ }
+ fclose(file);
+ } else
+ fprintf(stderr, "Unable to write dump file '%s'\n", dump->path);
+
+ free(dump);
+ break;
+ }
+ case RECORDER_COMMAND_QUIT:
+ quit = true;
+ break;
+ default:
+ fprintf(stderr, "Unknown command 0x%x\n", header.command);
+ break;
+ }
+}
+
+static void
+print_metric_sets(const struct intel_perf *perf)
+{
+ struct intel_perf_metric_set *metric_set;
+ uint32_t longest_name = 0;
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ longest_name = MAX(longest_name, strlen(metric_set->symbol_name));
+ }
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ fprintf(stdout, "%s:%*s%s\t\n",
+ metric_set->symbol_name,
+ (int) (longest_name - strlen(metric_set->symbol_name) + 1), " ",
+ metric_set->name);
+ }
+}
+
+static void
+print_metric_set_counters(const struct intel_perf_metric_set *metric_set)
+{
+ uint32_t longest_name = 0;
+ for (uint32_t i = 0; i < metric_set->n_counters; i++) {
+ longest_name = MAX(longest_name, strlen(metric_set->counters[i].name));
+ }
+
+ fprintf(stdout, "Metric set %s:\n", metric_set->name);
+ for (uint32_t i = 0; i < metric_set->n_counters; i++) {
+ struct intel_perf_logical_counter *counter = &metric_set->counters[i];
+
+ fprintf(stdout, "%s:%*s%s\n",
+ counter->name,
+ (int)(longest_name - strlen(counter->name) + 1), " ",
+ counter->desc);
+ }
+}
+
+static void
+usage(const char *name)
+{
+ fprintf(stdout,
+ "Usage: %s [options]\n"
+ "\n"
+ " --help, -h Print this screen\n"
+ " --correlation-period, -c <value> Time period of timestamp correlation in seconds\n"
+ " (default = 1.0)\n"
+ " --perf-period, -p <value> Time period of i915-perf reports in seconds\n"
+ " (default = 0.001)\n"
+ " --metric, -m <value> i915 metric to sample with\n"
+ " --counters, -C List counters for a given metric and exit\n"
+ " --size, -s <value> Size of circular buffer to use in kilobytes\n"
+ " If specified, a maximum amount of <value> data will\n"
+ " be recorded.\n"
+ " --command-fifo, -f <path> Path to a command fifo, implies circular buffer\n"
+ " (To use with i915-perf-control)\n"
+ " --output, -o <path> Output file (default = i915_perf.record)\n"
+ " --cpu-clock, -k <path> Cpu clock to use for correlations\n"
+ " Values: boot, mono, mono_raw (default = mono)\n",
+ name);
+}
+
+int
+main(int argc, char *argv[])
+{
+ const struct option long_options[] = {
+ {"help", no_argument, 0, 'h'},
+ {"correlation-period", required_argument, 0, 'c'},
+ {"perf-period", required_argument, 0, 'p'},
+ {"metric", required_argument, 0, 'm'},
+ {"counters", no_argument, 0, 'C'},
+ {"output", required_argument, 0, 'o'},
+ {"size", required_argument, 0, 's'},
+ {"command-fifo", required_argument, 0, 'f'},
+ {"cpu-clock", required_argument, 0, 'k'},
+ {0, 0, 0, 0}
+ };
+ const struct {
+ clock_t id;
+ const char *name;
+ } clock_names[] = {
+ { CLOCK_BOOTTIME, "boot" },
+ { CLOCK_MONOTONIC, "mono" },
+ { CLOCK_MONOTONIC_RAW, "mono_raw" },
+ };
+ const struct intel_device_info *devinfo;
+ double corr_period = 1.0, perf_period = 0.001;
+ const char *metric_name = NULL, *output_file = "i915_perf.record", *command_fifo = I915_PERF_RECORD_FIFO_PATH;
+ struct intel_perf *perf;
+ struct intel_perf_metric_set *metric_set, *selected_metric_set = NULL;
+ struct intel_perf_record_timestamp_correlation initial_correlation;
+ struct circular_buffer circular_buffer;
+ struct timespec now;
+ uint64_t corr_period_ns, poll_time_ns, timestamp_frequency;
+ uint32_t devid = 0, oa_exponent;
+ uint32_t circular_size = 0;
+ int drm_fd, perf_fd, command_fifo_fd = -1;
+ int opt;
+ bool list_counters = false;
+ FILE *output = NULL, *output_stream;
+
+ while ((opt = getopt_long(argc, argv, "hc:p:m:Co:s:f:k:", long_options, NULL)) != -1) {
+ switch (opt) {
+ case 'h':
+ usage(argv[0]);
+ return EXIT_SUCCESS;
+ case 'c':
+ corr_period = atof(optarg);
+ break;
+ case 'p':
+ perf_period = atof(optarg);
+ break;
+ case 'm':
+ metric_name = optarg;
+ break;
+ case 'C':
+ list_counters = true;
+ break;
+ case 'o':
+ output_file = optarg;
+ break;
+ case 's':
+ circular_size = MAX(8, atoi(optarg)) * 1024;
+ break;
+ case 'f':
+ command_fifo = optarg;
+ circular_size = 8 * 1024 * 1024;
+ break;
+ case 'k': {
+ bool found = false;
+ for (uint32_t i = 0; i < ARRAY_SIZE(clock_names); i++) {
+ if (!strcmp(clock_names[i].name, optarg)) {
+ correlation_clock_id = clock_names[i].id;
+ found = true;
+ break;
+ }
+ }
+ if (!found) {
+ fprintf(stderr, "Unknown clock name '%s'\n", optarg);
+ return EXIT_FAILURE;
+ }
+ break;
+ }
+ default:
+ fprintf(stderr, "Internal error: "
+ "unexpected getopt value: %d\n", opt);
+ usage(argv[0]);
+ return EXIT_FAILURE;
+ }
+ }
+
+ drm_fd = open_render_node(&devid);
+
+ devinfo = intel_get_device_info(devid);
+ if (!devinfo) {
+ fprintf(stderr, "No device info found.\n");
+ return EXIT_FAILURE;
+ }
+
+ fprintf(stdout, "Device name=%s gen=%i gt=%i id=0x%x\n",
+ devinfo->codename, devinfo->gen, devinfo->gt, devid);
+
+ perf = intel_perf_for_devinfo(devinfo);
+ if (!perf) {
+ fprintf(stderr, "No perf data found.\n");
+ return EXIT_FAILURE;
+ }
+
+ if (!metric_name) {
+ print_metric_sets(perf);
+ return EXIT_FAILURE;
+ }
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ if (!strcasecmp(metric_set->symbol_name, metric_name)) {
+ selected_metric_set = metric_set;
+ break;
+ }
+ }
+
+ if (!selected_metric_set) {
+ fprintf(stderr, "Unknown metric set '%s'\n", metric_name);
+ print_metric_sets(perf);
+ return EXIT_FAILURE;
+ }
+
+ if (list_counters) {
+ print_metric_set_counters(selected_metric_set);
+ return EXIT_SUCCESS;
+ }
+
+ intel_perf_load_perf_configs(perf, drm_fd);
+
+ timestamp_frequency = get_device_timestamp_frequency(devinfo, drm_fd);
+
+ signal(SIGINT, sigint_handler);
+
+ if (command_fifo) {
+ if (mkfifo(command_fifo, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH) != 0) {
+ fprintf(stderr, "Unable to create command fifo '%s': %s\n",
+ command_fifo, strerror(errno));
+ return EXIT_FAILURE;
+ }
+
+ command_fifo_fd = open(command_fifo, O_RDWR);
+ if (command_fifo_fd < 0) {
+ fprintf(stderr, "Unable to open command fifo '%s': %s\n",
+ command_fifo, strerror(errno));
+ return EXIT_FAILURE;
+ }
+ } else {
+ output = fopen(output_file, "w+");
+ if (!output) {
+ fprintf(stderr, "Unable to open output file '%s'\n",
+ output_file);
+ return EXIT_FAILURE;
+ }
+ }
+
+ if (circular_size) {
+ memset(&circular_buffer, 0, sizeof(circular_buffer));
+ circular_buffer.allocated_size = circular_size;
+ circular_buffer.data = malloc(circular_size);
+ if (!circular_buffer.data) {
+ fprintf(stderr, "Unable to allocate circular buffer\n");
+ return EXIT_FAILURE;
+ }
+
+ output_stream = fopencookie(&circular_buffer, "w+",
+ circular_buffer_functions);
+ if (!output_stream) {
+ fprintf(stderr, "Unable to create circular buffer\n");
+ return EXIT_FAILURE;
+ }
+
+ if (!get_correlation_timestamps(&initial_correlation, drm_fd)) {
+ fprintf(stderr, "Unable to correlation timestamps\n");
+ return EXIT_FAILURE;
+ }
+
+ write_correlation_timestamps(output_stream, drm_fd);
+ } else {
+ if (!write_header(output, devid, timestamp_frequency, selected_metric_set) ||
+ !write_topology(output, drm_fd) ||
+ !write_correlation_timestamps(output, drm_fd)) {
+ fprintf(stderr, "Unable to write header in file '%s'\n",
+ output_file);
+ return EXIT_FAILURE;
+ }
+
+ output_stream = output;
+ }
+
+ if (selected_metric_set->perf_oa_metrics_set == 0) {
+ fprintf(stderr,
+ "Unable to load performance configuration, consider running:\n"
+ " sysctl dev.i915.perf_stream_paranoid=0\n");
+ return EXIT_FAILURE;
+ }
+
+ oa_exponent = oa_exponent_for_period(timestamp_frequency, perf_period);
+ fprintf(stdout, "Opening perf stream with metric_id=%lu oa_exponent=%u\n",
+ selected_metric_set->perf_oa_metrics_set, oa_exponent);
+
+ perf_fd = perf_open(drm_fd, devinfo, oa_exponent, selected_metric_set);
+ if (perf_fd < 0) {
+ fprintf(stderr, "Unable to open i915 perf stream: %s\n",
+ strerror(errno));
+ return EXIT_FAILURE;
+ }
+
+ corr_period_ns = corr_period * 1000000000ul;
+ poll_time_ns = corr_period_ns;
+
+ while (!quit) {
+ struct pollfd pollfd[2] = {
+ { perf_fd, POLLIN, 0 },
+ { command_fifo_fd, POLLIN, 0 },
+ };
+ uint64_t elapsed_ns;
+ int ret;
+
+ igt_gettime(&now);
+ ret = poll(pollfd, command_fifo_fd != -1 ? 2 : 1, poll_time_ns / 1000000);
+ if (ret < 0 && errno != EINTR) {
+ fprintf(stderr, "Failed to poll i915-perf stream: %s\n",
+ strerror(errno));
+ break;
+ }
+
+ if (ret > 0) {
+ if (pollfd[0].revents & POLLIN) {
+ if (!write_i915_perf_data(output_stream, perf_fd)) {
+ fprintf(stderr, "Failed to write i915-perf data: %s\n",
+ strerror(errno));
+ break;
+ }
+ }
+
+ if (pollfd[1].revents & POLLIN) {
+ read_command_file(command_fifo_fd, output_stream,
+ &circular_buffer,
+ drm_fd, devid, timestamp_frequency,
+ selected_metric_set);
+ }
+ }
+
+ elapsed_ns = igt_nsec_elapsed(&now);
+ if (elapsed_ns > poll_time_ns) {
+ poll_time_ns = corr_period_ns;
+ if (!write_correlation_timestamps(output_stream, drm_fd)) {
+ fprintf(stderr,
+ "Failed to write i915 timestamp correlation data: %s\n",
+ strerror(errno));
+ break;
+ }
+ } else {
+ poll_time_ns -= elapsed_ns;
+ }
+ }
+
+ fprintf(stdout, "Exiting...\n");
+
+ if (!write_correlation_timestamps(output_stream, drm_fd)) {
+ fprintf(stderr,
+ "Failed to write final i915 timestamp correlation data: %s\n",
+ strerror(errno));
+ }
+
+ fclose(output_stream);
+
+ if (command_fifo)
+ unlink(command_fifo);
+
+ free(circular_buffer.data);
+
+ close(drm_fd);
+
+ return EXIT_SUCCESS;
+}
diff --git a/tools/i915-perf/i915_perf_recorder_commands.h b/tools/i915-perf/i915_perf_recorder_commands.h
new file mode 100644
index 00000000..4855d80f
--- /dev/null
+++ b/tools/i915-perf/i915_perf_recorder_commands.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <stdint.h>
+
+#define I915_PERF_RECORD_FIFO_PATH "/tmp/.i915-perf-record"
+
+enum recorder_command {
+ RECORDER_COMMAND_DUMP = 1,
+ RECORDER_COMMAND_QUIT,
+};
+
+struct recorder_command_base {
+ uint32_t command;
+ uint32_t size;
+};
+
+struct recorder_command_dump {
+ uint8_t path[0];
+};
diff --git a/tools/i915-perf/meson.build b/tools/i915-perf/meson.build
index 0ebdd185..1be3ab22 100644
--- a/tools/i915-perf/meson.build
+++ b/tools/i915-perf/meson.build
@@ -3,3 +3,15 @@ executable('i915-perf-configs',
include_directories: inc,
dependencies: [lib_igt_chipset, lib_igt_i915_perf],
install: true)
+
+executable('i915-perf-recorder',
+ [ 'i915_perf_recorder.c' ],
+ include_directories: inc,
+ dependencies: [lib_igt, lib_igt_i915_perf],
+ install: true)
+
+executable('i915-perf-control',
+ [ 'i915_perf_control.c' ],
+ c_args: '-U_FORTIFY_SOURCE',
+ include_directories: inc,
+ install: true)
--
2.20.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader
2020-02-15 1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
@ 2020-02-15 1:11 ` Umesh Nerlige Ramappa
2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin
3 siblings, 0 replies; 5+ messages in thread
From: Umesh Nerlige Ramappa @ 2020-02-15 1:11 UTC (permalink / raw)
To: igt-dev, Joonas Lahtinen, Ashutosh Dixit, Lionel G Landwerlin
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Read perf OA records and correlate timestamps between the GPU and CPU.
v2: (Umesh)
- Add README on usage
- rebase fixes for igt_list
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
---
lib/i915/perf_data_reader.c | 330 ++++++++++++++++++++++++++++++++++++
lib/i915/perf_data_reader.h | 103 +++++++++++
lib/meson.build | 2 +
tools/i915-perf/README | 70 ++++++++
4 files changed, 505 insertions(+)
create mode 100644 lib/i915/perf_data_reader.c
create mode 100644 lib/i915/perf_data_reader.h
create mode 100644 tools/i915-perf/README
diff --git a/lib/i915/perf_data_reader.c b/lib/i915/perf_data_reader.c
new file mode 100644
index 00000000..43683331
--- /dev/null
+++ b/lib/i915/perf_data_reader.c
@@ -0,0 +1,330 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#include "intel_chipset.h"
+#include "perf.h"
+#include "perf_data_reader.h"
+
+#define MAX(a,b) ((a) > (b) ? (a) : (b))
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+static inline bool
+oa_report_ctx_is_valid(const struct intel_perf_devinfo *devinfo,
+ const uint8_t *_report)
+{
+ const uint32_t *report = (const uint32_t *) _report;
+
+ if (devinfo->gen < 8) {
+ return false; /* TODO */
+ } else if (devinfo->gen == 8) {
+ return report[0] & (1ul << 25);
+ } else if (devinfo->gen > 8) {
+ return report[0] & (1ul << 16);
+ }
+
+ return false;
+}
+
+static uint32_t
+oa_report_ctx_id(const struct intel_perf_devinfo *devinfo, const uint8_t *report)
+{
+ if (!oa_report_ctx_is_valid(devinfo, report))
+ return 0xffffffff;
+ return ((const uint32_t *) report)[2];
+}
+
+static inline uint64_t
+oa_report_timestamp(const uint8_t *report)
+{
+ return ((const uint32_t *)report)[1];
+}
+
+static void
+append_record(struct intel_perf_data_reader *reader,
+ const struct drm_i915_perf_record_header *header)
+{
+ if (reader->n_records >= reader->n_allocated_records) {
+ reader->n_allocated_records = MAX(100, 2 * reader->n_allocated_records);
+ reader->records =
+ (const struct drm_i915_perf_record_header **)
+ realloc((void *) reader->records,
+ reader->n_allocated_records *
+ sizeof(*reader->records));
+ assert(reader->records);
+ }
+
+ reader->records[reader->n_records++] = header;
+}
+
+static void
+append_timestamp_correlation(struct intel_perf_data_reader *reader,
+ const struct intel_perf_record_timestamp_correlation *corr)
+{
+ if (reader->n_correlations >= reader->n_allocated_correlations) {
+ reader->n_allocated_correlations = MAX(100, 2 * reader->n_allocated_correlations);
+ reader->correlations =
+ (const struct intel_perf_record_timestamp_correlation **)
+ realloc((void *) reader->correlations,
+ reader->n_allocated_correlations *
+ sizeof(*reader->correlations));
+ assert(reader->correlations);
+ }
+
+ reader->correlations[reader->n_correlations++] = corr;
+}
+
+static struct intel_perf_metric_set *
+find_metric_set(struct intel_perf *perf, const char *uuid)
+{
+ struct intel_perf_metric_set *metric_set;
+
+ igt_list_for_each_entry(metric_set, &perf->metric_sets, link) {
+ if (!strcmp(uuid, metric_set->hw_config_guid))
+ return metric_set;
+ }
+
+ return NULL;
+}
+
+static void
+init_devinfo(struct intel_perf_devinfo *perf_devinfo,
+ const struct intel_device_info *devinfo,
+ uint32_t devid,
+ uint64_t timestamp_frequency)
+{
+ perf_devinfo->devid = devid;
+ perf_devinfo->gen = devinfo->gen;
+ perf_devinfo->timestamp_frequency = timestamp_frequency;
+}
+
+static bool
+parse_data(struct intel_perf_data_reader *reader)
+{
+ const uint8_t *end = reader->mmap_data + reader->mmap_size;
+ const uint8_t *iter = reader->mmap_data;
+ while (iter < end) {
+ const struct drm_i915_perf_record_header *header =
+ (const struct drm_i915_perf_record_header *) iter;
+
+ switch (header->type) {
+ case DRM_I915_PERF_RECORD_SAMPLE:
+ append_record(reader, header);
+ break;
+
+ case DRM_I915_PERF_RECORD_OA_REPORT_LOST:
+ case DRM_I915_PERF_RECORD_OA_BUFFER_LOST:
+ assert(header->size == sizeof(*header));
+ break;
+
+ case INTEL_PERF_RECORD_TYPE_DEVICE_INFO: {
+ const struct intel_device_info *devinfo;
+
+ reader->record_info =
+ (const struct intel_perf_record_device_info *) (header + 1);
+ assert(header->size == (sizeof(*reader->record_info) + sizeof(*header)));
+ devinfo = intel_get_device_info(reader->record_info->device_id);
+ if (!devinfo)
+ return false;
+ init_devinfo(&reader->devinfo, devinfo,
+ reader->record_info->device_id,
+ reader->record_info->timestamp_frequency);
+ reader->perf = intel_perf_for_devinfo(devinfo);
+ reader->metric_set = find_metric_set(reader->perf, reader->record_info->uuid);
+ break;
+ }
+
+ case INTEL_PERF_RECORD_TYPE_TIMESTAMP_CORRELATION: {
+ append_timestamp_correlation(reader,
+ (const struct intel_perf_record_timestamp_correlation *) (header + 1));
+ break;
+ }
+ }
+
+ iter += header->size;
+ }
+
+ return true;
+}
+
+static uint64_t
+correlate_gpu_timestamp(struct intel_perf_data_reader *reader,
+ uint64_t gpu_ts)
+{
+ /* OA reports only have the lower 32bits of the timestamp
+ * register, while our correlation data has the whole 36bits.
+ * Try to figure what portion of the correlation data the
+ * 32bit timestamp belongs to.
+ */
+ uint64_t mask = 0xffffffff;
+ int corr_idx = -1;
+
+ for (uint32_t i = 0; i < reader->n_correlation_chunks; i++) {
+ if (gpu_ts >= (reader->correlation_chunks[i].gpu_ts_begin & mask) &&
+ gpu_ts <= (reader->correlation_chunks[i].gpu_ts_end & mask)) {
+ corr_idx = reader->correlation_chunks[i].idx;
+ break;
+ }
+ }
+
+ /* Not found? Assume prior to the first timestamp correlation.
+ */
+ if (corr_idx < 0) {
+ return reader->correlations[0]->cpu_timestamp -
+ ((reader->correlations[0]->gpu_timestamp & mask) - gpu_ts) *
+ (reader->correlations[1]->cpu_timestamp - reader->correlations[0]->cpu_timestamp) /
+ (reader->correlations[1]->gpu_timestamp - reader->correlations[0]->gpu_timestamp);
+ }
+
+ for (uint32_t i = corr_idx; i < (reader->n_correlations - 1); i++) {
+ if (gpu_ts >= (reader->correlations[i]->gpu_timestamp & mask) &&
+ gpu_ts < (reader->correlations[i + 1]->gpu_timestamp & mask)) {
+ return reader->correlations[i]->cpu_timestamp +
+ (gpu_ts - (reader->correlations[i]->gpu_timestamp & mask)) *
+ (reader->correlations[i + 1]->cpu_timestamp - reader->correlations[i]->cpu_timestamp) /
+ (reader->correlations[i + 1]->gpu_timestamp - reader->correlations[i]->gpu_timestamp);
+ }
+ }
+
+ /* This is a bit harsh, but the recording tool should ensure we have
+ * sampling points on either side of the bag of OA reports.
+ */
+ assert(0);
+}
+
+static void
+append_timeline_event(struct intel_perf_data_reader *reader,
+ uint64_t ts_start, uint64_t ts_end,
+ uint32_t record_start, uint32_t record_end,
+ uint32_t hw_id)
+{
+ if (reader->n_timelines >= reader->n_allocated_timelines) {
+ reader->n_allocated_timelines = MAX(100, 2 * reader->n_allocated_timelines);
+ reader->timelines =
+ (struct intel_perf_timeline_item *)
+ realloc((void *) reader->timelines,
+ reader->n_allocated_timelines *
+ sizeof(*reader->timelines));
+ assert(reader->timelines);
+ }
+
+ reader->timelines[reader->n_timelines].ts_start = ts_start;
+ reader->timelines[reader->n_timelines].ts_end = ts_end;
+ reader->timelines[reader->n_timelines].cpu_ts_start =
+ correlate_gpu_timestamp(reader, ts_start);
+ reader->timelines[reader->n_timelines].cpu_ts_end =
+ correlate_gpu_timestamp(reader, ts_end);
+ reader->timelines[reader->n_timelines].record_start = record_start;
+ reader->timelines[reader->n_timelines].record_end = record_end;
+ reader->timelines[reader->n_timelines].hw_id = hw_id;
+ reader->n_timelines++;
+}
+
+static void
+generate_cpu_events(struct intel_perf_data_reader *reader)
+{
+ uint32_t last_header_idx = 0;
+ const struct drm_i915_perf_record_header *last_header = reader->records[0];
+
+ for (uint32_t i = 1; i < reader->n_records; i++) {
+ const struct drm_i915_perf_record_header *current_header =
+ reader->records[i];
+ const uint8_t *start_report = (const uint8_t *) (last_header + 1),
+ *end_report = (const uint8_t *) (current_header + 1);
+ uint32_t last_ctx_id = oa_report_ctx_id(&reader->devinfo, start_report),
+ current_ctx_id = oa_report_ctx_id(&reader->devinfo, end_report);
+ uint64_t gpu_ts_start = oa_report_timestamp(start_report),
+ gpu_ts_end = oa_report_timestamp(end_report);
+
+ if (last_ctx_id == current_ctx_id)
+ continue;
+
+ append_timeline_event(reader, gpu_ts_start, gpu_ts_end, last_header_idx, i, last_ctx_id);
+
+ last_header = current_header;
+ last_header_idx = i;
+ }
+}
+
+static void
+compute_correlation_chunks(struct intel_perf_data_reader *reader)
+{
+ uint64_t mask = ~(0xffffffff);
+ uint32_t last_idx = 0;
+ uint64_t last_ts = reader->correlations[last_idx]->gpu_timestamp;
+
+ for (uint32_t i = 0; i < reader->n_correlations; i++) {
+ if (!reader->n_correlation_chunks ||
+ (last_ts & mask) != (reader->correlations[i]->gpu_timestamp & mask)) {
+ assert(reader->n_correlation_chunks < ARRAY_SIZE(reader->correlation_chunks));
+ reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_begin = last_ts;
+ reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_end = last_ts | ~mask;
+ reader->correlation_chunks[reader->n_correlation_chunks].idx = last_idx;
+ last_ts = reader->correlation_chunks[reader->n_correlation_chunks].gpu_ts_end + 1;
+ last_idx = i;
+ reader->n_correlation_chunks++;
+ }
+ }
+}
+
+bool
+intel_perf_data_reader_init(struct intel_perf_data_reader *reader,
+ int perf_file_fd)
+{
+ struct stat st;
+ if (fstat(perf_file_fd, &st) != 0)
+ return false;
+
+ memset(reader, 0, sizeof(*reader));
+
+ reader->mmap_size = st.st_size;
+ reader->mmap_data = (const uint8_t *) mmap(NULL, st.st_size,
+ PROT_READ, MAP_PRIVATE,
+ perf_file_fd, 0);
+ if (reader->mmap_data == MAP_FAILED)
+ return false;
+
+ if (!parse_data(reader))
+ return false;
+
+ compute_correlation_chunks(reader);
+ generate_cpu_events(reader);
+
+ return true;
+}
+
+void
+intel_perf_data_reader_fini(struct intel_perf_data_reader *reader)
+{
+ intel_perf_free(reader->perf);
+ free(reader->records);
+ free(reader->timelines);
+ free(reader->correlations);
+ munmap((void *)reader->mmap_data, reader->mmap_size);
+}
diff --git a/lib/i915/perf_data_reader.h b/lib/i915/perf_data_reader.h
new file mode 100644
index 00000000..f75e96dd
--- /dev/null
+++ b/lib/i915/perf_data_reader.h
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef PERF_DATA_READER_H
+#define PERF_DATA_READER_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Helper to read a i915-perf recording. */
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#include <i915_drm.h>
+
+#include "perf.h"
+#include "perf_data.h"
+
+struct intel_device_info;
+
+struct intel_perf_timeline_item {
+ uint64_t ts_start;
+ uint64_t ts_end;
+ uint64_t cpu_ts_start;
+ uint64_t cpu_ts_end;
+
+ /* Offsets into intel_perf_data_reader.records */
+ uint32_t record_start;
+ uint32_t record_end;
+
+ uint32_t hw_id;
+
+ /* User associated data with a given item on the i915 perf
+ * timeline.
+ */
+ void *user_data;
+};
+
+struct intel_perf_data_reader {
+ /* Array of pointers into the mmapped i915 perf file. */
+ const struct drm_i915_perf_record_header **records;
+ uint32_t n_records;
+ uint32_t n_allocated_records;
+
+ /**/
+ struct intel_perf_timeline_item *timelines;
+ uint32_t n_timelines;
+ uint32_t n_allocated_timelines;
+
+ /**/
+ const struct intel_perf_record_timestamp_correlation **correlations;
+ uint32_t n_correlations;
+ uint32_t n_allocated_correlations;
+
+ struct {
+ uint64_t gpu_ts_begin;
+ uint64_t gpu_ts_end;
+ uint32_t idx;
+ } correlation_chunks[4];
+ uint32_t n_correlation_chunks;
+
+ /**/
+ const struct intel_perf_record_device_info *record_info;
+
+ struct intel_perf_devinfo devinfo;
+
+ struct intel_perf *perf;
+ struct intel_perf_metric_set *metric_set;
+
+ const uint8_t *mmap_data;
+ size_t mmap_size;
+};
+
+bool intel_perf_data_reader_init(struct intel_perf_data_reader *reader,
+ int perf_file_fd);
+void intel_perf_data_reader_fini(struct intel_perf_data_reader *reader);
+
+#ifdef __cplusplus
+};
+#endif
+
+#endif /* PERF_DATA_READER_H */
diff --git a/lib/meson.build b/lib/meson.build
index 6e935d45..f241bff7 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -173,6 +173,7 @@ lib_igt_perf = declare_dependency(link_with : lib_igt_perf_build,
i915_perf_files = [
'i915/perf.c',
+ 'i915/perf_data_reader.c',
]
i915_perf_hardware = [
@@ -218,6 +219,7 @@ install_headers(
'intel_chipset.h',
'i915/perf.h',
'i915/perf_data.h',
+ 'i915/perf_data_reader.h',
subdir : 'i915-perf'
)
diff --git a/tools/i915-perf/README b/tools/i915-perf/README
new file mode 100644
index 00000000..e9822345
--- /dev/null
+++ b/tools/i915-perf/README
@@ -0,0 +1,70 @@
+======================
+i915 perf tools for OA
+======================
+
+The tools provided here enable capturing performance metrics from the i915
+driver and are used in conjunction with the GPUvis software here -
+
+https://github.com/mikesart/gpuvis
+
+Tools in IGT
+------------
+
+The following tools are generated in build/tools/i915-perf
+
+i915-perf-configs
+i915-perf-control
+i915-perf-recorder
+
+Usage in IGT
+------------
+
+Just launching i915-perf-recorder with no argument will list all available
+metrics. Once installed, the igt recorder tool can be used to record metrics in
+a circular buffer. Example below shows capture of RenderBasic metrics with an
+8Mb circular buffer.
+
+i915-perf-recorder -m RenderBasic -s 8192
+
+The circular buffer can be dumped at a given location from another terminal
+using the i915-perf-control tool :
+
+i915-perf-control -d /tmp/recording.perf
+
+Integration with GPUvis
+-----------------------
+
+GPUvis provides sample scripts in gpuvis/sample directory that can be modified
+and used to capture the metrics required.
+
+1. Setup the recording by launching the following scripts from gpuvis/sample
+ directory :
+
+ trace-cmd-setup.sh
+ trace-cmd-start-tracing.sh
+
+This will setup a recording in a circular buffer.
+
+2. Start using the system for a specific task you want to record.
+
+3. Once the task is completed, save the circular buffer into a capture file with
+ the following script :
+
+ trace-cmd-capture.sh
+
+4. Once finished, tear down the circular buffer recording with :
+
+ trace-cmd-stop-tracing.sh
+
+Inspecting data captured in GPUvis
+----------------------------------
+
+The capture script will generate 2 files for instance :
+
+ trace_09-26-2019_01-22-40.dat
+ trace_09-26-2019_01-22-40.i915-dat
+
+The first one contains ftrace data, the other i915-perf data. To inspect the
+data launch gpuvis with the 2 files as arguments :
+
+ gpuvis trace_09-26-2019_01-22-40.dat trace_09-26-2019_01-22-40.i915-dat
--
2.20.1
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis
2020-02-15 1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
` (2 preceding siblings ...)
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
@ 2020-02-17 13:42 ` Lionel Landwerlin
3 siblings, 0 replies; 5+ messages in thread
From: Lionel Landwerlin @ 2020-02-17 13:42 UTC (permalink / raw)
To: Umesh Nerlige Ramappa, igt-dev, Joonas Lahtinen, Ashutosh Dixit
There was a small communication hiccup. Umesh thought I did not work on
this stuff anymore, but I actually just picked up the stuff again last week.
Sending an update with more changes/updates.
Sorry for the confusion.
-Lionel
On 15/02/2020 03:11, Umesh Nerlige Ramappa wrote:
> The tools provided here enable capturing performance metrics from the i915
> driver and are used in conjunction with the GPUvis software here -
>
> https://github.com/mikesart/gpuvis
>
> The changes required in GPUvis are wip and will be posted following the merge of
> these tools.
>
> For more information, view tools/i915-perf/README in this patch series
>
> Lionel Landwerlin (4):
> lib/i915/perf: Add i915_perf library
> lib/i915/perf: Add support for loading perf configurations
> tools/i915/perf: Add i915 perf recorder tool
> lib/i915/perf: Add i915 perf data reader
>
> lib/i915-perf.pc.in | 10 +
> lib/i915/perf-configs/README.md | 115 +
> lib/i915/perf-configs/codegen.py | 33 +
> lib/i915/perf-configs/guids.xml | 282 +
> lib/i915/perf-configs/mdapi-xml-convert.py | 1000 +
> lib/i915/perf-configs/oa-bdw.xml | 15653 ++++++++++++++++
> lib/i915/perf-configs/oa-bxt.xml | 9595 ++++++++++
> lib/i915/perf-configs/oa-cflgt2.xml | 10866 +++++++++++
> lib/i915/perf-configs/oa-cflgt3.xml | 10933 +++++++++++
> lib/i915/perf-configs/oa-chv.xml | 9757 ++++++++++
> lib/i915/perf-configs/oa-cnl.xml | 10411 ++++++++++
> lib/i915/perf-configs/oa-glk.xml | 9346 +++++++++
> lib/i915/perf-configs/oa-hsw.xml | 4615 +++++
> lib/i915/perf-configs/oa-icl.xml | 11899 ++++++++++++
> lib/i915/perf-configs/oa-kblgt2.xml | 10866 +++++++++++
> lib/i915/perf-configs/oa-kblgt3.xml | 10933 +++++++++++
> lib/i915/perf-configs/oa-sklgt2.xml | 11895 ++++++++++++
> lib/i915/perf-configs/oa-sklgt3.xml | 10933 +++++++++++
> lib/i915/perf-configs/oa-sklgt4.xml | 10956 +++++++++++
> lib/i915/perf-configs/perf-codegen.py | 854 +
> lib/i915/perf-configs/update-guids.py | 230 +
> lib/i915/perf.c | 332 +
> lib/i915/perf.h | 240 +
> lib/i915/perf_data.h | 88 +
> lib/i915/perf_data_reader.c | 330 +
> lib/i915/perf_data_reader.h | 103 +
> lib/meson.build | 67 +
> tools/i915-perf/README | 70 +
> tools/i915-perf/i915_perf_configs.c | 277 +
> tools/i915-perf/i915_perf_control.c | 133 +
> tools/i915-perf/i915_perf_recorder.c | 931 +
> tools/i915-perf/i915_perf_recorder_commands.h | 39 +
> tools/i915-perf/meson.build | 17 +
> tools/meson.build | 1 +
> 34 files changed, 153810 insertions(+)
> create mode 100644 lib/i915-perf.pc.in
> create mode 100644 lib/i915/perf-configs/README.md
> create mode 100644 lib/i915/perf-configs/codegen.py
> create mode 100644 lib/i915/perf-configs/guids.xml
> create mode 100755 lib/i915/perf-configs/mdapi-xml-convert.py
> create mode 100644 lib/i915/perf-configs/oa-bdw.xml
> create mode 100644 lib/i915/perf-configs/oa-bxt.xml
> create mode 100644 lib/i915/perf-configs/oa-cflgt2.xml
> create mode 100644 lib/i915/perf-configs/oa-cflgt3.xml
> create mode 100644 lib/i915/perf-configs/oa-chv.xml
> create mode 100644 lib/i915/perf-configs/oa-cnl.xml
> create mode 100644 lib/i915/perf-configs/oa-glk.xml
> create mode 100644 lib/i915/perf-configs/oa-hsw.xml
> create mode 100644 lib/i915/perf-configs/oa-icl.xml
> create mode 100644 lib/i915/perf-configs/oa-kblgt2.xml
> create mode 100644 lib/i915/perf-configs/oa-kblgt3.xml
> create mode 100644 lib/i915/perf-configs/oa-sklgt2.xml
> create mode 100644 lib/i915/perf-configs/oa-sklgt3.xml
> create mode 100644 lib/i915/perf-configs/oa-sklgt4.xml
> create mode 100755 lib/i915/perf-configs/perf-codegen.py
> create mode 100755 lib/i915/perf-configs/update-guids.py
> create mode 100644 lib/i915/perf.c
> create mode 100644 lib/i915/perf.h
> create mode 100644 lib/i915/perf_data.h
> create mode 100644 lib/i915/perf_data_reader.c
> create mode 100644 lib/i915/perf_data_reader.h
> create mode 100644 tools/i915-perf/README
> create mode 100644 tools/i915-perf/i915_perf_configs.c
> create mode 100644 tools/i915-perf/i915_perf_control.c
> create mode 100644 tools/i915-perf/i915_perf_recorder.c
> create mode 100644 tools/i915-perf/i915_perf_recorder_commands.h
> create mode 100644 tools/i915-perf/meson.build
>
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-02-17 13:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-15 1:11 [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 2/4] lib/i915/perf: Add support for loading perf configurations Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 3/4] tools/i915/perf: Add i915 perf recorder tool Umesh Nerlige Ramappa
2020-02-15 1:11 ` [igt-dev] [PATCH i-g-t 4/4] lib/i915/perf: Add i915 perf data reader Umesh Nerlige Ramappa
2020-02-17 13:42 ` [igt-dev] [PATCH i-g-t 0/4] Add perf OA tools for GPUvis Lionel Landwerlin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox