* [RFC v1 0/3] C and Rust support for perf script
@ 2024-09-19 21:51 Stefan
2024-09-19 21:51 ` [RFC v1 1/3] " Stefan
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Stefan @ 2024-09-19 21:51 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: vinicius.gomes, stefan.ene, stef_an_ene
From: Stefan Ene <stefan.ene@intel.com>
============================COVER=LETTER============================
This proposal is addressing the usability and performance of the available scripting languages for custom perf event data processing inside of the perf toolset, specifically with the perf script command.
With the perf-script custom event processing functionality for C and Rust, we noticed between 2x to 5x speed improvement with our new methods compared to the existent Python and Perl scripting methods.
To explain the proposed method, you begin with the C or Rust script we’ve templatized, then just add changes for custom event processing in the preferred language using this provided template, compile the respective script into a dynamic library, then give the resulting .so file as a parameter to the newly implemented perf script option.
List of functionality additions:
1/ Added new "--new_script" option inside the perf script command that takes in as parameter .so files. Code addition in tools/perf/builtin-script.c
2/ The functional code for the newly implemented option was added to the tools/perf/util/new_script.c and tools/perf/util/new_script.h files.
3/ Folder added at tools/perf/new_script_templates, containing C and Rust script templates for the new option, along with compilation instructions.
4/ Finally, a short bash script for updating the perf tool set within the kernel code was provided in the base-level file update_perf_tools.sh.
Common Questions
* How can I use the new toolset?
The new implementations for the perf script have a detailed usage guide inside of the tools/perf/new_script_templates/README file, along with some script templates for C and Rust!
* Why a new option instead of expanding dlfilter?
The new option gave us the flexibility to make use of the fast dlfilter dynamic library approach, as opposed to implementing another interpreting methodology. This allows for scalability, with great potential to other languages supporting dynamic library calls from the base C code.
* Why use C and Rust instead?
As of kernel version 6.11, the perf tool has a large overhead for data processing using Python and Perl, given the languages having to use their respective perf built-in interpreters. Furthermore, while Python is widley used in the development comunity, as of 20204, Perl is only used by 2.5 of developers worldwide, while C and Rust are more common, with 20.3% and 12.6% usage, respectively (Source: statista.com).
* What are the actual performance improvements?
As last tested, the C and Rust approach are anywhere between 2 to 5 times faster than the existent Python and Perl scripting methods, with Rust being the fastest all across!
Acknowledgements:
This code was completed as part of an Intel summer internship project, under the mentoring of Vinicius Gomes, Intel Linux Kernel Team.
=========================END=COVER=LETTER===========================
Stefan Ene (3):
add the new perf script option (--new_script) and related changes
added the C sample script
added the Rust sample script
tools/perf/builtin-script.c | 22 +-
tools/perf/new_script_templates/README | 65 ++++
tools/perf/new_script_templates/lib.rs | 108 +++++++
tools/perf/new_script_templates/script.c | 113 +++++++
tools/perf/util/Build | 1 +
tools/perf/util/new_script.c | 376 +++++++++++++++++++++++
tools/perf/util/new_script.h | 54 ++++
tools/perf/util/new_script_rs_lib.h | 35 +++
8 files changed, 773 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/new_script_templates/README
create mode 100644 tools/perf/new_script_templates/lib.rs
create mode 100644 tools/perf/new_script_templates/script.c
create mode 100644 tools/perf/util/new_script.c
create mode 100644 tools/perf/util/new_script.h
create mode 100644 tools/perf/util/new_script_rs_lib.h
--
2.46.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [RFC v1 1/3] C and Rust support for perf script
2024-09-19 21:51 [RFC v1 0/3] C and Rust support for perf script Stefan
@ 2024-09-19 21:51 ` Stefan
2024-09-19 21:51 ` [RFC v1 2/3] " Stefan
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Stefan @ 2024-09-19 21:51 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: vinicius.gomes, stefan.ene, stef_an_ene
From: Stefan Ene <stefan.ene@intel.com>
[PATCH 1/3] add the new perf script option (--new_script) and related
changes
---
tools/perf/builtin-script.c | 22 +-
tools/perf/util/Build | 1 +
tools/perf/util/new_script.c | 376 +++++++++++++++++++++++++++++++++++
tools/perf/util/new_script.h | 54 +++++
4 files changed, 452 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/new_script.c
create mode 100644 tools/perf/util/new_script.h
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index c16224b1fef3..e91a1e2481bb 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -63,6 +63,7 @@
#include "util/util.h"
#include "util/cgroup.h"
#include "perf.h"
+#include "util/new_script.h"
#include <linux/ctype.h>
#ifdef HAVE_LIBTRACEEVENT
@@ -88,6 +89,7 @@ static struct perf_stat_config stat_config;
static int max_blocks;
static bool native_arch;
static struct dlfilter *dlfilter;
+static struct new_script *new_struct;
static int dlargc;
static char **dlargv;
@@ -3898,6 +3900,7 @@ int cmd_script(int argc, const char **argv)
struct utsname uts;
char *script_path = NULL;
const char *dlfilter_file = NULL;
+ const char *new_script_file = NULL;
const char **__argv;
int i, j, err = 0;
struct perf_script script = {
@@ -3954,6 +3957,7 @@ int cmd_script(int argc, const char **argv)
OPT_STRING('g', "gen-script", &generate_script_lang, "lang",
"generate perf-script.xx script in specified language"),
OPT_STRING(0, "dlfilter", &dlfilter_file, "file", "filter .so file name"),
+ OPT_STRING(0, "new_script", &new_script_file, "file", "specify .so file name"),
OPT_CALLBACK(0, "dlarg", NULL, "argument", "filter argument",
add_dlarg),
OPT_STRING('i', "input", &input_name, "file", "input file name"),
@@ -4355,6 +4359,21 @@ int cmd_script(int argc, const char **argv)
goto out_delete;
}
#endif
+
+ if (new_script_file) {
+ new_struct = new_script__new(new_script_file);
+
+ if (!new_struct) {
+ perror("cannot create new_script objects\n");
+ goto out_delete;
+ }
+ if (new_script__start(new_struct, &data) < 0) {
+ perror("cannot start new_script object\n");
+ goto out_delete;
+ }
+ goto out_delete;
+ }
+
if (generate_script_lang) {
struct stat perf_stat;
int input;
@@ -4458,7 +4477,8 @@ int cmd_script(int argc, const char **argv)
if (script_started)
cleanup_scripting();
dlfilter__cleanup(dlfilter);
+ new_script__cleanup(new_struct);
free_dlarg();
out:
return err;
-}
+}
\ No newline at end of file
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 0f18fe81ef0b..386cdd0fe13c 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -140,6 +140,7 @@ perf-util-y += parse-sublevel-options.o
perf-util-y += term.o
perf-util-y += help-unknown-cmd.o
perf-util-y += dlfilter.o
+perf-util-y += new_script.o
perf-util-y += mem-events.o
perf-util-y += mem-info.o
perf-util-y += vsprintf.o
diff --git a/tools/perf/util/new_script.c b/tools/perf/util/new_script.c
new file mode 100644
index 000000000000..ff3234b20738
--- /dev/null
+++ b/tools/perf/util/new_script.c
@@ -0,0 +1,376 @@
+// Stefan Ene's code, under Intel - Linux Kernel Team
+/*
+ * new_script.c: code for new scripting object allowing C and Rust event processing
+ *
+ */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <dlfcn.h>
+#include <errno.h>
+
+#include <linux/types.h>
+#include <perf/event.h>
+
+#include "data.h"
+#include "sample.h"
+#include "header.h"
+#include "new_script.h"
+#include "ordered-events.h"
+
+// rust lib
+#include "new_script_rs_lib.h"
+
+
+#define PATH_MAX 4096
+#define MMAP_SIZE (32 * 1024 * 1024ULL)
+#define EINVAL 22 /* Invalid argument */
+
+
+static char *
+find_new_script(const char* file)
+{
+ char path[PATH_MAX];
+
+ if (strchr(file, '/'))
+ goto out;
+
+ if (!access(file, R_OK)) {
+ snprintf(path, sizeof(path), "./%s", file);
+ file = path;
+ }
+
+out:
+ return strdup(file);
+}
+
+
+static int
+new_script_open(struct new_script* s)
+{
+ void (*initialize)(struct new_script*);
+
+ s->handle = dlopen(s->file, RTLD_NOW);
+ if (!s->handle)
+ return -1;
+
+ initialize = (void (*)(struct new_script*))dlsym(s->handle, "initialize_new_script");
+
+ if (!initialize) {
+ dlclose(s->handle);
+ return -1;
+ }
+ initialize(s);
+
+ return 0;
+}
+
+
+static int
+new_script__init(struct new_script *s, const char *file)
+{
+ memset(s, 0, sizeof(*s));
+ s->file = find_new_script(file); // not freed since file string used somewhere else
+ if (!s->file)
+ return -1;
+
+ return 0;
+}
+
+
+struct new_script *
+new_script__new(const char *file)
+{
+ struct new_script *s = malloc(sizeof(*s));
+ if(!s)
+ return NULL;
+
+ if (strcmp(file, "langs") == 0) {
+ printf("\n");
+ printf("Languages dynamic library can be based on:\n\n");
+ printf("\tC\t\t\t\t[file.c ]\n");
+ printf("\tRust\t\t\t\t[file.rs]\n\n");
+ fflush(stdout);
+ return NULL;
+ }
+
+ if (new_script__init(s, file) < 0) {
+ printf("Error finding .so file\n");
+ return NULL;
+ }
+
+ if (new_script_open(s) < 0) {
+ printf("Error opening dynamic library\n");
+ return NULL;
+ }
+
+ return s;
+
+}
+
+
+static int
+new_script__validate_data(struct perf_data *data)
+{
+ if (!data) {
+ printf("> no data\n");
+ return -1;
+ }
+ printf("> has data at %s \n", data->path);
+ fflush(stdout);
+
+ return 0;
+
+}
+
+
+static long
+process_file_header(int fd, struct new_script *s)
+{
+ struct perf_file_header fh;
+ u64 data[3];
+ void *ctx = NULL;
+ size_t e;
+
+ e = read(fd, &fh, sizeof(struct perf_file_header));
+ if (e != sizeof(struct perf_file_header)) {
+ if (fd == -1)
+ return -1;
+ }
+
+ data[0] = fh.size;
+ data[1] = fh.data.size;
+ data[2] = fh.data.offset;
+
+ if (fh.attr_size != sizeof(struct perf_file_attr)) {
+ printf("Error matching file attributes size\n");
+ return -1;
+ }
+
+ if (s->process_file_header((void *)data, ctx) < 0)
+ return -1;
+
+ lseek(fd, fh.data.offset, SEEK_SET);
+
+ return fh.data.size;
+}
+
+
+static int
+process_event_header(struct perf_event_header *header, struct new_script *s)
+{
+ void *ctx = NULL;
+
+ if (!header)
+ return -1;
+
+ if (s->process_event_header((void *)header, ctx) < 0)
+ return -1;
+
+ return 0;
+}
+
+
+static void
+process_sample_mmap(void *buffer)
+{
+ struct perf_record_mmap mmap;
+
+ memcpy(&mmap, buffer, sizeof(struct perf_record_mmap));
+
+ printf("\tMMAP data: %u %llx %s\n", mmap.pid, mmap.start, mmap.filename);
+ fflush(stdout);
+
+}
+
+
+static void
+process_sample_data(void *buffer, size_t size)
+{
+ long count = size / sizeof(u64);
+
+ u64 *array = malloc(count * sizeof(u64));
+
+ memcpy(array, buffer + sizeof(struct perf_event_header), count * sizeof(u64));
+
+ printf("\tSample data: ");
+ for (int i = 0; i < count; i++) {
+ printf(" %lu", array[i]);
+ }
+ printf("\n");
+ fflush(stdout);
+}
+
+
+static int
+process_sample(struct perf_event_header *header, void *buffer)
+{
+ switch(header->type) {
+ case PERF_RECORD_MMAP:
+ process_sample_mmap(buffer);
+ break;
+ // case PERF_RECORD_COMM:
+ // case PERF_RECORD_FORK:
+ // case PERF_RECORD_EXIT:
+ case PERF_RECORD_SAMPLE:
+ process_sample_data(buffer, header->size - sizeof(*header));
+ break;
+ case PERF_RECORD_FINISHED_ROUND:
+ printf("\tEvent: PERF_RECORD_FINISHED_ROUND\n");
+ fflush(stdout);
+ break;
+ default: // skip event
+ printf("\tSkipped event\n");
+ fflush(stdout);
+ break;
+ }
+ return 0;
+}
+
+
+
+static int
+process_event_data(struct new_script *s, struct perf_event_header *header, void* buffer, size_t processed)
+{
+ void *ctx = NULL;
+ const size_t size = header->size;
+
+ s->process_event_raw_data(buffer + processed, size, ctx);
+
+ process_sample(header, buffer + processed);
+
+ return 0;
+}
+
+
+static int
+new_perf__process_all_data(struct perf_data *data, struct new_script *s)
+{
+ long data_size = 0;
+ long curr_size = 0;
+ size_t batch_size = 1024 * 1024;
+ char *buffer;
+ size_t processed = 0;
+ size_t event_size = 0;
+
+ int fd = open(data->path, O_RDONLY);
+ if (fd == -1) {
+ printf("Error opening perf data file\n");
+ close(fd);
+ return -1;
+ }
+
+
+ if ((data_size = process_file_header(fd, s)) < 0) {
+ printf("Error processing file header\n");
+ close(fd);
+ return -1;
+ }
+
+ buffer = malloc(batch_size);
+ if (!buffer) {
+ printf("Error allocating memory for file buffer\n");
+ close(fd);
+ return -1;
+ }
+
+ while (curr_size < data_size) {
+ size_t bytes_read = 0;
+
+ // final batch case
+ if ((curr_size + (long)batch_size) > data_size) {
+ batch_size = data_size - curr_size;
+ }
+
+ bytes_read = read(fd, buffer, batch_size);
+
+ processed = 0;
+ while (processed < bytes_read) {
+ struct perf_event_header *header;
+
+ if ((bytes_read - processed) >= sizeof(*header)) {
+ header = (struct perf_event_header *)(buffer + processed);
+ } else {
+ lseek(fd, -(bytes_read - processed), SEEK_CUR);
+ break;
+ }
+
+ event_size = header->size;
+
+ if ((processed + event_size) > bytes_read) { // split event case
+ lseek(fd, -(bytes_read - processed), SEEK_CUR);
+ break;
+ }
+
+ if (process_event_header(header, s) < 0) {
+ close(fd);
+ return -1;
+ }
+
+ if (process_event_data(s, header, buffer, processed) < 0) {
+ printf("Error processing event data\n");
+ close(fd);
+ return -1;
+ }
+
+ processed += event_size;
+ curr_size += event_size;
+ }
+ }
+
+ close(fd);
+ return 0;
+}
+
+
+int
+new_script__start(struct new_script *s, struct perf_data *data)
+{
+ int ret;
+ void *d = NULL, *ctx = NULL;
+
+ if (!s)
+ goto out;
+
+ if (new_script__validate_data(data) < 0)
+ goto out;
+
+ printf("\n");
+ fflush(stdout);
+
+ s->begin(d, ctx);
+
+ ret = new_perf__process_all_data(data, s);
+ if (ret < 0) {
+ printf("Error processing data in new_script\n");
+ goto out;
+ }
+
+ s->end(d, ctx);
+
+ printf("\n");
+ fflush(stdout);
+ return 0;
+
+out:
+ return -1;
+}
+
+
+static int
+new_script__close(struct new_script *s)
+{
+ return dlclose(s->handle);
+}
+
+
+void
+new_script__cleanup(struct new_script *s)
+{
+ if (s)
+ new_script__close(s);
+ free(s);
+}
\ No newline at end of file
diff --git a/tools/perf/util/new_script.h b/tools/perf/util/new_script.h
new file mode 100644
index 000000000000..9dc4760b3a33
--- /dev/null
+++ b/tools/perf/util/new_script.h
@@ -0,0 +1,54 @@
+// Stefan Ene's code, under Intel - Linux Kernel Team
+/*
+ * new_script.h: header file for new scripting object allowing C and Rust event processing
+ *
+ */
+#ifndef __PERF_NEW_SCRIPT_H
+#define __PERF_NEW_SCRIPT_H
+
+
+struct new_script {
+ char *file;
+ void *handle;
+
+ void (*begin)(void *data, void *ctx);
+ void (*end)(void *data, void *ctx);
+
+ int (*process_file_header)(void *data, void *ctx);
+ int (*process_event_header)(void *data, void *ctx);
+ int (*process_event_raw_data)(void *data, const int size, void *ctx);
+
+};
+
+struct perf_event_header;
+
+struct perf_data;
+
+union perf_event;
+
+struct perf_file_section;
+
+struct perf_file_header;
+
+struct perf_event_attr;
+
+struct perf_file_attr {
+ struct perf_event_attr attr;
+ struct perf_file_section ids;
+};
+
+
+struct perf_sample;
+
+struct ip_callchain;
+
+struct perf_record_mmap;
+
+struct new_script *new_script__new(const char *);
+
+void new_script__cleanup(struct new_script *);
+
+int new_script__start(struct new_script *, struct perf_data *);
+
+
+#endif /* __PERF_NEW_SCRIPT_H */
\ No newline at end of file
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC v1 2/3] C and Rust support for perf script
2024-09-19 21:51 [RFC v1 0/3] C and Rust support for perf script Stefan
2024-09-19 21:51 ` [RFC v1 1/3] " Stefan
@ 2024-09-19 21:51 ` Stefan
2024-09-19 21:51 ` [RFC v1 3/3] " Stefan
2024-09-26 0:03 ` [RFC v1 0/3] " Namhyung Kim
3 siblings, 0 replies; 5+ messages in thread
From: Stefan @ 2024-09-19 21:51 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: vinicius.gomes, stefan.ene, stef_an_ene
From: Stefan Ene <stefan.ene@intel.com>
[PATCH 2/3] added the C sample script
---
tools/perf/new_script_templates/README | 16 ++++
tools/perf/new_script_templates/script.c | 107 +++++++++++++++++++++++
2 files changed, 123 insertions(+)
create mode 100644 tools/perf/new_script_templates/README
create mode 100644 tools/perf/new_script_templates/script.c
diff --git a/tools/perf/new_script_templates/README b/tools/perf/new_script_templates/README
new file mode 100644
index 000000000000..c2a55a65d444
--- /dev/null
+++ b/tools/perf/new_script_templates/README
@@ -0,0 +1,16 @@
+Linux kernel additions for C and Rust script support inside the perf-script tool set.
+
+Steps to use new feature:
+
+ First, use the provided update_perf_tools.sh script to make sure your perf toolset is up to date with the latest implementation:
+ $ bash update_perf_tools.sh
+
+ a) For C scripts:
+ 1. Use the default C template test.c to write your own custom perf event processing
+
+ 2. Compile the C script into a dynamic library using the following two commands:
+ $ gcc -c -I ~/include -fpic new_script_templates/script.c
+ $ gcc -shared -o test.so new_script_templates/script.o
+
+ 3. Call the new perf script option to use the newly created .so file using the command:
+ $ sudo perf script --new_script new_script_templates/script.so
\ No newline at end of file
diff --git a/tools/perf/new_script_templates/script.c b/tools/perf/new_script_templates/script.c
new file mode 100644
index 000000000000..e4942ba689db
--- /dev/null
+++ b/tools/perf/new_script_templates/script.c
@@ -0,0 +1,107 @@
+#include <stdlib.h>
+#include <stdio.h>
+
+// =================== Needed stuff begin, DO NOT CHANGE ===================
+
+#include <linux/types.h>
+
+struct new_script {
+ char *file;
+ void *handle;
+
+ void (*begin)(void *data, void *ctx);
+ void (*end)(void *data, void *ctx);
+
+ int (*process_file_header)(void *data, void *ctx);
+ int (*process_event_header)(void *data, void *ctx);
+ int (*process_event_raw_data)(void *data, const int size, void *ctx);
+
+};
+
+struct processed_file_header {
+ __u64 size;
+ __u64 data_size;
+ __u64 data_offset;
+};
+
+struct processed_event_header {
+ __u32 type;
+ __u16 misc;
+ __u16 size;
+};
+
+// =================== Editable funtions begin ===================
+
+void
+print_begin(void *data, void *ctx)
+{
+ printf(">> in trace_begin\n");
+}
+
+
+int
+process_file_header(void* data, void *ctx)
+{
+ if (!data) {
+ printf("> Error dynamically processing file header\n");
+ return -1;
+ }
+
+ struct processed_file_header *fh = (struct processed_file_header *)data;
+
+ printf("\nFile header: size=%lx, data.size=%u, data.offset=%u\n", fh->size, fh->data_size, fh->data_offset);
+
+ return 0;
+}
+
+
+int
+process_event_header(void* data, void *ctx)
+{
+ if (!data) {
+ printf("> Error dynamically processing event header\n");
+ return -1;
+ }
+
+ struct processed_event_header *evh = (struct processed_event_header *)data;
+
+ printf("\nEvent header: size=%u, type=%u, misc=%u\n", evh->size, evh->type, evh->misc);
+
+ return 0;
+}
+
+
+int
+process_event_raw_data(void* data, const int size, void *ctx)
+{
+ unsigned char *byte_data = (unsigned char *)data;
+ for (size_t i = 0; i < size; i++) {
+ // if (i >= 160) {
+ // printf("\n...");
+ // break;
+ // }
+ if ((i % 16) == 0)
+ printf("\n");
+ printf("%02x ", byte_data[i]);
+ }
+ printf("\n");
+}
+
+
+void
+print_end(void *data, void *ctx)
+{
+ printf("\n>> in trace_end\n");
+}
+
+// =================== Needed stuff begin, DO NOT CHANGE ===================
+
+void
+initialize_new_script(struct new_script* s)
+{
+ s->begin = print_begin;
+ s->end = print_end;
+ s->process_file_header = process_file_header;
+ s->process_event_header = process_event_header;
+ s->process_event_raw_data = process_event_raw_data;
+}
\ No newline at end of file
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [RFC v1 3/3] C and Rust support for perf script
2024-09-19 21:51 [RFC v1 0/3] C and Rust support for perf script Stefan
2024-09-19 21:51 ` [RFC v1 1/3] " Stefan
2024-09-19 21:51 ` [RFC v1 2/3] " Stefan
@ 2024-09-19 21:51 ` Stefan
2024-09-26 0:03 ` [RFC v1 0/3] " Namhyung Kim
3 siblings, 0 replies; 5+ messages in thread
From: Stefan @ 2024-09-19 21:51 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: vinicius.gomes, stefan.ene, stef_an_ene
From: Stefan Ene <stefan.ene@intel.com>
[PATCH 3/3] added the Rust sample script
---
tools/perf/new_script_templates/README | 61 +++++++++++--
tools/perf/new_script_templates/lib.rs | 108 +++++++++++++++++++++++
tools/perf/new_script_templates/script.c | 6 ++
tools/perf/util/new_script_rs_lib.h | 35 ++++++++
4 files changed, 204 insertions(+), 6 deletions(-)
create mode 100644 tools/perf/new_script_templates/lib.rs
create mode 100644 tools/perf/util/new_script_rs_lib.h
diff --git a/tools/perf/new_script_templates/README b/tools/perf/new_script_templates/README
index c2a55a65d444..f16a9759d37e 100644
--- a/tools/perf/new_script_templates/README
+++ b/tools/perf/new_script_templates/README
@@ -2,15 +2,64 @@ Linux kernel additions for C and Rust script support inside the perf-script tool
Steps to use new feature:
- First, use the provided update_perf_tools.sh script to make sure your perf toolset is up to date with the latest implementation:
- $ bash update_perf_tools.sh
+ First, use run the following lines to make sure your perf toolset is up to date with the latest implementation:
+ $ cd ./tools/perf/
+ $ make
+ $ sudo cp perf /usr/bin
+ $ perf --version
+ Now, you can make use of our new features!
+
a) For C scripts:
- 1. Use the default C template test.c to write your own custom perf event processing
+ 1. Use the default C template in tools/perf/new_script_templates/script.c to write your own custom perf event processing
2. Compile the C script into a dynamic library using the following two commands:
- $ gcc -c -I ~/include -fpic new_script_templates/script.c
- $ gcc -shared -o test.so new_script_templates/script.o
+ $ gcc -c -I ~/include -fpic script.c
+ $ gcc -shared -o script.so script.o
3. Call the new perf script option to use the newly created .so file using the command:
- $ sudo perf script --new_script new_script_templates/script.so
\ No newline at end of file
+ $ sudo perf script --new_script script.so
+
+
+ b) For Rust scripts:
+ 1. Create a new Rust project using Cargo:
+ $ cargo new rslib_script --lib
+ $ cd rslib_script
+
+ 2. In the Cargo.toml file, specify the crate type as a dynamic library as follows:
+ [lib]
+ crate-type = ["cdylib"]
+
+ 3. Use the default Rust template in tools/perf/new_script_templates/lib.rs to write your own custom perf event processing, and store it into src/lib.rs inside of the Rust project folder.
+
+ 4. Compile the Rust project using Cargo:
+ $ cargo build --release
+
+ 5. Call the new perf script option to use the newly created /target/release/librslib.so within the Rust project using the followig command:
+ $ sudo perf script --new_script rslib/target/release/librslib.so
+
+Enjoy using the new scripting languages for an added bonus of usability and performance over the existent Python and Perl options in the upstream kernel code.
+
+
+<!-- ============================= -->
+
+This section contains some initial benchmark results on the Intel Lab machine using the perf script tools on a 4026869-byte perf.data file.
+
+Process ----- Time
+
+> Perf tool vanilla:
+perf script raw: 1.425s
+perf script names: 0.915s
+perf script Python: 1.334s
+perf script Perl: 2.210s
+
+
+> Pre-optimizations:
+new_script raw C: 0.284s
+new_script raw Rust: 0.607s
+
+> Post-optimizations:
+new_script raw C batch=1MB: 0.306s
+new_script raw C batch=2MB: 0.301s
+new_script raw C batch=4MB: 0.296s
+new_script raw Rust batch=1MB: 0.262s
\ No newline at end of file
diff --git a/tools/perf/new_script_templates/lib.rs b/tools/perf/new_script_templates/lib.rs
new file mode 100644
index 000000000000..1249d9e9099b
--- /dev/null
+++ b/tools/perf/new_script_templates/lib.rs
@@ -0,0 +1,108 @@
+// Stefan Ene's code, under Intel - Linux Kernel Team
+/*
+ * code for new perf script custom C script template
+ *
+ */
+
+ use std::os::raw::{c_char, c_int, c_void};
+ use std::slice;
+ use std::fmt::Write;
+
+ // =================== Needed stuff, DO NOT CHANGE ===================
+
+ #[repr(C)]
+ struct NewScript {
+ file: *mut c_char,
+ handle: *mut c_void,
+
+ begin: extern "C" fn(*mut c_void, *mut c_void),
+ end: extern "C" fn(*mut c_void, *mut c_void),
+
+ process_file_header: extern "C" fn(*mut c_void, *mut c_void) -> c_int,
+ process_event_header: extern "C" fn(*mut c_void, *mut c_void) -> c_int,
+ process_event_raw_data: extern "C" fn(*mut c_void, c_int, *mut c_void) -> c_int,
+ }
+
+ #[repr(C)]
+ struct ProcessedFileHeader {
+ size: u64,
+ data_size: u64,
+ data_offset: u64,
+ }
+
+ #[repr(C)]
+ struct ProcessedEventHeader {
+ event_type: u32,
+ misc: u16,
+ size: u16,
+ }
+
+ // =================== Editable funtions begin ===================
+
+ #[no_mangle]
+ pub extern "C" fn print_begin(_data: *mut c_void, _ctx: *mut c_void) {
+ println!(">> in trace_begin with Rust");
+ }
+
+ #[no_mangle]
+ pub extern "C" fn process_file_header(data: *mut c_void, _ctx: *mut c_void) -> c_int {
+ if data.is_null() {
+ println!("> Error dynamically processing file header");
+ return -1;
+ }
+
+ let fh = unsafe { &*(data as *const ProcessedFileHeader) };
+
+ println!("\nFile header: size={:x}, data.size={}, data.offset={}", fh.size, fh.data_size, fh.data_offset);
+
+ 0
+ }
+
+ #[no_mangle]
+ pub extern "C" fn process_event_header(data: *mut c_void, _ctx: *mut c_void) -> c_int {
+ if data.is_null() {
+ println!("> Error dynamically processing event header");
+ return -1;
+ }
+
+ let evh = unsafe { &*(data as *const ProcessedEventHeader) };
+
+ println!("\nRS Event header: size={}, type={}, misc={}", evh.size, evh.event_type, evh.misc);
+
+ 0
+ }
+
+ #[no_mangle]
+ pub extern "C" fn process_event_raw_data(data: *mut c_void, size: c_int, _ctx: *mut c_void) -> c_int {
+ let byte_data = unsafe { slice::from_raw_parts(data as *const u8, size as usize) };
+ let mut output = String::new();
+
+ for (i, &byte) in byte_data.iter().enumerate() {
+ if i % 16 == 0 {
+ output.push('\n');
+ }
+ write!(&mut output, "{:02x} ", byte).unwrap();
+ }
+
+ println!("{}", output);
+ 0
+ }
+
+ #[no_mangle]
+ pub extern "C" fn print_end(_data: *mut c_void, _ctx: *mut c_void) {
+ println!("\n>> in trace_end");
+ }
+
+ // =================== Needed stuff begin, DO NOT CHANGE ===================
+
+ #[no_mangle]
+ pub extern "C" fn initialize_new_script(s: *mut NewScript) {
+ if !s.is_null() {
+ let script = unsafe { &mut *s };
+ script.begin = print_begin;
+ script.end = print_end;
+ script.process_file_header = process_file_header;
+ script.process_event_header = process_event_header;
+ script.process_event_raw_data = process_event_raw_data;
+ }
+ }
\ No newline at end of file
diff --git a/tools/perf/new_script_templates/script.c b/tools/perf/new_script_templates/script.c
index e4942ba689db..787f67424d5e 100644
--- a/tools/perf/new_script_templates/script.c
+++ b/tools/perf/new_script_templates/script.c
@@ -1,3 +1,9 @@
+// Stefan Ene's code, under Intel - Linux Kernel Team
+/*
+ * code for new perf script custom C script template
+ *
+ */
+
#include <stdlib.h>
#include <stdio.h>
diff --git a/tools/perf/util/new_script_rs_lib.h b/tools/perf/util/new_script_rs_lib.h
new file mode 100644
index 000000000000..4a8d6f71b122
--- /dev/null
+++ b/tools/perf/util/new_script_rs_lib.h
@@ -0,0 +1,35 @@
+// Stefan Ene's code, under Intel - Linux Kernel Team
+/*
+ * new_script_lib_rs.h: header file for new_script Rust FFI
+ *
+ */
+
+#ifndef NEW_SCRIPT_LIB_RS_H
+#define NEW_SCRIPT_LIB_RS_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+// Match the struct definitions with the Rust code
+struct NewScript {
+ char *file;
+ void *handle;
+
+ void (*begin)(void *data, void *ctx);
+ void (*end)(void *data, void *ctx);
+
+ int (*process_file_header)(void *data, void *ctx);
+ int (*process_event_header)(void *data, void *ctx);
+ int (*process_event_raw_data)(void *data, const int size, void *ctx);
+};
+
+void initialize_new_script(struct NewScript* s);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif // NEW_SCRIPT_LIB_RS_H
\ No newline at end of file
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC v1 0/3] C and Rust support for perf script
2024-09-19 21:51 [RFC v1 0/3] C and Rust support for perf script Stefan
` (2 preceding siblings ...)
2024-09-19 21:51 ` [RFC v1 3/3] " Stefan
@ 2024-09-26 0:03 ` Namhyung Kim
3 siblings, 0 replies; 5+ messages in thread
From: Namhyung Kim @ 2024-09-26 0:03 UTC (permalink / raw)
To: StefanEne
Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, kan.liang, linux-perf-users, linux-kernel,
vinicius.gomes, stefan.ene, stef_an_ene
Hello,
On Thu, Sep 19, 2024 at 02:51:01PM -0700, StefanEne wrote:
> From: Stefan Ene <stefan.ene@intel.com>
>
> ============================COVER=LETTER============================
>
> This proposal is addressing the usability and performance of the available scripting languages for custom perf event data processing inside of the perf toolset, specifically with the perf script command.
>
> With the perf-script custom event processing functionality for C and Rust, we noticed between 2x to 5x speed improvement with our new methods compared to the existent Python and Perl scripting methods.
>
> To explain the proposed method, you begin with the C or Rust script we’ve templatized, then just add changes for custom event processing in the preferred language using this provided template, compile the respective script into a dynamic library, then give the resulting .so file as a parameter to the newly implemented perf script option.
>
> List of functionality additions:
>
> 1/ Added new "--new_script" option inside the perf script command that takes in as parameter .so files. Code addition in tools/perf/builtin-script.c
>
> 2/ The functional code for the newly implemented option was added to the tools/perf/util/new_script.c and tools/perf/util/new_script.h files.
>
> 3/ Folder added at tools/perf/new_script_templates, containing C and Rust script templates for the new option, along with compilation instructions.
>
> 4/ Finally, a short bash script for updating the perf tool set within the kernel code was provided in the base-level file update_perf_tools.sh.
>
>
> Common Questions
>
> * How can I use the new toolset?
>
> The new implementations for the perf script have a detailed usage guide inside of the tools/perf/new_script_templates/README file, along with some script templates for C and Rust!
>
> * Why a new option instead of expanding dlfilter?
>
> The new option gave us the flexibility to make use of the fast dlfilter dynamic library approach, as opposed to implementing another interpreting methodology. This allows for scalability, with great potential to other languages supporting dynamic library calls from the base C code.
>
> * Why use C and Rust instead?
>
> As of kernel version 6.11, the perf tool has a large overhead for data processing using Python and Perl, given the languages having to use their respective perf built-in interpreters. Furthermore, while Python is widley used in the development comunity, as of 20204, Perl is only used by 2.5 of developers worldwide, while C and Rust are more common, with 20.3% and 12.6% usage, respectively (Source: statista.com).
>
> * What are the actual performance improvements?
>
> As last tested, the C and Rust approach are anywhere between 2 to 5 times faster than the existent Python and Perl scripting methods, with Rust being the fastest all across!
>
>
> Acknowledgements:
>
> This code was completed as part of an Intel summer internship project, under the mentoring of Vinicius Gomes, Intel Linux Kernel Team.
>
> =========================END=COVER=LETTER===========================
>
>
> Stefan Ene (3):
> add the new perf script option (--new_script) and related changes
> added the C sample script
> added the Rust sample script
Thanks for sharing your work. But I think there are many more work to
support scripting of native languages properly. For example, you need
to resolve symbols and callchains before passing them to user. And we
might think about the safety as it can modify the internal state or
something.
Thanks,
Namhyung
>
> tools/perf/builtin-script.c | 22 +-
> tools/perf/new_script_templates/README | 65 ++++
> tools/perf/new_script_templates/lib.rs | 108 +++++++
> tools/perf/new_script_templates/script.c | 113 +++++++
> tools/perf/util/Build | 1 +
> tools/perf/util/new_script.c | 376 +++++++++++++++++++++++
> tools/perf/util/new_script.h | 54 ++++
> tools/perf/util/new_script_rs_lib.h | 35 +++
> 8 files changed, 773 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/new_script_templates/README
> create mode 100644 tools/perf/new_script_templates/lib.rs
> create mode 100644 tools/perf/new_script_templates/script.c
> create mode 100644 tools/perf/util/new_script.c
> create mode 100644 tools/perf/util/new_script.h
> create mode 100644 tools/perf/util/new_script_rs_lib.h
>
> --
> 2.46.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-26 0:03 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-19 21:51 [RFC v1 0/3] C and Rust support for perf script Stefan
2024-09-19 21:51 ` [RFC v1 1/3] " Stefan
2024-09-19 21:51 ` [RFC v1 2/3] " Stefan
2024-09-19 21:51 ` [RFC v1 3/3] " Stefan
2024-09-26 0:03 ` [RFC v1 0/3] " Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).