qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
@ 2024-05-27  8:14 Zhao Liu
  2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Hi maintainers and list,

This RFC series attempts to re-implement simpletrace.py with Rust, which
is the 1st task of Paolo's GSoC 2024 proposal.

There are two motivations for this work:
1. This is an open chance to discuss how to integrate Rust into QEMU.
2. Rust delivers faster parsing.


Introduction
============

Code framework
--------------

I choose "cargo" to organize the code, because the current
implementation depends on external crates (Rust's library), such as
"backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
regular matching, and so on. (Meson's support for external crates is
still incomplete. [2])

The simpletrace-rust created in this series is not yet integrated into
the QEMU compilation chain, so it can only be compiled independently, e.g.
under ./scripts/simpletrace/, compile it be:

    cargo build --release

The code tree for the entire simpletrace-rust is as follows:

$ script/simpletrace-rust .
.
├── Cargo.toml
└── src
    └── main.rs   // The simpletrace logic (similar to simpletrace.py).
    └── trace.rs  // The Argument and Event abstraction (refer to
                  // tracetool/__init__.py).

My question about meson v.s. cargo, I put it at the end of the cover
letter (the section "Opens on Rust Support").

The following two sections are lessons I've learned from this Rust
practice.


Performance
-----------

I did the performance comparison using the rust-simpletrace prototype with
the python one:

* On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
trace binary file for 10 times on each item:

                      AVE (ms)       Rust v.s. Python
Rust   (stdout)       12687.16            114.46%
Python (stdout)       14521.85

Rust   (file)          1422.44            264.99%
Python (file)          3769.37

- The "stdout" lines represent output to the screen.
- The "file" lines represent output to a file (via "> file").

This Rust version contains some optimizations (including print, regular
matching, etc.), but there should be plenty of room for optimization.

The current performance bottleneck is the reading binary trace file,
since I am parsing headers and event payloads one after the other, so
that the IO read overhead accounts for 33%, which can be further
optimized in the future.


Security
--------

This is an example.

Rust is very strict about type-checking, and it found timestamp reversal
issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
deeper with more time)...in this RFC, I workingaround it by allowing
negative values. And the python version, just silently covered this
issue up.


Opens on Rust Support
=====================

Meson v.s. Cargo
----------------

The first question is whether all Rust code (including under scripts)
must be integrated into meson?

If so, because of [2] then I have to discard the external crates and
build some more Rust wheels of my own to replace the previous external
crates.

For the main part of the QEMU code, I think the answer must be Yes, but
for the tools in the scripts directory, would it be possible to allow
the use of cargo to build small tools/program for flexibility and
migrate to meson later (as meson's support for rust becomes more
mature)?


External crates
---------------

This is an additional question that naturally follows from the above
question, do we have requirements for Rust's external crate? Is only std
allowed?

Welcome your feedback!


[1]: https://wiki.qemu.org/Google_Summer_of_Code_2024
[2]: https://github.com/mesonbuild/meson/issues/2173
[3]: https://lore.kernel.org/qemu-devel/20240509134712.GA515599@fedora.redhat.com/

Thanks and Best Regards,
Zhao

---
Zhao Liu (6):
  scripts/simpletrace-rust: Add the basic cargo framework
  scripts/simpletrace-rust: Support Event & Arguments in trace module
  scripts/simpletrace-rust: Add helpers to parse trace file
  scripts/simpletrace-rust: Parse and check trace recode file
  scripts/simpletrace-rust: Format simple trace output
  docs/tracing: Add simpletrace-rust section

 docs/devel/tracing.rst                 |  35 ++
 scripts/simpletrace-rust/.gitignore    |   1 +
 scripts/simpletrace-rust/.rustfmt.toml |   9 +
 scripts/simpletrace-rust/Cargo.lock    | 370 +++++++++++++++
 scripts/simpletrace-rust/Cargo.toml    |  17 +
 scripts/simpletrace-rust/src/main.rs   | 633 +++++++++++++++++++++++++
 scripts/simpletrace-rust/src/trace.rs  | 339 +++++++++++++
 7 files changed, 1404 insertions(+)
 create mode 100644 scripts/simpletrace-rust/.gitignore
 create mode 100644 scripts/simpletrace-rust/.rustfmt.toml
 create mode 100644 scripts/simpletrace-rust/Cargo.lock
 create mode 100644 scripts/simpletrace-rust/Cargo.toml
 create mode 100644 scripts/simpletrace-rust/src/main.rs
 create mode 100644 scripts/simpletrace-rust/src/trace.rs

-- 
2.34.1



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27 20:05   ` Stefan Hajnoczi
  2024-05-27  8:14 ` [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module Zhao Liu
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Define the basic cargo framework to support compiling simpletrace-rust
via cargo, and add the Rust code style (with some nightly features)
check items to make Rust style as close to the QEMU C code as possible.

With the base cargo package, define the basic code framework for
simpletrace-rust, approximating the Python version, and also abstract
Analyzer operations for simpletrace-rust. Event and other future
trace-related structures are placed in the trace module.

Additionally, support basic command line parsing for simpletrace-rust as
a start.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace-rust/.gitignore    |   1 +
 scripts/simpletrace-rust/.rustfmt.toml |   9 +
 scripts/simpletrace-rust/Cargo.lock    | 239 +++++++++++++++++++++++++
 scripts/simpletrace-rust/Cargo.toml    |  11 ++
 scripts/simpletrace-rust/src/main.rs   | 173 ++++++++++++++++++
 scripts/simpletrace-rust/src/trace.rs  |  11 ++
 6 files changed, 444 insertions(+)
 create mode 100644 scripts/simpletrace-rust/.gitignore
 create mode 100644 scripts/simpletrace-rust/.rustfmt.toml
 create mode 100644 scripts/simpletrace-rust/Cargo.lock
 create mode 100644 scripts/simpletrace-rust/Cargo.toml
 create mode 100644 scripts/simpletrace-rust/src/main.rs
 create mode 100644 scripts/simpletrace-rust/src/trace.rs

diff --git a/scripts/simpletrace-rust/.gitignore b/scripts/simpletrace-rust/.gitignore
new file mode 100644
index 000000000000..2f7896d1d136
--- /dev/null
+++ b/scripts/simpletrace-rust/.gitignore
@@ -0,0 +1 @@
+target/
diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
new file mode 100644
index 000000000000..97a97c24ebfb
--- /dev/null
+++ b/scripts/simpletrace-rust/.rustfmt.toml
@@ -0,0 +1,9 @@
+brace_style = "AlwaysNextLine"
+comment_width = 80
+edition = "2021"
+group_imports = "StdExternalCrate"
+imports_granularity = "item"
+max_width = 80
+use_field_init_shorthand = true
+use_try_shorthand = true
+wrap_comments = true
diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
new file mode 100644
index 000000000000..4a0ff8092dcb
--- /dev/null
+++ b/scripts/simpletrace-rust/Cargo.lock
@@ -0,0 +1,239 @@
+# This file is automatically @generated by Cargo.
+# It is not intended for manual editing.
+version = 3
+
+[[package]]
+name = "anstream"
+version = "0.6.14"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "418c75fa768af9c03be99d17643f93f79bbba589895012a80e3452a19ddda15b"
+dependencies = [
+ "anstyle",
+ "anstyle-parse",
+ "anstyle-query",
+ "anstyle-wincon",
+ "colorchoice",
+ "is_terminal_polyfill",
+ "utf8parse",
+]
+
+[[package]]
+name = "anstyle"
+version = "1.0.7"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "038dfcf04a5feb68e9c60b21c9625a54c2c0616e79b72b0fd87075a056ae1d1b"
+
+[[package]]
+name = "anstyle-parse"
+version = "0.2.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "c03a11a9034d92058ceb6ee011ce58af4a9bf61491aa7e1e59ecd24bd40d22d4"
+dependencies = [
+ "utf8parse",
+]
+
+[[package]]
+name = "anstyle-query"
+version = "1.0.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a64c907d4e79225ac72e2a354c9ce84d50ebb4586dee56c82b3ee73004f537f5"
+dependencies = [
+ "windows-sys",
+]
+
+[[package]]
+name = "anstyle-wincon"
+version = "3.0.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "61a38449feb7068f52bb06c12759005cf459ee52bb4adc1d5a7c4322d716fb19"
+dependencies = [
+ "anstyle",
+ "windows-sys",
+]
+
+[[package]]
+name = "clap"
+version = "4.5.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "90bc066a67923782aa8515dbaea16946c5bcc5addbd668bb80af688e53e548a0"
+dependencies = [
+ "clap_builder",
+]
+
+[[package]]
+name = "clap_builder"
+version = "4.5.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "ae129e2e766ae0ec03484e609954119f123cc1fe650337e155d03b022f24f7b4"
+dependencies = [
+ "anstream",
+ "anstyle",
+ "clap_lex",
+ "strsim",
+]
+
+[[package]]
+name = "clap_lex"
+version = "0.7.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "98cc8fbded0c607b7ba9dd60cd98df59af97e84d24e49c8557331cfc26d301ce"
+
+[[package]]
+name = "colorchoice"
+version = "1.0.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "0b6a852b24ab71dffc585bcb46eaf7959d175cb865a7152e35b348d1b2960422"
+
+[[package]]
+name = "is_terminal_polyfill"
+version = "1.70.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
+
+[[package]]
+name = "proc-macro2"
+version = "1.0.83"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "0b33eb56c327dec362a9e55b3ad14f9d2f0904fb5a5b03b513ab5465399e9f43"
+dependencies = [
+ "unicode-ident",
+]
+
+[[package]]
+name = "quote"
+version = "1.0.36"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "0fa76aaf39101c457836aec0ce2316dbdc3ab723cdda1c6bd4e6ad4208acaca7"
+dependencies = [
+ "proc-macro2",
+]
+
+[[package]]
+name = "simpletrace-rust"
+version = "0.1.0"
+dependencies = [
+ "clap",
+ "thiserror",
+]
+
+[[package]]
+name = "strsim"
+version = "0.11.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
+
+[[package]]
+name = "syn"
+version = "2.0.66"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "c42f3f41a2de00b01c0aaad383c5a45241efc8b2d1eda5661812fda5f3cdcff5"
+dependencies = [
+ "proc-macro2",
+ "quote",
+ "unicode-ident",
+]
+
+[[package]]
+name = "thiserror"
+version = "1.0.61"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "c546c80d6be4bc6a00c0f01730c08df82eaa7a7a61f11d656526506112cc1709"
+dependencies = [
+ "thiserror-impl",
+]
+
+[[package]]
+name = "thiserror-impl"
+version = "1.0.61"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "46c3384250002a6d5af4d114f2845d37b57521033f30d5c3f46c4d70e1197533"
+dependencies = [
+ "proc-macro2",
+ "quote",
+ "syn",
+]
+
+[[package]]
+name = "unicode-ident"
+version = "1.0.12"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
+
+[[package]]
+name = "utf8parse"
+version = "0.2.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "711b9620af191e0cdc7468a8d14e709c3dcdb115b36f838e601583af800a370a"
+
+[[package]]
+name = "windows-sys"
+version = "0.52.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
+dependencies = [
+ "windows-targets",
+]
+
+[[package]]
+name = "windows-targets"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "6f0713a46559409d202e70e28227288446bf7841d3211583a4b53e3f6d96e7eb"
+dependencies = [
+ "windows_aarch64_gnullvm",
+ "windows_aarch64_msvc",
+ "windows_i686_gnu",
+ "windows_i686_gnullvm",
+ "windows_i686_msvc",
+ "windows_x86_64_gnu",
+ "windows_x86_64_gnullvm",
+ "windows_x86_64_msvc",
+]
+
+[[package]]
+name = "windows_aarch64_gnullvm"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "7088eed71e8b8dda258ecc8bac5fb1153c5cffaf2578fc8ff5d61e23578d3263"
+
+[[package]]
+name = "windows_aarch64_msvc"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "9985fd1504e250c615ca5f281c3f7a6da76213ebd5ccc9561496568a2752afb6"
+
+[[package]]
+name = "windows_i686_gnu"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "88ba073cf16d5372720ec942a8ccbf61626074c6d4dd2e745299726ce8b89670"
+
+[[package]]
+name = "windows_i686_gnullvm"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "87f4261229030a858f36b459e748ae97545d6f1ec60e5e0d6a3d32e0dc232ee9"
+
+[[package]]
+name = "windows_i686_msvc"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "db3c2bf3d13d5b658be73463284eaf12830ac9a26a90c717b7f771dfe97487bf"
+
+[[package]]
+name = "windows_x86_64_gnu"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "4e4246f76bdeff09eb48875a0fd3e2af6aada79d409d33011886d3e1581517d9"
+
+[[package]]
+name = "windows_x86_64_gnullvm"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "852298e482cd67c356ddd9570386e2862b5673c85bd5f88df9ab6802b334c596"
+
+[[package]]
+name = "windows_x86_64_msvc"
+version = "0.52.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "bec47e5bfd1bff0eeaf6d8b485cc1074891a197ab4225d504cb7a1ab88b02bf0"
diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
new file mode 100644
index 000000000000..b44ba1569dad
--- /dev/null
+++ b/scripts/simpletrace-rust/Cargo.toml
@@ -0,0 +1,11 @@
+[package]
+name = "simpletrace-rust"
+description = "Pretty-printer for simple trace backend binary trace files (Rust version)"
+version = "0.1.0"
+edition = "2021"
+authors = ["Zhao Liu <zhao1.liu@intel.com>"]
+license = "GPL-2.0-or-later"
+
+[dependencies]
+clap = "4.5.4"
+thiserror = "1.0.20"
diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
new file mode 100644
index 000000000000..2d2926b7658d
--- /dev/null
+++ b/scripts/simpletrace-rust/src/main.rs
@@ -0,0 +1,173 @@
+/*
+ * Pretty-printer for simple trace backend binary trace files (Rust version)
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#![allow(dead_code)]
+#![allow(unused_variables)]
+
+mod trace;
+
+use std::env;
+
+use clap::Arg;
+use clap::Command;
+use thiserror::Error;
+use trace::Event;
+
+#[derive(Error, Debug)]
+pub enum Error
+{
+    #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
+    CliOptionUnmatch(String),
+}
+
+pub type Result<T> = std::result::Result<T, Error>;
+
+pub struct EventArgPayload {}
+
+trait Analyzer
+{
+    /* Called at the start of the trace. */
+    fn begin(&self) {}
+
+    /* Called if no specific method for processing a trace event. */
+    fn catchall(
+        &mut self,
+        rec_args: &[EventArgPayload],
+        event: &Event,
+        timestamp_ns: u64,
+        pid: u32,
+        event_id: u64,
+    ) -> Result<String>;
+
+    /* Called at the end of the trace. */
+    fn end(&self) {}
+
+    /*
+     * TODO: Support "variable" parameters (i.e. variants of process_event()
+     * with different parameters, like **kwargs in python), when we need a
+     * simpletrace rust module.
+     */
+    fn process_event(
+        &mut self,
+        rec_args: &[EventArgPayload],
+        event: &Event,
+        event_id: u64,
+        timestamp_ns: u64,
+        pid: u32,
+    ) -> Result<String>
+    {
+        self.catchall(rec_args, event, timestamp_ns, pid, event_id)
+
+        /*
+         * TODO: Support custom function hooks (like getattr() in python),
+         * when we need a simpletrace rust module.
+         */
+    }
+}
+
+struct Formatter
+{
+    last_timestamp_ns: Option<u64>,
+}
+
+impl Formatter
+{
+    fn new() -> Self
+    {
+        Formatter {
+            last_timestamp_ns: None,
+        }
+    }
+}
+
+impl Analyzer for Formatter
+{
+    fn catchall(
+        &mut self,
+        rec_args: &[EventArgPayload],
+        event: &Event,
+        timestamp_ns: u64,
+        pid: u32,
+        event_id: u64,
+    ) -> Result<String>
+    {
+        let fmt_str = String::new();
+
+        Ok(fmt_str)
+    }
+}
+
+fn process(
+    event_path: &str,
+    trace_path: &str,
+    analyzer: &mut Formatter,
+    read_header: bool,
+) -> Result<()>
+{
+    analyzer.begin();
+    analyzer.end();
+
+    Ok(())
+}
+
+/*
+ * Execute an analyzer on a trace file given on the command-line.
+ * This function is useful as a driver for simple analysis scripts.  More
+ * advanced scripts will want to call process() instead.
+ */
+fn run(analyzer: &mut Formatter) -> Result<()>
+{
+    let matches = Command::new("simple trace")
+        .arg(
+            Arg::new("no-header")
+                .required(false)
+                .long("no-header")
+                .action(clap::ArgAction::SetTrue)
+                .help("Disable header parsing"),
+        )
+        .arg(
+            Arg::new("trace-events")
+                .required(true)
+                .action(clap::ArgAction::Set)
+                .help("Path to trace events file"),
+        )
+        .arg(
+            Arg::new("trace-file")
+                .required(true)
+                .action(clap::ArgAction::Set)
+                .help("Path to trace file"),
+        )
+        .try_get_matches()
+        .map_err(|_| {
+            Error::CliOptionUnmatch(
+                env::current_exe()
+                    .unwrap_or_else(|_| "simpletrace".into())
+                    .to_string_lossy()
+                    .to_string(),
+            )
+        })?;
+
+    let no_header = matches.get_flag("no-header");
+    let event_path = matches.get_one::<String>("trace-events").unwrap();
+    let trace_path = matches.get_one::<String>("trace-file").unwrap();
+
+    process(event_path, trace_path, analyzer, !no_header)?;
+
+    Ok(())
+}
+
+fn main()
+{
+    let mut fmt = Formatter::new();
+
+    if let Err(e) = run(&mut fmt) {
+        println!("{:?}", e.to_string());
+    }
+}
diff --git a/scripts/simpletrace-rust/src/trace.rs b/scripts/simpletrace-rust/src/trace.rs
new file mode 100644
index 000000000000..3a4b06435b8b
--- /dev/null
+++ b/scripts/simpletrace-rust/src/trace.rs
@@ -0,0 +1,11 @@
+/*
+ * Machinery for generating tracing-related intermediate files (Rust version)
+ *
+ * Copyright (C) 2024 Intel Corporation.
+ *
+ * Authors: Zhao Liu <zhao1.liu@intel.com>
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+pub struct Event {}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
  2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27 20:33   ` Stefan Hajnoczi
  2024-05-27  8:14 ` [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file Zhao Liu
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Refer to scripts/tracetool/__init__.py, add Event & Arguments
abstractions in trace module.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace-rust/Cargo.lock   |  52 ++++
 scripts/simpletrace-rust/Cargo.toml   |   2 +
 scripts/simpletrace-rust/src/trace.rs | 330 +++++++++++++++++++++++++-
 3 files changed, 383 insertions(+), 1 deletion(-)

diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
index 4a0ff8092dcb..3d815014eb44 100644
--- a/scripts/simpletrace-rust/Cargo.lock
+++ b/scripts/simpletrace-rust/Cargo.lock
@@ -2,6 +2,15 @@
 # It is not intended for manual editing.
 version = 3
 
+[[package]]
+name = "aho-corasick"
+version = "1.1.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "8e60d3430d3a69478ad0993f19238d2df97c507009a52b3c10addcd7f6bcb916"
+dependencies = [
+ "memchr",
+]
+
 [[package]]
 name = "anstream"
 version = "0.6.14"
@@ -90,6 +99,18 @@ version = "1.70.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
 
+[[package]]
+name = "memchr"
+version = "2.7.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "6c8640c5d730cb13ebd907d8d04b52f55ac9a2eec55b440c8892f40d56c76c1d"
+
+[[package]]
+name = "once_cell"
+version = "1.19.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
+
 [[package]]
 name = "proc-macro2"
 version = "1.0.83"
@@ -108,11 +129,42 @@ dependencies = [
  "proc-macro2",
 ]
 
+[[package]]
+name = "regex"
+version = "1.10.4"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "c117dbdfde9c8308975b6a18d71f3f385c89461f7b3fb054288ecf2a2058ba4c"
+dependencies = [
+ "aho-corasick",
+ "memchr",
+ "regex-automata",
+ "regex-syntax",
+]
+
+[[package]]
+name = "regex-automata"
+version = "0.4.6"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "86b83b8b9847f9bf95ef68afb0b8e6cdb80f498442f5179a29fad448fcc1eaea"
+dependencies = [
+ "aho-corasick",
+ "memchr",
+ "regex-syntax",
+]
+
+[[package]]
+name = "regex-syntax"
+version = "0.8.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "adad44e29e4c806119491a7f06f03de4d1af22c3a680dd47f1e6e179439d1f56"
+
 [[package]]
 name = "simpletrace-rust"
 version = "0.1.0"
 dependencies = [
  "clap",
+ "once_cell",
+ "regex",
  "thiserror",
 ]
 
diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
index b44ba1569dad..24a79d04e566 100644
--- a/scripts/simpletrace-rust/Cargo.toml
+++ b/scripts/simpletrace-rust/Cargo.toml
@@ -8,4 +8,6 @@ license = "GPL-2.0-or-later"
 
 [dependencies]
 clap = "4.5.4"
+once_cell = "1.19.0"
+regex = "1.10.4"
 thiserror = "1.0.20"
diff --git a/scripts/simpletrace-rust/src/trace.rs b/scripts/simpletrace-rust/src/trace.rs
index 3a4b06435b8b..f41d6e0b5bc3 100644
--- a/scripts/simpletrace-rust/src/trace.rs
+++ b/scripts/simpletrace-rust/src/trace.rs
@@ -8,4 +8,332 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
  */
 
-pub struct Event {}
+#![allow(dead_code)]
+
+use std::fs::read_to_string;
+
+use once_cell::sync::Lazy;
+use regex::Regex;
+use thiserror::Error;
+
+#[derive(Error, Debug)]
+pub enum Error
+{
+    #[error("Empty argument (did you forget to use 'void'?)")]
+    EmptyArg,
+    #[error("Event '{0}' has more than maximum permitted argument count")]
+    InvalidArgCnt(String),
+    #[error("{0} does not end with a new line")]
+    InvalidEventFile(String),
+    #[error("Invalid format: {0}")]
+    InvalidFormat(String),
+    #[error(
+        "Argument type '{0}' is not allowed. \
+        Only standard C types and fixed size integer \
+        types should be used. struct, union, and \
+        other complex pointer types should be \
+        declared as 'void *'"
+    )]
+    InvalidType(String),
+    #[error("Error at {0}:{1}: {2}")]
+    ReadEventFail(String, usize, String),
+    #[error("Unknown event: {0}")]
+    UnknownEvent(String),
+    #[error("Unknown properties: {0}")]
+    UnknownProp(String),
+}
+
+pub type Result<T> = std::result::Result<T, Error>;
+
+/*
+ * Refer to the description of ALLOWED_TYPES in
+ * scripts/tracetool/__init__.py.
+ */
+const ALLOWED_TYPES: [&str; 20] = [
+    "int",
+    "long",
+    "short",
+    "char",
+    "bool",
+    "unsigned",
+    "signed",
+    "int8_t",
+    "uint8_t",
+    "int16_t",
+    "uint16_t",
+    "int32_t",
+    "uint32_t",
+    "int64_t",
+    "uint64_t",
+    "void",
+    "size_t",
+    "ssize_t",
+    "uintptr_t",
+    "ptrdiff_t",
+];
+
+const STRING_TYPES: [&str; 4] =
+    ["const char*", "char*", "const char *", "char *"];
+
+/* TODO: Support 'vcpu' property. */
+const VALID_PROPS: [&str; 1] = ["disable"];
+
+fn validate_c_type(arg_type: &str) -> Result<()>
+{
+    static RE_TYPE: Lazy<Regex> = Lazy::new(|| Regex::new(r"\*").unwrap());
+    let bits: Vec<String> =
+        arg_type.split_whitespace().map(|s| s.to_string()).collect();
+    for bit in bits {
+        let res = RE_TYPE.replace_all(&bit, "");
+        if res.is_empty() {
+            continue;
+        }
+        if res == "const" {
+            continue;
+        }
+        if !ALLOWED_TYPES.contains(&res.as_ref()) {
+            return Err(Error::InvalidType(res.to_string()));
+        }
+    }
+    Ok(())
+}
+
+pub fn read_events(fname: &str) -> Result<Vec<Event>>
+{
+    let fstr = read_to_string(fname).unwrap();
+    let lines = fstr.lines().map(|s| s.trim()).collect::<Vec<_>>();
+    let mut events = Vec::new();
+
+    /*
+     * lines() in Rust: Line terminators are not included in the lines
+     * returned by the iterator, so check whether line_str is empty.
+     */
+    for (lineno, line_str) in lines.iter().enumerate() {
+        if line_str.is_empty() || line_str.starts_with('#') {
+            continue;
+        }
+
+        let event = Event::build(line_str, lineno as u32 + 1, fname);
+        if let Err(e) = event {
+            return Err(Error::ReadEventFail(
+                fname.to_owned(),
+                lineno,
+                e.to_string(),
+            ));
+        } else {
+            events.push(event.unwrap());
+        }
+    }
+
+    Ok(events)
+}
+
+#[derive(Clone)]
+pub struct ArgProperty
+{
+    pub c_type: String,
+    pub name: String,
+}
+
+impl ArgProperty
+{
+    fn new(c_type: &str, name: &str) -> Self
+    {
+        ArgProperty {
+            c_type: c_type.to_string(),
+            name: name.to_string(),
+        }
+    }
+
+    pub fn is_string(&self) -> bool
+    {
+        let arg_strip = self.c_type.trim_start();
+        STRING_TYPES.iter().any(|&s| arg_strip.starts_with(s))
+            && arg_strip.matches('*').count() == 1
+    }
+}
+
+#[derive(Default, Clone)]
+pub struct Arguments
+{
+    /* List of (type, name) tuples or arguments properties. */
+    pub props: Vec<ArgProperty>,
+}
+
+impl Arguments
+{
+    pub fn new() -> Self
+    {
+        Arguments { props: Vec::new() }
+    }
+
+    pub fn len(&self) -> usize
+    {
+        self.props.len()
+    }
+
+    pub fn build(arg_str: &str) -> Result<Arguments>
+    {
+        let mut args = Arguments::new();
+        for arg in arg_str.split(',').map(|s| s.trim()) {
+            if arg.is_empty() {
+                return Err(Error::EmptyArg);
+            }
+
+            if arg == "void" {
+                continue;
+            }
+
+            let (arg_type, identifier) = if arg.contains('*') {
+                /* FIXME: Implement rsplit_inclusive(). */
+                let p = arg.rfind('*').unwrap();
+                (
+                    /* Safe because arg contains "*" and p exists. */
+                    unsafe { arg.get_unchecked(..p + 1) },
+                    /* Safe because arg contains "*" and p exists. */
+                    unsafe { arg.get_unchecked(p + 1..) },
+                )
+            } else {
+                arg.rsplit_once(' ').unwrap()
+            };
+
+            validate_c_type(arg_type)?;
+            args.props.push(ArgProperty::new(arg_type, identifier));
+        }
+        Ok(args)
+    }
+}
+
+/* TODO: Support original, event_trans, event_exec if needed. */
+#[derive(Default, Clone)]
+pub struct Event
+{
+    /* The event name. */
+    pub name: String,
+    /* Properties of the event. */
+    pub properties: Vec<String>,
+    /* The event format string. */
+    pub fmt: Vec<String>,
+    /* The event arguments. */
+    pub args: Arguments,
+    /* The line number in the input file. */
+    pub lineno: u32,
+    /* The path to the input file. */
+    pub filename: String,
+}
+
+impl Event
+{
+    #[allow(clippy::too_many_arguments)]
+    pub fn new(
+        name: &str,
+        mut props: Vec<String>,
+        fmt: Vec<String>,
+        args: Arguments,
+        lineno: u32,
+        filename: &str,
+    ) -> Result<Self>
+    {
+        let mut event = Event {
+            name: name.to_string(),
+            fmt: fmt.clone(),
+            args,
+            lineno,
+            filename: filename.to_string(),
+            ..Default::default()
+        };
+
+        event.properties.append(&mut props);
+
+        if event.args.len() > 10 {
+            return Err(Error::InvalidArgCnt(event.name));
+        }
+
+        let unknown_props: Vec<String> = event
+            .properties
+            .iter()
+            .filter_map(|p| {
+                if !VALID_PROPS.contains(&p.as_str()) {
+                    Some(p.to_string())
+                } else {
+                    None
+                }
+            })
+            .collect();
+        if !unknown_props.is_empty() {
+            return Err(Error::UnknownProp(format!("{:?}", unknown_props)));
+        }
+
+        if event.fmt.len() > 2 {
+            return Err(Error::InvalidFormat(
+                format!("too many arguments ({})", event.fmt.len()).to_string(),
+            ));
+        }
+
+        Ok(event)
+    }
+
+    pub fn build(line_str: &str, lineno: u32, filename: &str) -> Result<Event>
+    {
+        static RE: Lazy<Regex> = Lazy::new(|| {
+            Regex::new(
+                r#"(?x)
+                ((?P<props>[\w\s]+)\s+)?
+                (?P<name>\w+)
+                \((?P<args>[^)]*)\)
+                \s*
+                (?:(?:(?P<fmt_trans>".+),)?\s*(?P<fmt>".+))?
+                \s*"#,
+            )
+            .unwrap()
+        });
+
+        let caps_res = RE.captures(line_str);
+        if caps_res.is_none() {
+            return Err(Error::UnknownEvent(line_str.to_owned()));
+        }
+        let caps = caps_res.unwrap();
+        let name = caps.name("name").map_or("", |m| m.as_str());
+        let props: Vec<String> = if caps.name("props").is_some() {
+            caps.name("props")
+                .unwrap()
+                .as_str()
+                .split_whitespace()
+                .map(|s| s.to_string())
+                .collect()
+        } else {
+            Vec::new()
+        };
+        let fmt: String =
+            caps.name("fmt").map_or("", |m| m.as_str()).to_string();
+        let fmt_trans: String = caps
+            .name("fmt_trans")
+            .map_or("", |m| m.as_str())
+            .to_string();
+
+        if fmt.contains("%m") || fmt_trans.contains("%m") {
+            return Err(Error::InvalidFormat(
+                "Event format '%m' is forbidden, pass the error 
+                as an explicit trace argument"
+                    .to_string(),
+            ));
+        }
+        if fmt.ends_with(r"\n") {
+            return Err(Error::InvalidFormat(
+                "Event format must not end with a newline 
+                character"
+                    .to_string(),
+            ));
+        }
+        let mut fmt_vec = vec![fmt];
+        if !fmt_trans.is_empty() {
+            fmt_vec.push(fmt_trans);
+        }
+
+        let args =
+            Arguments::build(caps.name("args").map_or("", |m| m.as_str()))?;
+        let event = Event::new(name, props, fmt_vec, args, lineno, filename)?;
+
+        Ok(event)
+    }
+}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
  2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
  2024-05-27  8:14 ` [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27 20:39   ` Stefan Hajnoczi
  2024-05-27  8:14 ` [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file Zhao Liu
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Refer to scripts/simpletrace.py, add the helpers to read the trace file
and parse the record type field, record header and log header.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace-rust/src/main.rs | 151 +++++++++++++++++++++++++++
 1 file changed, 151 insertions(+)

diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
index 2d2926b7658d..b3b8baee7c66 100644
--- a/scripts/simpletrace-rust/src/main.rs
+++ b/scripts/simpletrace-rust/src/main.rs
@@ -14,21 +14,172 @@
 mod trace;
 
 use std::env;
+use std::fs::File;
+use std::io::Error as IOError;
+use std::io::ErrorKind;
+use std::io::Read;
 
 use clap::Arg;
 use clap::Command;
 use thiserror::Error;
 use trace::Event;
 
+const RECORD_TYPE_MAPPING: u64 = 0;
+const RECORD_TYPE_EVENT: u64 = 1;
+
 #[derive(Error, Debug)]
 pub enum Error
 {
     #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
     CliOptionUnmatch(String),
+    #[error("Failed to read file: {0}")]
+    ReadFile(IOError),
+    #[error("Unknown record type ({0})")]
+    UnknownRecType(u64),
 }
 
 pub type Result<T> = std::result::Result<T, Error>;
 
+enum RecordType
+{
+    Empty,
+    Mapping,
+    Event,
+}
+
+#[repr(C)]
+#[derive(Clone, Copy, Default)]
+struct RecordRawType
+{
+    rtype: u64,
+}
+
+impl RecordType
+{
+    fn read_type(mut fobj: &File) -> Result<RecordType>
+    {
+        let mut tbuf = [0u8; 8];
+        if let Err(e) = fobj.read_exact(&mut tbuf) {
+            if e.kind() == ErrorKind::UnexpectedEof {
+                return Ok(RecordType::Empty);
+            } else {
+                return Err(Error::ReadFile(e));
+            }
+        }
+
+        /*
+         * Safe because the layout of the trace record requires us to parse
+         * the type first, and then there is a check on the validity of the
+         * record type.
+         */
+        let raw_t =
+            unsafe { std::mem::transmute::<[u8; 8], RecordRawType>(tbuf) };
+        match raw_t.rtype {
+            RECORD_TYPE_MAPPING => Ok(RecordType::Mapping),
+            RECORD_TYPE_EVENT => Ok(RecordType::Event),
+            _ => Err(Error::UnknownRecType(raw_t.rtype)),
+        }
+    }
+}
+
+trait ReadHeader
+{
+    fn read_header(fobj: &File) -> Result<Self>
+    where
+        Self: Sized;
+}
+
+#[repr(C)]
+#[derive(Clone, Copy)]
+struct LogHeader
+{
+    event_id: u64,
+    magic: u64,
+    version: u64,
+}
+
+impl ReadHeader for LogHeader
+{
+    fn read_header(mut fobj: &File) -> Result<Self>
+    {
+        let mut raw_hdr = [0u8; 24];
+        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
+
+        /*
+         * Safe because the size of log header (struct LogHeader)
+         * is 24 bytes, which is ensured by simple trace backend.
+         */
+        let hdr =
+            unsafe { std::mem::transmute::<[u8; 24], LogHeader>(raw_hdr) };
+        Ok(hdr)
+    }
+}
+
+#[derive(Default)]
+struct RecordInfo
+{
+    event_id: u64,
+    timestamp_ns: u64,
+    record_pid: u32,
+    args_payload: Vec<u8>,
+}
+
+impl RecordInfo
+{
+    fn new() -> Self
+    {
+        Default::default()
+    }
+}
+
+#[repr(C)]
+#[derive(Clone, Copy)]
+struct RecordHeader
+{
+    event_id: u64,
+    timestamp_ns: u64,
+    record_length: u32,
+    record_pid: u32,
+}
+
+impl RecordHeader
+{
+    fn extract_record(&self, mut fobj: &File) -> Result<RecordInfo>
+    {
+        let mut info = RecordInfo::new();
+
+        info.event_id = self.event_id;
+        info.timestamp_ns = self.timestamp_ns;
+        info.record_pid = self.record_pid;
+        info.args_payload = vec![
+            0u8;
+            self.record_length as usize
+                - std::mem::size_of::<RecordHeader>()
+        ];
+        fobj.read_exact(&mut info.args_payload)
+            .map_err(Error::ReadFile)?;
+
+        Ok(info)
+    }
+}
+
+impl ReadHeader for RecordHeader
+{
+    fn read_header(mut fobj: &File) -> Result<Self>
+    {
+        let mut raw_hdr = [0u8; 24];
+        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
+
+        /*
+         * Safe because the size of record header (struct RecordHeader)
+         * is 24 bytes, which is ensured by simple trace backend.
+         */
+        let hdr: RecordHeader =
+            unsafe { std::mem::transmute::<[u8; 24], RecordHeader>(raw_hdr) };
+        Ok(hdr)
+    }
+}
+
 pub struct EventArgPayload {}
 
 trait Analyzer
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (2 preceding siblings ...)
  2024-05-27  8:14 ` [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27 20:44   ` Stefan Hajnoczi
  2024-05-27  8:14 ` [RFC 5/6] scripts/simpletrace-rust: Format simple trace output Zhao Liu
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Refer to scripts/simpletrace.py, parse and check the simple trace
backend binary trace file.

Note, in order to keep certain backtrace information to get frame,
adjust the cargo debug level for release version to "line-tables-only",
which slows down the program, but is necessary.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace-rust/Cargo.lock  |  79 +++++++++
 scripts/simpletrace-rust/Cargo.toml  |   4 +
 scripts/simpletrace-rust/src/main.rs | 253 ++++++++++++++++++++++++++-
 3 files changed, 329 insertions(+), 7 deletions(-)

diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
index 3d815014eb44..37d80974ffe7 100644
--- a/scripts/simpletrace-rust/Cargo.lock
+++ b/scripts/simpletrace-rust/Cargo.lock
@@ -2,6 +2,21 @@
 # It is not intended for manual editing.
 version = 3
 
+[[package]]
+name = "addr2line"
+version = "0.21.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "8a30b2e23b9e17a9f90641c7ab1549cd9b44f296d3ccbf309d2863cfe398a0cb"
+dependencies = [
+ "gimli",
+]
+
+[[package]]
+name = "adler"
+version = "1.0.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
+
 [[package]]
 name = "aho-corasick"
 version = "1.1.3"
@@ -60,6 +75,33 @@ dependencies = [
  "windows-sys",
 ]
 
+[[package]]
+name = "backtrace"
+version = "0.3.71"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "26b05800d2e817c8b3b4b54abd461726265fa9789ae34330622f2db9ee696f9d"
+dependencies = [
+ "addr2line",
+ "cc",
+ "cfg-if",
+ "libc",
+ "miniz_oxide",
+ "object",
+ "rustc-demangle",
+]
+
+[[package]]
+name = "cc"
+version = "1.0.98"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "41c270e7540d725e65ac7f1b212ac8ce349719624d7bcff99f8e2e488e8cf03f"
+
+[[package]]
+name = "cfg-if"
+version = "1.0.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
+
 [[package]]
 name = "clap"
 version = "4.5.4"
@@ -93,18 +135,48 @@ version = "1.0.1"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "0b6a852b24ab71dffc585bcb46eaf7959d175cb865a7152e35b348d1b2960422"
 
+[[package]]
+name = "gimli"
+version = "0.28.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "4271d37baee1b8c7e4b708028c57d816cf9d2434acb33a549475f78c181f6253"
+
 [[package]]
 name = "is_terminal_polyfill"
 version = "1.70.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
 
+[[package]]
+name = "libc"
+version = "0.2.155"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "97b3888a4aecf77e811145cadf6eef5901f4782c53886191b2f693f24761847c"
+
 [[package]]
 name = "memchr"
 version = "2.7.2"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "6c8640c5d730cb13ebd907d8d04b52f55ac9a2eec55b440c8892f40d56c76c1d"
 
+[[package]]
+name = "miniz_oxide"
+version = "0.7.3"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "87dfd01fe195c66b572b37921ad8803d010623c0aca821bea2302239d155cdae"
+dependencies = [
+ "adler",
+]
+
+[[package]]
+name = "object"
+version = "0.32.2"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a6a622008b6e321afc04970976f62ee297fdbaa6f95318ca343e3eebb9648441"
+dependencies = [
+ "memchr",
+]
+
 [[package]]
 name = "once_cell"
 version = "1.19.0"
@@ -158,10 +230,17 @@ version = "0.8.3"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "adad44e29e4c806119491a7f06f03de4d1af22c3a680dd47f1e6e179439d1f56"
 
+[[package]]
+name = "rustc-demangle"
+version = "0.1.24"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "719b953e2095829ee67db738b3bfa9fa368c94900df327b3f07fe6e794d2fe1f"
+
 [[package]]
 name = "simpletrace-rust"
 version = "0.1.0"
 dependencies = [
+ "backtrace",
  "clap",
  "once_cell",
  "regex",
diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
index 24a79d04e566..23a3179de01c 100644
--- a/scripts/simpletrace-rust/Cargo.toml
+++ b/scripts/simpletrace-rust/Cargo.toml
@@ -7,7 +7,11 @@ authors = ["Zhao Liu <zhao1.liu@intel.com>"]
 license = "GPL-2.0-or-later"
 
 [dependencies]
+backtrace = "0.3"
 clap = "4.5.4"
 once_cell = "1.19.0"
 regex = "1.10.4"
 thiserror = "1.0.20"
+
+[profile.release]
+debug = "line-tables-only"
diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
index b3b8baee7c66..f9a71d8dc243 100644
--- a/scripts/simpletrace-rust/src/main.rs
+++ b/scripts/simpletrace-rust/src/main.rs
@@ -13,17 +13,24 @@
 
 mod trace;
 
+use std::collections::HashMap;
 use std::env;
 use std::fs::File;
 use std::io::Error as IOError;
 use std::io::ErrorKind;
 use std::io::Read;
+use std::mem::size_of;
 
+use backtrace::Backtrace;
 use clap::Arg;
 use clap::Command;
 use thiserror::Error;
 use trace::Event;
 
+const DROPPED_EVENT_ID: u64 = 0xfffffffffffffffe;
+const HEADER_MAGIC: u64 = 0xf2b177cb0aa429b4;
+const HEADER_EVENT_ID: u64 = 0xffffffffffffffff;
+
 const RECORD_TYPE_MAPPING: u64 = 0;
 const RECORD_TYPE_EVENT: u64 = 1;
 
@@ -32,10 +39,25 @@ pub enum Error
 {
     #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
     CliOptionUnmatch(String),
+    #[error("Invalid event name ({0})")]
+    InvalidEventName(String),
+    #[error("Not a valid trace file, header id {0} != {1}")]
+    InvalidHeaderId(u64, u64),
+    #[error("Not a valid trace file, header magic {0} != {1}")]
+    InvalidHeaderMagic(u64, u64),
     #[error("Failed to read file: {0}")]
     ReadFile(IOError),
+    #[error(
+        "event {0} is logged but is not declared in the trace events \
+        file, try using trace-events-all instead."
+    )]
+    UnknownEvent(String),
     #[error("Unknown record type ({0})")]
     UnknownRecType(u64),
+    #[error("Unknown version {0} of tracelog format!")]
+    UnknownVersion(u64),
+    #[error("Log format {0} not supported with this QEMU release!")]
+    UnsupportedVersion(u64),
 }
 
 pub type Result<T> = std::result::Result<T, Error>;
@@ -98,19 +120,22 @@ struct LogHeader
     version: u64,
 }
 
+const LOG_HDR_LEN: usize = size_of::<LogHeader>();
+
 impl ReadHeader for LogHeader
 {
     fn read_header(mut fobj: &File) -> Result<Self>
     {
-        let mut raw_hdr = [0u8; 24];
+        let mut raw_hdr = [0u8; LOG_HDR_LEN];
         fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
 
         /*
          * Safe because the size of log header (struct LogHeader)
          * is 24 bytes, which is ensured by simple trace backend.
          */
-        let hdr =
-            unsafe { std::mem::transmute::<[u8; 24], LogHeader>(raw_hdr) };
+        let hdr = unsafe {
+            std::mem::transmute::<[u8; LOG_HDR_LEN], LogHeader>(raw_hdr)
+        };
         Ok(hdr)
     }
 }
@@ -142,6 +167,8 @@ struct RecordHeader
     record_pid: u32,
 }
 
+const REC_HDR_LEN: usize = size_of::<RecordHeader>();
+
 impl RecordHeader
 {
     fn extract_record(&self, mut fobj: &File) -> Result<RecordInfo>
@@ -167,20 +194,232 @@ impl ReadHeader for RecordHeader
 {
     fn read_header(mut fobj: &File) -> Result<Self>
     {
-        let mut raw_hdr = [0u8; 24];
+        let mut raw_hdr = [0u8; REC_HDR_LEN];
         fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
 
         /*
          * Safe because the size of record header (struct RecordHeader)
          * is 24 bytes, which is ensured by simple trace backend.
          */
-        let hdr: RecordHeader =
-            unsafe { std::mem::transmute::<[u8; 24], RecordHeader>(raw_hdr) };
+        let hdr: RecordHeader = unsafe {
+            std::mem::transmute::<[u8; REC_HDR_LEN], RecordHeader>(raw_hdr)
+        };
         Ok(hdr)
     }
 }
 
-pub struct EventArgPayload {}
+#[derive(Clone)]
+pub struct EventArgPayload
+{
+    raw_val: Option<u64>,
+    raw_str: Option<String>,
+}
+
+impl EventArgPayload
+{
+    fn new(raw_val: Option<u64>, raw_str: Option<String>) -> Self
+    {
+        EventArgPayload { raw_val, raw_str }
+    }
+
+    fn get_payload_str(
+        offset: &mut usize,
+        args_payload: &[u8],
+    ) -> Result<EventArgPayload>
+    {
+        let length = u32::from_le_bytes(
+            args_payload[*offset..(*offset + 4)].try_into().unwrap(),
+        );
+        *offset += 4;
+        let raw_str: &[u8] = args_payload
+            .get(*offset..(*offset + length as usize))
+            .unwrap();
+        *offset += length as usize;
+        Ok(EventArgPayload::new(
+            None,
+            Some(String::from_utf8_lossy(raw_str).into_owned()),
+        ))
+    }
+
+    fn get_payload_val(
+        offset: &mut usize,
+        args_payload: &[u8],
+    ) -> Result<EventArgPayload>
+    {
+        let raw_val = u64::from_le_bytes(
+            args_payload[*offset..(*offset + 8)].try_into().unwrap(),
+        );
+        *offset += 8;
+        Ok(EventArgPayload::new(Some(raw_val), None))
+    }
+}
+
+#[derive(Clone)]
+struct EventEntry
+{
+    event: Event,
+    event_id: u64,
+    timestamp_ns: u64,
+    record_pid: u32,
+    rec_args: Vec<EventArgPayload>,
+}
+
+impl EventEntry
+{
+    fn new(
+        event: &Event,
+        event_id: u64,
+        timestamp_ns: u64,
+        record_pid: u32,
+    ) -> Self
+    {
+        EventEntry {
+            event: event.clone(),
+            event_id,
+            timestamp_ns,
+            record_pid,
+            rec_args: Vec::new(),
+        }
+    }
+}
+
+fn get_mapping(mut fobj: &File) -> Result<(u64, String)>
+{
+    let mut event_id_buf = [0u8; 8];
+    fobj.read_exact(&mut event_id_buf)
+        .map_err(Error::ReadFile)?;
+    let event_id = u64::from_le_bytes(event_id_buf);
+
+    let mut len_buf = [0u8; 4];
+    fobj.read_exact(&mut len_buf).map_err(Error::ReadFile)?;
+
+    let len = u32::from_le_bytes(len_buf);
+    let mut name_buf = vec![0u8; len as usize];
+    fobj.read_exact(&mut name_buf).map_err(Error::ReadFile)?;
+    let name = String::from_utf8(name_buf.clone())
+        .map_err(|_| Error::InvalidEventName(format!("{:?}", name_buf)))?;
+
+    Ok((event_id, name))
+}
+
+fn read_record(fobj: &File) -> Result<RecordInfo>
+{
+    let hdr = RecordHeader::read_header(fobj)?;
+    let info = hdr.extract_record(fobj)?;
+    Ok(info)
+}
+
+fn read_trace_header(fobj: &File) -> Result<()>
+{
+    let log_hdr = LogHeader::read_header(fobj)?;
+    if log_hdr.event_id != HEADER_EVENT_ID {
+        return Err(Error::InvalidHeaderId(log_hdr.event_id, HEADER_EVENT_ID));
+    }
+    if log_hdr.magic != HEADER_MAGIC {
+        return Err(Error::InvalidHeaderMagic(log_hdr.magic, HEADER_MAGIC));
+    }
+    if ![0, 2, 3, 4].contains(&(log_hdr.version as i64)) {
+        return Err(Error::UnknownVersion(log_hdr.version));
+    }
+    if log_hdr.version != 4 {
+        return Err(Error::UnsupportedVersion(log_hdr.version));
+    }
+    Ok(())
+}
+
+fn read_trace_records(
+    events: &Vec<Event>,
+    fobj: &File,
+    analyzer: &mut Formatter,
+    read_header: bool,
+) -> Result<Vec<String>>
+{
+    /* backtrace::Backtrace needs this env variable. */
+    env::set_var("RUST_BACKTRACE", "1");
+    let bt = Backtrace::new();
+    let raw_frame = bt.frames().first().unwrap();
+    let frameinfo = raw_frame.symbols().first().unwrap();
+
+    let dropped_event = Event::build(
+        "Dropped_Event(uint64_t num_events_dropped)",
+        frameinfo.lineno().unwrap() + 1,
+        frameinfo.filename().unwrap().to_str().unwrap(),
+    )
+    .unwrap();
+
+    let mut event_mapping = HashMap::new();
+    for e in events {
+        event_mapping.insert(e.name.clone(), e.clone());
+    }
+
+    let drop_str = "dropped".to_string();
+    event_mapping.insert(drop_str.clone(), dropped_event.clone());
+    let mut event_id_to_name: HashMap<u64, String> = HashMap::new();
+    event_id_to_name.insert(DROPPED_EVENT_ID, drop_str.clone());
+
+    if !read_header {
+        for (event_id, event) in events.iter().enumerate() {
+            event_id_to_name.insert(event_id as u64, event.name.clone());
+        }
+    }
+
+    let mut fmt_strs = Vec::new();
+    loop {
+        let rtype = RecordType::read_type(fobj)?;
+        match rtype {
+            RecordType::Empty => {
+                break;
+            }
+            RecordType::Mapping => {
+                let (event_id, event_name) =
+                    get_mapping(fobj).expect("Error reading mapping");
+                event_id_to_name.insert(event_id, event_name);
+                continue;
+            }
+            RecordType::Event => {
+                let rec = read_record(fobj).expect("Error reading record");
+                let event_name =
+                    event_id_to_name.get(&rec.event_id).unwrap().to_string();
+                let event = event_mapping
+                    .get(&event_name)
+                    .ok_or(Error::UnknownEvent(event_name))?;
+
+                let mut entry = EventEntry::new(
+                    event,
+                    rec.event_id,
+                    rec.timestamp_ns,
+                    rec.record_pid,
+                );
+                let mut offset = 0;
+
+                for arg in &event.args.props {
+                    let payload = if arg.is_string() {
+                        EventArgPayload::get_payload_str(
+                            &mut offset,
+                            &rec.args_payload,
+                        )?
+                    } else {
+                        EventArgPayload::get_payload_val(
+                            &mut offset,
+                            &rec.args_payload,
+                        )?
+                    };
+                    entry.rec_args.push(payload);
+                }
+
+                let fmt = analyzer.process_event(
+                    &entry.rec_args,
+                    &entry.event,
+                    entry.event_id,
+                    entry.timestamp_ns,
+                    entry.record_pid,
+                )?;
+                fmt_strs.push(fmt);
+            }
+        }
+    }
+    Ok(fmt_strs)
+}
 
 trait Analyzer
 {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 5/6] scripts/simpletrace-rust: Format simple trace output
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (3 preceding siblings ...)
  2024-05-27  8:14 ` [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27  8:14 ` [RFC 6/6] docs/tracing: Add simpletrace-rust section Zhao Liu
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Format simple trace output, as in the Python version. Further, complete
the trace file input and trace log output.

Additionally, remove `#![allow(dead_code)]` and
`#![allow(unused_variables)]` to allow rustc to do related checks.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 scripts/simpletrace-rust/src/main.rs | 80 ++++++++++++++++++++++++++--
 1 file changed, 75 insertions(+), 5 deletions(-)

diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
index f9a71d8dc243..ba4b96efb066 100644
--- a/scripts/simpletrace-rust/src/main.rs
+++ b/scripts/simpletrace-rust/src/main.rs
@@ -8,23 +8,25 @@
  * SPDX-License-Identifier: GPL-2.0-or-later
  */
 
-#![allow(dead_code)]
-#![allow(unused_variables)]
-
 mod trace;
 
 use std::collections::HashMap;
 use std::env;
+use std::fmt;
 use std::fs::File;
+use std::io::stdout;
 use std::io::Error as IOError;
 use std::io::ErrorKind;
 use std::io::Read;
+use std::io::Write;
 use std::mem::size_of;
 
 use backtrace::Backtrace;
 use clap::Arg;
 use clap::Command;
 use thiserror::Error;
+use trace::read_events;
+use trace::Error as TraceError;
 use trace::Event;
 
 const DROPPED_EVENT_ID: u64 = 0xfffffffffffffffe;
@@ -45,8 +47,12 @@ pub enum Error
     InvalidHeaderId(u64, u64),
     #[error("Not a valid trace file, header magic {0} != {1}")]
     InvalidHeaderMagic(u64, u64),
+    #[error("Failed to open file: {0}")]
+    OpenFile(IOError),
     #[error("Failed to read file: {0}")]
     ReadFile(IOError),
+    #[error("Failed to read trace: {0}")]
+    ReadTrace(TraceError),
     #[error(
         "event {0} is logged but is not declared in the trace events \
         file, try using trace-events-all instead."
@@ -58,6 +64,8 @@ pub enum Error
     UnknownVersion(u64),
     #[error("Log format {0} not supported with this QEMU release!")]
     UnsupportedVersion(u64),
+    #[error("Failed to write trace: {0}")]
+    WriteTrace(IOError),
 }
 
 pub type Result<T> = std::result::Result<T, Error>;
@@ -254,6 +262,30 @@ impl EventArgPayload
     }
 }
 
+impl fmt::Display for EventArgPayload
+{
+    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result
+    {
+        if let Some(s) = &self.raw_str {
+            write!(fmt, "{}", s)
+        } else {
+            write!(fmt, "")
+        }
+    }
+}
+
+impl fmt::LowerHex for EventArgPayload
+{
+    fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result
+    {
+        if let Some(v) = self.raw_val {
+            write!(fmt, "{:x}", v)
+        } else {
+            write!(fmt, "")
+        }
+    }
+}
+
 #[derive(Clone)]
 struct EventEntry
 {
@@ -485,10 +517,35 @@ impl Analyzer for Formatter
         event: &Event,
         timestamp_ns: u64,
         pid: u32,
-        event_id: u64,
+        _event_id: u64,
     ) -> Result<String>
     {
-        let fmt_str = String::new();
+        if self.last_timestamp_ns.is_none() {
+            self.last_timestamp_ns = Some(timestamp_ns);
+        }
+
+        let fields: Vec<String> = rec_args
+            .iter()
+            .zip(event.args.props.iter())
+            .map(|(arg, prop)| {
+                if prop.is_string() {
+                    format!("{}={}", prop.name, arg)
+                } else {
+                    format!("{}=0x{:x}", prop.name, arg)
+                }
+            })
+            .collect();
+
+        let delta_ns =
+            timestamp_ns as f64 - self.last_timestamp_ns.unwrap() as f64;
+        self.last_timestamp_ns = Some(timestamp_ns);
+        let fmt_str = format!(
+            "{} {:.3} pid={} {}\n",
+            event.name,
+            delta_ns / 1000.0,
+            pid,
+            fields.join(" "),
+        );
 
         Ok(fmt_str)
     }
@@ -501,7 +558,20 @@ fn process(
     read_header: bool,
 ) -> Result<()>
 {
+    let events = read_events(event_path).map_err(Error::ReadTrace)?;
+    let trace_fobj = File::open(trace_path).map_err(Error::OpenFile)?;
+    if read_header {
+        read_trace_header(&trace_fobj)?;
+    }
+
     analyzer.begin();
+
+    let rec_strs =
+        read_trace_records(&events, &trace_fobj, analyzer, read_header)?;
+    let mut lock = stdout().lock();
+    lock.write_all(rec_strs.join("").as_ref())
+        .map_err(Error::WriteTrace)?;
+
     analyzer.end();
 
     Ok(())
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [RFC 6/6] docs/tracing: Add simpletrace-rust section
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (4 preceding siblings ...)
  2024-05-27  8:14 ` [RFC 5/6] scripts/simpletrace-rust: Format simple trace output Zhao Liu
@ 2024-05-27  8:14 ` Zhao Liu
  2024-05-27 10:29 ` [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Philippe Mathieu-Daudé
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-27  8:14 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Daniel P . Berrangé, Thomas Huth, Markus Armbruster
  Cc: qemu-devel, Zhao Liu

Describe how to compile and use this Rust version program.

And also define the Rust code contribution requirements.

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
 docs/devel/tracing.rst | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/docs/devel/tracing.rst b/docs/devel/tracing.rst
index 043bed7fd0fc..9f31fa38f9bc 100644
--- a/docs/devel/tracing.rst
+++ b/docs/devel/tracing.rst
@@ -273,6 +273,41 @@ You must ensure that the same "trace-events-all" file was used to build QEMU,
 otherwise trace event declarations may have changed and output will not be
 consistent.
 
+Simpletrace-rust
+----------------
+
+Simpletrace-rust (scripts/Simpletrace-rust) is a Rust implementation of
+simpletrace.py, with the same command line arguments as the Python script.
+
+Simpletrace-rust has faster trace parsing compared to the Python version.
+
+The script is compiled by::
+
+    cargo build --manifest-path ./scripts/simpletrace-rust/Cargo.toml --release
+
+Or under scripts/simpletrace-rust, just use::
+
+    cargo build --release
+
+The script also takes the "trace-events-all" file and the binary trace::
+
+    ./scripts/simpletrace-rust/target/release/simpletrace-rust trace-events-all \
+                                                               trace-12345
+
+Any contribution to Simpletrace-rust needs to do the following checks:
+
+Compilation check::
+
+    cargo build
+
+Clippy check::
+
+    cargo clippy
+
+Code style check::
+
+    cargo +nightly fmt --check
+
 Ftrace
 ------
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (5 preceding siblings ...)
  2024-05-27  8:14 ` [RFC 6/6] docs/tracing: Add simpletrace-rust section Zhao Liu
@ 2024-05-27 10:29 ` Philippe Mathieu-Daudé
  2024-05-27 10:49 ` Mads Ynddal
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 30+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-05-27 10:29 UTC (permalink / raw)
  To: Zhao Liu, Manos Pitsidianakis, Gustavo Romero, Pierrick Bouvier
  Cc: qemu-devel, Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster

Cc'ing a few more Rust integration reviewers :)

On 27/5/24 10:14, Zhao Liu wrote:
> Hi maintainers and list,
> 
> This RFC series attempts to re-implement simpletrace.py with Rust, which
> is the 1st task of Paolo's GSoC 2024 proposal.
> 
> There are two motivations for this work:
> 1. This is an open chance to discuss how to integrate Rust into QEMU.
> 2. Rust delivers faster parsing.
> 
> 
> Introduction
> ============
> 
> Code framework
> --------------
> 
> I choose "cargo" to organize the code, because the current
> implementation depends on external crates (Rust's library), such as
> "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> regular matching, and so on. (Meson's support for external crates is
> still incomplete. [2])
> 
> The simpletrace-rust created in this series is not yet integrated into
> the QEMU compilation chain, so it can only be compiled independently, e.g.
> under ./scripts/simpletrace/, compile it be:
> 
>      cargo build --release
> 
> The code tree for the entire simpletrace-rust is as follows:
> 
> $ script/simpletrace-rust .
> .
> ├── Cargo.toml
> └── src
>      └── main.rs   // The simpletrace logic (similar to simpletrace.py).
>      └── trace.rs  // The Argument and Event abstraction (refer to
>                    // tracetool/__init__.py).
> 
> My question about meson v.s. cargo, I put it at the end of the cover
> letter (the section "Opens on Rust Support").
> 
> The following two sections are lessons I've learned from this Rust
> practice.
> 
> 
> Performance
> -----------
> 
> I did the performance comparison using the rust-simpletrace prototype with
> the python one:
> 
> * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> trace binary file for 10 times on each item:
> 
>                        AVE (ms)       Rust v.s. Python
> Rust   (stdout)       12687.16            114.46%
> Python (stdout)       14521.85
> 
> Rust   (file)          1422.44            264.99%
> Python (file)          3769.37
> 
> - The "stdout" lines represent output to the screen.
> - The "file" lines represent output to a file (via "> file").
> 
> This Rust version contains some optimizations (including print, regular
> matching, etc.), but there should be plenty of room for optimization.
> 
> The current performance bottleneck is the reading binary trace file,
> since I am parsing headers and event payloads one after the other, so
> that the IO read overhead accounts for 33%, which can be further
> optimized in the future.
> 
> 
> Security
> --------
> 
> This is an example.
> 
> Rust is very strict about type-checking, and it found timestamp reversal
> issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> deeper with more time)...in this RFC, I workingaround it by allowing
> negative values. And the python version, just silently covered this
> issue up.
> 
> 
> Opens on Rust Support
> =====================
> 
> Meson v.s. Cargo
> ----------------
> 
> The first question is whether all Rust code (including under scripts)
> must be integrated into meson?
> 
> If so, because of [2] then I have to discard the external crates and
> build some more Rust wheels of my own to replace the previous external
> crates.
> 
> For the main part of the QEMU code, I think the answer must be Yes, but
> for the tools in the scripts directory, would it be possible to allow
> the use of cargo to build small tools/program for flexibility and
> migrate to meson later (as meson's support for rust becomes more
> mature)?
> 
> 
> External crates
> ---------------
> 
> This is an additional question that naturally follows from the above
> question, do we have requirements for Rust's external crate? Is only std
> allowed?
> 
> Welcome your feedback!
> 
> 
> [1]: https://wiki.qemu.org/Google_Summer_of_Code_2024
> [2]: https://github.com/mesonbuild/meson/issues/2173
> [3]: https://lore.kernel.org/qemu-devel/20240509134712.GA515599@fedora.redhat.com/
> 
> Thanks and Best Regards,
> Zhao
> 
> ---
> Zhao Liu (6):
>    scripts/simpletrace-rust: Add the basic cargo framework
>    scripts/simpletrace-rust: Support Event & Arguments in trace module
>    scripts/simpletrace-rust: Add helpers to parse trace file
>    scripts/simpletrace-rust: Parse and check trace recode file
>    scripts/simpletrace-rust: Format simple trace output
>    docs/tracing: Add simpletrace-rust section
> 
>   docs/devel/tracing.rst                 |  35 ++
>   scripts/simpletrace-rust/.gitignore    |   1 +
>   scripts/simpletrace-rust/.rustfmt.toml |   9 +
>   scripts/simpletrace-rust/Cargo.lock    | 370 +++++++++++++++
>   scripts/simpletrace-rust/Cargo.toml    |  17 +
>   scripts/simpletrace-rust/src/main.rs   | 633 +++++++++++++++++++++++++
>   scripts/simpletrace-rust/src/trace.rs  | 339 +++++++++++++
>   7 files changed, 1404 insertions(+)
>   create mode 100644 scripts/simpletrace-rust/.gitignore
>   create mode 100644 scripts/simpletrace-rust/.rustfmt.toml
>   create mode 100644 scripts/simpletrace-rust/Cargo.lock
>   create mode 100644 scripts/simpletrace-rust/Cargo.toml
>   create mode 100644 scripts/simpletrace-rust/src/main.rs
>   create mode 100644 scripts/simpletrace-rust/src/trace.rs
> 



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (6 preceding siblings ...)
  2024-05-27 10:29 ` [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Philippe Mathieu-Daudé
@ 2024-05-27 10:49 ` Mads Ynddal
  2024-05-28  6:15   ` Zhao Liu
  2024-05-27 19:59 ` Stefan Hajnoczi
  2024-05-31 12:27 ` Daniel P. Berrangé
  9 siblings, 1 reply; 30+ messages in thread
From: Mads Ynddal @ 2024-05-27 10:49 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Stefan Hajnoczi, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée,
	"Daniel P . Berrangé", Thomas Huth,
	Markus Armbruster, qemu-devel

Hi,

Interesting work. I don't have any particular comments for the code, but I
wanted to address a few of the points here.

> 2. Rust delivers faster parsing.

For me, the point of simpletrace.py is not to be the fastest at parsing, but
rather to open the door for using Python libraries like numpy, matplotlib, etc.
for analysis.

There might be room for improvement in the Python version, especially in
minimizing memory usage, when parsing large traces.


> Security
> --------
> 
> This is an example.
> 
> Rust is very strict about type-checking, and it found timestamp reversal
> issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> deeper with more time)...in this RFC, I workingaround it by allowing
> negative values. And the python version, just silently covered this
> issue up.

I'm not particularly worried about the security of the Python version. We're not
doing anything obviously exploitable.

—
Mads Ynddal



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (7 preceding siblings ...)
  2024-05-27 10:49 ` Mads Ynddal
@ 2024-05-27 19:59 ` Stefan Hajnoczi
  2024-05-28  6:48   ` Zhao Liu
  2024-05-31 12:27 ` Daniel P. Berrangé
  9 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-27 19:59 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 7545 bytes --]

On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> Hi maintainers and list,
> 
> This RFC series attempts to re-implement simpletrace.py with Rust, which
> is the 1st task of Paolo's GSoC 2024 proposal.
> 
> There are two motivations for this work:
> 1. This is an open chance to discuss how to integrate Rust into QEMU.
> 2. Rust delivers faster parsing.
> 
> 
> Introduction
> ============
> 
> Code framework
> --------------
> 
> I choose "cargo" to organize the code, because the current
> implementation depends on external crates (Rust's library), such as
> "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> regular matching, and so on. (Meson's support for external crates is
> still incomplete. [2])
> 
> The simpletrace-rust created in this series is not yet integrated into
> the QEMU compilation chain, so it can only be compiled independently, e.g.
> under ./scripts/simpletrace/, compile it be:
> 
>     cargo build --release

Please make sure it's built by .gitlab-ci.d/ so that the continuous
integration system prevents bitrot. You can add a job that runs the
cargo build.

> 
> The code tree for the entire simpletrace-rust is as follows:
> 
> $ script/simpletrace-rust .
> .
> ├── Cargo.toml
> └── src
>     └── main.rs   // The simpletrace logic (similar to simpletrace.py).
>     └── trace.rs  // The Argument and Event abstraction (refer to
>                   // tracetool/__init__.py).
> 
> My question about meson v.s. cargo, I put it at the end of the cover
> letter (the section "Opens on Rust Support").
> 
> The following two sections are lessons I've learned from this Rust
> practice.
> 
> 
> Performance
> -----------
> 
> I did the performance comparison using the rust-simpletrace prototype with
> the python one:
> 
> * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> trace binary file for 10 times on each item:
> 
>                       AVE (ms)       Rust v.s. Python
> Rust   (stdout)       12687.16            114.46%
> Python (stdout)       14521.85
> 
> Rust   (file)          1422.44            264.99%
> Python (file)          3769.37
> 
> - The "stdout" lines represent output to the screen.
> - The "file" lines represent output to a file (via "> file").
> 
> This Rust version contains some optimizations (including print, regular
> matching, etc.), but there should be plenty of room for optimization.
> 
> The current performance bottleneck is the reading binary trace file,
> since I am parsing headers and event payloads one after the other, so
> that the IO read overhead accounts for 33%, which can be further
> optimized in the future.

Performance will become more important when large amounts of TCG data is
captured, as described in the project idea:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing

While I can't think of a time in the past where simpletrace.py's
performance bothered me, improving performance is still welcome. Just
don't spend too much time on performance (and making the code more
complex) unless there is a real need.

> Security
> --------
> 
> This is an example.
> 
> Rust is very strict about type-checking, and it found timestamp reversal
> issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> deeper with more time)...in this RFC, I workingaround it by allowing
> negative values. And the python version, just silently covered this
> issue up.
>
> Opens on Rust Support
> =====================
> 
> Meson v.s. Cargo
> ----------------
> 
> The first question is whether all Rust code (including under scripts)
> must be integrated into meson?
> 
> If so, because of [2] then I have to discard the external crates and
> build some more Rust wheels of my own to replace the previous external
> crates.
> 
> For the main part of the QEMU code, I think the answer must be Yes, but
> for the tools in the scripts directory, would it be possible to allow
> the use of cargo to build small tools/program for flexibility and
> migrate to meson later (as meson's support for rust becomes more
> mature)?

I have not seen a satisfying way to natively build Rust code using
meson. I remember reading about a tool that converts Cargo.toml files to
meson wrap files or something similar. That still doesn't feel great
because upstream works with Cargo and duplicating build information in
meson is a drag.

Calling cargo from meson is not ideal either, but it works and avoids
duplicating build information. This is the approach I would use for now
unless someone can point to an example of native Rust support in meson
that is clean.

Here is how libblkio calls cargo from meson:
https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build
https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh

> 
> 
> External crates
> ---------------
> 
> This is an additional question that naturally follows from the above
> question, do we have requirements for Rust's external crate? Is only std
> allowed?

There is no policy. My suggestion:

If you need a third-party crate then it's okay to use it, but try to
minimize dependencies. Avoid adding dependening for niceties that are
not strictly needed. Third-party crates are a burden for package
maintainers and anyone building from source. They increase the risk that
the code will fail to build. They can also be a security risk.

> 
> Welcome your feedback!

It would be easier to give feedback if you implement some examples of
TCG binary tracing before rewriting simpletrace.py. It's unclear to me
why simpletrace needs to be rewritten at this point. If you are
extending the simpletrace file format in ways that are not suitable for
Python or can demonstrate that Python performance is a problem, then
focussing on a Rust simpletrace implementation is more convincing.

Could you use simpletrace.py to develop TCG binary tracing first?

Stefan

> 
> 
> [1]: https://wiki.qemu.org/Google_Summer_of_Code_2024
> [2]: https://github.com/mesonbuild/meson/issues/2173
> [3]: https://lore.kernel.org/qemu-devel/20240509134712.GA515599@fedora.redhat.com/
> 
> Thanks and Best Regards,
> Zhao
> 
> ---
> Zhao Liu (6):
>   scripts/simpletrace-rust: Add the basic cargo framework
>   scripts/simpletrace-rust: Support Event & Arguments in trace module
>   scripts/simpletrace-rust: Add helpers to parse trace file
>   scripts/simpletrace-rust: Parse and check trace recode file
>   scripts/simpletrace-rust: Format simple trace output
>   docs/tracing: Add simpletrace-rust section
> 
>  docs/devel/tracing.rst                 |  35 ++
>  scripts/simpletrace-rust/.gitignore    |   1 +
>  scripts/simpletrace-rust/.rustfmt.toml |   9 +
>  scripts/simpletrace-rust/Cargo.lock    | 370 +++++++++++++++
>  scripts/simpletrace-rust/Cargo.toml    |  17 +
>  scripts/simpletrace-rust/src/main.rs   | 633 +++++++++++++++++++++++++
>  scripts/simpletrace-rust/src/trace.rs  | 339 +++++++++++++
>  7 files changed, 1404 insertions(+)
>  create mode 100644 scripts/simpletrace-rust/.gitignore
>  create mode 100644 scripts/simpletrace-rust/.rustfmt.toml
>  create mode 100644 scripts/simpletrace-rust/Cargo.lock
>  create mode 100644 scripts/simpletrace-rust/Cargo.toml
>  create mode 100644 scripts/simpletrace-rust/src/main.rs
>  create mode 100644 scripts/simpletrace-rust/src/trace.rs
> 
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
@ 2024-05-27 20:05   ` Stefan Hajnoczi
  2024-05-28  7:53     ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-27 20:05 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 16269 bytes --]

On Mon, May 27, 2024 at 04:14:16PM +0800, Zhao Liu wrote:
> Define the basic cargo framework to support compiling simpletrace-rust
> via cargo, and add the Rust code style (with some nightly features)
> check items to make Rust style as close to the QEMU C code as possible.
> 
> With the base cargo package, define the basic code framework for
> simpletrace-rust, approximating the Python version, and also abstract
> Analyzer operations for simpletrace-rust. Event and other future
> trace-related structures are placed in the trace module.
> 
> Additionally, support basic command line parsing for simpletrace-rust as
> a start.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  scripts/simpletrace-rust/.gitignore    |   1 +
>  scripts/simpletrace-rust/.rustfmt.toml |   9 +
>  scripts/simpletrace-rust/Cargo.lock    | 239 +++++++++++++++++++++++++
>  scripts/simpletrace-rust/Cargo.toml    |  11 ++
>  scripts/simpletrace-rust/src/main.rs   | 173 ++++++++++++++++++
>  scripts/simpletrace-rust/src/trace.rs  |  11 ++
>  6 files changed, 444 insertions(+)
>  create mode 100644 scripts/simpletrace-rust/.gitignore
>  create mode 100644 scripts/simpletrace-rust/.rustfmt.toml
>  create mode 100644 scripts/simpletrace-rust/Cargo.lock
>  create mode 100644 scripts/simpletrace-rust/Cargo.toml
>  create mode 100644 scripts/simpletrace-rust/src/main.rs
>  create mode 100644 scripts/simpletrace-rust/src/trace.rs
> 
> diff --git a/scripts/simpletrace-rust/.gitignore b/scripts/simpletrace-rust/.gitignore
> new file mode 100644
> index 000000000000..2f7896d1d136
> --- /dev/null
> +++ b/scripts/simpletrace-rust/.gitignore
> @@ -0,0 +1 @@
> +target/
> diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> new file mode 100644
> index 000000000000..97a97c24ebfb
> --- /dev/null
> +++ b/scripts/simpletrace-rust/.rustfmt.toml
> @@ -0,0 +1,9 @@
> +brace_style = "AlwaysNextLine"
> +comment_width = 80
> +edition = "2021"
> +group_imports = "StdExternalCrate"
> +imports_granularity = "item"
> +max_width = 80
> +use_field_init_shorthand = true
> +use_try_shorthand = true
> +wrap_comments = true

There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?

> diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
> new file mode 100644
> index 000000000000..4a0ff8092dcb
> --- /dev/null
> +++ b/scripts/simpletrace-rust/Cargo.lock
> @@ -0,0 +1,239 @@
> +# This file is automatically @generated by Cargo.
> +# It is not intended for manual editing.
> +version = 3
> +
> +[[package]]
> +name = "anstream"
> +version = "0.6.14"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "418c75fa768af9c03be99d17643f93f79bbba589895012a80e3452a19ddda15b"
> +dependencies = [
> + "anstyle",
> + "anstyle-parse",
> + "anstyle-query",
> + "anstyle-wincon",
> + "colorchoice",
> + "is_terminal_polyfill",
> + "utf8parse",
> +]
> +
> +[[package]]
> +name = "anstyle"
> +version = "1.0.7"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "038dfcf04a5feb68e9c60b21c9625a54c2c0616e79b72b0fd87075a056ae1d1b"
> +
> +[[package]]
> +name = "anstyle-parse"
> +version = "0.2.4"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "c03a11a9034d92058ceb6ee011ce58af4a9bf61491aa7e1e59ecd24bd40d22d4"
> +dependencies = [
> + "utf8parse",
> +]
> +
> +[[package]]
> +name = "anstyle-query"
> +version = "1.0.3"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "a64c907d4e79225ac72e2a354c9ce84d50ebb4586dee56c82b3ee73004f537f5"
> +dependencies = [
> + "windows-sys",
> +]
> +
> +[[package]]
> +name = "anstyle-wincon"
> +version = "3.0.3"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "61a38449feb7068f52bb06c12759005cf459ee52bb4adc1d5a7c4322d716fb19"
> +dependencies = [
> + "anstyle",
> + "windows-sys",
> +]
> +
> +[[package]]
> +name = "clap"
> +version = "4.5.4"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "90bc066a67923782aa8515dbaea16946c5bcc5addbd668bb80af688e53e548a0"
> +dependencies = [
> + "clap_builder",
> +]
> +
> +[[package]]
> +name = "clap_builder"
> +version = "4.5.2"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "ae129e2e766ae0ec03484e609954119f123cc1fe650337e155d03b022f24f7b4"
> +dependencies = [
> + "anstream",
> + "anstyle",
> + "clap_lex",
> + "strsim",
> +]
> +
> +[[package]]
> +name = "clap_lex"
> +version = "0.7.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "98cc8fbded0c607b7ba9dd60cd98df59af97e84d24e49c8557331cfc26d301ce"
> +
> +[[package]]
> +name = "colorchoice"
> +version = "1.0.1"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "0b6a852b24ab71dffc585bcb46eaf7959d175cb865a7152e35b348d1b2960422"
> +
> +[[package]]
> +name = "is_terminal_polyfill"
> +version = "1.70.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
> +
> +[[package]]
> +name = "proc-macro2"
> +version = "1.0.83"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "0b33eb56c327dec362a9e55b3ad14f9d2f0904fb5a5b03b513ab5465399e9f43"
> +dependencies = [
> + "unicode-ident",
> +]
> +
> +[[package]]
> +name = "quote"
> +version = "1.0.36"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "0fa76aaf39101c457836aec0ce2316dbdc3ab723cdda1c6bd4e6ad4208acaca7"
> +dependencies = [
> + "proc-macro2",
> +]
> +
> +[[package]]
> +name = "simpletrace-rust"
> +version = "0.1.0"
> +dependencies = [
> + "clap",
> + "thiserror",
> +]
> +
> +[[package]]
> +name = "strsim"
> +version = "0.11.1"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
> +
> +[[package]]
> +name = "syn"
> +version = "2.0.66"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "c42f3f41a2de00b01c0aaad383c5a45241efc8b2d1eda5661812fda5f3cdcff5"
> +dependencies = [
> + "proc-macro2",
> + "quote",
> + "unicode-ident",
> +]
> +
> +[[package]]
> +name = "thiserror"
> +version = "1.0.61"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "c546c80d6be4bc6a00c0f01730c08df82eaa7a7a61f11d656526506112cc1709"
> +dependencies = [
> + "thiserror-impl",
> +]
> +
> +[[package]]
> +name = "thiserror-impl"
> +version = "1.0.61"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "46c3384250002a6d5af4d114f2845d37b57521033f30d5c3f46c4d70e1197533"
> +dependencies = [
> + "proc-macro2",
> + "quote",
> + "syn",
> +]
> +
> +[[package]]
> +name = "unicode-ident"
> +version = "1.0.12"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "3354b9ac3fae1ff6755cb6db53683adb661634f67557942dea4facebec0fee4b"
> +
> +[[package]]
> +name = "utf8parse"
> +version = "0.2.1"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "711b9620af191e0cdc7468a8d14e709c3dcdb115b36f838e601583af800a370a"
> +
> +[[package]]
> +name = "windows-sys"
> +version = "0.52.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
> +dependencies = [
> + "windows-targets",
> +]
> +
> +[[package]]
> +name = "windows-targets"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "6f0713a46559409d202e70e28227288446bf7841d3211583a4b53e3f6d96e7eb"
> +dependencies = [
> + "windows_aarch64_gnullvm",
> + "windows_aarch64_msvc",
> + "windows_i686_gnu",
> + "windows_i686_gnullvm",
> + "windows_i686_msvc",
> + "windows_x86_64_gnu",
> + "windows_x86_64_gnullvm",
> + "windows_x86_64_msvc",
> +]
> +
> +[[package]]
> +name = "windows_aarch64_gnullvm"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "7088eed71e8b8dda258ecc8bac5fb1153c5cffaf2578fc8ff5d61e23578d3263"
> +
> +[[package]]
> +name = "windows_aarch64_msvc"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "9985fd1504e250c615ca5f281c3f7a6da76213ebd5ccc9561496568a2752afb6"
> +
> +[[package]]
> +name = "windows_i686_gnu"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "88ba073cf16d5372720ec942a8ccbf61626074c6d4dd2e745299726ce8b89670"
> +
> +[[package]]
> +name = "windows_i686_gnullvm"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "87f4261229030a858f36b459e748ae97545d6f1ec60e5e0d6a3d32e0dc232ee9"
> +
> +[[package]]
> +name = "windows_i686_msvc"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "db3c2bf3d13d5b658be73463284eaf12830ac9a26a90c717b7f771dfe97487bf"
> +
> +[[package]]
> +name = "windows_x86_64_gnu"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "4e4246f76bdeff09eb48875a0fd3e2af6aada79d409d33011886d3e1581517d9"
> +
> +[[package]]
> +name = "windows_x86_64_gnullvm"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "852298e482cd67c356ddd9570386e2862b5673c85bd5f88df9ab6802b334c596"
> +
> +[[package]]
> +name = "windows_x86_64_msvc"
> +version = "0.52.5"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "bec47e5bfd1bff0eeaf6d8b485cc1074891a197ab4225d504cb7a1ab88b02bf0"
> diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
> new file mode 100644
> index 000000000000..b44ba1569dad
> --- /dev/null
> +++ b/scripts/simpletrace-rust/Cargo.toml
> @@ -0,0 +1,11 @@
> +[package]
> +name = "simpletrace-rust"
> +description = "Pretty-printer for simple trace backend binary trace files (Rust version)"
> +version = "0.1.0"
> +edition = "2021"
> +authors = ["Zhao Liu <zhao1.liu@intel.com>"]
> +license = "GPL-2.0-or-later"
> +
> +[dependencies]
> +clap = "4.5.4"
> +thiserror = "1.0.20"
> diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
> new file mode 100644
> index 000000000000..2d2926b7658d
> --- /dev/null
> +++ b/scripts/simpletrace-rust/src/main.rs
> @@ -0,0 +1,173 @@
> +/*
> + * Pretty-printer for simple trace backend binary trace files (Rust version)
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhao Liu <zhao1.liu@intel.com>
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#![allow(dead_code)]
> +#![allow(unused_variables)]
> +
> +mod trace;
> +
> +use std::env;
> +
> +use clap::Arg;
> +use clap::Command;
> +use thiserror::Error;
> +use trace::Event;
> +
> +#[derive(Error, Debug)]
> +pub enum Error
> +{
> +    #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
> +    CliOptionUnmatch(String),
> +}
> +
> +pub type Result<T> = std::result::Result<T, Error>;
> +
> +pub struct EventArgPayload {}
> +
> +trait Analyzer
> +{

The Python version treats this as an API so that users can write trace
analysis scripts. I see below that you're using non-doc comments. That
suggests you don't see this as a public API that people can write trace
analysis scripts against?

> +    /* Called at the start of the trace. */
> +    fn begin(&self) {}
> +
> +    /* Called if no specific method for processing a trace event. */
> +    fn catchall(
> +        &mut self,
> +        rec_args: &[EventArgPayload],
> +        event: &Event,
> +        timestamp_ns: u64,
> +        pid: u32,
> +        event_id: u64,
> +    ) -> Result<String>;
> +
> +    /* Called at the end of the trace. */
> +    fn end(&self) {}
> +
> +    /*
> +     * TODO: Support "variable" parameters (i.e. variants of process_event()
> +     * with different parameters, like **kwargs in python), when we need a
> +     * simpletrace rust module.
> +     */
> +    fn process_event(
> +        &mut self,
> +        rec_args: &[EventArgPayload],
> +        event: &Event,
> +        event_id: u64,
> +        timestamp_ns: u64,
> +        pid: u32,
> +    ) -> Result<String>
> +    {
> +        self.catchall(rec_args, event, timestamp_ns, pid, event_id)
> +
> +        /*
> +         * TODO: Support custom function hooks (like getattr() in python),
> +         * when we need a simpletrace rust module.
> +         */
> +    }
> +}
> +
> +struct Formatter
> +{
> +    last_timestamp_ns: Option<u64>,
> +}
> +
> +impl Formatter
> +{
> +    fn new() -> Self
> +    {
> +        Formatter {
> +            last_timestamp_ns: None,
> +        }
> +    }
> +}
> +
> +impl Analyzer for Formatter
> +{
> +    fn catchall(
> +        &mut self,
> +        rec_args: &[EventArgPayload],
> +        event: &Event,
> +        timestamp_ns: u64,
> +        pid: u32,
> +        event_id: u64,
> +    ) -> Result<String>
> +    {
> +        let fmt_str = String::new();
> +
> +        Ok(fmt_str)
> +    }
> +}
> +
> +fn process(
> +    event_path: &str,
> +    trace_path: &str,
> +    analyzer: &mut Formatter,
> +    read_header: bool,
> +) -> Result<()>
> +{
> +    analyzer.begin();
> +    analyzer.end();
> +
> +    Ok(())
> +}
> +
> +/*
> + * Execute an analyzer on a trace file given on the command-line.
> + * This function is useful as a driver for simple analysis scripts.  More
> + * advanced scripts will want to call process() instead.
> + */
> +fn run(analyzer: &mut Formatter) -> Result<()>
> +{
> +    let matches = Command::new("simple trace")
> +        .arg(
> +            Arg::new("no-header")
> +                .required(false)
> +                .long("no-header")
> +                .action(clap::ArgAction::SetTrue)
> +                .help("Disable header parsing"),
> +        )
> +        .arg(
> +            Arg::new("trace-events")
> +                .required(true)
> +                .action(clap::ArgAction::Set)
> +                .help("Path to trace events file"),
> +        )
> +        .arg(
> +            Arg::new("trace-file")
> +                .required(true)
> +                .action(clap::ArgAction::Set)
> +                .help("Path to trace file"),
> +        )
> +        .try_get_matches()
> +        .map_err(|_| {
> +            Error::CliOptionUnmatch(
> +                env::current_exe()
> +                    .unwrap_or_else(|_| "simpletrace".into())
> +                    .to_string_lossy()
> +                    .to_string(),
> +            )
> +        })?;
> +
> +    let no_header = matches.get_flag("no-header");
> +    let event_path = matches.get_one::<String>("trace-events").unwrap();
> +    let trace_path = matches.get_one::<String>("trace-file").unwrap();
> +
> +    process(event_path, trace_path, analyzer, !no_header)?;
> +
> +    Ok(())
> +}
> +
> +fn main()
> +{
> +    let mut fmt = Formatter::new();
> +
> +    if let Err(e) = run(&mut fmt) {
> +        println!("{:?}", e.to_string());
> +    }
> +}
> diff --git a/scripts/simpletrace-rust/src/trace.rs b/scripts/simpletrace-rust/src/trace.rs
> new file mode 100644
> index 000000000000..3a4b06435b8b
> --- /dev/null
> +++ b/scripts/simpletrace-rust/src/trace.rs
> @@ -0,0 +1,11 @@
> +/*
> + * Machinery for generating tracing-related intermediate files (Rust version)
> + *
> + * Copyright (C) 2024 Intel Corporation.
> + *
> + * Authors: Zhao Liu <zhao1.liu@intel.com>
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +pub struct Event {}
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module
  2024-05-27  8:14 ` [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module Zhao Liu
@ 2024-05-27 20:33   ` Stefan Hajnoczi
  2024-05-28  8:32     ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-27 20:33 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 14997 bytes --]

On Mon, May 27, 2024 at 04:14:17PM +0800, Zhao Liu wrote:
> Refer to scripts/tracetool/__init__.py, add Event & Arguments
> abstractions in trace module.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  scripts/simpletrace-rust/Cargo.lock   |  52 ++++
>  scripts/simpletrace-rust/Cargo.toml   |   2 +
>  scripts/simpletrace-rust/src/trace.rs | 330 +++++++++++++++++++++++++-
>  3 files changed, 383 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
> index 4a0ff8092dcb..3d815014eb44 100644
> --- a/scripts/simpletrace-rust/Cargo.lock
> +++ b/scripts/simpletrace-rust/Cargo.lock
> @@ -2,6 +2,15 @@
>  # It is not intended for manual editing.
>  version = 3
>  
> +[[package]]
> +name = "aho-corasick"
> +version = "1.1.3"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "8e60d3430d3a69478ad0993f19238d2df97c507009a52b3c10addcd7f6bcb916"
> +dependencies = [
> + "memchr",
> +]
> +
>  [[package]]
>  name = "anstream"
>  version = "0.6.14"
> @@ -90,6 +99,18 @@ version = "1.70.0"
>  source = "registry+https://github.com/rust-lang/crates.io-index"
>  checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
>  
> +[[package]]
> +name = "memchr"
> +version = "2.7.2"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "6c8640c5d730cb13ebd907d8d04b52f55ac9a2eec55b440c8892f40d56c76c1d"
> +
> +[[package]]
> +name = "once_cell"
> +version = "1.19.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "3fdb12b2476b595f9358c5161aa467c2438859caa136dec86c26fdd2efe17b92"
> +
>  [[package]]
>  name = "proc-macro2"
>  version = "1.0.83"
> @@ -108,11 +129,42 @@ dependencies = [
>   "proc-macro2",
>  ]
>  
> +[[package]]
> +name = "regex"
> +version = "1.10.4"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "c117dbdfde9c8308975b6a18d71f3f385c89461f7b3fb054288ecf2a2058ba4c"
> +dependencies = [
> + "aho-corasick",
> + "memchr",
> + "regex-automata",
> + "regex-syntax",
> +]
> +
> +[[package]]
> +name = "regex-automata"
> +version = "0.4.6"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "86b83b8b9847f9bf95ef68afb0b8e6cdb80f498442f5179a29fad448fcc1eaea"
> +dependencies = [
> + "aho-corasick",
> + "memchr",
> + "regex-syntax",
> +]
> +
> +[[package]]
> +name = "regex-syntax"
> +version = "0.8.3"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "adad44e29e4c806119491a7f06f03de4d1af22c3a680dd47f1e6e179439d1f56"
> +
>  [[package]]
>  name = "simpletrace-rust"
>  version = "0.1.0"
>  dependencies = [
>   "clap",
> + "once_cell",
> + "regex",
>   "thiserror",
>  ]
>  
> diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
> index b44ba1569dad..24a79d04e566 100644
> --- a/scripts/simpletrace-rust/Cargo.toml
> +++ b/scripts/simpletrace-rust/Cargo.toml
> @@ -8,4 +8,6 @@ license = "GPL-2.0-or-later"
>  
>  [dependencies]
>  clap = "4.5.4"
> +once_cell = "1.19.0"
> +regex = "1.10.4"
>  thiserror = "1.0.20"
> diff --git a/scripts/simpletrace-rust/src/trace.rs b/scripts/simpletrace-rust/src/trace.rs
> index 3a4b06435b8b..f41d6e0b5bc3 100644
> --- a/scripts/simpletrace-rust/src/trace.rs
> +++ b/scripts/simpletrace-rust/src/trace.rs
> @@ -8,4 +8,332 @@
>   * SPDX-License-Identifier: GPL-2.0-or-later
>   */
>  
> -pub struct Event {}
> +#![allow(dead_code)]
> +
> +use std::fs::read_to_string;
> +
> +use once_cell::sync::Lazy;
> +use regex::Regex;
> +use thiserror::Error;
> +
> +#[derive(Error, Debug)]
> +pub enum Error
> +{
> +    #[error("Empty argument (did you forget to use 'void'?)")]
> +    EmptyArg,
> +    #[error("Event '{0}' has more than maximum permitted argument count")]
> +    InvalidArgCnt(String),
> +    #[error("{0} does not end with a new line")]
> +    InvalidEventFile(String),
> +    #[error("Invalid format: {0}")]
> +    InvalidFormat(String),
> +    #[error(
> +        "Argument type '{0}' is not allowed. \
> +        Only standard C types and fixed size integer \
> +        types should be used. struct, union, and \
> +        other complex pointer types should be \
> +        declared as 'void *'"
> +    )]
> +    InvalidType(String),
> +    #[error("Error at {0}:{1}: {2}")]
> +    ReadEventFail(String, usize, String),
> +    #[error("Unknown event: {0}")]
> +    UnknownEvent(String),
> +    #[error("Unknown properties: {0}")]
> +    UnknownProp(String),
> +}
> +
> +pub type Result<T> = std::result::Result<T, Error>;
> +
> +/*
> + * Refer to the description of ALLOWED_TYPES in
> + * scripts/tracetool/__init__.py.

Please don't reference the Python implementation because this will not
age well. It may bitrot if the Python code changes or if the Python
implementation is deprecated then the source file will go away
altogether. Make the Rust implementation self-contained. If there are
common file format concerns shared by implementations, then move that
information to a separate document in docs/interop/ (i.e. a simpletrace
file format specification).

> + */
> +const ALLOWED_TYPES: [&str; 20] = [
> +    "int",
> +    "long",
> +    "short",
> +    "char",
> +    "bool",
> +    "unsigned",
> +    "signed",
> +    "int8_t",
> +    "uint8_t",
> +    "int16_t",
> +    "uint16_t",
> +    "int32_t",
> +    "uint32_t",
> +    "int64_t",
> +    "uint64_t",
> +    "void",
> +    "size_t",
> +    "ssize_t",
> +    "uintptr_t",
> +    "ptrdiff_t",
> +];
> +
> +const STRING_TYPES: [&str; 4] =
> +    ["const char*", "char*", "const char *", "char *"];
> +
> +/* TODO: Support 'vcpu' property. */

The vcpu property was removed in commit d9a6bad542cd ("docs: remove
references to TCG tracing"). Is this comment outdated or are you
planning to bring it back?

> +const VALID_PROPS: [&str; 1] = ["disable"];
> +
> +fn validate_c_type(arg_type: &str) -> Result<()>
> +{
> +    static RE_TYPE: Lazy<Regex> = Lazy::new(|| Regex::new(r"\*").unwrap());
> +    let bits: Vec<String> =
> +        arg_type.split_whitespace().map(|s| s.to_string()).collect();
> +    for bit in bits {
> +        let res = RE_TYPE.replace_all(&bit, "");
> +        if res.is_empty() {
> +            continue;
> +        }
> +        if res == "const" {
> +            continue;
> +        }
> +        if !ALLOWED_TYPES.contains(&res.as_ref()) {
> +            return Err(Error::InvalidType(res.to_string()));
> +        }
> +    }
> +    Ok(())
> +}
> +
> +pub fn read_events(fname: &str) -> Result<Vec<Event>>
> +{
> +    let fstr = read_to_string(fname).unwrap();
> +    let lines = fstr.lines().map(|s| s.trim()).collect::<Vec<_>>();
> +    let mut events = Vec::new();
> +
> +    /*
> +     * lines() in Rust: Line terminators are not included in the lines
> +     * returned by the iterator, so check whether line_str is empty.
> +     */
> +    for (lineno, line_str) in lines.iter().enumerate() {
> +        if line_str.is_empty() || line_str.starts_with('#') {
> +            continue;
> +        }
> +
> +        let event = Event::build(line_str, lineno as u32 + 1, fname);
> +        if let Err(e) = event {
> +            return Err(Error::ReadEventFail(
> +                fname.to_owned(),
> +                lineno,
> +                e.to_string(),
> +            ));
> +        } else {
> +            events.push(event.unwrap());
> +        }
> +    }
> +
> +    Ok(events)
> +}
> +
> +#[derive(Clone)]
> +pub struct ArgProperty
> +{
> +    pub c_type: String,
> +    pub name: String,
> +}
> +
> +impl ArgProperty
> +{
> +    fn new(c_type: &str, name: &str) -> Self
> +    {
> +        ArgProperty {
> +            c_type: c_type.to_string(),
> +            name: name.to_string(),
> +        }
> +    }
> +
> +    pub fn is_string(&self) -> bool
> +    {
> +        let arg_strip = self.c_type.trim_start();
> +        STRING_TYPES.iter().any(|&s| arg_strip.starts_with(s))
> +            && arg_strip.matches('*').count() == 1
> +    }
> +}
> +
> +#[derive(Default, Clone)]
> +pub struct Arguments
> +{
> +    /* List of (type, name) tuples or arguments properties. */
> +    pub props: Vec<ArgProperty>,
> +}
> +
> +impl Arguments
> +{
> +    pub fn new() -> Self
> +    {
> +        Arguments { props: Vec::new() }
> +    }
> +
> +    pub fn len(&self) -> usize
> +    {
> +        self.props.len()
> +    }
> +
> +    pub fn build(arg_str: &str) -> Result<Arguments>
> +    {
> +        let mut args = Arguments::new();
> +        for arg in arg_str.split(',').map(|s| s.trim()) {
> +            if arg.is_empty() {
> +                return Err(Error::EmptyArg);
> +            }
> +
> +            if arg == "void" {
> +                continue;
> +            }
> +
> +            let (arg_type, identifier) = if arg.contains('*') {
> +                /* FIXME: Implement rsplit_inclusive(). */
> +                let p = arg.rfind('*').unwrap();
> +                (
> +                    /* Safe because arg contains "*" and p exists. */
> +                    unsafe { arg.get_unchecked(..p + 1) },
> +                    /* Safe because arg contains "*" and p exists. */
> +                    unsafe { arg.get_unchecked(p + 1..) },
> +                )
> +            } else {
> +                arg.rsplit_once(' ').unwrap()
> +            };

Can you write this without unsafe? Maybe rsplit_once(' ') followed by a
check for (_, '*identifier'). If the identifier starts with '*', then
arg_type += ' *' and identifier = identifier[1:].

> +
> +            validate_c_type(arg_type)?;
> +            args.props.push(ArgProperty::new(arg_type, identifier));
> +        }
> +        Ok(args)
> +    }
> +}
> +
> +/* TODO: Support original, event_trans, event_exec if needed. */
> +#[derive(Default, Clone)]
> +pub struct Event
> +{
> +    /* The event name. */
> +    pub name: String,
> +    /* Properties of the event. */
> +    pub properties: Vec<String>,
> +    /* The event format string. */
> +    pub fmt: Vec<String>,
> +    /* The event arguments. */
> +    pub args: Arguments,
> +    /* The line number in the input file. */
> +    pub lineno: u32,
> +    /* The path to the input file. */
> +    pub filename: String,
> +}
> +
> +impl Event
> +{
> +    #[allow(clippy::too_many_arguments)]
> +    pub fn new(
> +        name: &str,
> +        mut props: Vec<String>,
> +        fmt: Vec<String>,
> +        args: Arguments,
> +        lineno: u32,
> +        filename: &str,
> +    ) -> Result<Self>
> +    {
> +        let mut event = Event {
> +            name: name.to_string(),
> +            fmt: fmt.clone(),
> +            args,
> +            lineno,
> +            filename: filename.to_string(),
> +            ..Default::default()
> +        };
> +
> +        event.properties.append(&mut props);
> +
> +        if event.args.len() > 10 {
> +            return Err(Error::InvalidArgCnt(event.name));
> +        }
> +
> +        let unknown_props: Vec<String> = event
> +            .properties
> +            .iter()
> +            .filter_map(|p| {
> +                if !VALID_PROPS.contains(&p.as_str()) {
> +                    Some(p.to_string())
> +                } else {
> +                    None
> +                }
> +            })
> +            .collect();
> +        if !unknown_props.is_empty() {
> +            return Err(Error::UnknownProp(format!("{:?}", unknown_props)));
> +        }
> +
> +        if event.fmt.len() > 2 {
> +            return Err(Error::InvalidFormat(
> +                format!("too many arguments ({})", event.fmt.len()).to_string(),
> +            ));
> +        }
> +
> +        Ok(event)
> +    }
> +
> +    pub fn build(line_str: &str, lineno: u32, filename: &str) -> Result<Event>
> +    {
> +        static RE: Lazy<Regex> = Lazy::new(|| {
> +            Regex::new(
> +                r#"(?x)
> +                ((?P<props>[\w\s]+)\s+)?
> +                (?P<name>\w+)
> +                \((?P<args>[^)]*)\)
> +                \s*
> +                (?:(?:(?P<fmt_trans>".+),)?\s*(?P<fmt>".+))?

What is the purpose of fmt_trans?

> +                \s*"#,
> +            )
> +            .unwrap()

I wonder if regular expressions help here. It's not easy to read this
regex and there is a bunch of logic that takes apart the matches
afterwards. It might even be clearer to use string methods to split
fields.

Please add a comment showing the format that's being parsed:

 // [disable] <name>(<type1> <arg1>[, <type2> <arg2>] ...) "<format-string>"


> +        });
> +
> +        let caps_res = RE.captures(line_str);
> +        if caps_res.is_none() {
> +            return Err(Error::UnknownEvent(line_str.to_owned()));
> +        }
> +        let caps = caps_res.unwrap();
> +        let name = caps.name("name").map_or("", |m| m.as_str());
> +        let props: Vec<String> = if caps.name("props").is_some() {
> +            caps.name("props")
> +                .unwrap()
> +                .as_str()
> +                .split_whitespace()
> +                .map(|s| s.to_string())
> +                .collect()
> +        } else {
> +            Vec::new()
> +        };
> +        let fmt: String =
> +            caps.name("fmt").map_or("", |m| m.as_str()).to_string();
> +        let fmt_trans: String = caps
> +            .name("fmt_trans")
> +            .map_or("", |m| m.as_str())
> +            .to_string();
> +
> +        if fmt.contains("%m") || fmt_trans.contains("%m") {
> +            return Err(Error::InvalidFormat(
> +                "Event format '%m' is forbidden, pass the error 
> +                as an explicit trace argument"
> +                    .to_string(),
> +            ));
> +        }

I'm not sure simpletrace needs to check this. That's a job for tracetool
the build-time tool that generates code from trace-events files.

> +        if fmt.ends_with(r"\n") {
> +            return Err(Error::InvalidFormat(
> +                "Event format must not end with a newline 
> +                character"
> +                    .to_string(),
> +            ));
> +        }
> +        let mut fmt_vec = vec![fmt];
> +        if !fmt_trans.is_empty() {
> +            fmt_vec.push(fmt_trans);
> +        }
> +
> +        let args =
> +            Arguments::build(caps.name("args").map_or("", |m| m.as_str()))?;
> +        let event = Event::new(name, props, fmt_vec, args, lineno, filename)?;
> +
> +        Ok(event)
> +    }
> +}
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file
  2024-05-27  8:14 ` [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file Zhao Liu
@ 2024-05-27 20:39   ` Stefan Hajnoczi
  2024-05-28  8:37     ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-27 20:39 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 5323 bytes --]

On Mon, May 27, 2024 at 04:14:18PM +0800, Zhao Liu wrote:
> Refer to scripts/simpletrace.py, add the helpers to read the trace file
> and parse the record type field, record header and log header.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  scripts/simpletrace-rust/src/main.rs | 151 +++++++++++++++++++++++++++
>  1 file changed, 151 insertions(+)
> 
> diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
> index 2d2926b7658d..b3b8baee7c66 100644
> --- a/scripts/simpletrace-rust/src/main.rs
> +++ b/scripts/simpletrace-rust/src/main.rs
> @@ -14,21 +14,172 @@
>  mod trace;
>  
>  use std::env;
> +use std::fs::File;
> +use std::io::Error as IOError;
> +use std::io::ErrorKind;
> +use std::io::Read;
>  
>  use clap::Arg;
>  use clap::Command;
>  use thiserror::Error;
>  use trace::Event;
>  
> +const RECORD_TYPE_MAPPING: u64 = 0;
> +const RECORD_TYPE_EVENT: u64 = 1;
> +
>  #[derive(Error, Debug)]
>  pub enum Error
>  {
>      #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
>      CliOptionUnmatch(String),
> +    #[error("Failed to read file: {0}")]
> +    ReadFile(IOError),
> +    #[error("Unknown record type ({0})")]
> +    UnknownRecType(u64),
>  }
>  
>  pub type Result<T> = std::result::Result<T, Error>;
>  
> +enum RecordType
> +{
> +    Empty,
> +    Mapping,
> +    Event,
> +}
> +
> +#[repr(C)]
> +#[derive(Clone, Copy, Default)]
> +struct RecordRawType
> +{
> +    rtype: u64,
> +}
> +
> +impl RecordType
> +{
> +    fn read_type(mut fobj: &File) -> Result<RecordType>
> +    {
> +        let mut tbuf = [0u8; 8];
> +        if let Err(e) = fobj.read_exact(&mut tbuf) {
> +            if e.kind() == ErrorKind::UnexpectedEof {
> +                return Ok(RecordType::Empty);
> +            } else {
> +                return Err(Error::ReadFile(e));
> +            }
> +        }
> +
> +        /*
> +         * Safe because the layout of the trace record requires us to parse
> +         * the type first, and then there is a check on the validity of the
> +         * record type.
> +         */
> +        let raw_t =
> +            unsafe { std::mem::transmute::<[u8; 8], RecordRawType>(tbuf) };

A safe alternative: https://doc.rust-lang.org/std/primitive.u64.html#method.from_ne_bytes?

> +        match raw_t.rtype {
> +            RECORD_TYPE_MAPPING => Ok(RecordType::Mapping),
> +            RECORD_TYPE_EVENT => Ok(RecordType::Event),
> +            _ => Err(Error::UnknownRecType(raw_t.rtype)),
> +        }
> +    }
> +}
> +
> +trait ReadHeader
> +{
> +    fn read_header(fobj: &File) -> Result<Self>
> +    where
> +        Self: Sized;
> +}
> +
> +#[repr(C)]
> +#[derive(Clone, Copy)]
> +struct LogHeader
> +{
> +    event_id: u64,
> +    magic: u64,
> +    version: u64,
> +}
> +
> +impl ReadHeader for LogHeader
> +{
> +    fn read_header(mut fobj: &File) -> Result<Self>
> +    {
> +        let mut raw_hdr = [0u8; 24];
> +        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
> +
> +        /*
> +         * Safe because the size of log header (struct LogHeader)
> +         * is 24 bytes, which is ensured by simple trace backend.
> +         */
> +        let hdr =
> +            unsafe { std::mem::transmute::<[u8; 24], LogHeader>(raw_hdr) };

Or u64::from_ne_bytes() for each field.

> +        Ok(hdr)
> +    }
> +}
> +
> +#[derive(Default)]
> +struct RecordInfo
> +{
> +    event_id: u64,
> +    timestamp_ns: u64,
> +    record_pid: u32,
> +    args_payload: Vec<u8>,
> +}
> +
> +impl RecordInfo
> +{
> +    fn new() -> Self
> +    {
> +        Default::default()
> +    }
> +}
> +
> +#[repr(C)]
> +#[derive(Clone, Copy)]
> +struct RecordHeader
> +{
> +    event_id: u64,
> +    timestamp_ns: u64,
> +    record_length: u32,
> +    record_pid: u32,
> +}
> +
> +impl RecordHeader
> +{
> +    fn extract_record(&self, mut fobj: &File) -> Result<RecordInfo>
> +    {
> +        let mut info = RecordInfo::new();
> +
> +        info.event_id = self.event_id;
> +        info.timestamp_ns = self.timestamp_ns;
> +        info.record_pid = self.record_pid;
> +        info.args_payload = vec![
> +            0u8;
> +            self.record_length as usize
> +                - std::mem::size_of::<RecordHeader>()
> +        ];
> +        fobj.read_exact(&mut info.args_payload)
> +            .map_err(Error::ReadFile)?;
> +
> +        Ok(info)
> +    }
> +}
> +
> +impl ReadHeader for RecordHeader
> +{
> +    fn read_header(mut fobj: &File) -> Result<Self>
> +    {
> +        let mut raw_hdr = [0u8; 24];
> +        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
> +
> +        /*
> +         * Safe because the size of record header (struct RecordHeader)
> +         * is 24 bytes, which is ensured by simple trace backend.
> +         */
> +        let hdr: RecordHeader =
> +            unsafe { std::mem::transmute::<[u8; 24], RecordHeader>(raw_hdr) };

Or u64::from_ne_bytes() and u32::from_ne_bytes() for all fields.

> +        Ok(hdr)
> +    }
> +}
> +
>  pub struct EventArgPayload {}
>  
>  trait Analyzer
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file
  2024-05-27  8:14 ` [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file Zhao Liu
@ 2024-05-27 20:44   ` Stefan Hajnoczi
  2024-05-28  9:30     ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-27 20:44 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée, Daniel P . Berrangé,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 16245 bytes --]

On Mon, May 27, 2024 at 04:14:19PM +0800, Zhao Liu wrote:
> Refer to scripts/simpletrace.py, parse and check the simple trace
> backend binary trace file.
> 
> Note, in order to keep certain backtrace information to get frame,
> adjust the cargo debug level for release version to "line-tables-only",
> which slows down the program, but is necessary.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> ---
>  scripts/simpletrace-rust/Cargo.lock  |  79 +++++++++
>  scripts/simpletrace-rust/Cargo.toml  |   4 +
>  scripts/simpletrace-rust/src/main.rs | 253 ++++++++++++++++++++++++++-
>  3 files changed, 329 insertions(+), 7 deletions(-)
> 
> diff --git a/scripts/simpletrace-rust/Cargo.lock b/scripts/simpletrace-rust/Cargo.lock
> index 3d815014eb44..37d80974ffe7 100644
> --- a/scripts/simpletrace-rust/Cargo.lock
> +++ b/scripts/simpletrace-rust/Cargo.lock
> @@ -2,6 +2,21 @@
>  # It is not intended for manual editing.
>  version = 3
>  
> +[[package]]
> +name = "addr2line"
> +version = "0.21.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "8a30b2e23b9e17a9f90641c7ab1549cd9b44f296d3ccbf309d2863cfe398a0cb"
> +dependencies = [
> + "gimli",
> +]
> +
> +[[package]]
> +name = "adler"
> +version = "1.0.2"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
> +
>  [[package]]
>  name = "aho-corasick"
>  version = "1.1.3"
> @@ -60,6 +75,33 @@ dependencies = [
>   "windows-sys",
>  ]
>  
> +[[package]]
> +name = "backtrace"
> +version = "0.3.71"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "26b05800d2e817c8b3b4b54abd461726265fa9789ae34330622f2db9ee696f9d"
> +dependencies = [
> + "addr2line",
> + "cc",
> + "cfg-if",
> + "libc",
> + "miniz_oxide",
> + "object",
> + "rustc-demangle",
> +]
> +
> +[[package]]
> +name = "cc"
> +version = "1.0.98"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "41c270e7540d725e65ac7f1b212ac8ce349719624d7bcff99f8e2e488e8cf03f"
> +
> +[[package]]
> +name = "cfg-if"
> +version = "1.0.0"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
> +
>  [[package]]
>  name = "clap"
>  version = "4.5.4"
> @@ -93,18 +135,48 @@ version = "1.0.1"
>  source = "registry+https://github.com/rust-lang/crates.io-index"
>  checksum = "0b6a852b24ab71dffc585bcb46eaf7959d175cb865a7152e35b348d1b2960422"
>  
> +[[package]]
> +name = "gimli"
> +version = "0.28.1"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "4271d37baee1b8c7e4b708028c57d816cf9d2434acb33a549475f78c181f6253"
> +
>  [[package]]
>  name = "is_terminal_polyfill"
>  version = "1.70.0"
>  source = "registry+https://github.com/rust-lang/crates.io-index"
>  checksum = "f8478577c03552c21db0e2724ffb8986a5ce7af88107e6be5d2ee6e158c12800"
>  
> +[[package]]
> +name = "libc"
> +version = "0.2.155"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "97b3888a4aecf77e811145cadf6eef5901f4782c53886191b2f693f24761847c"
> +
>  [[package]]
>  name = "memchr"
>  version = "2.7.2"
>  source = "registry+https://github.com/rust-lang/crates.io-index"
>  checksum = "6c8640c5d730cb13ebd907d8d04b52f55ac9a2eec55b440c8892f40d56c76c1d"
>  
> +[[package]]
> +name = "miniz_oxide"
> +version = "0.7.3"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "87dfd01fe195c66b572b37921ad8803d010623c0aca821bea2302239d155cdae"
> +dependencies = [
> + "adler",
> +]
> +
> +[[package]]
> +name = "object"
> +version = "0.32.2"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "a6a622008b6e321afc04970976f62ee297fdbaa6f95318ca343e3eebb9648441"
> +dependencies = [
> + "memchr",
> +]
> +
>  [[package]]
>  name = "once_cell"
>  version = "1.19.0"
> @@ -158,10 +230,17 @@ version = "0.8.3"
>  source = "registry+https://github.com/rust-lang/crates.io-index"
>  checksum = "adad44e29e4c806119491a7f06f03de4d1af22c3a680dd47f1e6e179439d1f56"
>  
> +[[package]]
> +name = "rustc-demangle"
> +version = "0.1.24"
> +source = "registry+https://github.com/rust-lang/crates.io-index"
> +checksum = "719b953e2095829ee67db738b3bfa9fa368c94900df327b3f07fe6e794d2fe1f"
> +
>  [[package]]
>  name = "simpletrace-rust"
>  version = "0.1.0"
>  dependencies = [
> + "backtrace",
>   "clap",
>   "once_cell",
>   "regex",
> diff --git a/scripts/simpletrace-rust/Cargo.toml b/scripts/simpletrace-rust/Cargo.toml
> index 24a79d04e566..23a3179de01c 100644
> --- a/scripts/simpletrace-rust/Cargo.toml
> +++ b/scripts/simpletrace-rust/Cargo.toml
> @@ -7,7 +7,11 @@ authors = ["Zhao Liu <zhao1.liu@intel.com>"]
>  license = "GPL-2.0-or-later"
>  
>  [dependencies]
> +backtrace = "0.3"
>  clap = "4.5.4"
>  once_cell = "1.19.0"
>  regex = "1.10.4"
>  thiserror = "1.0.20"
> +
> +[profile.release]
> +debug = "line-tables-only"
> diff --git a/scripts/simpletrace-rust/src/main.rs b/scripts/simpletrace-rust/src/main.rs
> index b3b8baee7c66..f9a71d8dc243 100644
> --- a/scripts/simpletrace-rust/src/main.rs
> +++ b/scripts/simpletrace-rust/src/main.rs
> @@ -13,17 +13,24 @@
>  
>  mod trace;
>  
> +use std::collections::HashMap;
>  use std::env;
>  use std::fs::File;
>  use std::io::Error as IOError;
>  use std::io::ErrorKind;
>  use std::io::Read;
> +use std::mem::size_of;
>  
> +use backtrace::Backtrace;
>  use clap::Arg;
>  use clap::Command;
>  use thiserror::Error;
>  use trace::Event;
>  
> +const DROPPED_EVENT_ID: u64 = 0xfffffffffffffffe;
> +const HEADER_MAGIC: u64 = 0xf2b177cb0aa429b4;
> +const HEADER_EVENT_ID: u64 = 0xffffffffffffffff;
> +
>  const RECORD_TYPE_MAPPING: u64 = 0;
>  const RECORD_TYPE_EVENT: u64 = 1;
>  
> @@ -32,10 +39,25 @@ pub enum Error
>  {
>      #[error("usage: {0} [--no-header] <trace-events> <trace-file>")]
>      CliOptionUnmatch(String),
> +    #[error("Invalid event name ({0})")]
> +    InvalidEventName(String),
> +    #[error("Not a valid trace file, header id {0} != {1}")]
> +    InvalidHeaderId(u64, u64),
> +    #[error("Not a valid trace file, header magic {0} != {1}")]
> +    InvalidHeaderMagic(u64, u64),
>      #[error("Failed to read file: {0}")]
>      ReadFile(IOError),
> +    #[error(
> +        "event {0} is logged but is not declared in the trace events \
> +        file, try using trace-events-all instead."
> +    )]
> +    UnknownEvent(String),
>      #[error("Unknown record type ({0})")]
>      UnknownRecType(u64),
> +    #[error("Unknown version {0} of tracelog format!")]
> +    UnknownVersion(u64),
> +    #[error("Log format {0} not supported with this QEMU release!")]
> +    UnsupportedVersion(u64),
>  }
>  
>  pub type Result<T> = std::result::Result<T, Error>;
> @@ -98,19 +120,22 @@ struct LogHeader
>      version: u64,
>  }
>  
> +const LOG_HDR_LEN: usize = size_of::<LogHeader>();
> +
>  impl ReadHeader for LogHeader
>  {
>      fn read_header(mut fobj: &File) -> Result<Self>
>      {
> -        let mut raw_hdr = [0u8; 24];
> +        let mut raw_hdr = [0u8; LOG_HDR_LEN];
>          fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
>  
>          /*
>           * Safe because the size of log header (struct LogHeader)
>           * is 24 bytes, which is ensured by simple trace backend.
>           */
> -        let hdr =
> -            unsafe { std::mem::transmute::<[u8; 24], LogHeader>(raw_hdr) };
> +        let hdr = unsafe {
> +            std::mem::transmute::<[u8; LOG_HDR_LEN], LogHeader>(raw_hdr)
> +        };
>          Ok(hdr)
>      }
>  }
> @@ -142,6 +167,8 @@ struct RecordHeader
>      record_pid: u32,
>  }
>  
> +const REC_HDR_LEN: usize = size_of::<RecordHeader>();
> +
>  impl RecordHeader
>  {
>      fn extract_record(&self, mut fobj: &File) -> Result<RecordInfo>
> @@ -167,20 +194,232 @@ impl ReadHeader for RecordHeader
>  {
>      fn read_header(mut fobj: &File) -> Result<Self>
>      {
> -        let mut raw_hdr = [0u8; 24];
> +        let mut raw_hdr = [0u8; REC_HDR_LEN];
>          fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
>  
>          /*
>           * Safe because the size of record header (struct RecordHeader)
>           * is 24 bytes, which is ensured by simple trace backend.
>           */
> -        let hdr: RecordHeader =
> -            unsafe { std::mem::transmute::<[u8; 24], RecordHeader>(raw_hdr) };
> +        let hdr: RecordHeader = unsafe {
> +            std::mem::transmute::<[u8; REC_HDR_LEN], RecordHeader>(raw_hdr)
> +        };
>          Ok(hdr)
>      }
>  }
>  
> -pub struct EventArgPayload {}
> +#[derive(Clone)]
> +pub struct EventArgPayload
> +{
> +    raw_val: Option<u64>,
> +    raw_str: Option<String>,
> +}
> +
> +impl EventArgPayload
> +{
> +    fn new(raw_val: Option<u64>, raw_str: Option<String>) -> Self
> +    {
> +        EventArgPayload { raw_val, raw_str }
> +    }
> +
> +    fn get_payload_str(
> +        offset: &mut usize,
> +        args_payload: &[u8],
> +    ) -> Result<EventArgPayload>
> +    {
> +        let length = u32::from_le_bytes(
> +            args_payload[*offset..(*offset + 4)].try_into().unwrap(),
> +        );
> +        *offset += 4;
> +        let raw_str: &[u8] = args_payload
> +            .get(*offset..(*offset + length as usize))
> +            .unwrap();
> +        *offset += length as usize;
> +        Ok(EventArgPayload::new(
> +            None,
> +            Some(String::from_utf8_lossy(raw_str).into_owned()),
> +        ))
> +    }
> +
> +    fn get_payload_val(
> +        offset: &mut usize,
> +        args_payload: &[u8],
> +    ) -> Result<EventArgPayload>
> +    {
> +        let raw_val = u64::from_le_bytes(
> +            args_payload[*offset..(*offset + 8)].try_into().unwrap(),
> +        );
> +        *offset += 8;
> +        Ok(EventArgPayload::new(Some(raw_val), None))
> +    }
> +}
> +
> +#[derive(Clone)]
> +struct EventEntry
> +{
> +    event: Event,
> +    event_id: u64,
> +    timestamp_ns: u64,
> +    record_pid: u32,
> +    rec_args: Vec<EventArgPayload>,
> +}
> +
> +impl EventEntry
> +{
> +    fn new(
> +        event: &Event,
> +        event_id: u64,
> +        timestamp_ns: u64,
> +        record_pid: u32,
> +    ) -> Self
> +    {
> +        EventEntry {
> +            event: event.clone(),
> +            event_id,
> +            timestamp_ns,
> +            record_pid,
> +            rec_args: Vec::new(),
> +        }
> +    }
> +}
> +
> +fn get_mapping(mut fobj: &File) -> Result<(u64, String)>
> +{
> +    let mut event_id_buf = [0u8; 8];
> +    fobj.read_exact(&mut event_id_buf)
> +        .map_err(Error::ReadFile)?;
> +    let event_id = u64::from_le_bytes(event_id_buf);
> +
> +    let mut len_buf = [0u8; 4];
> +    fobj.read_exact(&mut len_buf).map_err(Error::ReadFile)?;
> +
> +    let len = u32::from_le_bytes(len_buf);
> +    let mut name_buf = vec![0u8; len as usize];
> +    fobj.read_exact(&mut name_buf).map_err(Error::ReadFile)?;
> +    let name = String::from_utf8(name_buf.clone())
> +        .map_err(|_| Error::InvalidEventName(format!("{:?}", name_buf)))?;
> +
> +    Ok((event_id, name))
> +}
> +
> +fn read_record(fobj: &File) -> Result<RecordInfo>
> +{
> +    let hdr = RecordHeader::read_header(fobj)?;
> +    let info = hdr.extract_record(fobj)?;
> +    Ok(info)
> +}
> +
> +fn read_trace_header(fobj: &File) -> Result<()>
> +{
> +    let log_hdr = LogHeader::read_header(fobj)?;
> +    if log_hdr.event_id != HEADER_EVENT_ID {
> +        return Err(Error::InvalidHeaderId(log_hdr.event_id, HEADER_EVENT_ID));
> +    }
> +    if log_hdr.magic != HEADER_MAGIC {
> +        return Err(Error::InvalidHeaderMagic(log_hdr.magic, HEADER_MAGIC));
> +    }
> +    if ![0, 2, 3, 4].contains(&(log_hdr.version as i64)) {
> +        return Err(Error::UnknownVersion(log_hdr.version));
> +    }
> +    if log_hdr.version != 4 {
> +        return Err(Error::UnsupportedVersion(log_hdr.version));
> +    }
> +    Ok(())
> +}
> +
> +fn read_trace_records(
> +    events: &Vec<Event>,
> +    fobj: &File,
> +    analyzer: &mut Formatter,
> +    read_header: bool,
> +) -> Result<Vec<String>>
> +{
> +    /* backtrace::Backtrace needs this env variable. */
> +    env::set_var("RUST_BACKTRACE", "1");
> +    let bt = Backtrace::new();
> +    let raw_frame = bt.frames().first().unwrap();
> +    let frameinfo = raw_frame.symbols().first().unwrap();
> +
> +    let dropped_event = Event::build(
> +        "Dropped_Event(uint64_t num_events_dropped)",
> +        frameinfo.lineno().unwrap() + 1,
> +        frameinfo.filename().unwrap().to_str().unwrap(),
> +    )
> +    .unwrap();

Is the backtrace necessary? This trick was used in Python where it's
part of the standard library, but I don't think there is any need for
Dropped_Event to refer to this line in the source code.

Maybe Rust has a way to do this at compile-time or you can hardcode
values instead.

> +
> +    let mut event_mapping = HashMap::new();
> +    for e in events {
> +        event_mapping.insert(e.name.clone(), e.clone());
> +    }
> +
> +    let drop_str = "dropped".to_string();
> +    event_mapping.insert(drop_str.clone(), dropped_event.clone());
> +    let mut event_id_to_name: HashMap<u64, String> = HashMap::new();
> +    event_id_to_name.insert(DROPPED_EVENT_ID, drop_str.clone());
> +
> +    if !read_header {
> +        for (event_id, event) in events.iter().enumerate() {
> +            event_id_to_name.insert(event_id as u64, event.name.clone());
> +        }
> +    }
> +
> +    let mut fmt_strs = Vec::new();
> +    loop {
> +        let rtype = RecordType::read_type(fobj)?;
> +        match rtype {
> +            RecordType::Empty => {
> +                break;
> +            }
> +            RecordType::Mapping => {
> +                let (event_id, event_name) =
> +                    get_mapping(fobj).expect("Error reading mapping");
> +                event_id_to_name.insert(event_id, event_name);
> +                continue;
> +            }
> +            RecordType::Event => {
> +                let rec = read_record(fobj).expect("Error reading record");
> +                let event_name =
> +                    event_id_to_name.get(&rec.event_id).unwrap().to_string();
> +                let event = event_mapping
> +                    .get(&event_name)
> +                    .ok_or(Error::UnknownEvent(event_name))?;
> +
> +                let mut entry = EventEntry::new(
> +                    event,
> +                    rec.event_id,
> +                    rec.timestamp_ns,
> +                    rec.record_pid,
> +                );
> +                let mut offset = 0;
> +
> +                for arg in &event.args.props {
> +                    let payload = if arg.is_string() {
> +                        EventArgPayload::get_payload_str(
> +                            &mut offset,
> +                            &rec.args_payload,
> +                        )?
> +                    } else {
> +                        EventArgPayload::get_payload_val(
> +                            &mut offset,
> +                            &rec.args_payload,
> +                        )?
> +                    };
> +                    entry.rec_args.push(payload);
> +                }
> +
> +                let fmt = analyzer.process_event(
> +                    &entry.rec_args,
> +                    &entry.event,
> +                    entry.event_id,
> +                    entry.timestamp_ns,
> +                    entry.record_pid,
> +                )?;
> +                fmt_strs.push(fmt);
> +            }
> +        }
> +    }
> +    Ok(fmt_strs)
> +}
>  
>  trait Analyzer
>  {
> -- 
> 2.34.1
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27 10:49 ` Mads Ynddal
@ 2024-05-28  6:15   ` Zhao Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  6:15 UTC (permalink / raw)
  To: Mads Ynddal
  Cc: Stefan Hajnoczi, Paolo Bonzini, Philippe Mathieu-Daudé,
	Peter Maydell, Alex Bennée,
	"Daniel P . Berrangé", Thomas Huth,
	Markus Armbruster, qemu-devel

Hi Mads,

On Mon, May 27, 2024 at 12:49:06PM +0200, Mads Ynddal wrote:
> Date: Mon, 27 May 2024 12:49:06 +0200
> From: Mads Ynddal <mads@ynddal.dk>
> Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> X-Mailer: Apple Mail (2.3774.600.62)
> 
> Hi,
> 
> Interesting work. I don't have any particular comments for the code, but I
> wanted to address a few of the points here.
> 
> > 2. Rust delivers faster parsing.
> 
> For me, the point of simpletrace.py is not to be the fastest at parsing, but
> rather to open the door for using Python libraries like numpy, matplotlib, etc.
> for analysis.
> 
> There might be room for improvement in the Python version, especially in
> minimizing memory usage, when parsing large traces.

Thanks for pointing this out, the Python version is also very extensible
and easy to develop.

Perhaps ease of scalability vs. performance could be the difference that
the two versions focus on?

> > Security
> > --------
> > 
> > This is an example.
> > 
> > Rust is very strict about type-checking, and it found timestamp reversal
> > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> > deeper with more time)...in this RFC, I workingaround it by allowing
> > negative values. And the python version, just silently covered this
> > issue up.
> 
> I'm not particularly worried about the security of the Python version. We're not
> doing anything obviously exploitable.

I agree with this, this tool is mainly for parsing. I think one of the
starting points for providing a Rust version was also to explore whether
this could be an opportunity to integrate Rust into QEMU.

Thanks,
Zhao




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27 19:59 ` Stefan Hajnoczi
@ 2024-05-28  6:48   ` Zhao Liu
  2024-05-28 13:05     ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  6:48 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

Hi Stefan,

On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote:
> Date: Mon, 27 May 2024 15:59:44 -0400
> From: Stefan Hajnoczi <stefanha@redhat.com>
> Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> 
> On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> > Hi maintainers and list,
> > 
> > This RFC series attempts to re-implement simpletrace.py with Rust, which
> > is the 1st task of Paolo's GSoC 2024 proposal.
> > 
> > There are two motivations for this work:
> > 1. This is an open chance to discuss how to integrate Rust into QEMU.
> > 2. Rust delivers faster parsing.
> > 
> > 
> > Introduction
> > ============
> > 
> > Code framework
> > --------------
> > 
> > I choose "cargo" to organize the code, because the current
> > implementation depends on external crates (Rust's library), such as
> > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> > regular matching, and so on. (Meson's support for external crates is
> > still incomplete. [2])
> > 
> > The simpletrace-rust created in this series is not yet integrated into
> > the QEMU compilation chain, so it can only be compiled independently, e.g.
> > under ./scripts/simpletrace/, compile it be:
> > 
> >     cargo build --release
> 
> Please make sure it's built by .gitlab-ci.d/ so that the continuous
> integration system prevents bitrot. You can add a job that runs the
> cargo build.

Thanks! I'll do this.

> > 
> > The code tree for the entire simpletrace-rust is as follows:
> > 
> > $ script/simpletrace-rust .
> > .
> > ├── Cargo.toml
> > └── src
> >     └── main.rs   // The simpletrace logic (similar to simpletrace.py).
> >     └── trace.rs  // The Argument and Event abstraction (refer to
> >                   // tracetool/__init__.py).
> > 
> > My question about meson v.s. cargo, I put it at the end of the cover
> > letter (the section "Opens on Rust Support").
> > 
> > The following two sections are lessons I've learned from this Rust
> > practice.
> > 
> > 
> > Performance
> > -----------
> > 
> > I did the performance comparison using the rust-simpletrace prototype with
> > the python one:
> > 
> > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> > trace binary file for 10 times on each item:
> > 
> >                       AVE (ms)       Rust v.s. Python
> > Rust   (stdout)       12687.16            114.46%
> > Python (stdout)       14521.85
> > 
> > Rust   (file)          1422.44            264.99%
> > Python (file)          3769.37
> > 
> > - The "stdout" lines represent output to the screen.
> > - The "file" lines represent output to a file (via "> file").
> > 
> > This Rust version contains some optimizations (including print, regular
> > matching, etc.), but there should be plenty of room for optimization.
> > 
> > The current performance bottleneck is the reading binary trace file,
> > since I am parsing headers and event payloads one after the other, so
> > that the IO read overhead accounts for 33%, which can be further
> > optimized in the future.
> 
> Performance will become more important when large amounts of TCG data is
> captured, as described in the project idea:
> https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing
> 
> While I can't think of a time in the past where simpletrace.py's
> performance bothered me, improving performance is still welcome. Just
> don't spend too much time on performance (and making the code more
> complex) unless there is a real need.

Yes, I agree that it shouldn't be over-optimized.

The logic in the current Rust version is pretty much a carbon copy of
the Python version, without additional complex logic introduced, but the
improvements in x2.6 were obtained by optimizing IO:

* reading the event configuration file, where I called the buffered
  interface,
* and the output formatted trace log, which I output all via std_out.lock()
  followed by write_all().

So, just the simple tweak of the interfaces brings much benefits. :-)

> > Security
> > --------
> > 
> > This is an example.
> > 
> > Rust is very strict about type-checking, and it found timestamp reversal
> > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> > deeper with more time)...in this RFC, I workingaround it by allowing
> > negative values. And the python version, just silently covered this
> > issue up.
> >
> > Opens on Rust Support
> > =====================
> > 
> > Meson v.s. Cargo
> > ----------------
> > 
> > The first question is whether all Rust code (including under scripts)
> > must be integrated into meson?
> > 
> > If so, because of [2] then I have to discard the external crates and
> > build some more Rust wheels of my own to replace the previous external
> > crates.
> > 
> > For the main part of the QEMU code, I think the answer must be Yes, but
> > for the tools in the scripts directory, would it be possible to allow
> > the use of cargo to build small tools/program for flexibility and
> > migrate to meson later (as meson's support for rust becomes more
> > mature)?
> 
> I have not seen a satisfying way to natively build Rust code using
> meson. I remember reading about a tool that converts Cargo.toml files to
> meson wrap files or something similar. That still doesn't feel great
> because upstream works with Cargo and duplicating build information in
> meson is a drag.
> 
> Calling cargo from meson is not ideal either, but it works and avoids
> duplicating build information. This is the approach I would use for now
> unless someone can point to an example of native Rust support in meson
> that is clean.
> 
> Here is how libblkio calls cargo from meson:
> https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build
> https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh

Many thanks! This is a good example and I'll try to build similarly to
it.

> > 
> > 
> > External crates
> > ---------------
> > 
> > This is an additional question that naturally follows from the above
> > question, do we have requirements for Rust's external crate? Is only std
> > allowed?
> 
> There is no policy. My suggestion:
> 
> If you need a third-party crate then it's okay to use it, but try to
> minimize dependencies. Avoid adding dependening for niceties that are
> not strictly needed. Third-party crates are a burden for package
> maintainers and anyone building from source. They increase the risk that
> the code will fail to build. They can also be a security risk.

Thanks for the suggestion, that's clear to me, I'll try to control the
third party dependencies.

> > 
> > Welcome your feedback!
> 
> It would be easier to give feedback if you implement some examples of
> TCG binary tracing before rewriting simpletrace.py. It's unclear to me
> why simpletrace needs to be rewritten at this point. If you are
> extending the simpletrace file format in ways that are not suitable for
> Python or can demonstrate that Python performance is a problem, then
> focussing on a Rust simpletrace implementation is more convincing.
> 
> Could you use simpletrace.py to develop TCG binary tracing first?

Yes, I can. :-)

Rewriting in Rust does sound duplicative, but I wonder if this could be
viewed as an open opportunity to add Rust support for QEMU, looking at
the scripts directory to start exploring Rust support/build would be
a relatively easy place to start.

I think the exploration of Rust's build of the simpletrace tool under
scripts parts can help with subsequent work on supporting Rust in other
QEMU core parts.

From this point, may I ask your opinion?

Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
the former goes for performance and the latter for scalability.

Thanks,
Zhao




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-27 20:05   ` Stefan Hajnoczi
@ 2024-05-28  7:53     ` Zhao Liu
  2024-05-28 14:14       ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  7:53 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

Hi Stefan,

[snip]

> > diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> > new file mode 100644
> > index 000000000000..97a97c24ebfb
> > --- /dev/null
> > +++ b/scripts/simpletrace-rust/.rustfmt.toml
> > @@ -0,0 +1,9 @@
> > +brace_style = "AlwaysNextLine"
> > +comment_width = 80
> > +edition = "2021"
> > +group_imports = "StdExternalCrate"
> > +imports_granularity = "item"
> > +max_width = 80
> > +use_field_init_shorthand = true
> > +use_try_shorthand = true
> > +wrap_comments = true
> 
> There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?

Indeed, but QEMU's style for Rust is currently undefined, so I'm trying
to add this to make it easier to check the style...I will separate it
out as a style policy proposal.

[snip]

> > +trait Analyzer
> > +{
> 
> The Python version treats this as an API so that users can write trace
> analysis scripts. I see below that you're using non-doc comments. That
> suggests you don't see this as a public API that people can write trace
> analysis scripts against?

Yes, for the initial version, I'm not exposing it for now. It could be
exposed later if needed and then we can support multiple script builds
by defining multiple workspaces in cargo.

Thanks,
Zhao

> > +    /* Called at the start of the trace. */
> > +    fn begin(&self) {}
> > +
> > +    /* Called if no specific method for processing a trace event. */
> > +    fn catchall(
> > +        &mut self,
> > +        rec_args: &[EventArgPayload],
> > +        event: &Event,
> > +        timestamp_ns: u64,
> > +        pid: u32,
> > +        event_id: u64,
> > +    ) -> Result<String>;
> > +
> > +    /* Called at the end of the trace. */
> > +    fn end(&self) {}
> > +
> > +    /*
> > +     * TODO: Support "variable" parameters (i.e. variants of process_event()
> > +     * with different parameters, like **kwargs in python), when we need a
> > +     * simpletrace rust module.
> > +     */
> > +    fn process_event(
> > +        &mut self,
> > +        rec_args: &[EventArgPayload],
> > +        event: &Event,
> > +        event_id: u64,
> > +        timestamp_ns: u64,
> > +        pid: u32,
> > +    ) -> Result<String>
> > +    {
> > +        self.catchall(rec_args, event, timestamp_ns, pid, event_id)
> > +
> > +        /*
> > +         * TODO: Support custom function hooks (like getattr() in python),
> > +         * when we need a simpletrace rust module.
> > +         */
> > +    }
> > +}


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module
  2024-05-27 20:33   ` Stefan Hajnoczi
@ 2024-05-28  8:32     ` Zhao Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  8:32 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

> > +/*
> > + * Refer to the description of ALLOWED_TYPES in
> > + * scripts/tracetool/__init__.py.
> 
> Please don't reference the Python implementation because this will not
> age well. It may bitrot if the Python code changes or if the Python
> implementation is deprecated then the source file will go away
> altogether. Make the Rust implementation self-contained. If there are
> common file format concerns shared by implementations, then move that
> information to a separate document in docs/interop/ (i.e. a simpletrace
> file format specification).

Thanks for your guidance, will do.

> > + */
> > +const ALLOWED_TYPES: [&str; 20] = [
> > +    "int",
> > +    "long",
> > +    "short",
> > +    "char",
> > +    "bool",
> > +    "unsigned",
> > +    "signed",
> > +    "int8_t",
> > +    "uint8_t",
> > +    "int16_t",
> > +    "uint16_t",
> > +    "int32_t",
> > +    "uint32_t",
> > +    "int64_t",
> > +    "uint64_t",
> > +    "void",
> > +    "size_t",
> > +    "ssize_t",
> > +    "uintptr_t",
> > +    "ptrdiff_t",
> > +];
> > +
> > +const STRING_TYPES: [&str; 4] =
> > +    ["const char*", "char*", "const char *", "char *"];
> > +
> > +/* TODO: Support 'vcpu' property. */
> 
> The vcpu property was removed in commit d9a6bad542cd ("docs: remove
> references to TCG tracing"). Is this comment outdated or are you
> planning to bring it back?

Thanks! I have no plan for this, I just follow _VALID_PROPS[] in
scripts/tracetool/__init__.py. As you commented above, I think I should
just ignore it. ;-)

> > +const VALID_PROPS: [&str; 1] = ["disable"];

[snip]

> > +    pub fn build(arg_str: &str) -> Result<Arguments>
> > +    {
> > +        let mut args = Arguments::new();
> > +        for arg in arg_str.split(',').map(|s| s.trim()) {
> > +            if arg.is_empty() {
> > +                return Err(Error::EmptyArg);
> > +            }
> > +
> > +            if arg == "void" {
> > +                continue;
> > +            }
> > +
> > +            let (arg_type, identifier) = if arg.contains('*') {
> > +                /* FIXME: Implement rsplit_inclusive(). */
> > +                let p = arg.rfind('*').unwrap();
> > +                (
> > +                    /* Safe because arg contains "*" and p exists. */
> > +                    unsafe { arg.get_unchecked(..p + 1) },
> > +                    /* Safe because arg contains "*" and p exists. */
> > +                    unsafe { arg.get_unchecked(p + 1..) },
> > +                )
> > +            } else {
> > +                arg.rsplit_once(' ').unwrap()
> > +            };
> 
> Can you write this without unsafe? Maybe rsplit_once(' ') followed by a
> check for (_, '*identifier'). If the identifier starts with '*', then
> arg_type += ' *' and identifier = identifier[1:].

Clever idea! It should work, will try this way.

> > +
> > +            validate_c_type(arg_type)?;
> > +            args.props.push(ArgProperty::new(arg_type, identifier));
> > +        }
> > +        Ok(args)
> > +    }
> > +}
> > +

[snip]

> > +    pub fn build(line_str: &str, lineno: u32, filename: &str) -> Result<Event>
> > +    {
> > +        static RE: Lazy<Regex> = Lazy::new(|| {
> > +            Regex::new(
> > +                r#"(?x)
> > +                ((?P<props>[\w\s]+)\s+)?
> > +                (?P<name>\w+)
> > +                \((?P<args>[^)]*)\)
> > +                \s*
> > +                (?:(?:(?P<fmt_trans>".+),)?\s*(?P<fmt>".+))?
> 
> What is the purpose of fmt_trans?
> 
> > +                \s*"#,
> > +            )
> > +            .unwrap()
> 
> I wonder if regular expressions help here. It's not easy to read this
> regex and there is a bunch of logic that takes apart the matches
> afterwards. It might even be clearer to use string methods to split
> fields.

Yes, regular matching is a burden here (it's a "lazy simplification" on
my part), and I'll think if it's possible to avoid regular matching with
string methods.

> Please add a comment showing the format that's being parsed:
> 
>  // [disable] <name>(<type1> <arg1>[, <type2> <arg2>] ...) "<format-string>"
> 

OK.

> > +        });
> > +
> > +        let caps_res = RE.captures(line_str);
> > +        if caps_res.is_none() {
> > +            return Err(Error::UnknownEvent(line_str.to_owned()));
> > +        }
> > +        let caps = caps_res.unwrap();
> > +        let name = caps.name("name").map_or("", |m| m.as_str());
> > +        let props: Vec<String> = if caps.name("props").is_some() {
> > +            caps.name("props")
> > +                .unwrap()
> > +                .as_str()
> > +                .split_whitespace()
> > +                .map(|s| s.to_string())
> > +                .collect()
> > +        } else {
> > +            Vec::new()
> > +        };
> > +        let fmt: String =
> > +            caps.name("fmt").map_or("", |m| m.as_str()).to_string();
> > +        let fmt_trans: String = caps
> > +            .name("fmt_trans")
> > +            .map_or("", |m| m.as_str())
> > +            .to_string();
> > +
> > +        if fmt.contains("%m") || fmt_trans.contains("%m") {
> > +            return Err(Error::InvalidFormat(
> > +                "Event format '%m' is forbidden, pass the error 
> > +                as an explicit trace argument"
> > +                    .to_string(),
> > +            ));
> > +        }
> 
> I'm not sure simpletrace needs to check this. That's a job for tracetool
> the build-time tool that generates code from trace-events files.

Thanks for the clarification, this item has bothered me before, I also
noticed that simpletrace doesn't use it, but don't feel confident about
deleting it completely, I'll clean it up!

> > +        if fmt.ends_with(r"\n") {
> > +            return Err(Error::InvalidFormat(
> > +                "Event format must not end with a newline 
> > +                character"
> > +                    .to_string(),
> > +            ));
> > +        }

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file
  2024-05-27 20:39   ` Stefan Hajnoczi
@ 2024-05-28  8:37     ` Zhao Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  8:37 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

> > +    fn read_type(mut fobj: &File) -> Result<RecordType>
> > +    {
> > +        let mut tbuf = [0u8; 8];
> > +        if let Err(e) = fobj.read_exact(&mut tbuf) {
> > +            if e.kind() == ErrorKind::UnexpectedEof {
> > +                return Ok(RecordType::Empty);
> > +            } else {
> > +                return Err(Error::ReadFile(e));
> > +            }
> > +        }
> > +
> > +        /*
> > +         * Safe because the layout of the trace record requires us to parse
> > +         * the type first, and then there is a check on the validity of the
> > +         * record type.
> > +         */
> > +        let raw_t =
> > +            unsafe { std::mem::transmute::<[u8; 8], RecordRawType>(tbuf) };
> 
> A safe alternative: https://doc.rust-lang.org/std/primitive.u64.html#method.from_ne_bytes?

Thanks! Will use it.

> > +        match raw_t.rtype {
> > +            RECORD_TYPE_MAPPING => Ok(RecordType::Mapping),
> > +            RECORD_TYPE_EVENT => Ok(RecordType::Event),
> > +            _ => Err(Error::UnknownRecType(raw_t.rtype)),
> > +        }
> > +    }
> > +}

[snip]

> > +{
> > +    fn read_header(mut fobj: &File) -> Result<Self>
> > +    {
> > +        let mut raw_hdr = [0u8; 24];
> > +        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
> > +
> > +        /*
> > +         * Safe because the size of log header (struct LogHeader)
> > +         * is 24 bytes, which is ensured by simple trace backend.
> > +         */
> > +        let hdr =
> > +            unsafe { std::mem::transmute::<[u8; 24], LogHeader>(raw_hdr) };
> 
> Or u64::from_ne_bytes() for each field.

Will do.

> > +        Ok(hdr)
> > +    }
> > +}

[snip]

> > +impl ReadHeader for RecordHeader
> > +{
> > +    fn read_header(mut fobj: &File) -> Result<Self>
> > +    {
> > +        let mut raw_hdr = [0u8; 24];
> > +        fobj.read_exact(&mut raw_hdr).map_err(Error::ReadFile)?;
> > +
> > +        /*
> > +         * Safe because the size of record header (struct RecordHeader)
> > +         * is 24 bytes, which is ensured by simple trace backend.
> > +         */
> > +        let hdr: RecordHeader =
> > +            unsafe { std::mem::transmute::<[u8; 24], RecordHeader>(raw_hdr) };
> 
> Or u64::from_ne_bytes() and u32::from_ne_bytes() for all fields.

Will do.

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file
  2024-05-27 20:44   ` Stefan Hajnoczi
@ 2024-05-28  9:30     ` Zhao Liu
  0 siblings, 0 replies; 30+ messages in thread
From: Zhao Liu @ 2024-05-28  9:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

> > +fn read_trace_records(
> > +    events: &Vec<Event>,
> > +    fobj: &File,
> > +    analyzer: &mut Formatter,
> > +    read_header: bool,
> > +) -> Result<Vec<String>>
> > +{
> > +    /* backtrace::Backtrace needs this env variable. */
> > +    env::set_var("RUST_BACKTRACE", "1");
> > +    let bt = Backtrace::new();
> > +    let raw_frame = bt.frames().first().unwrap();
> > +    let frameinfo = raw_frame.symbols().first().unwrap();
> > +
> > +    let dropped_event = Event::build(
> > +        "Dropped_Event(uint64_t num_events_dropped)",
> > +        frameinfo.lineno().unwrap() + 1,
> > +        frameinfo.filename().unwrap().to_str().unwrap(),
> > +    )
> > +    .unwrap();
> 
> Is the backtrace necessary? This trick was used in Python where it's
> part of the standard library, but I don't think there is any need for
> Dropped_Event to refer to this line in the source code.
>
> Maybe Rust has a way to do this at compile-time or you can hardcode
> values instead.

Thanks your reminder, I'll think about the hardcode approach as the
easiest solution...removing the backtrace can help a lot here.

Regards,
Zhao



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-28  6:48   ` Zhao Liu
@ 2024-05-28 13:05     ` Stefan Hajnoczi
  2024-05-29  9:33       ` Mads Ynddal
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-28 13:05 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 9921 bytes --]

On Tue, May 28, 2024 at 02:48:42PM +0800, Zhao Liu wrote:
> Hi Stefan,
> 
> On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote:
> > Date: Mon, 27 May 2024 15:59:44 -0400
> > From: Stefan Hajnoczi <stefanha@redhat.com>
> > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> > 
> > On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> > > Hi maintainers and list,
> > > 
> > > This RFC series attempts to re-implement simpletrace.py with Rust, which
> > > is the 1st task of Paolo's GSoC 2024 proposal.
> > > 
> > > There are two motivations for this work:
> > > 1. This is an open chance to discuss how to integrate Rust into QEMU.
> > > 2. Rust delivers faster parsing.
> > > 
> > > 
> > > Introduction
> > > ============
> > > 
> > > Code framework
> > > --------------
> > > 
> > > I choose "cargo" to organize the code, because the current
> > > implementation depends on external crates (Rust's library), such as
> > > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> > > regular matching, and so on. (Meson's support for external crates is
> > > still incomplete. [2])
> > > 
> > > The simpletrace-rust created in this series is not yet integrated into
> > > the QEMU compilation chain, so it can only be compiled independently, e.g.
> > > under ./scripts/simpletrace/, compile it be:
> > > 
> > >     cargo build --release
> > 
> > Please make sure it's built by .gitlab-ci.d/ so that the continuous
> > integration system prevents bitrot. You can add a job that runs the
> > cargo build.
> 
> Thanks! I'll do this.
> 
> > > 
> > > The code tree for the entire simpletrace-rust is as follows:
> > > 
> > > $ script/simpletrace-rust .
> > > .
> > > ├── Cargo.toml
> > > └── src
> > >     └── main.rs   // The simpletrace logic (similar to simpletrace.py).
> > >     └── trace.rs  // The Argument and Event abstraction (refer to
> > >                   // tracetool/__init__.py).
> > > 
> > > My question about meson v.s. cargo, I put it at the end of the cover
> > > letter (the section "Opens on Rust Support").
> > > 
> > > The following two sections are lessons I've learned from this Rust
> > > practice.
> > > 
> > > 
> > > Performance
> > > -----------
> > > 
> > > I did the performance comparison using the rust-simpletrace prototype with
> > > the python one:
> > > 
> > > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> > > trace binary file for 10 times on each item:
> > > 
> > >                       AVE (ms)       Rust v.s. Python
> > > Rust   (stdout)       12687.16            114.46%
> > > Python (stdout)       14521.85
> > > 
> > > Rust   (file)          1422.44            264.99%
> > > Python (file)          3769.37
> > > 
> > > - The "stdout" lines represent output to the screen.
> > > - The "file" lines represent output to a file (via "> file").
> > > 
> > > This Rust version contains some optimizations (including print, regular
> > > matching, etc.), but there should be plenty of room for optimization.
> > > 
> > > The current performance bottleneck is the reading binary trace file,
> > > since I am parsing headers and event payloads one after the other, so
> > > that the IO read overhead accounts for 33%, which can be further
> > > optimized in the future.
> > 
> > Performance will become more important when large amounts of TCG data is
> > captured, as described in the project idea:
> > https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing
> > 
> > While I can't think of a time in the past where simpletrace.py's
> > performance bothered me, improving performance is still welcome. Just
> > don't spend too much time on performance (and making the code more
> > complex) unless there is a real need.
> 
> Yes, I agree that it shouldn't be over-optimized.
> 
> The logic in the current Rust version is pretty much a carbon copy of
> the Python version, without additional complex logic introduced, but the
> improvements in x2.6 were obtained by optimizing IO:
> 
> * reading the event configuration file, where I called the buffered
>   interface,
> * and the output formatted trace log, which I output all via std_out.lock()
>   followed by write_all().
> 
> So, just the simple tweak of the interfaces brings much benefits. :-)
> 
> > > Security
> > > --------
> > > 
> > > This is an example.
> > > 
> > > Rust is very strict about type-checking, and it found timestamp reversal
> > > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> > > deeper with more time)...in this RFC, I workingaround it by allowing
> > > negative values. And the python version, just silently covered this
> > > issue up.
> > >
> > > Opens on Rust Support
> > > =====================
> > > 
> > > Meson v.s. Cargo
> > > ----------------
> > > 
> > > The first question is whether all Rust code (including under scripts)
> > > must be integrated into meson?
> > > 
> > > If so, because of [2] then I have to discard the external crates and
> > > build some more Rust wheels of my own to replace the previous external
> > > crates.
> > > 
> > > For the main part of the QEMU code, I think the answer must be Yes, but
> > > for the tools in the scripts directory, would it be possible to allow
> > > the use of cargo to build small tools/program for flexibility and
> > > migrate to meson later (as meson's support for rust becomes more
> > > mature)?
> > 
> > I have not seen a satisfying way to natively build Rust code using
> > meson. I remember reading about a tool that converts Cargo.toml files to
> > meson wrap files or something similar. That still doesn't feel great
> > because upstream works with Cargo and duplicating build information in
> > meson is a drag.
> > 
> > Calling cargo from meson is not ideal either, but it works and avoids
> > duplicating build information. This is the approach I would use for now
> > unless someone can point to an example of native Rust support in meson
> > that is clean.
> > 
> > Here is how libblkio calls cargo from meson:
> > https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build
> > https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh
> 
> Many thanks! This is a good example and I'll try to build similarly to
> it.
> 
> > > 
> > > 
> > > External crates
> > > ---------------
> > > 
> > > This is an additional question that naturally follows from the above
> > > question, do we have requirements for Rust's external crate? Is only std
> > > allowed?
> > 
> > There is no policy. My suggestion:
> > 
> > If you need a third-party crate then it's okay to use it, but try to
> > minimize dependencies. Avoid adding dependening for niceties that are
> > not strictly needed. Third-party crates are a burden for package
> > maintainers and anyone building from source. They increase the risk that
> > the code will fail to build. They can also be a security risk.
> 
> Thanks for the suggestion, that's clear to me, I'll try to control the
> third party dependencies.
> 
> > > 
> > > Welcome your feedback!
> > 
> > It would be easier to give feedback if you implement some examples of
> > TCG binary tracing before rewriting simpletrace.py. It's unclear to me
> > why simpletrace needs to be rewritten at this point. If you are
> > extending the simpletrace file format in ways that are not suitable for
> > Python or can demonstrate that Python performance is a problem, then
> > focussing on a Rust simpletrace implementation is more convincing.
> > 
> > Could you use simpletrace.py to develop TCG binary tracing first?
> 
> Yes, I can. :-)
> 
> Rewriting in Rust does sound duplicative, but I wonder if this could be
> viewed as an open opportunity to add Rust support for QEMU, looking at
> the scripts directory to start exploring Rust support/build would be
> a relatively easy place to start.
> 
> I think the exploration of Rust's build of the simpletrace tool under
> scripts parts can help with subsequent work on supporting Rust in other
> QEMU core parts.
> 
> From this point, may I ask your opinion?
> 
> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
> the former goes for performance and the latter for scalability.

Rewriting an existing, maintained component without buy-in from the
maintainers is risky. Mads is the maintainer of simpletrace.py and I am
the overall tracing maintainer. While the performance improvement is
nice, I'm a skeptical about the need for this and wonder whether thought
was put into how simpletrace should evolve.

There are disadvantages to maintaining multiple implementations:
- File format changes need to be coordinated across implementations to
  prevent compatibility issues. In other words, changing the
  trace-events format becomes harder and discourages future work.
- Multiple implementations makes life harder for users because they need
  to decide between implementations and understand the trade-offs.
- There is more maintenance overall.

I think we should have a single simpletrace implementation to avoid
these issues. The Python implementation is more convenient for
user-written trace analysis scripts. The Rust implementation has better
performance (although I'm not aware of efforts to improve the Python
implementation's performance, so who knows).

I'm ambivalent about why a reimplementation is necessary. What I would
like to see first is the TCG binary tracing functionality. Find the
limits of the Python simpletrace implementation and then it will be
clear whether a Rust reimplementation makes sense.

If Mads agrees, I am happy with a Rust reimplementation, but please
demonstrate why a reimplementation is necessary first.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-28  7:53     ` Zhao Liu
@ 2024-05-28 14:14       ` Stefan Hajnoczi
  2024-05-29 14:30         ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-28 14:14 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1021 bytes --]

On Tue, May 28, 2024 at 03:53:55PM +0800, Zhao Liu wrote:
> Hi Stefan,
> 
> [snip]
> 
> > > diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> > > new file mode 100644
> > > index 000000000000..97a97c24ebfb
> > > --- /dev/null
> > > +++ b/scripts/simpletrace-rust/.rustfmt.toml
> > > @@ -0,0 +1,9 @@
> > > +brace_style = "AlwaysNextLine"
> > > +comment_width = 80
> > > +edition = "2021"
> > > +group_imports = "StdExternalCrate"
> > > +imports_granularity = "item"
> > > +max_width = 80
> > > +use_field_init_shorthand = true
> > > +use_try_shorthand = true
> > > +wrap_comments = true
> > 
> > There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?
> 
> Indeed, but QEMU's style for Rust is currently undefined, so I'm trying
> to add this to make it easier to check the style...I will separate it
> out as a style policy proposal.

Why is a config file necessary? QEMU should use the default Rust style.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-28 13:05     ` Stefan Hajnoczi
@ 2024-05-29  9:33       ` Mads Ynddal
  2024-05-29 14:10         ` Zhao Liu
  0 siblings, 1 reply; 30+ messages in thread
From: Mads Ynddal @ 2024-05-29  9:33 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Zhao Liu, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e,
	"Daniel P . Berrang�", Thomas Huth,
	Markus Armbruster, qemu-devel


>> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
>> the former goes for performance and the latter for scalability.
> 
> Rewriting an existing, maintained component without buy-in from the
> maintainers is risky. Mads is the maintainer of simpletrace.py and I am
> the overall tracing maintainer. While the performance improvement is
> nice, I'm a skeptical about the need for this and wonder whether thought
> was put into how simpletrace should evolve.
> 
> There are disadvantages to maintaining multiple implementations:
> - File format changes need to be coordinated across implementations to
>  prevent compatibility issues. In other words, changing the
>  trace-events format becomes harder and discourages future work.
> - Multiple implementations makes life harder for users because they need
>  to decide between implementations and understand the trade-offs.
> - There is more maintenance overall.
> 
> I think we should have a single simpletrace implementation to avoid
> these issues. The Python implementation is more convenient for
> user-written trace analysis scripts. The Rust implementation has better
> performance (although I'm not aware of efforts to improve the Python
> implementation's performance, so who knows).
> 
> I'm ambivalent about why a reimplementation is necessary. What I would
> like to see first is the TCG binary tracing functionality. Find the
> limits of the Python simpletrace implementation and then it will be
> clear whether a Rust reimplementation makes sense.
> 
> If Mads agrees, I am happy with a Rust reimplementation, but please
> demonstrate why a reimplementation is necessary first.
> 
> Stefan

I didn't want to shoot down the idea, since it seemed like somebody had a plan
with GSoC. But I actually agree, that I'm not quite convinced.

I think I'd need to see some data that showed the Python version is inadequate.
I know Python is not the fastest, but is it so prohibitively slow, that we
cannot make the TCG analysis? I'm not saying it can't be true, but it'd be nice
to see it demonstrated before making decisions.

Because, as you point out, there's a lot of downsides to having two versions. So
the benefits have to clearly outweigh the additional work.

I have a lot of other questions, but let's maybe start with the core idea first.

—
Mads Ynddal



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-29  9:33       ` Mads Ynddal
@ 2024-05-29 14:10         ` Zhao Liu
  2024-05-29 18:44           ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-29 14:10 UTC (permalink / raw)
  To: Stefan Hajnoczi, Mads Ynddal
  Cc: Paolo Bonzini, Philippe Mathieu-Daud�, Peter Maydell,
	Alex Benn�e, "Daniel P . Berrang�",
	Thomas Huth, Markus Armbruster, qemu-devel

Hi Stefan and Mads,

On Wed, May 29, 2024 at 11:33:42AM +0200, Mads Ynddal wrote:
> Date: Wed, 29 May 2024 11:33:42 +0200
> From: Mads Ynddal <mads@ynddal.dk>
> Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> X-Mailer: Apple Mail (2.3774.600.62)
> 
> 
> >> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
> >> the former goes for performance and the latter for scalability.
> > 
> > Rewriting an existing, maintained component without buy-in from the
> > maintainers is risky. Mads is the maintainer of simpletrace.py and I am
> > the overall tracing maintainer. While the performance improvement is
> > nice, I'm a skeptical about the need for this and wonder whether thought
> > was put into how simpletrace should evolve.
> > 
> > There are disadvantages to maintaining multiple implementations:
> > - File format changes need to be coordinated across implementations to
> >  prevent compatibility issues. In other words, changing the
> >  trace-events format becomes harder and discourages future work.
> > - Multiple implementations makes life harder for users because they need
> >  to decide between implementations and understand the trade-offs.
> > - There is more maintenance overall.
> > 
> > I think we should have a single simpletrace implementation to avoid
> > these issues. The Python implementation is more convenient for
> > user-written trace analysis scripts. The Rust implementation has better
> > performance (although I'm not aware of efforts to improve the Python
> > implementation's performance, so who knows).
> > 
> > I'm ambivalent about why a reimplementation is necessary. What I would
> > like to see first is the TCG binary tracing functionality. Find the
> > limits of the Python simpletrace implementation and then it will be
> > clear whether a Rust reimplementation makes sense.
> > 
> > If Mads agrees, I am happy with a Rust reimplementation, but please
> > demonstrate why a reimplementation is necessary first.
> > 
> > Stefan
> 
> I didn't want to shoot down the idea, since it seemed like somebody had a plan
> with GSoC. But I actually agree, that I'm not quite convinced.
> 
> I think I'd need to see some data that showed the Python version is inadequate.
> I know Python is not the fastest, but is it so prohibitively slow, that we
> cannot make the TCG analysis? I'm not saying it can't be true, but it'd be nice
> to see it demonstrated before making decisions.
> 
> Because, as you point out, there's a lot of downsides to having two versions. So
> the benefits have to clearly outweigh the additional work.
> 
> I have a lot of other questions, but let's maybe start with the core idea first.
> 
> —
> Mads Ynddal
>

I really appreciate your patience and explanations, and your feedback
and review has helped me a lot!

Yes, simple repetition creates much maintenance burden (though I'm happy
to help with), and the argument for current performance isn't convincing
enough.

Getting back to the project itself, as you have said, the core is still
further support for TCG-related traces, and I'll continue to work on it,
and then look back based on such work to see what issues there are with
traces or what improvements can be made.

Thanks again!

Regards,
Zhao



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-28 14:14       ` Stefan Hajnoczi
@ 2024-05-29 14:30         ` Zhao Liu
  2024-05-29 18:41           ` Stefan Hajnoczi
  0 siblings, 1 reply; 30+ messages in thread
From: Zhao Liu @ 2024-05-29 14:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

Hi Stefan,

On Tue, May 28, 2024 at 10:14:01AM -0400, Stefan Hajnoczi wrote:
> Date: Tue, 28 May 2024 10:14:01 -0400
> From: Stefan Hajnoczi <stefanha@redhat.com>
> Subject: Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo
>  framework
> 
> On Tue, May 28, 2024 at 03:53:55PM +0800, Zhao Liu wrote:
> > Hi Stefan,
> > 
> > [snip]
> > 
> > > > diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> > > > new file mode 100644
> > > > index 000000000000..97a97c24ebfb
> > > > --- /dev/null
> > > > +++ b/scripts/simpletrace-rust/.rustfmt.toml
> > > > @@ -0,0 +1,9 @@
> > > > +brace_style = "AlwaysNextLine"
> > > > +comment_width = 80
> > > > +edition = "2021"
> > > > +group_imports = "StdExternalCrate"
> > > > +imports_granularity = "item"
> > > > +max_width = 80
> > > > +use_field_init_shorthand = true
> > > > +use_try_shorthand = true
> > > > +wrap_comments = true
> > > 
> > > There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?
> > 
> > Indeed, but QEMU's style for Rust is currently undefined, so I'm trying
> > to add this to make it easier to check the style...I will separate it
> > out as a style policy proposal.
> 
> Why is a config file necessary? QEMU should use the default Rust style.
> 

There are some that may be overdone, but I think some basic may still
be necessary, like "comment_width = 80", "max_width = 80",
"wrap_comments". Is it necessary to specify the width? As C.

And, "group_imports" and "imports_granularity" (refered from crosvm),
can also be used to standardize including styles and improve
readability, since importing can be done in many different styles.

This fmt config is something like ./script/check_patch.pl for QEMU/linux.
Different programs have different practices, so I feel like that's an
open too!

This certainly also depends on the maintainer's/your preferences, ;-)
in what way looks more comfortable/convenient that is the best,
completely according to the default is also good.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-29 14:30         ` Zhao Liu
@ 2024-05-29 18:41           ` Stefan Hajnoczi
  2024-05-31 12:22             ` Daniel P. Berrangé
  0 siblings, 1 reply; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-29 18:41 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e, Daniel P . Berrang�,
	Thomas Huth, Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2752 bytes --]

On Wed, May 29, 2024 at 10:30:13PM +0800, Zhao Liu wrote:
> Hi Stefan,
> 
> On Tue, May 28, 2024 at 10:14:01AM -0400, Stefan Hajnoczi wrote:
> > Date: Tue, 28 May 2024 10:14:01 -0400
> > From: Stefan Hajnoczi <stefanha@redhat.com>
> > Subject: Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo
> >  framework
> > 
> > On Tue, May 28, 2024 at 03:53:55PM +0800, Zhao Liu wrote:
> > > Hi Stefan,
> > > 
> > > [snip]
> > > 
> > > > > diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> > > > > new file mode 100644
> > > > > index 000000000000..97a97c24ebfb
> > > > > --- /dev/null
> > > > > +++ b/scripts/simpletrace-rust/.rustfmt.toml
> > > > > @@ -0,0 +1,9 @@
> > > > > +brace_style = "AlwaysNextLine"
> > > > > +comment_width = 80
> > > > > +edition = "2021"
> > > > > +group_imports = "StdExternalCrate"
> > > > > +imports_granularity = "item"
> > > > > +max_width = 80
> > > > > +use_field_init_shorthand = true
> > > > > +use_try_shorthand = true
> > > > > +wrap_comments = true
> > > > 
> > > > There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?
> > > 
> > > Indeed, but QEMU's style for Rust is currently undefined, so I'm trying
> > > to add this to make it easier to check the style...I will separate it
> > > out as a style policy proposal.
> > 
> > Why is a config file necessary? QEMU should use the default Rust style.
> > 
> 
> There are some that may be overdone, but I think some basic may still
> be necessary, like "comment_width = 80", "max_width = 80",
> "wrap_comments". Is it necessary to specify the width? As C.

Let's agree to follow the Rust coding style from the start, then the
problem is solved. My view is that deviating from the standard Rust
coding style in order to make QEMU Rust code resemble QEMU C code is
less helpful than following Rust conventions so our Rust code looks like
Rust.

> 
> And, "group_imports" and "imports_granularity" (refered from crosvm),
> can also be used to standardize including styles and improve
> readability, since importing can be done in many different styles.
> 
> This fmt config is something like ./script/check_patch.pl for QEMU/linux.
> Different programs have different practices, so I feel like that's an
> open too!

In languages like Rust that have a standard, let's follow the standard
instead of inventing our own way of formatting code.

> This certainly also depends on the maintainer's/your preferences, ;-)
> in what way looks more comfortable/convenient that is the best,
> completely according to the default is also good.

This will probably affect all Rust code in QEMU so everyone's opinion
counts.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-29 14:10         ` Zhao Liu
@ 2024-05-29 18:44           ` Stefan Hajnoczi
  0 siblings, 0 replies; 30+ messages in thread
From: Stefan Hajnoczi @ 2024-05-29 18:44 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Mads Ynddal, Paolo Bonzini, Philippe Mathieu-Daud�,
	Peter Maydell, Alex Benn�e,
	"Daniel P . Berrang�", Thomas Huth,
	Markus Armbruster, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 3592 bytes --]

On Wed, May 29, 2024 at 10:10:00PM +0800, Zhao Liu wrote:
> Hi Stefan and Mads,
> 
> On Wed, May 29, 2024 at 11:33:42AM +0200, Mads Ynddal wrote:
> > Date: Wed, 29 May 2024 11:33:42 +0200
> > From: Mads Ynddal <mads@ynddal.dk>
> > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> > X-Mailer: Apple Mail (2.3774.600.62)
> > 
> > 
> > >> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
> > >> the former goes for performance and the latter for scalability.
> > > 
> > > Rewriting an existing, maintained component without buy-in from the
> > > maintainers is risky. Mads is the maintainer of simpletrace.py and I am
> > > the overall tracing maintainer. While the performance improvement is
> > > nice, I'm a skeptical about the need for this and wonder whether thought
> > > was put into how simpletrace should evolve.
> > > 
> > > There are disadvantages to maintaining multiple implementations:
> > > - File format changes need to be coordinated across implementations to
> > >  prevent compatibility issues. In other words, changing the
> > >  trace-events format becomes harder and discourages future work.
> > > - Multiple implementations makes life harder for users because they need
> > >  to decide between implementations and understand the trade-offs.
> > > - There is more maintenance overall.
> > > 
> > > I think we should have a single simpletrace implementation to avoid
> > > these issues. The Python implementation is more convenient for
> > > user-written trace analysis scripts. The Rust implementation has better
> > > performance (although I'm not aware of efforts to improve the Python
> > > implementation's performance, so who knows).
> > > 
> > > I'm ambivalent about why a reimplementation is necessary. What I would
> > > like to see first is the TCG binary tracing functionality. Find the
> > > limits of the Python simpletrace implementation and then it will be
> > > clear whether a Rust reimplementation makes sense.
> > > 
> > > If Mads agrees, I am happy with a Rust reimplementation, but please
> > > demonstrate why a reimplementation is necessary first.
> > > 
> > > Stefan
> > 
> > I didn't want to shoot down the idea, since it seemed like somebody had a plan
> > with GSoC. But I actually agree, that I'm not quite convinced.
> > 
> > I think I'd need to see some data that showed the Python version is inadequate.
> > I know Python is not the fastest, but is it so prohibitively slow, that we
> > cannot make the TCG analysis? I'm not saying it can't be true, but it'd be nice
> > to see it demonstrated before making decisions.
> > 
> > Because, as you point out, there's a lot of downsides to having two versions. So
> > the benefits have to clearly outweigh the additional work.
> > 
> > I have a lot of other questions, but let's maybe start with the core idea first.
> > 
> > —
> > Mads Ynddal
> >
> 
> I really appreciate your patience and explanations, and your feedback
> and review has helped me a lot!
> 
> Yes, simple repetition creates much maintenance burden (though I'm happy
> to help with), and the argument for current performance isn't convincing
> enough.
> 
> Getting back to the project itself, as you have said, the core is still
> further support for TCG-related traces, and I'll continue to work on it,
> and then look back based on such work to see what issues there are with
> traces or what improvements can be made.

Thanks for doing that and sorry for holding up the work you have already
done!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework
  2024-05-29 18:41           ` Stefan Hajnoczi
@ 2024-05-31 12:22             ` Daniel P. Berrangé
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel P. Berrangé @ 2024-05-31 12:22 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Zhao Liu, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daud�, Peter Maydell, Alex Benn�e,
	Thomas Huth, Markus Armbruster, qemu-devel

On Wed, May 29, 2024 at 02:41:03PM -0400, Stefan Hajnoczi wrote:
> On Wed, May 29, 2024 at 10:30:13PM +0800, Zhao Liu wrote:
> > Hi Stefan,
> > 
> > On Tue, May 28, 2024 at 10:14:01AM -0400, Stefan Hajnoczi wrote:
> > > Date: Tue, 28 May 2024 10:14:01 -0400
> > > From: Stefan Hajnoczi <stefanha@redhat.com>
> > > Subject: Re: [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo
> > >  framework
> > > 
> > > On Tue, May 28, 2024 at 03:53:55PM +0800, Zhao Liu wrote:
> > > > Hi Stefan,
> > > > 
> > > > [snip]
> > > > 
> > > > > > diff --git a/scripts/simpletrace-rust/.rustfmt.toml b/scripts/simpletrace-rust/.rustfmt.toml
> > > > > > new file mode 100644
> > > > > > index 000000000000..97a97c24ebfb
> > > > > > --- /dev/null
> > > > > > +++ b/scripts/simpletrace-rust/.rustfmt.toml
> > > > > > @@ -0,0 +1,9 @@
> > > > > > +brace_style = "AlwaysNextLine"
> > > > > > +comment_width = 80
> > > > > > +edition = "2021"
> > > > > > +group_imports = "StdExternalCrate"
> > > > > > +imports_granularity = "item"
> > > > > > +max_width = 80
> > > > > > +use_field_init_shorthand = true
> > > > > > +use_try_shorthand = true
> > > > > > +wrap_comments = true
> > > > > 
> > > > > There should be QEMU-wide policy. That said, why is it necessary to customize rustfmt?
> > > > 
> > > > Indeed, but QEMU's style for Rust is currently undefined, so I'm trying
> > > > to add this to make it easier to check the style...I will separate it
> > > > out as a style policy proposal.
> > > 
> > > Why is a config file necessary? QEMU should use the default Rust style.
> > > 
> > 
> > There are some that may be overdone, but I think some basic may still
> > be necessary, like "comment_width = 80", "max_width = 80",
> > "wrap_comments". Is it necessary to specify the width? As C.
> 
> Let's agree to follow the Rust coding style from the start, then the
> problem is solved. My view is that deviating from the standard Rust
> coding style in order to make QEMU Rust code resemble QEMU C code is
> less helpful than following Rust conventions so our Rust code looks like
> Rust.

Agreed. The value of a language wide standard is undermined if apps
diverge from it.

All code style rules come down to bike shedding, and by simply adopting
the Rust community defaults, we avoid endless debates as to what style
is best, and stay aligned with the rest of the Rust community who mostly
won't override defaults for rustfmt.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
                   ` (8 preceding siblings ...)
  2024-05-27 19:59 ` Stefan Hajnoczi
@ 2024-05-31 12:27 ` Daniel P. Berrangé
  2024-05-31 14:55   ` Alex Bennée
  9 siblings, 1 reply; 30+ messages in thread
From: Daniel P. Berrangé @ 2024-05-31 12:27 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Alex Bennée,
	Thomas Huth, Markus Armbruster, qemu-devel

On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> Hi maintainers and list,
> 
> This RFC series attempts to re-implement simpletrace.py with Rust, which
> is the 1st task of Paolo's GSoC 2024 proposal.
> 
> There are two motivations for this work:
> 1. This is an open chance to discuss how to integrate Rust into QEMU.

I don't think this proposal really triggers that discussion to any
great extent, because 'simpletrace.py' is not a part of the QEMU
codebase in any meaningul way. It is a standalone python script
that just happens to live in the qemu.git repository.

The difficult questions around Rust integration will arise when we
want to have Rust used to /replace/ some existing non-optional
functionality. IOW, if Rust were to become a mandatory dep of QEMU
that could not be avoided.

In fact I kinda wonder whether this Rust simpletrace code could
simply be its own git repo under gitlab.com/qemu-project/....,
rather than put into the monolithic QEMU repo ? That just makes
it a "normal" Rust project and no questions around integration
with QEMU's traditional build system would arise.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
  2024-05-31 12:27 ` Daniel P. Berrangé
@ 2024-05-31 14:55   ` Alex Bennée
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Bennée @ 2024-05-31 14:55 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Zhao Liu, Stefan Hajnoczi, Mads Ynddal, Paolo Bonzini,
	Philippe Mathieu-Daudé, Peter Maydell, Thomas Huth,
	Markus Armbruster, qemu-devel, Manos Pitsidianakis

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
>> Hi maintainers and list,
>> 
>> This RFC series attempts to re-implement simpletrace.py with Rust, which
>> is the 1st task of Paolo's GSoC 2024 proposal.
>> 
>> There are two motivations for this work:
>> 1. This is an open chance to discuss how to integrate Rust into QEMU.
>
> I don't think this proposal really triggers that discussion to any
> great extent, because 'simpletrace.py' is not a part of the QEMU
> codebase in any meaningul way. It is a standalone python script
> that just happens to live in the qemu.git repository.
>
> The difficult questions around Rust integration will arise when we
> want to have Rust used to /replace/ some existing non-optional
> functionality. IOW, if Rust were to become a mandatory dep of QEMU
> that could not be avoided.

We hope to post some PoC device models in Rust soon with exactly that
debate in mind.

> In fact I kinda wonder whether this Rust simpletrace code could
> simply be its own git repo under gitlab.com/qemu-project/....,
> rather than put into the monolithic QEMU repo ? That just makes
> it a "normal" Rust project and no questions around integration
> with QEMU's traditional build system would arise.

Do we export the necessary bits for external projects to use? I don't
think we've had the equivalent of a qemu-devel package yet and doing so
start implying externally visible versioning and deprecation policies
for QEMU APIs and data formats.

TCG plugins have a header based API but currently everyone builds
against their local checkout and we are fairly liberal about bumping the
API versions.


>
>
> With regards,
> Daniel

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-05-31 14:56 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
2024-05-27 20:05   ` Stefan Hajnoczi
2024-05-28  7:53     ` Zhao Liu
2024-05-28 14:14       ` Stefan Hajnoczi
2024-05-29 14:30         ` Zhao Liu
2024-05-29 18:41           ` Stefan Hajnoczi
2024-05-31 12:22             ` Daniel P. Berrangé
2024-05-27  8:14 ` [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module Zhao Liu
2024-05-27 20:33   ` Stefan Hajnoczi
2024-05-28  8:32     ` Zhao Liu
2024-05-27  8:14 ` [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file Zhao Liu
2024-05-27 20:39   ` Stefan Hajnoczi
2024-05-28  8:37     ` Zhao Liu
2024-05-27  8:14 ` [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file Zhao Liu
2024-05-27 20:44   ` Stefan Hajnoczi
2024-05-28  9:30     ` Zhao Liu
2024-05-27  8:14 ` [RFC 5/6] scripts/simpletrace-rust: Format simple trace output Zhao Liu
2024-05-27  8:14 ` [RFC 6/6] docs/tracing: Add simpletrace-rust section Zhao Liu
2024-05-27 10:29 ` [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Philippe Mathieu-Daudé
2024-05-27 10:49 ` Mads Ynddal
2024-05-28  6:15   ` Zhao Liu
2024-05-27 19:59 ` Stefan Hajnoczi
2024-05-28  6:48   ` Zhao Liu
2024-05-28 13:05     ` Stefan Hajnoczi
2024-05-29  9:33       ` Mads Ynddal
2024-05-29 14:10         ` Zhao Liu
2024-05-29 18:44           ` Stefan Hajnoczi
2024-05-31 12:27 ` Daniel P. Berrangé
2024-05-31 14:55   ` Alex Bennée

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).