All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Zhao Liu <zhao1.liu@intel.com>
Cc: "Mads Ynddal" <mads@ynddal.dk>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daud�" <philmd@linaro.org>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Alex Benn�e" <alex.bennee@linaro.org>,
	"Daniel P . Berrang�" <berrange@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
Date: Tue, 28 May 2024 09:05:18 -0400	[thread overview]
Message-ID: <20240528130518.GA993828@fedora.redhat.com> (raw)
In-Reply-To: <ZlV+Su4hziCFymVt@intel.com>

[-- Attachment #1: Type: text/plain, Size: 9921 bytes --]

On Tue, May 28, 2024 at 02:48:42PM +0800, Zhao Liu wrote:
> Hi Stefan,
> 
> On Mon, May 27, 2024 at 03:59:44PM -0400, Stefan Hajnoczi wrote:
> > Date: Mon, 27 May 2024 15:59:44 -0400
> > From: Stefan Hajnoczi <stefanha@redhat.com>
> > Subject: Re: [RFC 0/6] scripts: Rewrite simpletrace printer in Rust
> > 
> > On Mon, May 27, 2024 at 04:14:15PM +0800, Zhao Liu wrote:
> > > Hi maintainers and list,
> > > 
> > > This RFC series attempts to re-implement simpletrace.py with Rust, which
> > > is the 1st task of Paolo's GSoC 2024 proposal.
> > > 
> > > There are two motivations for this work:
> > > 1. This is an open chance to discuss how to integrate Rust into QEMU.
> > > 2. Rust delivers faster parsing.
> > > 
> > > 
> > > Introduction
> > > ============
> > > 
> > > Code framework
> > > --------------
> > > 
> > > I choose "cargo" to organize the code, because the current
> > > implementation depends on external crates (Rust's library), such as
> > > "backtrace" for getting frameinfo, "clap" for parsing the cli, "rex" for
> > > regular matching, and so on. (Meson's support for external crates is
> > > still incomplete. [2])
> > > 
> > > The simpletrace-rust created in this series is not yet integrated into
> > > the QEMU compilation chain, so it can only be compiled independently, e.g.
> > > under ./scripts/simpletrace/, compile it be:
> > > 
> > >     cargo build --release
> > 
> > Please make sure it's built by .gitlab-ci.d/ so that the continuous
> > integration system prevents bitrot. You can add a job that runs the
> > cargo build.
> 
> Thanks! I'll do this.
> 
> > > 
> > > The code tree for the entire simpletrace-rust is as follows:
> > > 
> > > $ script/simpletrace-rust .
> > > .
> > > ├── Cargo.toml
> > > └── src
> > >     └── main.rs   // The simpletrace logic (similar to simpletrace.py).
> > >     └── trace.rs  // The Argument and Event abstraction (refer to
> > >                   // tracetool/__init__.py).
> > > 
> > > My question about meson v.s. cargo, I put it at the end of the cover
> > > letter (the section "Opens on Rust Support").
> > > 
> > > The following two sections are lessons I've learned from this Rust
> > > practice.
> > > 
> > > 
> > > Performance
> > > -----------
> > > 
> > > I did the performance comparison using the rust-simpletrace prototype with
> > > the python one:
> > > 
> > > * On the i7-10700 CPU @ 2.90GHz machine, parsing and outputting a 35M
> > > trace binary file for 10 times on each item:
> > > 
> > >                       AVE (ms)       Rust v.s. Python
> > > Rust   (stdout)       12687.16            114.46%
> > > Python (stdout)       14521.85
> > > 
> > > Rust   (file)          1422.44            264.99%
> > > Python (file)          3769.37
> > > 
> > > - The "stdout" lines represent output to the screen.
> > > - The "file" lines represent output to a file (via "> file").
> > > 
> > > This Rust version contains some optimizations (including print, regular
> > > matching, etc.), but there should be plenty of room for optimization.
> > > 
> > > The current performance bottleneck is the reading binary trace file,
> > > since I am parsing headers and event payloads one after the other, so
> > > that the IO read overhead accounts for 33%, which can be further
> > > optimized in the future.
> > 
> > Performance will become more important when large amounts of TCG data is
> > captured, as described in the project idea:
> > https://wiki.qemu.org/Internships/ProjectIdeas/TCGBinaryTracing
> > 
> > While I can't think of a time in the past where simpletrace.py's
> > performance bothered me, improving performance is still welcome. Just
> > don't spend too much time on performance (and making the code more
> > complex) unless there is a real need.
> 
> Yes, I agree that it shouldn't be over-optimized.
> 
> The logic in the current Rust version is pretty much a carbon copy of
> the Python version, without additional complex logic introduced, but the
> improvements in x2.6 were obtained by optimizing IO:
> 
> * reading the event configuration file, where I called the buffered
>   interface,
> * and the output formatted trace log, which I output all via std_out.lock()
>   followed by write_all().
> 
> So, just the simple tweak of the interfaces brings much benefits. :-)
> 
> > > Security
> > > --------
> > > 
> > > This is an example.
> > > 
> > > Rust is very strict about type-checking, and it found timestamp reversal
> > > issue in simpletrace-rust [3] (sorry, haven't gotten around to digging
> > > deeper with more time)...in this RFC, I workingaround it by allowing
> > > negative values. And the python version, just silently covered this
> > > issue up.
> > >
> > > Opens on Rust Support
> > > =====================
> > > 
> > > Meson v.s. Cargo
> > > ----------------
> > > 
> > > The first question is whether all Rust code (including under scripts)
> > > must be integrated into meson?
> > > 
> > > If so, because of [2] then I have to discard the external crates and
> > > build some more Rust wheels of my own to replace the previous external
> > > crates.
> > > 
> > > For the main part of the QEMU code, I think the answer must be Yes, but
> > > for the tools in the scripts directory, would it be possible to allow
> > > the use of cargo to build small tools/program for flexibility and
> > > migrate to meson later (as meson's support for rust becomes more
> > > mature)?
> > 
> > I have not seen a satisfying way to natively build Rust code using
> > meson. I remember reading about a tool that converts Cargo.toml files to
> > meson wrap files or something similar. That still doesn't feel great
> > because upstream works with Cargo and duplicating build information in
> > meson is a drag.
> > 
> > Calling cargo from meson is not ideal either, but it works and avoids
> > duplicating build information. This is the approach I would use for now
> > unless someone can point to an example of native Rust support in meson
> > that is clean.
> > 
> > Here is how libblkio calls cargo from meson:
> > https://gitlab.com/libblkio/libblkio/-/blob/main/src/meson.build
> > https://gitlab.com/libblkio/libblkio/-/blob/main/src/cargo-build.sh
> 
> Many thanks! This is a good example and I'll try to build similarly to
> it.
> 
> > > 
> > > 
> > > External crates
> > > ---------------
> > > 
> > > This is an additional question that naturally follows from the above
> > > question, do we have requirements for Rust's external crate? Is only std
> > > allowed?
> > 
> > There is no policy. My suggestion:
> > 
> > If you need a third-party crate then it's okay to use it, but try to
> > minimize dependencies. Avoid adding dependening for niceties that are
> > not strictly needed. Third-party crates are a burden for package
> > maintainers and anyone building from source. They increase the risk that
> > the code will fail to build. They can also be a security risk.
> 
> Thanks for the suggestion, that's clear to me, I'll try to control the
> third party dependencies.
> 
> > > 
> > > Welcome your feedback!
> > 
> > It would be easier to give feedback if you implement some examples of
> > TCG binary tracing before rewriting simpletrace.py. It's unclear to me
> > why simpletrace needs to be rewritten at this point. If you are
> > extending the simpletrace file format in ways that are not suitable for
> > Python or can demonstrate that Python performance is a problem, then
> > focussing on a Rust simpletrace implementation is more convincing.
> > 
> > Could you use simpletrace.py to develop TCG binary tracing first?
> 
> Yes, I can. :-)
> 
> Rewriting in Rust does sound duplicative, but I wonder if this could be
> viewed as an open opportunity to add Rust support for QEMU, looking at
> the scripts directory to start exploring Rust support/build would be
> a relatively easy place to start.
> 
> I think the exploration of Rust's build of the simpletrace tool under
> scripts parts can help with subsequent work on supporting Rust in other
> QEMU core parts.
> 
> From this point, may I ask your opinion?
> 
> Maybe later, Rust-simpletrace and python-simpletrace can differ, e.g.
> the former goes for performance and the latter for scalability.

Rewriting an existing, maintained component without buy-in from the
maintainers is risky. Mads is the maintainer of simpletrace.py and I am
the overall tracing maintainer. While the performance improvement is
nice, I'm a skeptical about the need for this and wonder whether thought
was put into how simpletrace should evolve.

There are disadvantages to maintaining multiple implementations:
- File format changes need to be coordinated across implementations to
  prevent compatibility issues. In other words, changing the
  trace-events format becomes harder and discourages future work.
- Multiple implementations makes life harder for users because they need
  to decide between implementations and understand the trade-offs.
- There is more maintenance overall.

I think we should have a single simpletrace implementation to avoid
these issues. The Python implementation is more convenient for
user-written trace analysis scripts. The Rust implementation has better
performance (although I'm not aware of efforts to improve the Python
implementation's performance, so who knows).

I'm ambivalent about why a reimplementation is necessary. What I would
like to see first is the TCG binary tracing functionality. Find the
limits of the Python simpletrace implementation and then it will be
clear whether a Rust reimplementation makes sense.

If Mads agrees, I am happy with a Rust reimplementation, but please
demonstrate why a reimplementation is necessary first.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2024-05-28 13:06 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-27  8:14 [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Zhao Liu
2024-05-27  8:14 ` [RFC 1/6] scripts/simpletrace-rust: Add the basic cargo framework Zhao Liu
2024-05-27 20:05   ` Stefan Hajnoczi
2024-05-28  7:53     ` Zhao Liu
2024-05-28 14:14       ` Stefan Hajnoczi
2024-05-29 14:30         ` Zhao Liu
2024-05-29 18:41           ` Stefan Hajnoczi
2024-05-31 12:22             ` Daniel P. Berrangé
2024-05-27  8:14 ` [RFC 2/6] scripts/simpletrace-rust: Support Event & Arguments in trace module Zhao Liu
2024-05-27 20:33   ` Stefan Hajnoczi
2024-05-28  8:32     ` Zhao Liu
2024-05-27  8:14 ` [RFC 3/6] scripts/simpletrace-rust: Add helpers to parse trace file Zhao Liu
2024-05-27 20:39   ` Stefan Hajnoczi
2024-05-28  8:37     ` Zhao Liu
2024-05-27  8:14 ` [RFC 4/6] scripts/simpletrace-rust: Parse and check trace recode file Zhao Liu
2024-05-27 20:44   ` Stefan Hajnoczi
2024-05-28  9:30     ` Zhao Liu
2024-05-27  8:14 ` [RFC 5/6] scripts/simpletrace-rust: Format simple trace output Zhao Liu
2024-05-27  8:14 ` [RFC 6/6] docs/tracing: Add simpletrace-rust section Zhao Liu
2024-05-27 10:29 ` [RFC 0/6] scripts: Rewrite simpletrace printer in Rust Philippe Mathieu-Daudé
2024-05-27 10:49 ` Mads Ynddal
2024-05-28  6:15   ` Zhao Liu
2024-05-27 19:59 ` Stefan Hajnoczi
2024-05-28  6:48   ` Zhao Liu
2024-05-28 13:05     ` Stefan Hajnoczi [this message]
2024-05-29  9:33       ` Mads Ynddal
2024-05-29 14:10         ` Zhao Liu
2024-05-29 18:44           ` Stefan Hajnoczi
2024-05-31 12:27 ` Daniel P. Berrangé
2024-05-31 14:55   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240528130518.GA993828@fedora.redhat.com \
    --to=stefanha@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=mads@ynddal.dk \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.