[PULL 0/7] Rust, TCG, x86 patches for 2025-02-07

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07
@ 2025-02-07 10:27 Paolo Bonzini
  2025-02-07 10:27 ` [PULL 1/7] rust: remove unnecessary Cargo.toml metadata Paolo Bonzini
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 131c58469f6fb68c89b38fee6aba8bbb20c7f4bf:

  rust: add --rust-target option for bindgen (2025-02-06 13:51:46 -0500)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 07f858bd1bf7cafc201789d7e379b2e95dc76946:

  tcg/optimize: optimize TSTNE using smask and zmask (2025-02-07 11:16:39 +0100)

----------------------------------------------------------------
* tcg/optimize: optimize TSTNE using smask and zmask
* target/i386: fix exceptions for 0 * Inf + QNaN
* rust: cleanups for configuration
* rust: add developer docs

----------------------------------------------------------------
Paolo Bonzini (5):
      rust: remove unnecessary Cargo.toml metadata
      rust: include rust_version in Cargo.toml
      rust: add docs
      rust: add clippy configuration file
      tcg/optimize: optimize TSTNE using smask and zmask

Peter Maydell (2):
      target/i386: Do not raise Invalid for 0 * Inf + QNaN
      tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases

 docs/devel/index-process.rst           |   1 +
 docs/devel/rust.rst                    | 425 +++++++++++++++++++++++++++++++++
 include/fpu/softfloat-types.h          |  16 +-
 target/i386/tcg/fpu_helper.c           |   5 +-
 tcg/optimize.c                         |  13 +-
 tests/tcg/x86_64/fma.c                 | 109 +++++++++
 fpu/softfloat-parts.c.inc              |   5 +-
 rust/clippy.toml                       |   2 +
 rust/hw/char/pl011/Cargo.toml          |   4 +-
 rust/hw/char/pl011/README.md           |  31 ---
 rust/hw/char/pl011/src/device_class.rs |   1 -
 rust/hw/char/pl011/src/lib.rs          |  14 +-
 rust/qemu-api-macros/Cargo.toml        |   4 +-
 rust/qemu-api-macros/README.md         |   1 -
 rust/qemu-api/Cargo.toml               |   1 +
 tests/lcitool/libvirt-ci               |   2 +-
 tests/tcg/x86_64/Makefile.target       |   1 +
 17 files changed, 576 insertions(+), 59 deletions(-)
 create mode 100644 docs/devel/rust.rst
 create mode 100644 tests/tcg/x86_64/fma.c
 create mode 100644 rust/clippy.toml
 delete mode 100644 rust/hw/char/pl011/README.md
 delete mode 100644 rust/qemu-api-macros/README.md
-- 
2.48.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PULL 1/7] rust: remove unnecessary Cargo.toml metadata
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
@ 2025-02-07 10:27 ` Paolo Bonzini
  2025-02-07 10:27 ` [PULL 2/7] rust: include rust_version in Cargo.toml Paolo Bonzini
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel

Some items of Cargo.toml (readme, homepage, repository) are
only present because of clippy::cargo warnings being enabled in
rust/hw/char/pl011/src/lib.rs.  But these items are not
particularly useful and would be all the same for all Cargo.toml
files in the QEMU workspace.  Clean them up.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 rust/hw/char/pl011/Cargo.toml   |  3 ---
 rust/hw/char/pl011/README.md    | 31 -------------------------------
 rust/hw/char/pl011/src/lib.rs   | 14 ++++++--------
 rust/qemu-api-macros/Cargo.toml |  3 ---
 rust/qemu-api-macros/README.md  |  1 -
 5 files changed, 6 insertions(+), 46 deletions(-)
 delete mode 100644 rust/hw/char/pl011/README.md
 delete mode 100644 rust/qemu-api-macros/README.md

diff --git a/rust/hw/char/pl011/Cargo.toml b/rust/hw/char/pl011/Cargo.toml
index 58f3e859f7e..2b4097864df 100644
--- a/rust/hw/char/pl011/Cargo.toml
+++ b/rust/hw/char/pl011/Cargo.toml
@@ -4,10 +4,7 @@ version = "0.1.0"
 edition = "2021"
 authors = ["Manos Pitsidianakis <manos.pitsidianakis@linaro.org>"]
 license = "GPL-2.0-or-later"
-readme = "README.md"
-homepage = "https://www.qemu.org"
 description = "pl011 device model for QEMU"
-repository = "https://gitlab.com/epilys/rust-for-qemu"
 resolver = "2"
 publish = false
 keywords = []
diff --git a/rust/hw/char/pl011/README.md b/rust/hw/char/pl011/README.md
deleted file mode 100644
index cd7dea31634..00000000000
--- a/rust/hw/char/pl011/README.md
+++ /dev/null
@@ -1,31 +0,0 @@
-# PL011 QEMU Device Model
-
-This library implements a device model for the PrimeCell® UART (PL011)
-device in QEMU.
-
-## Build static lib
-
-Host build target must be explicitly specified:
-
-```sh
-cargo build --target x86_64-unknown-linux-gnu
-```
-
-Replace host target triplet if necessary.
-
-## Generate Rust documentation
-
-To generate docs for this crate, including private items:
-
-```sh
-cargo doc --no-deps --document-private-items --target x86_64-unknown-linux-gnu
-```
-
-To include direct dependencies like `bilge` (bitmaps for register types):
-
-```sh
-cargo tree --depth 1 -e normal --prefix none \
- | cut -d' ' -f1 \
- | xargs printf -- '-p %s\n' \
- | xargs cargo doc --no-deps --document-private-items --target x86_64-unknown-linux-gnu
-```
diff --git a/rust/hw/char/pl011/src/lib.rs b/rust/hw/char/pl011/src/lib.rs
index e2df4586bcc..e704daf6e3e 100644
--- a/rust/hw/char/pl011/src/lib.rs
+++ b/rust/hw/char/pl011/src/lib.rs
@@ -1,13 +1,12 @@
 // Copyright 2024, Linaro Limited
 // Author(s): Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
 // SPDX-License-Identifier: GPL-2.0-or-later
-//
-// PL011 QEMU Device Model
-//
-// This library implements a device model for the PrimeCell® UART (PL011)
-// device in QEMU.
-//
-#![doc = include_str!("../README.md")]
+
+//! PL011 QEMU Device Model
+//!
+//! This library implements a device model for the PrimeCell® UART (PL011)
+//! device in QEMU.
+//!
 //! # Library crate
 //!
 //! See [`PL011State`](crate::device::PL011State) for the device model type and
@@ -18,7 +17,6 @@
     clippy::suspicious,
     clippy::complexity,
     clippy::perf,
-    clippy::cargo,
     clippy::nursery,
     clippy::style
 )]
diff --git a/rust/qemu-api-macros/Cargo.toml b/rust/qemu-api-macros/Cargo.toml
index 5a27b52ee6e..b9b4baecddb 100644
--- a/rust/qemu-api-macros/Cargo.toml
+++ b/rust/qemu-api-macros/Cargo.toml
@@ -4,10 +4,7 @@ version = "0.1.0"
 edition = "2021"
 authors = ["Manos Pitsidianakis <manos.pitsidianakis@linaro.org>"]
 license = "GPL-2.0-or-later"
-readme = "README.md"
-homepage = "https://www.qemu.org"
 description = "Rust bindings for QEMU - Utility macros"
-repository = "https://gitlab.com/qemu-project/qemu/"
 resolver = "2"
 publish = false
 keywords = []
diff --git a/rust/qemu-api-macros/README.md b/rust/qemu-api-macros/README.md
deleted file mode 100644
index f60f54ac4be..00000000000
--- a/rust/qemu-api-macros/README.md
+++ /dev/null
@@ -1 +0,0 @@
-# `qemu-api-macros` - Utility macros for defining QEMU devices
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 2/7] rust: include rust_version in Cargo.toml
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
  2025-02-07 10:27 ` [PULL 1/7] rust: remove unnecessary Cargo.toml metadata Paolo Bonzini
@ 2025-02-07 10:27 ` Paolo Bonzini
  2025-02-07 10:27 ` [PULL 3/7] rust: add docs Paolo Bonzini
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel

Tell clippy the minimum supported Rust version for QEMU.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 rust/hw/char/pl011/Cargo.toml          | 1 +
 rust/hw/char/pl011/src/device_class.rs | 1 -
 rust/qemu-api-macros/Cargo.toml        | 1 +
 rust/qemu-api/Cargo.toml               | 1 +
 4 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/rust/hw/char/pl011/Cargo.toml b/rust/hw/char/pl011/Cargo.toml
index 2b4097864df..f2296cad58b 100644
--- a/rust/hw/char/pl011/Cargo.toml
+++ b/rust/hw/char/pl011/Cargo.toml
@@ -9,6 +9,7 @@ resolver = "2"
 publish = false
 keywords = []
 categories = []
+rust-version = "1.63.0"
 
 [lib]
 crate-type = ["staticlib"]
diff --git a/rust/hw/char/pl011/src/device_class.rs b/rust/hw/char/pl011/src/device_class.rs
index 8a157a663fb..dbef93f6cb3 100644
--- a/rust/hw/char/pl011/src/device_class.rs
+++ b/rust/hw/char/pl011/src/device_class.rs
@@ -12,7 +12,6 @@
 
 use crate::device::{PL011Registers, PL011State};
 
-#[allow(clippy::missing_const_for_fn)]
 extern "C" fn pl011_clock_needed(opaque: *mut c_void) -> bool {
     let state = NonNull::new(opaque).unwrap().cast::<PL011State>();
     unsafe { state.as_ref().migrate_clock }
diff --git a/rust/qemu-api-macros/Cargo.toml b/rust/qemu-api-macros/Cargo.toml
index b9b4baecddb..89dee1cfb39 100644
--- a/rust/qemu-api-macros/Cargo.toml
+++ b/rust/qemu-api-macros/Cargo.toml
@@ -9,6 +9,7 @@ resolver = "2"
 publish = false
 keywords = []
 categories = []
+rust-version = "1.63.0"
 
 [lib]
 proc-macro = true
diff --git a/rust/qemu-api/Cargo.toml b/rust/qemu-api/Cargo.toml
index 4aa22f31986..a51dd142852 100644
--- a/rust/qemu-api/Cargo.toml
+++ b/rust/qemu-api/Cargo.toml
@@ -12,6 +12,7 @@ resolver = "2"
 publish = false
 keywords = []
 categories = []
+rust-version = "1.63.0"
 
 [dependencies]
 qemu_api_macros = { path = "../qemu-api-macros" }
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 3/7] rust: add docs
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
  2025-02-07 10:27 ` [PULL 1/7] rust: remove unnecessary Cargo.toml metadata Paolo Bonzini
  2025-02-07 10:27 ` [PULL 2/7] rust: include rust_version in Cargo.toml Paolo Bonzini
@ 2025-02-07 10:27 ` Paolo Bonzini
  2025-02-07 10:27 ` [PULL 4/7] rust: add clippy configuration file Paolo Bonzini
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 docs/devel/index-process.rst |   1 +
 docs/devel/rust.rst          | 425 +++++++++++++++++++++++++++++++++++
 tests/lcitool/libvirt-ci     |   2 +-
 3 files changed, 427 insertions(+), 1 deletion(-)
 create mode 100644 docs/devel/rust.rst

diff --git a/docs/devel/index-process.rst b/docs/devel/index-process.rst
index 362f97ee300..cb7c6640fd2 100644
--- a/docs/devel/index-process.rst
+++ b/docs/devel/index-process.rst
@@ -17,3 +17,4 @@ Notes about how to interact with the community and how and where to submit patch
    stable-process
    submitting-a-pull-request
    secure-coding-practices
+   rust
diff --git a/docs/devel/rust.rst b/docs/devel/rust.rst
new file mode 100644
index 00000000000..7d67ac591f9
--- /dev/null
+++ b/docs/devel/rust.rst
@@ -0,0 +1,425 @@
+.. |msrv| replace:: 1.63.0
+
+Rust in QEMU
+============
+
+Rust in QEMU is a project to enable using the Rust programming language
+to add new functionality to QEMU.
+
+Right now, the focus is on making it possible to write devices that inherit
+from ``SysBusDevice`` in `*safe*`__ Rust.  Later, it may become possible
+to write other kinds of devices (e.g. PCI devices that can do DMA),
+complete boards, or backends (e.g. block device formats).
+
+__ https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html
+
+Building the Rust in QEMU code
+------------------------------
+
+The Rust in QEMU code is included in the emulators via Meson.  Meson
+invokes rustc directly, building static libraries that are then linked
+together with the C code.  This is completely automatic when you run
+``make`` or ``ninja``.
+
+However, QEMU's build system also tries to be easy to use for people who
+are accustomed to the more "normal" Cargo-based development workflow.
+In particular:
+
+* the set of warnings and lints that are used to build QEMU always
+  comes from the ``rust/Cargo.toml`` workspace file
+
+* it is also possible to use ``cargo`` for common Rust-specific coding
+  tasks, in particular to invoke ``clippy``, ``rustfmt`` and ``rustdoc``.
+
+To this end, QEMU includes a ``build.rs`` build script that picks up
+generated sources from QEMU's build directory and puts it in Cargo's
+output directory (typically ``rust/target/``).  A vanilla invocation
+of Cargo will complain that it cannot find the generated sources,
+which can be fixed in different ways:
+
+* by using special shorthand targets in the QEMU build directory::
+
+    make clippy
+    make rustfmt
+    make rustdoc
+
+* by invoking ``cargo`` through the Meson `development environment`__
+  feature::
+
+    pyvenv/bin/meson devenv -w ../rust cargo clippy --tests
+    pyvenv/bin/meson devenv -w ../rust cargo fmt
+
+  If you are going to use ``cargo`` repeatedly, ``pyvenv/bin/meson devenv``
+  will enter a shell where commands like ``cargo clippy`` just work.
+
+__ https://mesonbuild.com/Commands.html#devenv
+
+* by pointing the ``MESON_BUILD_ROOT`` to the top of your QEMU build
+  tree.  This third method is useful if you are using ``rust-analyzer``;
+  you can set the environment variable through the
+  ``rust-analyzer.cargo.extraEnv`` setting.
+
+As shown above, you can use the ``--tests`` option as usual to operate on test
+code.  Note however that you cannot *build* or run tests via ``cargo``, because
+they need support C code from QEMU that Cargo does not know about.  Tests can
+be run via ``meson test`` or ``make``::
+
+   make check-rust
+
+Building Rust code with ``--enable-modules`` is not supported yet.
+
+Supported tools
+'''''''''''''''
+
+QEMU supports rustc version 1.63.0 and newer.  Notably, the following features
+are missing:
+
+* ``core::ffi`` (1.64.0).  Use ``std::os::raw`` and ``std::ffi`` instead.
+
+* ``cast_mut()``/``cast_const()`` (1.65.0).  Use ``as`` instead.
+
+* "let ... else" (1.65.0).  Use ``if let`` instead.  This is currently patched
+  in QEMU's vendored copy of the bilge crate.
+
+* Generic Associated Types (1.65.0)
+
+* ``CStr::from_bytes_with_nul()`` as a ``const`` function (1.72.0).
+
+* "Return position ``impl Trait`` in Traits" (1.75.0, blocker for including
+  the pinned-init create).
+
+* ``MaybeUninit::zeroed()`` as a ``const`` function (1.75.0).  QEMU's
+  ``Zeroable`` trait can be implemented without ``MaybeUninit::zeroed()``,
+  so this would be just a cleanup.
+
+* ``c"" literals`` (stable in 1.77.0).  QEMU provides a ``c_str!()`` macro
+  to define ``CStr`` constants easily
+
+* ``offset_of!`` (stable in 1.77.0).  QEMU uses ``offset_of!()`` heavily; it
+  provides a replacement in the ``qemu_api`` crate, but it does not support
+  lifetime parameters and therefore ``&'a Something`` fields in the struct
+  may have to be replaced by ``NonNull<Something>``.  *Nested* ``offset_of!``
+  was only stabilized in Rust 1.82.0, but it is not used.
+
+* inline const expression (stable in 1.79.0), currently worked around with
+  associated constants in the ``FnCall`` trait.
+
+* associated constants have to be explicitly marked ``'static`` (`changed in
+  1.81.0`__)
+
+* ``&raw`` (stable in 1.82.0).  Use ``addr_of!`` and ``addr_of_mut!`` instead,
+  though hopefully the need for raw pointers will go down over time.
+
+* ``new_uninit`` (stable in 1.82.0).  This is used internally by the ``pinned_init``
+  crate, which is planned for inclusion in QEMU, but it can be easily patched
+  out.
+
+* referencing statics in constants (stable in 1.83.0).  For now use a const
+  function; this is an important limitation for QEMU's migration stream
+  architecture (VMState).  Right now, VMState lacks type safety because
+  it is hard to place the ``VMStateField`` definitions in traits.
+
+* associated const equality would be nice to have for some users of
+  ``callbacks::FnCall``, but is still experimental.  ``ASSERT_IS_SOME``
+  replaces it.
+
+__ https://github.com/rust-lang/rust/pull/125258
+
+It is expected that QEMU will advance its minimum supported version of
+rustc to 1.77.0 as soon as possible; as of January 2025, blockers
+for that right now are Debian bookworm and 32-bit MIPS processors.
+This unfortunately means that references to statics in constants will
+remain an issue.
+
+QEMU also supports version 0.60.x of bindgen, which is missing option
+``--generate-cstr``.  This option requires version 0.66.x and will
+be adopted as soon as supporting these older versions is not necessary
+anymore.
+
+Writing Rust code in QEMU
+-------------------------
+
+Right now QEMU includes three crates:
+
+* ``qemu_api`` for bindings to C code and useful functionality
+
+* ``qemu_api_macros`` defines several procedural macros that are useful when
+  writing C code
+
+* ``pl011`` (under ``rust/hw/char/pl011``) is the sample device that is being
+  used to further develop ``qemu_api`` and ``qemu_api_macros``.  It is a functional
+  replacement for the ``hw/char/pl011.c`` file.
+
+This section explains how to work with them.
+
+Status
+''''''
+
+Modules of ``qemu_api`` can be defined as:
+
+- *complete*: ready for use in new devices; if applicable, the API supports the
+  full functionality available in C
+
+- *stable*: ready for production use, the API is safe and should not undergo
+  major changes
+
+- *proof of concept*: the API is subject to change but allows working with safe
+  Rust
+
+- *initial*: the API is in its initial stages; it requires large amount of
+  unsafe code; it might have soundness or type-safety issues
+
+The status of the modules are
+
+================ ======================
+module           status
+================ ======================
+``assertions``   stable
+``bitops``       complete
+``callbacks``    complete
+``cell``         stable
+``c_str``        complete
+``irq``          complete
+``module``       complete
+``offset_of``    stable
+``qdev``         stable
+``qom``          stable
+``sysbus``       stable
+``vmstate``      proof of concept
+``zeroable``     stable
+================ ======================
+
+Common pitfalls
+'''''''''''''''
+
+Rust has very strict rules with respect to how you get an exclusive (``&mut``)
+reference; failure to respect those rules is a source of undefined behavior.
+In particular, even if a value is loaded from a raw mutable pointer (``*mut``),
+it *cannot* be casted to ``&mut`` unless the value was stored to the ``*mut``
+from a mutable reference.  Furthermore, it is undefined behavior if any
+shared reference was created between the store to the ``*mut`` and the load::
+
+    let mut p: u32 = 42;
+    let p_mut = &mut p;                              // 1
+    let p_raw = p_mut as *mut u32;                   // 2
+
+    // p_raw keeps the mutable reference "alive"
+
+    let p_shared = &p;                               // 3
+    println!("access from &u32: {}", *p_shared);
+
+    // Bring back the mutable reference, its lifetime overlaps
+    // with that of a shared reference.
+    let p_mut = unsafe { &mut *p_raw };              // 4
+    println!("access from &mut 32: {}", *p_mut);
+
+    println!("access from &u32: {}", *p_shared);     // 5
+
+These rules can be tested with `MIRI`__, for example.
+
+__ https://github.com/rust-lang/miri
+
+Almost all Rust code in QEMU will involve QOM objects, and pointers to these
+objects are *shared*, for example because they are part of the QOM composition
+tree.  This creates exactly the above scenario:
+
+1. a QOM object is created
+
+2. a ``*mut`` is created, for example as the opaque value for a ``MemoryRegion``
+
+3. the QOM object is placed in the composition tree
+
+4. a memory access dereferences the opaque value to a ``&mut``
+
+5. but the shared reference is still present in the composition tree
+
+Because of this, QOM objects should almost always use ``&self`` instead
+of ``&mut self``; access to internal fields must use *interior mutability*
+to go from a shared reference to a ``&mut``.
+
+Whenever C code provides you with an opaque ``void *``, avoid converting it
+to a Rust mutable reference, and use a shared reference instead.  Rust code
+will then have to use QEMU's ``BqlRefCell`` and ``BqlCell`` type, which
+enforce that locking rules for the "Big QEMU Lock" are respected.  These cell
+types are also known to the ``vmstate`` crate, which is able to "look inside"
+them when building an in-memory representation of a ``struct``s layout.
+Note that the same is not true of a ``RefCell`` or ``Mutex``.
+
+In the future, similar cell types might also be provided for ``AioContext``-based
+locking as well.
+
+Writing bindings to C code
+''''''''''''''''''''''''''
+
+Here are some things to keep in mind when working on the ``qemu_api`` crate.
+
+**Look at existing code**
+  Very often, similar idioms in C code correspond to similar tricks in
+  Rust bindings.  If the C code uses ``offsetof``, look at qdev properties
+  or ``vmstate``.  If the C code has a complex const struct, look at
+  ``MemoryRegion``.  Reuse existing patterns for handling lifetimes;
+  for example use ``&T`` for QOM objects that do not need a reference
+  count (including those that can be embedded in other objects) and
+  ``Owned<T>`` for those that need it.
+
+**Use the type system**
+  Bindings often will need access information that is specific to a type
+  (either a builtin one or a user-defined one) in order to pass it to C
+  functions.  Put them in a trait and access it through generic parameters.
+  The ``vmstate`` module has examples of how to retrieve type information
+  for the fields of a Rust ``struct``.
+
+**Prefer unsafe traits to unsafe functions**
+  Unsafe traits are much easier to prove correct than unsafe functions.
+  They are an excellent place to store metadata that can later be accessed
+  by generic functions.  C code usually places metadata in global variables;
+  in Rust, they can be stored in traits and then turned into ``static``
+  variables.  Often, unsafe traits can be generated by procedural macros.
+
+**Document limitations due to old Rust versions**
+  If you need to settle for an inferior solution because of the currently
+  supported set of Rust versions, document it in the source and in this
+  file.  This ensures that it can be fixed when the minimum supported
+  version is bumped.
+
+**Keep locking in mind**.
+  When marking a type ``Sync``, be careful of whether it needs the big
+  QEMU lock.  Use ``BqlCell`` and ``BqlRefCell`` for interior data,
+  or assert ``bql_locked()``.
+
+**Don't be afraid of complexity, but document and isolate it**
+  It's okay to be tricky; device code is written more often than bindings
+  code and it's important that it is idiomatic.  However, you should strive
+  to isolate any tricks in a place (for example a ``struct``, a trait
+  or a macro) where it can be documented and tested.  If needed, include
+  toy versions of the code in the documentation.
+
+Writing procedural macros
+'''''''''''''''''''''''''
+
+By conventions, procedural macros are split in two functions, one
+returning ``Result<proc_macro2::TokenStream, MacroError>` with the body of
+the procedural macro, and the second returning ``proc_macro::TokenStream``
+which is the actual procedural macro.  The former's name is the same as
+the latter with the ``_or_error`` suffix.  The code for the latter is more
+or less fixed; it follows the following template, which is fixed apart
+from the type after ``as`` in the invocation of ``parse_macro_input!``::
+
+    #[proc_macro_derive(Object)]
+    pub fn derive_object(input: TokenStream) -> TokenStream {
+        let input = parse_macro_input!(input as DeriveInput);
+        let expanded = derive_object_or_error(input).unwrap_or_else(Into::into);
+
+        TokenStream::from(expanded)
+    }
+
+The ``qemu_api_macros`` crate has utility functions to examine a
+``DeriveInput`` and perform common checks (e.g. looking for a struct
+with named fields).  These functions return ``Result<..., MacroError>``
+and can be used easily in the procedural macro function::
+
+    fn derive_object_or_error(input: DeriveInput) ->
+        Result<proc_macro2::TokenStream, MacroError>
+    {
+        is_c_repr(&input, "#[derive(Object)]")?;
+
+        let name = &input.ident;
+        let parent = &get_fields(&input, "#[derive(Object)]")?[0].ident;
+        ...
+    }
+
+Use procedural macros with care.  They are mostly useful for two purposes:
+
+* Performing consistency checks; for example ``#[derive(Object)]`` checks
+  that the structure has ``#[repr[C])`` and that the type of the first field
+  is consistent with the ``ObjectType`` declaration.
+
+* Extracting information from Rust source code into traits, typically based
+  on types and attributes.  For example, ``#[derive(TryInto)]`` builds an
+  implementation of ``TryFrom``, and it uses the ``#[repr(...)]`` attribute
+  as the ``TryFrom`` source and error types.
+
+Procedural macros can be hard to debug and test; if the code generation
+exceeds a few lines of code, it may be worthwhile to delegate work to
+"regular" declarative (``macro_rules!``) macros and write unit tests for
+those instead.
+
+
+Coding style
+''''''''''''
+
+Code should pass clippy and be formatted with rustfmt.
+
+Right now, only the nightly version of ``rustfmt`` is supported.  This
+might change in the future.  While CI checks for correct formatting via
+``cargo fmt --check``, maintainers can fix this for you when applying patches.
+
+It is expected that ``qemu_api`` provides full ``rustdoc`` documentation for
+bindings that are in their final shape or close.
+
+Adding dependencies
+-------------------
+
+Generally, the set of dependent crates is kept small.  Think twice before
+adding a new external crate, especially if it comes with a large set of
+dependencies itself.  Sometimes QEMU only needs a small subset of the
+functionality; see for example QEMU's ``assertions`` or ``c_str`` modules.
+
+On top of this recommendation, adding external crates to QEMU is a
+slightly complicated process, mostly due to the need to teach Meson how
+to build them.  While Meson has initial support for parsing ``Cargo.lock``
+files, it is still highly experimental and is therefore not used.
+
+First of all, the crate must be added to the relevant ``Cargo.toml`` files.
+External crates must be added as subprojects for Meson to learn how to
+build them, as well as to the relevant ``Cargo.toml`` files.  Because the
+``rust/`` directory forms a Cargo `workspace`__, there is a single
+``rust/Cargo.lock`` file for the whole build.
+
+__ https://doc.rust-lang.org/cargo/reference/workspaces.html#virtual-workspace
+
+Choose a version of the crate that works with QEMU's minimum supported
+Rust version (|msrv|).
+
+Second, a new ``wrap`` file must be added to teach Meson how to download the
+crate.  The wrap file must be named ``NAME-SEMVER-rs.wrap``, where ``NAME``
+is the name of the crate and ``SEMVER`` is the version up to and including the
+first non-zero number.  For example, a crate with version ``0.2.3`` will use
+``0.2`` for its ``SEMVER``, while a crate with version ``1.0.84`` will use ``1``.
+
+Third, the Meson rules to build the crate must be added at
+``subprojects/NAME-SEMVER-rs/meson.build``.  Generally this includes:
+
+* ``subproject`` and ``dependency`` lines for all dependent crates
+
+* a ``static_library`` or ``rust.proc_macro`` line to perform the actual build
+
+* ``declare_dependency`` and a ``meson.override_dependency`` lines to expose
+  the result to QEMU and to other subprojects
+
+Remember to add ``native: true`` to ``dependency``, ``static_library`` and
+``meson.override_dependency`` for dependencies of procedural macros.
+If a crate is needed in both procedural macros and QEMU binaries, everything
+apart from ``subproject`` must be duplicated to build both native and
+non-native versions of the crate.
+
+It's important to specify the right compiler options.  These include:
+
+* the language edition (which can be found in the ``Cargo.toml`` file)
+
+* the ``--cfg`` (which have to be "reverse engineered" from the ``build.rs``
+  file of the crate).
+
+* usually, a ``--cap-lints allow`` argument to hide warnings from rustc
+  or clippy.
+
+After every change to the ``meson.build`` file you have to update the patched
+version with ``meson subprojects update --reset ``NAME-SEMVER-rs``.  This might
+be automated in the future.
+
+Also, after every change to the ``meson.build`` file it is strongly suggested to
+do a dummy change to the ``.wrap`` file (for example adding a comment like
+``# version 2``), which will help Meson notice that the subproject is out of date.
+
+As a last step, add the new subproject to ``scripts/archive-source.sh``,
+``scripts/make-release`` and ``subprojects/.gitignore``.
diff --git a/tests/lcitool/libvirt-ci b/tests/lcitool/libvirt-ci
index b6a65806bc9..9ad3f70bde9 160000
--- a/tests/lcitool/libvirt-ci
+++ b/tests/lcitool/libvirt-ci
@@ -1 +1 @@
-Subproject commit b6a65806bc9b2b56985f5e97c936b77c7e7a99fc
+Subproject commit 9ad3f70bde9865d5ad18f36d256d472e72b5cbf3
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 4/7] rust: add clippy configuration file
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
                   ` (2 preceding siblings ...)
  2025-02-07 10:27 ` [PULL 3/7] rust: add docs Paolo Bonzini
@ 2025-02-07 10:27 ` Paolo Bonzini
  2025-02-07 10:27 ` [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN Paolo Bonzini
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel

Configure the minimum supported Rust version (though strictly speaking
that's redundant with Cargo.toml), and the list of CamelCase identifiers
that are not Rust types.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 rust/clippy.toml | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 rust/clippy.toml

diff --git a/rust/clippy.toml b/rust/clippy.toml
new file mode 100644
index 00000000000..f42154e95ec
--- /dev/null
+++ b/rust/clippy.toml
@@ -0,0 +1,2 @@
+doc-valid-idents = ["PrimeCell", ".."]
+msrv = "1.63.0"
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
                   ` (3 preceding siblings ...)
  2025-02-07 10:27 ` [PULL 4/7] rust: add clippy configuration file Paolo Bonzini
@ 2025-02-07 10:27 ` Paolo Bonzini
  2025-02-07 11:53   ` Michael Tokarev
  2025-02-07 10:28 ` [PULL 6/7] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases Paolo Bonzini
  2025-02-07 10:28 ` [PULL 7/7] tcg/optimize: optimize TSTNE using smask and zmask Paolo Bonzini
  6 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-stable, Richard Henderson

From: Peter Maydell <peter.maydell@linaro.org>

In commit 8adcff4ae7 ("fpu: handle raising Invalid for infzero in
pick_nan_muladd") we changed the handling of 0 * Inf + QNaN to always
raise the Invalid exception regardless of target architecture.  (This
was a change affecting hppa, i386, sh4 and tricore.) However, this
was incorrect for i386, which documents in the SDM section 14.5.2
that for the 0 * Inf + NaN case that it will only raise the Invalid
exception when the input is an SNaN.  (This is permitted by the IEEE
754-2008 specification, which documents that whether we raise Invalid
for 0 * Inf + QNaN is implementation defined.)

Adjust the softfloat pick_nan_muladd code to allow the target to
suppress the raising of Invalid for the inf * zero + NaN case (as an
extra flag orthogonal to its choice for when to use the default NaN),
and enable that for x86.

We do not revert here the behaviour change for hppa, sh4 or tricore:
 * The sh4 manual is clear that it should signal Invalid
 * The tricore manual is a bit vague but doesn't say it shouldn't
 * The hppa manual doesn't talk about fused multiply-add corner
   cases at all

Cc: qemu-stable@nongnu.org
Fixes: 8adcff4ae7 (""fpu: handle raising Invalid for infzero in pick_nan_muladd")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-2-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/fpu/softfloat-types.h | 16 +++++++++++++---
 target/i386/tcg/fpu_helper.c  |  5 ++++-
 fpu/softfloat-parts.c.inc     |  5 +++--
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
index 616c290145f..2e43d1dd9e6 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -280,11 +280,21 @@ typedef enum __attribute__((__packed__)) {
     /* No propagation rule specified */
     float_infzeronan_none = 0,
     /* Result is never the default NaN (so always the input NaN) */
-    float_infzeronan_dnan_never,
+    float_infzeronan_dnan_never = 1,
     /* Result is always the default NaN */
-    float_infzeronan_dnan_always,
+    float_infzeronan_dnan_always = 2,
     /* Result is the default NaN if the input NaN is quiet */
-    float_infzeronan_dnan_if_qnan,
+    float_infzeronan_dnan_if_qnan = 3,
+    /*
+     * Don't raise Invalid for 0 * Inf + NaN. Default is to raise.
+     * IEEE 754-2008 section 7.2 makes it implementation defined whether
+     * 0 * Inf + QNaN raises Invalid or not. Note that 0 * Inf + SNaN will
+     * raise the Invalid flag for the SNaN anyway.
+     *
+     * This is a flag which can be ORed in with any of the above
+     * DNaN behaviour options.
+     */
+    float_infzeronan_suppress_invalid = (1 << 7),
 } FloatInfZeroNaNRule;
 
 /*
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index 3d764bc138d..de6d0b252ec 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -178,8 +178,11 @@ void cpu_init_fp_statuses(CPUX86State *env)
      * "Fused-Multiply-ADD (FMA) Numeric Behavior" the NaN handling is
      * specified -- for 0 * inf + NaN the input NaN is selected, and if
      * there are multiple input NaNs they are selected in the order a, b, c.
+     * We also do not raise Invalid for the 0 * inf + (Q)NaN case.
      */
-    set_float_infzeronan_rule(float_infzeronan_dnan_never, &env->sse_status);
+    set_float_infzeronan_rule(float_infzeronan_dnan_never |
+                              float_infzeronan_suppress_invalid,
+                              &env->sse_status);
     set_float_3nan_prop_rule(float_3nan_prop_abc, &env->sse_status);
     /* Default NaN: sign bit set, most significant frac bit set */
     set_float_default_nan_pattern(0b11000000, &env->fp_status);
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index fee05d0a863..73621f4a970 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -126,7 +126,8 @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
         float_raise(float_flag_invalid | float_flag_invalid_snan, s);
     }
 
-    if (infzero) {
+    if (infzero &&
+        !(s->float_infzeronan_rule & float_infzeronan_suppress_invalid)) {
         /* This is (0 * inf) + NaN or (inf * 0) + NaN */
         float_raise(float_flag_invalid | float_flag_invalid_imz, s);
     }
@@ -144,7 +145,7 @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b,
          * Inf * 0 + NaN -- some implementations return the
          * default NaN here, and some return the input NaN.
          */
-        switch (s->float_infzeronan_rule) {
+        switch (s->float_infzeronan_rule & ~float_infzeronan_suppress_invalid) {
         case float_infzeronan_dnan_never:
             break;
         case float_infzeronan_dnan_always:
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN
  2025-02-07 10:27 ` [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN Paolo Bonzini
@ 2025-02-07 11:53   ` Michael Tokarev
  2025-02-07 13:43     ` Peter Maydell
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Tokarev @ 2025-02-07 11:53 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: Peter Maydell, qemu-stable, Richard Henderson

07.02.2025 13:27, Paolo Bonzini wrote:
> From: Peter Maydell <peter.maydell@linaro.org>
> 
> In commit 8adcff4ae7 ("fpu: handle raising Invalid for infzero in
> pick_nan_muladd") we changed the handling of 0 * Inf + QNaN to always
> raise the Invalid exception regardless of target architecture.  (This
> was a change affecting hppa, i386, sh4 and tricore.) However, this
> was incorrect for i386, which documents in the SDM section 14.5.2
> that for the 0 * Inf + NaN case that it will only raise the Invalid
> exception when the input is an SNaN.  (This is permitted by the IEEE
> 754-2008 specification, which documents that whether we raise Invalid
> for 0 * Inf + QNaN is implementation defined.)
> 
> Adjust the softfloat pick_nan_muladd code to allow the target to
> suppress the raising of Invalid for the inf * zero + NaN case (as an
> extra flag orthogonal to its choice for when to use the default NaN),
> and enable that for x86.
> 
> We do not revert here the behaviour change for hppa, sh4 or tricore:
>   * The sh4 manual is clear that it should signal Invalid
>   * The tricore manual is a bit vague but doesn't say it shouldn't
>   * The hppa manual doesn't talk about fused multiply-add corner
>     cases at all
> 
> Cc: qemu-stable@nongnu.org
> Fixes: 8adcff4ae7 (""fpu: handle raising Invalid for infzero in pick_nan_muladd")

A nitpick: double double-quote.

8adcff4ae7 is v9.2.0-7-g8adcff4ae7 - which is 7 commits *after* the latest
released version, -- hopefully this fix should not go to any stable series,
unless 8adcff4ae7 itself has to be picked up for 9.2 too.

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN
  2025-02-07 11:53   ` Michael Tokarev
@ 2025-02-07 13:43     ` Peter Maydell
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Maydell @ 2025-02-07 13:43 UTC (permalink / raw)
  To: Michael Tokarev; +Cc: Paolo Bonzini, qemu-devel, qemu-stable, Richard Henderson

On Fri, 7 Feb 2025 at 11:53, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> 07.02.2025 13:27, Paolo Bonzini wrote:
> > From: Peter Maydell <peter.maydell@linaro.org>
> >
> > In commit 8adcff4ae7 ("fpu: handle raising Invalid for infzero in
> > pick_nan_muladd") we changed the handling of 0 * Inf + QNaN to always
> > raise the Invalid exception regardless of target architecture.  (This
> > was a change affecting hppa, i386, sh4 and tricore.) However, this
> > was incorrect for i386, which documents in the SDM section 14.5.2
> > that for the 0 * Inf + NaN case that it will only raise the Invalid
> > exception when the input is an SNaN.  (This is permitted by the IEEE
> > 754-2008 specification, which documents that whether we raise Invalid
> > for 0 * Inf + QNaN is implementation defined.)
> >
> > Adjust the softfloat pick_nan_muladd code to allow the target to
> > suppress the raising of Invalid for the inf * zero + NaN case (as an
> > extra flag orthogonal to its choice for when to use the default NaN),
> > and enable that for x86.
> >
> > We do not revert here the behaviour change for hppa, sh4 or tricore:
> >   * The sh4 manual is clear that it should signal Invalid
> >   * The tricore manual is a bit vague but doesn't say it shouldn't
> >   * The hppa manual doesn't talk about fused multiply-add corner
> >     cases at all
> >
> > Cc: qemu-stable@nongnu.org
> > Fixes: 8adcff4ae7 (""fpu: handle raising Invalid for infzero in pick_nan_muladd")
>
> A nitpick: double double-quote.
>
> 8adcff4ae7 is v9.2.0-7-g8adcff4ae7 - which is 7 commits *after* the latest
> released version, -- hopefully this fix should not go to any stable series,
> unless 8adcff4ae7 itself has to be picked up for 9.2 too.

Ah, yes, I think I assumed based on date that 8adcff4ae7 had
made it into a release already. That commit was a refactoring
so it doesn't need to be backported anywhere.

-- PMM


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PULL 6/7] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
                   ` (4 preceding siblings ...)
  2025-02-07 10:27 ` [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN Paolo Bonzini
@ 2025-02-07 10:28 ` Paolo Bonzini
  2025-02-07 10:28 ` [PULL 7/7] tcg/optimize: optimize TSTNE using smask and zmask Paolo Bonzini
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Richard Henderson

From: Peter Maydell <peter.maydell@linaro.org>

Add a test case which tests some corner case behaviour of
fused-multiply-add on x86:
 * 0 * Inf + SNaN should raise Invalid
 * 0 * Inf + QNaN shouldh not raise Invalid
 * tininess should be detected after rounding

There is also one currently-disabled test case:
 * flush-to-zero should be done after rounding

This is disabled because QEMU's emulation currently does this
incorrectly (and so would fail the test).  The test case is kept in
but disabled, as the justification for why the test running harness
has support for testing both with and without FTZ set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20250116112536.4117889-3-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tests/tcg/x86_64/fma.c           | 109 +++++++++++++++++++++++++++++++
 tests/tcg/x86_64/Makefile.target |   1 +
 2 files changed, 110 insertions(+)
 create mode 100644 tests/tcg/x86_64/fma.c

diff --git a/tests/tcg/x86_64/fma.c b/tests/tcg/x86_64/fma.c
new file mode 100644
index 00000000000..09c622ebc00
--- /dev/null
+++ b/tests/tcg/x86_64/fma.c
@@ -0,0 +1,109 @@
+/*
+ * Test some fused multiply add corner cases.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include <stdio.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <inttypes.h>
+
+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+/*
+ * Perform one "n * m + a" operation using the vfmadd insn and return
+ * the result; on return *mxcsr_p is set to the bottom 6 bits of MXCSR
+ * (the Flag bits). If ftz is true then we set MXCSR.FTZ while doing
+ * the operation.
+ * We print the operation and its results to stdout.
+ */
+static uint64_t do_fmadd(uint64_t n, uint64_t m, uint64_t a,
+                         bool ftz, uint32_t *mxcsr_p)
+{
+    uint64_t r;
+    uint32_t mxcsr = 0;
+    uint32_t ftz_bit = ftz ? (1 << 15) : 0;
+    uint32_t saved_mxcsr = 0;
+
+    asm volatile("stmxcsr %[saved_mxcsr]\n"
+                 "stmxcsr %[mxcsr]\n"
+                 "andl $0xffff7fc0, %[mxcsr]\n"
+                 "orl %[ftz_bit], %[mxcsr]\n"
+                 "ldmxcsr %[mxcsr]\n"
+                 "movq %[a], %%xmm0\n"
+                 "movq %[m], %%xmm1\n"
+                 "movq %[n], %%xmm2\n"
+                 /* xmm0 = xmm0 + xmm2 * xmm1 */
+                 "vfmadd231sd %%xmm1, %%xmm2, %%xmm0\n"
+                 "movq %%xmm0, %[r]\n"
+                 "stmxcsr %[mxcsr]\n"
+                 "ldmxcsr %[saved_mxcsr]\n"
+                 : [r] "=r" (r), [mxcsr] "=m" (mxcsr),
+                   [saved_mxcsr] "=m" (saved_mxcsr)
+                 : [n] "r" (n), [m] "r" (m), [a] "r" (a),
+                   [ftz_bit] "r" (ftz_bit)
+                 : "xmm0", "xmm1", "xmm2");
+    *mxcsr_p = mxcsr & 0x3f;
+    printf("vfmadd132sd 0x%" PRIx64 " 0x%" PRIx64 " 0x%" PRIx64
+           " = 0x%" PRIx64 " MXCSR flags 0x%" PRIx32 "\n",
+           n, m, a, r, *mxcsr_p);
+    return r;
+}
+
+typedef struct testdata {
+    /* Input n, m, a */
+    uint64_t n;
+    uint64_t m;
+    uint64_t a;
+    bool ftz;
+    /* Expected result */
+    uint64_t expected_r;
+    /* Expected low 6 bits of MXCSR (the Flag bits) */
+    uint32_t expected_mxcsr;
+} testdata;
+
+static testdata tests[] = {
+    { 0, 0x7ff0000000000000, 0x7ff000000000aaaa, false, /* 0 * Inf + SNaN */
+      0x7ff800000000aaaa, 1 }, /* Should be QNaN and does raise Invalid */
+    { 0, 0x7ff0000000000000, 0x7ff800000000aaaa, false, /* 0 * Inf + QNaN */
+      0x7ff800000000aaaa, 0 }, /* Should be QNaN and does *not* raise Invalid */
+    /*
+     * These inputs give a result which is tiny before rounding but which
+     * becomes non-tiny after rounding. x86 is a "detect tininess after
+     * rounding" architecture, so it should give a non-denormal result and
+     * not set the Underflow flag (only the Precision flag for an inexact
+     * result).
+     */
+    { 0x3fdfffffffffffff, 0x001fffffffffffff, 0x801fffffffffffff, false,
+      0x8010000000000000, 0x20 },
+    /*
+     * Flushing of denormal outputs to zero should also happen after
+     * rounding, so setting FTZ should not affect the result or the flags.
+     * QEMU currently does not emulate this correctly because we do the
+     * flush-to-zero check before rounding, so we incorrectly produce a
+     * zero result and set Underflow as well as Precision.
+     */
+#ifdef ENABLE_FAILING_TESTS
+    { 0x3fdfffffffffffff, 0x001fffffffffffff, 0x801fffffffffffff, true,
+      0x8010000000000000, 0x20 }, /* Enabling FTZ shouldn't change flags */
+#endif
+};
+
+int main(void)
+{
+    bool passed = true;
+    for (int i = 0; i < ARRAY_SIZE(tests); i++) {
+        uint32_t mxcsr;
+        uint64_t r = do_fmadd(tests[i].n, tests[i].m, tests[i].a,
+                              tests[i].ftz, &mxcsr);
+        if (r != tests[i].expected_r) {
+            printf("expected result 0x%" PRIx64 "\n", tests[i].expected_r);
+            passed = false;
+        }
+        if (mxcsr != tests[i].expected_mxcsr) {
+            printf("expected MXCSR flags 0x%x\n", tests[i].expected_mxcsr);
+            passed = false;
+        }
+    }
+    return passed ? 0 : 1;
+}
diff --git a/tests/tcg/x86_64/Makefile.target b/tests/tcg/x86_64/Makefile.target
index d6dff559c7d..be20fc64e88 100644
--- a/tests/tcg/x86_64/Makefile.target
+++ b/tests/tcg/x86_64/Makefile.target
@@ -18,6 +18,7 @@ X86_64_TESTS += adox
 X86_64_TESTS += test-1648
 X86_64_TESTS += test-2175
 X86_64_TESTS += cross-modifying-code
+X86_64_TESTS += fma
 TESTS=$(MULTIARCH_TESTS) $(X86_64_TESTS) test-x86_64
 else
 TESTS=$(MULTIARCH_TESTS)
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 7/7] tcg/optimize: optimize TSTNE using smask and zmask
  2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
                   ` (5 preceding siblings ...)
  2025-02-07 10:28 ` [PULL 6/7] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases Paolo Bonzini
@ 2025-02-07 10:28 ` Paolo Bonzini
  6 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2025-02-07 10:28 UTC (permalink / raw)
  To: qemu-devel

Generalize the existing optimization of "TSTNE x,sign" and "TSTNE x,-1".
This can be useful for example in the i386 frontend, which will generate
tests of zero-extended registers against 0xffffffff.

Ironically, on x86 hosts this is a very slight pessimization in the very
case it's meant to optimize because

 brcond_i64 cc_dst,$0xffffffff,tsteq,$L1

(test %ebx, %ebx) is 1 byte smaller than

 brcond_i64 cc_dst,$0x0,eq,$L1

(test %rbx, %rbx).  However, in general it is an improvement, especially
if it avoids placing a large immediate in the constant pool.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 tcg/optimize.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 8c6303e3afa..bca11cc427b 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -766,6 +766,7 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp *op, TCGArg dest,
                                      TCGArg *p1, TCGArg *p2, TCGArg *pcond)
 {
     TCGCond cond;
+    TempOptInfo *i1;
     bool swap;
     int r;
 
@@ -783,19 +784,21 @@ static int do_constant_folding_cond1(OptContext *ctx, TCGOp *op, TCGArg dest,
         return -1;
     }
 
+    i1 = arg_info(*p1);
+
     /*
      * TSTNE x,x -> NE x,0
-     * TSTNE x,-1 -> NE x,0
+     * TSTNE x,i -> NE x,0 if i includes all nonzero bits of x
      */
-    if (args_are_copies(*p1, *p2) || arg_is_const_val(*p2, -1)) {
+    if (args_are_copies(*p1, *p2) ||
+        (arg_is_const(*p2) && (i1->z_mask & ~arg_info(*p2)->val) == 0)) {
         *p2 = arg_new_constant(ctx, 0);
         *pcond = tcg_tst_eqne_cond(cond);
         return -1;
     }
 
-    /* TSTNE x,sign -> LT x,0 */
-    if (arg_is_const_val(*p2, (ctx->type == TCG_TYPE_I32
-                               ? INT32_MIN : INT64_MIN))) {
+    /* TSTNE x,i -> LT x,0 if i only includes sign bit copies */
+    if (arg_is_const(*p2) && (arg_info(*p2)->val & ~i1->s_mask) == 0) {
         *p2 = arg_new_constant(ctx, 0);
         *pcond = tcg_tst_ltge_cond(cond);
         return -1;
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-02-07 13:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07 10:27 [PULL 0/7] Rust, TCG, x86 patches for 2025-02-07 Paolo Bonzini
2025-02-07 10:27 ` [PULL 1/7] rust: remove unnecessary Cargo.toml metadata Paolo Bonzini
2025-02-07 10:27 ` [PULL 2/7] rust: include rust_version in Cargo.toml Paolo Bonzini
2025-02-07 10:27 ` [PULL 3/7] rust: add docs Paolo Bonzini
2025-02-07 10:27 ` [PULL 4/7] rust: add clippy configuration file Paolo Bonzini
2025-02-07 10:27 ` [PULL 5/7] target/i386: Do not raise Invalid for 0 * Inf + QNaN Paolo Bonzini
2025-02-07 11:53   ` Michael Tokarev
2025-02-07 13:43     ` Peter Maydell
2025-02-07 10:28 ` [PULL 6/7] tests/tcg/x86_64/fma: Test some x86 fused-multiply-add cases Paolo Bonzini
2025-02-07 10:28 ` [PULL 7/7] tcg/optimize: optimize TSTNE using smask and zmask Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).