Re: [PATCH 04/13] target/hexagon: add v68 HVX IEEE float arithmetic insns

public inbox for qemu-devel@nongnu.org
 help / color / mirror / Atom feed

From: Taylor Simpson <ltaylorsimpson@gmail.com>
To: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com>
Cc: qemu-devel@nongnu.org, brian.cain@oss.qualcomm.com, ale@rev.ng,
	 anjo@rev.ng, marco.liebel@oss.qualcomm.com, philmd@linaro.org,
	 quic_mburton@quicinc.com, sid.manning@oss.qualcomm.com
Subject: Re: [PATCH 04/13] target/hexagon: add v68 HVX IEEE float arithmetic insns
Date: Mon, 23 Mar 2026 14:28:48 -0600	[thread overview]
Message-ID: <CAATN3NrZVn6W0pAyshZ7mhYVcY+gtRk9tJwz5WLLzxoPTDupwA@mail.gmail.com> (raw)
In-Reply-To: <831949008a7266559a6f313f99a394cd68cc9846.1774271525.git.matheus.bernardino@oss.qualcomm.com>

[-- Attachment #1: Type: text/plain, Size: 8533 bytes --]

On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <
matheus.bernardino@oss.qualcomm.com> wrote:

> Add HVX IEEE floating-point arithmetic instructions:
> - vmpy_sf_sf, vmpy_sf_hf, vmpy_hf_hf: multiply operations
> - vdmpy_sf_hf: dot-product multiply
> - vmpy_sf_hf_acc, vmpy_hf_hf_acc, vdmpy_sf_hf_acc: multiply-accumulate
> - vadd_sf_sf, vsub_sf_sf, vadd_sf_hf, vsub_sf_hf: add/sub with sf output
> - vadd_hf_hf, vsub_hf_hf: add/sub with hf output
>
> Signed-off-by: Matheus Tavares Bernardino <
> matheus.bernardino@oss.qualcomm.com>
> ---
>  target/hexagon/mmvec/kvx_ieee.h              | 47 ++++++++++
>  target/hexagon/mmvec/macros.h                |  1 +
>  target/hexagon/mmvec/mmvec.h                 |  2 +
>  target/hexagon/attribs_def.h.inc             |  4 +
>  target/hexagon/mmvec/kvx_ieee.c              | 87 ++++++++++++++++++
>  target/hexagon/hex_common.py                 |  1 +
>  target/hexagon/imported/mmvec/encode_ext.def | 18 ++++
>  target/hexagon/imported/mmvec/ext.idef       | 93 ++++++++++++++++++++
>  target/hexagon/meson.build                   |  1 +
>  9 files changed, 254 insertions(+)
>  create mode 100644 target/hexagon/mmvec/kvx_ieee.h
>  create mode 100644 target/hexagon/mmvec/kvx_ieee.c
>

I'm curious why the prefix is kvx instead of hvx.


>
> diff --git a/target/hexagon/mmvec/kvx_ieee.h
> b/target/hexagon/mmvec/kvx_ieee.h
> new file mode 100644
> index 0000000000..e92ddebeb9
> --- /dev/null
> +++ b/target/hexagon/mmvec/kvx_ieee.h
> @@ -0,0 +1,47 @@
> +/*
> + *  Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + *
> + *  SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#ifndef HEXAGON_KVX_IEEE_H
> +#define HEXAGON_KVX_IEEE_H
> +
> +#include "fpu/softfloat.h"
> +
> +/* Hexagon canonical NaN */
> +#define FP32_DEF_NAN      0x7FFFFFFF
> +#define FP16_DEF_NAN      0x7FFF
>

These are the same as the scalar core, right?  If so, there's already a
call to set_float_default_nan_pattern in hexagon_cpu_reset_hold.

If the patterns are different, you'll need to call
set_float_default_nan_pattern before each scalar FP instruction and before
each HVX FP instruction.


> +
> +/*
> + * IEEE - FP ADD/SUB/MPY instructionsFP
> + */
> +uint32_t fp_mult_sf_sf(uint32_t a1, uint32_t a2, float_status *fp_status);
> +uint32_t fp_add_sf_sf(uint32_t a1, uint32_t a2, float_status *fp_status);
> +uint32_t fp_sub_sf_sf(uint32_t a1, uint32_t a2, float_status *fp_status);
> +
> +uint16_t fp_mult_hf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +uint16_t fp_add_hf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +uint16_t fp_sub_hf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +
> +uint32_t fp_mult_sf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +uint32_t fp_add_sf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +uint32_t fp_sub_sf_hf(uint16_t a1, uint16_t a2, float_status *fp_status);
> +
> +/*
> + * IEEE - FP Accumulate instructions
> + */
> +uint16_t fp_mult_hf_hf_acc(uint16_t a1, uint16_t a2, uint16_t acc,
> +                           float_status *fp_status);
> +uint32_t fp_mult_sf_hf_acc(uint16_t a1, uint16_t a2, uint32_t acc,
> +                           float_status *fp_status);
> +
> +/*
> + * IEEE - FP Reduce instructions
> + */
> +uint32_t fp_vdmpy(uint16_t a1, uint16_t a2, uint16_t a3, uint16_t a4,
> +                  float_status *fp_status);
> +uint32_t fp_vdmpy_acc(uint32_t acc, uint16_t a1, uint16_t a2, uint16_t a3,
> +                      uint16_t a4, float_status *fp_status);
> +
>

Consider using macros similar to the ones in the .c file to create these
protos.


> +#endif
> diff --git a/target/hexagon/mmvec/kvx_ieee.c
> b/target/hexagon/mmvec/kvx_ieee.c
> new file mode 100644
> index 0000000000..b763899aa3
> --- /dev/null
> +++ b/target/hexagon/mmvec/kvx_ieee.c
> @@ -0,0 +1,87 @@
> +/*
> + *  Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + *
> + *  SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include "qemu/osdep.h"
> +#include "kvx_ieee.h"
> +
> +#define DEF_FP_INSN_2(name, rt, a1t, a2t, op) \
> +    uint##rt##_t fp_##name(uint##a1t##_t a1, uint##a2t##_t a2, \
> +                           float_status *fp_status) { \
> +        float##a1t f1 = make_float##a1t(a1); \
> +        float##a2t f2 = make_float##a2t(a2); \
> +        \
> +        if (float##a1t##_is_any_nan(f1) || float##a2t##_is_any_nan(f2)) {
> \
> +            return FP##rt##_DEF_NAN; \
> +        } \
>

These nan checks shouldn't be needed if you're using QEMU softfloat
properly.


> +        float##rt result = op; \
> +        \
> +        if (float##rt##_is_any_nan(result)) { \
> +            return FP##rt##_DEF_NAN; \
> +        } \
>

Ditto


> +        return result; \
> +    }
> +
> +#define DEF_FP_INSN_3(name, rt, a1t, a2t, a3t, op) \
> +    uint##rt##_t fp_##name(uint##a1t##_t a1, uint##a2t##_t a2, \
> +                           uint##a3t##_t a3, float_status *fp_status) { \
> +        float##a1t f1 = make_float##a1t(a1); \
> +        float##a2t f2 = make_float##a2t(a2); \
> +        float##a3t f3 = make_float##a3t(a3); \
> +        \
> +        if (float##a1t##_is_any_nan(f1) || float##a2t##_is_any_nan(f2) ||
> \
> +            float##a3t##_is_any_nan(f3)) \
> +            return FP##rt##_DEF_NAN; \
>

Ditto


> +        \
> +        float##rt result = op; \
> +        \
> +        if (float##rt##_is_any_nan(result)) \
> +            return FP##rt##_DEF_NAN; \
>

Ditto


> +        return result; \
> +    }
> +
> diff --git a/target/hexagon/imported/mmvec/ext.idef
> b/target/hexagon/imported/mmvec/ext.idef
> index 03d31f6181..3f0d8e366e 100644
> --- a/target/hexagon/imported/mmvec/ext.idef
> +++ b/target/hexagon/imported/mmvec/ext.idef
> @@ -2895,9 +2895,102 @@ EXTINSN(V6_vprefixqw,"Vd32.w=prefixsum(Qv4)",
>  ATTRIBS(A_EXTENSION,A_CVI,A_CVI_
>      }
>      } )
>
> +/* KVX - IEEE FP Instructions */
>
> +/* Single pipe, 32-bit output */
> +#define ITERATOR_INSN_IEEE_FP_32(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX,A_HVX_IEEE_FP_OUT_32), \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
>
> +/* Single pipe, 16-bit output */
> +#define ITERATOR_INSN_IEEE_FP_16(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX,A_HVX_IEEE_FP_OUT_16), \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
>
> +/* Two pipes: P2 & P3, single output: P2, 32-bit output */
> +#define
> ITERATOR_INSN_IEEE_FP_DOUBLE_SINGLE_32(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX_DV,A_HVX_IEEE_FP_OUT_32),
> \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
> +
> +/* Two pipes: P2 & P3, two outputs, 32-bit output */
> +#define ITERATOR_INSN_IEEE_FP_DOUBLE_32(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX_DV,A_HVX_IEEE_FP_OUT_32),
> \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
> +
> +/*
> + * single pipe, accumulate instruction, produces 16-bit output, requires
> 16-bit
> + * accumulate input
> + */
> +#define ITERATOR_INSN_IEEE_FP_ACC_16(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX,A_HVX_IEEE_FP_ACC,A_HVX_IEEE_FP_OUT_16,A_CVI_VX_NO_TMP_LD),
> \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
> +
> +/*
> + * single pipe, accumulate instruction, produces 32-bit output, requires
> 32-bit
> + * accumulate input
> + */
> +#define ITERATOR_INSN_IEEE_FP_ACC_32(WIDTH,TAG,SYNTAX,DESCR,CODE) \
> +EXTINSN(V6_##TAG, SYNTAX, \
> +ATTRIBS(A_EXTENSION,A_HVX_IEEE_FP,A_CVI,A_CVI_VX,A_HVX_IEEE_FP_ACC,A_HVX_IEEE_FP_OUT_32,A_CVI_VX_NO_TMP_LD),
> \
> +DESCR, DO_FOR_EACH_CODE(WIDTH, CODE))
> +
> +/* IEEE FP multiply instructions */
> +ITERATOR_INSN_IEEE_FP_DOUBLE_SINGLE_32(32, vmpy_sf_sf,
> +    "Vd32.sf=vmpy(Vu32.sf,Vv32.sf)", "Vector IEEE mul: sf",
> +    VdV.sf[i] = fp_mult_sf_sf(VuV.sf[i], VvV.sf[i], &env->fp_status))
>

Do these instructions interact with the FP bits in USR (e.g., rounding
mode, FP exceptions)?

If so, you'll need something similar to arch_fpop_start/arch_fpop_end at
the start/end of each helper.  This can be done in gen_helper_funcs.py.

Thanks,
Taylor

[-- Attachment #2: Type: text/html, Size: 10903 bytes --]

next prev parent reply	other threads:[~2026-03-23 20:30 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23 13:15 [PATCH 00/13] hexagon: add missing HVX float instructions Matheus Tavares Bernardino
2026-03-23 13:15 ` [PATCH 01/13] tests/docker: Update hexagon cross toolchain to 22.1.0 Matheus Tavares Bernardino
2026-03-23 13:15 ` [PATCH 02/13] target/hexagon: fix incorrect/too-permissive HVX encodings Matheus Tavares Bernardino
2026-03-23 19:21   ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension Matheus Tavares Bernardino
2026-03-23 19:32   ` Taylor Simpson
2026-03-24 16:52     ` Matheus Bernardino
2026-03-24 18:48       ` Taylor Simpson
2026-03-24 19:20         ` Brian Cain
2026-03-24 19:46           ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 04/13] target/hexagon: add v68 HVX IEEE float arithmetic insns Matheus Tavares Bernardino
2026-03-23 20:28   ` Taylor Simpson [this message]
2026-03-24 19:30     ` Matheus Bernardino
2026-03-24 19:51       ` Taylor Simpson
2026-03-24 19:59         ` Matheus Bernardino
2026-03-25  1:18           ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 05/13] target/hexagon: add v68 HVX IEEE float min/max insns Matheus Tavares Bernardino
2026-03-23 20:47   ` Taylor Simpson
2026-03-24 20:15     ` Matheus Bernardino
2026-03-23 13:15 ` [PATCH 06/13] target/hexagon: add v68 HVX IEEE float misc insns Matheus Tavares Bernardino
2026-03-23 21:08   ` Taylor Simpson
2026-03-24 20:25     ` Matheus Bernardino
2026-03-23 13:15 ` [PATCH 07/13] target/hexagon: add v68 HVX IEEE float conversion insns Matheus Tavares Bernardino
2026-03-23 21:25   ` Taylor Simpson
2026-03-24 21:04     ` Matheus Bernardino
2026-03-25  1:15       ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 08/13] target/hexagon: add v68 HVX IEEE float compare insns Matheus Tavares Bernardino
2026-03-23 21:42   ` Taylor Simpson
2026-03-26 13:00     ` Matheus Bernardino
2026-03-23 13:15 ` [PATCH 09/13] target/hexagon: add v73 HVX IEEE bfloat16 insns Matheus Tavares Bernardino
2026-03-23 22:03   ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 10/13] tests/hexagon: add tests for v68 HVX IEEE float arithmetics Matheus Tavares Bernardino
2026-03-24 19:05   ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 11/13] tests/hexagon: add tests for v68 HVX IEEE float min/max Matheus Tavares Bernardino
2026-03-24 19:07   ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 12/13] tests/hexagon: add tests for v68 HVX IEEE float conversions Matheus Tavares Bernardino
2026-03-24 19:30   ` Taylor Simpson
2026-03-23 13:15 ` [PATCH 13/13] tests/hexagon: add tests for v68 HVX IEEE float comparisons Matheus Tavares Bernardino
2026-03-24 19:37   ` Taylor Simpson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAATN3NrZVn6W0pAyshZ7mhYVcY+gtRk9tJwz5WLLzxoPTDupwA@mail.gmail.com \
    --to=ltaylorsimpson@gmail.com \
    --cc=ale@rev.ng \
    --cc=anjo@rev.ng \
    --cc=brian.cain@oss.qualcomm.com \
    --cc=marco.liebel@oss.qualcomm.com \
    --cc=matheus.bernardino@oss.qualcomm.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quic_mburton@quicinc.com \
    --cc=sid.manning@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox