From: Richard Henderson <richard.henderson@linaro.org>
To: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new decoder
Date: Wed, 10 Apr 2024 18:12:35 -1000 [thread overview]
Message-ID: <f211d5d7-9d0f-455a-97c5-d2c09d600bcb@linaro.org> (raw)
In-Reply-To: <20240409164323.776660-10-pbonzini@redhat.com>
On 4/9/24 06:43, Paolo Bonzini wrote:
> +static void gen_ARPL(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
> +{
> + TCGLabel *label1 = gen_new_label();
> + TCGv rpl_adj = tcg_temp_new();
> + TCGv flags = tcg_temp_new();
> +
> + gen_mov_eflags(s, flags);
> + tcg_gen_andi_tl(flags, flags, ~CC_Z);
> +
> + /* Compute dest[rpl] - src[rpl], adjust if result <0. */
> + tcg_gen_andi_tl(rpl_adj, s->T0, 3);
> + tcg_gen_andi_tl(s->T1, s->T1, 3);
> + tcg_gen_sub_tl(rpl_adj, rpl_adj, s->T1);
> +
> + tcg_gen_brcondi_tl(TCG_COND_LT, rpl_adj, 0, label1);
Comment is right, but branch condition is wrong.
I think this might be better as:
/* SRC = DST with SRC[RPL] */
tcg_gen_deposit_tl(s->T1, s->T0, s->T1, 0, 2);
/* Z flag set if DST < SRC */
tcg_gen_setcond_tl(TCG_COND_LTU, tmp, s->T0, s->T1);
/* Install Z */
tcg_gen_deposit_tl(flags, flags, tmp, ctz(CC_Z), 1);
/* DST with maximum RPL */
tcg_gen_umax_tl(s->T0, s->T0, s->T1);
> + case MO_32:
> +#ifdef TARGET_X86_64
> + /*
> + * This could also use the same algorithm as MO_16. It produces fewer
> + * TCG ops and better code if flags are needed, but it requires a 64-bit
> + * multiply even if they are not (and thus the high part of the multiply
> + * is dead).
> + */
Is 64-bit multiply ever slower these days?
My intuition says "slow" multiply is at least a decade out of date.
> + tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
> + tcg_gen_trunc_tl_i32(s->tmp3_i32, s->T1);
Avoid s->tmp*, especially in new code.
> + tcg_gen_muls2_i32(s->tmp2_i32, s->tmp3_i32,
> + s->tmp2_i32, s->tmp3_i32);
> + tcg_gen_extu_i32_tl(s->T0, s->tmp2_i32);
> +
> + cc_src_rhs = tcg_temp_new();
> + tcg_gen_extu_i32_tl(cc_src_rhs, s->tmp3_i32);
> + /* Compare the high part to the sign bit of the truncated result */
> + tcg_gen_negsetcondi_i32(TCG_COND_LT, s->tmp2_i32, s->tmp2_i32, 0);
This seems like something the optimizer should handle, but doesn't.
I'd write this as
tcg_gen_sari_i32(tmp, tmp, 31);
or
tcg_gen_sextract_i32(tmp, tmp, 31, 1);
which I know will expand to the same thing.
> + case MO_64:
> +#endif
> + cc_src_rhs = tcg_temp_new();
> + tcg_gen_muls2_tl(s->T0, cc_src_rhs, s->T0, s->T1);
> + /* Compare the high part to the sign bit of the truncated result */
> + tcg_gen_negsetcondi_tl(TCG_COND_LT, s->T1, s->T0, 0);
Similarly.
r~
next prev parent reply other threads:[~2024-04-11 7:48 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-09 16:43 [PATCH for-9.1 00/19] target/i386: convert 1-byte opcodes to new decoder Paolo Bonzini
2024-04-09 16:43 ` [PATCH for-9.1 01/19] target/i386: use TSTEQ/TSTNE to test low bits Paolo Bonzini
2024-04-09 16:43 ` [PATCH for-9.1 02/19] target/i386: use TSTEQ/TSTNE to check flags Paolo Bonzini
2024-04-09 16:43 ` [PATCH for-9.1 03/19] target/i386: remove mask from CCPrepare Paolo Bonzini
2024-04-09 17:23 ` Philippe Mathieu-Daudé
2024-04-09 16:43 ` [PATCH for-9.1 04/19] target/i386: do not use s->tmp0 and s->tmp4 to compute flags Paolo Bonzini
2024-04-10 6:34 ` Richard Henderson
2024-04-10 18:33 ` Paolo Bonzini
2024-04-09 16:43 ` [PATCH for-9.1 05/19] target/i386: reintroduce debugging mechanism Paolo Bonzini
2024-04-09 16:43 ` [PATCH for-9.1 06/19] target/i386: move 00-5F opcodes to new decoder Paolo Bonzini
2024-04-11 2:50 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 07/19] target/i386: extract gen_far_call/jmp, reordering temporaries Paolo Bonzini
2024-04-11 2:55 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 08/19] target/i386: allow instructions with more than one immediate Paolo Bonzini
2024-04-11 2:57 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 09/19] target/i386: move 60-BF opcodes to new decoder Paolo Bonzini
2024-04-11 4:12 ` Richard Henderson [this message]
2024-04-11 11:18 ` Paolo Bonzini
2024-04-11 14:31 ` Zhao Liu
2024-04-11 15:19 ` Zhao Liu
2024-04-11 16:43 ` Paolo Bonzini
2024-04-24 11:13 ` Paolo Bonzini
2024-04-25 15:29 ` Zhao Liu
2024-04-09 16:43 ` [PATCH for-9.1 10/19] target/i386: generalize gen_movl_seg_T0 Paolo Bonzini
2024-04-11 4:13 ` Richard Henderson
2024-04-11 14:45 ` Zhao Liu
2024-04-09 16:43 ` [PATCH for-9.1 11/19] target/i386: move C0-FF opcodes to new decoder (except for x87) Paolo Bonzini
2024-04-11 6:02 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 12/19] target/i386: merge and enlarge a few ranges for call to disas_insn_new Paolo Bonzini
2024-04-11 7:56 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 13/19] target/i386: move remaining conditional operations to new decoder Paolo Bonzini
2024-04-11 8:00 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 14/19] target/i386: move BSWAP " Paolo Bonzini
2024-04-11 8:02 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 15/19] target/i386: port extensions of one-byte opcodes " Paolo Bonzini
2024-04-11 8:08 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 16/19] target/i386: remove now-converted opcodes from old decoder Paolo Bonzini
2024-04-11 8:11 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 17/19] target/i386: decode x87 instructions in a separate function Paolo Bonzini
2024-04-09 17:20 ` Philippe Mathieu-Daudé
2024-04-11 8:16 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 18/19] target/i386: split legacy decoder into " Paolo Bonzini
2024-04-09 17:17 ` Philippe Mathieu-Daudé
2024-04-11 8:17 ` Richard Henderson
2024-04-09 16:43 ` [PATCH for-9.1 19/19] target/i386: remove duplicate prefix decoding Paolo Bonzini
2024-04-11 8:34 ` Richard Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f211d5d7-9d0f-455a-97c5-d2c09d600bcb@linaro.org \
--to=richard.henderson@linaro.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).