From: WANG Xuerui <i.qemu@xen0n.name>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
"Laurent Vivier" <laurent@vivier.eu>
Subject: Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
Date: Fri, 24 Sep 2021 23:08:05 +0800 [thread overview]
Message-ID: <7ca2e822-839f-96ab-9dc9-276565d03478@xen0n.name> (raw)
In-Reply-To: <5ace7b10-b7de-46e2-2021-01129024ffe2@linaro.org>
Hi Richard,
On 9/24/21 00:50, Richard Henderson wrote:
> On 9/22/21 11:09 AM, WANG Xuerui wrote:
>
> Following up on previous, I suggest:
>
>> +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
>> + tcg_target_long val)
>> +{
>> + if (type == TCG_TYPE_I32) {
>> + val = (int32_t)val;
>> + }
>> +
>> + /* Single-instruction cases. */
>> + tcg_target_long low = sextreg(val, 0, 12);
>> + if (low == val) {
>> + /* val fits in simm12: addi.w rd, zero, val */
>> + tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
>> + return;
>> + }
>> + if (0x800 <= val && val <= 0xfff) {
>> + /* val fits in uimm12: ori rd, zero, val */
>> + tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
>> + return;
>> + }
>
>> + /* Test for PC-relative values that can be loaded faster. */
>> + intptr_t pc_offset = tcg_pcrel_diff(s, (void *)val);
>> + if (pc_offset == sextreg(pc_offset, 0, 22) && (pc_offset & 3) ==
>> 0) {
>> + tcg_out_opc_pcaddu2i(s, rd, pc_offset >> 2);
>> + return;
>> + }
>
> /* Handle all 32-bit constants. */
> if (val == (int32_t)val) {
> tcg_out_opc_lu12i(s, rd, val >> 12);
> if (low) {
> tcg_out_opc_ori(s, rd, rd, val & 0xfff);
> }
> return;
> }
>
> /* Handle pc-relative values requiring 2 instructions. */
> intptr_t pc_lo = sextract64(pc_offset, 0, 12);
> intptr_t pc_hi = pc_offset - pc_low;
> if (pc_hi == (int32_t)pc_hi) {
> tcg_out_opc_pcaddu12i(s, rd, pc_hi >> 12);
> tcg_out_opc_addi_d(s, rd, rd, pc_lo);
> return;
> }
>
> /*
> * Choose signed low part if bit 13 is also set,
> * which gives us a chance of making more zeros.
> * Otherwise, let low be unsigned.
> */
> if ((val & 0x1800) != 0x1800) {
> low = val & 0xfff;
> }
> val -= low;
>
> tcg_target_long hi20 = sextract64(val, 12, 20);
> tcg_target_long hi32 = sextract64(val, 32, 20);
> tcg_target_long hi52 = sextract64(val, 52, 12);
>
> /*
> * If we can use the sign-extension of a previous
> * operation, suppress higher -1.
> */
> if (hi32 < 0 && hi52 == -1) {
> hi52 = 0;
> }
> if (hi20 < 0 && hi32 == -1) {
> hi32 = 0;
> }
>
> /* Initialize RD with the least non-zero component. */
> if (hi20) {
> tcg_out_opc_lu12i_w(s, rd, hi20 >> 12);
> } else if (hi32) {
> /* CU32I_D is modify in place, so RD must be initialized. */
> if (low < 0) {
> tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, low);
> } else {
> tcg_out_opc_ori(s, rd, TCG_REG_ZERO, low);
> }
> low = 0;
> } else {
> tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, hi52);
> hi52 = 0;
> }
>
> /* Assume that lu12i + ori are fusable */
> if (low > 0) {
> tcg_out_opc_ori(s, rd, rd, low);
> }
>
> /* Set the high 32 bits */
> if (hi32) {
> tcg_out_opc_cu32i_d(s, rd, hi32);
> }
> if (hi52) {
> tcg_out_opc_cu52i(s, rd, rd, hi52);
> }
>
> /*
> * Note that any subtraction must come last,
> * because cu32i and cu52i overwrite high bits,
> * and we have computed them as val - low.
> */
> if (low < 0) {
> tcg_out_opc_addi_d(s, rd, rd, low);
> }
>
> Untested, and all bugs are mine, of course.
>
> Try "qemu-system-ppc64 -D z -d in_asm,op_opt,out_asm".
> You should see some masking constants like
>
> ---- 000000001daf2898
> and_i64 CA,r9,$0x7fffffffffffffff dead: 2 pref=0xffff
>
> cu52i.d rd, zero, 0x800
> addi.d rd, rd, -1
>
> ---- 000000001db0775c
> mov_i64 r26,$0x300000002 sync: 0 dead: 0 1 pref=0xffff
>
> ori rd, zero, 2
> cu32i rd, 3
>
Oops, for some reason I only received this at about 8 pm... I'll of
course take advantage of the Saturday and compare the generated code for
the cases, hopefully incorporating some of your ideas presented here.
Thanks for the detailed reply!
>
> r~
next prev parent reply other threads:[~2021-09-24 15:09 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
2021-09-22 18:17 ` Richard Henderson
2021-09-22 18:23 ` Philippe Mathieu-Daudé
2021-09-22 18:08 ` [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
2021-09-22 18:34 ` Philippe Mathieu-Daudé
2021-09-22 18:47 ` WANG Xuerui
2021-09-22 18:58 ` Richard Henderson
2021-09-23 10:35 ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
2021-09-22 18:37 ` Philippe Mathieu-Daudé
2021-09-22 18:51 ` WANG Xuerui
2021-09-22 19:32 ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
2021-09-22 18:41 ` Philippe Mathieu-Daudé
2021-09-22 18:55 ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
2021-09-22 18:39 ` Richard Henderson
2021-09-22 19:02 ` WANG Xuerui
2021-09-22 18:51 ` Richard Henderson
2021-09-23 15:38 ` WANG Xuerui
2021-09-23 16:50 ` Richard Henderson
2021-09-24 15:08 ` WANG Xuerui [this message]
2021-09-24 15:53 ` Richard Henderson
2021-09-24 16:26 ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
2021-09-22 18:23 ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
2021-09-22 18:25 ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
2021-09-23 17:25 ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 27/30] tcg/loongarch64: Register the JIT WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler " WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 30/30] configure, meson.build: Mark support " WANG Xuerui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7ca2e822-839f-96ab-9dc9-276565d03478@xen0n.name \
--to=i.qemu@xen0n.name \
--cc=f4bug@amsat.org \
--cc=laurent@vivier.eu \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).