qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: WANG Xuerui <i.qemu@xen0n.name>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Philippe Mathieu-Daudé" <f4bug@amsat.org>,
	"Laurent Vivier" <laurent@vivier.eu>
Subject: Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
Date: Fri, 24 Sep 2021 23:08:05 +0800	[thread overview]
Message-ID: <7ca2e822-839f-96ab-9dc9-276565d03478@xen0n.name> (raw)
In-Reply-To: <5ace7b10-b7de-46e2-2021-01129024ffe2@linaro.org>

Hi Richard,

On 9/24/21 00:50, Richard Henderson wrote:
> On 9/22/21 11:09 AM, WANG Xuerui wrote:
>
> Following up on previous, I suggest:
>
>> +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
>> +                         tcg_target_long val)
>> +{
>> +    if (type == TCG_TYPE_I32) {
>> +        val = (int32_t)val;
>> +    }
>> +
>> +    /* Single-instruction cases.  */
>> +    tcg_target_long low = sextreg(val, 0, 12);
>> +    if (low == val) {
>> +        /* val fits in simm12: addi.w rd, zero, val */
>> +        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
>> +        return;
>> +    }
>> +    if (0x800 <= val && val <= 0xfff) {
>> +        /* val fits in uimm12: ori rd, zero, val */
>> +        tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
>> +        return;
>> +    }
>
>> +    /* Test for PC-relative values that can be loaded faster.  */
>> +    intptr_t pc_offset = tcg_pcrel_diff(s, (void *)val);
>> +    if (pc_offset == sextreg(pc_offset, 0, 22) && (pc_offset & 3) == 
>> 0) {
>> +        tcg_out_opc_pcaddu2i(s, rd, pc_offset >> 2);
>> +        return;
>> +    }
>
>     /* Handle all 32-bit constants. */
>     if (val == (int32_t)val) {
>         tcg_out_opc_lu12i(s, rd, val >> 12);
>         if (low) {
>             tcg_out_opc_ori(s, rd, rd, val & 0xfff);
>         }
>         return;
>     }
>
>     /* Handle pc-relative values requiring 2 instructions. */
>     intptr_t pc_lo = sextract64(pc_offset, 0, 12);
>     intptr_t pc_hi = pc_offset - pc_low;
>     if (pc_hi == (int32_t)pc_hi) {
>         tcg_out_opc_pcaddu12i(s, rd, pc_hi >> 12);
>         tcg_out_opc_addi_d(s, rd, rd, pc_lo);
>         return;
>     }
>
>     /*
>      * Choose signed low part if bit 13 is also set,
>      * which gives us a chance of making more zeros.
>      * Otherwise, let low be unsigned.
>      */
>     if ((val & 0x1800) != 0x1800) {
>         low = val & 0xfff;
>     }
>     val -= low;
>
>     tcg_target_long hi20 = sextract64(val, 12, 20);
>     tcg_target_long hi32 = sextract64(val, 32, 20);
>     tcg_target_long hi52 = sextract64(val, 52, 12);
>
>     /*
>      * If we can use the sign-extension of a previous
>      * operation, suppress higher -1.
>      */
>     if (hi32 < 0 && hi52 == -1) {
>         hi52 = 0;
>     }
>     if (hi20 < 0 && hi32 == -1) {
>         hi32 = 0;
>     }
>
>     /* Initialize RD with the least non-zero component. */
>     if (hi20) {
>         tcg_out_opc_lu12i_w(s, rd, hi20 >> 12);
>     } else if (hi32) {
>         /* CU32I_D is modify in place, so RD must be initialized. */
>         if (low < 0) {
>             tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, low);
>         } else {
>             tcg_out_opc_ori(s, rd, TCG_REG_ZERO, low);
>         }
>         low = 0;
>     } else {
>         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, hi52);
>         hi52 = 0;
>     }
>
>     /* Assume that lu12i + ori are fusable */
>     if (low > 0) {
>         tcg_out_opc_ori(s, rd, rd, low);
>     }
>
>     /* Set the high 32 bits */
>     if (hi32) {
>         tcg_out_opc_cu32i_d(s, rd, hi32);
>     }
>     if (hi52) {
>         tcg_out_opc_cu52i(s, rd, rd, hi52);
>     }
>
>     /*
>      * Note that any subtraction must come last,
>      * because cu32i and cu52i overwrite high bits,
>      * and we have computed them as val - low.
>      */
>     if (low < 0) {
>         tcg_out_opc_addi_d(s, rd, rd, low);
>     }
>
> Untested, and all bugs are mine, of course.
>
> Try "qemu-system-ppc64 -D z -d in_asm,op_opt,out_asm".
> You should see some masking constants like
>
>  ---- 000000001daf2898
>  and_i64 CA,r9,$0x7fffffffffffffff        dead: 2  pref=0xffff
>
>   cu52i.d rd, zero, 0x800
>   addi.d  rd, rd, -1
>
>  ---- 000000001db0775c
>  mov_i64 r26,$0x300000002                 sync: 0  dead: 0 1 pref=0xffff
>
>   ori     rd, zero, 2
>   cu32i   rd, 3
>
Oops, for some reason I only received this at about 8 pm... I'll of 
course take advantage of the Saturday and compare the generated code for 
the cases, hopefully incorporating some of your ideas presented here. 
Thanks for the detailed reply!
>
> r~


  reply	other threads:[~2021-09-24 15:09 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
2021-09-22 18:17   ` Richard Henderson
2021-09-22 18:23   ` Philippe Mathieu-Daudé
2021-09-22 18:08 ` [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
2021-09-22 18:34   ` Philippe Mathieu-Daudé
2021-09-22 18:47     ` WANG Xuerui
2021-09-22 18:58     ` Richard Henderson
2021-09-23 10:35       ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
2021-09-22 18:37   ` Philippe Mathieu-Daudé
2021-09-22 18:51     ` WANG Xuerui
2021-09-22 19:32       ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
2021-09-22 18:41   ` Philippe Mathieu-Daudé
2021-09-22 18:55     ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
2021-09-22 18:39   ` Richard Henderson
2021-09-22 19:02     ` WANG Xuerui
2021-09-22 18:51   ` Richard Henderson
2021-09-23 15:38     ` WANG Xuerui
2021-09-23 16:50   ` Richard Henderson
2021-09-24 15:08     ` WANG Xuerui [this message]
2021-09-24 15:53       ` Richard Henderson
2021-09-24 16:26         ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
2021-09-22 18:23   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
2021-09-22 18:25   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
2021-09-23 17:25   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 27/30] tcg/loongarch64: Register the JIT WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler " WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 30/30] configure, meson.build: Mark support " WANG Xuerui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ca2e822-839f-96ab-9dc9-276565d03478@xen0n.name \
    --to=i.qemu@xen0n.name \
    --cc=f4bug@amsat.org \
    --cc=laurent@vivier.eu \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).