Re: [Qemu-devel] [PATCH 3/3] tcg/aarch64: implement new TCG target for aarch64

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Claudio Fontana <claudio.fontana@huawei.com>
To: Richard Henderson <rth@twiddle.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, Peter Maydell <peter.maydell@linaro.org>
Subject: Re: [Qemu-devel] [PATCH 3/3] tcg/aarch64: implement new TCG target for aarch64
Date: Tue, 14 May 2013 16:05:43 +0200	[thread overview]
Message-ID: <519244B7.4090207@huawei.com> (raw)
In-Reply-To: <519143B8.1030109@twiddle.net>

On 13.05.2013 21:49, Richard Henderson wrote:
> On 05/13/2013 06:33 AM, Claudio Fontana wrote:
>> +enum aarch64_cond_code {
>> +    COND_EQ = 0x0,
>> +    COND_NE = 0x1,
>> +    COND_CS = 0x2,	/* Unsigned greater or equal */
>> +    COND_HS = 0x2,      /* ALIAS greater or equal */
> 
> Clearer to define aliases as COND_HS = COND_CS.

I agree, will change.

>> +static inline void tcg_out_movi64(TCGContext *s, int rd, uint64_t value)
>> +{
>> +    uint32_t half, base, movk = 0, shift = 0;
>> +    if (!value) {
>> +        tcg_out_movr(s, 1, rd, TCG_REG_XZR);
>> +        return;
>> +    }
>> +    /* construct halfwords of the immediate with MOVZ with LSL */
>> +    /* using MOVZ 0x52800000 | extended reg.. */
>> +    base = 0xd2800000;
>> +
>> +    while (value) {
>> +        half = value & 0xffff;
>> +        if (half) {
>> +            /* Op can be MOVZ or MOVK */
>> +            tcg_out32(s, base | movk | shift | half << 5 | rd);
>> +            if (!movk)
>> +                movk = 0x20000000; /* morph next MOVZs into MOVKs */
>> +        }
>> +        value >>= 16;
>> +        shift += 0x00200000;
> 
> You'll almost certainly want to try ADP+ADD before decomposing into 3-4 mov[zk]
> instructions.

I don't know the ADP instruction; but this movi can be improved.
Right now it needs 1 to 4 mov[zk] instructions, depending on the value:
it depends on how many 0000h 16 bit holes there are in the 64bit value.
I was thinking a successive patch to experiment with this.

> 
>> +void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>> +{
>> +    tcg_target_long target, offset;
>> +    target = (tcg_target_long)addr;
>> +    offset = (target - (tcg_target_long)jmp_addr) / 4;
>> +
>> +    if (offset <= -0x02000000 || offset >= 0x02000000) {
>> +        /* out of 26bit range */
>> +        tcg_abort();
>> +    }
> 
> See MAX_CODE_GEN_BUFFER_SIZE in translate-all.c.  Set this value to 128MB and
> then all cross-TB branches will be in range, and the abort won't trigger.

That's great, I missed that. I will add a change to that end in translate-all.c .

> 
>> +static inline void tcg_out_goto_label_cond(TCGContext *s, TCGCond c, int label_index)
>> +{
>> +    tcg_target_long offset;
>> +    /* backward conditional jump never seems to happen in practice,
>> +       so just always use the branch trampoline */
>> +    c = tcg_invert_cond(c);
>> +    offset = 2; /* skip current instr and the next */
>> +    tcg_out_goto_cond(s, c, offset);
>> +    tcg_out_goto_label(s, label_index); /* emit 26bit jump */
>> +}
> 
> Conditional branch range is +-1MB.  You'll never see a TB that large.  You
> don't need to emit a branch-across-branch.

Is there maybe a way to do it right even in the corner case where we have
a huge list of hundreds of thousands of instructions without jumps and then a conditional jump?
Are we _guaranteed_ to never see that large a TB with some kind of define,
similarly to MAX_CODE_GEN_BUFFER_SIZE?

I know, it's a corner case that would only trigger in a very strange program, but
I would propose to add this topic to the TODO list for successive patches.

> 
>> +    /* Should generate something like the following:
>> +     *  shr x8, addr_reg, #TARGET_PAGE_BITS
>> +     *  and x0, x8, #(CPU_TLB_SIZE - 1)   @ Assumption: CPU_TLB_BITS <= 8
>> +     *  add x0, env, x0 lsl #CPU_TLB_ENTRY_BITS
>> +     */
>> +#  if CPU_TLB_BITS > 8
>> +#   error "CPU_TLB_BITS too large"
>> +#  endif
> 
> I wonder if using UBFM to extract the TLB bits and BFM with XZR to clear the
> middle bits wouldn't be better, as you wouldn't be restricted on the size of
> CPU_TLB_BITS.  AFAICS it would be the same number of instructions.

Hmm..

> 
>> +    case INDEX_op_mov_i64: ext = 1;
>> +    case INDEX_op_mov_i32:
>> +        tcg_out_movr(s, ext, args[0], args[1]);
>> +        break;
> 
> See how the i386 backend uses macros to reduce the typing with these sorts of
> paired opcodes.

I saw the glue thing, and I try to stay away from that kind of preprocessor use,
as it makes it more difficult for newcomers to dig in, since it breaks gid / grepping
for symbols among other things.
I used instead the editor and regexps to generate the list.
Maybe this could be patched up later if that seems to be the consensus?

> 
>> +    case INDEX_op_rotl_i64: ext = 1;
>> +    case INDEX_op_rotl_i32:     /* same as rotate right by (32 - m) */
>> +        if (const_args[2])      /* ROR / EXTR Wd, Wm, Wm, 32 - m */
>> +            tcg_out_rotl(s, ext, args[0], args[1], args[2]);
>> +        else { /* no RSB in aarch64 unfortunately. */
>> +            /* XXX UNTESTED */
>> +            tcg_out_movi32(s, ext, TCG_REG_X8, ext ? 64 : 32);
> 
> But A64 does have shift counts that truncate to the width of the operation.
> Which means that the high bits may contain garbage, which means that you can
> compute this merely as ROR = -ROL, ignoring the 32/64.

I see.

> 
>> +    case INDEX_op_setcond_i64: ext = 1;
>> +    case INDEX_op_setcond_i32:
>> +        tcg_out_movi32(s, ext, TCG_REG_X8, 0x01);
>> +        tcg_out_cmp(s, ext, args[1], args[2]);
>> +        tcg_out_csel(s, ext, args[0], TCG_REG_X8, TCG_REG_XZR,
>> +                     tcg_cond_to_aarch64_cond[args[3]]);
> 
> See CSINC Wd,Wzr,Wzr,cond.  No need for the initial movi.

Yes, that was mentioned also by Peter, will change.

> 
>> +    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffff);
>> +    tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffff);
> 
> Only half of your registers are marked available.

Oops, will fix. Thanks,

-- 
Claudio Fontana

next prev parent reply	other threads:[~2013-05-14 14:06 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-14 15:57 [Qemu-devel] QEMU aarch64 TCG target Claudio Fontana
2013-03-14 16:16 ` Peter Maydell
2013-05-06 12:56   ` [Qemu-devel] QEMU aarch64 TCG target - testing question about x86-64 Claudio Fontana
2013-05-06 13:27     ` Paolo Bonzini
2013-05-13 13:22       ` [Qemu-devel] [PATCH 0/3] ARM aarch64 TCG target Claudio Fontana
2013-05-13 13:28         ` [Qemu-devel] [PATCH 1/3] configure: permit compilation on arm aarch64 Claudio Fontana
2013-05-13 18:29           ` Peter Maydell
2013-05-14  8:19             ` Claudio Fontana
2013-05-13 13:31         ` [Qemu-devel] [PATCH 2/3] include/elf.h: add aarch64 ELF machine and relocs Claudio Fontana
2013-05-13 18:34           ` Peter Maydell
2013-05-14  8:24             ` Claudio Fontana
2013-05-13 13:33         ` [Qemu-devel] [PATCH 3/3] tcg/aarch64: implement new TCG target for aarch64 Claudio Fontana
2013-05-13 18:28           ` Peter Maydell
2013-05-14 12:01             ` Claudio Fontana
2013-05-14 12:25               ` Peter Maydell
2013-05-14 15:19                 ` Richard Henderson
2013-05-16 14:39                   ` Claudio Fontana
2013-05-14 12:41               ` Laurent Desnogues
2013-05-13 19:49           ` Richard Henderson
2013-05-14 14:05             ` Claudio Fontana [this message]
2013-05-14 15:16               ` Richard Henderson
2013-05-14 16:26                 ` Richard Henderson
2013-05-06 13:42     ` [Qemu-devel] QEMU aarch64 TCG target - testing question about x86-64 Peter Maydell
2013-05-23  8:09 ` [Qemu-devel] [PATCH 0/4] ARM aarch64 TCG target VERSION 2 Claudio Fontana
2013-05-23  8:14   ` [Qemu-devel] [PATCH 1/4] include/elf.h: add aarch64 ELF machine and relocs Claudio Fontana
2013-05-23 13:18     ` Peter Maydell
2013-05-28  8:09     ` Laurent Desnogues
2013-05-23  8:18   ` [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement new TCG target for aarch64 Claudio Fontana
2013-05-23 16:29     ` Richard Henderson
2013-05-24  8:53       ` Claudio Fontana
2013-05-24 17:02         ` Richard Henderson
2013-05-24 17:08           ` Peter Maydell
2013-05-24 17:17             ` Richard Henderson
2013-05-24 17:28               ` Peter Maydell
2013-05-24 17:54                 ` Richard Henderson
2013-05-27 11:43           ` Claudio Fontana
2013-05-27 18:47             ` Richard Henderson
2013-05-27 21:14               ` [Qemu-devel] [PATCH 3/3] " Laurent Desnogues
2013-05-28 13:01                 ` Claudio Fontana
2013-05-28 13:09                   ` Laurent Desnogues
2013-05-28  7:17               ` [Qemu-devel] [PATCH 2/4] " Claudio Fontana
2013-05-28 14:52                 ` Richard Henderson
2013-05-23 16:39     ` Peter Maydell
2013-05-24  8:51       ` Claudio Fontana
2013-05-27  9:10         ` Claudio Fontana
2013-05-27 10:40           ` Peter Maydell
2013-05-27 17:05           ` Richard Henderson
2013-05-27  9:47     ` Laurent Desnogues
2013-05-27 10:13       ` Claudio Fontana
2013-05-27 10:28         ` Laurent Desnogues
2013-05-28 13:14     ` Laurent Desnogues
2013-05-28 14:37       ` Claudio Fontana
2013-05-23  8:19   ` [Qemu-devel] [PATCH 3/4] configure: permit compilation on arm aarch64 Claudio Fontana
2013-05-23 13:24     ` Peter Maydell
2013-05-23  8:22   ` [Qemu-devel] [PATCH 4/4] tcg/aarch64: more ops in preparation of tlb lookup Claudio Fontana
2013-05-23 12:37   ` [Qemu-devel] [PATCH 0/4] ARM aarch64 TCG target VERSION 2 Andreas Färber
2013-05-23 12:50     ` Peter Maydell
2013-05-23 12:53       ` Andreas Färber
2013-05-23 13:03         ` Peter Maydell
2013-05-23 13:27           ` Claudio Fontana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519244B7.4090207@huawei.com \
    --to=claudio.fontana@huawei.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).