Re: [Qemu-devel] [PATCH 09/10 v11] target-tilegx: Generate tcg instructions to finish "Hello world"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Richard Henderson <rth@twiddle.net>
To: "Chen Gang" <xili_gchen_5257@hotmail.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Andreas Färber" <afaerber@suse.de>,
	"Chris Metcalf" <cmetcalf@ezchip.com>
Cc: "walt@tilera.com" <walt@tilera.com>,
	Riku Voipio <riku.voipio@iki.fi>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 09/10 v11] target-tilegx: Generate tcg instructions to finish "Hello world"
Date: Mon, 01 Jun 2015 11:40:29 -0700	[thread overview]
Message-ID: <556CA71D.4090208@twiddle.net> (raw)
In-Reply-To: <BLU436-SMTP68C6610B03B00ADDE5679FB9C80@phx.gbl>

First, what happened to the decoding skeleton patch?  You seem to have merged
it with patch 9 here.  That said, see the bottom of this message.

On 05/30/2015 02:18 PM, Chen Gang wrote:
> +/* mfspr can be only in X1 pipe, so it doesn't need to be bufferd */
> +static void gen_mfspr(struct DisasContext *dc, uint8_t rdst, uint16_t imm14)

I'm not keen on this as a comment.  Clearly it could be buffered, with what is
implemented here now.  But there are plenty of SPRs for which produce side
effects, and *cannot* be buffered.

Adjust the comment to

/* Many SPR reads have side effects and cannot be buffered.  However, they are
   all in the X1 pipe, which we are executing last, therefore we need not do
   additional buffering.  */

> +/* mtspr can be only in X1 pipe, so it doesn't need to be bufferd */

Same, but s/reads/writes/.

> +#if 1

Do not include this.

> +/*
> + * uint64_t output = 0;
> + * uint32_t counter;
> + * for (counter = 0; counter < (WORD_SIZE / BYTE_SIZE); counter++)
> + * {
> + *     int8_t srca = getByte (rf[SrcA], counter);
> + *     int8_t srcb = signExtend8 (Imm8);
> + *     output = setByte (output, counter, ((srca == srcb) ? 1 : 0));
> + * }
> + * rf[Dest] = output;
> + */
> +static void gen_v1cmpeqi(struct DisasContext *dc,
> +                         uint8_t rdst, uint8_t rsrc, int8_t imm8)

Pass in the condition to use, since you'll eventually need to implement
v1cmpltsi, v1cmpltui.

> +static void gen_v1cmpeq(struct DisasContext *dc,
> +                        uint8_t rdst, uint8_t rsrc, uint8_t rsrcb)

Likewise for v1cmples, v1cmpleu, v1cmplts, v1cmpltu, v1cmpne.

> +    tcg_gen_movi_i64(vdst, 0); /* or Assertion `ts->val_type == TEMP_VAL_REG' */

These comments are unnecessary.  Of course it's illegal to use an uninitialized
temporary.

> +static void gen_v4int_l(struct DisasContext *dc,
> +                        uint8_t rdst, uint8_t rsrc, uint8_t rsrcb)
> +{
> +    qemu_log_mask(CPU_LOG_TB_IN_ASM, "v4int_l r%d, r%d, r%d\n",
> +                  rdst, rsrc, rsrcb);
> +    tcg_gen_deposit_i64(dest_gr(dc, rdst), load_gr(dc, rsrc),
> +                        load_gr(dc, rsrcb), 0, 32);

This is incorrect.  This produces { A1, B0 }, not { A0, B0 }.

As I said, you did want "32, 32" as the field insert, but you have the source
operands in the wrong order.

> +static void gen_addx(struct DisasContext *dc,
> +                     uint8_t rdst, uint8_t rsrc, uint8_t rsrcb)
> +{
> +    TCGv vdst = dest_gr(dc, rdst);
> +
> +    /* High bits have no effect with low bits, so addx and addxsc are merged. */
> +    qemu_log_mask(CPU_LOG_TB_IN_ASM, "addx(sc) r%d, r%d, r%d\n",
> +                  rdst, rsrc, rsrcb);

Um, no, addxsc does signed saturation before sign extension.

> +static void gen_mul_u_u(struct DisasContext *dc,
> +                        uint8_t rdst, uint8_t rsrc, uint8_t rsrcb,
> +                        int8 high, int8 highb, int8 add, const char *code)

A better name for this function is warranted, since it does much more than
mul_u_u.  The add parameter should be type bool.

Given the existence of mul_hs_hs, mul_hu_ls, etc, you're probably better off
passing in extraction functions.  E.g.

static void ext32s_high(TCGv d, TCGv s)
{
    tcg_gen_sari_i64(d, s, 32);
}

static void ext32u_high(TCGv d, TCGv s)
{
    tcg_gen_shri_i64(d, s, 32);
}

  gen_mul(dc, rdst, rsrc, rsrcb, ext32s_high, ext32s_high,
          false, "mul_hs_hs");
  gen_mul(dc, rdst, rsrc, rsrcb, ext32s_high, ext32u_high,
          false, "mul_hs_hu");
  gen_mul(dc, rdst, rsrc, rsrcb, ext32s_high, tcg_gen_ext32s_i64,
          false, "mul_hs_ls");
  gen_mul(dc, rdst, rsrc, rsrcb, ext32s_high, tcg_gen_ext32u_i64,
          false, "mul_hs_lu");

  gen_mul(dc, rdst, rsrc, rsrcb, ext32u_high, ext32u_high,
          false, "mul_hu_hu");
  gen_mul(dc, rdst, rsrc, rsrcb, ext32u_high, tcg_gen_ext32s_i64,
          false, "mul_hu_ls");
  gen_mul(dc, rdst, rsrc, rsrcb, ext32u_high, tcg_gen_ext32u_i64,
          false, "mul_hu_lu");

  gen_mul(dc, rdst, rsrc, rsrcb, tcg_gen_ext32s_i64, tcg_gen_ext32s_i64,
          false, "mul_ls_ls");
  gen_mul(dc, rdst, rsrc, rsrcb, tcg_gen_ext32s_i64, tcg_gen_ext32u_i64,
          false, "mul_ls_lu");

  gen_mul(dc, rdst, rsrc, rsrcb, tcg_gen_ext32u_i64, tcg_gen_ext32u_i64,
          false, "mul_lu_lu");


and of course the same for the mula insns with true instead of false for the
"add" parameter.

> +static void gen_shladd(struct DisasContext *dc,
> +                       uint8_t rdst, uint8_t rsrc, uint8_t rsrcb,
> +                       uint8_t shift, uint8_t cast)

cast should be bool.

> +static void gen_dblalign(struct DisasContext *dc,
> +                         uint8_t rdst, uint8_t rsrc, uint8_t rsrcb)
> +{
> +    TCGv vdst = dest_gr(dc, rdst);
> +    TCGv mask = tcg_temp_new_i64();
> +    TCGv tmp = tcg_temp_new_i64();
> +
> +    qemu_log_mask(CPU_LOG_TB_IN_ASM, "dblalign r%d, r%d, r%d\n",
> +                  rdst, rsrc, rsrcb);
> +
> +    tcg_gen_andi_i64(mask, load_gr(dc, rsrcb), 7);
> +    tcg_gen_muli_i64(mask, mask, 8);

tcg_gen_shli_i64(mask, mask, 3);

> +    tcg_gen_shr_i64(vdst, load_gr(dc, rdst), mask);
> +
> +    tcg_gen_movi_i64(tmp, 64);
> +    tcg_gen_sub_i64(mask, tmp, mask);
> +    tcg_gen_shl_i64(mask, load_gr(dc, rsrc), mask);
> +
> +    tcg_gen_or_i64(vdst, vdst, mask);

Does not produce the correct results for mask == 0.

What you want is when mask == 0, you shift A by 64 bits, i.e. produce a zero.
But you can't do that in TCG (or C for that matter).  Best is to do two shifts:

  tcg_gen_xori_i64(mask, mask, 63); /* compute 1's compliment of the shift */
  tcg_gen_shl_i64(mask, load_gr(dc, rsrc), mask);
  tcg_gen_shli_i64(mask, mask, 1); /* one more to produce 2's compliment */

> +static void gen_ld_add(struct DisasContext *dc,
> +                       uint8_t rdst, uint8_t rsrc, int8_t imm8,
> +                       TCGMemOp ops, const char *code)
> +{
> +    qemu_log_mask(CPU_LOG_TB_IN_ASM, "%s r%d, r%d, %d\n",
> +                  code, rdst, rsrc, imm8);
> +
> +    tcg_gen_qemu_ld_i64(dest_gr(dc, rdst), load_gr(dc, rsrc),
> +                        MMU_USER_IDX, ops);
> +    /*
> +     * Each pipe only have one temp val which is already used, and it is only
> +     * for pipe X1, so can use real register
> +     */
> +    if (rsrc < TILEGX_R_COUNT) {
> +        tcg_gen_addi_i64(cpu_regs[rsrc], load_gr(dc, rsrc), imm8);
> +    }
> +}

This is a poor comment.  Clearly each pipe can have two outputs, so this
limitation is simply of your own design.

Further, the < TILEGX_R_COUNT restriction is also incorrect.  True, you don't
actually implement the top 7 special registers, but that doesn't matter, you
should still be incrementing them.

> +
> +    return;

Do not add bare return statments at the ends of functions.

> +static int gen_blb(struct DisasContext *dc, uint8_t rsrc, int32_t off,
> +                   TCGCond cond, const char *code)

Unused return value.  What were you intending?

> +static void decode_rrr_1_opcode_y0(struct DisasContext *dc,
> +                                   tilegx_bundle_bits bundle)
> +{
> +    uint8_t rsrc = get_SrcA_Y0(bundle);
> +    uint8_t rsrcb = get_SrcB_Y0(bundle);
> +    uint8_t rdst = get_Dest_Y0(bundle);
> +
> +    switch (get_RRROpcodeExtension_Y0(bundle)) {
> +    case UNARY_RRR_1_OPCODE_Y0:
> +        switch (get_UnaryOpcodeExtension_Y0(bundle)) {
> +        case CNTLZ_UNARY_OPCODE_Y0:
> +            gen_cntlz(dc, rdst, rsrc);
> +            return;
> +        case CNTTZ_UNARY_OPCODE_Y0:
> +            gen_cnttz(dc, rdst, rsrc);
> +            return;
> +        case NOP_UNARY_OPCODE_Y0:
> +        case  FNOP_UNARY_OPCODE_Y0:
> +            if (!rsrc && !rdst) {
> +                qemu_log_mask(CPU_LOG_TB_IN_ASM, "(f)nop\n");
> +                return;
> +            }
> +            break;
> +        case FSINGLE_PACK1_UNARY_OPCODE_Y0:
> +        case PCNT_UNARY_OPCODE_Y0:
> +        case REVBITS_UNARY_OPCODE_Y0:
> +        case REVBYTES_UNARY_OPCODE_Y0:
> +        case TBLIDXB0_UNARY_OPCODE_Y0:
> +        case TBLIDXB1_UNARY_OPCODE_Y0:
> +        case TBLIDXB2_UNARY_OPCODE_Y0:
> +        case TBLIDXB3_UNARY_OPCODE_Y0:
> +        default:
> +            break;
> +        }
> +        break;
> +    case SHL1ADD_RRR_1_OPCODE_Y0:
> +        gen_shladd(dc, rdst, rsrc, rsrcb, 1, 0);
> +        return;
> +    case SHL2ADD_RRR_1_OPCODE_Y0:
> +        gen_shladd(dc, rdst, rsrc, rsrcb, 2, 0);
> +        return;
> +    case SHL3ADD_RRR_1_OPCODE_Y0:
> +        gen_shladd(dc, rdst, rsrc, rsrcb, 3, 0);
> +        return;
> +    default:
> +        break;
> +    }
> +
> +    qemu_log_mask(LOG_UNIMP, "UNIMP rrr_1_opcode_y0, [" FMT64X "]\n", bundle);
> +}
> +

I can't help thinking, as I read all of these decode functions, that it would
be better if the output disassembly, i.e. qemu_log_mask(CPU_LOG_TB_IN_ASM, *),
were to happen here, instead of being spread across 99 other functions.

This has a side effect of reducing many of your functions to a single
statement, invoking another tcg generator, at which point it's worth inlining them.

For example:

static void decode_rrr_1_unary_y0(struct DisasContext *dc,
                                  tilegx_bundle_bits bundle,
                                  uint8_t rdst, uint8_t rsrc)
{
    unsigned ext = get_UnaryOpcodeExtension_Y0(bundle);
    const char *mnemonic;
    TCGv vdst, vsrc;

    if (ext == NOP_UNARY_OPCODE_Y0 || ext == FNOP_UNARY_OPCODE_Y0) {
        if (rsrc != 0 || rdst != 0) {
            goto unimplemented;
        }
        qemu_log_mask(CPU_LOG_TB_IN_ASM, "(f)nop\n");
        return;
    }

    vdst = dest_gr(dc, rdst);
    vsrc = load_gr(dc, rsrc);

    switch (ext) {
    case CNTLZ_UNARY_OPCODE_Y0:
        gen_helper_cntlz(vdst, vsrc);
        mnemonic = "cntlz";
        break;
    case CNTTZ_UNARY_OPCODE_Y0:
        gen_helper_cnttz(vdst, vsrc);
        mnemonic = "cnttz";
        break;
    case FSINGLE_PACK1_UNARY_OPCODE_Y0:
    case PCNT_UNARY_OPCODE_Y0:
    case REVBITS_UNARY_OPCODE_Y0:
    case REVBYTES_UNARY_OPCODE_Y0:
    case TBLIDXB0_UNARY_OPCODE_Y0:
    case TBLIDXB1_UNARY_OPCODE_Y0:
    case TBLIDXB2_UNARY_OPCODE_Y0:
    case TBLIDXB3_UNARY_OPCODE_Y0:
    default:
    unimplemented:
        qemu_log_mask(LOG_UNIMP, "UNIMP rrr_1_unary_y0, [" FMT64X "]\n",
                      bundle);
        dc->exception = TILEGX_EXCP_OPCODE_UNIMPLEMENTED;
        return;
    }

    qemu_log_mask(CPU_LOG_TB_IN_ASM, "%s r%d,r%d\n",
                  mnemonic, rdst, rsrc);
}

static void decode_rrr_1_opcode_y0(struct DisasContext *dc,
                                   tilegx_bundle_bits bundle)
{
    unsigned ext = get_RRROpcodeExtension_Y0(bundle);
    uint8_t rsrca = get_SrcA_Y0(bundle);
    uint8_t rsrcb = get_SrcB_Y0(bundle);
    uint8_t rdst = get_Dest_Y0(bundle);
    const char *mnemonic;
    TCGv vdst, vsrca, vsrcb;

    if (ext == UNARY_RRR_1_OPCODE_Y0) {
        decode_rrr_1_unary_y0(dc, bundle, rdst, rsrc);
        return;
    }

    vdst = dest_gr(dc, rdst);
    vsrca = load_gr(dc, rsrca);
    vsrcb = load_gr(dc, rsrcb);

    switch (ext) {
    case SHL1ADD_RRR_1_OPCODE_Y0:
        gen_shladd(vdst, vsrca, vsrcb, 1, 0);
        mnemonic = "shl1add";
        break;
    case SHL2ADD_RRR_1_OPCODE_Y0:
        gen_shladd(vdst, vsrca, vsrcb, 2, 0);
        mnemonic = "shl2add";
        break;
    case SHL3ADD_RRR_1_OPCODE_Y0:
        gen_shladd(vdst, vsrca, vsrcb, 3, 0);
        mnemonic = "shl3add";
        break;
    default:
        qemu_log_mask(LOG_UNIMP, "UNIMP rrr_1_opcode_y0, [" FMT64X "]\n",
                      bundle);
        dc->exception = TILEGX_EXCP_OPCODE_UNIMPLEMENTED;
        return;
    }
    qemu_log_mask(CPU_LOG_TB_IN_ASM, "%s r%d,r%d,r%d\n",
                  mnemonic, rdst, rsrca, rsrcb);
}


r~

next prev parent reply	other threads:[~2015-06-01 18:40 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-30 21:07 [Qemu-devel] [PATCH 00/10 v11] tilegx: Firstly add tilegx target for linux-user Chen Gang
2015-05-30 21:10 ` [Qemu-devel] [PATCH 01/10 v11] linux-user: tilegx: Firstly add architecture related features Chen Gang
2015-06-02 17:06   ` Peter Maydell
2015-06-03 13:06     ` Chen Gang
2015-06-03 14:28       ` Peter Maydell
2015-05-30 21:10 ` [Qemu-devel] [PATCH 02/10 v11] linux-user: Support tilegx architecture in linux-user Chen Gang
2015-06-02 17:40   ` Peter Maydell
2015-06-03 12:30     ` Chen Gang
2015-06-03 12:34       ` Peter Maydell
2015-06-03 12:47         ` Chen Gang
2015-06-03 15:10           ` Chris Metcalf
2015-06-03 15:19             ` Peter Maydell
2015-06-03 15:20               ` Chris Metcalf
2015-06-04 12:25                 ` Chen Gang
2015-06-04 12:29                   ` Peter Maydell
2015-06-04 12:39                     ` Chen Gang
2015-06-03 15:47         ` Richard Henderson
2015-06-04 12:32           ` Chen Gang
2015-07-07  0:19             ` Chris Metcalf
2015-07-07 20:47               ` Chen Gang
2015-07-07 20:50               ` Chen Gang
2015-07-08 19:24             ` Chris Metcalf
     [not found]               ` <559DC775.9080204@hotmail.com>
2015-07-09  1:09                 ` gchen gchen
2015-05-30 21:12 ` [Qemu-devel] [PATCH 03/10 v11] linux-user/syscall.c: conditionalize syscalls which are not defined in tilegx Chen Gang
2015-06-02 17:41   ` Peter Maydell
2015-05-30 21:13 ` [Qemu-devel] [PATCH 04/10 v11] target-tilegx: Add opcode basic implementation from Tilera Corporation Chen Gang
2015-06-02 17:42   ` Peter Maydell
2015-05-30 21:14 ` [Qemu-devel] [PATCH 05/10 v11] arget-tilegx/opcode_tilegx.h: Modify it to fit qemu using Chen Gang
2015-06-02 17:43   ` Peter Maydell
2015-06-02 23:39     ` Andreas Färber
2015-06-03 11:59       ` Chen Gang
2015-05-30 21:15 ` [Qemu-devel] [PATCH 06/10 v11] target-tilegx: Add special register information from Tilera Corporation Chen Gang
2015-06-02 17:44   ` Peter Maydell
2015-06-02 20:52     ` Chen Gang
2015-06-02 20:53     ` Chen Gang
2015-05-30 21:15 ` [Qemu-devel] [PATCH 07/10 v11] target-tilegx: Add cpu basic features for linux-user Chen Gang
2015-06-02 17:51   ` Peter Maydell
2015-06-02 20:47     ` Chen Gang
2015-05-30 21:17 ` [Qemu-devel] [PATCH 08/10 v11] target-tilegx: Add several helpers for instructions translation Chen Gang
2015-06-01 16:02   ` Richard Henderson
2015-06-01 18:47     ` Chen Gang
2015-05-30 21:18 ` [Qemu-devel] [PATCH 09/10 v11] target-tilegx: Generate tcg instructions to finish "Hello world" Chen Gang
2015-06-01 18:40   ` Richard Henderson [this message]
2015-06-01 20:54     ` Chen Gang
2015-06-02 16:32       ` Richard Henderson
2015-06-02 21:30         ` Chen Gang
2015-06-07 22:20       ` Chen Gang
2015-06-02 17:54   ` Peter Maydell
2015-06-02 20:25     ` Chen Gang
2015-05-30 21:19 ` [Qemu-devel] [PATCH 10/10 v11] target-tilegx: Add TILE-Gx building files Chen Gang
2015-06-02 17:52   ` Peter Maydell
2015-06-02 20:26     ` Chen Gang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=556CA71D.4090208@twiddle.net \
    --to=rth@twiddle.net \
    --cc=afaerber@suse.de \
    --cc=cmetcalf@ezchip.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=riku.voipio@iki.fi \
    --cc=walt@tilera.com \
    --cc=xili_gchen_5257@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.