From: Richard Henderson <rth@twiddle.net>
To: Aurelien Jarno <aurelien@aurel32.net>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 2/4] tcg-hppa: Finish the port.
Date: Thu, 08 Apr 2010 09:32:41 -0700 [thread overview]
Message-ID: <4BBE0529.7000404@twiddle.net> (raw)
In-Reply-To: <20100408095612.GC17138@volta.aurel32.net>
On 04/08/2010 02:56 AM, Aurelien Jarno wrote:
> I have applied the patch. I have some comments though, it would be nice
> if you can address them with additional patches.
Sure.
>> +static void tcg_out_ori(TCGContext *s, int ret, int arg, tcg_target_ulong m)
>> +{
>> + if (m == 0) {
>> + tcg_out_mov(s, ret, arg);
>> + } else if (m == -1) {
>> + tcg_out_movi(s, TCG_TYPE_I32, ret, -1);
>
> Those cases are already eliminated in tcg/tcg-op.h. This code looks
> redundant.
The cases eliminated in tcg-op.h are with immediate constants.
There is no generic code in tcg.c to eliminate these cases
after constant propagation. However, I can remove them with...
>> + } else {
>> + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R1, m);
>> + tcg_out_arith(s, ret, arg, TCG_REG_R1, INSN_OR);
>
> Do we really want a movi here? It would be better to leave the tcg code
> load the constant itself, so that if the same constant is used twice, it
> is only loaded once.
I've never caught TCG properly re-using constants, but I take your
point -- there's no reason why tcg.c can't be improved, and this
port would miss out on that improvement. I'll invent a constraint
that matches or_mask_p.
>> +static void tcg_out_andi(TCGContext *s, int ret, int arg, tcg_target_ulong m)
...
>> + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R1, m);
>> + tcg_out_arith(s, ret, arg, TCG_REG_R1, INSN_AND);
>
> Same.
ANDI slightly different case. This function is used by tcg_out_tlb_read
with constants that may or may not satisfy and_mask_p. I think it's cleaner
to handle the arbitrary case here, rather than open code the same test in
the tlb read function.
I will of course add a constraint to match and_mask_p, for ANDs that
originate within the opcode stream.
>> + tcg_out_reloc(s, s->code_ptr, R_PARISC_PCREL17F, label_index, 0);
>> + tcg_out32(s, op);
>
> This breaks partial retranslation. The bits corresponding to the offset
> should be preserved.
I don't recall ever hearing about re-translation. Can you point me
at the bits that do it, so I can figure out what's going on? This
sounds like something that ought to be documented properly...
I rather assumed that the "addend" parameter to patch_reloc would
hold whatever is really needed to be preserved. What else is that
field for, anyway?
>> - if (opc == 3)
>> - data_reg2 = *args++;
>> - else
>> - data_reg2 = 0; /* suppress warning */
>> + data_reg2 = (opc == 3 ? *args++ : TCG_REG_R0);
>
> I am not sure TCG_REG_R0 is really correct here, and I find it confusing.
> While it's value is zero, the assignment there is just to make GCC
> happy, it won't be used after
Correct. I don't see what else I can really do though. I think it's
worse to mix types: integer-as-register-number (i.e. *args) and
integer-as-filler (i.e. 0). Better to at least have them be the same
type as it clarifies that *args must be a register number.
Perhaps just a comment here?
>> #if defined(CONFIG_SOFTMMU)
>> - tcg_out_mov(s, r1, addr_reg);
>> + lab1 = gen_new_label();
>> + lab2 = gen_new_label();
>
> Do you really want to use label here? load/store are the most common
> instructions, I am not really sure of the resulting performance.
I think the code is *so* much more readable re-using the usual branch
and relocate code. I'd almost rather spend the time speeding up the
use of temporary labels than uglifying the code here.
>> + /* These three correspond exactly to the fallback implementation.
>> + But by including them we reduce the number of TCG ops that
>> + need to be generated, and these opcodes are fairly common. */
>
> Are you sure it really makes a difference?
Not quantifiably, but the reasoning is sound. I can remove them if you insist.
>> + { INDEX_op_add_i32, { "r", "rZ", "ri" } },
>> + { INDEX_op_sub_i32, { "r", "rI", "ri" } },
>> + { INDEX_op_and_i32, { "r", "rZ", "ri" } },
>> + { INDEX_op_or_i32, { "r", "rZ", "ri" } },
>
> Already commented for "and" and "or", but the same apply for add and
> sub. Do we really need a "i" contraints here if the constant is going
> to be loaded with a movi.
ADD and SUB are not going to use movi. They will use one or both of
ADDIL (21-bit constant << 11) and LDO (14-bit constant). As a pair
these insns can perform a full 32-bit constant addition.
I suppose technically there's a subset of 32-bit constants that could
benefit from generic code loading constants into registers. The only
valid output register for ADDIL is R1. So at the moment for
R3 = R4 + 0x10000;
we generate
addil 0x10000, r4, r1
copy r1, r3
where we could equivalently generate
ldil 0x10000, r5
add r4, r5, r3
However I don't think this is worth worrying about in the short term.
>> + if (GUEST_BASE != 0) {
>> + tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, GUEST_BASE);
>> + }
>
> The final GUEST_BASE value is computed after the prologue has been
> generated. The value is modified in two cases:
> - The user specify a non-aligned base address.
> - /proc/sys/vm/mmap_min_addr is different than 0, which is now the
> in default configuration for more than one year.
>
> When it happens, the guest crashes almost immediately.
To be fair, mmap_min_addr only affects GUEST_BASE if the executable
image we've loaded overlaps. Which is uncommon, but certainly possible.
Hmm. I wonder which is better: one extra instruction needed per qemu_ld
vs having one more call-saved register available. At the moment we don't
even come close to using all of the call-saved registers, and it would be
easy enough to have the prologue read the actual guest_base variable rather
than embed the constant.
r~
next prev parent reply other threads:[~2010-04-08 16:34 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-03-23 22:33 [Qemu-devel] [PATCH 0/2] tcg-hppa finish, v3 Richard Henderson
2010-03-12 14:58 ` [Qemu-devel] [PATCH 1/2] tcg-hppa: Compute is_write in cpu_signal_handler Richard Henderson
2010-03-23 22:06 ` [Qemu-devel] [PATCH 2/2] tcg-hppa: Finish the port Richard Henderson
2010-04-07 11:45 ` [Qemu-devel] [PATCH 0/2] tcg-hppa finish, v3 Richard Henderson
2010-04-07 11:56 ` Aurelien Jarno
2010-04-07 11:56 ` [Qemu-devel] [PATCH 2/4] tcg-hppa: Finish the port Richard Henderson
2010-04-08 9:56 ` Aurelien Jarno
2010-04-08 16:32 ` Richard Henderson [this message]
2010-04-08 21:48 ` Richard Henderson
2010-04-08 23:01 ` Aurelien Jarno
2010-04-07 14:46 ` [Qemu-devel] [PATCH 3/4] tcg-hppa: Fix in/out register overlap in add2/sub2 Richard Henderson
2010-04-08 9:57 ` Aurelien Jarno
2010-04-07 23:24 ` [Qemu-devel] [PATCH 4/4] tcg-hppa: Don't try to calls to non-constant addresses Richard Henderson
2010-04-08 9:58 ` Aurelien Jarno
2010-04-07 23:29 ` [Qemu-devel] [PATCH 0/4] tcg-hppa finish, v4 Richard Henderson
2010-04-08 9:36 ` Aurelien Jarno
2010-04-07 23:42 ` [Qemu-devel] [PATCH 1/4] tcg-hppa: Compute is_write in cpu_signal_handler Richard Henderson
2010-04-08 9:36 ` Aurelien Jarno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BBE0529.7000404@twiddle.net \
--to=rth@twiddle.net \
--cc=aurelien@aurel32.net \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).