From: Richard Henderson <richard.henderson@linaro.org>
To: Michael Clark <michael@anarch128.org>,
qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] tcg: refactor pool data for simplicity and comprehension
Date: Sun, 16 Feb 2025 10:01:20 -0800 [thread overview]
Message-ID: <05f94ee7-a0ea-4e8f-bf84-0674a98cdb96@linaro.org> (raw)
In-Reply-To: <aedcfd05-96fc-4e8a-9fcb-3763e30a6663@anarch128.org>
On 2/16/25 00:00, Michael Clark wrote:
> On 2/16/25 06:58, Richard Henderson wrote:
>>
>>> the label member is merely a pointer to the instruction text to
>>> be updated with the relative address of the constant, the primary
>>> data is the constant data pool at the end of translation blocks.
>>> this relates more closely to .data sections in offline codegen
>>> if we were to imagine a translation block has .text and .data.
>>
>> No, it doesn't. It relates most closely to data emitted within .text, accessed via pc-
>> relative instructions with limited offsets.
>>
>> This isn't a thing you'd have ever seen on x86 or x86_64, but it is quite common for
>> arm32 (12-bit offsets), sh4 (8-bit offsets), m68k (16- bit offsets) and such. Because
>> the offsets are so small, they could even be placed *within* functions not just between
>> them.
>
> I mentioned before I like the idea and have thought about architectures with constant
> streams and constant branch units.
>
> say for arguments sake we considered it 'TCData' with embedded label and reloc (the
> purpose is the constant after after all, just it is not a TCGTemp, it's an explicitly
> reified constant in the codegen emitters). wondering if we could add a "disposition" field
> to control placement. TCG_DISP_TEXT_TB, TCG_DISP_DATA, etc. this way you could ask the
> code generator to do something more conventional while still supporting the short relative
> constant islands. "disposition" might be better than section as a name. also a DATA
> section could be mmap R without X perms to lessen the risk of injecting code as constants.
I don't think there's any point to doing anything differently than we currently do: place
the data at the end of the TB.
(1) The architectures that we host and use the constant pool currently have
relatively large displacements: aarch64 (21 bit), x86_64 (32 bit),
ppc (16 or 34 bit (power10 only)), riscv (32 bit), s390x (34 bit).
(2) The size of a TB pretty generally maxes out at 3-4k, but is firmly capped at 64k
by uint16_t TranslationBlock.jmp_reset_offset.
(3) The 16 and 21-bit offsets are not large enough to stretch to a read-only mapping.
(4) Memory management of TranslationBlocks becomes *much* more complicated.
> TCGConstant is another alternative I would consider as okay. distinct from TCGTemp of type
> TEMP_CONST which is heavier weight. it makes one wonder about reification of large
> implicit constants as opposed to the explicitly emitted ones we are talking about here.
TCGConstant isn't bad, but I think I prefer TCGPoolData as mooted before.
> i'm looking at a TCG source-compatible code generator as an option so I may experiment
> locally. it is a private interface at the moment anyhow. that just seemed inconsistent as
> most structure definitions are in the header. but I understand it is a private interface.
The organization of tcg.h is from antiquity. I am actively trying to reduce the size of
the exported API.
r~
prev parent reply other threads:[~2025-02-16 18:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-15 2:11 [PATCH] tcg: refactor pool data for simplicity and comprehension Michael Clark
2025-02-15 17:58 ` Richard Henderson
2025-02-15 20:24 ` Michael Clark
2025-02-15 21:50 ` Richard Henderson
2025-02-15 22:40 ` Michael Clark
2025-02-15 23:41 ` Richard Henderson
2025-02-16 0:48 ` Michael Clark
2025-02-15 22:58 ` Michael Clark
2025-02-16 8:00 ` Michael Clark
2025-02-16 18:01 ` Richard Henderson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=05f94ee7-a0ea-4e8f-bf84-0674a98cdb96@linaro.org \
--to=richard.henderson@linaro.org \
--cc=michael@anarch128.org \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).