* [Qemu-devel] [PULL v2] Queued TCG improvements
@ 2015-08-18 14:59 Richard Henderson
2015-08-18 23:23 ` Peter Maydell
0 siblings, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2015-08-18 14:59 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
This pull includes three independent patch sets, which were
all posted during the 2.4 freeze.
The first is algorithmic improvements to tcg/optimize, both
improving its runtime and its tracking of constants.
The second is improvements to the representation of 32<->64-bit
size changing operations. Still to do here is investigate how
these might be best applied to each tcg host.
The third is improvements to how guest unaligned accesses are
implemented in softmmu mode, for the 4 supported host processors
that themselves implement unaligned accesses.
Change v1-v2:
* Removed a patch that Aurelien self-nack'ed. I guess I'd
gotten the set of patches confused along the way.
r~
The following changes since commit 074a9925e1cfd659d5376dcaccd1436d3840e611:
Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2015-08-14 16:52:34 +0100)
are available in the git repository at:
git://github.com/rth7680/qemu.git tags/pull-tcg-20150818
for you to fetch changes up to 2e58c34d4c9c61f311b5468f05b0ad63b77645c1:
tcg/aarch64: Use softmmu fast path for unaligned accesses (2015-08-18 07:50:19 -0700)
----------------------------------------------------------------
queued tcg patches
----------------------------------------------------------------
Aurelien Jarno (11):
tcg/optimize: fix constant signedness
tcg/optimize: optimize temps tracking
tcg/optimize: add temp_is_const and temp_is_copy functions
tcg/optimize: track const/copy status separately
tcg/optimize: allow constant to have copies
tcg: rename trunc_shr_i32 into trunc_shr_i64_i32
tcg: don't abuse TCG type in tcg_gen_trunc_shr_i64_i32
tcg: implement real ext_i32_i64 and extu_i32_i64 ops
tcg/optimize: add optimizations for ext_i32_i64 and extu_i32_i64 ops
tcg: update README about size changing ops
tcg/i386: use softmmu fast path for unaligned accesses
Benjamin Herrenschmidt (1):
tcg/ppc: Improve unaligned load/store handling on 64-bit backend
Richard Henderson (4):
tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32
tcg: Remove tcg_gen_trunc_i64_i32
tcg/s390: Use softmmu fast path for unaligned accesses
tcg/aarch64: Use softmmu fast path for unaligned accesses
target-alpha/translate.c | 4 +-
target-arm/translate-a64.c | 60 +++++------
target-arm/translate.c | 46 ++++----
target-cris/translate.c | 4 +-
target-m68k/translate.c | 2 +-
target-microblaze/translate.c | 8 +-
target-mips/translate.c | 4 +-
target-openrisc/translate.c | 22 ++--
target-s390x/translate.c | 30 +++---
target-sh4/translate.c | 4 +-
target-sparc/translate.c | 14 +--
target-tricore/translate.c | 32 +++---
target-xtensa/translate.c | 2 +-
tcg/README | 32 ++++--
tcg/aarch64/tcg-target.c | 41 ++++---
tcg/aarch64/tcg-target.h | 3 +-
tcg/i386/tcg-target.c | 27 +++--
tcg/i386/tcg-target.h | 3 +-
tcg/ia64/tcg-target.c | 4 +
tcg/ia64/tcg-target.h | 3 +-
tcg/optimize.c | 246 +++++++++++++++++++++---------------------
tcg/ppc/tcg-target.c | 47 ++++++--
tcg/ppc/tcg-target.h | 3 +-
tcg/s390/tcg-target.c | 31 +++++-
tcg/s390/tcg-target.h | 3 +-
tcg/sparc/tcg-target.c | 22 ++--
tcg/sparc/tcg-target.h | 3 +-
tcg/tcg-op.c | 48 ++++-----
tcg/tcg-op.h | 12 +--
tcg/tcg-opc.h | 10 +-
tcg/tcg.h | 3 +-
tcg/tci/tcg-target.c | 4 +
tcg/tci/tcg-target.h | 3 +-
tci.c | 6 +-
34 files changed, 447 insertions(+), 339 deletions(-)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PULL v2] Queued TCG improvements
2015-08-18 14:59 [Qemu-devel] [PULL v2] Queued TCG improvements Richard Henderson
@ 2015-08-18 23:23 ` Peter Maydell
2015-08-19 6:15 ` Richard Henderson
2015-08-19 15:49 ` Richard Henderson
0 siblings, 2 replies; 5+ messages in thread
From: Peter Maydell @ 2015-08-18 23:23 UTC (permalink / raw)
To: Richard Henderson; +Cc: QEMU Developers
On 18 August 2015 at 15:59, Richard Henderson <rth@twiddle.net> wrote:
> This pull includes three independent patch sets, which were
> all posted during the 2.4 freeze.
>
> The first is algorithmic improvements to tcg/optimize, both
> improving its runtime and its tracking of constants.
>
> The second is improvements to the representation of 32<->64-bit
> size changing operations. Still to do here is investigate how
> these might be best applied to each tcg host.
>
> The third is improvements to how guest unaligned accesses are
> implemented in softmmu mode, for the 4 supported host processors
> that themselves implement unaligned accesses.
>
> Change v1-v2:
> * Removed a patch that Aurelien self-nack'ed. I guess I'd
> gotten the set of patches confused along the way.
>
>
> r~
>
>
> The following changes since commit 074a9925e1cfd659d5376dcaccd1436d3840e611:
>
> Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging (2015-08-14 16:52:34 +0100)
>
> are available in the git repository at:
>
> git://github.com/rth7680/qemu.git tags/pull-tcg-20150818
>
> for you to fetch changes up to 2e58c34d4c9c61f311b5468f05b0ad63b77645c1:
>
> tcg/aarch64: Use softmmu fast path for unaligned accesses (2015-08-18 07:50:19 -0700)
>
> ----------------------------------------------------------------
> queued tcg patches
>
Hi. I'm afraid this fails 'make check' on 32-bit ARM for me:
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64
QTEST_QEMU_IMG=qemu-img MALLOC_PERTURB_=${MALLOC_PERTURB_:-
$((RANDOM % 255 + 1))} gtester -k --verbose -m=quick
tests/endianness-test tests/fdc-test tests/ide-test tests/
ahci-test tests/hd-geo-test tests/boot-order-test
tests/bios-tables-test tests/rtc-test tests/i440fx-test tests
/fw_cfg-test tests/drive_del-test tests/wdt_ib700-test tests/tco-test
tests/e1000-test tests/rtl8139-test tests
/pcnet-test tests/eepro100-test tests/ne2000-test tests/nvme-test
tests/ac97-test tests/es1370-test tests/virti
o-net-test tests/virtio-balloon-test tests/virtio-blk-test
tests/virtio-rng-test tests/virtio-scsi-test tests/virtio-9p-test
tests/virtio-serial-test tests/virtio-console-test tests/tpci200-test
tests/ipoctal232-test tests/display-vga-test tests/intel-hda-test
tests/vmxnet3-test tests/pvpanic-test tests/i82801b11-test
tests/ioh3420-test tests/usb-hcd-ohci-test tests/usb-hcd-uhci-test
tests/usb-hcd-ehci-test tests/usb-hcd-xhci-test tests/pc-cpu-test
tests/q35-test tests/vhost-user-test tests/qom-test
[snip...]
TEST: tests/bios-tables-test... (pid=15865)
/x86_64/acpi/piix4/tcg: FAIL
GTester: last random seed: R02S16b73b222ae7e15b567b3f8c378584b0
(pid=15870)
/x86_64/acpi/piix4/tcg/bridge: FAIL
GTester: last random seed: R02S6234d11ab3f559ebcd1267cc71046b7f
(pid=15875)
/x86_64/acpi/q35/tcg: FAIL
GTester: last random seed: R02Sc765d52188a4d61bbb5a9294c9429e13
(pid=15880)
/x86_64/acpi/q35/tcg/bridge: FAIL
GTester: last random seed: R02S7c5238b347bb71adf72465b0653a793a
(pid=15885)
FAIL: tests/bios-tables-test
These are the tests which try to actually run guest code, so
usually this means that TCG on ARM has broken. Indeed:
# ./build/all/x86_64-softmmu/qemu-system-x86_64
Segmentation fault
(i386-softmmu doesn't segv, so probably it's a 64-bit-ops-on-32-bit
thing.)
thanks
-- PMM
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PULL v2] Queued TCG improvements
2015-08-18 23:23 ` Peter Maydell
@ 2015-08-19 6:15 ` Richard Henderson
2015-08-19 15:49 ` Richard Henderson
1 sibling, 0 replies; 5+ messages in thread
From: Richard Henderson @ 2015-08-19 6:15 UTC (permalink / raw)
To: Peter Maydell; +Cc: QEMU Developers
On 08/18/2015 04:23 PM, Peter Maydell wrote:
> Hi. I'm afraid this fails 'make check' on 32-bit ARM for me:
...
> (i386-softmmu doesn't segv, so probably it's a 64-bit-ops-on-32-bit
> thing.)
Sadly, this doesn't fail on 32-bit x86 host. I've started a build on an arm
host, but it may be a while before I get results.
r~
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PULL v2] Queued TCG improvements
2015-08-18 23:23 ` Peter Maydell
2015-08-19 6:15 ` Richard Henderson
@ 2015-08-19 15:49 ` Richard Henderson
2015-09-10 19:31 ` Aurelien Jarno
1 sibling, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2015-08-19 15:49 UTC (permalink / raw)
To: Aurelien Jarno; +Cc: Peter Maydell, QEMU Developers
On 08/18/2015 04:23 PM, Peter Maydell wrote:
> Hi. I'm afraid this fails 'make check' on 32-bit ARM for me:
Found it. The problem is in the temps tracking patch, where we weren't
ignoring TCG_CALL_DUMMY_ARG (-1). This isn't used on x86 of course, which is
why we didn't see this failure there.
The following fixes the problem. I chose to split the initialization so that
non-call opcodes don't need to check for <dummy>.
Can I get an RB for squashing this into the original patch?
r~
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 2693168..10795ec 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -597,17 +597,24 @@ void tcg_optimize(TCGContext *s)
const TCGOpDef *def = &tcg_op_defs[opc];
oi_next = op->next;
+
+ /* Count the arguments, and initialize the temps that are
+ going to be used */
if (opc == INDEX_op_call) {
nb_oargs = op->callo;
nb_iargs = op->calli;
+ for (i = 0; i < nb_oargs + nb_iargs; i++) {
+ tmp = args[i];
+ if (tmp != TCG_CALL_DUMMY_ARG) {
+ init_temp_info(tmp);
+ }
+ }
} else {
nb_oargs = def->nb_oargs;
nb_iargs = def->nb_iargs;
- }
-
- /* Initialize the temps that are going to be used */
- for (i = 0; i < nb_oargs + nb_iargs; i++) {
- init_temp_info(args[i]);
+ for (i = 0; i < nb_oargs + nb_iargs; i++) {
+ init_temp_info(args[i]);
+ }
}
/* Do copy propagation */
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] [PULL v2] Queued TCG improvements
2015-08-19 15:49 ` Richard Henderson
@ 2015-09-10 19:31 ` Aurelien Jarno
0 siblings, 0 replies; 5+ messages in thread
From: Aurelien Jarno @ 2015-09-10 19:31 UTC (permalink / raw)
To: Richard Henderson; +Cc: Peter Maydell, QEMU Developers
On 2015-08-19 08:49, Richard Henderson wrote:
> On 08/18/2015 04:23 PM, Peter Maydell wrote:
> > Hi. I'm afraid this fails 'make check' on 32-bit ARM for me:
>
> Found it. The problem is in the temps tracking patch, where we weren't
> ignoring TCG_CALL_DUMMY_ARG (-1). This isn't used on x86 of course, which is
> why we didn't see this failure there.
>
> The following fixes the problem. I chose to split the initialization so that
> non-call opcodes don't need to check for <dummy>.
>
> Can I get an RB for squashing this into the original patch?
Sorry for answering so late while it's already merged. This is indeed
correct. Thanks for fixing that.
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 2693168..10795ec 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -597,17 +597,24 @@ void tcg_optimize(TCGContext *s)
> const TCGOpDef *def = &tcg_op_defs[opc];
>
> oi_next = op->next;
> +
> + /* Count the arguments, and initialize the temps that are
> + going to be used */
> if (opc == INDEX_op_call) {
> nb_oargs = op->callo;
> nb_iargs = op->calli;
> + for (i = 0; i < nb_oargs + nb_iargs; i++) {
> + tmp = args[i];
> + if (tmp != TCG_CALL_DUMMY_ARG) {
> + init_temp_info(tmp);
> + }
> + }
> } else {
> nb_oargs = def->nb_oargs;
> nb_iargs = def->nb_iargs;
> - }
> -
> - /* Initialize the temps that are going to be used */
> - for (i = 0; i < nb_oargs + nb_iargs; i++) {
> - init_temp_info(args[i]);
> + for (i = 0; i < nb_oargs + nb_iargs; i++) {
> + init_temp_info(args[i]);
> + }
> }
>
> /* Do copy propagation */
>
>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-09-10 19:31 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-18 14:59 [Qemu-devel] [PULL v2] Queued TCG improvements Richard Henderson
2015-08-18 23:23 ` Peter Maydell
2015-08-19 6:15 ` Richard Henderson
2015-08-19 15:49 ` Richard Henderson
2015-09-10 19:31 ` Aurelien Jarno
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).