From: "Alex Bennée" <alex.bennee@linaro.org>
To: Kirill Batuzov <batuzovk@ispras.ru>
Cc: qemu-devel@nongnu.org, Peter Maydell <peter.maydell@linaro.org>,
Peter Crosthwaite <crosthwaite.peter@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH 00/18] Emulate guest vector operations with host vector operations
Date: Fri, 27 Jan 2017 14:55:39 +0000 [thread overview]
Message-ID: <87r33o8sd0.fsf@linaro.org> (raw)
In-Reply-To: <1484644078-21312-1-git-send-email-batuzovk@ispras.ru>
Kirill Batuzov <batuzovk@ispras.ru> writes:
> The goal of these patch series is to set up an infrastructure to emulate
> guest vector operations using host vector operations. Preliminary
> experiments show that simply translating loads and stores increases
> performance of x264 video codec by 10%. The performance of a gcc vectorized
> for loop increased 2x.
>
> To be able to emulate guest vector operations using host vector operations,
> several things need to be done.
I see rth has already done a bunch of review so I'll pass on this cycle
but please feel free to add me to the CC list next iteration.
>
> 1. Corresponding vector types should be added to TCG. These series add
> TCG_v128 and TCG_v64. I've made TCG_v64 a different type than TCG_i64
> because it usually needs to be allocated to different registers and
> supports different operations.
>
> 2. Load/store operations for these new types need to be implemented.
>
> 3. For seamless transition from current model to a new one we need to
> handle cases where memory occupied by global variable can be accessed via
> pointer to the CPUArchState structure. A very simple conservative alias
> analysis has been added to do it. This analysis tracks memory loads and
> stores that overlap with fields of CPUArchState and provides this
> information to the register allocator. The allocator then spills and
> reloads affected globals when needed.
>
> 4. Allow overlapping globals. For scalar registers this is a rare case, and
> overlapping registers can ba handled as a single one (ah, al, ax, eax,
> rax). In ARM every Q-register consists of two D-register each consisting of
> two S-registers. Handling 4 S-registers as one because they are parts of
> the same Q-register is way too inefficient.
>
> 5. Add new memory addressing mode to MMU code for large accesses and create
> needed helpers. Only 128-bit vectors have been handled for now.
>
> 6. Create TCG opcodes for vector operations. Only addition has beed handled
> in these series. Each operation has a wrapper that checks if the backend
> supports the corresponding operation or not. In one case the vector opcode
> is generated, in the other the operation is emulated with scalar
> operations. The emulation code is generated inline for performance reasons
> (there is a huge performance difference between inline generation
> and calling a helper). As a positive side effect this will eventually allow
> to merge similar emulation code for vector instructions from different
> frontends to target-independent implementation.
>
> 7. Use new operations in the frontend (ARM was used in these series).
>
> 8. Support new operations in the backend (x86_64 was used in these series).
>
> For experiments I have used ARM guest on x86_64 host. I wanted some pair of
> different architectures with vector extensions both. ARM and x86_64 pair
> fits well.
>
> Kirill Batuzov (18):
> tcg: add support for 128bit vector type
> tcg: add support for 64bit vector type
> tcg: add ld_v128, ld_v64, st_v128 and st_v64 opcodes
> tcg: add simple alias analysis
> tcg: use results of alias analysis in liveness analysis
> tcg: allow globals to overlap
> tcg: add vector addition operations
> target/arm: support access to vector guest registers as globals
> target/arm: use vector opcode to handle vadd.<size> instruction
> tcg/i386: add support for vector opcodes
> tcg/i386: support 64-bit vector operations
> tcg/i386: support remaining vector addition operations
> tcg: do not relay on exact values of MO_BSWAP or MO_SIGN in backend
> tcg: introduce new TCGMemOp - MO_128
> tcg: introduce qemu_ld_v128 and qemu_st_v128 opcodes
> softmmu: create helpers for vector loads
> tcg/i386: add support for qemu_ld_v128/qemu_st_v128 ops
> target/arm: load two consecutive 64-bits vector regs as a 128-bit
> vector reg
>
> cputlb.c | 4 +
> softmmu_template_vector.h | 266 +++++++++++++++++++++++++++++++++++++++++++
> target/arm/translate.c | 89 ++++++++++++++-
> tcg/aarch64/tcg-target.inc.c | 4 +-
> tcg/arm/tcg-target.inc.c | 4 +-
> tcg/i386/tcg-target.h | 35 +++++-
> tcg/i386/tcg-target.inc.c | 245 ++++++++++++++++++++++++++++++++++++---
> tcg/mips/tcg-target.inc.c | 4 +-
> tcg/optimize.c | 146 ++++++++++++++++++++++++
> tcg/ppc/tcg-target.inc.c | 4 +-
> tcg/s390/tcg-target.inc.c | 4 +-
> tcg/sparc/tcg-target.inc.c | 12 +-
> tcg/tcg-op.c | 20 +++-
> tcg/tcg-op.h | 262 ++++++++++++++++++++++++++++++++++++++++++
> tcg/tcg-opc.h | 34 ++++++
> tcg/tcg.c | 146 ++++++++++++++++++++++++
> tcg/tcg.h | 147 +++++++++++++++++++++++-
> 17 files changed, 1385 insertions(+), 41 deletions(-)
> create mode 100644 softmmu_template_vector.h
--
Alex Bennée
prev parent reply other threads:[~2017-01-27 14:55 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-17 9:07 [Qemu-devel] [PATCH 00/18] Emulate guest vector operations with host vector operations Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 01/18] tcg: add support for 128bit vector type Kirill Batuzov
2017-01-18 18:29 ` Richard Henderson
2017-01-19 13:04 ` Kirill Batuzov
2017-01-19 15:09 ` Richard Henderson
2017-01-19 16:54 ` Kirill Batuzov
2017-01-22 7:00 ` Richard Henderson
2017-01-23 10:30 ` Kirill Batuzov
2017-01-23 18:43 ` Richard Henderson
2017-01-24 14:29 ` Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 02/18] tcg: add support for 64bit " Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 03/18] tcg: add ld_v128, ld_v64, st_v128 and st_v64 opcodes Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 04/18] tcg: add simple alias analysis Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 05/18] tcg: use results of alias analysis in liveness analysis Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 06/18] tcg: allow globals to overlap Kirill Batuzov
2017-01-17 19:50 ` Richard Henderson
2017-01-17 9:07 ` [Qemu-devel] [PATCH 07/18] tcg: add vector addition operations Kirill Batuzov
2017-01-17 21:56 ` Richard Henderson
2017-01-17 9:07 ` [Qemu-devel] [PATCH 08/18] target/arm: support access to vector guest registers as globals Kirill Batuzov
2017-01-17 20:07 ` Richard Henderson
2017-01-17 9:07 ` [Qemu-devel] [PATCH 09/18] target/arm: use vector opcode to handle vadd.<size> instruction Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 10/18] tcg/i386: add support for vector opcodes Kirill Batuzov
2017-01-17 20:19 ` Richard Henderson
2017-01-18 13:05 ` Kirill Batuzov
2017-01-18 18:22 ` Richard Henderson
2017-01-27 14:51 ` Alex Bennée
2017-01-17 9:07 ` [Qemu-devel] [PATCH 11/18] tcg/i386: support 64-bit vector operations Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 12/18] tcg/i386: support remaining vector addition operations Kirill Batuzov
2017-01-17 21:49 ` Richard Henderson
2017-01-17 9:07 ` [Qemu-devel] [PATCH 13/18] tcg: do not relay on exact values of MO_BSWAP or MO_SIGN in backend Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 14/18] tcg: introduce new TCGMemOp - MO_128 Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 15/18] tcg: introduce qemu_ld_v128 and qemu_st_v128 opcodes Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 16/18] softmmu: create helpers for vector loads Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 17/18] tcg/i386: add support for qemu_ld_v128/qemu_st_v128 ops Kirill Batuzov
2017-01-17 9:07 ` [Qemu-devel] [PATCH 18/18] target/arm: load two consecutive 64-bits vector regs as a 128-bit vector reg Kirill Batuzov
2017-01-27 14:55 ` Alex Bennée [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87r33o8sd0.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=batuzovk@ispras.ru \
--cc=crosthwaite.peter@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).