From: David Gibson <david@gibson.dropbear.id.au>
To: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Cc: qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org,
bharata@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH v3 03/10] target/ppc: support for 32-bit carry and overflow
Date: Thu, 23 Feb 2017 14:21:18 +1100 [thread overview]
Message-ID: <20170223032118.GD12577@umbus.fritz.box> (raw)
In-Reply-To: <1487763883-4877-4-git-send-email-nikunj@linux.vnet.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 8803 bytes --]
On Wed, Feb 22, 2017 at 05:14:36PM +0530, Nikunj A Dadhania wrote:
> POWER ISA 3.0 adds CA32 and OV32 status in 64-bit mode. Add the flags
> and corresponding defines.
>
> Moreover, CA32 is updated when CA is updated and OV32 is updated when OV
> is updated.
>
> Arithmetic instructions:
> * Addition and Substractions:
>
> addic, addic., subfic, addc, subfc, adde, subfe, addme, subfme,
> addze, and subfze always updates CA and CA32.
>
> => CA reflects the carry out of bit 0 in 64-bit mode and out of
> bit 32 in 32-bit mode.
> => CA32 reflects the carry out of bit 32 independent of the
> mode.
>
> => SO and OV reflects overflow of the 64-bit result in 64-bit
> mode and overflow of the low-order 32-bit result in 32-bit
> mode
> => OV32 reflects overflow of the low-order 32-bit independent of
> the mode
>
> * Multiply Low and Divide:
>
> For mulld, divd, divde, divdu and divdeu: SO, OV, and OV32 bits
> reflects overflow of the 64-bit result
>
> For mullw, divw, divwe, divwu and divweu: SO, OV, and OV32 bits
> reflects overflow of the 32-bit result
>
> * Negate with OE=1 (nego)
>
> For 64-bit mode if the register RA contains
> 0x8000_0000_0000_0000, OV and OV32 are set to 1.
>
> For 32-bit mode if the register RA contains 0x8000_0000, OV and
> OV32 are set to 1.
>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
> target/ppc/cpu.c | 19 +++++++++++++++++--
> target/ppc/cpu.h | 7 +++++++
> target/ppc/translate.c | 29 ++++++++++++++++++++++++-----
> target/ppc/translate_init.c | 4 ++--
> 4 files changed, 50 insertions(+), 9 deletions(-)
>
> diff --git a/target/ppc/cpu.c b/target/ppc/cpu.c
> index de3004b..89c1ccb 100644
> --- a/target/ppc/cpu.c
> +++ b/target/ppc/cpu.c
> @@ -23,8 +23,15 @@
>
> target_ulong cpu_read_xer(CPUPPCState *env)
> {
> - return env->xer | (env->so << XER_SO) | (env->ov << XER_OV) |
> + target_ulong xer;
> +
> + xer = env->xer | (env->so << XER_SO) | (env->ov << XER_OV) |
> (env->ca << XER_CA);
> +
> + if (is_isa300(env)) {
> + xer |= (env->ov32 << XER_OV32) | (env->ca32 << XER_CA32);
> + }
> + return xer;
> }
>
> void cpu_write_xer(CPUPPCState *env, target_ulong xer)
> @@ -32,5 +39,13 @@ void cpu_write_xer(CPUPPCState *env, target_ulong xer)
> env->so = (xer >> XER_SO) & 1;
> env->ov = (xer >> XER_OV) & 1;
> env->ca = (xer >> XER_CA) & 1;
> - env->xer = xer & ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA));
> + if (is_isa300(env)) {
> + env->ov32 = (xer >> XER_OV32) & 1;
> + env->ca32 = (xer >> XER_CA32) & 1;
I think these might as well be unconditional - as long as the read_xer
doesn't read the bits back, the guest won't care that we track them in
internal state.
I'm also wondering if it might be worth adding a xer_mask to the env,
instead of explicitly checking isa300 all over the place.
> + env->xer = xer & ~((1ul << XER_SO) |
> + (1ul << XER_OV) | (1ul << XER_CA) |
> + (1ul << XER_OV32) | (1ul << XER_CA32));
> + } else {
> + env->xer = xer & ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA));
> + }
And you can definitely use the stricer mask for both archs. If it's
ISA300, you've stashed them elsewhere, if it's not those bits are
invalid anyway,
(Incidentally given the modern balance between the cost of
instructions and cachelines, I wonder if all these split out bits of
the XER are a good idea in any case, but that would be a big change
out of scope for what you're attempting here)
> }
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index b559b67..ee2eb45 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -965,6 +965,8 @@ struct CPUPPCState {
> target_ulong so;
> target_ulong ov;
> target_ulong ca;
> + target_ulong ov32;
> + target_ulong ca32;
> /* Reservation address */
> target_ulong reserve_addr;
> /* Reservation value */
> @@ -1372,11 +1374,15 @@ int ppc_compat_max_threads(PowerPCCPU *cpu);
> #define XER_SO 31
> #define XER_OV 30
> #define XER_CA 29
> +#define XER_OV32 19
> +#define XER_CA32 18
> #define XER_CMP 8
> #define XER_BC 0
> #define xer_so (env->so)
> #define xer_ov (env->ov)
> #define xer_ca (env->ca)
> +#define xer_ov32 (env->ov)
> +#define xer_ca32 (env->ca)
> #define xer_cmp ((env->xer >> XER_CMP) & 0xFF)
> #define xer_bc ((env->xer >> XER_BC) & 0x7F)
>
> @@ -2343,6 +2349,7 @@ enum {
>
> /*****************************************************************************/
>
> +#define is_isa300(ctx) (!!(ctx->insns_flags2 & PPC2_ISA300))
> target_ulong cpu_read_xer(CPUPPCState *env);
> void cpu_write_xer(CPUPPCState *env, target_ulong xer);
>
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index b09e16f..c9f6768 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -71,7 +71,7 @@ static TCGv cpu_lr;
> #if defined(TARGET_PPC64)
> static TCGv cpu_cfar;
> #endif
> -static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca;
> +static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca, cpu_ov32, cpu_ca32;
> static TCGv cpu_reserve;
> static TCGv cpu_fpscr;
> static TCGv_i32 cpu_access_type;
> @@ -173,6 +173,10 @@ void ppc_translate_init(void)
> offsetof(CPUPPCState, ov), "OV");
> cpu_ca = tcg_global_mem_new(cpu_env,
> offsetof(CPUPPCState, ca), "CA");
> + cpu_ov32 = tcg_global_mem_new(cpu_env,
> + offsetof(CPUPPCState, ov32), "OV32");
> + cpu_ca32 = tcg_global_mem_new(cpu_env,
> + offsetof(CPUPPCState, ca32), "CA32");
>
> cpu_reserve = tcg_global_mem_new(cpu_env,
> offsetof(CPUPPCState, reserve_addr),
> @@ -3703,7 +3707,7 @@ static void gen_tdi(DisasContext *ctx)
>
> /*** Processor control ***/
>
> -static void gen_read_xer(TCGv dst)
> +static void gen_read_xer(DisasContext *ctx, TCGv dst)
> {
> TCGv t0 = tcg_temp_new();
> TCGv t1 = tcg_temp_new();
> @@ -3715,15 +3719,30 @@ static void gen_read_xer(TCGv dst)
> tcg_gen_or_tl(t0, t0, t1);
> tcg_gen_or_tl(dst, dst, t2);
> tcg_gen_or_tl(dst, dst, t0);
> + if (is_isa300(ctx)) {
> + tcg_gen_shli_tl(t0, cpu_ov32, XER_OV32);
> + tcg_gen_or_tl(dst, dst, t0);
> + tcg_gen_shli_tl(t0, cpu_ca32, XER_CA32);
> + tcg_gen_or_tl(dst, dst, t0);
Could you use 2 deposits here, instead of 2 shifts and 2 ors?
> + }
> tcg_temp_free(t0);
> tcg_temp_free(t1);
> tcg_temp_free(t2);
> }
>
> -static void gen_write_xer(TCGv src)
> +static void gen_write_xer(DisasContext *ctx, TCGv src)
> {
> - tcg_gen_andi_tl(cpu_xer, src,
> - ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA)));
> + if (is_isa300(ctx)) {
> + tcg_gen_andi_tl(cpu_xer, src,
> + ~((1u << XER_SO) |
> + (1u << XER_OV) | (1u << XER_OV32) |
> + (1u << XER_CA) | (1u << XER_CA32)));
> + tcg_gen_extract_tl(cpu_ov32, src, XER_OV32, 1);
> + tcg_gen_extract_tl(cpu_ca32, src, XER_CA32, 1);
> + } else {
> + tcg_gen_andi_tl(cpu_xer, src,
> + ~((1u << XER_SO) | (1u << XER_OV) | (1u << XER_CA)));
> + }
> tcg_gen_extract_tl(cpu_so, src, XER_SO, 1);
> tcg_gen_extract_tl(cpu_ov, src, XER_OV, 1);
> tcg_gen_extract_tl(cpu_ca, src, XER_CA, 1);
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index be35cbd..eb667bb 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -107,12 +107,12 @@ static void spr_access_nop(DisasContext *ctx, int sprn, int gprn)
> /* XER */
> static void spr_read_xer (DisasContext *ctx, int gprn, int sprn)
> {
> - gen_read_xer(cpu_gpr[gprn]);
> + gen_read_xer(ctx, cpu_gpr[gprn]);
> }
>
> static void spr_write_xer (DisasContext *ctx, int sprn, int gprn)
> {
> - gen_write_xer(cpu_gpr[gprn]);
> + gen_write_xer(ctx, cpu_gpr[gprn]);
> }
>
> /* LR */
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2017-02-23 3:28 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-22 11:44 [Qemu-devel] [PATCH v3 00/10] POWER9 TCG enablements - part15 Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 01/10] target/ppc: move cpu_[read, write]_xer to cpu.c Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 02/10] target/ppc: optimize gen_write_xer() Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 03/10] target/ppc: support for 32-bit carry and overflow Nikunj A Dadhania
2017-02-22 17:17 ` Richard Henderson
2017-02-22 17:20 ` Richard Henderson
2017-02-23 6:40 ` Nikunj A Dadhania
2017-02-23 22:34 ` Richard Henderson
2017-02-23 22:53 ` David Gibson
2017-02-24 0:41 ` [Qemu-devel] [Qemu-ppc] " Nikunj Dadhania
2017-02-24 4:50 ` David Gibson
2017-02-24 6:30 ` Richard Henderson
2017-02-27 1:39 ` David Gibson
2017-02-23 3:21 ` David Gibson [this message]
2017-02-23 5:09 ` [Qemu-devel] " Nikunj A Dadhania
2017-02-23 5:32 ` David Gibson
2017-02-23 7:02 ` Nikunj A Dadhania
2017-02-23 9:29 ` David Gibson
2017-02-23 22:36 ` Richard Henderson
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 04/10] target/ppc: update ca32 in arithmetic add Nikunj A Dadhania
2017-02-22 17:20 ` Richard Henderson
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 05/10] target/ppc: update ca32 in arithmetic substract Nikunj A Dadhania
2017-02-22 17:21 ` Richard Henderson
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 06/10] target/ppc: update overflow flags for add/sub Nikunj A Dadhania
2017-02-22 17:26 ` Richard Henderson
2017-02-23 4:46 ` Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 07/10] target/ppc: use tcg ops for neg instruction Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 08/10] target/ppc: add ov32 flag for multiply low insns Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 09/10] target/ppc: add ov32 flag in divide operations Nikunj A Dadhania
2017-02-22 11:44 ` [Qemu-devel] [PATCH v3 10/10] target/ppc: add mcrxrx instruction Nikunj A Dadhania
2017-02-23 3:27 ` [Qemu-devel] [PATCH v3 00/10] POWER9 TCG enablements - part15 David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170223032118.GD12577@umbus.fritz.box \
--to=david@gibson.dropbear.id.au \
--cc=bharata@linux.vnet.ibm.com \
--cc=nikunj@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).