qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
@ 2012-08-26 14:14 Aurelien Jarno
  2012-08-26 15:25 ` Peter Maydell
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Aurelien Jarno @ 2012-08-26 14:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Blue Swirl, Alexander Graf, Aurelien Jarno, Andreas Färber

Altivec instructions are not working anymore in PowerPC emulation,
following commit d15f74fb, which inverted two registers in the call
to helper. Fix that.

Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-ppc/translate.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 91eb7a0..ac915cc 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -6530,7 +6530,7 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
     ra = gen_avr_ptr(rA(ctx->opcode));                                  \
     rb = gen_avr_ptr(rB(ctx->opcode));                                  \
     rd = gen_avr_ptr(rD(ctx->opcode));                                  \
-    gen_helper_##name(rd, cpu_env, ra, rb);                             \
+    gen_helper_##name(cpu_env, rd, ra, rb);                             \
     tcg_temp_free_ptr(ra);                                              \
     tcg_temp_free_ptr(rb);                                              \
     tcg_temp_free_ptr(rd);                                              \
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
  2012-08-26 14:14 [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions Aurelien Jarno
@ 2012-08-26 15:25 ` Peter Maydell
  2012-08-26 15:27 ` Andreas Färber
  2012-08-26 17:56 ` Blue Swirl
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2012-08-26 15:25 UTC (permalink / raw)
  To: Aurelien Jarno
  Cc: Blue Swirl, qemu-devel, Andreas Färber, Alexander Graf

On 26 August 2012 15:14, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Altivec instructions are not working anymore in PowerPC emulation,
> following commit d15f74fb, which inverted two registers in the call
> to helper. Fix that.
>
> Cc: Blue Swirl <blauwirbel@gmail.com>
> Cc: Alexander Graf <agraf@suse.de>
> Cc: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-ppc/translate.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 91eb7a0..ac915cc 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6530,7 +6530,7 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
>      ra = gen_avr_ptr(rA(ctx->opcode));                                  \
>      rb = gen_avr_ptr(rB(ctx->opcode));                                  \
>      rd = gen_avr_ptr(rD(ctx->opcode));                                  \
> -    gen_helper_##name(rd, cpu_env, ra, rb);                             \
> +    gen_helper_##name(cpu_env, rd, ra, rb);                             \
>      tcg_temp_free_ptr(ra);                                              \
>      tcg_temp_free_ptr(rb);                                              \
>      tcg_temp_free_ptr(rd);                                              \

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

(For these helpers, rd is an input to the helper, not an output,
which is why the cpu_env has to go first, unlike eg gen_helper_mulldo().)

-- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
  2012-08-26 14:14 [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions Aurelien Jarno
  2012-08-26 15:25 ` Peter Maydell
@ 2012-08-26 15:27 ` Andreas Färber
  2012-08-26 15:46   ` Aurelien Jarno
  2012-08-26 17:56 ` Blue Swirl
  2 siblings, 1 reply; 6+ messages in thread
From: Andreas Färber @ 2012-08-26 15:27 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: Blue Swirl, qemu-devel, Alexander Graf

Am 26.08.2012 16:14, schrieb Aurelien Jarno:
> Altivec instructions are not working anymore in PowerPC emulation,
> following commit d15f74fb, which inverted two registers in the call
> to helper. Fix that.
> 
> Cc: Blue Swirl <blauwirbel@gmail.com>
> Cc: Alexander Graf <agraf@suse.de>
> Cc: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>

Reviewed-by: Andreas Färber <afaerber@suse.de>

This looks right, but do you have a particular test case I can check?

Andreas

> ---
>  target-ppc/translate.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 91eb7a0..ac915cc 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6530,7 +6530,7 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
>      ra = gen_avr_ptr(rA(ctx->opcode));                                  \
>      rb = gen_avr_ptr(rB(ctx->opcode));                                  \
>      rd = gen_avr_ptr(rD(ctx->opcode));                                  \
> -    gen_helper_##name(rd, cpu_env, ra, rb);                             \
> +    gen_helper_##name(cpu_env, rd, ra, rb);                             \
>      tcg_temp_free_ptr(ra);                                              \
>      tcg_temp_free_ptr(rb);                                              \
>      tcg_temp_free_ptr(rd);                                              \
> 


-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
  2012-08-26 15:27 ` Andreas Färber
@ 2012-08-26 15:46   ` Aurelien Jarno
  0 siblings, 0 replies; 6+ messages in thread
From: Aurelien Jarno @ 2012-08-26 15:46 UTC (permalink / raw)
  To: Andreas Färber; +Cc: Blue Swirl, qemu-devel, Alexander Graf

[-- Attachment #1: Type: text/plain, Size: 764 bytes --]

On Sun, Aug 26, 2012 at 05:27:59PM +0200, Andreas Färber wrote:
> Am 26.08.2012 16:14, schrieb Aurelien Jarno:
> > Altivec instructions are not working anymore in PowerPC emulation,
> > following commit d15f74fb, which inverted two registers in the call
> > to helper. Fix that.
> > 
> > Cc: Blue Swirl <blauwirbel@gmail.com>
> > Cc: Alexander Graf <agraf@suse.de>
> > Cc: Andreas Färber <afaerber@suse.de>
> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> 
> Reviewed-by: Andreas Färber <afaerber@suse.de>
> 
> This looks right, but do you have a particular test case I can check?
> 

The Gwenole Beauchesne testsuite (attached).

-- 
Aurelien Jarno	                        GPG: 1024D/F1BCDB73
aurelien@aurel32.net                 http://www.aurel32.net

[-- Attachment #2: test-powerpc.cpp --]
[-- Type: text/x-c++src, Size: 64744 bytes --]

/*
 *  test-powerpc.cpp - PowerPC regression testing
 *
 *  Kheperix (C) 2003-2005 Gwenole Beauchesne
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, write to the Free Software
 *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */

// NOTE: Results file md5sum: 3e29432abb6e21e625a2eef8cf2f0840 ($Revision: 1.30 $)

#include <vector>
#include <limits>
#include <altivec.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <setjmp.h>
#include <signal.h>
#include <ctype.h>
#include <math.h>
#include <stdint.h>
#include <string.h>
#ifndef UINT_MAX
#define UINT_MAX UINT32_MAX
#endif

// Define units to test (in-order: ALU, FPU, VMX)
#define TEST_ALU_OPS	1
#define TEST_FPU_OPS	1
#define TEST_VMX_OPS	1

// Define units to skip during testing
#define SKIP_ALU_OPS	0
#define SKIP_FPU_OPS	0
#define SKIP_VMX_OPS	0

// Define instructions to test
#define TEST_ADD		1
#define TEST_SUB		1
#define TEST_MUL		1
#define TEST_DIV		1
#define TEST_SHIFT		1
#define TEST_ROTATE		1
#define TEST_MISC		1
#define TEST_LOGICAL	1
#define TEST_COMPARE	1
#define TEST_CR_LOGICAL	1
#define TEST_VMX_LOADSH	1
#define TEST_VMX_LOAD	1
#define TEST_VMX_ARITH	1

// Partial PowerPC runtime assembler from GNU lightning
#undef  _I
#define _I(X)			((uint32_t)(X))
#define _UL(X)			((uint32_t)(X))
#define _MASK(N)		((uint32_t)((1<<(N)))-1)
#define _ck_s(W,I)		(_UL(I) & _MASK(W))
#define _ck_u(W,I)    	(_UL(I) & _MASK(W))
#define _ck_su(W,I)    	(_UL(I) & _MASK(W))
#define _u1(I)          _ck_u( 1,I)
#define _u5(I)          _ck_u( 5,I)
#define _u6(I)          _ck_u( 6,I)
#define _u9(I)          _ck_u( 9,I)
#define _u10(I)         _ck_u(10,I)
#define _u11(I)			_ck_u(11,I)
#define _s16(I)         _ck_s(16,I)

#undef  _D
#define _D(   OP,RD,RA,         DD )  	_I((_u6(OP)<<26)|(_u5(RD)<<21)|(_u5(RA)<<16)|                _s16(DD)                          )
#undef  _X
#define _X(   OP,RD,RA,RB,   XO,RC )  	_I((_u6(OP)<<26)|(_u5(RD)<<21)|(_u5(RA)<<16)|( _u5(RB)<<11)|              (_u10(XO)<<1)|_u1(RC))
#define _XO(  OP,RD,RA,RB,OE,XO,RC )  	_I((_u6(OP)<<26)|(_u5(RD)<<21)|(_u5(RA)<<16)|( _u5(RB)<<11)|(_u1(OE)<<10)|( _u9(XO)<<1)|_u1(RC))
#define _M(   OP,RS,RA,SH,MB,ME,RC )  	_I((_u6(OP)<<26)|(_u5(RS)<<21)|(_u5(RA)<<16)|( _u5(SH)<<11)|(_u5(MB)<< 6)|( _u5(ME)<<1)|_u1(RC))
#define _VX(  OP,VD,VA,VB,   XO    )	_I((_u6(OP)<<26)|(_u5(VD)<<21)|(_u5(VA)<<16)|( _u5(VB)<<11)|               _u11(XO)            )
#define _VXR( OP,VD,VA,VB,   XO,RC )	_I((_u6(OP)<<26)|(_u5(VD)<<21)|(_u5(VA)<<16)|( _u5(VB)<<11)|              (_u1(RC)<<10)|_u10(XO))
#define _VA(  OP,VD,VA,VB,VC,XO    )	_I((_u6(OP)<<26)|(_u5(VD)<<21)|(_u5(VA)<<16)|( _u5(VB)<<11)|(_u5(VC)<< 6)|  _u6(XO)            )

// PowerPC opcodes
static inline uint32_t POWERPC_LI(int RD, uint32_t v) { return _D(14,RD,00,(v&0xffff)); }
static inline uint32_t POWERPC_MR(int RD, int RA) { return _X(31,RA,RD,RA,444,0); }
static inline uint32_t POWERPC_MFCR(int RD) { return _X(31,RD,00,00,19,0); }
static inline uint32_t POWERPC_LVX(int vD, int rA, int rB) { return _X(31,vD,rA,rB,103,0); }
static inline uint32_t POWERPC_STVX(int vS, int rA, int rB) { return _X(31,vS,rA,rB,231,0); }
static inline uint32_t POWERPC_MFSPR(int rD, int SPR) { return _X(31,rD,(SPR&0x1f),((SPR>>5)&0x1f),339,0); }
static inline uint32_t POWERPC_MTSPR(int rS, int SPR) { return _X(31,rS,(SPR&0x1f),((SPR>>5)&0x1f),467,0); }
const uint32_t POWERPC_NOP = 0x60000000;
const uint32_t POWERPC_BLR = 0x4e800020;
const uint32_t POWERPC_BLRL = 0x4e800021;
const uint32_t POWERPC_ILLEGAL = 0x00000000;
const uint32_t POWERPC_EMUL_OP = 0x18000000;

// Invalidate test cache
static void inline ppc_flush_icache_range(uint32_t *start_p, uint32_t length)
{
	const int MIN_CACHE_LINE_SIZE = 8; /* conservative value */

	unsigned long start = (unsigned long)start_p;
	unsigned long stop  = start + length;
    unsigned long p;

    p = start & ~(MIN_CACHE_LINE_SIZE - 1);
    stop = (stop + MIN_CACHE_LINE_SIZE - 1) & ~(MIN_CACHE_LINE_SIZE - 1);
    
    for (p = start; p < stop; p += MIN_CACHE_LINE_SIZE) {
        asm volatile ("dcbst 0,%0" : : "r"(p) : "memory");
    }
    asm volatile ("sync" : : : "memory");
    for (p = start; p < stop; p += MIN_CACHE_LINE_SIZE) {
        asm volatile ("icbi 0,%0" : : "r"(p) : "memory");
    }
    asm volatile ("sync" : : : "memory");
    asm volatile ("isync" : : : "memory");
}


// Define bit-fields
#if !EMU_KHEPERIX
template< int FB, int FE >
struct static_mask {
	enum { value = (0xffffffff >> FB) ^ (0xffffffff >> (FE + 1)) };
};

template< int FB >
struct static_mask<FB, 31> {
	enum { value  = 0xffffffff >> FB };
};

template< int FB, int FE >
struct bit_field {
	static inline uint32_t mask() {
		return static_mask<FB, FE>::value;
	}
	static inline bool test(uint32_t value) {
		return value & mask();
	}
	static inline uint32_t extract(uint32_t value) {
		const uint32_t m = mask() >> (31 - FE);
		return (value >> (31 - FE)) & m;
	}
	static inline void insert(uint32_t & data, uint32_t value) {
		const uint32_t m = mask();
		data = (data & ~m) | ((value << (31 - FE)) & m);
	}
};

// General purpose registers
typedef bit_field< 11, 15 > rA_field;
typedef bit_field< 16, 20 > rB_field;
typedef bit_field<  6, 10 > rD_field;
typedef bit_field<  6, 10 > rS_field;

// Vector registers
typedef bit_field< 11, 15 > vA_field;
typedef bit_field< 16, 20 > vB_field;
typedef bit_field< 21, 25 > vC_field;
typedef bit_field<  6, 10 > vD_field;
typedef bit_field<  6, 10 > vS_field;
typedef bit_field< 22, 25 > vSH_field;

// Condition registers
typedef bit_field< 11, 15 > crbA_field;
typedef bit_field< 16, 20 > crbB_field;
typedef bit_field<  6, 10 > crbD_field;
typedef bit_field<  6,  8 > crfD_field;
typedef bit_field< 11, 13 > crfS_field;

// CR register fields
template< int CRn > struct CR_field : bit_field< 4*CRn+0, 4*CRn+3 > { };
template< int CRn > struct CR_LT_field : bit_field< 4*CRn+0, 4*CRn+0 > { };
template< int CRn > struct CR_GT_field : bit_field< 4*CRn+1, 4*CRn+1 > { };
template< int CRn > struct CR_EQ_field : bit_field< 4*CRn+2, 4*CRn+2 > { };
template< int CRn > struct CR_SO_field : bit_field< 4*CRn+3, 4*CRn+3 > { };
template< int CRn > struct CR_UN_field : bit_field< 4*CRn+3, 4*CRn+3 > { };

// Immediates
typedef bit_field< 16, 31 > UIMM_field;
typedef bit_field< 21, 25 > MB_field;
typedef bit_field< 26, 30 > ME_field;
typedef bit_field< 16, 20 > SH_field;

// XER register fields
typedef bit_field<  0,  0 > XER_SO_field;
typedef bit_field<  1,  1 > XER_OV_field;
typedef bit_field<  2,  2 > XER_CA_field;
#endif
#undef  CA
#define CA XER_CA_field::mask()
#undef  OV
#define OV XER_OV_field::mask()
#undef  SO
#define SO XER_SO_field::mask()

// Flag: does the host support AltiVec instructions?
static bool has_altivec = true;

// A 128-bit AltiVec register
typedef uint8_t vector_t[16];

class aligned_vector_t {
	struct {
		vector_t v;
		uint8_t pad[16];
	} vs;
public:
	aligned_vector_t()
		{ clear(); }
	void clear()
		{ memset(addr(), 0, sizeof(vector_t)); }
	void copy(vector_t const & vi, int n = sizeof(vector_t))
		{ clear(); memcpy(addr(), &vi, n); }
	vector_t *addr() const
		{ return (vector_t *)(((char *)&vs.v) + (16 - (((uintptr_t)&vs.v) % 16))); }
	vector_t const & value() const
		{ return *addr(); }
	vector_t & value()
		{ return *addr(); }
};

union vector_helper_t {
	vector_t v;
	uint8_t	b[16];
	uint16_t	h[8];
	uint32_t	w[4];
	float	f[4];
};

static void print_vector(vector_t const & v, char type = 'b')
{
	vector_helper_t x;
	memcpy(&x.b, &v, sizeof(vector_t));

	printf("{");
	switch (type) {
	case 'b':
	default:
		for (int i = 0; i < 16; i++) {
			if (i != 0)
				printf(",");
			printf(" %02x", x.b[i]);
		}
		break;
	case 'h':
		for (int i = 0; i < 8; i++) {
			if (i != 0)
				printf(",");
			printf(" %04x", x.h[i]);
		}
		break;
	case 'w':
		for (int i = 0; i < 4; i++) {
			if (i != 0)
				printf(",");
			printf(" %08x", x.w[i]);
		}
		break;
	case 'f':
	case 'e': // estimate result
	case 'l': // estimate log2 result
		for (int i = 0; i < 4; i++) {
			x.w[i] = (x.w[i]);
			if (i != 0)
				printf(",");
			printf(" %g", x.f[i]);
		}
		break;
	}
	printf(" }");
}

static inline bool do_float_equals(float a, float b, float tolerance)
{
	if (a == b)
		return true;

	if (isnan(a) && isnan(b))
		return true;

	if (isinf(a) && isinf(b) && signbit(a) == signbit(b))
		return true;

	if ((b < (a + tolerance)) && (b > (a - tolerance)))
		return true;

	return false;
}

static inline bool float_equals(float a, float b)
{
	return do_float_equals(a, b, 3 * std::numeric_limits<float>::epsilon());
}

static bool vector_equals(char type, vector_t const & a, vector_t const & b)
{
	// the vector is in ppc big endian format
	float tolerance;
	switch (type) {
	case 'f':
		tolerance = 3 * std::numeric_limits<float>::epsilon();
		goto do_compare;
	case 'l': // FIXME: this does not handle |x-1|<=1/8 case
		tolerance = 1. / 32.;
		goto do_compare;
	case 'e':
		tolerance = 1. / 4096.;
	  do_compare:
		for (int i = 0; i < 4; i++) {
			union { float f; uint32_t i; } u, v;
			u.i = (((uint32_t *)&a)[i]);
			v.i = (((uint32_t *)&b)[i]);
			if (!do_float_equals(u.f, v.f, tolerance))
				return false;
		}
		return true;
	}

	return memcmp(&a, &b, sizeof(vector_t)) == 0;
}

static bool vector_all_eq(char type, vector_t const & b)
{
	uint32_t v;
	vector_helper_t x;
	memcpy(&x.v, &b, sizeof(vector_t));

	bool all_eq = true;
	switch (type) {
	case 'b':
	default:
		v = x.b[0];
		for (int i = 1; all_eq && i < 16; i++)
			if (x.b[i] != v)
				all_eq = false;
		break;
	case 'h':
		v = x.h[0];
		for (int i = 1; all_eq && i < 8; i++)
			if (x.h[i] != v)
				all_eq = false;
		break;
	case 'w':
	case 'f':
		v = x.w[0];
		for (int i = 1; all_eq && i < 4; i++)
			if (x.w[i] != v)
				all_eq = false;
		break;
	}
	return all_eq;
}

// Define PowerPC tester
class powerpc_test_cpu
{
	uint32_t native_get_xer() const
		{ uint32_t xer; asm volatile ("mfxer %0" : "=r" (xer)); return xer; }

	void native_set_xer(uint32_t xer) const
		{ asm volatile ("mtxer %0" : : "r" (xer)); }

	uint32_t native_get_cr() const
		{ uint32_t cr; asm volatile ("mfcr %0" : "=r" (cr)); return cr; }

	void native_set_cr(uint32_t cr) const
		{ asm volatile ("mtcr %0" :  : "r" (cr)); }

	void flush_icache_range(uint32_t *start, uint32_t size)
		{ ppc_flush_icache_range(start, size); }

	void print_xer_flags(uint32_t xer) const;
	void print_flags(uint32_t cr, uint32_t xer, int crf = 0) const;

public:

  powerpc_test_cpu(FILE *, FILE *);
  ~powerpc_test_cpu();

  bool test(void);

private:

  static const bool verbose = false;
  uint32_t tests, errors;
  enum testing_mode {
    MODE_GENERATING,
    MODE_COMPARING
  } test_mode;

  FILE *results_file;
  FILE *reference_file;
  uint32_t get32(FILE *f);
  void put32(FILE *f, uint32_t v);
  void get_vector(FILE *f, vector_t & v);
  void put_vector(FILE *f, vector_t const & v);

	// Initial CR0, XER states
	uint32_t init_cr;
	uint32_t init_xer;

	// XER preset values to test with
	std::vector<uint32_t> xer_values;
	void gen_xer_values(uint32_t use_mask, uint32_t set_mask);

	// Emulated registers IDs
	enum {
		RD = 3,
		RA = 4,
		RB = 5,
		RC = 6,
		VSCR = 7,
	};

	// Operands
	enum {
		__,
		vD, vS, vA, vB, vC, vI, vN,
		rD, rS, rA, rB, rC,
	};

	struct vector_test_t {
		uint8_t	name[14];
		char	type;
		char	op_type;
		uint32_t	opcode;
		uint8_t	operands[4];
	};

	struct vector_value_t {
		char type;
		vector_t v;
	};

	static const uint32_t reg_values[];
	static const uint32_t imm_values[];
	static const uint32_t msk_values[];
	static const vector_value_t vector_values[];
	static const vector_value_t vector_fp_values[];

	void test_one_1(uint32_t *code, const char *insn, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a0 = 0);
	void test_one(uint32_t *code, const char *insn, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a0 = 0);
	void test_instruction_CNTLZ(const char *insn, uint32_t opcode);
	void test_instruction_RR___(const char *insn, uint32_t opcode);
	void test_instruction_RRI__(const char *insn, uint32_t opcode);
#define  test_instruction_RRK__ test_instruction_RRI__
	void test_instruction_RRS__(const char *insn, uint32_t opcode);
	void test_instruction_RRR__(const char *insn, uint32_t opcode);
	void test_instruction_RRRSH(const char *insn, uint32_t opcode);
	void test_instruction_RRIII(const char *insn, uint32_t opcode);
	void test_instruction_RRRII(const char *insn, uint32_t opcode);
	void test_instruction_CRR__(const char *insn, uint32_t opcode);
	void test_instruction_CRI__(const char *insn, uint32_t opcode);
#define  test_instruction_CRK__ test_instruction_CRI__
	void test_instruction_CCC__(const char *insn, uint32_t opcode);

	void test_add(void);
	void test_sub(void);
	void test_mul(void);
	void test_div(void);
	void test_shift(void);
	void test_rotate(void);
	void test_logical(void);
	void test_compare(void);
	void test_cr_logical(void);

	void test_one_vector(uint32_t *code, vector_test_t const & vt, uint8_t *rA, uint8_t *rB = 0, uint8_t *rC = 0);
	void test_one_vector(uint32_t *code, vector_test_t const & vt, vector_t const *vA = 0, vector_t const *vB = 0, vector_t const *vC = 0)
		{ test_one_vector(code, vt, (uint8_t *)vA, (uint8_t *)vB, (uint8_t *)vC); }
	void test_vector_load(void);
	void test_vector_load_for_shift(void);
	void test_vector_arith(void);
};

powerpc_test_cpu::powerpc_test_cpu(FILE *result, FILE *reference)
  : results_file(result), reference_file(reference)
{
  if (reference_file == NULL)
    test_mode = MODE_GENERATING;
  else
    test_mode = MODE_COMPARING;
}

powerpc_test_cpu::~powerpc_test_cpu()
{
}

uint32_t powerpc_test_cpu::get32(FILE *f)
{
	uint32_t v;
	if (fread(&v, sizeof(v), 1, f) != 1) {
		fprintf(stderr, "ERROR: unexpected end of results file\n");
		exit(EXIT_FAILURE);
	}
	return (v);
}

void powerpc_test_cpu::put32(FILE *f, uint32_t v)
{
	uint32_t out = (v);
	if (fwrite(&out, sizeof(out), 1, f) != 1) {
		fprintf(stderr, "could not write item to results file\n");
		exit(EXIT_FAILURE);
	}
}

void powerpc_test_cpu::get_vector(FILE *f, vector_t & v)
{
	if (fread(&v, sizeof(v), 1, f) != 1) {
		fprintf(stderr, "ERROR: unexpected end of results file\n");
		exit(EXIT_FAILURE);
	}
}

void powerpc_test_cpu::put_vector(FILE *f, vector_t const & v)
{
	if (fwrite(&v, sizeof(v), 1, f) != 1) {
		fprintf(stderr, "could not write vector to results file\n");
		exit(EXIT_FAILURE);
	}
}

void powerpc_test_cpu::gen_xer_values(uint32_t use_mask, uint32_t set_mask)
{
	const uint32_t mask = use_mask | set_mask;

	// Always test with XER=0
	xer_values.clear();
	xer_values.push_back(0);

	// Iterate over XER fields, only handle CA, OV, SO
	for (uint32_t m = 0x80000000; m != 0; m >>= 1) {
		if (m & (CA | OV | SO) & mask) {
			const int n_xer_values = xer_values.size();
			for (int i = 0; i < n_xer_values; i++)
				xer_values.push_back(xer_values[i] | m);
		}
	}
}

void powerpc_test_cpu::print_xer_flags(uint32_t xer) const
{
	printf("%s,%s,%s",
		   (xer & XER_CA_field::mask() ? "CA" : "__"),
		   (xer & XER_OV_field::mask() ? "OV" : "__"),
		   (xer & XER_SO_field::mask() ? "SO" : "__"));
}

void powerpc_test_cpu::print_flags(uint32_t cr, uint32_t xer, int crf) const
{
	cr = cr << (4 * crf);
	printf("%s,%s,%s,%s,%s,%s",
		   (cr & CR_LT_field<0>::mask() ? "LT" : "__"),
		   (cr & CR_GT_field<0>::mask() ? "GT" : "__"),
		   (cr & CR_EQ_field<0>::mask() ? "EQ" : "__"),
		   (cr & CR_SO_field<0>::mask() ? "SO" : "__"),
		   (xer & XER_OV_field::mask()  ? "OV" : "__"),
		   (xer & XER_CA_field::mask()  ? "CA" : "__"));
}

#define TEST_INSTRUCTION(FORMAT, NATIVE_OP, EMUL_OP) do {	\
	printf("Testing " NATIVE_OP "\n");						\
	test_instruction_##FORMAT(NATIVE_OP, EMUL_OP);			\
} while (0)

void powerpc_test_cpu::test_one(uint32_t *code, const char *insn, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a0)
{
	// Iterate over test XER values as input
	const int n_xer_values = xer_values.size();
	for (int i = 0; i < n_xer_values; i++) {
		init_xer = xer_values[i];
		test_one_1(code, insn, a1, a2, a3, a0);
	}
	init_xer = 0;
}

void powerpc_test_cpu::test_one_1(uint32_t *code, const char *insn, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a0)
{
  if (test_mode == MODE_GENERATING)
    {
      // Invoke native code
      const uint32_t save_xer = native_get_xer();
      const uint32_t save_cr = native_get_cr();
      native_set_xer(init_xer);
      native_set_cr(init_cr);
      typedef uint32_t (*func_t)(uint32_t, uint32_t, uint32_t);
      func_t func = (func_t)code;
      const uint32_t native_rd = func(a0, a1, a2);
      const uint32_t native_xer = native_get_xer();
      const uint32_t native_cr = native_get_cr();
      native_set_xer(save_xer);
      native_set_cr(save_cr);
      if (results_file)
        {
          put32(results_file, native_rd);
          put32(results_file, native_xer);
          put32(results_file, native_cr);
        }
    }
  else if (test_mode == MODE_COMPARING)
    {
      const uint32_t native_rd = get32 (results_file);
      const uint32_t native_xer = get32 (results_file);
      const uint32_t native_cr = get32 (results_file);
      const uint32_t emul_rd = get32(reference_file);
      const uint32_t emul_xer = get32(reference_file);
      const uint32_t emul_cr = get32(reference_file);
    
      ++tests;

      bool ok = (native_rd == emul_rd
                 && native_xer == emul_xer
                 && native_cr == emul_cr);

      if (code[0] == POWERPC_MR(0, RA))
        code++;

      if (!ok) {
        printf("FAIL: %s [%08x]\n", insn, code[0]);
        errors++;
      }
      else if (verbose) {
        printf("PASS: %s [%08x]\n", insn, code[0]);
      }

      if (!ok || verbose) {
#define PRINT_OPERANDS(PREFIX) do {                     \
          printf(" a0:%08x, a1:%08x, a2:%08x => %08x [",	\
                 a0, a1, a2, PREFIX##_rd);		\
          print_flags(PREFIX##_cr, PREFIX##_xer);       \
          printf("]\n");                                \
        } while (0)
        PRINT_OPERANDS(native);
        PRINT_OPERANDS(emul);
#undef  PRINT_OPERANDS
      }
    }
}

const uint32_t powerpc_test_cpu::reg_values[] = {
	0x00000000, 0x10000000, 0x20000000,
	0x30000000, 0x40000000, 0x50000000,
	0x60000000, 0x70000000, 0x80000000,
	0x90000000, 0xa0000000, 0xb0000000,
	0xc0000000, 0xd0000000, 0xe0000000,
	0xf0000000, 0xfffffffd, 0xfffffffe,
	0xffffffff, 0x00000001, 0x00000002,
	0x00000003, 0x11111111, 0x22222222,
	0x33333333, 0x44444444, 0x55555555,
	0x66666666, 0x77777777, 0x88888888,
	0x99999999, 0xaaaaaaaa, 0xbbbbbbbb,
	0xcccccccc, 0xdddddddd, 0xeeeeeeee
};

const uint32_t powerpc_test_cpu::imm_values[] = {
	0x0000, 0x1000, 0x2000,
	0x3000, 0x4000, 0x5000,
	0x6000, 0x7000, 0x8000,
	0x9000, 0xa000, 0xb000,
	0xc000, 0xd000, 0xe000,
	0xf000, 0xfffd, 0xfffe,
	0xffff, 0x0001, 0x0002,
	0x0003, 0x1111, 0x2222,
	0x3333, 0x4444, 0x5555,
	0x6666, 0x7777, 0x8888,
	0x9999, 0xaaaa, 0xbbbb,
	0xcccc, 0xdddd, 0xeeee
};

const uint32_t powerpc_test_cpu::msk_values[] = {
	0, 1,
//	15, 16, 17,
	30, 31
};

void powerpc_test_cpu::test_instruction_CNTLZ(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	code[0] = code[3] = opcode;			// <op> RD,RA,RB
	rA_field::insert(code[3], 0);		// <op> RD,R0,RB
	flush_icache_range(code, sizeof(code));

	for (uint32_t mask = 0x80000000; mask != 0; mask >>= 1) {
		uint32_t ra = mask;
		test_one(&code[0], insn, ra, 0, 0);
		test_one(&code[2], insn, ra, 0, 0);
	}
	// random values (including zero)
	for (int i = 0; i < n_values; i++) {
		uint32_t ra = reg_values[i];
		test_one(&code[0], insn, ra, 0, 0);
		test_one(&code[2], insn, ra, 0, 0);
	}
}

void powerpc_test_cpu::test_instruction_RR___(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	code[0] = code[3] = opcode;			// <op> RD,RA,RB
	rA_field::insert(code[3], 0);		// <op> RD,R0,RB
	flush_icache_range(code, sizeof(code));

	for (int i = 0; i < n_values; i++) {
		uint32_t ra = reg_values[i];
		test_one(&code[0], insn, ra, 0, 0);
		test_one(&code[2], insn, ra, 0, 0);
	}
}

void powerpc_test_cpu::test_instruction_RRI__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_reg_values = sizeof(reg_values)/sizeof(reg_values[0]);
	const int n_imm_values = sizeof(imm_values)/sizeof(imm_values[0]);

	for (int j = 0; j < n_imm_values; j++) {
		const uint32_t im = imm_values[j];
		uint32_t op = opcode;
		UIMM_field::insert(op, im);
		code[0] = code[3] = op;				// <op> RD,RA,IM
		rA_field::insert(code[3], 0);		// <op> RD,R0,IM
		flush_icache_range(code, sizeof(code));
		for (int i = 0; i < n_reg_values; i++) {
			const uint32_t ra = reg_values[i];
			test_one(&code[0], insn, ra, im, 0);
			test_one(&code[2], insn, ra, im, 0);
		}
	}
}

void powerpc_test_cpu::test_instruction_RRS__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	for (int j = 0; j < 32; j++) {
		const uint32_t sh = j;
		SH_field::insert(opcode, sh);
		code[0] = code[3] = opcode;
		rA_field::insert(code[3], 0);
		flush_icache_range(code, sizeof(code));
		for (int i = 0; i < n_values; i++) {
			const uint32_t ra = reg_values[i];
			test_one(&code[0], insn, ra, sh, 0);
		}
	}
}

void powerpc_test_cpu::test_instruction_RRR__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	code[0] = code[3] = opcode;			// <op> RD,RA,RB
	rA_field::insert(code[3], 0);		// <op> RD,R0,RB
	flush_icache_range(code, sizeof(code));

	for (int i = 0; i < n_values; i++) {
		const uint32_t ra = reg_values[i];
		for (int j = 0; j < n_values; j++) {
			const uint32_t rb = reg_values[j];
			test_one(&code[0], insn, ra, rb, 0);
			test_one(&code[2], insn, ra, rb, 0);
		}
	}
}

void powerpc_test_cpu::test_instruction_RRRSH(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	code[0] = code[3] = opcode;			// <op> RD,RA,RB
	rA_field::insert(code[3], 0);		// <op> RD,R0,RB
	flush_icache_range(code, sizeof(code));

	for (int i = 0; i < n_values; i++) {
		const uint32_t ra = reg_values[i];
		for (int j = 0; j <= 64; j++) {
			const uint32_t rb = j;
			test_one(&code[0], insn, ra, rb, 0);
			test_one(&code[2], insn, ra, rb, 0);
		}
	}
}

void powerpc_test_cpu::test_instruction_RRIII(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_reg_values = sizeof(reg_values)/sizeof(reg_values[0]);
	const int n_msk_values = sizeof(msk_values)/sizeof(msk_values[0]);

	for (int sh = 0; sh < 32; sh++) {
		for (int i_mb = 0; i_mb < n_msk_values; i_mb++) {
			const uint32_t mb = msk_values[i_mb];
			for (int i_me = 0; i_me < n_msk_values; i_me++) {
				const uint32_t me = msk_values[i_me];
				SH_field::insert(opcode, sh);
				MB_field::insert(opcode, mb);
				ME_field::insert(opcode, me);
				code[0] = opcode;
				code[3] = opcode;
				rA_field::insert(code[3], 0);
				flush_icache_range(code, sizeof(code));
				for (int i = 0; i < n_reg_values; i++) {
					const uint32_t ra = reg_values[i];
					test_one(&code[0], insn, ra, sh, 0, 0);
					test_one(&code[2], insn, ra, sh, 0, 0);
				}
			}
		}
	}
}

void powerpc_test_cpu::test_instruction_RRRII(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_reg_values = sizeof(reg_values)/sizeof(reg_values[0]);
	const int n_msk_values = sizeof(msk_values)/sizeof(msk_values[0]);

	for (int i_mb = 0; i_mb < n_msk_values; i_mb++) {
		const uint32_t mb = msk_values[i_mb];
		for (int i_me = 0; i_me < n_msk_values; i_me++) {
			const uint32_t me = msk_values[i_me];
			MB_field::insert(opcode, mb);
			ME_field::insert(opcode, me);
			code[0] = opcode;
			code[3] = opcode;
			rA_field::insert(code[3], 0);
			flush_icache_range(code, sizeof(code));
			for (int i = 0; i < n_reg_values; i++) {
				const uint32_t ra = reg_values[i];
				for (int j = -1; j <= 33; j++) {
					const uint32_t rb = j;
					test_one(&code[0], insn, ra, rb, 0, 0);
					test_one(&code[2], insn, ra, rb, 0, 0);
				}
			}
		}
	}
}

void powerpc_test_cpu::test_instruction_CRR__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_values = sizeof(reg_values)/sizeof(reg_values[0]);

	for (int k = 0; k < 8; k++) {
		crfD_field::insert(opcode, k);
		code[0] = code[3] = opcode;			// <op> crfD,RA,RB
		rA_field::insert(code[3], 0);		// <op> crfD,R0,RB
		flush_icache_range(code, sizeof(code));
		for (int i = 0; i < n_values; i++) {
			const uint32_t ra = reg_values[i];
			for (int j = 0; j < n_values; j++) {
			const uint32_t rb = reg_values[j];
			test_one(&code[0], insn, ra, rb, 0);
			test_one(&code[2], insn, ra, rb, 0);
			}
		}
	}
}

void powerpc_test_cpu::test_instruction_CRI__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_BLR,
		POWERPC_MR(0, RA), POWERPC_ILLEGAL, POWERPC_BLR
	};

	// Input values
	const int n_reg_values = sizeof(reg_values)/sizeof(reg_values[0]);
	const int n_imm_values = sizeof(imm_values)/sizeof(imm_values[0]);

	for (int k = 0; k < 8; k++) {
		crfD_field::insert(opcode, k);
		for (int j = 0; j < n_imm_values; j++) {
			const uint32_t im = imm_values[j];
			UIMM_field::insert(opcode, im);
			code[0] = code[3] = opcode;			// <op> crfD,RA,SIMM
			rA_field::insert(code[3], 0);		// <op> crfD,R0,SIMM
			flush_icache_range(code, sizeof(code));
			for (int i = 0; i < n_reg_values; i++) {
				const uint32_t ra = reg_values[i];
				test_one(&code[0], insn, ra, im, 0);
				test_one(&code[2], insn, ra, im, 0);
			}
		}
	}
}

void powerpc_test_cpu::test_instruction_CCC__(const char *insn, uint32_t opcode)
{
	// Test code
	static uint32_t code[] = {
		POWERPC_ILLEGAL, POWERPC_MFCR(RD), POWERPC_BLR,
	};

	const uint32_t saved_cr = init_cr;
	crbD_field::insert(opcode, 0);

	// Loop over crbA=[4-7] (crf1), crbB=[28-31] (crf7)
	for (int crbA = 4; crbA <= 7; crbA++) {
		crbA_field::insert(opcode, crbA);
		for (int crbB = 28; crbB <= 31; crbB++) {
			crbB_field::insert(opcode, crbB);
			code[0] = opcode;
			flush_icache_range(code, sizeof(code));
			// Generate CR values for (crf1, crf7)
			uint32_t cr = 0;
			for (int i = 0; i < 16; i++) {
				CR_field<1>::insert(cr, i);
				for (int j = 0; j < 16; j++) {
					CR_field<7>::insert(cr, j);
					init_cr = cr;
					test_one(&code[0], insn, init_cr, 0, 0);
				}
			}
		}
	}
	init_cr = saved_cr;
}

void powerpc_test_cpu::test_add(void)
{
#if TEST_ADD
	gen_xer_values(0, 0);
	TEST_INSTRUCTION(RRI__,"addi",		_D (14,RD,RA,00));
	TEST_INSTRUCTION(RRI__,"addis",		_D (15,RD,RA,00));
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRR__,"add",		_XO(31,RD,RA,RB,0,266,0));
	TEST_INSTRUCTION(RRR__,"add.",		_XO(31,RD,RA,RB,0,266,1));
	gen_xer_values(0, SO|OV);
	TEST_INSTRUCTION(RRR__,"addo",		_XO(31,RD,RA,RB,1,266,0));
	TEST_INSTRUCTION(RRR__,"addo." ,	_XO(31,RD,RA,RB,1,266,1));
	gen_xer_values(0, SO|CA);
	TEST_INSTRUCTION(RRR__,"addc",		_XO(31,RD,RA,RB,0, 10,0));
	TEST_INSTRUCTION(RRR__,"addc.",		_XO(31,RD,RA,RB,0, 10,1));
	TEST_INSTRUCTION(RRI__,"addic",		_D (12,RD,RA,00));
	TEST_INSTRUCTION(RRI__,"addic.",	_D (13,RD,RA,00));
	gen_xer_values(0, SO|CA|OV);
	TEST_INSTRUCTION(RRR__,"addco",		_XO(31,RD,RA,RB,1, 10,0));
	TEST_INSTRUCTION(RRR__,"addco.",	_XO(31,RD,RA,RB,1, 10,1));
	gen_xer_values(CA, SO|CA);
	TEST_INSTRUCTION(RRR__,"adde",		_XO(31,RD,RA,RB,0,138,0));
	TEST_INSTRUCTION(RRR__,"adde.",		_XO(31,RD,RA,RB,0,138,1));
	TEST_INSTRUCTION(RR___,"addme",		_XO(31,RD,RA,00,0,234,0));
	TEST_INSTRUCTION(RR___,"addme.",	_XO(31,RD,RA,00,0,234,1));
	TEST_INSTRUCTION(RR___,"addze",		_XO(31,RD,RA,00,0,202,0));
	TEST_INSTRUCTION(RR___,"addze.",	_XO(31,RD,RA,00,0,202,1));
	gen_xer_values(CA, SO|CA|OV);
	TEST_INSTRUCTION(RRR__,"addeo",		_XO(31,RD,RA,RB,1,138,0));
	TEST_INSTRUCTION(RRR__,"addeo.",	_XO(31,RD,RA,RB,1,138,1));
	TEST_INSTRUCTION(RR___,"addmeo",	_XO(31,RD,RA,00,1,234,0));
	TEST_INSTRUCTION(RR___,"addmeo.",	_XO(31,RD,RA,00,1,234,1));
	TEST_INSTRUCTION(RR___,"addzeo",	_XO(31,RD,RA,00,1,202,0));
	TEST_INSTRUCTION(RR___,"addzeo.",	_XO(31,RD,RA,00,1,202,1));
#endif
}

void powerpc_test_cpu::test_sub(void)
{
#if TEST_SUB
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRR__,"subf",		_XO(31,RD,RA,RB,0, 40,0));
	TEST_INSTRUCTION(RRR__,"subf.",		_XO(31,RD,RA,RB,0, 40,1));
	gen_xer_values(0, SO|OV);
	TEST_INSTRUCTION(RRR__,"subfo",		_XO(31,RD,RA,RB,1, 40,0));
	TEST_INSTRUCTION(RRR__,"subfo.",	_XO(31,RD,RA,RB,1, 40,1));
	gen_xer_values(0, SO|CA);
	TEST_INSTRUCTION(RRR__,"subfc",		_XO(31,RD,RA,RB,0,  8,0));
	TEST_INSTRUCTION(RRR__,"subfc.",	_XO(31,RD,RA,RB,0,  8,1));
	gen_xer_values(0, SO|CA|OV);
	TEST_INSTRUCTION(RRR__,"subfco",	_XO(31,RD,RA,RB,1,  8,0));
	TEST_INSTRUCTION(RRR__,"subfco.",	_XO(31,RD,RA,RB,1,  8,1));
	gen_xer_values(0, CA);
	TEST_INSTRUCTION(RRI__,"subfic",	_D ( 8,RD,RA,00));
	gen_xer_values(CA, SO|CA);
	TEST_INSTRUCTION(RRR__,"subfe",		_XO(31,RD,RA,RB,0,136,0));
	TEST_INSTRUCTION(RRR__,"subfe.",	_XO(31,RD,RA,RB,0,136,1));
	TEST_INSTRUCTION(RR___,"subfme",	_XO(31,RD,RA,00,0,232,0));
	TEST_INSTRUCTION(RR___,"subfme.",	_XO(31,RD,RA,00,0,232,1));
	TEST_INSTRUCTION(RR___,"subfze",	_XO(31,RD,RA,00,0,200,0));
	TEST_INSTRUCTION(RR___,"subfze.",	_XO(31,RD,RA,00,0,200,1));
	gen_xer_values(CA, SO|CA|OV);
	TEST_INSTRUCTION(RRR__,"subfeo",	_XO(31,RD,RA,RB,1,136,0));
	TEST_INSTRUCTION(RRR__,"subfeo.",	_XO(31,RD,RA,RB,1,136,1));
	TEST_INSTRUCTION(RR___,"subfmeo",	_XO(31,RD,RA,00,1,232,0));
	TEST_INSTRUCTION(RR___,"subfmeo.",	_XO(31,RD,RA,00,1,232,1));
	TEST_INSTRUCTION(RR___,"subfzeo",	_XO(31,RD,RA,00,1,200,0));
	TEST_INSTRUCTION(RR___,"subfzeo.",	_XO(31,RD,RA,00,1,200,1));
#endif
}

void powerpc_test_cpu::test_mul(void)
{
#if TEST_MUL
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRR__,"mulhw",		_XO(31,RD,RA,RB,0, 75,0));
	TEST_INSTRUCTION(RRR__,"mulhw.",	_XO(31,RD,RA,RB,0, 75,1));
	TEST_INSTRUCTION(RRR__,"mulhwu",	_XO(31,RD,RA,RB,0, 11,0));
	TEST_INSTRUCTION(RRR__,"mulhwu.",	_XO(31,RD,RA,RB,0, 11,1));
	TEST_INSTRUCTION(RRI__,"mulli",		_D ( 7,RD,RA,00));
	TEST_INSTRUCTION(RRR__,"mullw",		_XO(31,RD,RA,RB,0,235,0));
	TEST_INSTRUCTION(RRR__,"mullw.",	_XO(31,RD,RA,RB,0,235,1));
	gen_xer_values(0, SO|OV);
	TEST_INSTRUCTION(RRR__,"mullwo",	_XO(31,RD,RA,RB,1,235,0));
	TEST_INSTRUCTION(RRR__,"mullwo.",	_XO(31,RD,RA,RB,1,235,1));
#endif
}

void powerpc_test_cpu::test_div(void)
{
#if TEST_DIV
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRR__,"divw",		_XO(31,RD,RA,RB,0,491,0));
	TEST_INSTRUCTION(RRR__,"divw.",		_XO(31,RD,RA,RB,0,491,1));
	TEST_INSTRUCTION(RRR__,"divwu",		_XO(31,RD,RA,RB,0,459,0));
	TEST_INSTRUCTION(RRR__,"divwu.",	_XO(31,RD,RA,RB,0,459,1));
	gen_xer_values(0, SO|OV);
	TEST_INSTRUCTION(RRR__,"divwo",		_XO(31,RD,RA,RB,1,491,0));
	TEST_INSTRUCTION(RRR__,"divwo.",	_XO(31,RD,RA,RB,1,491,1));
	TEST_INSTRUCTION(RRR__,"divwuo",	_XO(31,RD,RA,RB,1,459,0));
	TEST_INSTRUCTION(RRR__,"divwuo.",	_XO(31,RD,RA,RB,1,459,1));
#endif
}

void powerpc_test_cpu::test_logical(void)
{
#if TEST_LOGICAL
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRR__,"and",		_X (31,RA,RD,RB,28,0));
	TEST_INSTRUCTION(RRR__,"and.",		_X (31,RA,RD,RB,28,1));
	TEST_INSTRUCTION(RRR__,"andc",		_X (31,RA,RD,RB,60,0));
	TEST_INSTRUCTION(RRR__,"andc.",		_X (31,RA,RD,RB,60,1));
	TEST_INSTRUCTION(RRK__,"andi.",		_D (28,RA,RD,00));
	TEST_INSTRUCTION(RRK__,"andis.",	_D (29,RA,RD,00));
	TEST_INSTRUCTION(CNTLZ,"cntlzw",	_X (31,RA,RD,00,26,0));
	TEST_INSTRUCTION(CNTLZ,"cntlzw.",	_X (31,RA,RD,00,26,1));
	TEST_INSTRUCTION(RRR__,"eqv",		_X (31,RA,RD,RB,284,0));
	TEST_INSTRUCTION(RRR__,"eqv.",		_X (31,RA,RD,RB,284,1));
	TEST_INSTRUCTION(RR___,"extsb",		_X (31,RA,RD,00,954,0));
	TEST_INSTRUCTION(RR___,"extsb.",	_X (31,RA,RD,00,954,1));
	TEST_INSTRUCTION(RR___,"extsh",		_X (31,RA,RD,00,922,0));
	TEST_INSTRUCTION(RR___,"extsh.",	_X (31,RA,RD,00,922,1));
	TEST_INSTRUCTION(RRR__,"nand",		_X (31,RA,RD,RB,476,0));
	TEST_INSTRUCTION(RRR__,"nand.",		_X (31,RA,RD,RB,476,1));
	TEST_INSTRUCTION(RR___,"neg",		_XO(31,RD,RA,00,0,104,0));
	TEST_INSTRUCTION(RR___,"neg.",		_XO(31,RD,RA,00,0,104,1));
	TEST_INSTRUCTION(RRR__,"nor",		_X (31,RA,RD,RB,124,0));
	TEST_INSTRUCTION(RRR__,"nor.",		_X (31,RA,RD,RB,124,1));
	TEST_INSTRUCTION(RRR__,"or",		_X (31,RA,RD,RB,444,0));
	TEST_INSTRUCTION(RRR__,"or.",		_X (31,RA,RD,RB,444,1));
	TEST_INSTRUCTION(RRR__,"orc",		_X (31,RA,RD,RB,412,0));
	TEST_INSTRUCTION(RRR__,"orc.",		_X (31,RA,RD,RB,412,1));
	TEST_INSTRUCTION(RRK__,"ori",		_D (24,RA,RD,00));
	TEST_INSTRUCTION(RRK__,"oris",		_D (25,RA,RD,00));
	TEST_INSTRUCTION(RRR__,"xor",		_X (31,RA,RD,RB,316,0));
	TEST_INSTRUCTION(RRR__,"xor.",		_X (31,RA,RD,RB,316,1));
	TEST_INSTRUCTION(RRK__,"xori",		_D (26,RA,RD,00));
	TEST_INSTRUCTION(RRK__,"xoris",		_D (27,RA,RD,00));
	gen_xer_values(0, SO|OV);
	TEST_INSTRUCTION(RR___,"nego",		_XO(31,RD,RA,00,1,104,0));
	TEST_INSTRUCTION(RR___,"nego.",		_XO(31,RD,RA,00,1,104,1));
#endif
}

void powerpc_test_cpu::test_shift(void)
{
#if TEST_SHIFT
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRRSH,"slw",		_X (31,RA,RD,RB, 24,0));
	TEST_INSTRUCTION(RRRSH,"slw.",		_X (31,RA,RD,RB, 24,1));
	TEST_INSTRUCTION(RRRSH,"sraw",		_X (31,RA,RD,RB,792,0));
	TEST_INSTRUCTION(RRRSH,"sraw.",		_X (31,RA,RD,RB,792,1));
	TEST_INSTRUCTION(RRS__,"srawi",		_X (31,RA,RD,00,824,0));
	TEST_INSTRUCTION(RRS__,"srawi.",	_X (31,RA,RD,00,824,1));
	TEST_INSTRUCTION(RRRSH,"srw",		_X (31,RA,RD,RB,536,0));
	TEST_INSTRUCTION(RRRSH,"srw.",		_X (31,RA,RD,RB,536,1));
#endif
}

void powerpc_test_cpu::test_rotate(void)
{
#if TEST_ROTATE
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(RRIII,"rlwimi",	_M (20,RA,RD,00,00,00,0));
	TEST_INSTRUCTION(RRIII,"rlwimi.",	_M (20,RA,RD,00,00,00,1));
	TEST_INSTRUCTION(RRIII,"rlwinm",	_M (21,RA,RD,00,00,00,0));
	TEST_INSTRUCTION(RRIII,"rlwinm.",	_M (21,RA,RD,00,00,00,1));
	TEST_INSTRUCTION(RRRII,"rlwnm",		_M (23,RA,RD,RB,00,00,0));
	TEST_INSTRUCTION(RRRII,"rlwnm.",	_M (23,RA,RD,RB,00,00,1));
#endif
}

void powerpc_test_cpu::test_compare(void)
{
#if TEST_COMPARE
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(CRR__,"cmp",		_X (31,00,RA,RB,000,0));
	TEST_INSTRUCTION(CRI__,"cmpi",		_D (11,00,RA,00));
	TEST_INSTRUCTION(CRR__,"cmpl",		_X (31,00,RA,RB, 32,0));
	TEST_INSTRUCTION(CRK__,"cmpli",		_D (10,00,RA,00));
#endif
}

void powerpc_test_cpu::test_cr_logical(void)
{
#if TEST_CR_LOGICAL
	gen_xer_values(0, SO);
	TEST_INSTRUCTION(CCC__,"crand",		_X (19,00,00,00,257,0));
	TEST_INSTRUCTION(CCC__,"crandc",	_X (19,00,00,00,129,0));
	TEST_INSTRUCTION(CCC__,"creqv",		_X (19,00,00,00,289,0));
	TEST_INSTRUCTION(CCC__,"crnand",	_X (19,00,00,00,225,0));
	TEST_INSTRUCTION(CCC__,"crnor",		_X (19,00,00,00, 33,0));
	TEST_INSTRUCTION(CCC__,"cror",		_X (19,00,00,00,449,0));
	TEST_INSTRUCTION(CCC__,"crorc",		_X (19,00,00,00,417,0));
	TEST_INSTRUCTION(CCC__,"crxor",		_X (19,00,00,00,193,0));
#endif
}

// Template-generated vector values
const powerpc_test_cpu::vector_value_t powerpc_test_cpu::vector_values[] = {
	{'w',{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}},
	{'w',{0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01,0x01}},
	{'w',{0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02,0x02}},
	{'w',{0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03,0x03}},
	{'w',{0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04,0x04}},
	{'w',{0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05,0x05}},
	{'w',{0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06,0x06}},
	{'w',{0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07,0x07}},
	{'w',{0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08,0x08}},
	{'w',{0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10,0x10}},
	{'w',{0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18,0x18}},
	{'w',{0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20}},
	{'w',{0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28,0x28}},
	{'w',{0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30,0x30}},
	{'w',{0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38,0x38}},
	{'w',{0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40,0x40}},
	{'w',{0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48,0x48}},
	{'w',{0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50,0x50}},
	{'w',{0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58,0x58}},
	{'w',{0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60,0x60}},
	{'w',{0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68,0x68}},
	{'w',{0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70,0x70}},
	{'w',{0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78,0x78}},
	{'w',{0x00,0x00,0x01,0x00,0x00,0x00,0x01,0x00,0x00,0x00,0x01,0x00,0x00,0x00,0x01,0x00}},
	{'w',{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x04}},
	{'w',{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x10}},
	{'w',{0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff,0xff}},
	{'w',{0x11,0x11,0x11,0x11,0x22,0x22,0x22,0x22,0x33,0x33,0x33,0x33,0x44,0x44,0x44,0x44}},
	{'w',{0x88,0x88,0x88,0x88,0x77,0x77,0x77,0x77,0x66,0x66,0x66,0x66,0x55,0x55,0x55,0x55}},
	{'w',{0x99,0x99,0x99,0x99,0xaa,0xaa,0xaa,0xaa,0xbb,0xbb,0xbb,0xbb,0xcc,0xcc,0xcc,0xcc}},
	{'w',{0x00,0x00,0x00,0x00,0xff,0xff,0xff,0xff,0xee,0xee,0xee,0xee,0xdd,0xdd,0xdd,0xdd}},
	{'w',{0x00,0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99,0xaa,0xbb,0xcc,0xdd,0xee,0xff}},
	{'h',{0x00,0x00,0x11,0x11,0x22,0x22,0x33,0x33,0x44,0x44,0x55,0x55,0x66,0x66,0x77,0x77}},
	{'h',{0x00,0x01,0x00,0x02,0x00,0x03,0x00,0x04,0x00,0x05,0x00,0x06,0x00,0x07,0x00,0x08}},
	{'h',{0x00,0x16,0x00,0x15,0x00,0x14,0x00,0x13,0x00,0x12,0x00,0x10,0x00,0x10,0x00,0x09}},
	{'b',{0x00,0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99,0xaa,0xbb,0xcc,0xdd,0xee,0xff}},
	{'b',{0xff,0xee,0xdd,0xcc,0xbb,0xaa,0x99,0x88,0x77,0x66,0x55,0x44,0x33,0x22,0x11,0x00}},
	{'b',{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f}},
	{'b',{0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f}},
	{'b',{0x2f,0x2e,0x2d,0x2c,0x2b,0x2a,0x29,0x28,0x27,0x26,0x25,0x24,0x23,0x22,0x21,0x20}}
};

const powerpc_test_cpu::vector_value_t powerpc_test_cpu::vector_fp_values[] = {
	{'f',{0x80,0x00,0x00,0x00,0x80,0x00,0x00,0x00,0x80,0x00,0x00,0x00,0x80,0x00,0x00,0x00}}, // -0, -0, -0, -0
	{'f',{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}}, // 0, 0, 0, 0
	{'f',{0xbf,0x80,0x00,0x00,0xbf,0x80,0x00,0x00,0xbf,0x80,0x00,0x00,0xbf,0x80,0x00,0x00}}, // -1, -1, -1, -1
	{'f',{0x3f,0x80,0x00,0x00,0x3f,0x80,0x00,0x00,0x3f,0x80,0x00,0x00,0x3f,0x80,0x00,0x00}}, // 1, 1, 1, 1
	{'f',{0xc0,0x00,0x00,0x00,0xc0,0x00,0x00,0x00,0xc0,0x00,0x00,0x00,0xc0,0x00,0x00,0x00}}, // -2, -2, -2, -2
	{'f',{0x40,0x00,0x00,0x00,0x40,0x00,0x00,0x00,0x40,0x00,0x00,0x00,0x40,0x00,0x00,0x00}}, // 2, 2, 2, 2
	{'f',{0xc0,0x00,0x00,0x00,0xbf,0x80,0x00,0x00,0x3f,0x80,0x00,0x00,0x40,0x00,0x00,0x00}}, // -2, -1, 1, 2
	{'f',{0xc0,0x40,0x00,0x00,0x80,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x40,0x40,0x00,0x00}}, // -3, -0, 0, 3
	{'f',{0x40,0x00,0x00,0x00,0x3f,0x80,0x00,0x00,0xbf,0x80,0x00,0x00,0xc0,0x00,0x00,0x00}}  // 2, 1, -1, -2
};

void powerpc_test_cpu::test_one_vector(uint32_t *code, vector_test_t const & vt, uint8_t *rAp, uint8_t *rBp, uint8_t *rCp)
{
#if TEST_VMX_OPS
  static vector_t native_vD;
  memset(&native_vD, 0, sizeof(native_vD));
  static vector_helper_t native_vSCR;
  memset(&native_vSCR, 0, sizeof(native_vSCR));
  static aligned_vector_t dummy_vector;
  dummy_vector.clear();
  if (!rAp) rAp = (uint8_t *)dummy_vector.addr();
  if (!rBp) rBp = (uint8_t *)dummy_vector.addr();
  if (!rCp) rCp = (uint8_t *)dummy_vector.addr();
  if (test_mode == MODE_GENERATING)
    {
      // Invoke native code
      const uint32_t save_cr = native_get_cr();
      native_set_cr(init_cr);
      native_vSCR.w[3] = 0;
      typedef void (*func_t)(uint8_t *, uint8_t *, uint8_t *, uint8_t *, uint8_t *);
      func_t func = (func_t)code;
      func((uint8_t *)&native_vD, rAp, rBp, rCp, native_vSCR.b);
      const uint32_t native_cr = native_get_cr();
      const uint32_t native_vscr = native_vSCR.w[3];
      native_set_cr(save_cr);
      if (results_file) {
        put_vector(results_file, native_vD);
        put32(results_file, native_cr);
        put32(results_file, native_vscr);
      }
    }
  else
    {
      get_vector(results_file, native_vD);
      const uint32_t native_cr = get32(results_file);
      const uint32_t native_vscr = get32(results_file);

      static vector_t emul_vD;
      get_vector(reference_file, emul_vD);
      const uint32_t emul_cr = get32(reference_file);
      const uint32_t emul_vscr = get32(reference_file);

      ++tests;

      bool ok = (vector_equals(vt.type, native_vD, emul_vD)
                 && native_cr == emul_cr
                 && native_vscr == emul_vscr);

      if (!ok)
        {
          printf("FAIL: %s [%08x]\n", vt.name, vt.opcode);
          errors++;
        }
      else if (verbose)
        {
          printf("PASS: %s [%08x]\n", vt.name, vt.opcode);
        }

      if (!ok || verbose) {
        char op_type = tolower(vt.op_type);
        if (!op_type)
          op_type = vt.type;
#define PRINT_OPERAND(N, vX, rX)                                        \
        switch (vt.operands[N]) {                                       \
        case vX:                                                        \
          printf(#vX " = ");                                            \
          print_vector(*((vector_t *)rX##p));                           \
          printf("\n");                                                 \
          break;                                                        \
        case vI:                                                        \
        case vN:                                                        \
          printf(#vX " = %d\n", vX##_field::extract(vt.opcode));	\
          break;                                                        \
        case rX:                                                        \
          printf(#rX " = %08x", rX##p);                                 \
          if (rX##p) switch (op_type) {                                 \
            case 'b': printf(" [%02x]", *rX##p); break;                 \
            case 'h': printf(" [%04x]", *((uint16_t *)rX##p)); break;     \
            case 'w': printf(" [%08x]", *((uint32_t *)rX##p)); break;   \
            }                                                           \
          printf("\n");                                                 \
          break;                                                        \
        }
        PRINT_OPERAND(1, vA, rA);
        PRINT_OPERAND(2, vB, rB);
        PRINT_OPERAND(3, vC, rC);
#undef  PRINT_OPERAND
        printf("vD.N = ");
        print_vector(native_vD, vt.type);
        printf("\n");
        printf("vD.E = ");
        print_vector(emul_vD, vt.type);
        printf("\n");
        printf("CR.N = %08x ; VSCR.N = %08x\n", native_cr, native_vscr);
        printf("CR.E = %08x ; VSCR.E = %08x\n", emul_cr, emul_vscr);
      }
    }
#endif
}

void powerpc_test_cpu::test_vector_load_for_shift(void)
{
#if TEST_VMX_LOADSH
	// Tested instructions
	static const vector_test_t tests[] = {
		{ "lvsl",  'b', 0, _X (31,00,00,00,  6,0), { vD, rA, rB } },
		{ "lvsr",  'b', 0, _X (31,00,00,00, 38,0), { vD, rA, rB } },
	};

	// Code template
	static uint32_t code[] = {
		POWERPC_MFSPR(12, 256),			// mfvrsave r12
		_D(15,0,0,0x1000),				// lis r0,0x1000 ([v3])
		POWERPC_MTSPR(0, 256),			// mtvrsave r0
		POWERPC_LI(RA, 0),				// li rB,<val>
		0,								// <insn>
		POWERPC_STVX(RD, 0, RD),		// stvx v3,r3(0)
		POWERPC_MTSPR(12, 256),			// mtvrsave r12
		POWERPC_BLR						// blr
	};

	int i_opcode = -1;
	const int n_instructions = sizeof(code) / sizeof(code[0]);
	for (int i = 0; i < n_instructions; i++) {
		if (code[i] == 0) {
			i_opcode = i;
			break;
		}
	}
	assert(i_opcode != -1);

	const int n_elements = sizeof(tests) / sizeof(tests[0]);
	for (int i = 0; i < n_elements; i++) {
		vector_test_t const & vt = tests[i];
		code[i_opcode] = vt.opcode;
		vD_field::insert(code[i_opcode], RD);
		rA_field::insert(code[i_opcode], 00);
		rB_field::insert(code[i_opcode], RA);

		printf("Testing %s\n", vt.name);
		for (int j = 0; j < 32; j++) {
			UIMM_field::insert(code[i_opcode - 1], j);
			flush_icache_range(code, sizeof(code));
			test_one_vector(code, vt, (uint8_t *)NULL);
		}
	}
#endif
}

void powerpc_test_cpu::test_vector_load(void)
{
#if TEST_VMX_LOAD
	// Tested instructions
	static const vector_test_t tests[] = {
		{ "lvebx",  'b', 0, _X (31,00,00,00,  7,0), { vD, rA, rB } },
		{ "lvehx",  'h', 0, _X (31,00,00,00, 39,0), { vD, rA, rB } },
		{ "lvewx",  'w', 0, _X (31,00,00,00, 71,0), { vD, rA, rB } }
	};

	// Code template
	static uint32_t code[] = {
		POWERPC_MFSPR(12, 256),			// mfvrsave r12
		_D(15,0,0,0x1000),				// lis r0,0x1000 ([v3])
		POWERPC_MTSPR(0, 256),			// mtvrsave r0
		POWERPC_LVX(RD, 0, RD),			// lvx v3,r3(0)
		0,								// <insn>
		POWERPC_STVX(RD, 0, RD),		// stvx v3,r3(0)
		POWERPC_MTSPR(12, 256),			// mtvrsave r12
		POWERPC_BLR						// blr
	};

	int i_opcode = -1;
	const int n_instructions = sizeof(code) / sizeof(code[0]);
	for (int i = 0; i < n_instructions; i++) {
		if (code[i] == 0) {
			i_opcode = i;
			break;
		}
	}
	assert(i_opcode != -1);

	const int n_elements = sizeof(tests) / sizeof(tests[0]);
	for (int i = 0; i < n_elements; i++) {
		vector_test_t const & vt = tests[i];
		code[i_opcode] = vt.opcode;
		vD_field::insert(code[i_opcode], RD);
		rA_field::insert(code[i_opcode], 00);
		rB_field::insert(code[i_opcode], RA);
		flush_icache_range(code, sizeof(code));

		printf("Testing %s\n", vt.name);
		const int n_vector_values = sizeof(vector_values)/sizeof(vector_values[0]);
		for (int j = 0; j < n_vector_values; j++) {
			static aligned_vector_t av;
			switch (vt.type) {
			case 'b':
				for (int k = 0; k < 16; k++) {
					av.copy(*(vector_t *)((uint8_t *)(&vector_values[j].v) + 1 * k), 16 - 1 * k);
					test_one_vector(code, vt, av.addr());
				}
				break;
			case 'h':
				for (int k = 0; k < 8; k++) {
					av.copy(*(vector_t *)((uint8_t *)(&vector_values[j].v) + 2 * k), 16 - 2 * k);
					test_one_vector(code, vt, av.addr());
				}
				break;
			case 'w':
				for (int k = 0; k < 4; k++) {
					av.copy(*(vector_t *)((uint8_t *)(&vector_values[j].v) + 4 * k), 16 - 4 * k);
					test_one_vector(code, vt, av.addr());
				}
				break;
			}
		}
	}
#endif
}

void powerpc_test_cpu::test_vector_arith(void)
{
#if TEST_VMX_ARITH
	// Tested instructions
	static const vector_test_t tests[] = {
		{ "vaddcuw",	'w',  0 , _VX(04,RD,RA,RB, 384), { vD, vA, vB } },
		{ "vaddfp",		'f',  0 , _VX(04,RD,RA,RB,  10), { vD, vA, vB } },
		{ "vaddsbs",	'b',  0 , _VX(04,RD,RA,RB, 768), { vD, vA, vB } },
		{ "vaddshs",	'h',  0 , _VX(04,RD,RA,RB, 832), { vD, vA, vB } },
		{ "vaddsws",	'w',  0 , _VX(04,RD,RA,RB, 896), { vD, vA, vB } },
		{ "vaddubm",	'b',  0 , _VX(04,RD,RA,RB,   0), { vD, vA, vB } },
		{ "vaddubs",	'b',  0 , _VX(04,RD,RA,RB, 512), { vD, vA, vB } },
		{ "vadduhm",	'h',  0 , _VX(04,RD,RA,RB,  64), { vD, vA, vB } },
		{ "vadduhs",	'h',  0 , _VX(04,RD,RA,RB, 576), { vD, vA, vB } },
		{ "vadduwm",	'w',  0 , _VX(04,RD,RA,RB, 128), { vD, vA, vB } },
		{ "vadduws",	'w',  0 , _VX(04,RD,RA,RB, 640), { vD, vA, vB } },
		{ "vand",		'w',  0 , _VX(04,RD,RA,RB,1028), { vD, vA, vB } },
		{ "vandc",		'w',  0 , _VX(04,RD,RA,RB,1092), { vD, vA, vB } },
		{ "vavgsb",		'b',  0 , _VX(04,RD,RA,RB,1282), { vD, vA, vB } },
		{ "vavgsh",		'h',  0 , _VX(04,RD,RA,RB,1346), { vD, vA, vB } },
		{ "vavgsw",		'w',  0 , _VX(04,RD,RA,RB,1410), { vD, vA, vB } },
		{ "vavgub",		'b',  0 , _VX(04,RD,RA,RB,1026), { vD, vA, vB } },
		{ "vavguh",		'h',  0 , _VX(04,RD,RA,RB,1090), { vD, vA, vB } },
		{ "vavguw",		'w',  0 , _VX(04,RD,RA,RB,1154), { vD, vA, vB } },
		{ "vcfsx",		'f', 'w', _VX(04,RD,00,RB, 842), { vD, vI, vB } },
		{ "vcfux",		'f', 'w', _VX(04,RD,00,RB, 778), { vD, vI, vB } },
		{ "vcmpbfp",	'w', 'f', _VXR(04,RD,RA,RB,966,0), { vD, vA, vB } },
		{ "vcmpbfp.",	'w', 'f', _VXR(04,RD,RA,RB,966,1), { vD, vA, vB } },
		{ "vcmpeqfp",	'w', 'f', _VXR(04,RD,RA,RB,198,0), { vD, vA, vB } },
		{ "vcmpeqfp.",	'w', 'f', _VXR(04,RD,RA,RB,198,1), { vD, vA, vB } },
		{ "vcmpequb",	'b',  0 , _VXR(04,RD,RA,RB,  6,0), { vD, vA, vB } },
		{ "vcmpequb.",	'b',  0 , _VXR(04,RD,RA,RB,  6,1), { vD, vA, vB } },
		{ "vcmpequh",	'h',  0 , _VXR(04,RD,RA,RB, 70,0), { vD, vA, vB } },
		{ "vcmpequh.",	'h',  0 , _VXR(04,RD,RA,RB, 70,1), { vD, vA, vB } },
		{ "vcmpequw",	'w',  0 , _VXR(04,RD,RA,RB,134,0), { vD, vA, vB } },
		{ "vcmpequw.",	'w',  0 , _VXR(04,RD,RA,RB,134,1), { vD, vA, vB } },
		{ "vcmpgefp",	'w', 'f', _VXR(04,RD,RA,RB,454,0), { vD, vA, vB } },
		{ "vcmpgefp.",	'w', 'f', _VXR(04,RD,RA,RB,454,1), { vD, vA, vB } },
		{ "vcmpgtfp",	'w', 'f', _VXR(04,RD,RA,RB,710,0), { vD, vA, vB } },
		{ "vcmpgtfp.",	'w', 'f', _VXR(04,RD,RA,RB,710,1), { vD, vA, vB } },
		{ "vcmpgtsb",	'b',  0 , _VXR(04,RD,RA,RB,774,0), { vD, vA, vB } },
		{ "vcmpgtsb.",	'b',  0 , _VXR(04,RD,RA,RB,774,1), { vD, vA, vB } },
		{ "vcmpgtsh",	'h',  0 , _VXR(04,RD,RA,RB,838,0), { vD, vA, vB } },
		{ "vcmpgtsh.",	'h',  0 , _VXR(04,RD,RA,RB,838,1), { vD, vA, vB } },
		{ "vcmpgtsw",	'w',  0 , _VXR(04,RD,RA,RB,902,0), { vD, vA, vB } },
		{ "vcmpgtsw.",	'w',  0 , _VXR(04,RD,RA,RB,902,1), { vD, vA, vB } },
		{ "vcmpgtub",	'b',  0 , _VXR(04,RD,RA,RB,518,0), { vD, vA, vB } },
		{ "vcmpgtub.",	'b',  0 , _VXR(04,RD,RA,RB,518,1), { vD, vA, vB } },
		{ "vcmpgtuh",	'h',  0 , _VXR(04,RD,RA,RB,582,0), { vD, vA, vB } },
		{ "vcmpgtuh.",	'h',  0 , _VXR(04,RD,RA,RB,582,1), { vD, vA, vB } },
		{ "vcmpgtuw",	'w',  0 , _VXR(04,RD,RA,RB,646,0), { vD, vA, vB } },
		{ "vcmpgtuw.",	'w',  0 , _VXR(04,RD,RA,RB,646,1), { vD, vA, vB } },
		{ "vctsxs",		'w', 'f', _VX(04,RD,00,RB, 970), { vD, vI, vB } },
		{ "vctuxs",		'w', 'f', _VX(04,RD,00,RB, 906), { vD, vI, vB } },
		{ "vexptefp",	'f',  0 , _VX(04,RD,00,RB, 394), { vD, __, vB } },
		{ "vlogefp",	'l', 'f', _VX(04,RD,00,RB, 458), { vD, __, vB } },
		{ "vmaddfp",	'f',  0 , _VA(04,RD,RA,RB,RC,46),{ vD, vA, vB, vC } },
		{ "vmaxfp",		'f',  0 , _VX(04,RD,RA,RB,1034), { vD, vA, vB } },
		{ "vmaxsb",		'b',  0 , _VX(04,RD,RA,RB, 258), { vD, vA, vB } },
		{ "vmaxsh",		'h',  0 , _VX(04,RD,RA,RB, 322), { vD, vA, vB } },
		{ "vmaxsw",		'w',  0 , _VX(04,RD,RA,RB, 386), { vD, vA, vB } },
		{ "vmaxub",		'b',  0 , _VX(04,RD,RA,RB,   2), { vD, vA, vB } },
		{ "vmaxuh",		'h',  0 , _VX(04,RD,RA,RB,  66), { vD, vA, vB } },
		{ "vmaxuw",		'w',  0 , _VX(04,RD,RA,RB, 130), { vD, vA, vB } },
		{ "vmhaddshs",	'h',  0 , _VA(04,RD,RA,RB,RC,32),{ vD, vA, vB, vC } },
		{ "vmhraddshs",	'h',  0 , _VA(04,RD,RA,RB,RC,33),{ vD, vA, vB, vC } },
		{ "vminfp",		'f',  0 , _VX(04,RD,RA,RB,1098), { vD, vA, vB } },
		{ "vminsb",		'b',  0 , _VX(04,RD,RA,RB, 770), { vD, vA, vB } },
		{ "vminsh",		'h',  0 , _VX(04,RD,RA,RB, 834), { vD, vA, vB } },
		{ "vminsw",		'w',  0 , _VX(04,RD,RA,RB, 898), { vD, vA, vB } },
		{ "vminub",		'b',  0 , _VX(04,RD,RA,RB, 514), { vD, vA, vB } },
		{ "vminuh",		'h',  0 , _VX(04,RD,RA,RB, 578), { vD, vA, vB } },
		{ "vminuw",		'w',  0 , _VX(04,RD,RA,RB, 642), { vD, vA, vB } },
		{ "vmladduhm",	'h',  0 , _VA(04,RD,RA,RB,RC,34),{ vD, vA, vB, vC } },
		{ "vmrghb",		'b',  0 , _VX(04,RD,RA,RB,  12), { vD, vA, vB } },
		{ "vmrghh",		'h',  0 , _VX(04,RD,RA,RB,  76), { vD, vA, vB } },
		{ "vmrghw",		'w',  0 , _VX(04,RD,RA,RB, 140), { vD, vA, vB } },
		{ "vmrglb",		'b',  0 , _VX(04,RD,RA,RB, 268), { vD, vA, vB } },
		{ "vmrglh",		'h',  0 , _VX(04,RD,RA,RB, 332), { vD, vA, vB } },
		{ "vmrglw",		'w',  0 , _VX(04,RD,RA,RB, 396), { vD, vA, vB } },
		{ "vmsummbm",	'b',  0 , _VA(04,RD,RA,RB,RC,37),{ vD, vA, vB, vC } },
		{ "vmsumshm",	'h',  0 , _VA(04,RD,RA,RB,RC,40),{ vD, vA, vB, vC } },
		{ "vmsumshs",	'h',  0 , _VA(04,RD,RA,RB,RC,41),{ vD, vA, vB, vC } },
		{ "vmsumubm",	'b',  0 , _VA(04,RD,RA,RB,RC,36),{ vD, vA, vB, vC } },
		{ "vmsumuhm",	'h',  0 , _VA(04,RD,RA,RB,RC,38),{ vD, vA, vB, vC } },
		{ "vmsumuhs",	'h',  0 , _VA(04,RD,RA,RB,RC,39),{ vD, vA, vB, vC } },
		{ "vmulesb",	'b',  0 , _VX(04,RD,RA,RB, 776), { vD, vA, vB } },
		{ "vmulesh",	'h',  0 , _VX(04,RD,RA,RB, 840), { vD, vA, vB } },
		{ "vmuleub",	'b',  0 , _VX(04,RD,RA,RB, 520), { vD, vA, vB } },
		{ "vmuleuh",	'h',  0 , _VX(04,RD,RA,RB, 584), { vD, vA, vB } },
		{ "vmulosb",	'b',  0 , _VX(04,RD,RA,RB, 264), { vD, vA, vB } },
		{ "vmulosh",	'h',  0 , _VX(04,RD,RA,RB, 328), { vD, vA, vB } },
		{ "vmuloub",	'b',  0 , _VX(04,RD,RA,RB,   8), { vD, vA, vB } },
		{ "vmulouh",	'h',  0 , _VX(04,RD,RA,RB,  72), { vD, vA, vB } },
		{ "vnmsubfp",	'f',  0 , _VA(04,RD,RA,RB,RC,47),{ vD, vA, vB, vC } },
		{ "vnor",		'w',  0 , _VX(04,RD,RA,RB,1284), { vD, vA, vB } },
		{ "vor",		'w',  0 , _VX(04,RD,RA,RB,1156), { vD, vA, vB } },
		{ "vperm",		'b',  0 , _VA(04,RD,RA,RB,RC,43),{ vD, vA, vB, vC } },
		{ "vpkpx",		'h',  0 , _VX(04,RD,RA,RB, 782), { vD, vA, vB } },
		{ "vpkshss",	'b',  0 , _VX(04,RD,RA,RB, 398), { vD, vA, vB } },
		{ "vpkshus",	'b',  0 , _VX(04,RD,RA,RB, 270), { vD, vA, vB } },
		{ "vpkswss",	'h',  0 , _VX(04,RD,RA,RB, 462), { vD, vA, vB } },
		{ "vpkswus",	'h',  0 , _VX(04,RD,RA,RB, 334), { vD, vA, vB } },
		{ "vpkuhum",	'b',  0 , _VX(04,RD,RA,RB,  14), { vD, vA, vB } },
		{ "vpkuhus",	'b',  0 , _VX(04,RD,RA,RB, 142), { vD, vA, vB } },
		{ "vpkuwum",	'h',  0 , _VX(04,RD,RA,RB,  78), { vD, vA, vB } },
		{ "vpkuwus",	'h',  0 , _VX(04,RD,RA,RB, 206), { vD, vA, vB } },
		{ "vrefp",		'e', 'f', _VX(04,RD,00,RB, 266), { vD, __, vB } },
		{ "vrfim",		'f',  0 , _VX(04,RD,00,RB, 714), { vD, __, vB } },
		{ "vrfin",		'f',  0 , _VX(04,RD,00,RB, 522), { vD, __, vB } },
		{ "vrfip",		'f',  0 , _VX(04,RD,00,RB, 650), { vD, __, vB } },
		{ "vrfiz",		'f',  0 , _VX(04,RD,00,RB, 586), { vD, __, vB } },
		{ "vrlb",		'b',  0 , _VX(04,RD,RA,RB,   4), { vD, vA, vB } },
		{ "vrlh",		'h',  0 , _VX(04,RD,RA,RB,  68), { vD, vA, vB } },
		{ "vrlw",		'w',  0 , _VX(04,RD,RA,RB, 132), { vD, vA, vB } },
		{ "vrsqrtefp",	'e', 'f', _VX(04,RD,00,RB, 330), { vD, __, vB } },
		{ "vsel",		'b',  0 , _VA(04,RD,RA,RB,RC,42),{ vD, vA, vB, vC } },
		{ "vsl",		'b', 'B', _VX(04,RD,RA,RB, 452), { vD, vA, vB } },
		{ "vslb",		'b',  0 , _VX(04,RD,RA,RB, 260), { vD, vA, vB } },
		{ "vsldoi",		'b',  0 , _VA(04,RD,RA,RB,00,44),{ vD, vA, vB, vN } },
		{ "vslh",		'h',  0 , _VX(04,RD,RA,RB, 324), { vD, vA, vB } },
		{ "vslo",		'b',  0 , _VX(04,RD,RA,RB,1036), { vD, vA, vB } },
		{ "vslw",		'w',  0 , _VX(04,RD,RA,RB, 388), { vD, vA, vB } },
		{ "vspltb",		'b',  0 , _VX(04,RD,00,RB, 524), { vD, vI, vB } },
		{ "vsplth",		'h',  0 , _VX(04,RD,00,RB, 588), { vD, vI, vB } },
		{ "vspltisb",	'b',  0 , _VX(04,RD,00,00, 780), { vD, vI } },
		{ "vspltish",	'h',  0 , _VX(04,RD,00,00, 844), { vD, vI } },
		{ "vspltisw",	'w',  0 , _VX(04,RD,00,00, 908), { vD, vI } },
		{ "vspltw",		'w',  0 , _VX(04,RD,00,RB, 652), { vD, vI, vB } },
		{ "vsr",		'b', 'B', _VX(04,RD,RA,RB, 708), { vD, vA, vB } },
		{ "vsrab",		'b',  0 , _VX(04,RD,RA,RB, 772), { vD, vA, vB } },
		{ "vsrah",		'h',  0 , _VX(04,RD,RA,RB, 836), { vD, vA, vB } },
		{ "vsraw",		'w',  0 , _VX(04,RD,RA,RB, 900), { vD, vA, vB } },
		{ "vsrb",		'b',  0 , _VX(04,RD,RA,RB, 516), { vD, vA, vB } },
		{ "vsrh",		'h',  0 , _VX(04,RD,RA,RB, 580), { vD, vA, vB } },
		{ "vsro",		'b',  0 , _VX(04,RD,RA,RB,1100), { vD, vA, vB } },
		{ "vsrw",		'w',  0 , _VX(04,RD,RA,RB, 644), { vD, vA, vB } },
		{ "vsubcuw",	'w',  0 , _VX(04,RD,RA,RB,1408), { vD, vA, vB } },
		{ "vsubfp",		'f',  0 , _VX(04,RD,RA,RB,  74), { vD, vA, vB } },
		{ "vsubsbs",	'b',  0 , _VX(04,RD,RA,RB,1792), { vD, vA, vB } },
		{ "vsubshs",	'h',  0 , _VX(04,RD,RA,RB,1856), { vD, vA, vB } },
		{ "vsubsws",	'w',  0 , _VX(04,RD,RA,RB,1920), { vD, vA, vB } },
		{ "vsububm",	'b',  0 , _VX(04,RD,RA,RB,1024), { vD, vA, vB } },
		{ "vsububs",	'b',  0 , _VX(04,RD,RA,RB,1536), { vD, vA, vB } },
		{ "vsubuhm",	'h',  0 , _VX(04,RD,RA,RB,1088), { vD, vA, vB } },
		{ "vsubuhs",	'h',  0 , _VX(04,RD,RA,RB,1600), { vD, vA, vB } },
		{ "vsubuwm",	'w',  0 , _VX(04,RD,RA,RB,1152), { vD, vA, vB } },
		{ "vsubuws",	'w',  0 , _VX(04,RD,RA,RB,1664), { vD, vA, vB } },
		{ "vsum2sws",	'w',  0 , _VX(04,RD,RA,RB,1672), { vD, vA, vB } },
		{ "vsum4sbs",	'w',  0 , _VX(04,RD,RA,RB,1800), { vD, vA, vB } },
		{ "vsum4shs",	'w',  0 , _VX(04,RD,RA,RB,1608), { vD, vA, vB } },
		{ "vsum4ubs",	'w',  0 , _VX(04,RD,RA,RB,1544), { vD, vA, vB } },
		{ "vsumsws",	'w',  0 , _VX(04,RD,RA,RB,1928), { vD, vA, vB } },
		{ "vupkhpx",	'w',  0 , _VX(04,RD,00,RB, 846), { vD, __, vB } },
		{ "vupkhsb",	'h',  0 , _VX(04,RD,00,RB, 526), { vD, __, vB } },
		{ "vupkhsh",	'w',  0 , _VX(04,RD,00,RB, 590), { vD, __, vB } },
		{ "vupklpx",	'w',  0 , _VX(04,RD,00,RB, 974), { vD, __, vB } },
		{ "vupklsb",	'h',  0 , _VX(04,RD,00,RB, 654), { vD, __, vB } },
		{ "vupklsh",	'w',  0 , _VX(04,RD,00,RB, 718), { vD, __, vB } },
		{ "vxor",		'w',  0 , _VX(04,RD,RA,RB,1220), { vD, vA, vB } },
	};

	// Code template
	static uint32_t code[] = {
		POWERPC_MFSPR(12, 256),			// mfvrsave	r12
		_D(15,0,0,0x1e00),				// lis		r0,0x9e00 ([v0;v3-v6])
		POWERPC_MTSPR(0, 256),			// mtvrsave	r0
		POWERPC_LVX(RA, 0, RA),			// lvx		v4,r4(0)
		POWERPC_LVX(RB, 0, RB),			// lvx		v5,r5(0)
		POWERPC_LVX(RC, 0, RC),			// lvx		v6,r6(0)
		POWERPC_LVX(0, 0, VSCR),		// lvx		v0,r7(0)
		_VX(04,00,00,00,1604),			// mtvscr	v0
		0,								// <op>		v3,v4,v5
		_VX(04,00,00,00,1540),			// mfvscr	v0
		POWERPC_STVX(0, 0, VSCR),		// stvx		v0,r7(0)
		POWERPC_STVX(RD, 0, RD),		// stvx		v3,r3(0)
		POWERPC_MTSPR(12, 256),			// mtvrsave	r12
		POWERPC_BLR						// blr
	};

	int i_opcode = -1;
	const int n_instructions = sizeof(code) / sizeof(code[0]);
	for (int i = 0; i < n_instructions; i++) {
		if (code[i] == 0) {
			i_opcode = i;
			break;
		}
	}
	assert(i_opcode != -1);

	const int n_elements = sizeof(tests) / sizeof(tests[0]);
	for (int n = 0; n < n_elements; n++) {
		vector_test_t vt = tests[n];
		code[i_opcode] = vt.opcode;
		flush_icache_range(code, sizeof(code));

		// Operand type
		char op_type = vt.op_type;
		if (!op_type)
			op_type = vt.type;

		// Operand values
		int n_vector_values;
		const vector_value_t *vvp;
		if (op_type == 'f') {
			n_vector_values = sizeof(vector_fp_values)/sizeof(vector_fp_values[0]);
			vvp = vector_fp_values;
		}
		else {
			n_vector_values = sizeof(vector_values)/sizeof(vector_values[0]);
			vvp = vector_values;
		}

		printf("Testing %s\n", vt.name);
		static aligned_vector_t avi, avj, avk;
		if (vt.operands[1] == vA && vt.operands[2] == vB && vt.operands[3] == vC) {
			for (int i = 0; i < n_vector_values; i++) {
				avi.copy(vvp[i].v);
				for (int j = 0; j < n_vector_values; j++) {
					avj.copy(vvp[j].v);
					for (int k = 0; k < n_vector_values; k++) {
						avk.copy(vvp[k].v);
						test_one_vector(code, vt, avi.addr(), avj.addr(), avk.addr());
					}
				}
			}
		}
		else if (vt.operands[1] == vA && vt.operands[2] == vB && vt.operands[3] == vN) {
			for (int i = 0; i < 16; i++) {
				vSH_field::insert(vt.opcode, i);
				code[i_opcode] = vt.opcode;
				flush_icache_range(code, sizeof(code));
				avi.copy(vvp[i].v);
				for (int j = 0; j < n_vector_values; j++) {
					avj.copy(vvp[j].v);
					for (int k = 0; k < n_vector_values; k++)
						test_one_vector(code, vt, avi.addr(), avj.addr());
				}
			}
		}
		else if (vt.operands[1] == vA && vt.operands[2] == vB) {
			for (int i = 0; i < n_vector_values; i++) {
				avi.copy(vvp[i].v);
				for (int j = 0; j < n_vector_values; j++) {
					if (op_type == 'B') {
						if (!vector_all_eq('b', vvp[j].v))
							continue;
					}
					avj.copy(vvp[j].v);
					test_one_vector(code, vt, avi.addr(), avj.addr());
				}
			}
		}
		else if (vt.operands[1] == vI && vt.operands[2] == vB) {
			for (int i = 0; i < 32; i++) {
				rA_field::insert(vt.opcode, i);
				code[i_opcode] = vt.opcode;
				flush_icache_range(code, sizeof(code));
				for (int j = 0; j < n_vector_values; j++) {
					avj.copy(vvp[j].v);
					test_one_vector(code, vt, NULL, avj.addr());
				}
			}
		}
		else if (vt.operands[1] == vI) {
			for (int i = 0; i < 32; i++) {
				rA_field::insert(vt.opcode, i);
				code[i_opcode] = vt.opcode;
				flush_icache_range(code, sizeof(code));
				test_one_vector(code, vt);
			}
		}
		else if (vt.operands[1] == __ && vt.operands[2] == vB) {
			for (int i = 0; i < n_vector_values; i++) {
				avi.copy(vvp[i].v);
				test_one_vector(code, vt, NULL, avi.addr());
			}
		}
		else {
			printf("ERROR: unhandled test case\n");
			abort();
		}
	}
#endif
}

// Illegal handler to catch out AltiVec instruction
#ifdef NATIVE_POWERPC
static sigjmp_buf env;

static void sigill_handler(int sig)
{
	has_altivec = false;
	siglongjmp(env, 1);
}
#endif

bool powerpc_test_cpu::test(void)
{
	// Tests initialization
	tests = errors = 0;
	init_cr = init_xer = 0;

	// Execution ALU tests
#if TEST_ALU_OPS
	test_add();
	test_sub();
	test_mul();
	test_div();
	test_shift();
	test_rotate();
	test_logical();
	test_compare();
	test_cr_logical();
#endif

	// Execute VMX tests
#if TEST_VMX_OPS
	if (has_altivec) {
		test_vector_load_for_shift();
		test_vector_load();
		test_vector_arith();
	}
#endif

	printf("%u errors out of %u tests\n", errors, tests);
	return errors == 0;
}

int main(int argc, char *argv[])
{
	FILE *result = NULL;
        FILE *reference = NULL;

        if (argc == 1) {
          fprintf (stderr, "usage: %s result-file [reference-file]\n", argv[0]);
          return EXIT_FAILURE;
        }
        else if (argc == 2) {
          const char *file = argv[1];
          if ((result = fopen(file, "wb")) == NULL) {
            fprintf(stderr, "ERROR: can't open %s for writing\n", file);
            return EXIT_FAILURE;
          }
        }
        else if (argc == 3) {
          const char *resfile = argv[1];
          const char *reffile = argv[2];

          if ((result = fopen(resfile, "rb")) == NULL) {
            fprintf(stderr, "ERROR: can't open %s for reading\n", resfile);
            return EXIT_FAILURE;
          }
          if ((reference = fopen(reffile, "rb")) == NULL) {
            fprintf(stderr, "ERROR: can't open %s for reading\n", reffile);
            return EXIT_FAILURE;
          }
	}

	powerpc_test_cpu *ppc = new powerpc_test_cpu (result, reference);

	// Check if host CPU supports AltiVec instructions
	has_altivec = true;
#ifdef NATIVE_POWERPC
	signal(SIGILL, sigill_handler);
	if (!sigsetjmp(env, 1))
		asm volatile(".long 0x10000484"); // vor v0,v0,v0
	signal(SIGILL, SIG_DFL);
#endif

	bool ok = ppc->test();
	if (result)
          fclose(result);
        if (reference)
          fclose(reference);
	delete ppc;
	return !ok;
}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
  2012-08-26 14:14 [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions Aurelien Jarno
  2012-08-26 15:25 ` Peter Maydell
  2012-08-26 15:27 ` Andreas Färber
@ 2012-08-26 17:56 ` Blue Swirl
  2012-08-26 18:17   ` Peter Maydell
  2 siblings, 1 reply; 6+ messages in thread
From: Blue Swirl @ 2012-08-26 17:56 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: qemu-devel, Andreas Färber, Alexander Graf

On Sun, Aug 26, 2012 at 2:14 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
> Altivec instructions are not working anymore in PowerPC emulation,
> following commit d15f74fb, which inverted two registers in the call
> to helper. Fix that.
>
> Cc: Blue Swirl <blauwirbel@gmail.com>

Acked-by: Blue Swirl <blauwirbel@gmail.com>

I wonder why TCG debug did not catch this.

> Cc: Alexander Graf <agraf@suse.de>
> Cc: Andreas Färber <afaerber@suse.de>
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-ppc/translate.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target-ppc/translate.c b/target-ppc/translate.c
> index 91eb7a0..ac915cc 100644
> --- a/target-ppc/translate.c
> +++ b/target-ppc/translate.c
> @@ -6530,7 +6530,7 @@ static void glue(gen_, name)(DisasContext *ctx)                         \
>      ra = gen_avr_ptr(rA(ctx->opcode));                                  \
>      rb = gen_avr_ptr(rB(ctx->opcode));                                  \
>      rd = gen_avr_ptr(rD(ctx->opcode));                                  \
> -    gen_helper_##name(rd, cpu_env, ra, rb);                             \
> +    gen_helper_##name(cpu_env, rd, ra, rb);                             \
>      tcg_temp_free_ptr(ra);                                              \
>      tcg_temp_free_ptr(rb);                                              \
>      tcg_temp_free_ptr(rd);                                              \
> --
> 1.7.10.4
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions
  2012-08-26 17:56 ` Blue Swirl
@ 2012-08-26 18:17   ` Peter Maydell
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2012-08-26 18:17 UTC (permalink / raw)
  To: Blue Swirl
  Cc: Alexander Graf, qemu-devel, Aurelien Jarno, Andreas Färber

On 26 August 2012 18:56, Blue Swirl <blauwirbel@gmail.com> wrote:
> On Sun, Aug 26, 2012 at 2:14 PM, Aurelien Jarno <aurelien@aurel32.net> wrote:
>> Altivec instructions are not working anymore in PowerPC emulation,
>> following commit d15f74fb, which inverted two registers in the call
>> to helper. Fix that.

> I wonder why TCG debug did not catch this.

Because all of ra, rb, rd and cpu_env are TCGv_ptr. Debug only
catches mismatches between _i32, _i64 and _ptr. It might be
possible to add support for enforcing that you pass a cpu_env
in where your DEF_HELPER_* had an 'env' parameter, but it would
be slightly different from the current checks because you want
to support passing a cpu_env TCGv in where a TCGv_ptr is OK
as well as the places which require exactly a TCGv_env.

-- PMM

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-08-26 18:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-26 14:14 [Qemu-devel] [PATCH for-1.2] target-ppc: fix altivec instructions Aurelien Jarno
2012-08-26 15:25 ` Peter Maydell
2012-08-26 15:27 ` Andreas Färber
2012-08-26 15:46   ` Aurelien Jarno
2012-08-26 17:56 ` Blue Swirl
2012-08-26 18:17   ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).