LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 5/9][v5] powerpc: implement is_instr_load_store().
From: Michael Ellerman @ 2013-10-09  1:03 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: linux-kernel, Stephane Eranian, linuxppc-dev, Paul Mackerras,
	Arnaldo Carvalho de Melo, Anshuman Khandual
In-Reply-To: <20131008193117.GA699@us.ibm.com>

On Tue, 2013-10-08 at 12:31 -0700, Sukadev Bhattiprolu wrote:
> Michael Ellerman [michael@ellerman.id.au] wrote:
> | bool is_load_store(int ext_opcode)
> | {
> |         upper = ext_opcode >> 5;
> |         lower = ext_opcode & 0x1f;
> | 
> |         /* Short circuit as many misses as we can */
> |         if (lower < 3 || lower > 23)
> |             return false;
> 
> I see some loads/stores like these which are not covered by
> the above check. Is it ok to ignore them ?
> 
> 	lower == 29: ldepx, stdepx, eviddepx, evstddepx
> 
> 	lower == 31: lwepx, lbepx, lfdepx, stfdepx,

Those are the external process ID instructions, which I've never heard
of anyone using, I think we can ignore them.

> Looking through the opcode maps, I also see these for primary
> op code 4:
> 
> 	evldd, evlddx, evldwx, evldw, evldh, evldhx.
> 
> Should we include those also ?

Yes I think so. I didn't check any of the other opcodes for you.

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/powernv: Reduce panic timeout from 180s to 10s
From: Anton Blanchard @ 2013-10-08 23:56 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, paulus
In-Reply-To: <1381269153.7979.281.camel@snotra.buserror.net>


> > > We made this change to pseries in 2011 and I think it makes
> > > sense to do the same on powernv.
> > 
> > I'd vote we set it to 10s for all 64-bit machines in
> > arch/powerpc/kernel/setup_64.c.
> 
> Why is 64-bit relevant?  And wouldn't such a short delay be a problem
> if the crash is displayed on a monitor?

That is why we made it pseries specific in the past. Almost all our
boxes are on a virtual console and the 3 minutes of pausing just hurt
our uptimes.

If other platform maintainers prefer to keep the 3 minute pause, then
we just change the PowerNV platform.

Anton

^ permalink raw reply

* Re: [PATCH] powerpc: fix e500 SPE float to integer and fixed-point conversions
From: Joseph S. Myers @ 2013-10-08 23:43 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Shan Hai, linux-kernel
In-Reply-To: <Pine.LNX.4.64.1310082337080.23637@digraph.polyomino.org.uk>

On Tue, 8 Oct 2013, Joseph S. Myers wrote:

> I'll send as a followup the testcase I used for verifying that the
> instructions (other than the theoretical conversions to 64-bit
> integers) produce the correct results.  In addition, this has been
> tested with the glibc testsuite (with the e500 port as posted at
> <https://sourceware.org/ml/libc-alpha/2013-10/msg00195.html>, where it
> improves the libm test results.

Here is that testcase.

#include <stdio.h>
#include <stdlib.h>

#define INFF __builtin_inff ()
#define INFD __builtin_inf ()
#define NANF __builtin_nanf ("")
#define NAND __builtin_nan ("")

/* e500 rounding modes: 0 = nearest, 1 = zero, 2 = up, 3 = down.  */

static inline void
set_rm (unsigned int mode)
{
  unsigned int spefscr;
  asm volatile ("mfspefscr %0" : "=r" (spefscr));
  spefscr = (spefscr & ~3) | mode;
  asm volatile ("mtspefscr %0" : : "r" (spefscr));
}

static int success_count, failure_count;

struct float_test_data
{
  float input;
  unsigned int expected[4];
};

struct double_test_data
{
  double input;
  unsigned int expected[4];
};

typedef float vfloat __attribute__ ((vector_size (8)));
typedef unsigned int vuint __attribute__ ((vector_size (8)));

union vfloat_union
{
  vfloat vf;
  float f[2];
};

union vuint_union
{
  vuint vui;
  unsigned int ui[2];
};

#define T(A, B, C, D, E) { (A), { (B), (C), (D), (E) } }
#define TZ(A, B) T (A, B, B, B, B)

static void
check_result (const char *insn, double input, unsigned int rm,
	      unsigned int expected, unsigned int res)
{
  if (res == expected)
    success_count++;
  else
    {
      failure_count++;
      printf ("%s %a mode %u expected 0x%x (%d) got 0x%x (%d)\n",
	      insn, input, rm, expected, (int) expected, res, (int) res);
    }
}

#define RUN_FLOAT_TESTS(INSN)						\
static void								\
test_##INSN (void)							\
{									\
  size_t i;								\
  for (i = 0;								\
       i < sizeof (INSN##_test_data) / sizeof (INSN##_test_data[0]);	\
       i++)								\
    {									\
      unsigned int rm;							\
      for (rm = 0; rm <= 3; rm++)					\
	{								\
	  set_rm (rm);							\
	  unsigned int res;						\
	  asm volatile (#INSN " %0, %1"					\
			: "=&r" (res)					\
			: "r" (INSN##_test_data[i].input));		\
	  check_result (#INSN, INSN##_test_data[i].input, rm,		\
			INSN##_test_data[i].expected[rm], res);		\
	}								\
    }									\
}

#define RUN_VFLOAT_TESTS(INSN, TINSN)					\
static void								\
test_##INSN (void)							\
{									\
  size_t i;								\
  for (i = 0;								\
       i < sizeof (TINSN##_test_data) / sizeof (TINSN##_test_data[0]);	\
       i++)								\
    {									\
      unsigned int rm;							\
      for (rm = 0; rm <= 3; rm++)					\
	{								\
	  set_rm (rm);							\
	  union vfloat_union varg;					\
	  union vuint_union vres;					\
	  varg.f[0] = TINSN##_test_data[i].input;			\
	  varg.f[1] = 0;						\
	  asm volatile (#INSN " %0, %1"					\
			: "=&r" (vres.vui)				\
			: "r" (varg.vf));				\
	  check_result (#INSN " (high)", TINSN##_test_data[i].input,	\
			rm, TINSN##_test_data[i].expected[rm],		\
			vres.ui[0]);					\
	  check_result (#INSN " (low 0)", TINSN##_test_data[i].input,	\
			rm, 0, vres.ui[1]);				\
	  varg.f[1] = TINSN##_test_data[i].input;			\
	  varg.f[0] = 0;						\
	  asm volatile (#INSN " %0, %1"					\
			: "=&r" (vres.vui)				\
			: "r" (varg.vf));				\
	  check_result (#INSN " (low)", TINSN##_test_data[i].input,	\
			rm, TINSN##_test_data[i].expected[rm],		\
			vres.ui[1]);					\
	  check_result (#INSN " (high 0)", TINSN##_test_data[i].input,	\
			rm, 0, vres.ui[0]);				\
	}								\
    }									\
}

static const struct float_test_data efsctsiz_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0x7fffffff),
    TZ (0x1.fffffep127f, 0x7fffffff),
    TZ (0x1p31f, 0x7fffffff),
    TZ (0x1.fffffep30f, 0x7fffff80),
    TZ (1.6f, 1),
    TZ (1.5f, 1),
    TZ (1.4f, 1),
    TZ (1.0f, 1),
    TZ (0.6f, 0),
    TZ (0.5f, 0),
    TZ (0.4f, 0),
    TZ (0x1p-149f, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    TZ (-0x1p-149f, 0),
    TZ (-0.4f, 0),
    TZ (-0.5f, 0),
    TZ (-0.6f, 0),
    TZ (-1.0f, -1),
    TZ (-1.4f, -1),
    TZ (-1.5f, -1),
    TZ (-1.6f, -1),
    TZ (-0x1.fffffep30f, 0x80000080),
    TZ (-0x1p31f, 0x80000000),
    TZ (-0x1.fffffep127f, 0x80000000),
    TZ (-INFF, 0x80000000),
    TZ (-NANF, 0),
  };

static const struct float_test_data efsctuiz_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0xffffffff),
    TZ (0x1.fffffep127f, 0xffffffff),
    TZ (0x1p32f, 0xffffffff),
    TZ (0x1.fffffep31f, 0xffffff00),
    TZ (1.6f, 1),
    TZ (1.5f, 1),
    TZ (1.4f, 1),
    TZ (1.0f, 1),
    TZ (0.6f, 0),
    TZ (0.5f, 0),
    TZ (0.4f, 0),
    TZ (0x1p-149f, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    TZ (-0x1p-149f, 0),
    TZ (-0.4f, 0),
    TZ (-0.5f, 0),
    TZ (-0.6f, 0),
    TZ (-1.0f, 0),
    TZ (-1.4f, 0),
    TZ (-1.5f, 0),
    TZ (-1.6f, 0),
    TZ (-0x1.fffffep127f, 0),
    TZ (-INFF, 0),
    TZ (-NANF, 0),
  };

static const struct double_test_data efdctsiz_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0x7fffffff),
    TZ (0x1.fffffffffffffp1023, 0x7fffffff),
    TZ (0x1.0000000000001p31, 0x7fffffff),
    TZ (0x1p31, 0x7fffffff),
    TZ (0x1.fffffffffffffp30, 0x7fffffff),
    TZ (0x1.fffffffcp30, 0x7fffffff),
    TZ (1.6, 1),
    TZ (1.5, 1),
    TZ (1.4, 1),
    TZ (1.0, 1),
    TZ (0.6, 0),
    TZ (0.5, 0),
    TZ (0.4, 0),
    TZ (0x1p-1074, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    TZ (-0x1p-1074, 0),
    TZ (-0.4, 0),
    TZ (-0.5, 0),
    TZ (-0.6, 0),
    TZ (-1.0, -1),
    TZ (-1.4, -1),
    TZ (-1.5, -1),
    TZ (-1.6, -1),
    TZ (-0x1.fffffffcp30, 0x80000001),
    TZ (-0x1.fffffffffffffp30, 0x80000001),
    TZ (-0x1p31, 0x80000000),
    TZ (-0x1.0000000000001p31, 0x80000000),
    TZ (-0x1.fffffffffffffp1023, 0x80000000),
    TZ (-INFD, 0x80000000),
    TZ (-NAND, 0),
  };

static const struct double_test_data efdctuiz_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0xffffffff),
    TZ (0x1.fffffffffffffp1023, 0xffffffff),
    TZ (0x1.0000000000001p32, 0xffffffff),
    TZ (0x1p32, 0xffffffff),
    TZ (0x1.fffffffffffffp31, 0xffffffff),
    TZ (1.6, 1),
    TZ (1.5, 1),
    TZ (1.4, 1),
    TZ (1.0, 1),
    TZ (0.6, 0),
    TZ (0.5, 0),
    TZ (0.4, 0),
    TZ (0x1p-1074, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    TZ (-0x1p-1074, 0),
    TZ (-0.4, 0),
    TZ (-0.5, 0),
    TZ (-0.6, 0),
    TZ (-1.0, 0),
    TZ (-1.4, 0),
    TZ (-1.5, 0),
    TZ (-1.6, 0),
    TZ (-0x1.fffffffffffffp1023, 0),
    TZ (-INFD, 0),
    TZ (-NAND, 0),
  };

static const struct float_test_data efsctsi_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0x7fffffff),
    TZ (0x1.fffffep127f, 0x7fffffff),
    TZ (0x1p31f, 0x7fffffff),
    TZ (0x1.fffffep30f, 0x7fffff80),
    T (1.6f, 2, 1, 2, 1),
    T (1.5f, 2, 1, 2, 1),
    T (1.4f, 1, 1, 2, 1),
    TZ (1.0f, 1),
    T (0.6f, 1, 0, 1, 0),
    T (0.5f, 0, 0, 1, 0),
    T (0.4f, 0, 0, 1, 0),
    T (0x1p-149f, 0, 0, 1, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    T (-0x1p-149f, 0, 0, 0, -1),
    T (-0.4f, 0, 0, 0, -1),
    T (-0.5f, 0, 0, 0, -1),
    T (-0.6f, -1, 0, 0, -1),
    TZ (-1.0f, -1),
    T (-1.4f, -1, -1, -1, -2),
    T (-1.5f, -2, -1, -1, -2),
    T (-1.6f, -2, -1, -1, -2),
    TZ (-0x1.fffffep30f, 0x80000080),
    TZ (-0x1p31f, 0x80000000),
    TZ (-0x1.fffffep127f, 0x80000000),
    TZ (-INFF, 0x80000000),
    TZ (-NANF, 0),
  };

static const struct float_test_data efsctui_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0xffffffff),
    TZ (0x1.fffffep127f, 0xffffffff),
    TZ (0x1p32f, 0xffffffff),
    TZ (0x1.fffffep31f, 0xffffff00),
    T (1.6f, 2, 1, 2, 1),
    T (1.5f, 2, 1, 2, 1),
    T (1.4f, 1, 1, 2, 1),
    TZ (1.0f, 1),
    T (0.6f, 1, 0, 1, 0),
    T (0.5f, 0, 0, 1, 0),
    T (0.4f, 0, 0, 1, 0),
    T (0x1p-149f, 0, 0, 1, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    TZ (-0x1p-149f, 0),
    TZ (-0.4f, 0),
    TZ (-0.5f, 0),
    TZ (-0.6f, 0),
    TZ (-1.0f, 0),
    TZ (-1.4f, 0),
    TZ (-1.5f, 0),
    TZ (-1.6f, 0),
    TZ (-0x1.fffffep127f, 0),
    TZ (-INFF, 0),
    TZ (-NANF, 0),
  };

static const struct double_test_data efdctsi_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0x7fffffff),
    TZ (0x1.fffffffffffffp1023, 0x7fffffff),
    TZ (0x1.0000000000001p31, 0x7fffffff),
    TZ (0x1p31, 0x7fffffff),
    TZ (0x1.fffffffffffffp30, 0x7fffffff),
    TZ (0x1.fffffffcp30, 0x7fffffff),
    T (1.6, 2, 1, 2, 1),
    T (1.5, 2, 1, 2, 1),
    T (1.4, 1, 1, 2, 1),
    TZ (1.0, 1),
    T (0.6, 1, 0, 1, 0),
    T (0.5, 0, 0, 1, 0),
    T (0.4, 0, 0, 1, 0),
    T (0x1p-1074, 0, 0, 1, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    T (-0x1p-1074, 0, 0, 0, -1),
    T (-0.4, 0, 0, 0, -1),
    T (-0.5, 0, 0, 0, -1),
    T (-0.6, -1, 0, 0, -1),
    TZ (-1.0, -1),
    T (-1.4, -1, -1, -1, -2),
    T (-1.5, -2, -1, -1, -2),
    T (-1.6, -2, -1, -1, -2),
    TZ (-0x1.fffffffcp30, 0x80000001),
    T (-0x1.fffffffffffffp30, 0x80000000, 0x80000001, 0x80000001, 0x80000000),
    TZ (-0x1p31, 0x80000000),
    TZ (-0x1.0000000000001p31, 0x80000000),
    TZ (-0x1.fffffffffffffp1023, 0x80000000),
    TZ (-INFD, 0x80000000),
    TZ (-NAND, 0),
  };

static const struct double_test_data efdctui_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0xffffffff),
    TZ (0x1.fffffffffffffp1023, 0xffffffff),
    TZ (0x1.0000000000001p32, 0xffffffff),
    TZ (0x1p32, 0xffffffff),
    TZ (0x1.fffffffffffffp31, 0xffffffff),
    T (1.6, 2, 1, 2, 1),
    T (1.5, 2, 1, 2, 1),
    T (1.4, 1, 1, 2, 1),
    TZ (1.0, 1),
    T (0.6, 1, 0, 1, 0),
    T (0.5, 0, 0, 1, 0),
    T (0.4, 0, 0, 1, 0),
    T (0x1p-1074, 0, 0, 1, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    TZ (-0x1p-1074, 0),
    TZ (-0.4, 0),
    TZ (-0.5, 0),
    TZ (-0.6, 0),
    TZ (-1.0, 0),
    TZ (-1.4, 0),
    TZ (-1.5, 0),
    TZ (-1.6, 0),
    TZ (-0x1.fffffffffffffp1023, 0),
    TZ (-INFD, 0),
    TZ (-NAND, 0),
  };

static const struct float_test_data efsctsf_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0x7fffffff),
    TZ (0x1.fffffep127f, 0x7fffffff),
    TZ (0x1.000002p0f, 0x7fffffff),
    TZ (1.0f, 0x7fffffff),
    TZ (0x1.fffffep-1f, 0x7fffff80),
    TZ (0xffffff.0p-31f, 0xffffff),
    T (0x7fffff.8p-31f, 0x800000, 0x7fffff, 0x800000, 0x7fffff),
    T (0x7ffffe.8p-31f, 0x7ffffe, 0x7ffffe, 0x7fffff, 0x7ffffe),
    T (0x1.9p-31f, 2, 1, 2, 1),
    T (0x1.8p-31f, 2, 1, 2, 1),
    T (0x1.7p-31f, 1, 1, 2, 1),
    TZ (0x1p-31f, 1),
    T (0x0.9p-31f, 1, 0, 1, 0),
    T (0x0.8p-31f, 0, 0, 1, 0),
    T (0x0.7p-31f, 0, 0, 1, 0),
    T (0x1p-149f, 0, 0, 1, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    T (-0x1p-149f, 0, 0, 0, -1),
    T (-0x0.7p-31f, 0, 0, 0, -1),
    T (-0x0.8p-31f, 0, 0, 0, -1),
    T (-0x0.9p-31f, -1, 0, 0, -1),
    TZ (-0x1p-31f, -1),
    T (-0x1.7p-31f, -1, -1, -1, -2),
    T (-0x1.8p-31f, -2, -1, -1, -2),
    T (-0x1.9p-31f, -2, -1, -1, -2),
    T (-0x7ffffe.8p-31f, -0x7ffffe, -0x7ffffe, -0x7ffffe, -0x7fffff),
    T (-0x7fffff.8p-31f, -0x800000, -0x7fffff, -0x7fffff, -0x800000),
    TZ (-0xffffff.0p-31f, -0xffffff),
    TZ (-0x1.fffffep-1f, -0x7fffff80),
    TZ (-1.0f, 0x80000000),
    TZ (-0x1.000002p0f, 0x80000000),
    TZ (-0x1.fffffep127f, 0x80000000),
    TZ (-INFF, 0x80000000),
    TZ (-NANF, 0),
  };

static const struct float_test_data efsctuf_test_data[] =
  {
    TZ (NANF, 0),
    TZ (INFF, 0xffffffff),
    TZ (0x1.fffffep127f, 0xffffffff),
    TZ (0x1.000002p0f, 0xffffffff),
    TZ (1.0f, 0xffffffff),
    TZ (0x1.fffffep-1f, 0xffffff00),
    TZ (0xffffff.0p-32f, 0xffffff),
    T (0x7fffff.8p-32f, 0x800000, 0x7fffff, 0x800000, 0x7fffff),
    T (0x7ffffe.8p-32f, 0x7ffffe, 0x7ffffe, 0x7fffff, 0x7ffffe),
    T (0x1.9p-32f, 2, 1, 2, 1),
    T (0x1.8p-32f, 2, 1, 2, 1),
    T (0x1.7p-32f, 1, 1, 2, 1),
    TZ (0x1p-32f, 1),
    T (0x0.9p-32f, 1, 0, 1, 0),
    T (0x0.8p-32f, 0, 0, 1, 0),
    T (0x0.7p-32f, 0, 0, 1, 0),
    T (0x1p-149f, 0, 0, 1, 0),
    TZ (0.0f, 0),
    TZ (-0.0f, 0),
    TZ (-0x1p-149f, 0),
    TZ (-0x0.7p-32f, 0),
    TZ (-0x0.8p-32f, 0),
    TZ (-0x0.9p-32f, 0),
    TZ (-0x1p-32f, 0),
    TZ (-0x1.7p-32f, 0),
    TZ (-0x1.8p-32f, 0),
    TZ (-0x1.9p-32f, 0),
    TZ (-0x7ffffe.8p-32f, 0),
    TZ (-0x7fffff.8p-32f, 0),
    TZ (-0xffffff.0p-32f, 0),
    TZ (-0x1.fffffep-1f, 0),
    TZ (-1.0f, 0),
    TZ (-0x1.000002p0f, 0),
    TZ (-0x1.fffffep127f, 0),
    TZ (-INFF, 0),
    TZ (-NANF, 0),
  };

static const struct double_test_data efdctsf_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0x7fffffff),
    TZ (0x1.fffffffffffffp1023, 0x7fffffff),
    TZ (0x1.0000000000001p0, 0x7fffffff),
    TZ (1.0, 0x7fffffff),
    TZ (0x7fffffffp-31, 0x7fffffff),
    T (0x7fffff.8p-31, 0x800000, 0x7fffff, 0x800000, 0x7fffff),
    T (0x7ffffe.8p-31, 0x7ffffe, 0x7ffffe, 0x7fffff, 0x7ffffe),
    T (0x1.9p-31, 2, 1, 2, 1),
    T (0x1.8p-31, 2, 1, 2, 1),
    T (0x1.7p-31, 1, 1, 2, 1),
    TZ (0x1p-31, 1),
    T (0x0.9p-31, 1, 0, 1, 0),
    T (0x0.8p-31, 0, 0, 1, 0),
    T (0x0.7p-31, 0, 0, 1, 0),
    T (0x1p-1074, 0, 0, 1, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    T (-0x1p-1074, 0, 0, 0, -1),
    T (-0x0.7p-31, 0, 0, 0, -1),
    T (-0x0.8p-31, 0, 0, 0, -1),
    T (-0x0.9p-31, -1, 0, 0, -1),
    TZ (-0x1p-31, -1),
    T (-0x1.7p-31, -1, -1, -1, -2),
    T (-0x1.8p-31, -2, -1, -1, -2),
    T (-0x1.9p-31, -2, -1, -1, -2),
    T (-0x7ffffe.8p-31, -0x7ffffe, -0x7ffffe, -0x7ffffe, -0x7fffff),
    T (-0x7fffff.8p-31, -0x800000, -0x7fffff, -0x7fffff, -0x800000),
    TZ (-0x7fffffffp-31, -0x7fffffff),
    TZ (-1.0, 0x80000000),
    TZ (-0x1.0000000000001p0, 0x80000000),
    TZ (-0x1.fffffffffffffp1023, 0x80000000),
    TZ (-INFD, 0x80000000),
    TZ (-NAND, 0),
  };

static const struct double_test_data efdctuf_test_data[] =
  {
    TZ (NAND, 0),
    TZ (INFD, 0xffffffff),
    TZ (0x1.fffffffffffffp1023, 0xffffffff),
    TZ (0x1.0000000000001p0, 0xffffffff),
    TZ (1.0, 0xffffffff),
    TZ (0xffffffffp-32, 0xffffffff),
    T (0xfffffffe.9p-32, 0xffffffff, 0xfffffffe, 0xffffffff, 0xfffffffe),
    T (0xfffffffe.8p-32, 0xfffffffe, 0xfffffffe, 0xffffffff, 0xfffffffe),
    T (0xfffffffe.7p-32, 0xfffffffe, 0xfffffffe, 0xffffffff, 0xfffffffe),
    T (0xfffffffd.9p-32, 0xfffffffe, 0xfffffffd, 0xfffffffe, 0xfffffffd),
    T (0xfffffffd.8p-32, 0xfffffffe, 0xfffffffd, 0xfffffffe, 0xfffffffd),
    T (0xfffffffd.7p-32, 0xfffffffd, 0xfffffffd, 0xfffffffe, 0xfffffffd),
    T (0x7fffff.8p-32, 0x800000, 0x7fffff, 0x800000, 0x7fffff),
    T (0x7ffffe.8p-32, 0x7ffffe, 0x7ffffe, 0x7fffff, 0x7ffffe),
    T (0x1.9p-32, 2, 1, 2, 1),
    T (0x1.8p-32, 2, 1, 2, 1),
    T (0x1.7p-32, 1, 1, 2, 1),
    TZ (0x1p-32, 1),
    T (0x0.9p-32, 1, 0, 1, 0),
    T (0x0.8p-32, 0, 0, 1, 0),
    T (0x0.7p-32, 0, 0, 1, 0),
    T (0x1p-1074, 0, 0, 1, 0),
    TZ (0.0, 0),
    TZ (-0.0, 0),
    TZ (-0x1p-1074, 0),
    TZ (-0x0.7p-32, 0),
    TZ (-0x0.8p-32, 0),
    TZ (-0x0.9p-32, 0),
    TZ (-0x1p-32, 0),
    TZ (-0x1.7p-32, 0),
    TZ (-0x1.8p-32, 0),
    TZ (-0x1.9p-32, 0),
    TZ (-0x7ffffe.8p-32, 0),
    TZ (-0x7fffff.8p-32, 0),
    TZ (-0xfffffffd.7p-32, 0),
    TZ (-0xfffffffd.8p-32, 0),
    TZ (-0xfffffffd.9p-32, 0),
    TZ (-0xfffffffe.7p-32, 0),
    TZ (-0xfffffffe.8p-32, 0),
    TZ (-0xfffffffe.9p-32, 0),
    TZ (-0xffffffffp-32, 0),
    TZ (-1.0, 0),
    TZ (-0x1.0000000000001p0, 0),
    TZ (-0x1.fffffffffffffp1023, 0),
    TZ (-INFD, 0),
    TZ (-NAND, 0),
  };

RUN_FLOAT_TESTS (efsctsiz)
RUN_VFLOAT_TESTS (evfsctsiz, efsctsiz)
RUN_FLOAT_TESTS (efsctuiz)
RUN_VFLOAT_TESTS (evfsctuiz, efsctuiz)
RUN_FLOAT_TESTS (efdctsiz)
RUN_FLOAT_TESTS (efdctuiz)

RUN_FLOAT_TESTS (efsctsi)
RUN_VFLOAT_TESTS (evfsctsi, efsctsi)
RUN_FLOAT_TESTS (efsctui)
RUN_VFLOAT_TESTS (evfsctui, efsctui)
RUN_FLOAT_TESTS (efdctsi)
RUN_FLOAT_TESTS (efdctui)

RUN_FLOAT_TESTS (efsctsf)
RUN_VFLOAT_TESTS (evfsctsf, efsctsf)
RUN_FLOAT_TESTS (efsctuf)
RUN_VFLOAT_TESTS (evfsctuf, efsctuf)
RUN_FLOAT_TESTS (efdctsf)
RUN_FLOAT_TESTS (efdctuf)

int
main (void)
{
  test_efsctsiz ();
  test_evfsctsiz ();
  test_efsctuiz ();
  test_evfsctuiz ();
  test_efdctsiz ();
  test_efdctuiz ();
  test_efsctsi ();
  test_evfsctsi ();
  test_efsctui ();
  test_evfsctui ();
  test_efdctsi ();
  test_efdctui ();
  test_efsctsf ();
  test_evfsctsf ();
  test_efsctuf ();
  test_evfsctuf ();
  test_efdctsf ();
  test_efdctuf ();
  printf ("%d tests passed, %d tests failed\n", success_count, failure_count);
  exit (failure_count != 0 ? EXIT_FAILURE : EXIT_SUCCESS);
}

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply

* [PATCH] powerpc: fix e500 SPE float to integer and fixed-point conversions
From: Joseph S. Myers @ 2013-10-08 23:41 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Shan Hai, linux-kernel

From: Joseph Myers <joseph@codesourcery.com>

The e500 SPE floating-point emulation code has several problems in how
it handles conversions to integer and fixed-point fractional types.

There are the following 20 relevant instructions.  These can convert
to signed or unsigned 32-bit integers, either rounding towards zero
(as correct for C casts from floating-point to integer) or according
to the current rounding mode, or to signed or unsigned 32-bit
fixed-point values (values in the range [-1, 1) or [0, 1)).  For
conversion from double precision there are also instructions to
convert to 64-bit integers, rounding towards zero, although as far as
I know those instructions are completely theoretical (they are only
defined for implementations that support both SPE and classic 64-bit,
and I'm not aware of any such hardware even though the architecture
definition permits that combination).

#define EFSCTUI		0x2d4
#define EFSCTSI		0x2d5
#define EFSCTUF		0x2d6
#define EFSCTSF		0x2d7
#define EFSCTUIZ	0x2d8
#define EFSCTSIZ	0x2da

#define EVFSCTUI	0x294
#define EVFSCTSI	0x295
#define EVFSCTUF	0x296
#define EVFSCTSF	0x297
#define EVFSCTUIZ	0x298
#define EVFSCTSIZ	0x29a

#define EFDCTUIDZ	0x2ea
#define EFDCTSIDZ	0x2eb

#define EFDCTUI		0x2f4
#define EFDCTSI		0x2f5
#define EFDCTUF		0x2f6
#define EFDCTSF		0x2f7
#define EFDCTUIZ	0x2f8
#define EFDCTSIZ	0x2fa

The emulation code, for the instructions that come in variants
rounding either towards zero or according to the current rounding
direction, uses "if (func & 0x4)" as a condition for using _FP_ROUND
(otherwise _FP_ROUND_ZERO is used).  The condition is correct, but the
code it controls isn't.  Whether _FP_ROUND or _FP_ROUND_ZERO is used
makes no difference, as the effect of those soft-fp macros is to round
an intermediate floating-point result using the low three bits (the
last one sticky) of the working format.  As these operations are
dealing with a freshly unpacked floating-point input, those low bits
are zero and no rounding occurs.  The emulation code then uses the
FP_TO_INT_* macros for the actual integer conversion, with the effect
of always rounding towards zero; for rounding according to the current
rounding direction, it should be using FP_TO_INT_ROUND_*.

The instructions in question have semantics defined (in the Power ISA
documents) for out-of-range values and NaNs: out-of-range values
saturate and NaNs are converted to zero.  The emulation does nothing
to follow those semantics for NaNs (the soft-fp handling is to treat
them as infinities), and messes up the saturation semantics.  For
single-precision conversion to integers, (((func & 0x3) != 0) || SB_s)
is the condition used for doing a signed conversion.  The first part
is correct, but the second isn't: negative numbers should result in
saturation to 0 when converted to unsigned.  Double-precision
conversion to 64-bit integers correctly uses ((func & 0x1) == 0).
Double-precision conversion to 32-bit integers uses (((func & 0x3) !=
0) || DB_s), with correct first part and incorrect second part.  And
vector float conversion to integers uses (((func & 0x3) != 0) ||
SB0_s) (and similar for the other vector element), where the sign bit
check is again wrong.

The incorrect handling of negative numbers converted to unsigned was
introduced in commit afc0a07d4a283599ac3a6a31d7454e9baaeccca0.  The
rationale given there was a C testcase with cast from float to
unsigned int.  Conversion of out-of-range floating-point numbers to
integer types in C is undefined behavior in the base standard, defined
in Annex F to produce an unspecified value.  That is, the C testcase
used to justify that patch is incorrect - there is no ISO C
requirement for a particular value resulting from this conversion -
and in any case, the correct semantics for such emulation are the
semantics for the instruction (unsigned saturation, which is what it
does in hardware when the emulation is disabled).

The conversion to fixed-point values has its own problems.  That code
doesn't try to do a full emulation; it relies on the trap handler only
being called for arguments that are infinities, NaNs, subnormal or out
of range.  That's fine, but the logic ((vb.wp[1] >> 23) == 0xff &&
((vb.wp[1] & 0x7fffff) > 0)) for NaN detection won't detect negative
NaNs as being NaNs (the same applies for the double-precision case),
and subnormals are mapped to 0 rather than respecting the rounding
mode; the code should also explicitly raise the "invalid" exception.
The code for vectors works by executing the scalar float instruction
with the trapping disabled, meaning at least subnormals won't be
handled correctly.

As well as all those problems in the main emulation code, the rounding
handler - used to emulate rounding upward and downward when not
supported in hardware and when no higher priority exception occurred -
has its own problems.

* It gets called in some cases even for the instructions rounding to
  zero, and then acts according to the current rounding mode when it
  should just leave alone the truncated result provided by hardware.

* It presumes that the result is a single-precision, double-precision
  or single-precision vector as appropriate for the instruction type,
  determines the sign of the result accordingly, and then adjusts the
  result based on that sign and the rounding mode.

  - In the single-precision cases at least the sign determination for
    an integer result is the same as for a floating-point result; in
    the double-precision case, converted to 32-bit integer or fixed
    point, the sign of a double-precision value is in the high part of
    the register but it's the low part of the register that has the
    result of the conversion.

  - If the result is unsigned fixed-point, its sign may be wrongly
    determined as negative (does not actually cause problems, because
    inexact unsigned fixed-point results with the high bit set can
    only appear when converting from double, in which case the sign
    determination is instead wrongly using the high part of the
    register).

  - If the sign of the result is correctly determined as negative, any
    adjustment required to change the truncated result to one correct
    for the rounding mode should be in the opposite direction for
    two's-complement integers as for sign-magnitude floating-point
    values.

  - And if the integer result is zero, the correct sign can only be
    determined by examining the original operand, and not at all (as
    far as I can tell) if the operand and result are the same
    register.

This patch fixes all these problems (as far as possible, given the
inability to determine the correct sign in the rounding handler when
the truncated result is 0, the conversion is to a signed type and the
truncated result has overwritten the original operand).  Conversion to
fixed-point now uses full emulation, and does not use "asm" in the
vector case; the semantics are exactly those of converting to integer
according to the current rounding direction, once the exponent has
been adjusted, so the code makes such an adjustment then uses the
FP_TO_INT_ROUND macros.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

I'll send as a followup the testcase I used for verifying that the
instructions (other than the theoretical conversions to 64-bit
integers) produce the correct results.  In addition, this has been
tested with the glibc testsuite (with the e500 port as posted at
<https://sourceware.org/ml/libc-alpha/2013-10/msg00195.html>, where it
improves the libm test results.

The patch depends on my previous patch
<http://lkml.org/lkml/2013/10/4/497> to fix inexactness detection in
the rounding handler.  It does not depend on
<http://lkml.org/lkml/2013/10/4/495> (fix exception clearing),
<http://lkml.org/lkml/2013/10/8/694> (math-emu: fix floating-point to
integer unsigned saturation) or <http://lkml.org/lkml/2013/10/8/700>
(math-emu: fix floating-point to integer overflow detection), in that
I believe it can be applied independently of those other patches
without causing problems, but my testing has been in conjunction with
all those other patches and it may not fully fix all the affected
cases unless they are applied as well.

diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index ecdf35d..01a0abb 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -275,21 +275,13 @@ int do_spe_mathemu(struct pt_regs *regs)
 
 		case EFSCTSF:
 		case EFSCTUF:
-			if (!((vb.wp[1] >> 23) == 0xff && ((vb.wp[1] & 0x7fffff) > 0))) {
-				/* NaN */
-				if (((vb.wp[1] >> 23) & 0xff) == 0) {
-					/* denorm */
-					vc.wp[1] = 0x0;
-				} else if ((vb.wp[1] >> 31) == 0) {
-					/* positive normal */
-					vc.wp[1] = (func == EFSCTSF) ?
-						0x7fffffff : 0xffffffff;
-				} else { /* negative normal */
-					vc.wp[1] = (func == EFSCTSF) ?
-						0x80000000 : 0x0;
-				}
-			} else { /* rB is NaN */
-				vc.wp[1] = 0x0;
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB_e += (func == EFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[1], SB, 32,
+						(func == EFSCTSF));
 			}
 			goto update_regs;
 
@@ -306,16 +298,25 @@ int do_spe_mathemu(struct pt_regs *regs)
 		}
 
 		case EFSCTSI:
-		case EFSCTSIZ:
 		case EFSCTUI:
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[1], SB, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
+		case EFSCTSIZ:
 		case EFSCTUIZ:
-			if (func & 0x4) {
-				_FP_ROUND(1, SB);
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(1, SB);
+				FP_TO_INT_S(vc.wp[1], SB, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_S(vc.wp[1], SB, 32,
-					(((func & 0x3) != 0) || SB_s));
 			goto update_regs;
 
 		default:
@@ -404,22 +405,13 @@ cmp_s:
 
 		case EFDCTSF:
 		case EFDCTUF:
-			if (!((vb.wp[0] >> 20) == 0x7ff &&
-			   ((vb.wp[0] & 0xfffff) > 0 || (vb.wp[1] > 0)))) {
-				/* not a NaN */
-				if (((vb.wp[0] >> 20) & 0x7ff) == 0) {
-					/* denorm */
-					vc.wp[1] = 0x0;
-				} else if ((vb.wp[0] >> 31) == 0) {
-					/* positive normal */
-					vc.wp[1] = (func == EFDCTSF) ?
-						0x7fffffff : 0xffffffff;
-				} else { /* negative normal */
-					vc.wp[1] = (func == EFDCTSF) ?
-						0x80000000 : 0x0;
-				}
-			} else { /* NaN */
-				vc.wp[1] = 0x0;
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				DB_e += (func == EFDCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_D(vc.wp[1], DB, 32,
+						(func == EFDCTSF));
 			}
 			goto update_regs;
 
@@ -437,21 +429,35 @@ cmp_s:
 
 		case EFDCTUIDZ:
 		case EFDCTSIDZ:
-			_FP_ROUND_ZERO(2, DB);
-			FP_TO_INT_D(vc.dp[0], DB, 64, ((func & 0x1) == 0));
+			if (DB_c == FP_CLS_NAN) {
+				vc.dp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_D(vc.dp[0], DB, 64,
+						((func & 0x1) == 0));
+			}
 			goto update_regs;
 
 		case EFDCTUI:
 		case EFDCTSI:
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_D(vc.wp[1], DB, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
 		case EFDCTUIZ:
 		case EFDCTSIZ:
-			if (func & 0x4) {
-				_FP_ROUND(2, DB);
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(2, DB);
+				FP_TO_INT_D(vc.wp[1], DB, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_D(vc.wp[1], DB, 32,
-					(((func & 0x3) != 0) || DB_s));
 			goto update_regs;
 
 		default:
@@ -556,37 +562,60 @@ cmp_d:
 			cmp = -1;
 			goto cmp_vs;
 
-		case EVFSCTSF:
-			__asm__ __volatile__ ("mtspr 512, %4\n"
-				"efsctsf %0, %2\n"
-				"efsctsf %1, %3\n"
-				: "=r" (vc.wp[0]), "=r" (vc.wp[1])
-				: "r" (vb.wp[0]), "r" (vb.wp[1]), "r" (0));
-			goto update_regs;
-
 		case EVFSCTUF:
-			__asm__ __volatile__ ("mtspr 512, %4\n"
-				"efsctuf %0, %2\n"
-				"efsctuf %1, %3\n"
-				: "=r" (vc.wp[0]), "=r" (vc.wp[1])
-				: "r" (vb.wp[0]), "r" (vb.wp[1]), "r" (0));
+		case EVFSCTSF:
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB0_e += (func == EVFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[0], SB0, 32,
+						(func == EVFSCTSF));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB1_e += (func == EVFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[1], SB1, 32,
+						(func == EVFSCTSF));
+			}
 			goto update_regs;
 
 		case EVFSCTUI:
 		case EVFSCTSI:
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[0], SB0, 32,
+						((func & 0x3) != 0));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[1], SB1, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
 		case EVFSCTUIZ:
 		case EVFSCTSIZ:
-			if (func & 0x4) {
-				_FP_ROUND(1, SB0);
-				_FP_ROUND(1, SB1);
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(1, SB0);
-				_FP_ROUND_ZERO(1, SB1);
+				FP_TO_INT_S(vc.wp[0], SB0, 32,
+						((func & 0x3) != 0));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_S(vc.wp[1], SB1, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_S(vc.wp[0], SB0, 32,
-					(((func & 0x3) != 0) || SB0_s));
-			FP_TO_INT_S(vc.wp[1], SB1, 32,
-					(((func & 0x3) != 0) || SB1_s));
 			goto update_regs;
 
 		default:
@@ -681,14 +710,16 @@ int speround_handler(struct pt_regs *regs)
 	union dw_union fgpr;
 	int s_lo, s_hi;
 	int lo_inexact, hi_inexact;
-	unsigned long speinsn, type, fc, fptype;
+	int fp_result;
+	unsigned long speinsn, type, fb, fc, fptype, func;
 
 	if (get_user(speinsn, (unsigned int __user *) regs->nip))
 		return -EFAULT;
 	if ((speinsn >> 26) != 4)
 		return -EINVAL;         /* not an spe instruction */
 
-	type = insn_type(speinsn & 0x7ff);
+	func = speinsn & 0x7ff;
+	type = insn_type(func);
 	if (type == XCR) return -ENOSYS;
 
 	__FPU_FPSCR = mfspr(SPRN_SPEFSCR);
@@ -708,6 +739,65 @@ int speround_handler(struct pt_regs *regs)
 	fgpr.wp[0] = current->thread.evr[fc];
 	fgpr.wp[1] = regs->gpr[fc];
 
+	fb = (speinsn >> 11) & 0x1f;
+	switch (func) {
+	case EFSCTUIZ:
+	case EFSCTSIZ:
+	case EVFSCTUIZ:
+	case EVFSCTSIZ:
+	case EFDCTUIDZ:
+	case EFDCTSIDZ:
+	case EFDCTUIZ:
+	case EFDCTSIZ:
+		/*
+		 * These instructions always round to zero,
+		 * independent of the rounding mode.
+		 */
+		return 0;
+
+	case EFSCTUI:
+	case EFSCTUF:
+	case EVFSCTUI:
+	case EVFSCTUF:
+	case EFDCTUI:
+	case EFDCTUF:
+		fp_result = 0;
+		s_lo = 0;
+		s_hi = 0;
+		break;
+
+	case EFSCTSI:
+	case EFSCTSF:
+		fp_result = 0;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_lo = regs->gpr[fb] & SIGN_BIT_S;
+		break;
+
+	case EVFSCTSI:
+	case EVFSCTSF:
+		fp_result = 0;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_lo = regs->gpr[fb] & SIGN_BIT_S;
+		if (fgpr.wp[0] == 0)
+			s_hi = current->thread.evr[fb] & SIGN_BIT_S;
+		break;
+
+	case EFDCTSI:
+	case EFDCTSF:
+		fp_result = 0;
+		s_hi = s_lo;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_hi = current->thread.evr[fb] & SIGN_BIT_S;
+		break;
+
+	default:
+		fp_result = 1;
+		break;
+	}
+
 	pr_debug("round fgpr: %08x  %08x\n", fgpr.wp[0], fgpr.wp[1]);
 
 	switch (fptype) {
@@ -719,15 +809,30 @@ int speround_handler(struct pt_regs *regs)
 		if ((FP_ROUNDMODE) == FP_RND_PINF) {
 			if (!s_lo) fgpr.wp[1]++; /* Z > 0, choose Z1 */
 		} else { /* round to -Inf */
-			if (s_lo) fgpr.wp[1]++; /* Z < 0, choose Z2 */
+			if (s_lo) {
+				if (fp_result)
+					fgpr.wp[1]++; /* Z < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z < 0, choose Z2 */
+			}
 		}
 		break;
 
 	case DPFP:
 		if (FP_ROUNDMODE == FP_RND_PINF) {
-			if (!s_hi) fgpr.dp[0]++; /* Z > 0, choose Z1 */
+			if (!s_hi) {
+				if (fp_result)
+					fgpr.dp[0]++; /* Z > 0, choose Z1 */
+				else
+					fgpr.wp[1]++; /* Z > 0, choose Z1 */
+			}
 		} else { /* round to -Inf */
-			if (s_hi) fgpr.dp[0]++; /* Z < 0, choose Z2 */
+			if (s_hi) {
+				if (fp_result)
+					fgpr.dp[0]++; /* Z < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z < 0, choose Z2 */
+			}
 		}
 		break;
 
@@ -738,10 +843,18 @@ int speround_handler(struct pt_regs *regs)
 			if (hi_inexact && !s_hi)
 				fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */
 		} else { /* round to -Inf */
-			if (lo_inexact && s_lo)
-				fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
-			if (hi_inexact && s_hi)
-				fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
+			if (lo_inexact && s_lo) {
+				if (fp_result)
+					fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z_low < 0, choose Z2 */
+			}
+			if (hi_inexact && s_hi) {
+				if (fp_result)
+					fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
+				else
+					fgpr.wp[0]--; /* Z_high < 0, choose Z2 */
+			}
 		}
 		break;
 

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* Re: [PATCH 1/7] powerpc: Add interface to get msi region information
From: Scott Wood @ 2013-10-08 23:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alexander Graf, Joerg Roedel, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, open list:INTEL IOMMU (VT-d),
	Bharat Bhushan, alex.williamson@redhat.com, Bharat Bhushan,
	linuxppc-dev
In-Reply-To: <CAErSpo7+7SHVcOJkHkW5cjb6pN+bqYeRdPBA=MVX0RLz=-pP6Q@mail.gmail.com>

On Tue, 2013-10-08 at 17:25 -0600, Bjorn Helgaas wrote:
> >> -     u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
> >> +     dma_addr_t msiir; /* MSIIR Address in CCSR */
> >
> > Are you sure dma_addr_t is right here, versus phys_addr_t?  It implies
> > that it's the output of the DMA API, but I don't think the DMA API is
> > used in the MSI driver.  Perhaps it should be, but we still want the raw
> > physical address to pass on to VFIO.
> 
> I don't know what "msiir" is used for, but if it's an address you
> program into a PCI device, then it's a dma_addr_t even if you didn't
> get it from the DMA API.  Maybe "bus_addr_t" would have been a more
> suggestive name than "dma_addr_t".  That said, I have no idea how this
> relates to VFIO.

It's a bit awkward because it gets used both as something to program
into a PCI device (and it's probably a bug that the DMA API doesn't get
used), and also (if I understand the current plans correctly) as a
physical address to give to VFIO to be a destination address in an IOMMU
mapping.  So I think the value we keep here should be a phys_addr_t (it
comes straight from the MMIO address in the device tree), which gets
trivially turned into a dma_addr_t by the non-VFIO code path because
there's currently no translation there.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2][v2] pci: fsl: derive the common PCI driver to drivers/pci/host
From: Benjamin Herrenschmidt @ 2013-10-08 23:31 UTC (permalink / raw)
  To: Scott Wood
  Cc: linux-pci@vger.kernel.org, Zang Roy-R61911, Minghuan Lian,
	Paul Mackerras, Bjorn Helgaas, linuxppc-dev
In-Reply-To: <1381274440.7979.309.camel@snotra.buserror.net>

On Tue, 2013-10-08 at 18:20 -0500, Scott Wood wrote:
> > So I'll apply these given an ack from the powerpc folks.
> 
> ACK this patch.  The second one I'd like to see broken up into
> digestible chunks so I can better review it.

Bjorn, for such FSL-only stuff, Scott ack is enough, don't wait for
mine :-)

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 1/7] powerpc: Add interface to get msi region information
From: Bjorn Helgaas @ 2013-10-08 23:25 UTC (permalink / raw)
  To: Scott Wood
  Cc: Alexander Graf, Joerg Roedel, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, open list:INTEL IOMMU (VT-d),
	Bharat Bhushan, alex.williamson@redhat.com, Bharat Bhushan,
	linuxppc-dev
In-Reply-To: <1381273037.7979.298.camel@snotra.buserror.net>

>> -     u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
>> +     dma_addr_t msiir; /* MSIIR Address in CCSR */
>
> Are you sure dma_addr_t is right here, versus phys_addr_t?  It implies
> that it's the output of the DMA API, but I don't think the DMA API is
> used in the MSI driver.  Perhaps it should be, but we still want the raw
> physical address to pass on to VFIO.

I don't know what "msiir" is used for, but if it's an address you
program into a PCI device, then it's a dma_addr_t even if you didn't
get it from the DMA API.  Maybe "bus_addr_t" would have been a more
suggestive name than "dma_addr_t".  That said, I have no idea how this
relates to VFIO.

Bjorn

^ permalink raw reply

* [PATCH] math-emu: fix floating-point to integer overflow detection
From: Joseph S. Myers @ 2013-10-08 23:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: linuxppc-dev

From: Joseph Myers <joseph@codesourcery.com>

On overflow, the math-emu macro _FP_TO_INT_ROUND tries to saturate its
result (subject to the value of rsigned specifying the desired
overflow semantics).  However, if the rounding step has the effect of
increasing the exponent so as to cause overflow (if the rounded result
is 1 larger than the largest positive value with the given number of
bits, allowing for signedness), the overflow does not get detected,
meaning that for unsigned results 0 is produced instead of the maximum
unsigned integer with the give number of bits, without an exception
being raised for overflow, and that for signed results the minimum
(negative) value is produced instead of the maximum (positive) value,
again without an exception.  This patch makes the code check for
rounding increasing the exponent and adjusts the exponent value as
needed for the overflow check.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

This macro is not present in the glibc/libgcc version of the code.
This patch is independent of my separate patch
<http://lkml.org/lkml/2013/10/8/694> to fix the results for unsigned
saturation, although you need both patches together to get the correct
results for the affected unsigned overflow case.  It remains the case
both before and after this patch that the conversions wrongly treat a
signed result of the most negative integer as an overflow, when
actually only that integer minus 1 or smaller should be an overflow,
although this only means an incorrect exception rather than affecting
the value returned; that was one of the bugs I fixed in the
glibc/libgcc version of this code in 2006 (as part of a major overhaul
of the code including various interface changes, so not trivially
backportable to the kernel version).

diff --git a/include/math-emu/op-common.h b/include/math-emu/op-common.h
index 9696a5e..6bdf8c6 100644
--- a/include/math-emu/op-common.h
+++ b/include/math-emu/op-common.h
@@ -743,12 +743,17 @@ do {									\
 	  }									\
 	else									\
 	  {									\
+	    int _lz0, _lz1;							\
 	    if (X##_e <= -_FP_WORKBITS - 1)					\
 	      _FP_FRAC_SET_##wc(X, _FP_MINFRAC_##wc);				\
 	    else								\
 	      _FP_FRAC_SRS_##wc(X, _FP_FRACBITS_##fs - 1 - X##_e,		\
 				_FP_WFRACBITS_##fs);				\
+	    _FP_FRAC_CLZ_##wc(_lz0, X);						\
 	    _FP_ROUND(wc, X);							\
+	    _FP_FRAC_CLZ_##wc(_lz1, X);						\
+	    if (_lz1 < _lz0)							\
+	      X##_e++; /* For overflow detection.  */				\
 	    _FP_FRAC_SRL_##wc(X, _FP_WORKBITS);					\
 	    _FP_FRAC_ASSEMBLE_##wc(r, X, rsize);				\
 	  }									\

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* Re: [PATCH 1/2][v2] pci: fsl: derive the common PCI driver to drivers/pci/host
From: Scott Wood @ 2013-10-08 23:20 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci@vger.kernel.org, Zang Roy-R61911, Minghuan Lian,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <CAErSpo76_w+wK76m73kkdYS1PouhYcGfZus_FYF5E2FHJTA6gQ@mail.gmail.com>

On Tue, 2013-10-08 at 17:09 -0600, Bjorn Helgaas wrote:
> On Tue, Oct 8, 2013 at 4:46 PM, Scott Wood <scottwood@freescale.com> wrote:
> > On Tue, 2013-10-08 at 13:13 -0600, Bjorn Helgaas wrote:
> >> [+cc Ben, Paul, linuxppc-dev]
> >>
> >> On Mon, Sep 30, 2013 at 04:52:54PM +0800, Minghuan Lian wrote:
> >> > The Freescale's Layerscape series processors will use ARM cores.
> >> > The LS1's PCIe controllers is the same as T4240's. So it's better
> >> > the PCIe controller driver can support PowerPC and ARM
> >> > simultaneously. This patch is for this purpose. It derives
> >> > the common functions from arch/powerpc/sysdev/fsl_pci.c to
> >> > drivers/pci/host/pci-fsl-common.c and leaves the architecture
> >> > specific functions which should be implemented in arch related files.
> >> >
> >> > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
> >>
> >> I cc'd the powerpc maintainers so we can work out which tree this
> >> should go through.
> >>
> >> > ---
> >> > change log:
> >> > v1-v2:
> >> > 1. rename pci.h to pci-common.h
> >> > 2. rename pci-fsl.c to pci-fsl-common.c
> >> >
> >> > Based on upstream master.
> >> > Based on the discussion of RFC version here
> >> > http://patchwork.ozlabs.org/patch/274487/
> >> >
> >> >  arch/powerpc/sysdev/fsl_pci.c                      | 521 +-----------------
> >> >  arch/powerpc/sysdev/fsl_pci.h                      |  89 ----
> >> >  .../fsl_pci.c => drivers/pci/host/pci-fsl-common.c | 591 +--------------------
> >> >  .../fsl_pci.h => include/linux/fsl/pci-common.h    |  45 +-
> >>
> >> Is there any way to avoid putting this file in include/linux?  I know
> >> you want to share it beyond PowerPC, and I know there are similar
> >> examples there already, but this is all arch-specific or
> >> chipset-specific stuff that seems like it should be in some
> >> not-so-public place.  It doesn't seem scalable to add an include/linux
> >> subdirectory for every chipset that might be shared across
> >> architectures.
> >
> > What specifically is the problem with it, as long as it's properly
> > namespaced?
> 
> Well, as I said above, it doesn't seem scalable,

I'm not sure what scaling problems you're picturing, assuming proper
namespacing and organization within include/linux/.

>  and it doesn't seem to be the common existing practice. 
>
> Possibly this is just because sharing chipsets across arches isn't very common yet.
> 
> I hadn't noticed that include/linux/fsl exists already; I thought you
> were adding it.  Given that it *does* exist already, I guess I'm OK
> with putting more stuff in it.

I see other existing practice as well.  Besides plenty of
"include/linux/fsl*" that ought to be moved to "include/linux/fsl/", I
see things like include/linux/amba/, include/linux/scx200*,
include/linux/clksrc-dbx500-prcmu.h, include/linux/com202020.h, etc.
These are just a few random examples out of many.

> So I'll apply these given an ack from the powerpc folks.

ACK this patch.  The second one I'd like to see broken up into
digestible chunks so I can better review it.

-Scott

^ permalink raw reply

* [PATCH] math-emu: fix floating-point to integer unsigned saturation
From: Joseph S. Myers @ 2013-10-08 23:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: linuxppc-dev

From: Joseph Myers <joseph@codesourcery.com>

The math-emu macros _FP_TO_INT and _FP_TO_INT_ROUND are supposed to
saturate their results for out-of-range arguments, except in the case
rsigned == 2 (when instead the low bits of the result are taken).
However, in the case rsigned == 0 (converting to unsigned integers),
they mistakenly produce 0 for positive results and the maximum
unsigned integer for negative results, the opposite of correct
unsigned saturation.  This patch fixes the logic.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

I intend to make the corresponding changes to the glibc/libgcc copy of
this code, given that it would be desirable to resync the Linux and
glibc/libgcc copies (the latter has had many enhancements and bug
fixes since it was copied into Linux), although strictly this
incorrect saturation is only a bug when trying to emulate particular
instruction semantics, not when used in userspace to implement C
operations where the results of out-of-range conversions are
unspecified or undefined.

diff --git a/include/math-emu/op-common.h b/include/math-emu/op-common.h
index 9696a5e..70fe5e9 100644
--- a/include/math-emu/op-common.h
+++ b/include/math-emu/op-common.h
@@ -685,7 +685,7 @@ do {									\
 	    else								\
 	      {									\
 		r = 0;								\
-		if (X##_s)							\
+		if (!X##_s)							\
 		  r = ~r;							\
 	      }									\
 	    FP_SET_EXCEPTION(FP_EX_INVALID);					\
@@ -762,7 +762,7 @@ do {									\
 	    if (!rsigned)							\
 	      {									\
 		r = 0;								\
-		if (X##_s)							\
+		if (!X##_s)							\
 		  r = ~r;							\
 	      }									\
 	    else if (rsigned != 2)						\

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* Re: [PATCH 1/2][v2] pci: fsl: derive the common PCI driver to drivers/pci/host
From: Bjorn Helgaas @ 2013-10-08 23:09 UTC (permalink / raw)
  To: Scott Wood
  Cc: linux-pci@vger.kernel.org, Zang Roy-R61911, Minghuan Lian,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <1381272382.7979.292.camel@snotra.buserror.net>

On Tue, Oct 8, 2013 at 4:46 PM, Scott Wood <scottwood@freescale.com> wrote:
> On Tue, 2013-10-08 at 13:13 -0600, Bjorn Helgaas wrote:
>> [+cc Ben, Paul, linuxppc-dev]
>>
>> On Mon, Sep 30, 2013 at 04:52:54PM +0800, Minghuan Lian wrote:
>> > The Freescale's Layerscape series processors will use ARM cores.
>> > The LS1's PCIe controllers is the same as T4240's. So it's better
>> > the PCIe controller driver can support PowerPC and ARM
>> > simultaneously. This patch is for this purpose. It derives
>> > the common functions from arch/powerpc/sysdev/fsl_pci.c to
>> > drivers/pci/host/pci-fsl-common.c and leaves the architecture
>> > specific functions which should be implemented in arch related files.
>> >
>> > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
>>
>> I cc'd the powerpc maintainers so we can work out which tree this
>> should go through.
>>
>> > ---
>> > change log:
>> > v1-v2:
>> > 1. rename pci.h to pci-common.h
>> > 2. rename pci-fsl.c to pci-fsl-common.c
>> >
>> > Based on upstream master.
>> > Based on the discussion of RFC version here
>> > http://patchwork.ozlabs.org/patch/274487/
>> >
>> >  arch/powerpc/sysdev/fsl_pci.c                      | 521 +-----------------
>> >  arch/powerpc/sysdev/fsl_pci.h                      |  89 ----
>> >  .../fsl_pci.c => drivers/pci/host/pci-fsl-common.c | 591 +--------------------
>> >  .../fsl_pci.h => include/linux/fsl/pci-common.h    |  45 +-
>>
>> Is there any way to avoid putting this file in include/linux?  I know
>> you want to share it beyond PowerPC, and I know there are similar
>> examples there already, but this is all arch-specific or
>> chipset-specific stuff that seems like it should be in some
>> not-so-public place.  It doesn't seem scalable to add an include/linux
>> subdirectory for every chipset that might be shared across
>> architectures.
>
> What specifically is the problem with it, as long as it's properly
> namespaced?

Well, as I said above, it doesn't seem scalable, and it doesn't seem
to be the common existing practice.  Possibly this is just because
sharing chipsets across arches isn't very common yet.

I hadn't noticed that include/linux/fsl exists already; I thought you
were adding it.  Given that it *does* exist already, I guess I'm OK
with putting more stuff in it.

So I'll apply these given an ack from the powerpc folks.

Bjorn

^ permalink raw reply

* RE: [PATCH RFC 63/77] qlcnic: Update MSI/MSI-X interrupts enablement code
From: Himanshu Madhani @ 2013-10-08 22:46 UTC (permalink / raw)
  To: Alexander Gordeev, linux-kernel
  Cc: linux-mips@linux-mips.org, VMware, Inc.,
	linux-nvme@lists.infradead.org, linux-ide@vger.kernel.org,
	linux-s390@vger.kernel.org, Andy King, linux-scsi,
	linux-rdma@vger.kernel.org, x86@kernel.org, Ingo Molnar,
	linux-pci, iss_storagedev@hp.com, Dept-Eng Linux Driver,
	Tejun Heo, Bjorn Helgaas, Dan Williams, Jon Mason,
	Solarflare linux maintainers, netdev, Ralf Baechle,
	e1000-devel@lists.sourceforge.net, Martin Schwidefsky,
	linux390@de.ibm.com, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <c92efbde96541d08f37510422c096d543bb01279.1380703263.git.agordeev@redhat.com>

> -----Original Message-----
> From: Alexander Gordeev [mailto:agordeev@redhat.com]
> Sent: Wednesday, October 02, 2013 3:49 AM
> To: linux-kernel
> Cc: Alexander Gordeev; Bjorn Helgaas; Ralf Baechle; Michael Ellerman;
> Benjamin Herrenschmidt; Martin Schwidefsky; Ingo Molnar; Tejun Heo; Dan
> Williams; Andy King; Jon Mason; Matt Porter; linux-pci; linux-mips@linux-
> mips.org; linuxppc-dev@lists.ozlabs.org; linux390@de.ibm.com; linux-
> s390@vger.kernel.org; x86@kernel.org; linux-ide@vger.kernel.org;
> iss_storagedev@hp.com; linux-nvme@lists.infradead.org; linux-
> rdma@vger.kernel.org; netdev; e1000-devel@lists.sourceforge.net; Dept-
> Eng Linux Driver; Solarflare linux maintainers; VMware, Inc.; linux-scsi
> Subject: [PATCH RFC 63/77] qlcnic: Update MSI/MSI-X interrupts enablement
> code
>=20
> As result of recent re-design of the MSI/MSI-X interrupts enabling patter=
n
> this driver has to be updated to use the new technique to obtain a optima=
l
> number of MSI/MSI-X interrupts required.
>=20
 "We will test this change for the driver and provide feedback."

> Signed-off-by: Alexander Gordeev <agordeev@redhat.com>

Thanks,
Himanshu

^ permalink raw reply

* Re: [PATCH 1/7] powerpc: Add interface to get msi region information
From: Scott Wood @ 2013-10-08 22:57 UTC (permalink / raw)
  To: Bharat Bhushan
  Cc: agraf, joro, linux-kernel, iommu, Bharat Bhushan, alex.williamson,
	linux-pci, linuxppc-dev
In-Reply-To: <1379575763-2091-2-git-send-email-Bharat.Bhushan@freescale.com>

On Thu, 2013-09-19 at 12:59 +0530, Bharat Bhushan wrote:
> @@ -376,6 +405,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
>  	int len;
>  	u32 offset;
>  	static const u32 all_avail[] = { 0, NR_MSI_IRQS };
> +	static int bank_index;
>  
>  	match = of_match_device(fsl_of_msi_ids, &dev->dev);
>  	if (!match)
> @@ -419,8 +449,8 @@ static int fsl_of_msi_probe(struct platform_device *dev)
>  				dev->dev.of_node->full_name);
>  			goto error_out;
>  		}
> -		msi->msiir_offset =
> -			features->msiir_offset + (res.start & 0xfffff);
> +		msi->msiir = res.start + features->msiir_offset;
> +		printk("msi->msiir = %llx\n", msi->msiir);

dev_dbg or remove

>  	}
>  
>  	msi->feature = features->fsl_pic_ip;
> @@ -470,6 +500,7 @@ static int fsl_of_msi_probe(struct platform_device *dev)
>  		}
>  	}
>  
> +	msi->bank_index = bank_index++;

What if multiple MSIs are boing probed in parallel?  bank_index is not
atomic.

> diff --git a/arch/powerpc/sysdev/fsl_msi.h b/arch/powerpc/sysdev/fsl_msi.h
> index 8225f86..6bd5cfc 100644
> --- a/arch/powerpc/sysdev/fsl_msi.h
> +++ b/arch/powerpc/sysdev/fsl_msi.h
> @@ -29,12 +29,19 @@ struct fsl_msi {
>  	struct irq_domain *irqhost;
>  
>  	unsigned long cascade_irq;
> -
> -	u32 msiir_offset; /* Offset of MSIIR, relative to start of CCSR */
> +	dma_addr_t msiir; /* MSIIR Address in CCSR */

Are you sure dma_addr_t is right here, versus phys_addr_t?  It implies
that it's the output of the DMA API, but I don't think the DMA API is
used in the MSI driver.  Perhaps it should be, but we still want the raw
physical address to pass on to VFIO.

>  	void __iomem *msi_regs;
>  	u32 feature;
>  	int msi_virqs[NR_MSI_REG];
>  
> +	/*
> +	 * During probe each bank is assigned a index number.
> +	 * index number ranges from 0 to 2^32.
> +	 * Example  MSI bank 1 = 0
> +	 * MSI bank 2 = 1, and so on.
> +	 */
> +	int bank_index;

2^32 doesn't fit in "int" (nor does 2^32 - 1).

Just say that indices start at 0.

-Scott

^ permalink raw reply

* Re: [PATCH 1/2][v2] pci: fsl: derive the common PCI driver to drivers/pci/host
From: Scott Wood @ 2013-10-08 22:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci@vger.kernel.org, Zang Roy-R61911, Minghuan Lian,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <CAErSpo7LXqXmrw1gPwKHsJkKaAcDr2W_+tcC8b9T_5Mab1Arnw@mail.gmail.com>

On Tue, 2013-10-08 at 13:13 -0600, Bjorn Helgaas wrote:
> [+cc Ben, Paul, linuxppc-dev]
> 
> On Mon, Sep 30, 2013 at 04:52:54PM +0800, Minghuan Lian wrote:
> > The Freescale's Layerscape series processors will use ARM cores.
> > The LS1's PCIe controllers is the same as T4240's. So it's better
> > the PCIe controller driver can support PowerPC and ARM
> > simultaneously. This patch is for this purpose. It derives
> > the common functions from arch/powerpc/sysdev/fsl_pci.c to
> > drivers/pci/host/pci-fsl-common.c and leaves the architecture
> > specific functions which should be implemented in arch related files.
> >
> > Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>
> 
> I cc'd the powerpc maintainers so we can work out which tree this
> should go through.
> 
> > ---
> > change log:
> > v1-v2:
> > 1. rename pci.h to pci-common.h
> > 2. rename pci-fsl.c to pci-fsl-common.c
> >
> > Based on upstream master.
> > Based on the discussion of RFC version here
> > http://patchwork.ozlabs.org/patch/274487/
> >
> >  arch/powerpc/sysdev/fsl_pci.c                      | 521 +-----------------
> >  arch/powerpc/sysdev/fsl_pci.h                      |  89 ----
> >  .../fsl_pci.c => drivers/pci/host/pci-fsl-common.c | 591 +--------------------
> >  .../fsl_pci.h => include/linux/fsl/pci-common.h    |  45 +-
> 
> Is there any way to avoid putting this file in include/linux?  I know
> you want to share it beyond PowerPC, and I know there are similar
> examples there already, but this is all arch-specific or
> chipset-specific stuff that seems like it should be in some
> not-so-public place.  It doesn't seem scalable to add an include/linux
> subdirectory for every chipset that might be shared across
> architectures.

What specifically is the problem with it, as long as it's properly
namespaced?

-Scott

^ permalink raw reply

* Re: Elbc device driver
From: Scott Wood @ 2013-10-08 22:34 UTC (permalink / raw)
  To: Mercier Ivan; +Cc: linuxppc-dev
In-Reply-To: <CAMc2ieqYDBE8HYzmEvfKxw8o7AdYVGN_ew6tvsn_TsRcpEgL8g@mail.gmail.com>

On Tue, 2013-10-08 at 16:06 +0200, Mercier Ivan wrote:
> Hi,
> 
> I'm working on a powerpc qoriq p3041 and trying to communicate with a
> device by elbc bus in gpmc mode.
> 
> I 've integrated CONFIG_FSL_LBC in Linux which provide the basic functions.
> 
> Now I'm wondering how can I do read and write operations on the
> bus.Where is mapped my device?

You'll need to use ioremap() or of_iomap() to map it.

> Should I code .read and .write driver functions?How can I start?
> 
> How integrates my device in the device tree?

See Documentation/devicetree/bindings/powerpc/fsl/lbc.txt and examples
such as "board-control" in various device trees.

-Scott

^ permalink raw reply

* Re: Gianfar driver crashes in Kernel v3.10
From: Scott Wood @ 2013-10-08 22:09 UTC (permalink / raw)
  To: Thomas Hühn
  Cc: linuxppc-dev@lists.ozlabs.org, claudiu.manoil@freescale.com
In-Reply-To: <8EF35D6A-A132-458C-A3B4-80D4D2C5BA4C@dai-labor.de>

On Fri, 2013-10-04 at 12:03 +0000, Thomas H=C3=BChn wrote:
> [code]
> [ 2671.841927] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 2671.847141] Freescale P1014
> [ 2671.849925] Modules linked in: ath9k pppoe ppp_async iptable_nat ath=
9k_common pppox p
> e xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quot=
a xt_pkttype xt_o
> mark xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_NETMA=
P xt_LOG xt_IPMAR
> ms_datafab ums_cypress ums_alauda slhc nf_nat_tftp nf_nat_snmp_basic nf=
_nat_sip nf_nat_r
> ntrack_sip nf_conntrack_rtsp nf_conntrack_proto_gre nf_conntrack_irc nf=
_conntrack_h323 n
>  compat_xtables compat ath sch_teql sch_tbf sch_sfq sch_red sch_prio sc=
h_htb sch_gred sc
> skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw=
 sch_hfsc sch_ing
> r usb_storage leds_gpio ohci_hcd ehci_platform ehci_hcd sd_mod scsi_mod=
 fsl_mph_dr_of gp
> [ 2671.988946] CPU: 0 PID: 5209 Comm: iftop Not tainted 3.10.13 #2
> [ 2671.994859] task: c4b22220 ti: c7ff8000 task.ti: c477e000
> [ 2672.000250] NIP: c018c7a0 LR: c018c794 CTR: c000b070
> [ 2672.005206] REGS: c7ff9f10 TRAP: 3202   Not tainted  (3.10.13)

Trap 0x3202 is a watchdog timer.

Did you get a "Bad trap at..." line before the above dump?  Do you have
any idea why the watchdog would have been armed without CONFIG_BOOKE_WDT
being set?  Is CONFIG_BOOKE_WDT set?

-Scott

^ permalink raw reply

* Re: [PATCH] powerpc/powernv: Reduce panic timeout from 180s to 10s
From: Scott Wood @ 2013-10-08 21:52 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, paulus, Anton Blanchard
In-Reply-To: <20131001083918.GD27484@concordia>

On Tue, 2013-10-01 at 18:39 +1000, Michael Ellerman wrote:
> On Thu, Sep 26, 2013 at 09:17:19PM +1000, Anton Blanchard wrote:
> > 
> > We made this change to pseries in 2011 and I think it makes
> > sense to do the same on powernv.
> 
> I'd vote we set it to 10s for all 64-bit machines in
> arch/powerpc/kernel/setup_64.c.

Why is 64-bit relevant?  And wouldn't such a short delay be a problem if
the crash is displayed on a monitor?

-Scott

^ permalink raw reply

* Re: [PATCH v1] powerpc/mpc512x: silence build warning upon disabled DIU
From: Anatolij Gustschin @ 2013-10-08 21:42 UTC (permalink / raw)
  To: Gerhard Sittig; +Cc: linuxppc-dev
In-Reply-To: <1380295718-10700-1-git-send-email-gsi@denx.de>

On Fri, 27 Sep 2013 17:28:38 +0200
Gerhard Sittig <gsi@denx.de> wrote:

> a disabled Kconfig option results in a reference to a not implemented
> routine when the IS_ENABLED() macro is used for both conditional
> implementation of the routine as well as a C language source code test
> at the call site -- the "if (0) func();" construct only gets eliminated
> later by the optimizer, while the compiler already has emitted its
> warning about "func()" being undeclared

applied, thanks!

^ permalink raw reply

* Re: [PATCH] Kind of revert "powerpc: 52xx: provide a default in mpc52xx_irqhost_map()"
From: Anatolij Gustschin @ 2013-10-08 21:44 UTC (permalink / raw)
  To: Wolfram Sang; +Cc: linuxppc-dev, Sebastian Andrzej Siewior, linux-rt-users
In-Reply-To: <1380901029-3548-1-git-send-email-wsa@the-dreams.de>

On Fri,  4 Oct 2013 17:37:09 +0200
Wolfram Sang <wsa@the-dreams.de> wrote:

> This more or less reverts commit 6391f697d4892a6f233501beea553e13f7745a23.
> Instead of adding an unneeded 'default', mark the variable to prevent
> the false positive 'uninitialized var'. The other change (fixing the
> printout) needs revert, too. We want to know WHICH critical irq failed,
> not which level it had.
> 
> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Anatolij Gustschin <agust@denx.de>

applied, thanks!

^ permalink raw reply

* Re: Linux 2.6.32 PowerPC MTD partition mounted at boot
From: Scott Wood @ 2013-10-08 21:47 UTC (permalink / raw)
  To: Dorin D; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <BAY176-W43D0B5AD1BD2454470D8AFA6210@phx.gbl>

On Wed, 2013-09-18 at 22:43 -0400, Dorin D wrote:
> I am working on bringing up two Linux systems, both based on Freescale
> PowerPC devices, one is a MPC8349, the other a P1020. I was able to
> build, install and boot the kernel on both cards. The kernel is 2.6.32
> and the  toolchains  are coming from the LTIBs packages from
> Freescale. 
> 
> Both cards have a 32 MByte NOR flash memory (AMD) boot flash. I have
> Uboot, kernel, RAM disk image and DTB in the boot flash and I want to
> use the spare space (about 20 MBytes) as flash file system. 
> 
> I have the following problem : the P1020 board boots fine using the
> RAM disk with the flash in the device tree , shows the flash device
> partitions (JFFS2) and DOESN"T try to mount a flash partition as root.
> The MPC8349 boots fine from the RAM disk but, after identifying the
> flash partitions, the kernel panics because is looking for a flash
> partition to mount as root partition and none of them is usable (not
> formatted). If I remove the flash from the device tree, the card boots
> fine using the RAM disk.
> 
> 
> I am not too familiar with Linux boot scripts and I didn't figure out
> where I can disable this tentative of mounting the MTD partition. I
> want the boards to boot and mount the RAM disk, as the P1020 board
> does. The flash partition will be initialized and mounted at a later
> time, but not as root partition.

Compare the kernel command line and kernel config for the two boards;
there's probably a relevant difference there (especially with the
"root=" parameter).

-Scott

^ permalink raw reply

* Re: [PATCH 5/9][v5] powerpc: implement is_instr_load_store().
From: Sukadev Bhattiprolu @ 2013-10-08 19:31 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Michael Ellerman, linux-kernel, Stephane Eranian, linuxppc-dev,
	Paul Mackerras, Arnaldo Carvalho de Melo, Anshuman Khandual
In-Reply-To: <20131003053519.GC17237@concordia>

Michael Ellerman [michael@ellerman.id.au] wrote:
| bool is_load_store(int ext_opcode)
| {
|         upper = ext_opcode >> 5;
|         lower = ext_opcode & 0x1f;
| 
|         /* Short circuit as many misses as we can */
|         if (lower < 3 || lower > 23)
|             return false;

I see some loads/stores like these which are not covered by
the above check. Is it ok to ignore them ?

	lower == 29: ldepx, stdepx, eviddepx, evstddepx

	lower == 31: lwepx, lbepx, lfdepx, stfdepx,

Looking through the opcode maps, I also see these for primary
op code 4:

	evldd, evlddx, evldwx, evldw, evldh, evldhx.

Should we include those also ?

Sukadev

^ permalink raw reply

* Re: [PATCH 1/2][v2] pci: fsl: derive the common PCI driver to drivers/pci/host
From: Bjorn Helgaas @ 2013-10-08 19:13 UTC (permalink / raw)
  To: Minghuan Lian
  Cc: linux-pci@vger.kernel.org, Zang Roy-R61911, Paul Mackerras,
	Scott Wood, linuxppc-dev
In-Reply-To: <1380531175-14836-1-git-send-email-Minghuan.Lian@freescale.com>

[+cc Ben, Paul, linuxppc-dev]

On Mon, Sep 30, 2013 at 04:52:54PM +0800, Minghuan Lian wrote:
> The Freescale's Layerscape series processors will use ARM cores.
> The LS1's PCIe controllers is the same as T4240's. So it's better
> the PCIe controller driver can support PowerPC and ARM
> simultaneously. This patch is for this purpose. It derives
> the common functions from arch/powerpc/sysdev/fsl_pci.c to
> drivers/pci/host/pci-fsl-common.c and leaves the architecture
> specific functions which should be implemented in arch related files.
>
> Signed-off-by: Minghuan Lian <Minghuan.Lian@freescale.com>

I cc'd the powerpc maintainers so we can work out which tree this
should go through.

> ---
> change log:
> v1-v2:
> 1. rename pci.h to pci-common.h
> 2. rename pci-fsl.c to pci-fsl-common.c
>
> Based on upstream master.
> Based on the discussion of RFC version here
> http://patchwork.ozlabs.org/patch/274487/
>
>  arch/powerpc/sysdev/fsl_pci.c                      | 521 +-----------------
>  arch/powerpc/sysdev/fsl_pci.h                      |  89 ----
>  .../fsl_pci.c => drivers/pci/host/pci-fsl-common.c | 591 +--------------------
>  .../fsl_pci.h => include/linux/fsl/pci-common.h    |  45 +-

Is there any way to avoid putting this file in include/linux?  I know
you want to share it beyond PowerPC, and I know there are similar
examples there already, but this is all arch-specific or
chipset-specific stuff that seems like it should be in some
not-so-public place.  It doesn't seem scalable to add an include/linux
subdirectory for every chipset that might be shared across
architectures.

I assume this patch basically just moves code around, so the only
question I really care about is where it ends up.

Bjorn

>  4 files changed, 7 insertions(+), 1239 deletions(-)
>  copy arch/powerpc/sysdev/fsl_pci.c => drivers/pci/host/pci-fsl-common.c (54%)
>  copy arch/powerpc/sysdev/fsl_pci.h => include/linux/fsl/pci-common.h (79%)
>
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> index ccfb50d..26039e3 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -27,6 +27,7 @@
>  #include <linux/log2.h>
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
> +#include <linux/fsl/pci-common.h>
>
>  #include <asm/io.h>
>  #include <asm/prom.h>
> @@ -58,57 +59,8 @@ static void quirk_fsl_pcie_header(struct pci_dev *dev)
>   return;
>  }
>
> -static int fsl_indirect_read_config(struct pci_bus *, unsigned int,
> -    int, int, u32 *);
> -
> -static int fsl_pcie_check_link(struct pci_controller *hose)
> -{
> - u32 val = 0;
> -
> - if (hose->indirect_type & PPC_INDIRECT_TYPE_FSL_CFG_REG_LINK) {
> - if (hose->ops->read == fsl_indirect_read_config) {
> - struct pci_bus bus;
> - bus.number = hose->first_busno;
> - bus.sysdata = hose;
> - bus.ops = hose->ops;
> - indirect_read_config(&bus, 0, PCIE_LTSSM, 4, &val);
> - } else
> - early_read_config_dword(hose, 0, 0, PCIE_LTSSM, &val);
> - if (val < PCIE_LTSSM_L0)
> - return 1;
> - } else {
> - struct ccsr_pci __iomem *pci = hose->private_data;
> - /* for PCIe IP rev 3.0 or greater use CSR0 for link state */
> - val = (in_be32(&pci->pex_csr0) & PEX_CSR0_LTSSM_MASK)
> - >> PEX_CSR0_LTSSM_SHIFT;
> - if (val != PEX_CSR0_LTSSM_L0)
> - return 1;
> - }
> -
> - return 0;
> -}
> -
> -static int fsl_indirect_read_config(struct pci_bus *bus, unsigned int devfn,
> -    int offset, int len, u32 *val)
> -{
> - struct pci_controller *hose = pci_bus_to_host(bus);
> -
> - if (fsl_pcie_check_link(hose))
> - hose->indirect_type |= PPC_INDIRECT_TYPE_NO_PCIE_LINK;
> - else
> - hose->indirect_type &= ~PPC_INDIRECT_TYPE_NO_PCIE_LINK;
> -
> - return indirect_read_config(bus, devfn, offset, len, val);
> -}
> -
>  #if defined(CONFIG_FSL_SOC_BOOKE) || defined(CONFIG_PPC_86xx)
>
> -static struct pci_ops fsl_indirect_pcie_ops =
> -{
> - .read = fsl_indirect_read_config,
> - .write = indirect_write_config,
> -};
> -
>  #define MAX_PHYS_ADDR_BITS 40
>  static u64 pci64_dma_offset = 1ull << MAX_PHYS_ADDR_BITS;
>
> @@ -132,291 +84,6 @@ static int fsl_pci_dma_set_mask(struct device *dev, u64 dma_mask)
>   return 0;
>  }
>
> -static int setup_one_atmu(struct ccsr_pci __iomem *pci,
> - unsigned int index, const struct resource *res,
> - resource_size_t offset)
> -{
> - resource_size_t pci_addr = res->start - offset;
> - resource_size_t phys_addr = res->start;
> - resource_size_t size = resource_size(res);
> - u32 flags = 0x80044000; /* enable & mem R/W */
> - unsigned int i;
> -
> - pr_debug("PCI MEM resource start 0x%016llx, size 0x%016llx.\n",
> - (u64)res->start, (u64)size);
> -
> - if (res->flags & IORESOURCE_PREFETCH)
> - flags |= 0x10000000; /* enable relaxed ordering */
> -
> - for (i = 0; size > 0; i++) {
> - unsigned int bits = min(ilog2(size),
> - __ffs(pci_addr | phys_addr));
> -
> - if (index + i >= 5)
> - return -1;
> -
> - out_be32(&pci->pow[index + i].potar, pci_addr >> 12);
> - out_be32(&pci->pow[index + i].potear, (u64)pci_addr >> 44);
> - out_be32(&pci->pow[index + i].powbar, phys_addr >> 12);
> - out_be32(&pci->pow[index + i].powar, flags | (bits - 1));
> -
> - pci_addr += (resource_size_t)1U << bits;
> - phys_addr += (resource_size_t)1U << bits;
> - size -= (resource_size_t)1U << bits;
> - }
> -
> - return i;
> -}
> -
> -/* atmu setup for fsl pci/pcie controller */
> -static void setup_pci_atmu(struct pci_controller *hose)
> -{
> - struct ccsr_pci __iomem *pci = hose->private_data;
> - int i, j, n, mem_log, win_idx = 3, start_idx = 1, end_idx = 4;
> - u64 mem, sz, paddr_hi = 0;
> - u64 offset = 0, paddr_lo = ULLONG_MAX;
> - u32 pcicsrbar = 0, pcicsrbar_sz;
> - u32 piwar = PIWAR_EN | PIWAR_PF | PIWAR_TGI_LOCAL |
> - PIWAR_READ_SNOOP | PIWAR_WRITE_SNOOP;
> - const char *name = hose->dn->full_name;
> - const u64 *reg;
> - int len;
> -
> - if (early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP)) {
> - if (in_be32(&pci->block_rev1) >= PCIE_IP_REV_2_2) {
> - win_idx = 2;
> - start_idx = 0;
> - end_idx = 3;
> - }
> - }
> -
> - /* Disable all windows (except powar0 since it's ignored) */
> - for(i = 1; i < 5; i++)
> - out_be32(&pci->pow[i].powar, 0);
> - for (i = start_idx; i < end_idx; i++)
> - out_be32(&pci->piw[i].piwar, 0);
> -
> - /* Setup outbound MEM window */
> - for(i = 0, j = 1; i < 3; i++) {
> - if (!(hose->mem_resources[i].flags & IORESOURCE_MEM))
> - continue;
> -
> - paddr_lo = min(paddr_lo, (u64)hose->mem_resources[i].start);
> - paddr_hi = max(paddr_hi, (u64)hose->mem_resources[i].end);
> -
> - /* We assume all memory resources have the same offset */
> - offset = hose->mem_offset[i];
> - n = setup_one_atmu(pci, j, &hose->mem_resources[i], offset);
> -
> - if (n < 0 || j >= 5) {
> - pr_err("Ran out of outbound PCI ATMUs for resource %d!\n", i);
> - hose->mem_resources[i].flags |= IORESOURCE_DISABLED;
> - } else
> - j += n;
> - }
> -
> - /* Setup outbound IO window */
> - if (hose->io_resource.flags & IORESOURCE_IO) {
> - if (j >= 5) {
> - pr_err("Ran out of outbound PCI ATMUs for IO resource\n");
> - } else {
> - pr_debug("PCI IO resource start 0x%016llx, size 0x%016llx, "
> - "phy base 0x%016llx.\n",
> - (u64)hose->io_resource.start,
> - (u64)resource_size(&hose->io_resource),
> - (u64)hose->io_base_phys);
> - out_be32(&pci->pow[j].potar, (hose->io_resource.start >> 12));
> - out_be32(&pci->pow[j].potear, 0);
> - out_be32(&pci->pow[j].powbar, (hose->io_base_phys >> 12));
> - /* Enable, IO R/W */
> - out_be32(&pci->pow[j].powar, 0x80088000
> - | (ilog2(hose->io_resource.end
> - - hose->io_resource.start + 1) - 1));
> - }
> - }
> -
> - /* convert to pci address space */
> - paddr_hi -= offset;
> - paddr_lo -= offset;
> -
> - if (paddr_hi == paddr_lo) {
> - pr_err("%s: No outbound window space\n", name);
> - return;
> - }
> -
> - if (paddr_lo == 0) {
> - pr_err("%s: No space for inbound window\n", name);
> - return;
> - }
> -
> - /* setup PCSRBAR/PEXCSRBAR */
> - early_write_config_dword(hose, 0, 0, PCI_BASE_ADDRESS_0, 0xffffffff);
> - early_read_config_dword(hose, 0, 0, PCI_BASE_ADDRESS_0, &pcicsrbar_sz);
> - pcicsrbar_sz = ~pcicsrbar_sz + 1;
> -
> - if (paddr_hi < (0x100000000ull - pcicsrbar_sz) ||
> - (paddr_lo > 0x100000000ull))
> - pcicsrbar = 0x100000000ull - pcicsrbar_sz;
> - else
> - pcicsrbar = (paddr_lo - pcicsrbar_sz) & -pcicsrbar_sz;
> - early_write_config_dword(hose, 0, 0, PCI_BASE_ADDRESS_0, pcicsrbar);
> -
> - paddr_lo = min(paddr_lo, (u64)pcicsrbar);
> -
> - pr_info("%s: PCICSRBAR @ 0x%x\n", name, pcicsrbar);
> -
> - /* Setup inbound mem window */
> - mem = memblock_end_of_DRAM();
> -
> - /*
> - * The msi-address-64 property, if it exists, indicates the physical
> - * address of the MSIIR register.  Normally, this register is located
> - * inside CCSR, so the ATMU that covers all of CCSR is used. But if
> - * this property exists, then we normally need to create a new ATMU
> - * for it.  For now, however, we cheat.  The only entity that creates
> - * this property is the Freescale hypervisor, and the address is
> - * specified in the partition configuration.  Typically, the address
> - * is located in the page immediately after the end of DDR.  If so, we
> - * can avoid allocating a new ATMU by extending the DDR ATMU by one
> - * page.
> - */
> - reg = of_get_property(hose->dn, "msi-address-64", &len);
> - if (reg && (len == sizeof(u64))) {
> - u64 address = be64_to_cpup(reg);
> -
> - if ((address >= mem) && (address < (mem + PAGE_SIZE))) {
> - pr_info("%s: extending DDR ATMU to cover MSIIR", name);
> - mem += PAGE_SIZE;
> - } else {
> - /* TODO: Create a new ATMU for MSIIR */
> - pr_warn("%s: msi-address-64 address of %llx is "
> - "unsupported\n", name, address);
> - }
> - }
> -
> - sz = min(mem, paddr_lo);
> - mem_log = ilog2(sz);
> -
> - /* PCIe can overmap inbound & outbound since RX & TX are separated */
> - if (early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP)) {
> - /* Size window to exact size if power-of-two or one size up */
> - if ((1ull << mem_log) != mem) {
> - mem_log++;
> - if ((1ull << mem_log) > mem)
> - pr_info("%s: Setting PCI inbound window "
> - "greater than memory size\n", name);
> - }
> -
> - piwar |= ((mem_log - 1) & PIWAR_SZ_MASK);
> -
> - /* Setup inbound memory window */
> - out_be32(&pci->piw[win_idx].pitar,  0x00000000);
> - out_be32(&pci->piw[win_idx].piwbar, 0x00000000);
> - out_be32(&pci->piw[win_idx].piwar,  piwar);
> - win_idx--;
> -
> - hose->dma_window_base_cur = 0x00000000;
> - hose->dma_window_size = (resource_size_t)sz;
> -
> - /*
> - * if we have >4G of memory setup second PCI inbound window to
> - * let devices that are 64-bit address capable to work w/o
> - * SWIOTLB and access the full range of memory
> - */
> - if (sz != mem) {
> - mem_log = ilog2(mem);
> -
> - /* Size window up if we dont fit in exact power-of-2 */
> - if ((1ull << mem_log) != mem)
> - mem_log++;
> -
> - piwar = (piwar & ~PIWAR_SZ_MASK) | (mem_log - 1);
> -
> - /* Setup inbound memory window */
> - out_be32(&pci->piw[win_idx].pitar,  0x00000000);
> - out_be32(&pci->piw[win_idx].piwbear,
> - pci64_dma_offset >> 44);
> - out_be32(&pci->piw[win_idx].piwbar,
> - pci64_dma_offset >> 12);
> - out_be32(&pci->piw[win_idx].piwar,  piwar);
> -
> - /*
> - * install our own dma_set_mask handler to fixup dma_ops
> - * and dma_offset
> - */
> - ppc_md.dma_set_mask = fsl_pci_dma_set_mask;
> -
> - pr_info("%s: Setup 64-bit PCI DMA window\n", name);
> - }
> - } else {
> - u64 paddr = 0;
> -
> - /* Setup inbound memory window */
> - out_be32(&pci->piw[win_idx].pitar,  paddr >> 12);
> - out_be32(&pci->piw[win_idx].piwbar, paddr >> 12);
> - out_be32(&pci->piw[win_idx].piwar,  (piwar | (mem_log - 1)));
> - win_idx--;
> -
> - paddr += 1ull << mem_log;
> - sz -= 1ull << mem_log;
> -
> - if (sz) {
> - mem_log = ilog2(sz);
> - piwar |= (mem_log - 1);
> -
> - out_be32(&pci->piw[win_idx].pitar,  paddr >> 12);
> - out_be32(&pci->piw[win_idx].piwbar, paddr >> 12);
> - out_be32(&pci->piw[win_idx].piwar,  piwar);
> - win_idx--;
> -
> - paddr += 1ull << mem_log;
> - }
> -
> - hose->dma_window_base_cur = 0x00000000;
> - hose->dma_window_size = (resource_size_t)paddr;
> - }
> -
> - if (hose->dma_window_size < mem) {
> -#ifdef CONFIG_SWIOTLB
> - ppc_swiotlb_enable = 1;
> -#else
> - pr_err("%s: ERROR: Memory size exceeds PCI ATMU ability to "
> - "map - enable CONFIG_SWIOTLB to avoid dma errors.\n",
> - name);
> -#endif
> - /* adjusting outbound windows could reclaim space in mem map */
> - if (paddr_hi < 0xffffffffull)
> - pr_warning("%s: WARNING: Outbound window cfg leaves "
> - "gaps in memory map. Adjusting the memory map "
> - "could reduce unnecessary bounce buffering.\n",
> - name);
> -
> - pr_info("%s: DMA window size is 0x%llx\n", name,
> - (u64)hose->dma_window_size);
> - }
> -}
> -
> -static void __init setup_pci_cmd(struct pci_controller *hose)
> -{
> - u16 cmd;
> - int cap_x;
> -
> - early_read_config_word(hose, 0, 0, PCI_COMMAND, &cmd);
> - cmd |= PCI_COMMAND_SERR | PCI_COMMAND_MASTER | PCI_COMMAND_MEMORY
> - | PCI_COMMAND_IO;
> - early_write_config_word(hose, 0, 0, PCI_COMMAND, cmd);
> -
> - cap_x = early_find_capability(hose, 0, 0, PCI_CAP_ID_PCIX);
> - if (cap_x) {
> - int pci_x_cmd = cap_x + PCI_X_CMD;
> - cmd = PCI_X_CMD_MAX_SPLIT | PCI_X_CMD_MAX_READ
> - | PCI_X_CMD_ERO | PCI_X_CMD_DPERR_E;
> - early_write_config_word(hose, 0, 0, pci_x_cmd, cmd);
> - } else {
> - early_write_config_byte(hose, 0, 0, PCI_LATENCY_TIMER, 0x80);
> - }
> -}
> -
>  void fsl_pcibios_fixup_bus(struct pci_bus *bus)
>  {
>   struct pci_controller *hose = pci_bus_to_host(bus);
> @@ -454,112 +121,6 @@ void fsl_pcibios_fixup_bus(struct pci_bus *bus)
>   }
>  }
>
> -int __init fsl_add_bridge(struct platform_device *pdev, int is_primary)
> -{
> - int len;
> - struct pci_controller *hose;
> - struct resource rsrc;
> - const int *bus_range;
> - u8 hdr_type, progif;
> - struct device_node *dev;
> - struct ccsr_pci __iomem *pci;
> -
> - dev = pdev->dev.of_node;
> -
> - if (!of_device_is_available(dev)) {
> - pr_warning("%s: disabled\n", dev->full_name);
> - return -ENODEV;
> - }
> -
> - pr_debug("Adding PCI host bridge %s\n", dev->full_name);
> -
> - /* Fetch host bridge registers address */
> - if (of_address_to_resource(dev, 0, &rsrc)) {
> - printk(KERN_WARNING "Can't get pci register base!");
> - return -ENOMEM;
> - }
> -
> - /* Get bus range if any */
> - bus_range = of_get_property(dev, "bus-range", &len);
> - if (bus_range == NULL || len < 2 * sizeof(int))
> - printk(KERN_WARNING "Can't get bus-range for %s, assume"
> - " bus 0\n", dev->full_name);
> -
> - pci_add_flags(PCI_REASSIGN_ALL_BUS);
> - hose = pcibios_alloc_controller(dev);
> - if (!hose)
> - return -ENOMEM;
> -
> - /* set platform device as the parent */
> - hose->parent = &pdev->dev;
> - hose->first_busno = bus_range ? bus_range[0] : 0x0;
> - hose->last_busno = bus_range ? bus_range[1] : 0xff;
> -
> - pr_debug("PCI memory map start 0x%016llx, size 0x%016llx\n",
> - (u64)rsrc.start, (u64)resource_size(&rsrc));
> -
> - pci = hose->private_data = ioremap(rsrc.start, resource_size(&rsrc));
> - if (!hose->private_data)
> - goto no_bridge;
> -
> - setup_indirect_pci(hose, rsrc.start, rsrc.start + 0x4,
> -   PPC_INDIRECT_TYPE_BIG_ENDIAN);
> -
> - if (in_be32(&pci->block_rev1) < PCIE_IP_REV_3_0)
> - hose->indirect_type |= PPC_INDIRECT_TYPE_FSL_CFG_REG_LINK;
> -
> - if (early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP)) {
> - /* use fsl_indirect_read_config for PCIe */
> - hose->ops = &fsl_indirect_pcie_ops;
> - /* For PCIE read HEADER_TYPE to identify controler mode */
> - early_read_config_byte(hose, 0, 0, PCI_HEADER_TYPE, &hdr_type);
> - if ((hdr_type & 0x7f) != PCI_HEADER_TYPE_BRIDGE)
> - goto no_bridge;
> -
> - } else {
> - /* For PCI read PROG to identify controller mode */
> - early_read_config_byte(hose, 0, 0, PCI_CLASS_PROG, &progif);
> - if ((progif & 1) == 1)
> - goto no_bridge;
> - }
> -
> - setup_pci_cmd(hose);
> -
> - /* check PCI express link status */
> - if (early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP)) {
> - hose->indirect_type |= PPC_INDIRECT_TYPE_EXT_REG |
> - PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS;
> - if (fsl_pcie_check_link(hose))
> - hose->indirect_type |= PPC_INDIRECT_TYPE_NO_PCIE_LINK;
> - }
> -
> - printk(KERN_INFO "Found FSL PCI host bridge at 0x%016llx. "
> - "Firmware bus number: %d->%d\n",
> - (unsigned long long)rsrc.start, hose->first_busno,
> - hose->last_busno);
> -
> - pr_debug(" ->Hose at 0x%p, cfg_addr=0x%p,cfg_data=0x%p\n",
> - hose, hose->cfg_addr, hose->cfg_data);
> -
> - /* Interpret the "ranges" property */
> - /* This also maps the I/O region and sets isa_io/mem_base */
> - pci_process_bridge_OF_ranges(hose, dev, is_primary);
> -
> - /* Setup PEX window registers */
> - setup_pci_atmu(hose);
> -
> - return 0;
> -
> -no_bridge:
> - iounmap(hose->private_data);
> - /* unmap cfg_data & cfg_addr separately if not on same page */
> - if (((unsigned long)hose->cfg_data & PAGE_MASK) !=
> -    ((unsigned long)hose->cfg_addr & PAGE_MASK))
> - iounmap(hose->cfg_data);
> - iounmap(hose->cfg_addr);
> - pcibios_free_controller(hose);
> - return -ENODEV;
> -}
>  #endif /* CONFIG_FSL_SOC_BOOKE || CONFIG_PPC_86xx */
>
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, quirk_fsl_pcie_header);
> @@ -1029,26 +590,6 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
>  #endif
>
>  #if defined(CONFIG_FSL_SOC_BOOKE) || defined(CONFIG_PPC_86xx)
> -static const struct of_device_id pci_ids[] = {
> - { .compatible = "fsl,mpc8540-pci", },
> - { .compatible = "fsl,mpc8548-pcie", },
> - { .compatible = "fsl,mpc8610-pci", },
> - { .compatible = "fsl,mpc8641-pcie", },
> - { .compatible = "fsl,qoriq-pcie-v2.1", },
> - { .compatible = "fsl,qoriq-pcie-v2.2", },
> - { .compatible = "fsl,qoriq-pcie-v2.3", },
> - { .compatible = "fsl,qoriq-pcie-v2.4", },
> - { .compatible = "fsl,qoriq-pcie-v3.0", },
> -
> - /*
> - * The following entries are for compatibility with older device
> - * trees.
> - */
> - { .compatible = "fsl,p1022-pcie", },
> - { .compatible = "fsl,p4080-pcie", },
> -
> - {},
> -};
>
>  struct device_node *fsl_pci_primary;
>
> @@ -1083,64 +624,4 @@ void fsl_pci_assign_primary(void)
>   }
>   }
>  }
> -
> -static int fsl_pci_probe(struct platform_device *pdev)
> -{
> - int ret;
> - struct device_node *node;
> -
> - node = pdev->dev.of_node;
> - ret = fsl_add_bridge(pdev, fsl_pci_primary == node);
> -
> - mpc85xx_pci_err_probe(pdev);
> -
> - return 0;
> -}
> -
> -#ifdef CONFIG_PM
> -static int fsl_pci_resume(struct device *dev)
> -{
> - struct pci_controller *hose;
> - struct resource pci_rsrc;
> -
> - hose = pci_find_hose_for_OF_device(dev->of_node);
> - if (!hose)
> - return -ENODEV;
> -
> - if (of_address_to_resource(dev->of_node, 0, &pci_rsrc)) {
> - dev_err(dev, "Get pci register base failed.");
> - return -ENODEV;
> - }
> -
> - setup_pci_atmu(hose);
> -
> - return 0;
> -}
> -
> -static const struct dev_pm_ops pci_pm_ops = {
> - .resume = fsl_pci_resume,
> -};
> -
> -#define PCI_PM_OPS (&pci_pm_ops)
> -
> -#else
> -
> -#define PCI_PM_OPS NULL
> -
> -#endif
> -
> -static struct platform_driver fsl_pci_driver = {
> - .driver = {
> - .name = "fsl-pci",
> - .pm = PCI_PM_OPS,
> - .of_match_table = pci_ids,
> - },
> - .probe = fsl_pci_probe,
> -};
> -
> -static int __init fsl_pci_init(void)
> -{
> - return platform_driver_register(&fsl_pci_driver);
> -}
> -arch_initcall(fsl_pci_init);
>  #endif
> diff --git a/arch/powerpc/sysdev/fsl_pci.h b/arch/powerpc/sysdev/fsl_pci.h
> index 8d455df..ce77aad 100644
> --- a/arch/powerpc/sysdev/fsl_pci.h
> +++ b/arch/powerpc/sysdev/fsl_pci.h
> @@ -21,95 +21,6 @@ struct platform_device;
>  #define PCI_FSL_BRR1      0xbf8
>  #define PCI_FSL_BRR1_VER 0xffff
>
> -#define PCIE_LTSSM 0x0404 /* PCIE Link Training and Status */
> -#define PCIE_LTSSM_L0 0x16 /* L0 state */
> -#define PCIE_IP_REV_2_2 0x02080202 /* PCIE IP block version Rev2.2 */
> -#define PCIE_IP_REV_3_0 0x02080300 /* PCIE IP block version Rev3.0 */
> -#define PIWAR_EN 0x80000000 /* Enable */
> -#define PIWAR_PF 0x20000000 /* prefetch */
> -#define PIWAR_TGI_LOCAL 0x00f00000 /* target - local memory */
> -#define PIWAR_READ_SNOOP 0x00050000
> -#define PIWAR_WRITE_SNOOP 0x00005000
> -#define PIWAR_SZ_MASK          0x0000003f
> -
> -/* PCI/PCI Express outbound window reg */
> -struct pci_outbound_window_regs {
> - __be32 potar; /* 0x.0 - Outbound translation address register */
> - __be32 potear; /* 0x.4 - Outbound translation extended address register */
> - __be32 powbar; /* 0x.8 - Outbound window base address register */
> - u8 res1[4];
> - __be32 powar; /* 0x.10 - Outbound window attributes register */
> - u8 res2[12];
> -};
> -
> -/* PCI/PCI Express inbound window reg */
> -struct pci_inbound_window_regs {
> - __be32 pitar; /* 0x.0 - Inbound translation address register */
> - u8 res1[4];
> - __be32 piwbar; /* 0x.8 - Inbound window base address register */
> - __be32 piwbear; /* 0x.c - Inbound window base extended address register */
> - __be32 piwar; /* 0x.10 - Inbound window attributes register */
> - u8 res2[12];
> -};
> -
> -/* PCI/PCI Express IO block registers for 85xx/86xx */
> -struct ccsr_pci {
> - __be32 config_addr; /* 0x.000 - PCI/PCIE Configuration Address Register */
> - __be32 config_data; /* 0x.004 - PCI/PCIE Configuration Data Register */
> - __be32 int_ack; /* 0x.008 - PCI Interrupt Acknowledge Register */
> - __be32 pex_otb_cpl_tor; /* 0x.00c - PCIE Outbound completion timeout register */
> - __be32 pex_conf_tor; /* 0x.010 - PCIE configuration timeout register */
> - __be32 pex_config; /* 0x.014 - PCIE CONFIG Register */
> - __be32 pex_int_status; /* 0x.018 - PCIE interrupt status */
> - u8 res2[4];
> - __be32 pex_pme_mes_dr; /* 0x.020 - PCIE PME and message detect register */
> - __be32 pex_pme_mes_disr; /* 0x.024 - PCIE PME and message disable register */
> - __be32 pex_pme_mes_ier; /* 0x.028 - PCIE PME and message interrupt enable register */
> - __be32 pex_pmcr; /* 0x.02c - PCIE power management command register */
> - u8 res3[3016];
> - __be32 block_rev1; /* 0x.bf8 - PCIE Block Revision register 1 */
> - __be32 block_rev2; /* 0x.bfc - PCIE Block Revision register 2 */
> -
> -/* PCI/PCI Express outbound window 0-4
> - * Window 0 is the default window and is the only window enabled upon reset.
> - * The default outbound register set is used when a transaction misses
> - * in all of the other outbound windows.
> - */
> - struct pci_outbound_window_regs pow[5];
> - u8 res14[96];
> - struct pci_inbound_window_regs pmit; /* 0xd00 - 0xd9c Inbound MSI */
> - u8 res6[96];
> -/* PCI/PCI Express inbound window 3-0
> - * inbound window 1 supports only a 32-bit base address and does not
> - * define an inbound window base extended address register.
> - */
> - struct pci_inbound_window_regs piw[4];
> -
> - __be32 pex_err_dr; /* 0x.e00 - PCI/PCIE error detect register */
> - u8 res21[4];
> - __be32 pex_err_en; /* 0x.e08 - PCI/PCIE error interrupt enable register */
> - u8 res22[4];
> - __be32 pex_err_disr; /* 0x.e10 - PCI/PCIE error disable register */
> - u8 res23[12];
> - __be32 pex_err_cap_stat; /* 0x.e20 - PCI/PCIE error capture status register */
> - u8 res24[4];
> - __be32 pex_err_cap_r0; /* 0x.e28 - PCIE error capture register 0 */
> - __be32 pex_err_cap_r1; /* 0x.e2c - PCIE error capture register 0 */
> - __be32 pex_err_cap_r2; /* 0x.e30 - PCIE error capture register 0 */
> - __be32 pex_err_cap_r3; /* 0x.e34 - PCIE error capture register 0 */
> - u8 res_e38[200];
> - __be32 pdb_stat; /* 0x.f00 - PCIE Debug Status */
> - u8 res_f04[16];
> - __be32 pex_csr0; /* 0x.f14 - PEX Control/Status register 0*/
> -#define PEX_CSR0_LTSSM_MASK 0xFC
> -#define PEX_CSR0_LTSSM_SHIFT 2
> -#define PEX_CSR0_LTSSM_L0 0x11
> - __be32 pex_csr1; /* 0x.f18 - PEX Control/Status register 1*/
> - u8 res_f1c[228];
> -
> -};
> -
> -extern int fsl_add_bridge(struct platform_device *pdev, int is_primary);
>  extern void fsl_pcibios_fixup_bus(struct pci_bus *bus);
>  extern int mpc83xx_add_bridge(struct device_node *dev);
>  u64 fsl_pci_immrbar_base(struct pci_controller *hose);
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/drivers/pci/host/pci-fsl-common.c
> similarity index 54%
> copy from arch/powerpc/sysdev/fsl_pci.c
> copy to drivers/pci/host/pci-fsl-common.c
> index ccfb50d..69d338b 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/drivers/pci/host/pci-fsl-common.c
> @@ -1,5 +1,5 @@
>  /*
> - * MPC83xx/85xx/86xx PCI/PCIE support routing.
> + * 85xx/86xx/LS PCI/PCIE support routing.
>   *
>   * Copyright 2007-2012 Freescale Semiconductor, Inc.
>   * Copyright 2008-2009 MontaVista Software, Inc.
> @@ -8,9 +8,6 @@
>   * Recode: ZHANG WEI <wei.zhang@freescale.com>
>   * Rewrite the routing for Frescale PCI and PCI Express
>   * Roy Zang <tie-fei.zang@freescale.com>
> - * MPC83xx PCI-Express support:
> - * Tony Li <tony.li@freescale.com>
> - * Anton Vorontsov <avorontsov@ru.mvista.com>
>   *
>   * This program is free software; you can redistribute  it and/or modify it
>   * under  the terms of  the GNU General  Public License as published by the
> @@ -38,29 +35,6 @@
>  #include <sysdev/fsl_soc.h>
>  #include <sysdev/fsl_pci.h>
>
> -static int fsl_pcie_bus_fixup, is_mpc83xx_pci;
> -
> -static void quirk_fsl_pcie_header(struct pci_dev *dev)
> -{
> - u8 hdr_type;
> -
> - /* if we aren't a PCIe don't bother */
> - if (!pci_find_capability(dev, PCI_CAP_ID_EXP))
> - return;
> -
> - /* if we aren't in host mode don't bother */
> - pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type);
> - if ((hdr_type & 0x7f) != PCI_HEADER_TYPE_BRIDGE)
> - return;
> -
> - dev->class = PCI_CLASS_BRIDGE_PCI << 8;
> - fsl_pcie_bus_fixup = 1;
> - return;
> -}
> -
> -static int fsl_indirect_read_config(struct pci_bus *, unsigned int,
> -    int, int, u32 *);
> -
>  static int fsl_pcie_check_link(struct pci_controller *hose)
>  {
>   u32 val = 0;
> @@ -109,29 +83,6 @@ static struct pci_ops fsl_indirect_pcie_ops =
>   .write = indirect_write_config,
>  };
>
> -#define MAX_PHYS_ADDR_BITS 40
> -static u64 pci64_dma_offset = 1ull << MAX_PHYS_ADDR_BITS;
> -
> -static int fsl_pci_dma_set_mask(struct device *dev, u64 dma_mask)
> -{
> - if (!dev->dma_mask || !dma_supported(dev, dma_mask))
> - return -EIO;
> -
> - /*
> - * Fixup PCI devices that are able to DMA to above the physical
> - * address width of the SoC such that we can address any internal
> - * SoC address from across PCI if needed
> - */
> - if ((dev->bus == &pci_bus_type) &&
> -    dma_mask >= DMA_BIT_MASK(MAX_PHYS_ADDR_BITS)) {
> - set_dma_ops(dev, &dma_direct_ops);
> - set_dma_offset(dev, pci64_dma_offset);
> - }
> -
> - *dev->dma_mask = dma_mask;
> - return 0;
> -}
> -
>  static int setup_one_atmu(struct ccsr_pci __iomem *pci,
>   unsigned int index, const struct resource *res,
>   resource_size_t offset)
> @@ -417,43 +368,6 @@ static void __init setup_pci_cmd(struct pci_controller *hose)
>   }
>  }
>
> -void fsl_pcibios_fixup_bus(struct pci_bus *bus)
> -{
> - struct pci_controller *hose = pci_bus_to_host(bus);
> - int i, is_pcie = 0, no_link;
> -
> - /* The root complex bridge comes up with bogus resources,
> - * we copy the PHB ones in.
> - *
> - * With the current generic PCI code, the PHB bus no longer
> - * has bus->resource[0..4] set, so things are a bit more
> - * tricky.
> - */
> -
> - if (fsl_pcie_bus_fixup)
> - is_pcie = early_find_capability(hose, 0, 0, PCI_CAP_ID_EXP);
> - no_link = !!(hose->indirect_type & PPC_INDIRECT_TYPE_NO_PCIE_LINK);
> -
> - if (bus->parent == hose->bus && (is_pcie || no_link)) {
> - for (i = 0; i < PCI_BRIDGE_RESOURCE_NUM; ++i) {
> - struct resource *res = bus->resource[i];
> - struct resource *par;
> -
> - if (!res)
> - continue;
> - if (i == 0)
> - par = &hose->io_resource;
> - else if (i < 4)
> - par = &hose->mem_resources[i-1];
> - else par = NULL;
> -
> - res->start = par ? par->start : 0;
> - res->end   = par ? par->end   : 0;
> - res->flags = par ? par->flags : 0;
> - }
> - }
> -}
> -
>  int __init fsl_add_bridge(struct platform_device *pdev, int is_primary)
>  {
>   int len;
> @@ -560,475 +474,7 @@ no_bridge:
>   pcibios_free_controller(hose);
>   return -ENODEV;
>  }
> -#endif /* CONFIG_FSL_SOC_BOOKE || CONFIG_PPC_86xx */
> -
> -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, quirk_fsl_pcie_header);
> -
> -#if defined(CONFIG_PPC_83xx) || defined(CONFIG_PPC_MPC512x)
> -struct mpc83xx_pcie_priv {
> - void __iomem *cfg_type0;
> - void __iomem *cfg_type1;
> - u32 dev_base;
> -};
> -
> -struct pex_inbound_window {
> - u32 ar;
> - u32 tar;
> - u32 barl;
> - u32 barh;
> -};
> -
> -/*
> - * With the convention of u-boot, the PCIE outbound window 0 serves
> - * as configuration transactions outbound.
> - */
> -#define PEX_OUTWIN0_BAR 0xCA4
> -#define PEX_OUTWIN0_TAL 0xCA8
> -#define PEX_OUTWIN0_TAH 0xCAC
> -#define PEX_RC_INWIN_BASE 0xE60
> -#define PEX_RCIWARn_EN 0x1
> -
> -static int mpc83xx_pcie_exclude_device(struct pci_bus *bus, unsigned int devfn)
> -{
> - struct pci_controller *hose = pci_bus_to_host(bus);
> -
> - if (hose->indirect_type & PPC_INDIRECT_TYPE_NO_PCIE_LINK)
> - return PCIBIOS_DEVICE_NOT_FOUND;
> - /*
> - * Workaround for the HW bug: for Type 0 configure transactions the
> - * PCI-E controller does not check the device number bits and just
> - * assumes that the device number bits are 0.
> - */
> - if (bus->number == hose->first_busno ||
> - bus->primary == hose->first_busno) {
> - if (devfn & 0xf8)
> - return PCIBIOS_DEVICE_NOT_FOUND;
> - }
> -
> - if (ppc_md.pci_exclude_device) {
> - if (ppc_md.pci_exclude_device(hose, bus->number, devfn))
> - return PCIBIOS_DEVICE_NOT_FOUND;
> - }
> -
> - return PCIBIOS_SUCCESSFUL;
> -}
> -
> -static void __iomem *mpc83xx_pcie_remap_cfg(struct pci_bus *bus,
> -    unsigned int devfn, int offset)
> -{
> - struct pci_controller *hose = pci_bus_to_host(bus);
> - struct mpc83xx_pcie_priv *pcie = hose->dn->data;
> - u32 dev_base = bus->number << 24 | devfn << 16;
> - int ret;
> -
> - ret = mpc83xx_pcie_exclude_device(bus, devfn);
> - if (ret)
> - return NULL;
> -
> - offset &= 0xfff;
> -
> - /* Type 0 */
> - if (bus->number == hose->first_busno)
> - return pcie->cfg_type0 + offset;
> -
> - if (pcie->dev_base == dev_base)
> - goto mapped;
> -
> - out_le32(pcie->cfg_type0 + PEX_OUTWIN0_TAL, dev_base);
> -
> - pcie->dev_base = dev_base;
> -mapped:
> - return pcie->cfg_type1 + offset;
> -}
> -
> -static int mpc83xx_pcie_read_config(struct pci_bus *bus, unsigned int devfn,
> -    int offset, int len, u32 *val)
> -{
> - void __iomem *cfg_addr;
> -
> - cfg_addr = mpc83xx_pcie_remap_cfg(bus, devfn, offset);
> - if (!cfg_addr)
> - return PCIBIOS_DEVICE_NOT_FOUND;
> -
> - switch (len) {
> - case 1:
> - *val = in_8(cfg_addr);
> - break;
> - case 2:
> - *val = in_le16(cfg_addr);
> - break;
> - default:
> - *val = in_le32(cfg_addr);
> - break;
> - }
> -
> - return PCIBIOS_SUCCESSFUL;
> -}
> -
> -static int mpc83xx_pcie_write_config(struct pci_bus *bus, unsigned int devfn,
> -     int offset, int len, u32 val)
> -{
> - struct pci_controller *hose = pci_bus_to_host(bus);
> - void __iomem *cfg_addr;
> -
> - cfg_addr = mpc83xx_pcie_remap_cfg(bus, devfn, offset);
> - if (!cfg_addr)
> - return PCIBIOS_DEVICE_NOT_FOUND;
> -
> - /* PPC_INDIRECT_TYPE_SURPRESS_PRIMARY_BUS */
> - if (offset == PCI_PRIMARY_BUS && bus->number == hose->first_busno)
> - val &= 0xffffff00;
> -
> - switch (len) {
> - case 1:
> - out_8(cfg_addr, val);
> - break;
> - case 2:
> - out_le16(cfg_addr, val);
> - break;
> - default:
> - out_le32(cfg_addr, val);
> - break;
> - }
> -
> - return PCIBIOS_SUCCESSFUL;
> -}
> -
> -static struct pci_ops mpc83xx_pcie_ops = {
> - .read = mpc83xx_pcie_read_config,
> - .write = mpc83xx_pcie_write_config,
> -};
> -
> -static int __init mpc83xx_pcie_setup(struct pci_controller *hose,
> -     struct resource *reg)
> -{
> - struct mpc83xx_pcie_priv *pcie;
> - u32 cfg_bar;
> - int ret = -ENOMEM;
> -
> - pcie = zalloc_maybe_bootmem(sizeof(*pcie), GFP_KERNEL);
> - if (!pcie)
> - return ret;
> -
> - pcie->cfg_type0 = ioremap(reg->start, resource_size(reg));
> - if (!pcie->cfg_type0)
> - goto err0;
> -
> - cfg_bar = in_le32(pcie->cfg_type0 + PEX_OUTWIN0_BAR);
> - if (!cfg_bar) {
> - /* PCI-E isn't configured. */
> - ret = -ENODEV;
> - goto err1;
> - }
> -
> - pcie->cfg_type1 = ioremap(cfg_bar, 0x1000);
> - if (!pcie->cfg_type1)
> - goto err1;
> -
> - WARN_ON(hose->dn->data);
> - hose->dn->data = pcie;
> - hose->ops = &mpc83xx_pcie_ops;
> - hose->indirect_type |= PPC_INDIRECT_TYPE_FSL_CFG_REG_LINK;
> -
> - out_le32(pcie->cfg_type0 + PEX_OUTWIN0_TAH, 0);
> - out_le32(pcie->cfg_type0 + PEX_OUTWIN0_TAL, 0);
> -
> - if (fsl_pcie_check_link(hose))
> - hose->indirect_type |= PPC_INDIRECT_TYPE_NO_PCIE_LINK;
> -
> - return 0;
> -err1:
> - iounmap(pcie->cfg_type0);
> -err0:
> - kfree(pcie);
> - return ret;
> -
> -}
> -
> -int __init mpc83xx_add_bridge(struct device_node *dev)
> -{
> - int ret;
> - int len;
> - struct pci_controller *hose;
> - struct resource rsrc_reg;
> - struct resource rsrc_cfg;
> - const int *bus_range;
> - int primary;
> -
> - is_mpc83xx_pci = 1;
> -
> - if (!of_device_is_available(dev)) {
> - pr_warning("%s: disabled by the firmware.\n",
> -   dev->full_name);
> - return -ENODEV;
> - }
> - pr_debug("Adding PCI host bridge %s\n", dev->full_name);
> -
> - /* Fetch host bridge registers address */
> - if (of_address_to_resource(dev, 0, &rsrc_reg)) {
> - printk(KERN_WARNING "Can't get pci register base!\n");
> - return -ENOMEM;
> - }
> -
> - memset(&rsrc_cfg, 0, sizeof(rsrc_cfg));
> -
> - if (of_address_to_resource(dev, 1, &rsrc_cfg)) {
> - printk(KERN_WARNING
> - "No pci config register base in dev tree, "
> - "using default\n");
> - /*
> - * MPC83xx supports up to two host controllers
> - * one at 0x8500 has config space registers at 0x8300
> - * one at 0x8600 has config space registers at 0x8380
> - */
> - if ((rsrc_reg.start & 0xfffff) == 0x8500)
> - rsrc_cfg.start = (rsrc_reg.start & 0xfff00000) + 0x8300;
> - else if ((rsrc_reg.start & 0xfffff) == 0x8600)
> - rsrc_cfg.start = (rsrc_reg.start & 0xfff00000) + 0x8380;
> - }
> - /*
> - * Controller at offset 0x8500 is primary
> - */
> - if ((rsrc_reg.start & 0xfffff) == 0x8500)
> - primary = 1;
> - else
> - primary = 0;
> -
> - /* Get bus range if any */
> - bus_range = of_get_property(dev, "bus-range", &len);
> - if (bus_range == NULL || len < 2 * sizeof(int)) {
> - printk(KERN_WARNING "Can't get bus-range for %s, assume"
> -       " bus 0\n", dev->full_name);
> - }
> -
> - pci_add_flags(PCI_REASSIGN_ALL_BUS);
> - hose = pcibios_alloc_controller(dev);
> - if (!hose)
> - return -ENOMEM;
> -
> - hose->first_busno = bus_range ? bus_range[0] : 0;
> - hose->last_busno = bus_range ? bus_range[1] : 0xff;
> -
> - if (of_device_is_compatible(dev, "fsl,mpc8314-pcie")) {
> - ret = mpc83xx_pcie_setup(hose, &rsrc_reg);
> - if (ret)
> - goto err0;
> - } else {
> - setup_indirect_pci(hose, rsrc_cfg.start,
> -   rsrc_cfg.start + 4, 0);
> - }
> -
> - printk(KERN_INFO "Found FSL PCI host bridge at 0x%016llx. "
> -       "Firmware bus number: %d->%d\n",
> -       (unsigned long long)rsrc_reg.start, hose->first_busno,
> -       hose->last_busno);
> -
> - pr_debug(" ->Hose at 0x%p, cfg_addr=0x%p,cfg_data=0x%p\n",
> -    hose, hose->cfg_addr, hose->cfg_data);
> -
> - /* Interpret the "ranges" property */
> - /* This also maps the I/O region and sets isa_io/mem_base */
> - pci_process_bridge_OF_ranges(hose, dev, primary);
> -
> - return 0;
> -err0:
> - pcibios_free_controller(hose);
> - return ret;
> -}
> -#endif /* CONFIG_PPC_83xx */
> -
> -u64 fsl_pci_immrbar_base(struct pci_controller *hose)
> -{
> -#ifdef CONFIG_PPC_83xx
> - if (is_mpc83xx_pci) {
> - struct mpc83xx_pcie_priv *pcie = hose->dn->data;
> - struct pex_inbound_window *in;
> - int i;
> -
> - /* Walk the Root Complex Inbound windows to match IMMR base */
> - in = pcie->cfg_type0 + PEX_RC_INWIN_BASE;
> - for (i = 0; i < 4; i++) {
> - /* not enabled, skip */
> - if (!in_le32(&in[i].ar) & PEX_RCIWARn_EN)
> - continue;
> -
> - if (get_immrbase() == in_le32(&in[i].tar))
> - return (u64)in_le32(&in[i].barh) << 32 |
> -    in_le32(&in[i].barl);
> - }
> -
> - printk(KERN_WARNING "could not find PCI BAR matching IMMR\n");
> - }
> -#endif
> -
> -#if defined(CONFIG_FSL_SOC_BOOKE) || defined(CONFIG_PPC_86xx)
> - if (!is_mpc83xx_pci) {
> - u32 base;
> -
> - pci_bus_read_config_dword(hose->bus,
> - PCI_DEVFN(0, 0), PCI_BASE_ADDRESS_0, &base);
> - return base;
> - }
> -#endif
> -
> - return 0;
> -}
>
> -#ifdef CONFIG_E500
> -static int mcheck_handle_load(struct pt_regs *regs, u32 inst)
> -{
> - unsigned int rd, ra, rb, d;
> -
> - rd = get_rt(inst);
> - ra = get_ra(inst);
> - rb = get_rb(inst);
> - d = get_d(inst);
> -
> - switch (get_op(inst)) {
> - case 31:
> - switch (get_xop(inst)) {
> - case OP_31_XOP_LWZX:
> - case OP_31_XOP_LWBRX:
> - regs->gpr[rd] = 0xffffffff;
> - break;
> -
> - case OP_31_XOP_LWZUX:
> - regs->gpr[rd] = 0xffffffff;
> - regs->gpr[ra] += regs->gpr[rb];
> - break;
> -
> - case OP_31_XOP_LBZX:
> - regs->gpr[rd] = 0xff;
> - break;
> -
> - case OP_31_XOP_LBZUX:
> - regs->gpr[rd] = 0xff;
> - regs->gpr[ra] += regs->gpr[rb];
> - break;
> -
> - case OP_31_XOP_LHZX:
> - case OP_31_XOP_LHBRX:
> - regs->gpr[rd] = 0xffff;
> - break;
> -
> - case OP_31_XOP_LHZUX:
> - regs->gpr[rd] = 0xffff;
> - regs->gpr[ra] += regs->gpr[rb];
> - break;
> -
> - case OP_31_XOP_LHAX:
> - regs->gpr[rd] = ~0UL;
> - break;
> -
> - case OP_31_XOP_LHAUX:
> - regs->gpr[rd] = ~0UL;
> - regs->gpr[ra] += regs->gpr[rb];
> - break;
> -
> - default:
> - return 0;
> - }
> - break;
> -
> - case OP_LWZ:
> - regs->gpr[rd] = 0xffffffff;
> - break;
> -
> - case OP_LWZU:
> - regs->gpr[rd] = 0xffffffff;
> - regs->gpr[ra] += (s16)d;
> - break;
> -
> - case OP_LBZ:
> - regs->gpr[rd] = 0xff;
> - break;
> -
> - case OP_LBZU:
> - regs->gpr[rd] = 0xff;
> - regs->gpr[ra] += (s16)d;
> - break;
> -
> - case OP_LHZ:
> - regs->gpr[rd] = 0xffff;
> - break;
> -
> - case OP_LHZU:
> - regs->gpr[rd] = 0xffff;
> - regs->gpr[ra] += (s16)d;
> - break;
> -
> - case OP_LHA:
> - regs->gpr[rd] = ~0UL;
> - break;
> -
> - case OP_LHAU:
> - regs->gpr[rd] = ~0UL;
> - regs->gpr[ra] += (s16)d;
> - break;
> -
> - default:
> - return 0;
> - }
> -
> - return 1;
> -}
> -
> -static int is_in_pci_mem_space(phys_addr_t addr)
> -{
> - struct pci_controller *hose;
> - struct resource *res;
> - int i;
> -
> - list_for_each_entry(hose, &hose_list, list_node) {
> - if (!(hose->indirect_type & PPC_INDIRECT_TYPE_EXT_REG))
> - continue;
> -
> - for (i = 0; i < 3; i++) {
> - res = &hose->mem_resources[i];
> - if ((res->flags & IORESOURCE_MEM) &&
> - addr >= res->start && addr <= res->end)
> - return 1;
> - }
> - }
> - return 0;
> -}
> -
> -int fsl_pci_mcheck_exception(struct pt_regs *regs)
> -{
> - u32 inst;
> - int ret;
> - phys_addr_t addr = 0;
> -
> - /* Let KVM/QEMU deal with the exception */
> - if (regs->msr & MSR_GS)
> - return 0;
> -
> -#ifdef CONFIG_PHYS_64BIT
> - addr = mfspr(SPRN_MCARU);
> - addr <<= 32;
> -#endif
> - addr += mfspr(SPRN_MCAR);
> -
> - if (is_in_pci_mem_space(addr)) {
> - if (user_mode(regs)) {
> - pagefault_disable();
> - ret = get_user(regs->nip, &inst);
> - pagefault_enable();
> - } else {
> - ret = probe_kernel_address(regs->nip, inst);
> - }
> -
> - if (mcheck_handle_load(regs, inst)) {
> - regs->nip += 4;
> - return 1;
> - }
> - }
> -
> - return 0;
> -}
> -#endif
> -
> -#if defined(CONFIG_FSL_SOC_BOOKE) || defined(CONFIG_PPC_86xx)
>  static const struct of_device_id pci_ids[] = {
>   { .compatible = "fsl,mpc8540-pci", },
>   { .compatible = "fsl,mpc8548-pcie", },
> @@ -1050,40 +496,6 @@ static const struct of_device_id pci_ids[] = {
>   {},
>  };
>
> -struct device_node *fsl_pci_primary;
> -
> -void fsl_pci_assign_primary(void)
> -{
> - struct device_node *np;
> -
> - /* Callers can specify the primary bus using other means. */
> - if (fsl_pci_primary)
> - return;
> -
> - /* If a PCI host bridge contains an ISA node, it's primary. */
> - np = of_find_node_by_type(NULL, "isa");
> - while ((fsl_pci_primary = of_get_parent(np))) {
> - of_node_put(np);
> - np = fsl_pci_primary;
> -
> - if (of_match_node(pci_ids, np) && of_device_is_available(np))
> - return;
> - }
> -
> - /*
> - * If there's no PCI host bridge with ISA, arbitrarily
> - * designate one as primary.  This can go away once
> - * various bugs with primary-less systems are fixed.
> - */
> - for_each_matching_node(np, pci_ids) {
> - if (of_device_is_available(np)) {
> - fsl_pci_primary = np;
> - of_node_put(np);
> - return;
> - }
> - }
> -}
> -
>  static int fsl_pci_probe(struct platform_device *pdev)
>  {
>   int ret;
> @@ -1143,4 +555,3 @@ static int __init fsl_pci_init(void)
>   return platform_driver_register(&fsl_pci_driver);
>  }
>  arch_initcall(fsl_pci_init);
> -#endif
> diff --git a/arch/powerpc/sysdev/fsl_pci.h b/include/linux/fsl/pci-common.h
> similarity index 79%
> copy from arch/powerpc/sysdev/fsl_pci.h
> copy to include/linux/fsl/pci-common.h
> index 8d455df..5e4f683 100644
> --- a/arch/powerpc/sysdev/fsl_pci.h
> +++ b/include/linux/fsl/pci-common.h
> @@ -1,5 +1,5 @@
>  /*
> - * MPC85xx/86xx PCI Express structure define
> + * MPC85xx/86xx/LS PCI Express structure define
>   *
>   * Copyright 2007,2011 Freescale Semiconductor, Inc
>   *
> @@ -11,15 +11,8 @@
>   */
>
>  #ifdef __KERNEL__
> -#ifndef __POWERPC_FSL_PCI_H
> -#define __POWERPC_FSL_PCI_H
> -
> -struct platform_device;
> -
> -
> -/* FSL PCI controller BRR1 register */
> -#define PCI_FSL_BRR1      0xbf8
> -#define PCI_FSL_BRR1_VER 0xffff
> +#ifndef __PCI_COMMON_H
> +#define __PCI_COMMON_H
>
>  #define PCIE_LTSSM 0x0404 /* PCIE Link Training and Status */
>  #define PCIE_LTSSM_L0 0x16 /* L0 state */
> @@ -52,7 +45,7 @@ struct pci_inbound_window_regs {
>   u8 res2[12];
>  };
>
> -/* PCI/PCI Express IO block registers for 85xx/86xx */
> +/* PCI/PCI Express IO block registers for 85xx/86xx/LS */
>  struct ccsr_pci {
>   __be32 config_addr; /* 0x.000 - PCI/PCIE Configuration Address Register */
>   __be32 config_data; /* 0x.004 - PCI/PCIE Configuration Data Register */
> @@ -109,33 +102,5 @@ struct ccsr_pci {
>
>  };
>
> -extern int fsl_add_bridge(struct platform_device *pdev, int is_primary);
> -extern void fsl_pcibios_fixup_bus(struct pci_bus *bus);
> -extern int mpc83xx_add_bridge(struct device_node *dev);
> -u64 fsl_pci_immrbar_base(struct pci_controller *hose);
> -
> -extern struct device_node *fsl_pci_primary;
> -
> -#ifdef CONFIG_PCI
> -void fsl_pci_assign_primary(void);
> -#else
> -static inline void fsl_pci_assign_primary(void) {}
> -#endif
> -
> -#ifdef CONFIG_EDAC_MPC85XX
> -int mpc85xx_pci_err_probe(struct platform_device *op);
> -#else
> -static inline int mpc85xx_pci_err_probe(struct platform_device *op)
> -{
> - return -ENOTSUPP;
> -}
> -#endif
> -
> -#ifdef CONFIG_FSL_PCI
> -extern int fsl_pci_mcheck_exception(struct pt_regs *);
> -#else
> -static inline int fsl_pci_mcheck_exception(struct pt_regs *regs) {return 0; }
> -#endif
> -
> -#endif /* __POWERPC_FSL_PCI_H */
> +#endif /* __PCI_COMMON_H */
>  #endif /* __KERNEL__ */
> --
> 1.8.1.2
>
>

^ permalink raw reply

* RE: [PATCH 1/7] powerpc: Add interface to get msi region information
From: Bhushan Bharat-R65777 @ 2013-10-08 17:09 UTC (permalink / raw)
  To: joro@8bytes.org, Bjorn Helgaas
  Cc: agraf@suse.de, Wood Scott-B07421, linux-pci@vger.kernel.org,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	alex.williamson@redhat.com, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20131008170228.GD17455@8bytes.org>



> -----Original Message-----
> From: joro@8bytes.org [mailto:joro@8bytes.org]
> Sent: Tuesday, October 08, 2013 10:32 PM
> To: Bjorn Helgaas
> Cc: Bhushan Bharat-R65777; alex.williamson@redhat.com; benh@kernel.crashi=
ng.org;
> galak@kernel.crashing.org; linux-kernel@vger.kernel.org; linuxppc-
> dev@lists.ozlabs.org; linux-pci@vger.kernel.org; agraf@suse.de; Wood Scot=
t-
> B07421; iommu@lists.linux-foundation.org
> Subject: Re: [PATCH 1/7] powerpc: Add interface to get msi region informa=
tion
>=20
> On Tue, Oct 08, 2013 at 10:47:49AM -0600, Bjorn Helgaas wrote:
> > I still have no idea what an "aperture type IOMMU" is, other than that
> > it is "different."
>=20
> An aperture based IOMMU is basically any GART-like IOMMU which can only r=
emap a
> small window (the aperture) of the DMA address space. DMA outside of that=
 window
> is either blocked completly or passed through untranslated.

It is completely blocked for Freescale PAMU.=20
So for this type of iommu what we have to do is to create a MSI mapping jus=
t after guest physical address, Example: guest have a 512M of memory then w=
e create window of 1G (because of power of 2 requirement), then we have to =
FIT MSI just after 512M of guest.
And for that we need
	1) to know the physical address of MSI's in interrupt controller (for that=
 this patch was all about of).

	2) When guest enable MSI interrupt then we write MSI-address and MSI-DATA =
in device. The discussion with Alex Williamson is about that interface.

Thanks
-Bharat

>=20
>=20
> 	Joerg
>=20
>=20

^ permalink raw reply

* Re: [PATCH 1/7] powerpc: Add interface to get msi region information
From: joro @ 2013-10-08 17:02 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: agraf@suse.de, Wood Scott-B07421, linux-pci@vger.kernel.org,
	iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	alex.williamson@redhat.com, Bhushan Bharat-R65777,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAErSpo742BOxzxRaFQn+UnsNday4_8LM6+3G6=cfp9DHecPxDg@mail.gmail.com>

On Tue, Oct 08, 2013 at 10:47:49AM -0600, Bjorn Helgaas wrote:
> I still have no idea what an "aperture type IOMMU" is,
> other than that it is "different."

An aperture based IOMMU is basically any GART-like IOMMU which can only
remap a small window (the aperture) of the DMA address space. DMA
outside of that window is either blocked completly or passed through
untranslated.


	Joerg

^ permalink raw reply

* Re: [PATCH 1/7] powerpc: Add interface to get msi region information
From: Scott Wood @ 2013-10-08 17:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: agraf@suse.de, Wood Scott-B07421, joro@8bytes.org,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	iommu@lists.linux-foundation.org, alex.williamson@redhat.com,
	Bhushan Bharat-R65777, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAErSpo742BOxzxRaFQn+UnsNday4_8LM6+3G6=cfp9DHecPxDg@mail.gmail.com>

On Tue, 2013-10-08 at 10:47 -0600, Bjorn Helgaas wrote:
> On Thu, Oct 3, 2013 at 11:19 PM, Bhushan Bharat-R65777
> <R65777@freescale.com> wrote:
> 
> >> I don't know enough about VFIO to understand why these new interfaces are
> >> needed.  Is this the first VFIO IOMMU driver?  I see vfio_iommu_spapr_tce.c and
> >> vfio_iommu_type1.c but I don't know if they're comparable to the Freescale PAMU.
> >> Do other VFIO IOMMU implementations support MSI?  If so, do they handle the
> >> problem of mapping the MSI regions in a different way?
> >
> > PAMU is an aperture type of IOMMU while other are paging type, So they are completely different from what PAMU is and handle that differently.
> 
> This is not an explanation or a justification for adding new
> interfaces.  I still have no idea what an "aperture type IOMMU" is,
> other than that it is "different."  But I see that Alex is working on
> this issue with you in a different thread, so I'm sure you guys will
> sort it out.

PAMU is a very constrained IOMMU that cannot do arbitrary page mappings.
Due to these constraints, we cannot map the MSI I/O page at its normal
address while also mapping RAM at the address we want.  The address we
can map it at depends on the addresses of other mappings, so it can't be
hidden in the IOMMU driver -- the user needs to be in control.

Another difference is that (if I understand correctly) PCs handle MSIs
specially, via interrupt remapping, rather than being translated as a
normal memory access through the IOMMU.

-Scott

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox