LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] irqdomain: Allow quiet failure mode
From: Grant Likely @ 2013-05-06 11:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev@lists.ozlabs.org
In-Reply-To: <1367822494.15842.12.camel@pasglop>

On Mon, May 6, 2013 at 7:41 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> Some interrupt controllers refuse to map interrupts marked as
> "protected" by firwmare. Since we try to map everyting in the
> device-tree on some platforms, we end up with a lot of nasty
> WARN's in the boot log for what is a normal situation on those
> machines.
>
> This defines a specific return code (-EPERM) from the host map()
> callback which cause irqdomain to fail silently.
>
> MPIC is updated to return this when hitting a protected source
> printing only a single line message for diagnostic purposes.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
>
> Grant, I'm happy to merge that myself provided I have your ack,
> it fixes an annoying problem on Cell introduced by previous changes
> to irqdomain (not recent but users started to notice).

Yes, please merge it since powerpc needs it.

Acked-by: Grant Likely <grant.likely@linaro.org>

g.

^ permalink raw reply

* Re: [PATCH] powerpc, perf: Fix processing conditions for invalid BHRB entries
From: Michael Neuling @ 2013-05-06 11:11 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1367831206-16331-1-git-send-email-khandual@linux.vnet.ibm.com>

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:

> Fixing some conditions during BHRB entry processing.

I think we can simplify this a lot more... something like the below.

Also, this marks the "to" address as all 1s, which is better poison
value since it's possible to branch to/from 0x0.

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index c627843..d410d65 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1463,65 +1463,45 @@ void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
 {
 	u64 val;
 	u64 addr;
-	int r_index, u_index, target, pred;
+	int r_index, u_index, pred;
 
 	r_index = 0;
 	u_index = 0;
 	while (r_index < ppmu->bhrb_nr) {
 		/* Assembly read function */
-		val = read_bhrb(r_index);
+		val = read_bhrb(r_index++);
 
 		/* Terminal marker: End of valid BHRB entries */
-		if (val == 0) {
+		if (!val) {
 			break;
 		} else {
 			/* BHRB field break up */
 			addr = val & BHRB_EA;
 			pred = val & BHRB_PREDICTION;
-			target = val & BHRB_TARGET;
 
-			/* Probable Missed entry: Not applicable for POWER8 */
-			if ((addr == 0) && (target == 0) && (pred == 1)) {
-				r_index++;
+			if (!addr)
+				/* invalid entry */
 				continue;
-			}
-
-			/* Real Missed entry: Power8 based missed entry */
-			if ((addr == 0) && (target == 1) && (pred == 1)) {
-				r_index++;
-				continue;
-			}
 
-			/* Reserved condition: Not a valid entry  */
-			if ((addr == 0) && (target == 1) && (pred == 0)) {
-				r_index++;
-				continue;
-			}
-
-			/* Is a target address */
 			if (val & BHRB_TARGET) {
 				/* First address cannot be a target address */
-				if (r_index == 0) {
-					r_index++;
+				if (r_index == 0)
 					continue;
-				}
 
 				/* Update target address for the previous entry */
 				cpuhw->bhrb_entries[u_index - 1].to = addr;
 				cpuhw->bhrb_entries[u_index - 1].mispred = pred;
 				cpuhw->bhrb_entries[u_index - 1].predicted = ~pred;
-
-				/* Dont increment u_index */
-				r_index++;
 			} else {
 				/* Update address, flags for current entry */
 				cpuhw->bhrb_entries[u_index].from = addr;
+				cpuhw->bhrb_entries[u_index].to =
+					0xffffffffffffffff;
 				cpuhw->bhrb_entries[u_index].mispred = pred;
 				cpuhw->bhrb_entries[u_index].predicted = ~pred;
 
 				/* Successfully popullated one entry */
 				u_index++;
-				r_index++;
 			}
 		}
 	}

^ permalink raw reply related

* [PATCH] powerpc: Fix single step emulation of 32bit overflowed branches
From: Michael Neuling @ 2013-05-06 11:32 UTC (permalink / raw)
  To: benh; +Cc: Linux PPC dev, Paul Mackerras

Check truncate_if_32bit() on final write to nip.

Signed-off-by: Michael Neuling <mikey@neuling.org>

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index e15c521..99c7fc1 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -580,7 +580,7 @@ int __kprobes emulate_step(struct pt_regs *regs, unsigned int instr)
 		if (instr & 1)
 			regs->link = regs->nip;
 		if (branch_taken(instr, regs))
-			regs->nip = imm;
+			regs->nip = truncate_if_32bit(regs->msr, imm);
 		return 1;
 #ifdef CONFIG_PPC64
 	case 17:	/* sc */

^ permalink raw reply related

* Re: [PATCH] powerpc, perf: Clear out branch entries to avoid any previous stale values
From: Michael Ellerman @ 2013-05-06 12:53 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linuxppc-dev, mikey, linux-kernel
In-Reply-To: <1367827480-5375-1-git-send-email-khandual@linux.vnet.ibm.com>

On Mon, 2013-05-06 at 13:34 +0530, Anshuman Khandual wrote:
> The 'to' field inside branch entries might contain stale values from previous
> PMU interrupt instances which had indirect branches. So clear all the values
> before reading a fresh set of BHRB entries after a PMU interrupt.
> 
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index c627843..09db68d 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1590,6 +1590,8 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
>  		if (event->attr.sample_type & PERF_SAMPLE_BRANCH_STACK) {
>  			struct cpu_hw_events *cpuhw;
>  			cpuhw = &__get_cpu_var(cpu_hw_events);
> +			memset(cpuhw->bhrb_entries, 0,
> +				sizeof(struct perf_branch_entry) * BHRB_MAX_ENTRIES);
>  			power_pmu_bhrb_read(cpuhw);
>  			data.br_stack = &cpuhw->bhrb_stack;
>  		}

Wouldn't it be just as effective, and less overhead, to set .to = 0; in
the else branch in power_pmu_bhrb_read() ?

eg:

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index c627843..30af11a4 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1516,6 +1516,7 @@ void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
                        } else {
                                /* Update address, flags for current entry */
                                cpuhw->bhrb_entries[u_index].from = addr;
+                               cpuhw->bhrb_entries[u_index].to = 0;
                                cpuhw->bhrb_entries[u_index].mispred = pred;
                                cpuhw->bhrb_entries[u_index].predicted = ~pred;
 

cheers

^ permalink raw reply related

* [PATCH 1/3] PCI: PCI_BUS_FLAGS_NO_IO flag for PCI bus
From: Gavin Shan @ 2013-05-06 13:44 UTC (permalink / raw)
  To: linuxppc-dev, linux-pci; +Cc: bhelgaas, yinghai, Gavin Shan

There has some PCI hosts (e.g. PHB3) that don't support IO. So the
patch intends to introduce the flag to indicate the special case.
In turn, we won't do IO resource assignment and enable that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 drivers/pci/pci.c       |    3 +++
 drivers/pci/setup-bus.c |   25 ++++++++++++++++---------
 include/linux/pci.h     |    1 +
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a899d8b..2548bad 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1218,6 +1218,9 @@ int pci_enable_device_mem(struct pci_dev *dev)
  */
 int pci_enable_device(struct pci_dev *dev)
 {
+	if (dev->bus->bus_flags & PCI_BUS_FLAGS_NO_IO)
+		return pci_enable_device_flags(dev, IORESOURCE_MEM);
+
 	return pci_enable_device_flags(dev, IORESOURCE_MEM | IORESOURCE_IO);
 }
 
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 16abaaa..c8293e90 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -133,6 +133,10 @@ static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
 
 		r = &dev->resource[i];
 
+		if ((dev->bus->bus_flags & PCI_BUS_FLAGS_NO_IO) &&
+		    (r->flags & IORESOURCE_IO))
+			continue;
+
 		if (r->flags & IORESOURCE_PCI_FIXED)
 			continue;
 
@@ -1083,17 +1087,20 @@ static void __ref __pci_bus_size_bridges(struct pci_bus *bus,
 			additional_io_size  = pci_hotplug_io_size;
 			additional_mem_size = pci_hotplug_mem_size;
 		}
+
+		/* Follow thru */
+	default:
+		if (!(bus->bus_flags & PCI_BUS_FLAGS_NO_IO))
+			pbus_size_io(bus,
+				     realloc_head ? 0 : additional_io_size,
+				     additional_io_size, realloc_head);
 		/*
-		 * Follow thru
+		 * If the bridge supports prefetchable range, size it
+		 * separately. If it doesn't, or its prefetchable window
+		 * has already been allocated by arch code, try
+		 * non-prefetchable range for both types of PCI memory
+		 * resources.
 		 */
-	default:
-		pbus_size_io(bus, realloc_head ? 0 : additional_io_size,
-			     additional_io_size, realloc_head);
-		/* If the bridge supports prefetchable range, size it
-		   separately. If it doesn't, or its prefetchable window
-		   has already been allocated by arch code, try
-		   non-prefetchable range for both types of PCI memory
-		   resources. */
 		mask = IORESOURCE_MEM;
 		prefmask = IORESOURCE_MEM | IORESOURCE_PREFETCH;
 		if (pbus_size_mem(bus, prefmask, prefmask,
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3a24e4f..53b595a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -181,6 +181,7 @@ typedef unsigned short __bitwise pci_bus_flags_t;
 enum pci_bus_flags {
 	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
 	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+	PCI_BUS_FLAGS_NO_IO    = (__force pci_bus_flags_t) 4
 };
 
 /* Based on the PCI Hotplug Spec, but some values are made up by us */
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 3/3] powerpc/powernv: Don't configure IO window on PHB3
From: Gavin Shan @ 2013-05-06 13:44 UTC (permalink / raw)
  To: linuxppc-dev, linux-pci; +Cc: bhelgaas, yinghai, Gavin Shan
In-Reply-To: <1367847858-6506-1-git-send-email-shangw@linux.vnet.ibm.com>

We needn't configure IO windows for the corresponding PEs on PHB3
since that doesn't support IO.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 0c3fa29..b4f3edb 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -894,7 +894,9 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
 		    res->start > res->end)
 			continue;
 
-		if (res->flags & IORESOURCE_IO) {
+		/* We needn't setup IO windows for PHB3 */
+		if (!(pe->pbus->bus_flags & PCI_BUS_FLAGS_NO_IO) &&
+		    res->flags & IORESOURCE_IO) {
 			region.start = res->start - phb->ioda.io_pci_base;
 			region.end   = res->end - phb->ioda.io_pci_base;
 			index = region.start / phb->ioda.io_segsize;
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 2/3] powerpc/powernv: Disable IO space for PCI buses
From: Gavin Shan @ 2013-05-06 13:44 UTC (permalink / raw)
  To: linuxppc-dev, linux-pci; +Cc: bhelgaas, yinghai, Gavin Shan
In-Reply-To: <1367847858-6506-1-git-send-email-shangw@linux.vnet.ibm.com>

The patch intends to set the special flag (PCI_BUS_FLAGS_NO_IO) for
root buses so PCI core will skip assignment for IO stuff. Besides,
we also clear the IO resources on all PCI devices for PHB3.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/machdep.h        |    3 ++
 arch/powerpc/kernel/pci-common.c          |    7 +++++
 arch/powerpc/platforms/powernv/pci-ioda.c |   38 +++++++++++++++++++++++++++++
 3 files changed, 48 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 3f3f691..ebc2ffd 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -220,6 +220,9 @@ struct machdep_calls {
 	/* Called during PCI resource reassignment */
 	resource_size_t (*pcibios_window_alignment)(struct pci_bus *, unsigned long type);
 
+	/* Called when adding PCI bus */
+	void (*pcibios_add_bus)(struct pci_bus *);
+
 	/* Called to shutdown machine specific hardware not already controlled
 	 * by other drivers.
 	 */
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index f325dc9..7b8a6f1 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -120,6 +120,13 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
 	return 1;
 }
 
+/* The function will be called while adding PCI bus to system */
+void pcibios_add_bus(struct pci_bus *bus)
+{
+	if (ppc_md.pcibios_add_bus)
+		ppc_md.pcibios_add_bus(bus);
+}
+
 static resource_size_t pcibios_io_size(const struct pci_controller *hose)
 {
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8c6c9cf..0c3fa29 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1015,6 +1015,40 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
 	return phb->ioda.io_segsize;
 }
 
+/*
+ * The function will be called while adding PCI bus to the
+ * system. In turn, we should set flag to indicate that the
+ * root bus doesn't have IO resources.
+ */
+static void pnv_pci_add_bus(struct pci_bus *bus)
+{
+	struct pci_controller *hose = pci_bus_to_host(bus);
+	struct pnv_phb *phb = hose->private_data;
+
+	/*
+	 * We only need set the flag for root bus since the
+	 * bus flags are copied over from parent to children
+	 */
+	if (pci_is_root_bus(bus) &&
+	    phb->model == PNV_PHB_MODEL_PHB3)
+		bus->bus_flags |= PCI_BUS_FLAGS_NO_IO;
+}
+
+static void pnv_pci_fixup_resources(struct pci_dev *dev)
+{
+	int i;
+	struct resource *res;
+
+	if (!(dev->bus->bus_flags & PCI_BUS_FLAGS_NO_IO))
+		return;
+
+	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
+		res = dev->resource + i;
+		if (res->flags & IORESOURCE_IO)
+			res->flags = 0;
+	}
+}
+
 /* Prevent enabling devices for which we couldn't properly
  * assign a PE
  */
@@ -1189,6 +1223,10 @@ void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 	ppc_md.pcibios_enable_device_hook = pnv_pci_enable_device_hook;
 	ppc_md.pcibios_window_alignment = pnv_pci_window_alignment;
+	if (ioda_type == PNV_PHB_IODA2) {
+		ppc_md.pcibios_add_bus = pnv_pci_add_bus;
+		ppc_md.pcibios_fixup_resources = pnv_pci_fixup_resources;
+	}
 	pci_add_flags(PCI_REASSIGN_ALL_RSRC);
 
 	/* Reset IODA tables to a clean state */
-- 
1.7.5.4

^ permalink raw reply related

* Re: [PATCHv5 0/2] Speed Cap fixes for ppc64
From: Alex Deucher @ 2013-05-06 14:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: dri-devel, Kleber Sacilotto de Souza, Bjorn Helgaas,
	Jerome Glisse, Thadeu Lima de Souza Cascardo, Brian King,
	Alex Deucher, linuxppc-dev
In-Reply-To: <1367622108.4389.129.camel@pasglop>

On Fri, May 3, 2013 at 7:01 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Fri, 2013-05-03 at 19:43 -0300, Kleber Sacilotto de Souza wrote:
>
>> This patch series does:
>>   1. max_bus_speed is used to set the device to gen2 speeds
>>   2. on power there's no longer a conflict between the pseries call and other
>> architectures, because the overwrite is done via a ppc_md hook
>>   3. radeon is using bus->max_bus_speed instead of drm_pcie_get_speed_cap_mask
>> for gen2 capability detection
>>
>> The first patch consists of some architecture changes, such as adding a hook on
>> powerpc for pci_root_bridge_prepare, so that pseries will initialize it to a
>> function, while all other architectures get a NULL pointer. So that whenever
>> pci_create_root_bus is called, we'll get max_bus_speed properly setup from
>> OpenFirmware.
>>
>> The second patch consists of simple radeon changes not to call
>> drm_get_pcie_speed_cap_mask anymore. I assume that on x86 machines,
>> the max_bus_speed property will be properly set already.
>
> So I'm ok with the approach now and I might even put the powerpc patch
> in for 3.10 since arguably we are fixing a nasty bug (uninitialized
> max_bus_speed).
>
> David, what's your feeling about the radeon change ? It would be nice if
> that could go in soon for various distro targets :-) On the other hand
> I'm not going to be pushy if you are not comfortable with it.

FWIW, the radeon change looks fine to me.

Alex

^ permalink raw reply

* Re: [PATCHv5 0/2] Speed Cap fixes for ppc64
From: Jerome Glisse @ 2013-05-06 14:33 UTC (permalink / raw)
  To: Alex Deucher
  Cc: dri-devel, Brian King, Jerome Glisse,
	Thadeu Lima de Souza Cascardo, Bjorn Helgaas, Alex Deucher,
	linuxppc-dev
In-Reply-To: <CADnq5_M9n0q4CucEhNqrJBjHsTh19ASsYr=nsUAJA_-mRr8qvQ@mail.gmail.com>

n Mon, May 6, 2013 at 10:32 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
> On Fri, May 3, 2013 at 7:01 PM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
>> On Fri, 2013-05-03 at 19:43 -0300, Kleber Sacilotto de Souza wrote:
>>
>>> This patch series does:
>>>   1. max_bus_speed is used to set the device to gen2 speeds
>>>   2. on power there's no longer a conflict between the pseries call and other
>>> architectures, because the overwrite is done via a ppc_md hook
>>>   3. radeon is using bus->max_bus_speed instead of drm_pcie_get_speed_cap_mask
>>> for gen2 capability detection
>>>
>>> The first patch consists of some architecture changes, such as adding a hook on
>>> powerpc for pci_root_bridge_prepare, so that pseries will initialize it to a
>>> function, while all other architectures get a NULL pointer. So that whenever
>>> pci_create_root_bus is called, we'll get max_bus_speed properly setup from
>>> OpenFirmware.
>>>
>>> The second patch consists of simple radeon changes not to call
>>> drm_get_pcie_speed_cap_mask anymore. I assume that on x86 machines,
>>> the max_bus_speed property will be properly set already.
>>
>> So I'm ok with the approach now and I might even put the powerpc patch
>> in for 3.10 since arguably we are fixing a nasty bug (uninitialized
>> max_bus_speed).
>>
>> David, what's your feeling about the radeon change ? It would be nice if
>> that could go in soon for various distro targets :-) On the other hand
>> I'm not going to be pushy if you are not comfortable with it.
>
> FWIW, the radeon change looks fine to me.
>
> Alex

As said previously, looks fine to me too.

Cheers,
Jerome

^ permalink raw reply

* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Ryan Arnold @ 2013-05-06 14:38 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Michael R Meissner, Steve Munroe, Michael Neuling, Peter Bergner,
	linuxppc-dev
In-Reply-To: <20130503234019.GE8561@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2443 bytes --]


Nishanth Aravamudan <nacc@linux.vnet.ibm.com> wrote on 05/03/2013 06:40:19
PM:

> Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
> 05/03/2013 06:40 PM
>
> To
>
> Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> cc
>
> Michael Neuling <michael.neuling@au1.ibm.com>, Michael R Meissner/
> Cambridge/IBM@IBMUS, Steve Munroe/Rochester/IBM@IBMUS, Peter
> Bergner/Rochester/IBM@IBMUS, Ryan Arnold/Rochester/IBM@IBMUS,
> linuxppc-dev@lists.ozlabs.org
>
> Subject
>
> Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL
> bits in HWCAP2
>
> On 04.05.2013 [09:23:51 +1000], Benjamin Herrenschmidt wrote:
> > On Fri, 2013-05-03 at 16:19 -0700, Nishanth Aravamudan wrote:
> > > +/* in AT_HWCAP2 */
> > > +#define PPC_FEATURE2_ARCH_2_07         0x80000000
> > > +#define PPC_FEATURE2_HTM               0x40000000
> > > +#define PPC_FEATURE2_DSCR              0x20000000
> > > +#define PPC_FEATURE2_EBB               0x10000000
> > > +#define PPC_FEATURE2_ISEL              0x08000000
> >
> > Should we "adjust" (ie filter out) some of these based
> > on CONFIG_ options (such as transactional memory enabled,
> > EBB supported by the hypervisor, etc...) ?
>
> Err, yeah, that seems reasonable :) However, it seems like glibc uses
> these values rather directly so it knows what bits to check for each
> feature. Therefore, it seems like it would be better to do the
> ifdeffery/checking in the user in cputable.c, but that seems like it
> could get quite complicated.
>
> Would it be ok (I guess I'm asking Ryan & co. here) to have an #ifdef in
> the definition that may or may not mean the bit is set in the aux
> vector, but the bit, if set, would always be the same bit?

My understanding was that these bits being 'on' is an indication of what
features the hardware supports (or what the kernel emulates) and a not an
indication of whether that facility is currently enabled or not.  If the
hardware supports a particular feature but it is not enabled I'd expect
that user-space usage of that feature would cause the kernel to trap on a
facility availability exception (which is how Altivec/VMX is implemented,
being defaulted to turned off).

Otherwise there's no way I could know whether an ISA [optional] feature is
actually available on a particular machine.

And yes, the bits can't change.  My usage of hwcap.h has to coincide with
the kernel's asm/cputable.h
Ryan

[-- Attachment #2: Type: text/html, Size: 3457 bytes --]

^ permalink raw reply

* Re: [PATCH] powerpc, perf: Fix processing conditions for invalid BHRB entries
From: Anshuman Khandual @ 2013-05-06 12:25 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <24261.1367838705@ale.ozlabs.ibm.com>

On 05/06/2013 04:41 PM, Michael Neuling wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:
> 
>> Fixing some conditions during BHRB entry processing.
> 
> I think we can simplify this a lot more... something like the below.
> 

I feel that the conditional handling of the invalid BHRB entries should be
present which would help us during the debug and also could be used for more
granular branch classification or error handling later on.

> Also, this marks the "to" address as all 1s, which is better poison
> value since it's possible to branch to/from 0x0.
>

Agreed.


> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index c627843..d410d65 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1463,65 +1463,45 @@ void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
>  {
>  	u64 val;
>  	u64 addr;
> -	int r_index, u_index, target, pred;
> +	int r_index, u_index, pred;
> 
>  	r_index = 0;
>  	u_index = 0;
>  	while (r_index < ppmu->bhrb_nr) {
>  		/* Assembly read function */
> -		val = read_bhrb(r_index);
> +		val = read_bhrb(r_index++);
> 
>  		/* Terminal marker: End of valid BHRB entries */
> -		if (val == 0) {
> +		if (!val) {
>  			break;
>  		} else {
>  			/* BHRB field break up */
>  			addr = val & BHRB_EA;
>  			pred = val & BHRB_PREDICTION;
> -			target = val & BHRB_TARGET;
> 
> -			/* Probable Missed entry: Not applicable for POWER8 */
> -			if ((addr == 0) && (target == 0) && (pred == 1)) {
> -				r_index++;
> +			if (!addr)
> +				/* invalid entry */
>  				continue;
> -			}
> -
> -			/* Real Missed entry: Power8 based missed entry */
> -			if ((addr == 0) && (target == 1) && (pred == 1)) {
> -				r_index++;
> -				continue;
> -			}
> 
> -			/* Reserved condition: Not a valid entry  */
> -			if ((addr == 0) && (target == 1) && (pred == 0)) {
> -				r_index++;
> -				continue;
> -			}
> -
> -			/* Is a target address */
>  			if (val & BHRB_TARGET) {
>  				/* First address cannot be a target address */
> -				if (r_index == 0) {
> -					r_index++;
> +				if (r_index == 0)
>  					continue;
> -				}
> 
>  				/* Update target address for the previous entry */
>  				cpuhw->bhrb_entries[u_index - 1].to = addr;
>  				cpuhw->bhrb_entries[u_index - 1].mispred = pred;
>  				cpuhw->bhrb_entries[u_index - 1].predicted = ~pred;
> -
> -				/* Dont increment u_index */
> -				r_index++;
>  			} else {
>  				/* Update address, flags for current entry */
>  				cpuhw->bhrb_entries[u_index].from = addr;
> +				cpuhw->bhrb_entries[u_index].to =
> +					0xffffffffffffffff;
>  				cpuhw->bhrb_entries[u_index].mispred = pred;
>  				cpuhw->bhrb_entries[u_index].predicted = ~pred;
> 
>  				/* Successfully popullated one entry */
>  				u_index++;
> -				r_index++;
>  			}
>  		}
>  	}
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

^ permalink raw reply

* Re: [v1][KVM][PATCH 1/1] kvm:ppc: enable doorbell exception with E500MC
From: Alexander Graf @ 2013-05-06 14:58 UTC (permalink / raw)
  To: Tiejun Chen; +Cc: linuxppc-dev, kvm, kvm-ppc
In-Reply-To: <1367808836-28635-1-git-send-email-tiejun.chen@windriver.com>

On 05/06/2013 04:53 AM, Tiejun Chen wrote:
> Actually E500MC also support doorbell exception, and CONFIG_PPC_E500MC
> can cover BOOK3E/BOOK3E_64 as well.
>
> Signed-off-by: Tiejun Chen<tiejun.chen@windriver.com>
> ---
>   arch/powerpc/kvm/booke.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 1020119..dc1f590 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -795,7 +795,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
>   		kvmppc_fill_pt_regs(&regs);
>   		timer_interrupt(&regs);
>   		break;
> -#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_BOOK3E_64)
> +#if defined(CONFIG_PPC_E500MC)

I suppose you mean CONFIG_KVM_E500MC here? Why didn't this work for you 
before? The ifdef above should cover the same range of CPUs.


Alex

^ permalink raw reply

* Re: [PATCH] powerpc/pci: Support per-aperture memory offset
From: Bjorn Helgaas @ 2013-05-06 17:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Thomas Petazzoni, linux-pci@vger.kernel.org, Andrew Murray,
	Scott Wood, Sethi Varun-B16395, linuxppc-dev
In-Reply-To: <1367823016.15842.22.camel@pasglop>

On Sun, May 5, 2013 at 11:50 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> The PCI core supports an offset per aperture nowadays but our arch
> code still has a single offset per host bridge representing the
> difference betwen CPU memory addresses and PCI MMIO addresses.
>
> This is a problem as new machines and hypervisor versions are
> coming out where the 64-bit windows will have a different offset
> (basically mapped 1:1) from the 32-bit windows.
>
> This fixes it by using separate offsets. In the long run, we probably
> want to get rid of that intermediary struct pci_controller and have
> those directly stored into the pci_host_bridge as they are parsed
> but this will be a more invasive change.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This doesn't touch the PCI core, but it looks good to me, for whatever
that's worth.

> ---
>
> Now, this is a big one but I'd like to still merge it for 3.10 because
> we're having new machine coming up (and new versions of pHyp on existing
> machines) that are going to expose MMIO windows with different offsets
> (basically our 64-bit windows are 1:1 and our 32-bit windows remapped).
>
> I'm not expecting any major issue with the patch, I've tested it on some
> of our machines here and will test it more during the next couple of days
> the notable thing is the removal of a "workaround" / fallback on 32-bit
> that I suspect only ever mattered for machines long unsupported (PReP ?)
> as I don't think we have anything that doesn't populate the bridge
> resources and expects to work nowadays (other stuff would break anyway).
>
> This is also why I'm NAK'ing the patch making pci_process_bridge_OF_ranges()
> generic since I need to change it for powerpc and this isn't the right
> long term approach (we should "merge" pci_controller & pci_host_bridge
> instead and directly populate the pci_host_bridge apertures).
>
> If I see no major issue with the patch during the next few days, I'll send
> it to Linus with my next pull request, probably at -rc1.
>
> Kumar, Scott, Sethi, please test on FSL HW. I'll take care of macs and 4xx
> in addition to the various IBM ppc64 platforms.
>
> Ben.
>
>  arch/powerpc/include/asm/pci-bridge.h       |    6 +-
>  arch/powerpc/kernel/pci-common.c            |   97 ++++++++-------------------
>  arch/powerpc/kernel/pci_32.c                |    2 +-
>  arch/powerpc/kernel/pci_64.c                |    2 +-
>  arch/powerpc/platforms/embedded6xx/mpc10x.h |   11 ---
>  arch/powerpc/platforms/powermac/pci.c       |    2 +-
>  arch/powerpc/platforms/powernv/pci-ioda.c   |   10 +--
>  arch/powerpc/platforms/wsp/wsp_pci.c        |    2 +-
>  arch/powerpc/sysdev/fsl_pci.c               |   11 +--
>  arch/powerpc/sysdev/ppc4xx_pci.c            |   15 +++--
>  10 files changed, 54 insertions(+), 104 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h
> index 0694f73..8b11b5b 100644
> --- a/arch/powerpc/include/asm/pci-bridge.h
> +++ b/arch/powerpc/include/asm/pci-bridge.h
> @@ -39,11 +39,6 @@ struct pci_controller {
>         resource_size_t io_base_phys;
>         resource_size_t pci_io_size;
>
> -       /* Some machines (PReP) have a non 1:1 mapping of
> -        * the PCI memory space in the CPU bus space
> -        */
> -       resource_size_t pci_mem_offset;
> -
>         /* Some machines have a special region to forward the ISA
>          * "memory" cycles such as VGA memory regions. Left to 0
>          * if unsupported
> @@ -86,6 +81,7 @@ struct pci_controller {
>          */
>         struct resource io_resource;
>         struct resource mem_resources[3];
> +       resource_size_t mem_offset[3];
>         int global_number;              /* PCI domain number */
>
>         resource_size_t dma_window_base_cur;
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index cf00588..f5c5c90 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -786,22 +786,8 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
>                                 hose->isa_mem_size = size;
>                         }
>
> -                       /* We get the PCI/Mem offset from the first range or
> -                        * the, current one if the offset came from an ISA
> -                        * hole. If they don't match, bugger.
> -                        */
> -                       if (memno == 0 ||
> -                           (isa_hole >= 0 && pci_addr != 0 &&
> -                            hose->pci_mem_offset == isa_mb))
> -                               hose->pci_mem_offset = cpu_addr - pci_addr;
> -                       else if (pci_addr != 0 &&
> -                                hose->pci_mem_offset != cpu_addr - pci_addr) {
> -                               printk(KERN_INFO
> -                                      " \\--> Skipped (offset mismatch) !\n");
> -                               continue;
> -                       }
> -
>                         /* Build resource */
> +                       hose->mem_offset[memno] = cpu_addr - pci_addr;
>                         res = &hose->mem_resources[memno++];
>                         res->flags = IORESOURCE_MEM;
>                         if (pci_space & 0x40000000)
> @@ -817,20 +803,6 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
>                         res->child = NULL;
>                 }
>         }
> -
> -       /* If there's an ISA hole and the pci_mem_offset is -not- matching
> -        * the ISA hole offset, then we need to remove the ISA hole from
> -        * the resource list for that brige
> -        */
> -       if (isa_hole >= 0 && hose->pci_mem_offset != isa_mb) {
> -               unsigned int next = isa_hole + 1;
> -               printk(KERN_INFO " Removing ISA hole at 0x%016llx\n", isa_mb);
> -               if (next < memno)
> -                       memmove(&hose->mem_resources[isa_hole],
> -                               &hose->mem_resources[next],
> -                               sizeof(struct resource) * (memno - next));
> -               hose->mem_resources[--memno].flags = 0;
> -       }
>  }
>
>  /* Decide whether to display the domain number in /proc */
> @@ -916,6 +888,7 @@ static int pcibios_uninitialized_bridge_resource(struct pci_bus *bus,
>         struct pci_controller *hose = pci_bus_to_host(bus);
>         struct pci_dev *dev = bus->self;
>         resource_size_t offset;
> +       struct pci_bus_region region;
>         u16 command;
>         int i;
>
> @@ -925,10 +898,10 @@ static int pcibios_uninitialized_bridge_resource(struct pci_bus *bus,
>
>         /* Job is a bit different between memory and IO */
>         if (res->flags & IORESOURCE_MEM) {
> -               /* If the BAR is non-0 (res != pci_mem_offset) then it's probably been
> -                * initialized by somebody
> -                */
> -               if (res->start != hose->pci_mem_offset)
> +               pcibios_resource_to_bus(dev, &region, res);
> +
> +               /* If the BAR is non-0 then it's probably been initialized */
> +               if (region.start != 0)
>                         return 0;
>
>                 /* The BAR is 0, let's check if memory decoding is enabled on
> @@ -940,11 +913,11 @@ static int pcibios_uninitialized_bridge_resource(struct pci_bus *bus,
>
>                 /* Memory decoding is enabled and the BAR is 0. If any of the bridge
>                  * resources covers that starting address (0 then it's good enough for
> -                * us for memory
> +                * us for memory space)
>                  */
>                 for (i = 0; i < 3; i++) {
>                         if ((hose->mem_resources[i].flags & IORESOURCE_MEM) &&
> -                           hose->mem_resources[i].start == hose->pci_mem_offset)
> +                           hose->mem_resources[i].start == hose->mem_offset[i])
>                                 return 0;
>                 }
>
> @@ -1381,10 +1354,9 @@ static void __init pcibios_reserve_legacy_regions(struct pci_bus *bus)
>
>   no_io:
>         /* Check for memory */
> -       offset = hose->pci_mem_offset;
> -       pr_debug("hose mem offset: %016llx\n", (unsigned long long)offset);
>         for (i = 0; i < 3; i++) {
>                 pres = &hose->mem_resources[i];
> +               offset = hose->mem_offset[i];
>                 if (!(pres->flags & IORESOURCE_MEM))
>                         continue;
>                 pr_debug("hose mem res: %pR\n", pres);
> @@ -1524,6 +1496,7 @@ static void pcibios_setup_phb_resources(struct pci_controller *hose,
>                                         struct list_head *resources)
>  {
>         struct resource *res;
> +       resource_size_t offset;
>         int i;
>
>         /* Hookup PHB IO resource */
> @@ -1533,51 +1506,37 @@ static void pcibios_setup_phb_resources(struct pci_controller *hose,
>                 printk(KERN_WARNING "PCI: I/O resource not set for host"
>                        " bridge %s (domain %d)\n",
>                        hose->dn->full_name, hose->global_number);
> -#ifdef CONFIG_PPC32
> -               /* Workaround for lack of IO resource only on 32-bit */
> -               res->start = (unsigned long)hose->io_base_virt - isa_io_base;
> -               res->end = res->start + IO_SPACE_LIMIT;
> -               res->flags = IORESOURCE_IO;
> -#endif /* CONFIG_PPC32 */
> -       }
> -       if (res->flags) {
> -               pr_debug("PCI: PHB IO resource    = %016llx-%016llx [%lx]\n",
> +       } else {
> +               offset = pcibios_io_space_offset(hose);
> +
> +               pr_debug("PCI: PHB IO resource    = %08llx-%08llx [%lx] off 0x%08llx\n",
>                          (unsigned long long)res->start,
>                          (unsigned long long)res->end,
> -                        (unsigned long)res->flags);
> -               pci_add_resource_offset(resources, res, pcibios_io_space_offset(hose));
> -
> -               pr_debug("PCI: PHB IO  offset     = %08lx\n",
> -                        (unsigned long)hose->io_base_virt - _IO_BASE);
> +                        (unsigned long)res->flags,
> +                        (unsigned long long)offset);

If you use %pR, these will match any similar messages from the PCI core.

> +               pci_add_resource_offset(resources, res, offset);
>         }
>
>         /* Hookup PHB Memory resources */
>         for (i = 0; i < 3; ++i) {
>                 res = &hose->mem_resources[i];
>                 if (!res->flags) {
> -                       if (i > 0)
> -                               continue;
>                         printk(KERN_ERR "PCI: Memory resource 0 not set for "
>                                "host bridge %s (domain %d)\n",
>                                hose->dn->full_name, hose->global_number);

I don't understand what's going on here, but the message no longer
matches the code (we refer to "Memory resource 0," but it might be 0,
1, or 2).

> -#ifdef CONFIG_PPC32
> -                       /* Workaround for lack of MEM resource only on 32-bit */
> -                       res->start = hose->pci_mem_offset;
> -                       res->end = (resource_size_t)-1LL;
> -                       res->flags = IORESOURCE_MEM;
> -#endif /* CONFIG_PPC32 */
> -               }
> -               if (res->flags) {
> -                       pr_debug("PCI: PHB MEM resource %d = %016llx-%016llx [%lx]\n", i,
> -                                (unsigned long long)res->start,
> -                                (unsigned long long)res->end,
> -                                (unsigned long)res->flags);
> -                       pci_add_resource_offset(resources, res, hose->pci_mem_offset);
> +                       continue;
>                 }
> -       }
> +               offset = hose->mem_offset[i];
>
> -       pr_debug("PCI: PHB MEM offset     = %016llx\n",
> -                (unsigned long long)hose->pci_mem_offset);
> +
> +               pr_debug("PCI: PHB MEM resource %d = %08llx-%08llx [%lx] off 0x%08llx\n", i,
> +                        (unsigned long long)res->start,
> +                        (unsigned long long)res->end,
> +                        (unsigned long)res->flags,
> +                        (unsigned long long)offset);
> +
> +               pci_add_resource_offset(resources, res, offset);
> +       }
>  }
>
>  /*
> diff --git a/arch/powerpc/kernel/pci_32.c b/arch/powerpc/kernel/pci_32.c
> index e37c215..432459c 100644
> --- a/arch/powerpc/kernel/pci_32.c
> +++ b/arch/powerpc/kernel/pci_32.c
> @@ -295,7 +295,7 @@ long sys_pciconfig_iobase(long which, unsigned long bus, unsigned long devfn)
>         case IOBASE_BRIDGE_NUMBER:
>                 return (long)hose->first_busno;
>         case IOBASE_MEMORY:
> -               return (long)hose->pci_mem_offset;
> +               return (long)hose->mem_offset[0];
>         case IOBASE_IO:
>                 return (long)hose->io_base_phys;
>         case IOBASE_ISA_IO:
> diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
> index 51a133a..873050d 100644
> --- a/arch/powerpc/kernel/pci_64.c
> +++ b/arch/powerpc/kernel/pci_64.c
> @@ -246,7 +246,7 @@ long sys_pciconfig_iobase(long which, unsigned long in_bus,
>         case IOBASE_BRIDGE_NUMBER:
>                 return (long)hose->first_busno;
>         case IOBASE_MEMORY:
> -               return (long)hose->pci_mem_offset;
> +               return (long)hose->mem_offset[0];
>         case IOBASE_IO:
>                 return (long)hose->io_base_phys;
>         case IOBASE_ISA_IO:
> diff --git a/arch/powerpc/platforms/embedded6xx/mpc10x.h b/arch/powerpc/platforms/embedded6xx/mpc10x.h
> index b30a6a3..b290b63 100644
> --- a/arch/powerpc/platforms/embedded6xx/mpc10x.h
> +++ b/arch/powerpc/platforms/embedded6xx/mpc10x.h
> @@ -81,17 +81,6 @@
>  #define        MPC10X_MAPB_PCI_MEM_OFFSET      (MPC10X_MAPB_ISA_MEM_BASE -     \
>                                          MPC10X_MAPB_PCI_MEM_START)
>
> -/* Set hose members to values appropriate for the mem map used */
> -#define        MPC10X_SETUP_HOSE(hose, map) {                                  \
> -       (hose)->pci_mem_offset = MPC10X_MAP##map##_PCI_MEM_OFFSET;      \
> -       (hose)->io_space.start = MPC10X_MAP##map##_PCI_IO_START;        \
> -       (hose)->io_space.end = MPC10X_MAP##map##_PCI_IO_END;            \
> -       (hose)->mem_space.start = MPC10X_MAP##map##_PCI_MEM_START;      \
> -       (hose)->mem_space.end = MPC10X_MAP##map##_PCI_MEM_END;          \
> -       (hose)->io_base_virt = (void *)MPC10X_MAP##map##_ISA_IO_BASE;   \
> -}

I guess this was previously unused; I don't see any corresponding
changes to a user of MPC10X_SETUP_HOSE().

> -
> -
>  /* Miscellaneous Configuration register offsets */
>  #define        MPC10X_CFG_PIR_REG              0x09
>  #define        MPC10X_CFG_PIR_HOST_BRIDGE      0x00
> diff --git a/arch/powerpc/platforms/powermac/pci.c b/arch/powerpc/platforms/powermac/pci.c
> index 2b8af75..cf7009b 100644
> --- a/arch/powerpc/platforms/powermac/pci.c
> +++ b/arch/powerpc/platforms/powermac/pci.c
> @@ -824,6 +824,7 @@ static void __init parse_region_decode(struct pci_controller *hose,
>                         hose->mem_resources[cur].name = hose->dn->full_name;
>                         hose->mem_resources[cur].start = base;
>                         hose->mem_resources[cur].end = end;
> +                       hose->mem_offset[cur] = 0;
>                         DBG("  %d: 0x%08lx-0x%08lx\n", cur, base, end);
>                 } else {
>                         DBG("   :           -0x%08lx\n", end);
> @@ -866,7 +867,6 @@ static void __init setup_u3_ht(struct pci_controller* hose)
>         hose->io_resource.start = 0;
>         hose->io_resource.end = 0x003fffff;
>         hose->io_resource.flags = IORESOURCE_IO;
> -       hose->pci_mem_offset = 0;
>         hose->first_busno = 0;
>         hose->last_busno = 0xef;
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 97b08fc..1da578b 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -915,11 +915,14 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
>                                 index++;
>                         }
>                 } else if (res->flags & IORESOURCE_MEM) {
> +                       /* WARNING: Assumes M32 is mem region 0 in PHB. We need to
> +                        * harden that algorithm when we start supporting M64
> +                        */
>                         region.start = res->start -
> -                                      hose->pci_mem_offset -
> +                                      hose->mem_offset[0] -
>                                        phb->ioda.m32_pci_base;
>                         region.end   = res->end -
> -                                      hose->pci_mem_offset -
> +                                      hose->mem_offset[0] -
>                                        phb->ioda.m32_pci_base;
>                         index = region.start / phb->ioda.m32_segsize;
>
> @@ -1115,8 +1118,7 @@ void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
>         phb->ioda.m32_size += 0x10000;
>
>         phb->ioda.m32_segsize = phb->ioda.m32_size / phb->ioda.total_pe;
> -       phb->ioda.m32_pci_base = hose->mem_resources[0].start -
> -               hose->pci_mem_offset;
> +       phb->ioda.m32_pci_base = hose->mem_resources[0].start - hose->mem_offset[0];
>         phb->ioda.io_size = hose->pci_io_size;
>         phb->ioda.io_segsize = phb->ioda.io_size / phb->ioda.total_pe;
>         phb->ioda.io_pci_base = 0; /* XXX calculate this ? */
> diff --git a/arch/powerpc/platforms/wsp/wsp_pci.c b/arch/powerpc/platforms/wsp/wsp_pci.c
> index 8e22f56..62cb527 100644
> --- a/arch/powerpc/platforms/wsp/wsp_pci.c
> +++ b/arch/powerpc/platforms/wsp/wsp_pci.c
> @@ -502,7 +502,7 @@ static void __init wsp_pcie_configure_hw(struct pci_controller *hose)
>                  (~(hose->mem_resources[0].end -
>                     hose->mem_resources[0].start)) & 0x3ffffff0000ul);
>         out_be64(hose->cfg_data + PCIE_REG_M32A_START_ADDR,
> -                (hose->mem_resources[0].start - hose->pci_mem_offset) | 1);
> +                (hose->mem_resources[0].start - hose->mem_offset[0]) | 1);
>
>         /* Clear all TVT entries
>          *
> diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> index cffe7ed..028ac1f 100644
> --- a/arch/powerpc/sysdev/fsl_pci.c
> +++ b/arch/powerpc/sysdev/fsl_pci.c
> @@ -178,7 +178,7 @@ static void setup_pci_atmu(struct pci_controller *hose)
>         struct ccsr_pci __iomem *pci = hose->private_data;
>         int i, j, n, mem_log, win_idx = 3, start_idx = 1, end_idx = 4;
>         u64 mem, sz, paddr_hi = 0;
> -       u64 paddr_lo = ULLONG_MAX;
> +       u64 offset = 0, paddr_lo = ULLONG_MAX;
>         u32 pcicsrbar = 0, pcicsrbar_sz;
>         u32 piwar = PIWAR_EN | PIWAR_PF | PIWAR_TGI_LOCAL |
>                         PIWAR_READ_SNOOP | PIWAR_WRITE_SNOOP;
> @@ -208,8 +208,9 @@ static void setup_pci_atmu(struct pci_controller *hose)
>                 paddr_lo = min(paddr_lo, (u64)hose->mem_resources[i].start);
>                 paddr_hi = max(paddr_hi, (u64)hose->mem_resources[i].end);
>
> -               n = setup_one_atmu(pci, j, &hose->mem_resources[i],
> -                                  hose->pci_mem_offset);
> +               /* We assume all memory resources have the same offset */
> +               offset = hose->mem_offset[i];

This code doesn't *look* like you're assuming all resources have the
same offset, although maybe elsewhere you set every
hose->mem_offset[i] to the same value.

> +               n = setup_one_atmu(pci, j, &hose->mem_resources[i], offset);
>
>                 if (n < 0 || j >= 5) {
>                         pr_err("Ran out of outbound PCI ATMUs for resource %d!\n", i);
> @@ -239,8 +240,8 @@ static void setup_pci_atmu(struct pci_controller *hose)
>         }
>
>         /* convert to pci address space */
> -       paddr_hi -= hose->pci_mem_offset;
> -       paddr_lo -= hose->pci_mem_offset;
> +       paddr_hi -= offset;
> +       paddr_lo -= offset;
>
>         if (paddr_hi == paddr_lo) {
>                 pr_err("%s: No outbound window space\n", name);
> diff --git a/arch/powerpc/sysdev/ppc4xx_pci.c b/arch/powerpc/sysdev/ppc4xx_pci.c
> index 56e8b3c..64603a1 100644
> --- a/arch/powerpc/sysdev/ppc4xx_pci.c
> +++ b/arch/powerpc/sysdev/ppc4xx_pci.c
> @@ -257,6 +257,7 @@ static void __init ppc4xx_configure_pci_PMMs(struct pci_controller *hose,
>         /* Setup outbound memory windows */
>         for (i = j = 0; i < 3; i++) {
>                 struct resource *res = &hose->mem_resources[i];
> +               resource_size_t offset = hose->mem_offset[i];
>
>                 /* we only care about memory windows */
>                 if (!(res->flags & IORESOURCE_MEM))
> @@ -270,7 +271,7 @@ static void __init ppc4xx_configure_pci_PMMs(struct pci_controller *hose,
>                 /* Configure the resource */
>                 if (ppc4xx_setup_one_pci_PMM(hose, reg,
>                                              res->start,
> -                                            res->start - hose->pci_mem_offset,
> +                                            res->start - offset,
>                                              resource_size(res),
>                                              res->flags,
>                                              j) == 0) {
> @@ -279,7 +280,7 @@ static void __init ppc4xx_configure_pci_PMMs(struct pci_controller *hose,
>                         /* If the resource PCI address is 0 then we have our
>                          * ISA memory hole
>                          */
> -                       if (res->start == hose->pci_mem_offset)
> +                       if (res->start == offset)
>                                 found_isa_hole = 1;
>                 }
>         }
> @@ -457,6 +458,7 @@ static void __init ppc4xx_configure_pcix_POMs(struct pci_controller *hose,
>         /* Setup outbound memory windows */
>         for (i = j = 0; i < 3; i++) {
>                 struct resource *res = &hose->mem_resources[i];
> +               resource_size_t offset = hose->mem_offset[i];
>
>                 /* we only care about memory windows */
>                 if (!(res->flags & IORESOURCE_MEM))
> @@ -470,7 +472,7 @@ static void __init ppc4xx_configure_pcix_POMs(struct pci_controller *hose,
>                 /* Configure the resource */
>                 if (ppc4xx_setup_one_pcix_POM(hose, reg,
>                                               res->start,
> -                                             res->start - hose->pci_mem_offset,
> +                                             res->start - offset,
>                                               resource_size(res),
>                                               res->flags,
>                                               j) == 0) {
> @@ -479,7 +481,7 @@ static void __init ppc4xx_configure_pcix_POMs(struct pci_controller *hose,
>                         /* If the resource PCI address is 0 then we have our
>                          * ISA memory hole
>                          */
> -                       if (res->start == hose->pci_mem_offset)
> +                       if (res->start == offset)
>                                 found_isa_hole = 1;
>                 }
>         }
> @@ -1792,6 +1794,7 @@ static void __init ppc4xx_configure_pciex_POMs(struct ppc4xx_pciex_port *port,
>         /* Setup outbound memory windows */
>         for (i = j = 0; i < 3; i++) {
>                 struct resource *res = &hose->mem_resources[i];
> +               resource_size_t offset = hose->mem_offset[i];
>
>                 /* we only care about memory windows */
>                 if (!(res->flags & IORESOURCE_MEM))
> @@ -1805,7 +1808,7 @@ static void __init ppc4xx_configure_pciex_POMs(struct ppc4xx_pciex_port *port,
>                 /* Configure the resource */
>                 if (ppc4xx_setup_one_pciex_POM(port, hose, mbase,
>                                                res->start,
> -                                              res->start - hose->pci_mem_offset,
> +                                              res->start - offset,
>                                                resource_size(res),
>                                                res->flags,
>                                                j) == 0) {
> @@ -1814,7 +1817,7 @@ static void __init ppc4xx_configure_pciex_POMs(struct ppc4xx_pciex_port *port,
>                         /* If the resource PCI address is 0 then we have our
>                          * ISA memory hole
>                          */
> -                       if (res->start == hose->pci_mem_offset)
> +                       if (res->start == offset)
>                                 found_isa_hole = 1;
>                 }
>         }
>
>

^ permalink raw reply

* Re: [PATCH v2 1/4] powerpc/cputable: reserve bits in HWCAP2 for new features
From: Ryan Arnold @ 2013-05-06 19:07 UTC (permalink / raw)
  To: Nishanth Aravamudan
  Cc: Michael Neuling, Michael R Meissner, Steve Munroe, Peter Bergner,
	linuxppc-dev
In-Reply-To: <20130504004756.GA3532@linux.vnet.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 939 bytes --]

Hi Nish,

Nishanth Aravamudan <nacc@linux.vnet.ibm.com> wrote on 05/03/2013 07:47:56
PM:
> +/* in AT_HWCAP2 */
> +#define PPC_FEATURE2_ARCH_2_07      0x80000000
> +#define PPC_FEATURE2_HTM      0x40000000
> +#define PPC_FEATURE2_DSCR      0x20000000
> +#define PPC_FEATURE2_EBB      0x10000000
> +#define PPC_FEATURE2_ISEL      0x08000000
> +#define PPC_FEATURE2_TAR      0x04000000
> +
>  #endif /* _UAPI__ASM_POWERPC_CPUTABLE_H */

Following the existing naming convention in cputable.h, the features should
probably be amended to the following:

#define PPC_FEATURE2_ARCH_2_07      0x80000000
#define PPC_FEATURE2_HAS_HTM        0x40000000
#define PPC_FEATURE2_HAS_DSC        0x20000000
#define PPC_FEATURE2_HAS_EBB        0x10000000
#define PPC_FEATURE2_HAS_ISEL       0x08000000
#define PPC_FEATURE2_HAS_TAR        0x04000000

Notice that I changed DSCR to DSC.  The 'R' wasn't descriptive.

Ryan S. Arnold
IBM Linux Technology Center

[-- Attachment #2: Type: text/html, Size: 1690 bytes --]

^ permalink raw reply

* [PATCH] powerpc: Fix build errors STRICT_MM_TYPECHECKS
From: Aneesh Kumar K.V @ 2013-05-06 20:51 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pte-hash64-64k.h | 2 +-
 arch/powerpc/kernel/pci-common.c          | 5 ++---
 arch/powerpc/mm/init_64.c                 | 3 ++-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-hash64-64k.h b/arch/powerpc/include/asm/pte-hash64-64k.h
index 7ad8ef1..98e27be 100644
--- a/arch/powerpc/include/asm/pte-hash64-64k.h
+++ b/arch/powerpc/include/asm/pte-hash64-64k.h
@@ -58,7 +58,7 @@
  * generic accessors and iterators here
  */
 #define __real_pte(e,p) 	((real_pte_t) { \
-			(e), ((e) & _PAGE_COMBO) ? \
+			(e), (pte_val(e) & _PAGE_COMBO) ? \
 				(pte_val(*((p) + PTRS_PER_PTE))) : 0 })
 #define __rpte_to_hidx(r,index)	((pte_val((r).pte) & _PAGE_COMBO) ? \
         (((r).hidx >> ((index)<<2)) & 0xf) : ((pte_val((r).pte) >> 12) & 0xf))
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index f325dc9..fd4b917 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -359,7 +359,6 @@ static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
 				      enum pci_mmap_state mmap_state,
 				      int write_combine)
 {
-	unsigned long prot = pgprot_val(protection);
 
 	/* Write combine is always 0 on non-memory space mappings. On
 	 * memory space, if the user didn't pass 1, we check for a
@@ -376,9 +375,9 @@ static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp,
 
 	/* XXX would be nice to have a way to ask for write-through */
 	if (write_combine)
-		return pgprot_noncached_wc(prot);
+		return pgprot_noncached_wc(protection);
 	else
-		return pgprot_noncached(prot);
+		return pgprot_noncached(protection);
 }
 
 /*
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index c9ec657..d0cd9e4 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -218,7 +218,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 					     unsigned long phys)
 {
 	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
-					PAGE_KERNEL, mmu_vmemmap_psize,
+					pgprot_val(PAGE_KERNEL),
+					mmu_vmemmap_psize,
 					mmu_kernel_ssize);
 	BUG_ON(mapped < 0);
 }
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH 2/5] KVM: PPC: iommu: Add missing kvm_iommu_map_pages/kvm_iommu_unmap_pages
From: Alex Williamson @ 2013-05-06 21:07 UTC (permalink / raw)
  To: aik
  Cc: kvm, Alexander Graf, kvm-ppc, linux-kernel, Paul Mackerras,
	linux-pci, linuxppc-dev, David Gibson
In-Reply-To: <51875a36.25ac440a.50d9.ffffe90d@mx.google.com>

On Mon, 2013-05-06 at 17:21 +1000, aik@ozlabs.ru wrote:
> From: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> The IOMMU API implements groups creating/deletion, device binding
> and IOMMU map/unmap operations.
> 
> The PowerPC implementation uses most of the API except map/unmap
> operations, which are implemented on POWER using hypercalls.
> 
> However, in order to link a kernel with the CONFIG_IOMMU_API enabled,
> the empty kvm_iommu_map_pages/kvm_iommu_unmap_pages have to be
> defined, so this defines them.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>  arch/powerpc/include/asm/kvm_host.h |   14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index b6a047e..c025d91 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -603,4 +603,18 @@ struct kvm_vcpu_arch {
>  
>  #define __KVM_HAVE_ARCH_WQP
>  
> +#ifdef CONFIG_IOMMU_API
> +/* POWERPC does not use IOMMU API for mapping/unmapping */
> +static inline int kvm_iommu_map_pages(struct kvm *kvm,
> +		struct kvm_memory_slot *slot)
> +{
> +	return 0;
> +}
> +
> +static inline void kvm_iommu_unmap_pages(struct kvm *kvm,
> +		struct kvm_memory_slot *slot)
> +{
> +}
> +#endif /* CONFIG_IOMMU_API */
> +
>  #endif /* __POWERPC_KVM_HOST_H__ */

This is no longer needed, Gleb applied my patch for 3.10 that make all
of KVM device assignment dependent on a build config option and the top
level kvm_host.h now includes this when that is not set.  Thanks,

Alex

^ permalink raw reply

* Re: [PATCH] powerpc, perf: Fix processing conditions for invalid BHRB entries
From: Michael Neuling @ 2013-05-06 21:22 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <5187A144.9080307@linux.vnet.ibm.com>

Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:

> On 05/06/2013 04:41 PM, Michael Neuling wrote:
> > Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:
> > 
> >> Fixing some conditions during BHRB entry processing.
> > 
> > I think we can simplify this a lot more... something like the below.
> > 
> 
> I feel that the conditional handling of the invalid BHRB entries should be
> present which would help us during the debug and also could be used for more
> granular branch classification or error handling later on.

If we need that, then we can add it later on.  You'd still need to
modify the code to do any debug, branch classification or error handling
as the code is now.  It's just an unnecessary place holder.  

The code is unnecessarily verbose now.  This patch removes about 20
lines of code and does the same thing.

Mikey

> 
> > Also, this marks the "to" address as all 1s, which is better poison
> > value since it's possible to branch to/from 0x0.
> >
> 
> Agreed.
> 
> 
> > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> > index c627843..d410d65 100644
> > --- a/arch/powerpc/perf/core-book3s.c
> > +++ b/arch/powerpc/perf/core-book3s.c
> > @@ -1463,65 +1463,45 @@ void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
> >  {
> >  	u64 val;
> >  	u64 addr;
> > -	int r_index, u_index, target, pred;
> > +	int r_index, u_index, pred;
> > 
> >  	r_index = 0;
> >  	u_index = 0;
> >  	while (r_index < ppmu->bhrb_nr) {
> >  		/* Assembly read function */
> > -		val = read_bhrb(r_index);
> > +		val = read_bhrb(r_index++);
> > 
> >  		/* Terminal marker: End of valid BHRB entries */
> > -		if (val == 0) {
> > +		if (!val) {
> >  			break;
> >  		} else {
> >  			/* BHRB field break up */
> >  			addr = val & BHRB_EA;
> >  			pred = val & BHRB_PREDICTION;
> > -			target = val & BHRB_TARGET;
> > 
> > -			/* Probable Missed entry: Not applicable for POWER8 */
> > -			if ((addr == 0) && (target == 0) && (pred == 1)) {
> > -				r_index++;
> > +			if (!addr)
> > +				/* invalid entry */
> >  				continue;
> > -			}
> > -
> > -			/* Real Missed entry: Power8 based missed entry */
> > -			if ((addr == 0) && (target == 1) && (pred == 1)) {
> > -				r_index++;
> > -				continue;
> > -			}
> > 
> > -			/* Reserved condition: Not a valid entry  */
> > -			if ((addr == 0) && (target == 1) && (pred == 0)) {
> > -				r_index++;
> > -				continue;
> > -			}
> > -
> > -			/* Is a target address */
> >  			if (val & BHRB_TARGET) {
> >  				/* First address cannot be a target address */
> > -				if (r_index == 0) {
> > -					r_index++;
> > +				if (r_index == 0)
> >  					continue;
> > -				}
> > 
> >  				/* Update target address for the previous entry */
> >  				cpuhw->bhrb_entries[u_index - 1].to = addr;
> >  				cpuhw->bhrb_entries[u_index - 1].mispred = pred;
> >  				cpuhw->bhrb_entries[u_index - 1].predicted = ~pred;
> > -
> > -				/* Dont increment u_index */
> > -				r_index++;
> >  			} else {
> >  				/* Update address, flags for current entry */
> >  				cpuhw->bhrb_entries[u_index].from = addr;
> > +				cpuhw->bhrb_entries[u_index].to =
> > +					0xffffffffffffffff;
> >  				cpuhw->bhrb_entries[u_index].mispred = pred;
> >  				cpuhw->bhrb_entries[u_index].predicted = ~pred;
> > 
> >  				/* Successfully popullated one entry */
> >  				u_index++;
> > -				r_index++;
> >  			}
> >  		}
> >  	}
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
> > 
> 

^ permalink raw reply

* Re: [PATCH 2/3] powerpc/powernv: Disable IO space for PCI buses
From: Benjamin Herrenschmidt @ 2013-05-06 21:33 UTC (permalink / raw)
  To: Gavin Shan; +Cc: yinghai, linux-pci, bhelgaas, linuxppc-dev
In-Reply-To: <1367847858-6506-2-git-send-email-shangw@linux.vnet.ibm.com>

On Mon, 2013-05-06 at 21:44 +0800, Gavin Shan wrote:
> The patch intends to set the special flag (PCI_BUS_FLAGS_NO_IO) for
> root buses so PCI core will skip assignment for IO stuff. Besides,
> we also clear the IO resources on all PCI devices for PHB3.

Why the new hook ? Can't this be detected simply because there is
no aperture in the pci_host_bridge with IORESOURCE_IO set in the flags ?

Ben.

> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/machdep.h        |    3 ++
>  arch/powerpc/kernel/pci-common.c          |    7 +++++
>  arch/powerpc/platforms/powernv/pci-ioda.c |   38 +++++++++++++++++++++++++++++
>  3 files changed, 48 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index 3f3f691..ebc2ffd 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -220,6 +220,9 @@ struct machdep_calls {
>  	/* Called during PCI resource reassignment */
>  	resource_size_t (*pcibios_window_alignment)(struct pci_bus *, unsigned long type);
>  
> +	/* Called when adding PCI bus */
> +	void (*pcibios_add_bus)(struct pci_bus *);
> +
>  	/* Called to shutdown machine specific hardware not already controlled
>  	 * by other drivers.
>  	 */
> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> index f325dc9..7b8a6f1 100644
> --- a/arch/powerpc/kernel/pci-common.c
> +++ b/arch/powerpc/kernel/pci-common.c
> @@ -120,6 +120,13 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
>  	return 1;
>  }
>  
> +/* The function will be called while adding PCI bus to system */
> +void pcibios_add_bus(struct pci_bus *bus)
> +{
> +	if (ppc_md.pcibios_add_bus)
> +		ppc_md.pcibios_add_bus(bus);
> +}
> +
>  static resource_size_t pcibios_io_size(const struct pci_controller *hose)
>  {
>  #ifdef CONFIG_PPC64
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 8c6c9cf..0c3fa29 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1015,6 +1015,40 @@ static resource_size_t pnv_pci_window_alignment(struct pci_bus *bus,
>  	return phb->ioda.io_segsize;
>  }
>  
> +/*
> + * The function will be called while adding PCI bus to the
> + * system. In turn, we should set flag to indicate that the
> + * root bus doesn't have IO resources.
> + */
> +static void pnv_pci_add_bus(struct pci_bus *bus)
> +{
> +	struct pci_controller *hose = pci_bus_to_host(bus);
> +	struct pnv_phb *phb = hose->private_data;
> +
> +	/*
> +	 * We only need set the flag for root bus since the
> +	 * bus flags are copied over from parent to children
> +	 */
> +	if (pci_is_root_bus(bus) &&
> +	    phb->model == PNV_PHB_MODEL_PHB3)
> +		bus->bus_flags |= PCI_BUS_FLAGS_NO_IO;
> +}
> +
> +static void pnv_pci_fixup_resources(struct pci_dev *dev)
> +{
> +	int i;
> +	struct resource *res;
> +
> +	if (!(dev->bus->bus_flags & PCI_BUS_FLAGS_NO_IO))
> +		return;
> +
> +	for (i = 0; i < DEVICE_COUNT_RESOURCE; i++) {
> +		res = dev->resource + i;
> +		if (res->flags & IORESOURCE_IO)
> +			res->flags = 0;
> +	}
> +}
> +
>  /* Prevent enabling devices for which we couldn't properly
>   * assign a PE
>   */
> @@ -1189,6 +1223,10 @@ void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
>  	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
>  	ppc_md.pcibios_enable_device_hook = pnv_pci_enable_device_hook;
>  	ppc_md.pcibios_window_alignment = pnv_pci_window_alignment;
> +	if (ioda_type == PNV_PHB_IODA2) {
> +		ppc_md.pcibios_add_bus = pnv_pci_add_bus;
> +		ppc_md.pcibios_fixup_resources = pnv_pci_fixup_resources;
> +	}
>  	pci_add_flags(PCI_REASSIGN_ALL_RSRC);
>  
>  	/* Reset IODA tables to a clean state */

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/powernv: Don't configure IO window on PHB3
From: Benjamin Herrenschmidt @ 2013-05-06 21:34 UTC (permalink / raw)
  To: Gavin Shan; +Cc: yinghai, linux-pci, bhelgaas, linuxppc-dev
In-Reply-To: <1367847858-6506-3-git-send-email-shangw@linux.vnet.ibm.com>

On Mon, 2013-05-06 at 21:44 +0800, Gavin Shan wrote:
> We needn't configure IO windows for the corresponding PEs on PHB3
> since that doesn't support IO.

Here too, no need for such a flag, just check that
pci_controller->io_resource.flags is 0.

BTW. Please work on top of the patch I sent already that avoids adding
bogus resources to pci_host_bridge when their flags are 0. I'll send
it to Linus today.

Cheers,
Ben.

> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 0c3fa29..b4f3edb 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -894,7 +894,9 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
>  		    res->start > res->end)
>  			continue;
>  
> -		if (res->flags & IORESOURCE_IO) {
> +		/* We needn't setup IO windows for PHB3 */
> +		if (!(pe->pbus->bus_flags & PCI_BUS_FLAGS_NO_IO) &&
> +		    res->flags & IORESOURCE_IO) {
>  			region.start = res->start - phb->ioda.io_pci_base;
>  			region.end   = res->end - phb->ioda.io_pci_base;
>  			index = region.start / phb->ioda.io_segsize;

^ permalink raw reply

* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Benjamin Herrenschmidt @ 2013-05-06 21:37 UTC (permalink / raw)
  To: Ryan Arnold
  Cc: linuxppc-dev, Nishanth Aravamudan, Steve Munroe, Peter Bergner,
	Michael Neuling, Michael R Meissner
In-Reply-To: <OFBCA2593E.F64367AB-ON86257B63.004ED7A8-86257B63.00506FC6@us.ibm.com>

On Mon, 2013-05-06 at 09:38 -0500, Ryan Arnold wrote:
> My understanding was that these bits being 'on' is an indication of
> what features the hardware supports (or what the kernel emulates) and
> a not an indication of whether that facility is currently enabled or
> not.  If the hardware supports a particular feature but it is not
> enabled I'd expect that user-space usage of that feature would cause
> the kernel to trap on a facility availability exception (which is how
> Altivec/VMX is implemented, being defaulted to turned off).

Right but the discussion is about whether we should expose the bits
when the kernel doesn't have the ability to handle the feature :-)

IE. We need to remove the HTM feature if the kernel is compiled without
transactional memory support.

Similarily, Nish, you may need to check that we remove those bits if
pHyp has the partition in a mode that doesn't support them (P7
compatibility for example) for migration purposes.

Cheers,
Ben.

> Otherwise there's no way I could know whether an ISA [optional]
> feature is actually available on a particular machine.
> 
> And yes, the bits can't change.  My usage of hwcap.h has to coincide
> with the kernel's asm/cputable.h

^ permalink raw reply

* Re: [PATCH v2 1/4] powerpc/cputable: reserve bits in HWCAP2 for new features
From: Benjamin Herrenschmidt @ 2013-05-06 21:41 UTC (permalink / raw)
  To: Ryan Arnold
  Cc: Michael Neuling, Nishanth Aravamudan, Steve Munroe, Peter Bergner,
	linuxppc-dev, Michael R Meissner
In-Reply-To: <OFA8304853.25190807-ON86257B63.0067EDA3-86257B63.00690C34@us.ibm.com>

On Mon, 2013-05-06 at 14:07 -0500, Ryan Arnold wrote:
> Notice that I changed DSCR to DSC.  The 'R' wasn't descriptive.

The "R" is the name of the register for which we are exposing the
availability to userspace... it's also the name of the sysfs entry so
I'd rather keep it for consistency.

Cheers,
Ben.

^ permalink raw reply

* Re: [RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts
From: Scott Wood @ 2013-05-06 23:50 UTC (permalink / raw)
  To: tiejun.chen; +Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <51871FCD.9070900@windriver.com>

On 05/05/2013 10:13:17 PM, tiejun.chen wrote:
> On 05/06/2013 11:10 AM, Tiejun Chen wrote:
>> For the external interrupt, the decrementer exception and the =20
>> doorbell
>> excpetion, we also need to soft-disable interrupts while doing as =20
>> host
>> interrupt handlers since the DO_KVM hook is always performed to skip
>> EXCEPTION_COMMON then miss this original chance with the 'ints' =20
>> (INTS_DISABLE).

http://patchwork.ozlabs.org/patch/241344/
http://patchwork.ozlabs.org/patch/241412/

:-)

>> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
>> ---
>>   arch/powerpc/kvm/bookehv_interrupts.S |    9 +++++++++
>>   1 file changed, 9 insertions(+)
>>=20
>> diff --git a/arch/powerpc/kvm/bookehv_interrupts.S =20
>> b/arch/powerpc/kvm/bookehv_interrupts.S
>> index e8ed7d6..2fd62bf 100644
>> --- a/arch/powerpc/kvm/bookehv_interrupts.S
>> +++ b/arch/powerpc/kvm/bookehv_interrupts.S
>> @@ -33,6 +33,8 @@
>>=20
>>   #ifdef CONFIG_64BIT
>>   #include <asm/exception-64e.h>
>> +#include <asm/hw_irq.h>
>> +#include <asm/irqflags.h>
>>   #else
>>   #include "../kernel/head_booke.h" /* for THREAD_NORMSAVE() */
>>   #endif
>> @@ -469,6 +471,13 @@ _GLOBAL(kvmppc_resume_host)
>>   	PPC_LL	r3, HOST_RUN(r1)
>>   	mr	r5, r14 /* intno */
>>   	mr	r14, r4 /* Save vcpu pointer. */
>> +#ifdef CONFIG_64BIT
>> +	/* Should we soft-disable interrupts? */
>> +	andi.	r6, r5, BOOKE_INTERRUPT_EXTERNAL | =20
>> BOOKE_INTERRUPT_DECREMENTER | BOOKE_INTERRUPT_DOORBELL
>> +	beq	skip_soft_dis
>> +	SOFT_DISABLE_INTS(r7,r8)
>> +skip_soft_dis:
>> +#endif

Why wouldn't we always disable them?  kvmppc_handle_exit() will enable =20
interrupts when it's ready.

-Scott=

^ permalink raw reply

* Re: [PATCH] kvm/ppc/booke64: Hard disable interrupts when entering the guest
From: Scott Wood @ 2013-05-06 23:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: kvm, Alexander Graf, kvm-ppc, Mihai Caraman, Paul Mackerras,
	linuxppc-dev
In-Reply-To: <1367787788.11982.58.camel@pasglop>

On 05/05/2013 04:03:08 PM, Benjamin Herrenschmidt wrote:
> On Fri, 2013-05-03 at 18:45 -0500, Scott Wood wrote:
> > kvmppc_lazy_ee_enable() was causing interrupts to be soft-enabled
> > (albeit hard-disabled) in kvmppc_restart_interrupt().  This led to
> > warnings, and possibly breakage if the interrupt state was later =20
> saved
> > and then restored (leading to interrupts being hard-and-soft enabled
> > when they should be at least soft-disabled).
> >
> > Simply removing kvmppc_lazy_ee_enable() leaves interrupts only
> > soft-disabled when we enter the guest, but they will be =20
> hard-disabled
> > when we exit the guest -- without PACA_IRQ_HARD_DIS ever being set, =20
> so
> > the local_irq_enable() fails to hard-enable.
> >
> > While we could just set PACA_IRQ_HARD_DIS after an exit to =20
> compensate,
> > instead hard-disable interrupts before entering the guest.  This =20
> way,
> > we won't have to worry about interactions if we take an interrupt
> > during the guest entry code.  While I don't see any obvious
> > interactions, it could change in the future (e.g. it would be bad if
> > the non-hv code were used on 64-bit or if 32-bit guest lazy =20
> interrupt
> > disabling, since the non-hv code changes IVPR among other things).
>=20
> Shouldn't the interrupts be marked soft-enabled (even if hard =20
> disabled)
> when entering the guest ?
>=20
> Ie. The last stage of entry will hard enable, so they should be
> soft-enabled too... if not, latency trackers will consider the whole
> guest periods as "interrupt disabled"...

OK... I guess we already have that problem on 32-bit as well?

> Now, kvmppc_lazy_ee_enable() seems to be clearly bogus to me. It will
> unconditionally set soft_enabled and clear irq_happened from a
> soft-disabled state, thus potentially losing a pending event.
>=20
> Book3S "HV" seems to be keeping interrupts fully enabled all the way
> until the asm hard disables, which would be fine except that I'm =20
> worried
> we are racy vs. need_resched & signals.
>=20
> One thing you may be able to do is call prep_irq_for_idle(). This will
> tell you if something happened, giving you a chance to abort/re-enable
> before you go the guest.

As long as we go straight from IRQs fully enabled to hard-disabled, =20
before we check for signals and such, I don't think we need that (and =20
using it would raise the question of what to do on 32-bit).

What if we just take this patch, and add trace_hardirqs_on() just =20
before entering the guest?  This would be similar to what the 32-bit =20
non-KVM exception return code does (except it would be in C code).  =20
Perhaps we could set soft_enabled as well, but then we'd have to clear =20
it again before calling kvmppc_restart_interrupt() -- since the KVM =20
exception handlers don't actually care about soft_enabled (it would =20
just be for consistency), I'd rather just leave soft_enabled off.

We also don't want PACA_IRQ_HARD_DIS to be cleared the way =20
prep_irq_for_idle() does, because that's what lets the =20
local_irq_enable() do the hard-enabling after we exit the guest.

-Scott=

^ permalink raw reply

* Re: [PATCH] kvm/ppc/booke64: Hard disable interrupts when entering the guest
From: Benjamin Herrenschmidt @ 2013-05-07  0:03 UTC (permalink / raw)
  To: Scott Wood
  Cc: kvm, Alexander Graf, kvm-ppc, Mihai Caraman, Paul Mackerras,
	linuxppc-dev
In-Reply-To: <1367884418.3398.10@snotra>

On Mon, 2013-05-06 at 18:53 -0500, Scott Wood wrote:
> 
> > Ie. The last stage of entry will hard enable, so they should be
> > soft-enabled too... if not, latency trackers will consider the whole
> > guest periods as "interrupt disabled"...
> 
> OK... I guess we already have that problem on 32-bit as well?

32-bit doesn't do lazy disable, so the situation is a lot easier there.

> > Now, kvmppc_lazy_ee_enable() seems to be clearly bogus to me. It will
> > unconditionally set soft_enabled and clear irq_happened from a
> > soft-disabled state, thus potentially losing a pending event.
> > 
> > Book3S "HV" seems to be keeping interrupts fully enabled all the way
> > until the asm hard disables, which would be fine except that I'm  
> > worried
> > we are racy vs. need_resched & signals.
> > 
> > One thing you may be able to do is call prep_irq_for_idle(). This will
> > tell you if something happened, giving you a chance to abort/re-enable
> > before you go the guest.
> 
> As long as we go straight from IRQs fully enabled to hard-disabled,  
> before we check for signals and such, I don't think we need that (and  
> using it would raise the question of what to do on 32-bit).

Except that you have to mark them as "soft enabled" before you enter the
guest with interrupts on...

But yes, I see your point. If interrupts are fully enabled and you call
hard_irq_disable(), there should be no chance for anything to mess
around with irq_happened.

However if you set soft-enabled later on before the rfid that returns to
the guest and sets EE, you *must* also clear PACA_IRQ_HARD_DIS in
irq_happened. If you get that out of sync bad things will happen later
on...

To be sure all is well, you might want to
WARN_ON(get_paca()->irq_happened == PACA_IRQ_HARD_DIS); (with a comment
explaining why so).

Another problem is that hard_irq_disable() doesn't call
trace_hardirqs_off()... We might want to fix that:

static inline void hard_irq_disable(void)
{
	__hard_irq_disable();
	if (get_paca()->soft_enabled)
		trace_hardirqs_off();
	get_paca()->soft_enabled = 0;
	get_paca()->irq_happened |= PACA_IRQ_HARD_DIS;
}

> What if we just take this patch, and add trace_hardirqs_on() just  
> before entering the guest?

You still want to set soft_enabled I'd say ... though I can see how you
may get away without it as long as you call trace_hardirqs_off() right
on the way back from the guest, but beware some lockdep bits will choke
if they ever spot the discrepancy between the traced irq state and
soft_enabled. I'd recommend you just keep it in sync.

>   This would be similar to what the 32-bit  
> non-KVM exception return code does (except it would be in C code).   
> Perhaps we could set soft_enabled as well, but then we'd have to clear  
> it again before calling kvmppc_restart_interrupt() -- since the KVM  
> exception handlers don't actually care about soft_enabled (it would  
> just be for consistency), I'd rather just leave soft_enabled off.
> 
> We also don't want PACA_IRQ_HARD_DIS to be cleared the way  
> prep_irq_for_idle() does, because that's what lets the  
> local_irq_enable() do the hard-enabling after we exit the guest.

Then set it again. Don't leave the kernel in a state where soft_enabled
is 1 and irq_happened is non-zero. It might work in the specific KVM
case we are looking at now because we know we are coming back via KVM
exit and putting things right again but it's fragile, somebody will come
back and break it, etc...

If necessary, create (or improve existing) helpers that do the right
state adjustement. The cost of a couple of byte stores is negligible,
I'd rather you make sure everything remains in sync at all times.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 2/5] KVM: PPC: iommu: Add missing kvm_iommu_map_pages/kvm_iommu_unmap_pages
From: Alexey Kardashevskiy @ 2013-05-07  0:49 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm, Alexander Graf, kvm-ppc, linux-kernel, Paul Mackerras,
	linux-pci, linuxppc-dev, David Gibson
In-Reply-To: <1367874436.2868.90.camel@ul30vt.home>

On 05/07/2013 07:07 AM, Alex Williamson wrote:
> On Mon, 2013-05-06 at 17:21 +1000, aik@ozlabs.ru wrote:
>> From: Alexey Kardashevskiy <aik@ozlabs.ru>
>>
>> The IOMMU API implements groups creating/deletion, device binding
>> and IOMMU map/unmap operations.
>>
>> The PowerPC implementation uses most of the API except map/unmap
>> operations, which are implemented on POWER using hypercalls.
>>
>> However, in order to link a kernel with the CONFIG_IOMMU_API enabled,
>> the empty kvm_iommu_map_pages/kvm_iommu_unmap_pages have to be
>> defined, so this defines them.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Signed-off-by: Paul Mackerras <paulus@samba.org>
>> ---
>>  arch/powerpc/include/asm/kvm_host.h |   14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
>> index b6a047e..c025d91 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -603,4 +603,18 @@ struct kvm_vcpu_arch {
>>  
>>  #define __KVM_HAVE_ARCH_WQP
>>  
>> +#ifdef CONFIG_IOMMU_API
>> +/* POWERPC does not use IOMMU API for mapping/unmapping */
>> +static inline int kvm_iommu_map_pages(struct kvm *kvm,
>> +		struct kvm_memory_slot *slot)
>> +{
>> +	return 0;
>> +}
>> +
>> +static inline void kvm_iommu_unmap_pages(struct kvm *kvm,
>> +		struct kvm_memory_slot *slot)
>> +{
>> +}
>> +#endif /* CONFIG_IOMMU_API */
>> +
>>  #endif /* __POWERPC_KVM_HOST_H__ */
> 
> This is no longer needed, Gleb applied my patch for 3.10 that make all
> of KVM device assignment dependent on a build config option and the top
> level kvm_host.h now includes this when that is not set.  Thanks,

Cannot find it, could you point me please where it is on github or
git.kernel.org? Thanks.


-- 
Alexey

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox