LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] arch/powerpc/kernel: using %12.12s instead of %12s for avoiding memory overflow.
From: Chen Gang @ 2013-04-24  7:45 UTC (permalink / raw)
  To: Vasant Hegde
  Cc: sfr@canb.auug.org.au, Michael Neuling,
	linux-kernel@vger.kernel.org, paulus, linuxppc-dev
In-Reply-To: <514FD2E0.5030805@asianux.com>

Hello Vasant Hegde:

How about this patch, is it OK ?

Thanks.


On 2013年03月25日 12:30, Chen Gang wrote:
> Hello Maintainers:
> 
>   could you help check this patch whether is ok ?
> 
>   thanks.
> 
> 
> On 2013年02月17日 12:00, Chen Gang wrote:
>> Hello relative members:
>>
>>   please give a glance to this patch, when you have time.
>>
>>   thanks.
>>
>>   :-)
>>
>> gchen.
>>
>>
>> 于 2013年01月24日 12:14, Chen Gang 写道:
>>>
>>>   for tmp_part->header.name:
>>>     it is "Terminating null required only for names < 12 chars".
>>>     so need to limit the %.12s for it in printk
>>>
>>>   additional info:
>>>
>>>     %12s  limit the width, not for the original string output length
>>>           if name length is more than 12, it still can be fully displayed.
>>>           if name length is less than 12, the ' ' will be filled before name.
>>>
>>>     %.12s truly limit the original string output length (precision)
>>>
>>>
>>> Signed-off-by: Chen Gang <gang.chen@asianux.com>
>>> ---
>>>  arch/powerpc/kernel/nvram_64.c |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
>>> index bec1e93..57bf6d2 100644
>>> --- a/arch/powerpc/kernel/nvram_64.c
>>> +++ b/arch/powerpc/kernel/nvram_64.c
>>> @@ -202,7 +202,7 @@ static void __init nvram_print_partitions(char * label)
>>>  	printk(KERN_WARNING "--------%s---------\n", label);
>>>  	printk(KERN_WARNING "indx\t\tsig\tchks\tlen\tname\n");
>>>  	list_for_each_entry(tmp_part, &nvram_partitions, partition) {
>>> -		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12s\n",
>>> +		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12.12s\n",
>>>  		       tmp_part->index, tmp_part->header.signature,
>>>  		       tmp_part->header.checksum, tmp_part->header.length,
>>>  		       tmp_part->header.name);
>>>
>>
>>
> 
> 


-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: [PATCH] arch/powerpc/kernel: using %12.12s instead of %12s for avoiding memory overflow.
From: Vasant Hegde @ 2013-04-24  8:15 UTC (permalink / raw)
  To: Chen Gang
  Cc: sfr@canb.auug.org.au, Michael Neuling, linuxppc-dev, paulus,
	linux-kernel@vger.kernel.org
In-Reply-To: <51778D97.4080409@asianux.com>

On 04/24/2013 01:15 PM, Chen Gang wrote:
> Hello Vasant Hegde:
> 
> How about this patch, is it OK ?
> 
> Thanks.
> 
> 
> On 2013年03月25日 12:30, Chen Gang wrote:
>> Hello Maintainers:
>>
>>    could you help check this patch whether is ok ?
>>
>>    thanks.
>>
>>
>> On 2013年02月17日 12:00, Chen Gang wrote:
>>> Hello relative members:
>>>
>>>    please give a glance to this patch, when you have time.
>>>
>>>    thanks.
>>>
>>>    :-)
>>>
>>> gchen.
>>>
>>>
>>> 于 2013年01月24日 12:14, Chen Gang 写道:
>>>>
>>>>    for tmp_part->header.name:
>>>>      it is "Terminating null required only for names<  12 chars".
>>>>      so need to limit the %.12s for it in printk
>>>>
>>>>    additional info:
>>>>
>>>>      %12s  limit the width, not for the original string output length
>>>>            if name length is more than 12, it still can be fully displayed.
>>>>            if name length is less than 12, the ' ' will be filled before name.
>>>>
>>>>      %.12s truly limit the original string output length (precision)
>>>>
>>>>
>>>> Signed-off-by: Chen Gang<gang.chen@asianux.com>
>>>> ---
>>>>   arch/powerpc/kernel/nvram_64.c |    2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
>>>> index bec1e93..57bf6d2 100644
>>>> --- a/arch/powerpc/kernel/nvram_64.c
>>>> +++ b/arch/powerpc/kernel/nvram_64.c
>>>> @@ -202,7 +202,7 @@ static void __init nvram_print_partitions(char * label)
>>>>   	printk(KERN_WARNING "--------%s---------\n", label);
>>>>   	printk(KERN_WARNING "indx\t\tsig\tchks\tlen\tname\n");
>>>>   	list_for_each_entry(tmp_part,&nvram_partitions, partition) {
>>>> -		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12s\n",
>>>> +		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12.12s\n",

First, this code in inside NVRAM_DEBUG which is used only for debug purpose and
AFAIK, all partition names are less than 20 character. So I don't think we need
this patch.

-Vasant

>>>>   		       tmp_part->index, tmp_part->header.signature,
>>>>   		       tmp_part->header.checksum, tmp_part->header.length,
>>>>   		       tmp_part->header.name);
>>>>
>>>
>>>
>>
>>
> 
> 

^ permalink raw reply

* Re: [PATCH] arch/powerpc/kernel: using %12.12s instead of %12s for avoiding memory overflow.
From: Vasant Hegde @ 2013-04-24  8:19 UTC (permalink / raw)
  To: Chen Gang
  Cc: sfr@canb.auug.org.au, Michael Neuling, linuxppc-dev,
	linux-kernel@vger.kernel.org, paulus
In-Reply-To: <51779491.10307@linux.vnet.ibm.com>

On 04/24/2013 01:45 PM, Vasant Hegde wrote:
> On 04/24/2013 01:15 PM, Chen Gang wrote:
>> Hello Vasant Hegde:
>>
>> How about this patch, is it OK ?
>>
>> Thanks.
>>
>>
>> On 2013年03月25日 12:30, Chen Gang wrote:
>>> Hello Maintainers:
>>>
>>>     could you help check this patch whether is ok ?
>>>
>>>     thanks.
>>>
>>>
>>> On 2013年02月17日 12:00, Chen Gang wrote:
>>>> Hello relative members:
>>>>
>>>>     please give a glance to this patch, when you have time.
>>>>
>>>>     thanks.
>>>>
>>>>     :-)
>>>>
>>>> gchen.
>>>>
>>>>
>>>> 于 2013年01月24日 12:14, Chen Gang 写道:
>>>>>
>>>>>     for tmp_part->header.name:
>>>>>       it is "Terminating null required only for names<   12 chars".
>>>>>       so need to limit the %.12s for it in printk
>>>>>
>>>>>     additional info:
>>>>>
>>>>>       %12s  limit the width, not for the original string output length
>>>>>             if name length is more than 12, it still can be fully displayed.
>>>>>             if name length is less than 12, the ' ' will be filled before name.
>>>>>
>>>>>       %.12s truly limit the original string output length (precision)
>>>>>
>>>>>
>>>>> Signed-off-by: Chen Gang<gang.chen@asianux.com>
>>>>> ---
>>>>>    arch/powerpc/kernel/nvram_64.c |    2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
>>>>> index bec1e93..57bf6d2 100644
>>>>> --- a/arch/powerpc/kernel/nvram_64.c
>>>>> +++ b/arch/powerpc/kernel/nvram_64.c
>>>>> @@ -202,7 +202,7 @@ static void __init nvram_print_partitions(char * label)
>>>>>    	printk(KERN_WARNING "--------%s---------\n", label);
>>>>>    	printk(KERN_WARNING "indx\t\tsig\tchks\tlen\tname\n");
>>>>>    	list_for_each_entry(tmp_part,&nvram_partitions, partition) {
>>>>> -		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12s\n",
>>>>> +		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12.12s\n",
> 
> First, this code in inside NVRAM_DEBUG which is used only for debug purpose and
> AFAIK, all partition names are less than 20 character. So I don't think we need

Sorry.. I meant 12 character.

-Vasant


> this patch.
> 
> -Vasant
> 
>>>>>    		tmp_part->index, tmp_part->header.signature,
>>>>>    		       tmp_part->header.checksum, tmp_part->header.length,
>>>>>    		       tmp_part->header.name);
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* "attempt to move .org backwards" still show up
From: Mike Qiu @ 2013-04-24  8:22 UTC (permalink / raw)
  To: linuxppc-dev, gang.chen
  Cc: sfr, mikey, matt, linux-kernel, paulus, Aneesh Kumar K.V

Hi all

I get an error message when I compile the source code in Power7 platform
use the newest upstream kernel.

[root@feng linux]# make -j60
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CC scripts/mod/devicetable-offsets.s
GEN scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
CALL scripts/checksyscalls.sh
HOSTLD scripts/mod/modpost
CHK include/generated/compile.h
CALL arch/powerpc/kernel/systbl_chk.sh
CALL arch/powerpc/kernel/prom_init_check.sh
AS arch/powerpc/kernel/head_64.o
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org
backwards
make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
make: *** Waiting for unfinished jobs....

and I see this should be fixed by the commit:
087aa036eb79f24b856893190359ba812b460f45

But it still failed in my P7 machine.

the kernel source code info:
git tree : git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
[root@feng linux]# git log
commit 824282ca7d250bd7c301f221c3cd902ce906d731
Merge: f83b293 3b5e50e
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon Apr 22 15:00:59 2013 -0700

Merge branch 'upstream' of
git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fix from Ralf Baechle:
"Revert the change of the definition of PAGE_MASK which was prettier
but broke a few relativly rare platforms"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Revert "MIPS: page.h: Provide more readable definition for PAGE_MASK."

commit 3b5e50edaf500f392f4a372296afc0b99ffa7e70
Author: Ralf Baechle <ralf@linux-mips.org>
Date: Mon Apr 22 17:57:54 2013 +0200

[root@feng linux]# git branch
* master
[root@feng linux]# git diff
[root@feng linux]#

Thant means I have done nothing with the kernel

Thanks
Mike

^ permalink raw reply

* Re: "attempt to move .org backwards" still show up
From: Michael Ellerman @ 2013-04-24  8:31 UTC (permalink / raw)
  To: Mike Qiu
  Cc: sfr, mikey, matt, gang.chen, linux-kernel, paulus,
	Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <5177965D.9090406@linux.vnet.ibm.com>

On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:
> Hi all
> 
> I get an error message when I compile the source code in Power7 platform
> use the newest upstream kernel.

Hi Mike,

It depends on what your .config is. What defconfig are you building?

cheers

^ permalink raw reply

* Re: [PATCH] arch/powerpc/kernel: using %12.12s instead of %12s for avoiding memory overflow.
From: Chen Gang @ 2013-04-24  8:31 UTC (permalink / raw)
  To: Vasant Hegde
  Cc: sfr@canb.auug.org.au, Michael Neuling, linuxppc-dev,
	linux-kernel@vger.kernel.org, paulus
In-Reply-To: <51779588.1070203@linux.vnet.ibm.com>

On 2013年04月24日 16:19, Vasant Hegde wrote:
>>>>>>     for tmp_part->header.name:
>>>>>> >>>>>       it is "Terminating null required only for names<   12 chars".
>>>>>> >>>>>       so need to limit the %.12s for it in printk
>>>>>> >>>>>
>>>>>> >>>>>     additional info:
>>>>>> >>>>>
>>>>>> >>>>>       %12s  limit the width, not for the original string output length
>>>>>> >>>>>             if name length is more than 12, it still can be fully displayed.
>>>>>> >>>>>             if name length is less than 12, the ' ' will be filled before name.
>>>>>> >>>>>
>>>>>> >>>>>       %.12s truly limit the original string output length (precision)
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> Signed-off-by: Chen Gang<gang.chen@asianux.com>
>>>>>> >>>>> ---
>>>>>> >>>>>    arch/powerpc/kernel/nvram_64.c |    2 +-
>>>>>> >>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>> >>>>>
>>>>>> >>>>> diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
>>>>>> >>>>> index bec1e93..57bf6d2 100644
>>>>>> >>>>> --- a/arch/powerpc/kernel/nvram_64.c
>>>>>> >>>>> +++ b/arch/powerpc/kernel/nvram_64.c
>>>>>> >>>>> @@ -202,7 +202,7 @@ static void __init nvram_print_partitions(char * label)
>>>>>> >>>>>    	printk(KERN_WARNING "--------%s---------\n", label);
>>>>>> >>>>>    	printk(KERN_WARNING "indx\t\tsig\tchks\tlen\tname\n");
>>>>>> >>>>>    	list_for_each_entry(tmp_part,&nvram_partitions, partition) {
>>>>>> >>>>> -		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12s\n",
>>>>>> >>>>> +		printk(KERN_WARNING "%4d    \t%02x\t%02x\t%d\t%12.12s\n",
>> > 
>> > First, this code in inside NVRAM_DEBUG which is used only for debug purpose and
>> > AFAIK, all partition names are less than 20 character. So I don't think we need
> Sorry.. I meant 12 character.

Please see line 283:
  "strncpy(part->header.name, "wwwwwwwwwwww", 12);"
  (it is not a NUL terminated string, and the length is 12)

And also, can we be sure that all partition names should be less than 12
characters ?

All together, I think we still need %12.12s to protect the memory.

Thanks.

-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* Re: "attempt to move .org backwards" still show up
From: Mike Qiu @ 2013-04-24  8:35 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: sfr, mikey, matt, gang.chen, linux-kernel, paulus,
	Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20130424083142.GB26834@concordia>

于 2013/4/24 16:31, Michael Ellerman 写道:
> On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:
>> Hi all
>>
>> I get an error message when I compile the source code in Power7 platform
>> use the newest upstream kernel.
> Hi Mike,
>
> It depends on what your .config is. What defconfig are you building?
I just copy the config file from /boot/config.* to .config and use make 
menuconfig
change nothing by manually, then save.
> cheers
>

^ permalink raw reply

* Re: "attempt to move .org backwards" still show up
From: Mike Qiu @ 2013-04-24  8:36 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: sfr, mikey, matt, gang.chen, linux-kernel, paulus,
	Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <20130424083142.GB26834@concordia>

于 2013/4/24 16:31, Michael Ellerman 写道:
> On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:
>> Hi all
>>
>> I get an error message when I compile the source code in Power7 platform
>> use the newest upstream kernel.
> Hi Mike,
>
> It depends on what your .config is. What defconfig are you building?
>
> cheers
>
And I do know how to build the source code in this machine . . .

Thanks

^ permalink raw reply

* Re: [PATCH -V6 18/27] mm/THP: withdraw the pgtable after pmdp related operations
From: Aneesh Kumar K.V @ 2013-04-24  9:08 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: paulus, linuxppc-dev, David Gibson
In-Reply-To: <20130422154901.GC13442@redhat.com>

Andrea Arcangeli <aarcange@redhat.com> writes:

> Hi,
>
> On Mon, Apr 22, 2013 at 03:30:52PM +0530, Aneesh Kumar K.V wrote:
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> 
>> For architectures like ppc64 we look at deposited pgtable when
>> calling pmdp_get_and_clear. So do the pgtable_trans_huge_withdraw
>> after finishing pmdp related operations.
>> 
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  mm/huge_memory.c |    3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 84f3180..2a43782 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1363,9 +1363,10 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>>  		struct page *page;
>>  		pgtable_t pgtable;
>>  		pmd_t orig_pmd;
>> -		pgtable = pgtable_trans_huge_withdraw(tlb->mm, pmd);
>> +
>>  		orig_pmd = pmdp_get_and_clear(tlb->mm, addr, pmd);
>>  		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
>> +		pgtable = pgtable_trans_huge_withdraw(tlb->mm, pmd);
>>  		if (is_huge_zero_pmd(orig_pmd)) {
>>  			tlb->mm->nr_ptes--;
>>  			spin_unlock(&tlb->mm->page_table_lock);
>
> I think here a comment inline (not only in the commit msg) is in
> order. Otherwise it's hard to imagine others to be aware of this arch
> detail when they will read the code later. So it would be prone to
> break later without a comment.

How about ?

>From 7444a5eda33c00eea465b51c405cb830c57513b7 Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Date: Wed, 6 Mar 2013 12:50:37 +0530
Subject: [PATCH] mm/THP: withdraw the pgtable after pmdp related operations

For architectures like ppc64 we look at deposited pgtable when
calling pmdp_get_and_clear. So do the pgtable_trans_huge_withdraw
after finishing pmdp related operations.

Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 mm/huge_memory.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 84f3180..21c5ebd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1363,9 +1363,15 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 		struct page *page;
 		pgtable_t pgtable;
 		pmd_t orig_pmd;
-		pgtable = pgtable_trans_huge_withdraw(tlb->mm, pmd);
+		/*
+		 * For architectures like ppc64 we look at deposited pgtable
+		 * when calling pmdp_get_and_clear. So do the
+		 * pgtable_trans_huge_withdraw after finishing pmdp related
+		 * operations.
+		 */
 		orig_pmd = pmdp_get_and_clear(tlb->mm, addr, pmd);
 		tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
+		pgtable = pgtable_trans_huge_withdraw(tlb->mm, pmd);
 		if (is_huge_zero_pmd(orig_pmd)) {
 			tlb->mm->nr_ptes--;
 			spin_unlock(&tlb->mm->page_table_lock);
-- 
1.7.10

^ permalink raw reply related

* [PATCH 3/7] powerpc/powernv: Add option CONFIG_POWERNV_MSI
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

As Michael Ellerman suggested, to add CONFIG_POWERNV_MSI for PowerNV
platform. That's similar to CONFIG_PSERIES_MSI for pSeries platform.
For now, we don't make it dependent on CONFIG_EEH since it's not ready
to enable that yet.

Apart from that, we also enable CONFIG_PPC_MSI_BITMAP on selecting
CONFIG_POWERNV_MSI.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/Kconfig |    5 +++++
 arch/powerpc/sysdev/Kconfig            |    1 +
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 74fea5c..d3e840d 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -8,6 +8,11 @@ config PPC_POWERNV
 	select PPC_PCI_CHOICE if EMBEDDED
 	default y
 
+config POWERNV_MSI
+	bool "Support PCI MSI on PowerNV platform"
+	depends on PCI_MSI
+	default y
+
 config PPC_POWERNV_RTAS
 	depends on PPC_POWERNV
 	bool "Support for RTAS based PowerNV platforms such as BML"
diff --git a/arch/powerpc/sysdev/Kconfig b/arch/powerpc/sysdev/Kconfig
index a84fecf..ab4cb54 100644
--- a/arch/powerpc/sysdev/Kconfig
+++ b/arch/powerpc/sysdev/Kconfig
@@ -19,6 +19,7 @@ config PPC_MSI_BITMAP
 	default y if MPIC
 	default y if FSL_PCI
 	default y if PPC4xx_MSI
+	default y if POWERNV_MSI
 
 source "arch/powerpc/sysdev/xics/Kconfig"
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 1/7] powerpc/powernv: Supports PHB3
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

The patch intends to initialize PHB3 during system boot stage. The
flag "PNV_PHB_MODEL_PHB3" is introduced to differentiate IODA2
compatible PHB3 from other types of PHBs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   62 +++++++++++++++--------------
 arch/powerpc/platforms/powernv/pci.c      |    6 ++-
 arch/powerpc/platforms/powernv/pci.h      |    8 ++-
 3 files changed, 42 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index a5c5f15..3d4e958 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -852,18 +852,19 @@ static u32 pnv_ioda_bdfn_to_pe(struct pnv_phb *phb, struct pci_bus *bus,
 	return phb->ioda.pe_rmap[(bus->number << 8) | devfn];
 }
 
-void __init pnv_pci_init_ioda1_phb(struct device_node *np)
+void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
 {
 	struct pci_controller *hose;
 	static int primary = 1;
 	struct pnv_phb *phb;
 	unsigned long size, m32map_off, iomap_off, pemap_off;
 	const u64 *prop64;
+	const u32 *prop32;
 	u64 phb_id;
 	void *aux;
 	long rc;
 
-	pr_info(" Initializing IODA OPAL PHB %s\n", np->full_name);
+	pr_info(" Initializing IODA%d OPAL PHB %s\n", ioda_type, np->full_name);
 
 	prop64 = of_get_property(np, "ibm,opal-phbid", NULL);
 	if (!prop64) {
@@ -890,37 +891,34 @@ void __init pnv_pci_init_ioda1_phb(struct device_node *np)
 	hose->last_busno = 0xff;
 	hose->private_data = phb;
 	phb->opal_id = phb_id;
-	phb->type = PNV_PHB_IODA1;
+	phb->type = ioda_type;
 
 	/* Detect specific models for error handling */
 	if (of_device_is_compatible(np, "ibm,p7ioc-pciex"))
 		phb->model = PNV_PHB_MODEL_P7IOC;
+	else if (of_device_is_compatible(np, "ibm,p8-pciex"))
+		phb->model = PNV_PHB_MODEL_PHB3;
 	else
 		phb->model = PNV_PHB_MODEL_UNKNOWN;
 
-	/* We parse "ranges" now since we need to deduce the register base
-	 * from the IO base
-	 */
+	/* Parse 32-bit and IO ranges (if any) */
 	pci_process_bridge_OF_ranges(phb->hose, np, primary);
 	primary = 0;
 
-	/* Magic formula from Milton */
+	/* Get registers */
 	phb->regs = of_iomap(np, 0);
 	if (phb->regs == NULL)
 		pr_err("  Failed to map registers !\n");
 
-
-	/* XXX This is hack-a-thon. This needs to be changed so that:
-	 *  - we obtain stuff like PE# etc... from device-tree
-	 *  - we properly re-allocate M32 ourselves
-	 *    (the OFW one isn't very good)
-	 */
-
 	/* Initialize more IODA stuff */
-	phb->ioda.total_pe = 128;
+	prop32 = of_get_property(np, "ibm,opal-num-pes", NULL);
+	if (!prop32)
+		phb->ioda.total_pe = 1;
+	else
+		phb->ioda.total_pe = *prop32;
 
 	phb->ioda.m32_size = resource_size(&hose->mem_resources[0]);
-	/* OFW Has already off top 64k of M32 space (MSI space) */
+	/* FW Has already off top 64k of M32 space (MSI space) */
 	phb->ioda.m32_size += 0x10000;
 
 	phb->ioda.m32_segsize = phb->ioda.m32_size / phb->ioda.total_pe;
@@ -930,7 +928,10 @@ void __init pnv_pci_init_ioda1_phb(struct device_node *np)
 	phb->ioda.io_segsize = phb->ioda.io_size / phb->ioda.total_pe;
 	phb->ioda.io_pci_base = 0; /* XXX calculate this ? */
 
-	/* Allocate aux data & arrays */
+	/* Allocate aux data & arrays
+	 *
+	 * XXX TODO: Don't allocate io segmap on PHB3
+	 */
 	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
 	m32map_off = size;
 	size += phb->ioda.total_pe * sizeof(phb->ioda.m32_segmap[0]);
@@ -960,7 +961,7 @@ void __init pnv_pci_init_ioda1_phb(struct device_node *np)
 	hose->mem_resources[2].start = 0;
 	hose->mem_resources[2].end = 0;
 
-#if 0
+#if 0 /* We should really do that ... */
 	rc = opal_pci_set_phb_mem_window(opal->phb_id,
 					 window_type,
 					 window_num,
@@ -974,16 +975,6 @@ void __init pnv_pci_init_ioda1_phb(struct device_node *np)
 		phb->ioda.m32_size, phb->ioda.m32_segsize,
 		phb->ioda.io_size, phb->ioda.io_segsize);
 
-	if (phb->regs)  {
-		pr_devel(" BUID     = 0x%016llx\n", in_be64(phb->regs + 0x100));
-		pr_devel(" PHB2_CR  = 0x%016llx\n", in_be64(phb->regs + 0x160));
-		pr_devel(" IO_BAR   = 0x%016llx\n", in_be64(phb->regs + 0x170));
-		pr_devel(" IO_BAMR  = 0x%016llx\n", in_be64(phb->regs + 0x178));
-		pr_devel(" IO_SAR   = 0x%016llx\n", in_be64(phb->regs + 0x180));
-		pr_devel(" M32_BAR  = 0x%016llx\n", in_be64(phb->regs + 0x190));
-		pr_devel(" M32_BAMR = 0x%016llx\n", in_be64(phb->regs + 0x198));
-		pr_devel(" M32_SAR  = 0x%016llx\n", in_be64(phb->regs + 0x1a0));
-	}
 	phb->hose->ops = &pnv_pci_ops;
 
 	/* Setup RID -> PE mapping function */
@@ -1011,7 +1002,18 @@ void __init pnv_pci_init_ioda1_phb(struct device_node *np)
 	rc = opal_pci_reset(phb_id, OPAL_PCI_IODA_TABLE_RESET, OPAL_ASSERT_RESET);
 	if (rc)
 		pr_warning("  OPAL Error %ld performing IODA table reset !\n", rc);
-	opal_pci_set_pe(phb_id, 0, 0, 7, 1, 1 , OPAL_MAP_PE);
+
+	/*
+	 * On IODA1 map everything to PE#0, on IODA2 we assume the IODA reset
+	 * has cleared the RTT which has the same effect
+	 */
+	if (ioda_type == PNV_PHB_IODA1)
+		opal_pci_set_pe(phb_id, 0, 0, 7, 1, 1 , OPAL_MAP_PE);
+}
+
+void pnv_pci_init_ioda2_phb(struct device_node *np)
+{
+	pnv_pci_init_ioda_phb(np, PNV_PHB_IODA2);
 }
 
 void __init pnv_pci_init_ioda_hub(struct device_node *np)
@@ -1034,6 +1036,6 @@ void __init pnv_pci_init_ioda_hub(struct device_node *np)
 	for_each_child_of_node(np, phbn) {
 		/* Look for IODA1 PHBs */
 		if (of_device_is_compatible(phbn, "ibm,ioda-phb"))
-			pnv_pci_init_ioda1_phb(phbn);
+			pnv_pci_init_ioda_phb(phbn, PNV_PHB_IODA1);
 	}
 }
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index 42eee93..a11b5a6 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -492,7 +492,7 @@ static void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
 		pnv_pci_dma_fallback_setup(hose, pdev);
 }
 
-/* Fixup wrong class code in p7ioc root complex */
+/* Fixup wrong class code in p7ioc and p8 root complex */
 static void pnv_p7ioc_rc_quirk(struct pci_dev *dev)
 {
 	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
@@ -558,6 +558,10 @@ void __init pnv_pci_init(void)
 		if (!found_ioda)
 			for_each_compatible_node(np, NULL, "ibm,p5ioc2")
 				pnv_pci_init_p5ioc2_hub(np);
+
+		/* Look for ioda2 built-in PHB3's */
+		for_each_compatible_node(np, NULL, "ibm,ioda2-phb")
+			pnv_pci_init_ioda2_phb(np);
 	}
 
 	/* Setup the linkage between OF nodes and PHBs */
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index 42ddfba..f6314d6 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -4,9 +4,9 @@
 struct pci_dn;
 
 enum pnv_phb_type {
-	PNV_PHB_P5IOC2,
-	PNV_PHB_IODA1,
-	PNV_PHB_IODA2,
+	PNV_PHB_P5IOC2	= 0,
+	PNV_PHB_IODA1	= 1,
+	PNV_PHB_IODA2	= 2,
 };
 
 /* Precise PHB model for error management */
@@ -14,6 +14,7 @@ enum pnv_phb_model {
 	PNV_PHB_MODEL_UNKNOWN,
 	PNV_PHB_MODEL_P5IOC2,
 	PNV_PHB_MODEL_P7IOC,
+	PNV_PHB_MODEL_PHB3,
 };
 
 #define PNV_PCI_DIAG_BUF_SIZE	4096
@@ -148,6 +149,7 @@ extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
 				      u64 dma_offset);
 extern void pnv_pci_init_p5ioc2_hub(struct device_node *np);
 extern void pnv_pci_init_ioda_hub(struct device_node *np);
+extern void pnv_pci_init_ioda2_phb(struct device_node *np);
 
 
 #endif /* __POWERNV_PCI_H */
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 6/7] powerpc/powernv: Build DMA space for PE on PHB3
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

The patch intends to build 32-bits DMA space for individual PEs on
PHB3. The TVE# is recognized by the combo of PE# and fixed bits
from DMA address, which is zero for 32-bits DMA space.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |  102 +++++++++++++++++++++++++++--
 1 files changed, 96 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 9f4d323..6bc4648 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -589,9 +589,8 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 		 */
 		tbl->it_busno = 0;
 		tbl->it_index = (unsigned long)ioremap(be64_to_cpup(swinvp), 8);
-		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE;
-		if (phb->type == PNV_PHB_IODA1)
-			tbl->it_type |= TCE_PCI_SWINV_PAIR;
+		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE |
+			       TCE_PCI_SWINV_PAIR;
 	}
 	iommu_init_table(tbl, phb->hose->node);
 
@@ -609,6 +608,84 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 		__free_pages(tce_mem, get_order(TCE32_TABLE_SIZE * segs));
 }
 
+static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
+				       struct pnv_ioda_pe *pe)
+{
+	struct page *tce_mem = NULL;
+	void *addr;
+	const __be64 *swinvp;
+	struct iommu_table *tbl;
+	unsigned int tce_table_size, end;
+	int64_t rc;
+
+	/* We shouldn't already have a 32-bit DMA associated */
+	if (WARN_ON(pe->tce32_seg >= 0))
+		return;
+
+	/* The PE will reserve all possible 32-bits space */
+	pe->tce32_seg = 0;
+	end = (1 << ilog2(phb->ioda.m32_pci_base));
+	tce_table_size = (end / 0x1000) * 8;
+	pe_info(pe, "Setting up 32-bit TCE table at 0..%08x\n",
+		end);
+
+	/* Allocate TCE table */
+	tce_mem = alloc_pages_node(phb->hose->node, GFP_KERNEL,
+				   get_order(tce_table_size));
+	if (!tce_mem) {
+		pe_err(pe, "Failed to allocate a 32-bit TCE memory\n");
+		goto fail;
+	}
+	addr = page_address(tce_mem);
+	memset(addr, 0, tce_table_size);
+
+	/*
+	 * Map TCE table through TVT. The TVE index is the PE number
+	 * shifted by 1 bit for 32-bits DMA space.
+	 */
+	rc = opal_pci_map_pe_dma_window(phb->opal_id, pe->pe_number,
+					pe->pe_number << 1, 1, __pa(addr),
+					tce_table_size, 0x1000);
+	if (rc) {
+		pe_err(pe, "Failed to configure 32-bit TCE table,"
+		       " err %ld\n", rc);
+		goto fail;
+	}
+
+	/* Setup linux iommu table */
+	tbl = &pe->tce32_table;
+	pnv_pci_setup_iommu_table(tbl, addr, tce_table_size, 0);
+
+	/* Hook the IOMMU table to PHB */
+	tbl->sysdata = phb;
+
+	/* OPAL variant of PHB3 invalidated TCEs */
+	swinvp = of_get_property(phb->hose->dn, "ibm,opal-tce-kill", NULL);
+	if (swinvp) {
+		/* We need a couple more fields -- an address and a data
+		 * to or.  Since the bus is only printed out on table free
+		 * errors, and on the first pass the data will be a relative
+		 * bus number, print that out instead.
+		 */
+		tbl->it_busno = 0;
+		tbl->it_index = (unsigned long)ioremap(be64_to_cpup(swinvp), 8);
+		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE;
+	}
+	iommu_init_table(tbl, phb->hose->node);
+
+	if (pe->pdev)
+		set_iommu_table_base(&pe->pdev->dev, tbl);
+	else
+		pnv_ioda_setup_bus_dma(pe, pe->pbus);
+
+	return;
+fail:
+	if (pe->tce32_seg >= 0)
+		pe->tce32_seg = -1;
+	if (tce_mem)
+		__free_pages(tce_mem, get_order(tce_table_size));
+}
+
 static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 {
 	struct pci_controller *hose = phb->hose;
@@ -651,9 +728,22 @@ static void pnv_ioda_setup_dma(struct pnv_phb *phb)
 			if (segs > remaining)
 				segs = remaining;
 		}
-		pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
-			pe->dma_weight, segs);
-		pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+
+		/*
+		 * For IODA2 compliant PHB3, we needn't care about the weight.
+		 * The all available 32-bits DMA space will be assigned to
+		 * the specific PE.
+		 */
+		if (phb->type == PNV_PHB_IODA1) {
+			pe_info(pe, "DMA weight %d, assigned %d DMA32 segments\n",
+				pe->dma_weight, segs);
+			pnv_pci_ioda_setup_dma_pe(phb, pe, base, segs);
+		} else {
+			pe_info(pe, "Assign DMA32 space\n");
+			segs = 0;
+			pnv_pci_ioda2_setup_dma_pe(phb, pe);
+		}
+
 		remaining -= segs;
 		base += segs;
 	}
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 2/7] powerpc/powernv: Retrieve IODA2 tables explicitly
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

The PHB3, which is compatible with IODA2, have lots of tables (RTT/
PETLV/PEST/IVT/RBA) in system memory and have corresponding BARs to
trace the system memory address. The tables have been allocated in
firmware and exported through device-tree. The patch retrieves the
tables explicitly.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal.h           |    5 +--
 arch/powerpc/platforms/powernv/pci-ioda.c |   35 +++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h      |   13 ++++++++++
 3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index a4b28f1..0af7ba0 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -491,9 +491,8 @@ int64_t opal_pci_map_pe_mmio_window(uint64_t phb_id, uint16_t pe_number,
 				    uint16_t window_type, uint16_t window_num,
 				    uint16_t segment_num);
 int64_t opal_pci_set_phb_table_memory(uint64_t phb_id, uint64_t rtt_addr,
-				      uint64_t ivt_addr, uint64_t ivt_len,
-				      uint64_t reject_array_addr,
-				      uint64_t peltv_addr);
+				      uint64_t peltv_addr, uint64_t pest_addr,
+				      uint64_t ivt_addr, uint64_t rba_addr);
 int64_t opal_pci_set_pe(uint64_t phb_id, uint64_t pe_number, uint64_t bus_dev_func,
 			uint8_t bus_compare, uint8_t dev_compare, uint8_t func_compare,
 			uint8_t pe_action);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3d4e958..0c15870 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -852,6 +852,23 @@ static u32 pnv_ioda_bdfn_to_pe(struct pnv_phb *phb, struct pci_bus *bus,
 	return phb->ioda.pe_rmap[(bus->number << 8) | devfn];
 }
 
+static void __init pnv_pci_get_ioda2_table(struct device_node *np,
+					   const char *name,
+					   void **table,
+					   unsigned int *len)
+{
+	const u32 *prop32;
+	u64 base;
+
+	prop32 = of_get_property(np, name, NULL);
+	if (prop32) {
+		base = be32_to_cpup(prop32);
+		base = base << 32 | be32_to_cpup(prop32 + 1);
+		*table = __va(base);
+		*len = be32_to_cpup(prop32 + 2);
+	}
+}
+
 void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
 {
 	struct pci_controller *hose;
@@ -998,6 +1015,24 @@ void __init pnv_pci_init_ioda_phb(struct device_node *np, int ioda_type)
 	ppc_md.pcibios_window_alignment = pnv_pci_window_alignment;
 	pci_add_flags(PCI_REASSIGN_ALL_RSRC);
 
+	/* Retrieve variable IODA2 tables */
+	if (ioda_type == PNV_PHB_IODA2) {
+		pnv_pci_get_ioda2_table(np, "ibm,opal-rtt-table",
+				&phb->ioda.tbl_rtt, &phb->ioda.rtt_len);
+		pnv_pci_get_ioda2_table(np, "ibm,opal-peltv-table",
+				&phb->ioda.tbl_peltv, &phb->ioda.peltv_len);
+		pnv_pci_get_ioda2_table(np, "ibm,opal-pest-table",
+				&phb->ioda.tbl_pest, &phb->ioda.pest_len);
+		pnv_pci_get_ioda2_table(np, "ibm,opal-ivt-table",
+				&phb->ioda.tbl_ivt, &phb->ioda.ivt_len);
+		pnv_pci_get_ioda2_table(np, "ibm,opal-rba-table",
+				&phb->ioda.tbl_rba, &phb->ioda.rba_len);
+		/* Get IVE stride */
+		prop32 = of_get_property(np, "ibm,opal-ive-stride", NULL);
+		if (prop32)
+			phb->ioda.ive_stride = be32_to_cpup(prop32);
+	}
+
 	/* Reset IODA tables to a clean state */
 	rc = opal_pci_reset(phb_id, OPAL_PCI_IODA_TABLE_RESET, OPAL_ASSERT_RESET);
 	if (rc)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index f6314d6..c048c29 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -100,6 +100,19 @@ struct pnv_phb {
 			unsigned int		io_segsize;
 			unsigned int		io_pci_base;
 
+			/* Variable tables for IODA2 */
+			void			*tbl_rtt;
+			void			*tbl_peltv;
+			void			*tbl_pest;
+			void			*tbl_ivt;
+			void			*tbl_rba;
+			unsigned int		ive_stride;
+			unsigned int		rtt_len;
+			unsigned int		peltv_len;
+			unsigned int		pest_len;
+			unsigned int		ivt_len;
+			unsigned int		rba_len;
+
 			/* PE allocation bitmap */
 			unsigned long		*pe_alloc;
 
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 7/7] powerpc/powernv: Fix invalid IOMMU table
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

Ben found the root cause. Commit 37f02195bee9c25ce44e25204f40b7961a6d7c9d
("powerpc/pci: fix PCI-e devices rescan issue on powerpc platform")
overwrites the IOMMU table of PCI device while enabling PCI device.
The patch intends to fix the IOMMU table after that point.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   33 ++++++++++------------------
 1 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 6bc4648..c41696f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -432,20 +432,21 @@ static void pnv_pci_ioda_setup_PEs(void)
 	}
 }
 
-static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *dev)
+static void pnv_pci_ioda_dma_dev_setup(struct pnv_phb *phb, struct pci_dev *pdev)
 {
-	/* We delay DMA setup after we have assigned all PE# */
-}
+	struct pci_dn *pdn = pnv_ioda_get_pdn(pdev);
+	struct pnv_ioda_pe *pe;
 
-static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
-{
-	struct pci_dev *dev;
+	/*
+	 * The function can be called while the PE#
+	 * hasn't been assigned. Do nothing for the
+	 * case.
+	 */
+	if (!pdn || pdn->pe_number == IODA_INVALID_PE)
+		return;
 
-	list_for_each_entry(dev, &bus->devices, bus_list) {
-		set_iommu_table_base(&dev->dev, &pe->tce32_table);
-		if (dev->subordinate)
-			pnv_ioda_setup_bus_dma(pe, dev->subordinate);
-	}
+	pe = &phb->ioda.pe_array[pdn->pe_number];
+	set_iommu_table_base(&pdev->dev, &pe->tce32_table);
 }
 
 void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
@@ -594,11 +595,6 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	}
 	iommu_init_table(tbl, phb->hose->node);
 
-	if (pe->pdev)
-		set_iommu_table_base(&pe->pdev->dev, tbl);
-	else
-		pnv_ioda_setup_bus_dma(pe, pe->pbus);
-
 	return;
  fail:
 	/* XXX Failure: Try to fallback to 64-bit only ? */
@@ -673,11 +669,6 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
 	}
 	iommu_init_table(tbl, phb->hose->node);
 
-	if (pe->pdev)
-		set_iommu_table_base(&pe->pdev->dev, tbl);
-	else
-		pnv_ioda_setup_bus_dma(pe, pe->pbus);
-
 	return;
 fail:
 	if (pe->tce32_seg >= 0)
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH 5/7] powerpc/powernv: TCE invalidation for PHB3
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

The TCE should be invalidated while it's created or free'd. The
approach to do that for IODA1 and IODA2 compliant PHBs are different.
So the patch differentiate them with different functions called to
do that for IODA1 and IODA2 compliant PHBs. It's notable that the
PCI address is used to invalidate the corresponding TCE on IODA2
compliant PHB3.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/iommu.h            |    1 +
 arch/powerpc/platforms/powernv/pci-ioda.c   |   75 ++++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/pci-p5ioc2.c |    1 +
 arch/powerpc/platforms/powernv/pci.c        |   60 +++++----------------
 arch/powerpc/platforms/powernv/pci.h        |    6 ++-
 5 files changed, 93 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index cbfe678..0db308e 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -76,6 +76,7 @@ struct iommu_table {
 	struct iommu_pool large_pool;
 	struct iommu_pool pools[IOMMU_NR_POOLS];
 	unsigned long *it_map;       /* A simple allocation bitmap for now */
+	void *sysdata;
 };
 
 struct scatterlist;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 32197af..9f4d323 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -448,6 +448,73 @@ static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
 	}
 }
 
+void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
+				  u64 *startp, u64 *endp)
+{
+	u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
+	unsigned long start, end, inc;
+
+	start = __pa(startp);
+	end = __pa(endp);
+
+	/* BML uses this case for p6/p7/galaxy2: Shift addr and put in node */
+	if (tbl->it_busno) {
+		start <<= 12;
+		end <<= 12;
+		inc = 128 << 12;
+		start |= tbl->it_busno;
+		end |= tbl->it_busno;
+	} else if (tbl->it_type & TCE_PCI_SWINV_PAIR) {
+		/* p7ioc-style invalidation, 2 TCEs per write */
+		start |= (1ull << 63);
+		end |= (1ull << 63);
+		inc = 16;
+        } else {
+		/* Default (older HW) */
+                inc = 128;
+	}
+
+        end |= inc - 1;	/* round up end to be different than start */
+
+        mb(); /* Ensure above stores are visible */
+        while (start <= end) {
+                __raw_writeq(start, invalidate);
+                start += inc;
+        }
+
+	/*
+	 * The iommu layer will do another mb() for us on build()
+	 * and we don't care on free()
+	 */
+}
+
+void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
+				  u64 *startp, u64 *endp)
+{
+	unsigned long start, end, inc;
+	u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
+	struct pnv_ioda_pe *pe = container_of(tbl, struct pnv_ioda_pe,
+					      tce32_table);
+
+	/* We'll invalidate DMA address in PE scope */
+	start = 0x2ul << 60;
+	start |= (pe->pe_number & 0xFF);
+	end = start;
+
+	/* Figure out the start, end and step */
+	inc = tbl->it_offset + (((u64)startp - tbl->it_base) / sizeof(u64));
+	start |= (inc << 12);
+	inc = tbl->it_offset + (((u64)endp - tbl->it_base) / sizeof(u64));
+	end |= (inc << 12);
+	inc = (0x1ul << 12);
+	mb();
+
+	while (start <= end) {
+		__raw_writeq(start, invalidate);
+		start += inc;
+	}
+}
+
 static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 				      struct pnv_ioda_pe *pe, unsigned int base,
 				      unsigned int segs)
@@ -509,6 +576,9 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 	pnv_pci_setup_iommu_table(tbl, addr, TCE32_TABLE_SIZE * segs,
 				  base << 28);
 
+	/* Hook the IOMMU table to PHB */
+	tbl->sysdata = phb;
+
 	/* OPAL variant of P7IOC SW invalidated TCEs */
 	swinvp = of_get_property(phb->hose->dn, "ibm,opal-tce-kill", NULL);
 	if (swinvp) {
@@ -519,8 +589,9 @@ static void pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
 		 */
 		tbl->it_busno = 0;
 		tbl->it_index = (unsigned long)ioremap(be64_to_cpup(swinvp), 8);
-		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE
-			| TCE_PCI_SWINV_PAIR;
+		tbl->it_type = TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE;
+		if (phb->type == PNV_PHB_IODA1)
+			tbl->it_type |= TCE_PCI_SWINV_PAIR;
 	}
 	iommu_init_table(tbl, phb->hose->node);
 
diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
index d5c066e..177ef26 100644
--- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
+++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
@@ -167,6 +167,7 @@ static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np,
 
 	/* Setup TCEs */
 	phb->dma_dev_setup = pnv_pci_p5ioc2_dma_dev_setup;
+	phb->p5ioc2.iommu_table.sysdata = phb;
 	pnv_pci_setup_iommu_table(&phb->p5ioc2.iommu_table,
 				  tce_mem, tce_size, 0);
 }
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index ea6a93d..f140c7a 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -348,52 +348,11 @@ struct pci_ops pnv_pci_ops = {
 	.write = pnv_pci_write_config,
 };
 
-
-static void pnv_tce_invalidate(struct iommu_table *tbl,
-			       u64 *startp, u64 *endp)
-{
-	u64 __iomem *invalidate = (u64 __iomem *)tbl->it_index;
-	unsigned long start, end, inc;
-
-	start = __pa(startp);
-	end = __pa(endp);
-
-
-	/* BML uses this case for p6/p7/galaxy2: Shift addr and put in node */
-	if (tbl->it_busno) {
-		start <<= 12;
-		end <<= 12;
-		inc = 128 << 12;
-		start |= tbl->it_busno;
-		end |= tbl->it_busno;
-	}
-	/* p7ioc-style invalidation, 2 TCEs per write */
-	else if (tbl->it_type & TCE_PCI_SWINV_PAIR) {
-		start |= (1ull << 63);
-		end |= (1ull << 63);
-		inc = 16;
-	}
-	/* Default (older HW) */
-	else
-		inc = 128;
-
-	end |= inc - 1;		/* round up end to be different than start */
-
-	mb(); /* Ensure above stores are visible */
-	while (start <= end) {
-		__raw_writeq(start, invalidate);
-		start += inc;
-	}
-	/* The iommu layer will do another mb() for us on build() and
-	 * we don't care on free()
-	 */
-}
-
-
 static int pnv_tce_build(struct iommu_table *tbl, long index, long npages,
 			 unsigned long uaddr, enum dma_data_direction direction,
 			 struct dma_attrs *attrs)
 {
+	struct pnv_phb *phb = tbl->sysdata;
 	u64 proto_tce;
 	u64 *tcep, *tces;
 	u64 rpn;
@@ -413,14 +372,19 @@ static int pnv_tce_build(struct iommu_table *tbl, long index, long npages,
 	 * need that flush. We'll probably turn it_type into a bit mask
 	 * of flags if that becomes the case
 	 */
-	if (tbl->it_type & TCE_PCI_SWINV_CREATE)
-		pnv_tce_invalidate(tbl, tces, tcep - 1);
+	if (tbl->it_type & TCE_PCI_SWINV_CREATE) {
+		if (phb->type == PNV_PHB_IODA1)
+			pnv_pci_ioda1_tce_invalidate(tbl, tces, tcep - 1);
+		else
+			pnv_pci_ioda2_tce_invalidate(tbl, tces, tcep - 1);
+	}
 
 	return 0;
 }
 
 static void pnv_tce_free(struct iommu_table *tbl, long index, long npages)
 {
+	struct pnv_phb *phb = tbl->sysdata;
 	u64 *tcep, *tces;
 
 	tces = tcep = ((u64 *)tbl->it_base) + index - tbl->it_offset;
@@ -428,8 +392,12 @@ static void pnv_tce_free(struct iommu_table *tbl, long index, long npages)
 	while (npages--)
 		*(tcep++) = 0;
 
-	if (tbl->it_type & TCE_PCI_SWINV_FREE)
-		pnv_tce_invalidate(tbl, tces, tcep - 1);
+	if (tbl->it_type & TCE_PCI_SWINV_CREATE) {
+		if (phb->type == PNV_PHB_IODA1)
+			pnv_pci_ioda1_tce_invalidate(tbl, tces, tcep - 1);
+		else
+			pnv_pci_ioda2_tce_invalidate(tbl, tces, tcep - 1);
+	}
 }
 
 static unsigned long pnv_tce_get(struct iommu_table *tbl, long index)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index c6690b3..3cdc878 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -164,6 +164,8 @@ extern void pnv_pci_setup_iommu_table(struct iommu_table *tbl,
 extern void pnv_pci_init_p5ioc2_hub(struct device_node *np);
 extern void pnv_pci_init_ioda_hub(struct device_node *np);
 extern void pnv_pci_init_ioda2_phb(struct device_node *np);
-
-
+extern void pnv_pci_ioda1_tce_invalidate(struct iommu_table *tbl,
+					 u64 *startp, u64 *endp);
+extern void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
+					 u64 *startp, u64 *endp);
 #endif /* __POWERNV_PCI_H */
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH v3 0/7] powerpc/powernv: PHB3 Support
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan

The patchset includes minimal support for PHB3. Initially, flag "PNV_PHB_IODA2"
is introduced to differentiate IODA2 compliant PHB3 from other types of PHBs and
do initialization accordingly for PHB3. Besides, variable IODA2 tables reside in
system memory and we allocate them in kernel, then pass them to f/w and enable
the corresponding BARs through OPAL API. The P/Q bits of IVE should be handled
on PHB3 by software and the patchset intends to cover that as well.

NOTE: The first patch comes from Ben.

v2 -> v3
	* Remove the unnecessary quirk. That's only useful with simics
	* Do MSI EOI in single OPAL API opal_pci_msi_eoi()
	* Use explicit branch to fully utilize CPU's prefetching engine
	  while doing TCE invalidation
	* Add one patch to fix invalid IOMMU table for PCI devices
v1 -> v2
	* Introduce CONFIG_POWERNV_MSI, which is similiar to CONFIG_PSERIES_MSI
	* Enable CONFIG_PPC_MSI_BITMAP while selecting CONFIG_POWERNV_MSI
	* Eleminate (struct pnv_phb::msi_count) since it has been removed in
	  linux-next
	* Replace (CONFIG_PPC_POWERNV && CONFIG_PCI_MSI) with CONFIG_POWERNV_MSI
	* Move declaration of pnv_pci_msi_eoi() to asm/xics.h
	* Remove unnecessary "#ifdef ... #endif" in icp-native.c
	* Add support to invalidate TCE
	* Let the IODA2 table allocated by firmware and kernel to retrieve them
	  through device-tree

---

arch/powerpc/include/asm/iommu.h               |    1 +
arch/powerpc/include/asm/opal.h                |    7 +-
arch/powerpc/include/asm/xics.h                |    3 +
arch/powerpc/platforms/powernv/Kconfig         |    5 +
arch/powerpc/platforms/powernv/opal-wrappers.S |    1 +
arch/powerpc/platforms/powernv/pci-ioda.c      |  301 ++++++++++++++++++++----
arch/powerpc/platforms/powernv/pci-p5ioc2.c    |    1 +
arch/powerpc/platforms/powernv/pci.c           |   85 +++----
arch/powerpc/platforms/powernv/pci.h           |   28 ++-
arch/powerpc/sysdev/Kconfig                    |    1 +
arch/powerpc/sysdev/xics/icp-native.c          |   27 ++-
11 files changed, 356 insertions(+), 104 deletions(-)


Thanks,
Gavin

^ permalink raw reply

* [PATCH 4/7] powerpc/powernv: Patch MSI EOI handler on P8
From: Gavin Shan @ 2013-04-24  9:37 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Gavin Shan
In-Reply-To: <1366796259-29412-1-git-send-email-shangw@linux.vnet.ibm.com>

The EOI handler of MSI/MSI-X interrupts for P8 (PHB3) need additional
steps to handle the P/Q bits in IVE before EOIing the corresponding
interrupt. The patch changes the EOI handler to cover that.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal.h                |    2 +
 arch/powerpc/include/asm/xics.h                |    3 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |    1 +
 arch/powerpc/platforms/powernv/pci-ioda.c      |   16 ++++++++++++++
 arch/powerpc/platforms/powernv/pci.c           |   19 ++++++++++++++++
 arch/powerpc/platforms/powernv/pci.h           |    1 +
 arch/powerpc/sysdev/xics/icp-native.c          |   27 +++++++++++++++++++++++-
 7 files changed, 68 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 0af7ba0..93dad52 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -117,6 +117,7 @@ extern int opal_enter_rtas(struct rtas_args *args,
 #define OPAL_SET_SLOT_LED_STATUS		55
 #define OPAL_GET_EPOW_STATUS			56
 #define OPAL_SET_SYSTEM_ATTENTION_LED		57
+#define OPAL_PCI_MSI_EOI			63
 
 #ifndef __ASSEMBLY__
 
@@ -505,6 +506,7 @@ int64_t opal_pci_get_xive_reissue(uint64_t phb_id, uint32_t xive_number,
 				  uint8_t *p_bit, uint8_t *q_bit);
 int64_t opal_pci_set_xive_reissue(uint64_t phb_id, uint32_t xive_number,
 				  uint8_t p_bit, uint8_t q_bit);
+int64_t opal_pci_msi_eoi(uint64_t phb_id, uint32_t ive_number);
 int64_t opal_pci_set_xive_pe(uint64_t phb_id, uint32_t pe_number,
 			     uint32_t xive_num);
 int64_t opal_get_xive_source(uint64_t phb_id, uint32_t xive_num,
diff --git a/arch/powerpc/include/asm/xics.h b/arch/powerpc/include/asm/xics.h
index 4ae9a09..c4b364b 100644
--- a/arch/powerpc/include/asm/xics.h
+++ b/arch/powerpc/include/asm/xics.h
@@ -72,6 +72,9 @@ extern int ics_opal_init(void);
 static inline int ics_opal_init(void) { return -ENODEV; }
 #endif
 
+/* Extra EOI handler for PHB3 */
+extern int pnv_pci_msi_eoi(unsigned int hw_irq);
+
 /* ICS instance, hooked up to chip_data of an irq */
 struct ics {
 	struct list_head link;
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 3bb07e5..6fabe92 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -107,3 +107,4 @@ OPAL_CALL(opal_pci_mask_pe_error,		OPAL_PCI_MASK_PE_ERROR);
 OPAL_CALL(opal_set_slot_led_status,		OPAL_SET_SLOT_LED_STATUS);
 OPAL_CALL(opal_get_epow_status,			OPAL_GET_EPOW_STATUS);
 OPAL_CALL(opal_set_system_attention_led,	OPAL_SET_SYSTEM_ATTENTION_LED);
+OPAL_CALL(opal_pci_msi_eoi,			OPAL_PCI_MSI_EOI);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 0c15870..32197af 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -646,6 +646,20 @@ static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev,
 	return 0;
 }
 
+static int pnv_pci_ioda_msi_eoi(struct pnv_phb *phb, unsigned int hw_irq)
+{
+	long rc;
+
+	rc = opal_pci_msi_eoi(phb->opal_id, hw_irq - phb->msi_base);
+	if (rc) {
+		pr_warning("%s: Failed to EOI IRQ#%d on PHB#%d, rc=%ld\n",
+			   __func__, hw_irq, phb->hose->global_number, rc);
+		return -EIO;
+	}
+
+	return 0;
+}
+
 static void pnv_pci_init_ioda_msis(struct pnv_phb *phb)
 {
 	unsigned int count;
@@ -667,6 +681,8 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb)
 	}
 
 	phb->msi_setup = pnv_pci_ioda_msi_setup;
+	if (phb->type == PNV_PHB_IODA2)
+		phb->msi_eoi = pnv_pci_ioda_msi_eoi;
 	phb->msi32_support = 1;
 	pr_info("  Allocated bitmap for %d MSIs (base IRQ 0x%x)\n",
 		count, phb->msi_base);
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index a11b5a6..ea6a93d 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -115,6 +115,25 @@ static void pnv_teardown_msi_irqs(struct pci_dev *pdev)
 		irq_dispose_mapping(entry->irq);
 	}
 }
+
+int pnv_pci_msi_eoi(unsigned int hw_irq)
+{
+	struct pci_controller *hose, *tmp;
+	struct pnv_phb *phb = NULL;
+
+	list_for_each_entry_safe(hose, tmp, &hose_list, list_node) {
+		phb = hose->private_data;
+		if (hw_irq >= phb->msi_base &&
+		    hw_irq < phb->msi_base + phb->msi_bmp.irq_count) {
+			if (!phb->msi_eoi)
+				return -EEXIST;
+			return phb->msi_eoi(phb, hw_irq);
+		}
+	}
+
+	/* For LSI interrupts, we needn't do it */
+	return 0;
+}
 #endif /* CONFIG_PCI_MSI */
 
 static void pnv_pci_dump_p7ioc_diag_data(struct pnv_phb *phb)
diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
index c048c29..c6690b3 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -81,6 +81,7 @@ struct pnv_phb {
 	int (*msi_setup)(struct pnv_phb *phb, struct pci_dev *dev,
 			 unsigned int hwirq, unsigned int is_64,
 			 struct msi_msg *msg);
+	int (*msi_eoi)(struct pnv_phb *phb, unsigned int hw_irq);
 	void (*dma_dev_setup)(struct pnv_phb *phb, struct pci_dev *pdev);
 	void (*fixup_phb)(struct pci_controller *hose);
 	u32 (*bdfn_to_pe)(struct pnv_phb *phb, struct pci_bus *bus, u32 devfn);
diff --git a/arch/powerpc/sysdev/xics/icp-native.c b/arch/powerpc/sysdev/xics/icp-native.c
index 48861d3..38dd2b1 100644
--- a/arch/powerpc/sysdev/xics/icp-native.c
+++ b/arch/powerpc/sysdev/xics/icp-native.c
@@ -89,6 +89,22 @@ static void icp_native_eoi(struct irq_data *d)
 	icp_native_set_xirr((xics_pop_cppr() << 24) | hw_irq);
 }
 
+static void icp_p8_native_eoi(struct irq_data *d)
+{
+	unsigned int hw_irq = (unsigned int)irqd_to_hwirq(d);
+	int ret;
+
+	/* Let firmware handle P/Q bits */
+	if (hw_irq != XICS_IPI) {
+		ret = pnv_pci_msi_eoi(hw_irq);
+		WARN_ON_ONCE(ret);
+	}
+
+	/* EOI on ICP */
+	iosync();
+	icp_native_set_xirr((xics_pop_cppr() << 24) | hw_irq);
+}
+
 static void icp_native_teardown_cpu(void)
 {
 	int cpu = smp_processor_id();
@@ -264,7 +280,7 @@ static int __init icp_native_init_one_node(struct device_node *np,
 	return 0;
 }
 
-static const struct icp_ops icp_native_ops = {
+static struct icp_ops icp_native_ops = {
 	.get_irq	= icp_native_get_irq,
 	.eoi		= icp_native_eoi,
 	.set_priority	= icp_native_set_cpu_priority,
@@ -296,6 +312,15 @@ int __init icp_native_init(void)
 	if (found == 0)
 		return -ENODEV;
 
+	/* Change the EOI handler for P8 */
+#ifdef CONFIG_POWERNV_MSI
+	np = of_find_compatible_node(NULL, NULL, "ibm,power8-xicp");
+	if (np) {
+		icp_native_ops.eoi = icp_p8_native_eoi;
+		of_node_put(np);
+	}
+#endif
+
 	icp_ops = &icp_native_ops;
 
 	return 0;
-- 
1.7.5.4

^ permalink raw reply related

* [PATCH] powerpc: Add isync to copy_and_flush
From: Michael Neuling @ 2013-04-24 10:30 UTC (permalink / raw)
  To: benh; +Cc: Linux PPC dev, miltonm, Nishanth Aravamudan

In __after_prom_start we copy the kernel down to zero in two calls to 
copy_and_flush.  After the first call (copy from 0 to copy_to_here:)
we jump to the newly copied code soon after.

Unfortunately there's no isync between the copy of this code and the
jump to it.  Hence it's possible that stale instructions could still be
in the icache or pipeline before we branch to it.

We've seen this on real machines and it's results in no console output
after:
  calling quiesce...
  returning from prom_init

The below adds an isync to ensure that the copy and flushing has
completed before any branching to the new instructions occurs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@kernel.org
---
benh: we should get this in 3.9 ASAP.

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 0886ae6..b61363d 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -509,6 +509,7 @@ _GLOBAL(copy_and_flush)
 	sync
 	addi	r5,r5,8
 	addi	r6,r6,8
+	isync
 	blr
 
 .align 8

^ permalink raw reply related

* Re: "attempt to move .org backwards" still show up
From: Michael Neuling @ 2013-04-24 10:37 UTC (permalink / raw)
  To: Mike Qiu
  Cc: sfr, matt, gang.chen, linux-kernel, paulus, Aneesh Kumar K.V,
	linuxppc-dev
In-Reply-To: <51779941.8080403@linux.vnet.ibm.com>

Mike Qiu <qiudayu@linux.vnet.ibm.com> wrote:

> =E4=BA=8E 2013/4/24 16:31, Michael Ellerman =E5=86=99=E9=81=93:
> > On Wed, Apr 24, 2013 at 04:22:53PM +0800, Mike Qiu wrote:
> >> Hi all
> >>
> >> I get an error message when I compile the source code in Power7 platfo=
rm
> >> use the newest upstream kernel.
> > Hi Mike,
> >
> > It depends on what your .config is. What defconfig are you building?
> I just copy the config file from /boot/config.* to .config and use make
> menuconfig
> change nothing by manually, then save.

Can you post the resulting config here?

Do you have commit in your tree?
  commit 087aa036eb79f24b856893190359ba812b460f45
  Author: Chen Gang <gang.chen@asianux.com>
  powerpc: make additional room in exception vector area

Mikey

^ permalink raw reply

* Re: [PATCH 2/3 v14] iommu/fsl: Add additional iommu attributes required by the PAMU driver.
From: Joerg Roedel @ 2013-04-24 10:51 UTC (permalink / raw)
  To: Sethi Varun-B16395
  Cc: Wood Scott-B07421, linux-kernel@vger.kernel.org,
	Yoder Stuart-B08248, iommu@lists.linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <C5ECD7A89D1DC44195F34B25E172658D4B4D9A@039-SN2MPN1-011.039d.mgd.msft.net>

On Tue, Apr 23, 2013 at 02:10:25PM +0000, Sethi Varun-B16395 wrote:
> I think it's fine to have the header under linux, actually I also the
> intel-iommu header under linux.

Yes, the difference is that VT-d runs on x86 and on ia64. So there is no
single arch where the header could be placed. The amd-iommu.h file on
the other hand is x86 only and should also be moved to asm/, as I just
found out :)

And as long as PAMU is only needed on a single architecture the header
should also be arch-specific. If that changes someday the header can be
moved to a generic place.


	Joerg

^ permalink raw reply

* [RFC] device-tree.git automatic sync from linux.git
From: Ian Campbell @ 2013-04-24 10:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mips, linux-c6x-dev, linux-xtensa, microblaze-uclinux, x86,
	linux, linuxppc-dev, linux-arm-kernel

Hi,

First off apologies for the large CC list -- I think this catches the
arch list for all the arches with device tree source in the tree.

Various folks have expressed an interest in eventually splitting the
device tree bindings out of the Linux git repository into a separate
tree. This should help reduce the cross talk between the code and the
bindings and to make it difficult to accidentally "co-evolve" the
bindings and the code (i.e. break compatibility) etc. There are also
other projects (such as Xen) which would also like to use device-tree as
an OS agnostic description of the hardware.

I was talking to Grant about this at this Spring's LinaroConnect in Hong
Kong and as a first step he was interested in a device-tree.git
repository which is automatically kept insync with the main Linux tree.
Somehow I found myself volunteering to set up that tree.

An RFC repository can be found at:
        http://xenbits.xen.org/gitweb/?p=people/ianc/device-tree-rebasing.git

This is created using git filter-branch and retains the full history for
the device tree source files up to v3.9-rc8.

The master branch contains everything including the required build
infrastructure while upstream/master and upstream/dts contain the most
recently converted upstream master branch and the pristine converted
version respectively. Each upstream tag T is paired with a tag T-dts
which is the converted version of that tag.

Note that the tree will be potentially rebasing (hence the name) for the
time being while I'm still smoothing out the conversion process.

The paths to include in the conversion are described in
scripts/rewrite-paths.sed. The generic cases are:
        arch/ARCH/boot/dts/*.dts and *.dts? (for dtsi and dtsp etc)
	arch/ARCH/boot/*.dts and *.dts?
        arch/ARCH/include/dts/* (currently unused?)
which become src/ARCH/*.dts and *.dts? plus src/ARCH/include/*

There are also some special cases for some arches which don't follow
this pattern and for older versions of the kernel which were less
consistent. The paths were gleaned from git ls-tree + grep on every tag
in the tree, so if a file was added and moved between two rcs then the
original path may not be covered (so the move will look like it just
adds the files).

In principal this supports the new .dtsp files and includes the required
include paths in the conversion but none of them seem to be in mainline
yet, so we'll have to see!

The initial conversion took in excess of 40 hours (running out of a
ramdisk) so even if the result is stable in terms of commit ids etc a
fresh conversion every time isn't an option for a ~daily sync so I had
to create a slightly hacked around git-filter-branch (found in
scripts/git-filter-branch) to support incremental filtering, which I
intend to send to the git folks soon. 

Please let me know what you think.

Ian.

[0] real    2533m32.142s
    user    2393m35.039s
    sys     343m44.385s

^ permalink raw reply

* Re: [PATCH v2 01/15] powerpc/85xx: cache operations for Freescale SoCs based on BOOK3E
From: Zhao Chenhui @ 2013-04-24 11:08 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, r58472
In-Reply-To: <1366760770.5825.17@snotra>

On Tue, Apr 23, 2013 at 06:46:10PM -0500, Scott Wood wrote:
> On 04/19/2013 05:47:34 AM, Zhao Chenhui wrote:
> >These cache operations support Freescale SoCs based on BOOK3E.
> >Move L1 cache operations to fsl_booke_cache.S in order to maintain
> >easily. And, add cache operations for backside L2 cache and
> >platform cache.
> >
> >The backside L2 cache appears on e500mc and e5500 core. The
> >platform cache
> >supported by this patch is L2 Look-Aside Cache, which appears on SoCs
> >with e500v1/e500v2 core, such as MPC8572, P1020, etc.
> >
> >Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
> >Signed-off-by: Li Yang <leoli@freescale.com>
> >---
> > arch/powerpc/include/asm/cacheflush.h |    8 ++
> > arch/powerpc/kernel/Makefile          |    1 +
> > arch/powerpc/kernel/fsl_booke_cache.S |  210
> >+++++++++++++++++++++++++++++++++
> > arch/powerpc/kernel/head_fsl_booke.S  |   74 ------------
> > 4 files changed, 219 insertions(+), 74 deletions(-)
> > create mode 100644 arch/powerpc/kernel/fsl_booke_cache.S
> >
> >diff --git a/arch/powerpc/include/asm/cacheflush.h
> >b/arch/powerpc/include/asm/cacheflush.h
> >index b843e35..bc3f937 100644
> >--- a/arch/powerpc/include/asm/cacheflush.h
> >+++ b/arch/powerpc/include/asm/cacheflush.h
> >@@ -32,6 +32,14 @@ extern void flush_dcache_page(struct page *page);
> >
> > extern void __flush_disable_L1(void);
> >
> >+#ifdef CONFIG_FSL_SOC_BOOKE
> >+void flush_dcache_L1(void);
> >+void flush_backside_L2_cache(void);
> >+void disable_backside_L2_cache(void);
> >+void flush_disable_L2(void);
> >+void invalidate_enable_L2(void);
> >+#endif
> 
> Don't ifdef prototypes unless there's a good reason, such as
> providing an inline alternative.

I'll get rid of this "#ifdef".

> 
> Why do you have "flush_backside_L2_cache" and
> "disable_backside_L2_cache" as something different from
> "flush_disable_L2"?  The latter should flush whatever L2 is present.
> Don't treat pre-corenet as the default.
> 

These L2 caches are very different. The backside L2 is integrated in
the e500mc/e5500 core and controlled by SPR registers. But, the latter
L2 cache is on the SoC and controlled by registers mapped in CCSR.

> Why do we even need to distinguish L1 from L2 at all?  Shouldn't the
> function that gets exposed just be "flush and disable data caches
> that are specific to this cpu"?  What should happen on e6500?
> 
> -Scott

Yes. It is a good idea to use a set of uniform functions to operate the caches of
e500/e500mc/e5500/e6500 and SoCs. I'll think over your comments.

Thanks for you comments.

-Chenhui

^ permalink raw reply

* Re: [PATCH v2 13/15] powerpc/85xx: add support for e6500 L1 cache operation
From: Zhao Chenhui @ 2013-04-24 11:14 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, r58472
In-Reply-To: <1366761649.5825.20@snotra>

On Tue, Apr 23, 2013 at 07:00:49PM -0500, Scott Wood wrote:
> On 04/19/2013 05:47:46 AM, Zhao Chenhui wrote:
> >From: Chen-Hui Zhao <chenhui.zhao@freescale.com>
> >
> >The L1 Data Cache of e6500 contains no modified data, no flush
> >is required.
> >
> >Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
> >Signed-off-by: Li Yang <leoli@freescale.com>
> >Signed-off-by: Andy Fleming <afleming@freescale.com>
> >---
> > arch/powerpc/kernel/fsl_booke_cache.S |   11 ++++++++++-
> > 1 files changed, 10 insertions(+), 1 deletions(-)
> >
> >diff --git a/arch/powerpc/kernel/fsl_booke_cache.S
> >b/arch/powerpc/kernel/fsl_booke_cache.S
> >index 232c47b..24a52bb 100644
> >--- a/arch/powerpc/kernel/fsl_booke_cache.S
> >+++ b/arch/powerpc/kernel/fsl_booke_cache.S
> >@@ -65,13 +65,22 @@ _GLOBAL(flush_dcache_L1)
> >
> > 	blr
> >
> >+#define PVR_E6500	0x8040
> >+
> > /* Flush L1 d-cache, invalidate and disable d-cache and i-cache */
> > _GLOBAL(__flush_disable_L1)
> >+/* L1 Data Cache of e6500 contains no modified data, no flush is
> >required */
> >+	mfspr	r3, SPRN_PVR
> >+	rlwinm	r4, r3, 16, 0xffff
> >+	lis	r5, 0
> >+	ori	r5, r5, PVR_E6500@l
> >+	cmpw	r4, r5
> >+	beq	2f
> > 	mflr	r10
> > 	bl	flush_dcache_L1	/* Flush L1 d-cache */
> > 	mtlr	r10
> >
> >-	msync
> >+2:	msync
> > 	mfspr	r4, SPRN_L1CSR0	/* Invalidate and disable d-cache */
> > 	li	r5, 2
> > 	rlwimi	r4, r5, 0, 3
> 
> Note that disabling the cache is a core operation, rather than a
> thread operation.  Is this only called when the second thread is
> disabled?
> 
> -Scott

It is called only when a core is down.
I can add a comment in the code.

-Chenhui

^ permalink raw reply

* RE: [PATCH 2/3 v14] iommu/fsl: Add additional iommu attributes required by the PAMU driver.
From: Sethi Varun-B16395 @ 2013-04-24 11:21 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Wood Scott-B07421, linux-kernel@vger.kernel.org,
	Yoder Stuart-B08248, iommu@lists.linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20130424105117.GI17148@8bytes.org>



> -----Original Message-----
> From: Joerg Roedel [mailto:joro@8bytes.org]
> Sent: Wednesday, April 24, 2013 4:21 PM
> To: Sethi Varun-B16395
> Cc: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org;
> linux-kernel@vger.kernel.org; galak@kernel.crashing.org;
> benh@kernel.crashing.org; Yoder Stuart-B08248; Wood Scott-B07421
> Subject: Re: [PATCH 2/3 v14] iommu/fsl: Add additional iommu attributes
> required by the PAMU driver.
>=20
> On Tue, Apr 23, 2013 at 02:10:25PM +0000, Sethi Varun-B16395 wrote:
> > I think it's fine to have the header under linux, actually I also the
> > intel-iommu header under linux.
>=20
> Yes, the difference is that VT-d runs on x86 and on ia64. So there is no
> single arch where the header could be placed. The amd-iommu.h file on the
> other hand is x86 only and should also be moved to asm/, as I just found
> out :)
>=20
> And as long as PAMU is only needed on a single architecture the header
> should also be arch-specific. If that changes someday the header can be
> moved to a generic place.
>=20
Ok, I will post the next version of the patch set.

-Varun

^ permalink raw reply

* Re: [PATCH v2 12/15] powerpc/85xx: add time base sync support for e6500
From: Zhao Chenhui @ 2013-04-24 11:29 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev, linux-kernel, r58472
In-Reply-To: <1366761846.5825.21@snotra>

On Tue, Apr 23, 2013 at 07:04:06PM -0500, Scott Wood wrote:
> On 04/19/2013 05:47:45 AM, Zhao Chenhui wrote:
> >From: Chen-Hui Zhao <chenhui.zhao@freescale.com>
> >
> >For e6500, two threads in one core share one time base. Just need
> >to do time base sync on first thread of one core, and skip it on
> >the other thread.
> >
> >Signed-off-by: Zhao Chenhui <chenhui.zhao@freescale.com>
> >Signed-off-by: Li Yang <leoli@freescale.com>
> >Signed-off-by: Andy Fleming <afleming@freescale.com>
> >---
> > arch/powerpc/platforms/85xx/smp.c |   52
> >+++++++++++++++++++++++++++++++-----
> > 1 files changed, 44 insertions(+), 8 deletions(-)
> >
> >diff --git a/arch/powerpc/platforms/85xx/smp.c
> >b/arch/powerpc/platforms/85xx/smp.c
> >index 74d8cde..5f3eee3 100644
> >--- a/arch/powerpc/platforms/85xx/smp.c
> >+++ b/arch/powerpc/platforms/85xx/smp.c
> >@@ -26,6 +26,7 @@
> > #include <asm/cacheflush.h>
> > #include <asm/dbell.h>
> > #include <asm/fsl_guts.h>
> >+#include <asm/cputhreads.h>
> >
> > #include <sysdev/fsl_soc.h>
> > #include <sysdev/mpic.h>
> >@@ -45,6 +46,7 @@ static u64 timebase;
> > static int tb_req;
> > static int tb_valid;
> > static u32 cur_booting_core;
> >+static bool rcpmv2;
> >
> > #ifdef CONFIG_PPC_E500MC
> > /* get a physical mask of online cores and booting core */
> >@@ -53,26 +55,40 @@ static inline u32 get_phy_cpu_mask(void)
> > 	u32 mask;
> > 	int cpu;
> >
> >-	mask = 1 << cur_booting_core;
> >-	for_each_online_cpu(cpu)
> >-		mask |= 1 << get_hard_smp_processor_id(cpu);
> >+	if (smt_capable()) {
> >+		/* two threads in one core share one time base */
> >+		mask = 1 << cpu_core_index_of_thread(cur_booting_core);
> >+		for_each_online_cpu(cpu)
> >+			mask |= 1 << cpu_core_index_of_thread(
> >+					get_hard_smp_processor_id(cpu));
> >+	} else {
> >+		mask = 1 << cur_booting_core;
> >+		for_each_online_cpu(cpu)
> >+			mask |= 1 << get_hard_smp_processor_id(cpu);
> >+	}
> 
> Where is smt_capable defined()?  I assume somewhere in the patchset
> but it's a pain to search 12 patches...
> 

It is defined in arch/powerpc/include/asm/topology.h.
	#define smt_capable()           (cpu_has_feature(CPU_FTR_SMT))

Thanks for your review again.

> Is this really about whether we're SMT-capable or whether we have
> rcpm v2?
> 
> -Scott

I think this "if" statement can be removed. The cpu_core_index_of_thread()
can return the correct cpu number with thread or without thread.

Like this:
static inline u32 get_phy_cpu_mask(void)
{
	u32 mask;
	int cpu;

	mask = 1 << cpu_core_index_of_thread(cur_booting_core);
	for_each_online_cpu(cpu)
		mask |= 1 << cpu_core_index_of_thread(
				get_hard_smp_processor_id(cpu));

	return mask;
}

-Chenhui

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox