LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 13/16] debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
From: Anshuman Khandual @ 2020-08-13  5:27 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <20200812063358.369514-13-aneesh.kumar@linux.ibm.com>



On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
> pmd_clear() should not be used to clear pmd level pte entries.

Could you please elaborate on this. The proposed change set does
not match the description here.

> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index 061c19bba7f0..529892b9be2f 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -191,6 +191,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_young(pmd));
>  
> +	/*  Clear the pte entries  */
> +	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>  	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
>  }
>  
> @@ -313,6 +315,8 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>  	pudp_test_and_clear_young(vma, vaddr, pudp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(pud_young(pud));
> +
> +	pudp_huge_get_and_clear(mm, vaddr, pudp);
>  }
>  
>  static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
> @@ -431,8 +435,6 @@ static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>  	 * This entry points to next level page table page.
>  	 * Hence this must not qualify as pud_bad().
>  	 */
> -	pmd_clear(pmdp);
> -	pud_clear(pudp);

Both entires are cleared before creating a fresh page table entry.
Why that is a problem.

>  	pud_populate(mm, pudp, pmdp);
>  	pud = READ_ONCE(*pudp);
>  	WARN_ON(pud_bad(pud));
> @@ -564,7 +566,6 @@ static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>  	 * This entry points to next level page table page.
>  	 * Hence this must not qualify as pmd_bad().
>  	 */
> -	pmd_clear(pmdp);

Ditto.

>  	pmd_populate(mm, pmdp, pgtable);
>  	pmd = READ_ONCE(*pmdp);
>  	WARN_ON(pmd_bad(pmd));
> 

^ permalink raw reply

* Re: [PATCH 16/16] debug_vm_pgtable/ppc64: Add a variant of pfn_pte/pmd
From: Anshuman Khandual @ 2020-08-13  5:30 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <20200812063358.369514-16-aneesh.kumar@linux.ibm.com>


On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
> The tests do expect _PAGE_PTE bit set by different page table accessors.
> This is not true for the kernel. Within the kernel, _PAGE_PTE bits are
> usually set by set_pte_at(). To make the below tests work correctly add test
> specific pfn_pte/pmd helpers that set _PAGE_PTE bit.
> 
> pte_t pte = pfn_pte(pfn, prot);
> WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
> WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  mm/debug_vm_pgtable.c | 65 +++++++++++++++++++++++++++----------------
>  1 file changed, 41 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
> index eea62d5e503b..153c925b5273 100644
> --- a/mm/debug_vm_pgtable.c
> +++ b/mm/debug_vm_pgtable.c
> @@ -31,6 +31,23 @@
>  #include <asm/pgalloc.h>
>  #include <asm/tlbflush.h>
>  
> +#ifdef CONFIG_PPC_BOOK3S_64
> +static inline pte_t debug_vm_pfn_pte(unsigned long pfn, pgprot_t pgprot)
> +{
> +	pte_t pte = pfn_pte(pfn, pgprot);
> +	return __pte(pte_val(pte) | _PAGE_PTE);
> +
> +}
> +static inline pmd_t debug_vm_pfn_pmd(unsigned long pfn, pgprot_t pgprot)
> +{
> +	pmd_t pmd = pfn_pmd(pfn, pgprot);
> +	return __pmd(pmd_val(pmd) | _PAGE_PTE);
> +}
> +#else
> +#define debug_vm_pfn_pte(pfn, pgprot) pfn_pte(pfn, pgprot)
> +#define debug_vm_pfn_pmd(pfn, pgprot) pfn_pmd(pfn, pgprot)
> +#endif

Again, no platform specific constructs please. This defeats the whole purpose of
this test. If __PAGE_PTE is required for the helpers, then pfn_pmd/pte() could
be modified to accommodate that. We dont see similar issues on other platforms,
hence could you please explain why ppc64 is different here.

^ permalink raw reply

* Re: [PATCH v3 2/2] powerpc/uaccess: Add pre-update addressing to __get_user_asm() and __put_user_asm()
From: Christophe Leroy @ 2020-08-13  5:56 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <20200812193712.GV6753@gate.crashing.org>



Le 12/08/2020 à 21:37, Segher Boessenkool a écrit :
> On Wed, Aug 12, 2020 at 12:25:17PM +0000, Christophe Leroy wrote:
>> Enable pre-update addressing mode in __get_user_asm() and __put_user_asm()
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>> v3: new, splited out from patch 1.
> 
> It still looks fine to me, you can keep my Reviewed-by: :-)
> 

Ah yes thanks, forgot it when I commited it.

Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>

Christophe

^ permalink raw reply

* Re: INFO: task hung in pipe_release (2)
From: syzbot @ 2020-08-13  3:57 UTC (permalink / raw)
  To: James.Bottomley, amanieu, arnd, benh, bfields, borntraeger, bp,
	catalin.marinas, chris, christian, corbet, cyphar, dalias, davem,
	deller, dvyukov, fenghua.yu, geert, gor, heiko.carstens, hpa, ink,
	jcmvbkbc, jhogan, jlayton, kvalo, linux-alpha, linux-api,
	linux-arch, linux-arm-kernel, linux-fsdevel, linux-ia64,
	linux-kernel, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux-xtensa, linux, linux, linuxppc-dev,
	luis.f.correia, luto, martink, mattst88, ming.lei, ming.lei,
	mingo, monstr
In-Reply-To: <00000000000084b59f05abe928ee@google.com>

syzbot has bisected this issue to:

commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179
Author: Aleksa Sarai <cyphar@cyphar.com>
Date:   Sat Jan 18 12:07:59 2020 +0000

    open: introduce openat2(2) syscall

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=164e716a900000
start commit:   6ba1b005 Merge tag 'asm-generic-fixes-5.8' of git://git.ke..
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=154e716a900000
console output: https://syzkaller.appspot.com/x/log.txt?x=114e716a900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=84f076779e989e69
dashboard link: https://syzkaller.appspot.com/bug?extid=61acc40a49a3e46e25ea
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=142ae224900000

Reported-by: syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com
Fixes: fddb5d430ad9 ("open: introduce openat2(2) syscall")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH 16/16] debug_vm_pgtable/ppc64: Add a variant of pfn_pte/pmd
From: Aneesh Kumar K.V @ 2020-08-13  6:37 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <fe7beb39-97e6-dd7c-59d4-e1a72bab3d71@arm.com>

On 8/13/20 11:00 AM, Anshuman Khandual wrote:
> 
> On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
>> The tests do expect _PAGE_PTE bit set by different page table accessors.
>> This is not true for the kernel. Within the kernel, _PAGE_PTE bits are
>> usually set by set_pte_at(). To make the below tests work correctly add test
>> specific pfn_pte/pmd helpers that set _PAGE_PTE bit.
>>
>> pte_t pte = pfn_pte(pfn, prot);
>> WARN_ON(!pte_devmap(pte_mkdevmap(pte)));
>> WARN_ON(!pte_savedwrite(pte_mk_savedwrite(pte_clear_savedwrite(pte))));
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   mm/debug_vm_pgtable.c | 65 +++++++++++++++++++++++++++----------------
>>   1 file changed, 41 insertions(+), 24 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index eea62d5e503b..153c925b5273 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -31,6 +31,23 @@
>>   #include <asm/pgalloc.h>
>>   #include <asm/tlbflush.h>
>>   
>> +#ifdef CONFIG_PPC_BOOK3S_64
>> +static inline pte_t debug_vm_pfn_pte(unsigned long pfn, pgprot_t pgprot)
>> +{
>> +	pte_t pte = pfn_pte(pfn, pgprot);
>> +	return __pte(pte_val(pte) | _PAGE_PTE);
>> +
>> +}
>> +static inline pmd_t debug_vm_pfn_pmd(unsigned long pfn, pgprot_t pgprot)
>> +{
>> +	pmd_t pmd = pfn_pmd(pfn, pgprot);
>> +	return __pmd(pmd_val(pmd) | _PAGE_PTE);
>> +}
>> +#else
>> +#define debug_vm_pfn_pte(pfn, pgprot) pfn_pte(pfn, pgprot)
>> +#define debug_vm_pfn_pmd(pfn, pgprot) pfn_pmd(pfn, pgprot)
>> +#endif
> 
> Again, no platform specific constructs please. This defeats the whole purpose of
> this test. If __PAGE_PTE is required for the helpers, then pfn_pmd/pte() could
> be modified to accommodate that. We dont see similar issues on other platforms,
> hence could you please explain why ppc64 is different here.
> 

It is not platform specific. set_pte_at is the one that set the 
_PAGE_PTE bit. We don't call that in the test.  The test seems to make 
the assumption that pfn_pte returns a proper pte which is not true.

-aneesh

^ permalink raw reply

* Re: [PATCH 10/16] debug_vm_pgtable/thp: Use page table depost/withdraw with THP
From: Aneesh Kumar K.V @ 2020-08-13  6:38 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <40f2acb5-1da3-4c0a-5590-3fd12d128421@arm.com>

On 8/13/20 10:55 AM, Anshuman Khandual wrote:
> On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
>> Architectures like ppc64 use deposited page table while updating the huge pte
>> entries.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   mm/debug_vm_pgtable.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 644d28861ce9..48475d288df1 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -147,7 +147,7 @@ static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot)
>>   static void __init pmd_advanced_tests(struct mm_struct *mm,
>>   				      struct vm_area_struct *vma, pmd_t *pmdp,
>>   				      unsigned long pfn, unsigned long vaddr,
>> -				      pgprot_t prot)
>> +				      pgprot_t prot, pgtable_t pgtable)
>>   {
>>   	pmd_t pmd;
>>   
>> @@ -158,6 +158,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>>   	/* Align the address wrt HPAGE_PMD_SIZE */
>>   	vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
>>   
>> +	pgtable_trans_huge_deposit(mm, pmdp, pgtable);
>> +
>>   	pmd = pmd_mkhuge(pfn_pmd(pfn, prot));
>>   	set_pmd_at(mm, vaddr, pmdp, pmd);
>>   	pmdp_set_wrprotect(mm, vaddr, pmdp);
>> @@ -188,6 +190,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>>   	pmdp_test_and_clear_young(vma, vaddr, pmdp);
>>   	pmd = READ_ONCE(*pmdp);
>>   	WARN_ON(pmd_young(pmd));
>> +
>> +	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
>>   }
>>   
>>   static void __init pmd_leaf_tests(unsigned long pfn, pgprot_t prot)
>> @@ -1002,7 +1006,7 @@ static int __init debug_vm_pgtable(void)
>>   	pgd_clear_tests(mm, pgdp);
>>   
>>   	pte_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>> -	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot);
>> +	pmd_advanced_tests(mm, vma, pmdp, pmd_aligned, vaddr, prot, saved_ptep);
>>   	pud_advanced_tests(mm, vma, pudp, pud_aligned, vaddr, prot);
>>   	hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
>>   
>>
> 
> Makes sense, if it is required for THP to work correctly but needs to be tested
> across enabled platforms. Why should not the same apply for pud_advanced_tests()
> on platforms that supports PUD based THP.
> 


pud doesn't have page table deposit/withdraw semantics. We use that to 
support hugepage split. With pud mapping we don't split, we just drop 
the hugepage and expect it to be faulted back in as regular page.

-aneesh

^ permalink raw reply

* linux-next: runtime warning in Linus' tree
From: Stephen Rothwell @ 2020-08-13  6:46 UTC (permalink / raw)
  To: Roman Gushchin, Andrew Morton
  Cc: Linux PowerPC Mailing List, Linux Next Mailing List,
	Linus Torvalds, Linux Kernel Mailing List, Johannes Weiner

[-- Attachment #1: Type: text/plain, Size: 5734 bytes --]

Hi all,

Testing Linus' tree today, my qemu runs (PowerPC
powerpc_pseries_le_defconfig) produce the following WARNING:

[    0.021401][    T0] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.021529][    T0] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.053969][    T0] ------------[ cut here ]------------
[    0.055220][    T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904
[    0.055355][    T0] Modules linked in:
[    0.055812][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.8.0 #5
[    0.055976][    T0] NIP:  c000000000410010 LR: c00000000040fd68 CTR: 0000000000000000
[    0.056097][    T0] REGS: c0000000011e7ab0 TRAP: 0700   Not tainted  (5.8.0)
[    0.056162][    T0] MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24000888  XER: 00000000
[    0.056449][    T0] CFAR: c00000000040fd80 IRQMASK: 0 
[    0.056449][    T0] GPR00: c00000000040fd68 c0000000011e7d40 c0000000011e8300 0000000000000001 
[    0.056449][    T0] GPR04: 0000000000000228 0000000000000000 0000000000000001 ffffffffffffffff 
[    0.056449][    T0] GPR08: c00000007d003208 0000000000000000 0000000000000000 c00000007d002fe8 
[    0.056449][    T0] GPR12: 0000000000000001 c0000000013d0000 0000000000000000 00000000011dd528 
[    0.056449][    T0] GPR16: 00000000011dd840 00000000011dd690 0000000000000018 0000000000000003 
[    0.056449][    T0] GPR20: 0000000000000001 c0000000010cbcf8 0000000000000003 c0000000010cd540 
[    0.056449][    T0] GPR24: c0000000010e8778 c0000000010e9080 c0000000010cbcd8 0000000000000000 
[    0.056449][    T0] GPR28: 0000000000000000 c00000007e2a1000 c0000000010cbcc8 c00000000118ea00 
[    0.057109][    T0] NIP [c000000000410010] mem_cgroup_css_alloc+0x350/0x904
[    0.057177][    T0] LR [c00000000040fd68] mem_cgroup_css_alloc+0xa8/0x904
[    0.057394][    T0] Call Trace:
[    0.057534][    T0] [c0000000011e7d40] [c00000000040fd68] mem_cgroup_css_alloc+0xa8/0x904 (unreliable)
[    0.057814][    T0] [c0000000011e7dc0] [c000000000f5b13c] cgroup_init_subsys+0xbc/0x210
[    0.057903][    T0] [c0000000011e7e10] [c000000000f5b690] cgroup_init+0x220/0x598
[    0.057973][    T0] [c0000000011e7ee0] [c000000000f34354] start_kernel+0x67c/0x6ec
[    0.058047][    T0] [c0000000011e7f90] [c00000000000cb88] start_here_common+0x1c/0x614
[    0.058241][    T0] Instruction dump:
[    0.058420][    T0] eac10030 eae10038 eb410050 eb610058 4bffff60 60000000 60000000 60000000 
[    0.058550][    T0] 3be00100 4bfffdfc 60000000 60000000 <0fe00000> 4bfffd70 60000000 60000000 
[    0.059381][    T0] ---[ end trace cb2d79b4994ef1fe ]---
[    0.059810][    T0] ------------[ cut here ]------------
[    0.059872][    T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5135 mem_cgroup_css_alloc+0x750/0x904
[    0.059930][    T0] Modules linked in:
[    0.060053][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.8.0 #5
[    0.060113][    T0] NIP:  c000000000410410 LR: c00000000040ff2c CTR: 0000000000000000
[    0.060171][    T0] REGS: c0000000011e7ab0 TRAP: 0700   Tainted: G        W          (5.8.0)
[    0.060229][    T0] MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24000880  XER: 00000000
[    0.060332][    T0] CFAR: c00000000040fe48 IRQMASK: 0 
[    0.060332][    T0] GPR00: c00000000040ff2c c0000000011e7d40 c0000000011e8300 c00000007e234c00 
[    0.060332][    T0] GPR04: 0000000000000000 0000000000000000 c00000007e235000 0000000000000013 
[    0.060332][    T0] GPR08: 000000007ec00000 0000000000000000 0000000000000000 0000000000000001 
[    0.060332][    T0] GPR12: 0000000000000000 c0000000013d0000 0000000000000000 00000000011dd528 
[    0.060332][    T0] GPR16: 00000000011dd840 00000000011dd690 0000000000000018 0000000000000003 
[    0.060332][    T0] GPR20: c000000001223300 c000000000e95900 c00000000118ea00 c0000000012232c0 
[    0.060332][    T0] GPR24: c0000000010e8778 c0000000010e9080 0000000000400cc0 0000000000000000 
[    0.060332][    T0] GPR28: 0000000000000000 c00000007e2a1000 c00000007e234c00 0000000000000000 
[    0.060855][    T0] NIP [c000000000410410] mem_cgroup_css_alloc+0x750/0x904
[    0.060911][    T0] LR [c00000000040ff2c] mem_cgroup_css_alloc+0x26c/0x904
[    0.060958][    T0] Call Trace:
[    0.061003][    T0] [c0000000011e7d40] [c00000000040ff2c] mem_cgroup_css_alloc+0x26c/0x904 (unreliable)
[    0.061081][    T0] [c0000000011e7dc0] [c000000000f5b13c] cgroup_init_subsys+0xbc/0x210
[    0.061165][    T0] [c0000000011e7e10] [c000000000f5b690] cgroup_init+0x220/0x598
[    0.061233][    T0] [c0000000011e7ee0] [c000000000f34354] start_kernel+0x67c/0x6ec
[    0.061303][    T0] [c0000000011e7f90] [c00000000000cb88] start_here_common+0x1c/0x614
[    0.061364][    T0] Instruction dump:
[    0.061408][    T0] ebe1fff8 7c0803a6 4e800020 60000000 60000000 3d220004 e929d230 7c3c4800 
[    0.061508][    T0] 41820190 e93c03d2 4bfffc80 60000000 <0fe00000> 4bfffa38 60000000 60000000 
[    0.061630][    T0] ---[ end trace cb2d79b4994ef1ff ]---
[    0.096387][    T1] EEH: pSeries platform initialized
[    0.097232][    T1] POWER8 performance monitor hardware support registered

[The line numbers in the final linux next are 5226 and 5141 due to
later patches.]

Introduced (or exposed) by commit

  3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")

This commit actually adds the WARN_ON, so it either adds the bug that
sets it off, or the bug already existed.

Unfotunately, the version of this patch in linux-next up tuntil today
is different.  :-(

I have left this as I have no idea how to fix it :-)

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH 13/16] debug_vm_pgtable/pmd_clear: Don't use pmd/pud_clear on pte entries
From: Aneesh Kumar K.V @ 2020-08-13  8:45 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, akpm; +Cc: linuxppc-dev
In-Reply-To: <1bb841d2-4622-b122-7176-246eb3702c9f@arm.com>

On 8/13/20 10:57 AM, Anshuman Khandual wrote:
> 
> 
> On 08/12/2020 12:03 PM, Aneesh Kumar K.V wrote:
>> pmd_clear() should not be used to clear pmd level pte entries.
> 
> Could you please elaborate on this. The proposed change set does
> not match the description here.
> 

pmd_clear is implemented such that we don't use that to clear a huge pte 
entry. We use pmdp_huge_get_and_clear() for that. Hence we have check in 
pmd_clear which add a WARN if we find a _PAGE_PTE set on the entry.

In the test we follow a hugepmd usage with a pmd_clear. We should 
instead at the end of the advanced pmd test use pmdp_huge_get_and_clear().



>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   mm/debug_vm_pgtable.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 061c19bba7f0..529892b9be2f 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -191,6 +191,8 @@ static void __init pmd_advanced_tests(struct mm_struct *mm,
>>   	pmd = READ_ONCE(*pmdp);
>>   	WARN_ON(pmd_young(pmd));
>>   
>> +	/*  Clear the pte entries  */
>> +	pmdp_huge_get_and_clear(mm, vaddr, pmdp);
>>   	pgtable = pgtable_trans_huge_withdraw(mm, pmdp);
>>   }
>>   
>> @@ -313,6 +315,8 @@ static void __init pud_advanced_tests(struct mm_struct *mm,
>>   	pudp_test_and_clear_young(vma, vaddr, pudp);
>>   	pud = READ_ONCE(*pudp);
>>   	WARN_ON(pud_young(pud));
>> +
>> +	pudp_huge_get_and_clear(mm, vaddr, pudp);
>>   }
>>   
>>   static void __init pud_leaf_tests(unsigned long pfn, pgprot_t prot)
>> @@ -431,8 +435,6 @@ static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp,
>>   	 * This entry points to next level page table page.
>>   	 * Hence this must not qualify as pud_bad().
>>   	 */
>> -	pmd_clear(pmdp);
>> -	pud_clear(pudp);
> 
> Both entires are cleared before creating a fresh page table entry.
> Why that is a problem.
> 
>>   	pud_populate(mm, pudp, pmdp);
>>   	pud = READ_ONCE(*pudp);
>>   	WARN_ON(pud_bad(pud));
>> @@ -564,7 +566,6 @@ static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp,
>>   	 * This entry points to next level page table page.
>>   	 * Hence this must not qualify as pmd_bad().
>>   	 */
>> -	pmd_clear(pmdp);
> 
> Ditto.
> 
>>   	pmd_populate(mm, pmdp, pgtable);
>>   	pmd = READ_ONCE(*pmdp);
>>   	WARN_ON(pmd_bad(pmd));
>>


^ permalink raw reply

* [PATCH] powerpc: Drop _nmask_and_or_msr()
From: Christophe Leroy @ 2020-08-13 10:07 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

_nmask_and_or_msr() is only used at two places to set MSR_IP.

The SYNC is unnecessary as the users are not PowerPC 601.

Can be easily writen in C.

Do it, and drop _nmask_and_or_msr()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/misc_32.S                     | 13 -------------
 arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c |  3 ++-
 arch/powerpc/platforms/embedded6xx/storcenter.c   |  3 ++-
 3 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index b24f866fef81..8d9cb5df580e 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -215,19 +215,6 @@ _GLOBAL(low_choose_7447a_dfs)
 
 #endif /* CONFIG_CPU_FREQ_PMAC && CONFIG_PPC_BOOK3S_32 */
 
-/*
- * complement mask on the msr then "or" some values on.
- *     _nmask_and_or_msr(nmask, value_to_or)
- */
-_GLOBAL(_nmask_and_or_msr)
-	mfmsr	r0		/* Get current msr */
-	andc	r0,r0,r3	/* And off the bits set in r3 (first parm) */
-	or	r0,r0,r4	/* Or on the bits in r4 (second parm) */
-	SYNC			/* Some chip revs have problems here... */
-	mtmsr	r0		/* Update machine state */
-	isync
-	blr			/* Done */
-
 #ifdef CONFIG_40x
 
 /*
diff --git a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
index 15437abe1f6d..b95c3380d2b5 100644
--- a/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
+++ b/arch/powerpc/platforms/embedded6xx/mpc7448_hpc2.c
@@ -147,7 +147,8 @@ static void __noreturn mpc7448_hpc2_restart(char *cmd)
 	local_irq_disable();
 
 	/* Set exception prefix high - to the firmware */
-	_nmask_and_or_msr(0, MSR_IP);
+	mtmsr(mfmsr() | MSR_IP);
+	isync();
 
 	for (;;) ;		/* Spin until reset happens */
 }
diff --git a/arch/powerpc/platforms/embedded6xx/storcenter.c b/arch/powerpc/platforms/embedded6xx/storcenter.c
index ed1914dd34bb..e346ddcef45e 100644
--- a/arch/powerpc/platforms/embedded6xx/storcenter.c
+++ b/arch/powerpc/platforms/embedded6xx/storcenter.c
@@ -101,7 +101,8 @@ static void __noreturn storcenter_restart(char *cmd)
 	local_irq_disable();
 
 	/* Set exception prefix high - to the firmware */
-	_nmask_and_or_msr(0, MSR_IP);
+	mtmsr(mfmsr() | MSR_IP);
+	isync();
 
 	/* Wait for reset to happen */
 	for (;;) ;
-- 
2.25.0


^ permalink raw reply related

* [PATCH 1/5] powerpc: Remove flush_instruction_cache for book3s/32
From: Christophe Leroy @ 2020-08-13 10:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

The only callers of flush_instruction_cache() are:

arch/powerpc/kernel/swsusp_booke.S:	bl flush_instruction_cache
arch/powerpc/mm/nohash/40x.c:	flush_instruction_cache();
arch/powerpc/mm/nohash/44x.c:	flush_instruction_cache();
arch/powerpc/mm/nohash/fsl_booke.c:	flush_instruction_cache();
arch/powerpc/platforms/44x/machine_check.c:			flush_instruction_cache();
arch/powerpc/platforms/44x/machine_check.c:		flush_instruction_cache();

This function is not used by book3s/32, drop it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/misc_32.S | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index b24f866fef81..bd870743c06f 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -271,9 +271,8 @@ _ASM_NOKPROBE_SYMBOL(real_writeb)
 
 /*
  * Flush instruction cache.
- * This is a no-op on the 601.
  */
-#ifndef CONFIG_PPC_8xx
+#if !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC_BOOK3S_32)
 _GLOBAL(flush_instruction_cache)
 #if defined(CONFIG_4xx)
 	lis	r3, KERNELBASE@h
@@ -290,18 +289,11 @@ _GLOBAL(flush_instruction_cache)
 	mfspr	r3,SPRN_L1CSR1
 	ori	r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
 	mtspr	SPRN_L1CSR1,r3
-#elif defined(CONFIG_PPC_BOOK3S_601)
-	blr			/* for 601, do nothing */
-#else
-	/* 603/604 processor - use invalidate-all bit in HID0 */
-	mfspr	r3,SPRN_HID0
-	ori	r3,r3,HID0_ICFI
-	mtspr	SPRN_HID0,r3
 #endif /* CONFIG_4xx */
 	isync
 	blr
 EXPORT_SYMBOL(flush_instruction_cache)
-#endif /* CONFIG_PPC_8xx */
+#endif /* CONFIG_PPC_8xx || CONFIG_PPC_BOOK3S_32 */
 
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
-- 
2.25.0


^ permalink raw reply related

* [PATCH 2/5] powerpc: Untangle flush_instruction_cache()
From: Christophe Leroy @ 2020-08-13 10:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597313510.git.christophe.leroy@csgroup.eu>

flush_instruction_cache() is a mixup of each PPC32 sub-arch.

Untangle it by making one complete function for each sub-arch.

This makes it a lot more readable and maintainable.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/misc_32.S | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index bd870743c06f..a8f6ef513115 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -272,28 +272,31 @@ _ASM_NOKPROBE_SYMBOL(real_writeb)
 /*
  * Flush instruction cache.
  */
-#if !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC_BOOK3S_32)
+#ifdef CONFIG_4xx
 _GLOBAL(flush_instruction_cache)
-#if defined(CONFIG_4xx)
 	lis	r3, KERNELBASE@h
 	iccci	0,r3
-#elif defined(CONFIG_FSL_BOOKE)
+	isync
+	blr
+EXPORT_SYMBOL(flush_instruction_cache)
+#endif
+
+#ifdef CONFIG_FSL_BOOKE
+_GLOBAL(flush_instruction_cache)
 #ifdef CONFIG_E200
 	mfspr   r3,SPRN_L1CSR0
 	ori     r3,r3,L1CSR0_CFI|L1CSR0_CLFC
 	/* msync; isync recommended here */
 	mtspr   SPRN_L1CSR0,r3
-	isync
-	blr
-#endif
+#else
 	mfspr	r3,SPRN_L1CSR1
 	ori	r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
 	mtspr	SPRN_L1CSR1,r3
-#endif /* CONFIG_4xx */
+#endif
 	isync
 	blr
 EXPORT_SYMBOL(flush_instruction_cache)
-#endif /* CONFIG_PPC_8xx || CONFIG_PPC_BOOK3S_32 */
+#endif
 
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
-- 
2.25.0


^ permalink raw reply related

* [PATCH 3/5] powerpc: Remove flush_instruction_cache() on 8xx
From: Christophe Leroy @ 2020-08-13 10:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597313510.git.christophe.leroy@csgroup.eu>

flush_instruction_cache() is never used on 8xx, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/mm/nohash/8xx.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
index d2b37146ae6c..231ca95f9ffb 100644
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -244,13 +244,6 @@ void set_context(unsigned long id, pgd_t *pgd)
 	mb();
 }
 
-void flush_instruction_cache(void)
-{
-	isync();
-	mtspr(SPRN_IC_CST, IDC_INVALL);
-	isync();
-}
-
 #ifdef CONFIG_PPC_KUEP
 void __init setup_kuep(bool disabled)
 {
-- 
2.25.0


^ permalink raw reply related

* [PATCH 4/5] powerpc: Rewrite FSL_BOOKE flush_cache_instruction() in C
From: Christophe Leroy @ 2020-08-13 10:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597313510.git.christophe.leroy@csgroup.eu>

Nothing prevent flush_cache_instruction() from behing writen in C.

Do it to improve readability and maintainability.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/misc_32.S      | 17 -----------------
 arch/powerpc/mm/nohash/fsl_booke.c | 16 ++++++++++++++++
 2 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index a8f6ef513115..4f4a31d9fdd0 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -281,23 +281,6 @@ _GLOBAL(flush_instruction_cache)
 EXPORT_SYMBOL(flush_instruction_cache)
 #endif
 
-#ifdef CONFIG_FSL_BOOKE
-_GLOBAL(flush_instruction_cache)
-#ifdef CONFIG_E200
-	mfspr   r3,SPRN_L1CSR0
-	ori     r3,r3,L1CSR0_CFI|L1CSR0_CLFC
-	/* msync; isync recommended here */
-	mtspr   SPRN_L1CSR0,r3
-#else
-	mfspr	r3,SPRN_L1CSR1
-	ori	r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
-	mtspr	SPRN_L1CSR1,r3
-#endif
-	isync
-	blr
-EXPORT_SYMBOL(flush_instruction_cache)
-#endif
-
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
diff --git a/arch/powerpc/mm/nohash/fsl_booke.c b/arch/powerpc/mm/nohash/fsl_booke.c
index 0c294827d6e5..36bda962d3b3 100644
--- a/arch/powerpc/mm/nohash/fsl_booke.c
+++ b/arch/powerpc/mm/nohash/fsl_booke.c
@@ -219,6 +219,22 @@ unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
 	return tlbcam_addrs[tlbcam_index - 1].limit - PAGE_OFFSET + 1;
 }
 
+void flush_instruction_cache(void)
+{
+	unsigned long tmp;
+
+	if (IS_ENABLED(CONFIG_E200)) {
+		tmp = mfspr(SPRN_L1CSR0);
+		tmp |= L1CSR0_CFI | L1CSR0_CLFC;
+		mtspr(SPRN_L1CSR0, tmp);
+	} else {
+		tmp = mfspr(SPRN_L1CSR1);
+		tmp |= L1CSR1_ICFI | L1CSR1_ICLFR;
+		mtspr(SPRN_L1CSR1, tmp);
+	}
+	isync();
+}
+
 /*
  * MMU_init_hw does the chip-specific initialization of the MMU hardware.
  */
-- 
2.25.0


^ permalink raw reply related

* [PATCH 5/5] powerpc: Rewrite 4xx flush_cache_instruction() in C
From: Christophe Leroy @ 2020-08-13 10:12 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597313510.git.christophe.leroy@csgroup.eu>

Nothing prevent flush_cache_instruction() from behing writen in C.

Do it to improve readability and maintainability.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/misc_32.S   | 13 -------------
 arch/powerpc/mm/nohash/4xx.c    | 15 +++++++++++++++
 arch/powerpc/mm/nohash/Makefile |  1 +
 3 files changed, 16 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/mm/nohash/4xx.c

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 4f4a31d9fdd0..87717966f5cd 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -268,19 +268,6 @@ _ASM_NOKPROBE_SYMBOL(real_writeb)
 
 #endif /* CONFIG_40x */
 
-
-/*
- * Flush instruction cache.
- */
-#ifdef CONFIG_4xx
-_GLOBAL(flush_instruction_cache)
-	lis	r3, KERNELBASE@h
-	iccci	0,r3
-	isync
-	blr
-EXPORT_SYMBOL(flush_instruction_cache)
-#endif
-
 /*
  * Copy a whole page.  We use the dcbz instruction on the destination
  * to reduce memory traffic (it eliminates the unnecessary reads of
diff --git a/arch/powerpc/mm/nohash/4xx.c b/arch/powerpc/mm/nohash/4xx.c
new file mode 100644
index 000000000000..954c8aa42a32
--- /dev/null
+++ b/arch/powerpc/mm/nohash/4xx.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * This file contains the routines for initializing the MMU
+ * on the 4xx series of chips.
+ */
+
+#include <asm/processor.h>
+#include <asm/page.h>
+#include <asm/cache.h>
+
+void flush_instruction_cache(void)
+{
+	iccci((void*)KERNELBASE);
+	isync();
+}
diff --git a/arch/powerpc/mm/nohash/Makefile b/arch/powerpc/mm/nohash/Makefile
index 0424f6ce5bd8..a7f7211b6373 100644
--- a/arch/powerpc/mm/nohash/Makefile
+++ b/arch/powerpc/mm/nohash/Makefile
@@ -4,6 +4,7 @@ ccflags-$(CONFIG_PPC64)	:= $(NO_MINIMAL_TOC)
 
 obj-y				+= mmu_context.o tlb.o tlb_low.o
 obj-$(CONFIG_PPC_BOOK3E_64)  	+= tlb_low_64e.o book3e_pgtable.o
+obj-$(CONFIG_4xx)		+= 4xx.o
 obj-$(CONFIG_40x)		+= 40x.o
 obj-$(CONFIG_44x)		+= 44x.o
 obj-$(CONFIG_PPC_8xx)		+= 8xx.o
-- 
2.25.0


^ permalink raw reply related

* Re: [PATCH 1/5] powerpc: Remove flush_instruction_cache for book3s/32
From: Christoph Hellwig @ 2020-08-13 12:13 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597313510.git.christophe.leroy@csgroup.eu>

On Thu, Aug 13, 2020 at 10:12:00AM +0000, Christophe Leroy wrote:
> -#ifndef CONFIG_PPC_8xx
> +#if !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC_BOOK3S_32)
>  _GLOBAL(flush_instruction_cache)
>  #if defined(CONFIG_4xx)
>  	lis	r3, KERNELBASE@h
> @@ -290,18 +289,11 @@ _GLOBAL(flush_instruction_cache)
>  	mfspr	r3,SPRN_L1CSR1
>  	ori	r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
>  	mtspr	SPRN_L1CSR1,r3
> -#elif defined(CONFIG_PPC_BOOK3S_601)
> -	blr			/* for 601, do nothing */
> -#else
> -	/* 603/604 processor - use invalidate-all bit in HID0 */
> -	mfspr	r3,SPRN_HID0
> -	ori	r3,r3,HID0_ICFI
> -	mtspr	SPRN_HID0,r3
>  #endif /* CONFIG_4xx */
>  	isync
>  	blr
>  EXPORT_SYMBOL(flush_instruction_cache)
> -#endif /* CONFIG_PPC_8xx */
> +#endif /* CONFIG_PPC_8xx || CONFIG_PPC_BOOK3S_32 */

What about untangling this into entirely separate versions instead
of the ifdef mess?  Also the export does not seem to be needed at all.

^ permalink raw reply

* Re: [PATCH 1/5] powerpc: Remove flush_instruction_cache for book3s/32
From: Christoph Hellwig @ 2020-08-13 12:14 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
In-Reply-To: <20200813121308.GA16237@infradead.org>

On Thu, Aug 13, 2020 at 01:13:08PM +0100, Christoph Hellwig wrote:
> On Thu, Aug 13, 2020 at 10:12:00AM +0000, Christophe Leroy wrote:
> > -#ifndef CONFIG_PPC_8xx
> > +#if !defined(CONFIG_PPC_8xx) && !defined(CONFIG_PPC_BOOK3S_32)
> >  _GLOBAL(flush_instruction_cache)
> >  #if defined(CONFIG_4xx)
> >  	lis	r3, KERNELBASE@h
> > @@ -290,18 +289,11 @@ _GLOBAL(flush_instruction_cache)
> >  	mfspr	r3,SPRN_L1CSR1
> >  	ori	r3,r3,L1CSR1_ICFI|L1CSR1_ICLFR
> >  	mtspr	SPRN_L1CSR1,r3
> > -#elif defined(CONFIG_PPC_BOOK3S_601)
> > -	blr			/* for 601, do nothing */
> > -#else
> > -	/* 603/604 processor - use invalidate-all bit in HID0 */
> > -	mfspr	r3,SPRN_HID0
> > -	ori	r3,r3,HID0_ICFI
> > -	mtspr	SPRN_HID0,r3
> >  #endif /* CONFIG_4xx */
> >  	isync
> >  	blr
> >  EXPORT_SYMBOL(flush_instruction_cache)
> > -#endif /* CONFIG_PPC_8xx */
> > +#endif /* CONFIG_PPC_8xx || CONFIG_PPC_BOOK3S_32 */
> 
> What about untangling this into entirely separate versions instead
> of the ifdef mess?  Also the export does not seem to be needed at all.

Ok, I see that you do that later, sorry.

^ permalink raw reply

* Re: [PATCH] powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
From: Aneesh Kumar K.V @ 2020-08-13 12:31 UTC (permalink / raw)
  To: Vaibhav Jain, linuxppc-dev, linux-nvdimm
  Cc: Dan Williams, Ira Weiny, Santosh Sivaraj, Oliver O'Halloran
In-Reply-To: <20200813043458.165718-1-vaibhav@linux.ibm.com>

On 8/13/20 10:04 AM, Vaibhav Jain wrote:
> The newly introduced 'perf_stats' attribute uses the default access
> mode of 0444 letting non-root users access performance stats of an
> nvdimm and potentially force the kernel into issuing large number of
> expensive HCALLs. Since the information exposed by this attribute
> cannot be cached hence its better to ward of access to this attribute
> from users who don't need to access these performance statistics.
> 
> Hence this patch adds check in perf_stats_show() to only let users
> that are 'perfmon_capable()' to read the nvdimm performance
> statistics.
> 
> Fixes: 2d02bf835e573 ('powerpc/papr_scm: Fetch nvdimm performance stats from PHYP')
> Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>   arch/powerpc/platforms/pseries/papr_scm.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index f439f0dfea7d1..36c51bf8af9a8 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -792,6 +792,10 @@ static ssize_t perf_stats_show(struct device *dev,
>   	struct nvdimm *dimm = to_nvdimm(dev);
>   	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
>   
> +	/* Allow access only to perfmon capable users */
> +	if (!perfmon_capable())
> +		return -EACCES;
> +

An access check is usually done in open(). This is the read callback IIUC.

>   	if (!p->stat_buffer_len)
>   		return -ENOENT;
>   
> 

-aneesh

^ permalink raw reply

* [PATCH] sfc_ef100: Fix build failure on powerpc
From: Christophe Leroy @ 2020-08-13 14:39 UTC (permalink / raw)
  To: Solarflare linux maintainers, Edward Cree, Martin Habets,
	David S. Miller, Jakub Kicinski
  Cc: netdev, linuxppc-dev, linux-kernel

ppc6xx_defconfig fails building sfc.ko module, complaining
about the lack of _umoddi3 symbol.

This is due to the following test

 		if (EFX_MIN_DMAQ_SIZE % reader->value) {

Because reader->value is u64.

As EFX_MIN_DMAQ_SIZE value is 512, reader->value is obviously small
enough for an u32 calculation, so cast it as (u32) for the test, to
avoid the need for _umoddi3.

Fixes: adcfc3482fff ("sfc_ef100: read Design Parameters at probe time")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 drivers/net/ethernet/sfc/ef100_nic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/sfc/ef100_nic.c b/drivers/net/ethernet/sfc/ef100_nic.c
index 36598d0542ed..234400b69b07 100644
--- a/drivers/net/ethernet/sfc/ef100_nic.c
+++ b/drivers/net/ethernet/sfc/ef100_nic.c
@@ -979,7 +979,7 @@ static int ef100_process_design_param(struct efx_nic *efx,
 		 * EFX_MIN_DMAQ_SIZE is divisible by GRANULARITY.
 		 * This is very unlikely to fail.
 		 */
-		if (EFX_MIN_DMAQ_SIZE % reader->value) {
+		if (EFX_MIN_DMAQ_SIZE % (u32)reader->value) {
 			netif_err(efx, probe, efx->net_dev,
 				  "%s size granularity is %llu, can't guarantee safety\n",
 				  reader->type == ESE_EF100_DP_GZ_RXQ_SIZE_GRANULARITY ? "RXQ" : "TXQ",
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3] powerpc/pseries: explicitly reschedule during drmem_lmb list traversal
From: Nathan Lynch @ 2020-08-13 15:11 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: cheloha, ldufour, tyreld

The drmem lmb list can have hundreds of thousands of entries, and
unfortunately lookups take the form of linear searches. As long as
this is the case, traversals have the potential to monopolize the CPU
and provoke lockup reports, workqueue stalls, and the like unless
they explicitly yield.

Rather than placing cond_resched() calls within various
for_each_drmem_lmb() loop blocks in the code, put it in the iteration
expression of the loop macro itself so users can't omit it.

Introduce a drmem_lmb_next() iteration helper function which calls
cond_resched() at a regular interval during array traversal. Each
iteration of the loop in DLPAR code paths can involve around ten RTAS
calls which can each take up to 250us, so this ensures the check is
performed at worst every few milliseconds.

Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
---

Notes:
    Changes since v2:
    * Make drmem_lmb_next() more general.
    * Adjust reschedule interval for better code generation.
    * Add commentary to drmem_lmb_next() to explain the cond_resched()
      call.
    * Remove bounds assertions.
    
    Changes since v1:
    * Add bounds assertions in drmem_lmb_next().
    * Call cond_resched() in the iterator on only every 20th element
      instead of on every iteration, to reduce overhead in tight loops.

 arch/powerpc/include/asm/drmem.h | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/drmem.h b/arch/powerpc/include/asm/drmem.h
index 17ccc6474ab6..6fb928605ed1 100644
--- a/arch/powerpc/include/asm/drmem.h
+++ b/arch/powerpc/include/asm/drmem.h
@@ -8,6 +8,8 @@
 #ifndef _ASM_POWERPC_LMB_H
 #define _ASM_POWERPC_LMB_H
 
+#include <linux/sched.h>
+
 struct drmem_lmb {
 	u64     base_addr;
 	u32     drc_index;
@@ -26,8 +28,22 @@ struct drmem_lmb_info {
 
 extern struct drmem_lmb_info *drmem_info;
 
+static inline struct drmem_lmb *drmem_lmb_next(struct drmem_lmb *lmb,
+					       const struct drmem_lmb *start)
+{
+	/*
+	 * DLPAR code paths can take several milliseconds per element
+	 * when interacting with firmware. Ensure that we don't
+	 * unfairly monopolize the CPU.
+	 */
+	if (((++lmb - start) % 16) == 0)
+		cond_resched();
+
+	return lmb;
+}
+
 #define for_each_drmem_lmb_in_range(lmb, start, end)		\
-	for ((lmb) = (start); (lmb) < (end); (lmb)++)
+	for ((lmb) = (start); (lmb) < (end); lmb = drmem_lmb_next(lmb, start))
 
 #define for_each_drmem_lmb(lmb)					\
 	for_each_drmem_lmb_in_range((lmb),			\
-- 
2.25.4


^ permalink raw reply related

* Re: linux-next: runtime warning in Linus' tree
From: Johannes Weiner @ 2020-08-13 15:20 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linus Torvalds, Linux Kernel Mailing List,
	Linux Next Mailing List, Andrew Morton,
	Linux PowerPC Mailing List, Roman Gushchin
In-Reply-To: <20200813164654.061dbbd3@canb.auug.org.au>

On Thu, Aug 13, 2020 at 04:46:54PM +1000, Stephen Rothwell wrote:
> [    0.055220][    T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904

> [The line numbers in the final linux next are 5226 and 5141 due to
> later patches.]
> 
> Introduced (or exposed) by commit
> 
>   3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
> 
> This commit actually adds the WARN_ON, so it either adds the bug that
> sets it off, or the bug already existed.
> 
> Unfotunately, the version of this patch in linux-next up tuntil today
> is different.  :-(

Sorry, I made a last-minute request to include these checks in that
patch to make the code a bit more robust, but they trigger a false
positive here. Let's remove them.

---

From de8ea7c96c056c3cbe7b93995029986a158fb9cd Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Thu, 13 Aug 2020 10:40:54 -0400
Subject: [PATCH] mm: memcontrol: fix warning when allocating the root cgroup

Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the
parent cgroup") adds memory tracking to the memcg kernel structures
themselves to make cgroups liable for the memory they are consuming
through the allocation of child groups (which can be significant).

This code is a bit awkward as it's spread out through several
functions: The outermost function does memalloc_use_memcg(parent) to
set up current->active_memcg, which designates which cgroup to charge,
and the inner functions pass GFP_ACCOUNT to request charging for
specific allocations. To make sure this dependency is satisfied at all
times - to make sure we don't randomly charge whoever is calling the
functions - the inner functions warn on !current->active_memcg.

However, this triggers a false warning when the root memcg itself is
allocated. No parent exists in this case, and so current->active_memcg
is rightfully NULL. It's a false positive, not indicative of a bug.

Delete the warnings for now, we can revisit this later.

Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d59fd9af6e63..9d87082e64aa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5137,9 +5137,6 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
 	if (!pn)
 		return 1;
 
-	/* We charge the parent cgroup, never the current task */
-	WARN_ON_ONCE(!current->active_memcg);
-
 	pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat,
 						 GFP_KERNEL_ACCOUNT);
 	if (!pn->lruvec_stat_local) {
@@ -5222,9 +5219,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
 		goto fail;
 	}
 
-	/* We charge the parent cgroup, never the current task */
-	WARN_ON_ONCE(!current->active_memcg);
-
 	memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu,
 						GFP_KERNEL_ACCOUNT);
 	if (!memcg->vmstats_local)
-- 
2.28.0


^ permalink raw reply related

* Re: linux-next: runtime warning in Linus' tree
From: Roman Gushchin @ 2020-08-13 15:56 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Stephen Rothwell, Linux PowerPC Mailing List,
	Linux Kernel Mailing List, Linux Next Mailing List, Andrew Morton,
	Linus Torvalds
In-Reply-To: <20200813152033.GA701678@cmpxchg.org>

On Thu, Aug 13, 2020 at 11:20:33AM -0400, Johannes Weiner wrote:
> On Thu, Aug 13, 2020 at 04:46:54PM +1000, Stephen Rothwell wrote:
> > [    0.055220][    T0] WARNING: CPU: 0 PID: 0 at mm/memcontrol.c:5220 mem_cgroup_css_alloc+0x350/0x904
> 
> > [The line numbers in the final linux next are 5226 and 5141 due to
> > later patches.]
> > 
> > Introduced (or exposed) by commit
> > 
> >   3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
> > 
> > This commit actually adds the WARN_ON, so it either adds the bug that
> > sets it off, or the bug already existed.
> > 
> > Unfotunately, the version of this patch in linux-next up tuntil today
> > is different.  :-(
> 
> Sorry, I made a last-minute request to include these checks in that
> patch to make the code a bit more robust, but they trigger a false
> positive here. Let's remove them.
> 
> ---
> 
> From de8ea7c96c056c3cbe7b93995029986a158fb9cd Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Thu, 13 Aug 2020 10:40:54 -0400
> Subject: [PATCH] mm: memcontrol: fix warning when allocating the root cgroup
> 
> Commit 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the
> parent cgroup") adds memory tracking to the memcg kernel structures
> themselves to make cgroups liable for the memory they are consuming
> through the allocation of child groups (which can be significant).
> 
> This code is a bit awkward as it's spread out through several
> functions: The outermost function does memalloc_use_memcg(parent) to
> set up current->active_memcg, which designates which cgroup to charge,
> and the inner functions pass GFP_ACCOUNT to request charging for
> specific allocations. To make sure this dependency is satisfied at all
> times - to make sure we don't randomly charge whoever is calling the
> functions - the inner functions warn on !current->active_memcg.
> 
> However, this triggers a false warning when the root memcg itself is
> allocated. No parent exists in this case, and so current->active_memcg
> is rightfully NULL. It's a false positive, not indicative of a bug.
> 
> Delete the warnings for now, we can revisit this later.
> 
> Fixes: 3e38e0aaca9e ("mm: memcg: charge memcg percpu memory to the parent cgroup")
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Roman Gushchin <guro@fb.com>

Thanks!


> ---
>  mm/memcontrol.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d59fd9af6e63..9d87082e64aa 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5137,9 +5137,6 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
>  	if (!pn)
>  		return 1;
>  
> -	/* We charge the parent cgroup, never the current task */
> -	WARN_ON_ONCE(!current->active_memcg);
> -
>  	pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat,
>  						 GFP_KERNEL_ACCOUNT);
>  	if (!pn->lruvec_stat_local) {
> @@ -5222,9 +5219,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
>  		goto fail;
>  	}
>  
> -	/* We charge the parent cgroup, never the current task */
> -	WARN_ON_ONCE(!current->active_memcg);
> -
>  	memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu,
>  						GFP_KERNEL_ACCOUNT);
>  	if (!memcg->vmstats_local)
> -- 
> 2.28.0
> 

^ permalink raw reply

* Re: [PATCH] sfc_ef100: Fix build failure on powerpc
From: Segher Boessenkool @ 2020-08-13 15:57 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Solarflare linux maintainers, netdev, Martin Habets, linux-kernel,
	Edward Cree, Jakub Kicinski, linuxppc-dev, David S. Miller
In-Reply-To: <44e26ec6a1bc01b5b138c29b623c83d5846718b2.1597329390.git.christophe.leroy@csgroup.eu>

On Thu, Aug 13, 2020 at 02:39:10PM +0000, Christophe Leroy wrote:
> ppc6xx_defconfig fails building sfc.ko module, complaining
> about the lack of _umoddi3 symbol.
> 
> This is due to the following test
> 
>  		if (EFX_MIN_DMAQ_SIZE % reader->value) {
> 
> Because reader->value is u64.
> 
> As EFX_MIN_DMAQ_SIZE value is 512, reader->value is obviously small
> enough for an u32 calculation, so cast it as (u32) for the test, to
> avoid the need for _umoddi3.

That isn't the same e.g. if reader->value is 2**32 + small.  Which
probably cannot happen, but :-)


Segher

^ permalink raw reply

* [PATCH] powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
From: Aneesh Kumar K.V @ 2020-08-13 16:20 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: Aneesh Kumar K.V, Shirisha Ganta, Sandipan Das, npiggin

If the hypervisor doesn't support hugepages, the kernel ends up allocating a large
number of page table pages. The early page table allocation was wrongly
setting the max memblock limit to ppc64_rma_size with radix translation
which resulted in boot failure as shown below.

Kernel panic - not syncing:
early_alloc_pgtable: Failed to allocate 16777216 bytes align=0x1000000 nid=-1 from=0x0000000000000000 max_addr=0xffffffffffffffff
 CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-24.9-default+ #2
 Call Trace:
 [c0000000016f3d00] [c0000000007c6470] dump_stack+0xc4/0x114 (unreliable)
 [c0000000016f3d40] [c00000000014c78c] panic+0x164/0x418
 [c0000000016f3dd0] [c000000000098890] early_alloc_pgtable+0xe0/0xec
 [c0000000016f3e60] [c0000000010a5440] radix__early_init_mmu+0x360/0x4b4
 [c0000000016f3ef0] [c000000001099bac] early_init_mmu+0x1c/0x3c
 [c0000000016f3f10] [c00000000109a320] early_setup+0x134/0x170

This was because the kernel was checking for the radix feature before we enable the
feature via mmu_features. This resulted in the kernel using hash restrictions on
radix.

Rework the early init code such that the kernel boot with memblock restrictions
as imposed by hash. At that point, the kernel still hasn't finalized the
translation the kernel will end up using.

We have three different ways of detecting radix.

1. dt_cpu_ftrs_scan -> used only in case of PowerNV
2. ibm,pa-features -> Used when we don't use cpu_dt_ftr_scan
3. CAS -> Where we negotiate with hypervisor about the supported translation.

We look at 1 or 2 early in the boot and after that, we look at the CAS vector to
finalize the translation the kernel will use. We also support a kernel command
line option (disable_radix) to switch to hash.

Update the memblock limit after mmu_early_init_devtree() if the kernel is going
to use radix translation. This forces some of the memblock allocations we do before
mmu_early_init_devtree() to be within the RMA limit.

Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines")
Reported-by: Shirisha Ganta <shiganta@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h | 8 +++++---
 arch/powerpc/kernel/prom.c               | 6 ++++++
 arch/powerpc/mm/book3s64/radix_pgtable.c | 2 ++
 3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 55442d45c597..4245f99453f5 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -244,9 +244,11 @@ extern void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
 static inline void setup_initial_memory_limit(phys_addr_t first_memblock_base,
 					      phys_addr_t first_memblock_size)
 {
-	if (early_radix_enabled())
-		return radix__setup_initial_memory_limit(first_memblock_base,
-						   first_memblock_size);
+	/*
+	 * Hash has more strict restrictions. At this point we don't
+	 * know which translations we will pick. Hence got with hash
+	 * restrictions.
+	 */
 	return hash__setup_initial_memory_limit(first_memblock_base,
 					   first_memblock_size);
 }
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index d8a2fb87ba0c..340900ae95a4 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -811,6 +811,12 @@ void __init early_init_devtree(void *params)
 
 	mmu_early_init_devtree();
 
+	/*
+	 * Reset ppc64_rma_size and memblock memory limit
+	 */
+	if (early_radix_enabled())
+		radix__setup_initial_memory_limit(memstart_addr, first_memblock_size);
+
 #ifdef CONFIG_PPC_POWERNV
 	/* Scan and build the list of machine check recoverable ranges */
 	of_scan_flat_dt(early_init_dt_scan_recoverable_ranges, NULL);
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 28c784976bed..094daf16acac 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -747,6 +747,8 @@ void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
 	 * Radix mode is not limited by RMA / VRMA addressing.
 	 */
 	ppc64_rma_size = ULONG_MAX;
+
+	memblock_set_current_limit(MEMBLOCK_ALLOC_ANYWHERE);
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v2 3/4] powerpc/memhotplug: Make lmb size 64bit
From: Sasha Levin @ 2020-08-13 16:25 UTC (permalink / raw)
  To: Sasha Levin, Aneesh Kumar K.V, linuxppc-dev, mpe; +Cc: Nathan Lynch, stable
In-Reply-To: <20200806162329.276534-3-aneesh.kumar@linux.ibm.com>

Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.8, v5.7.14, v5.4.57, v4.19.138, v4.14.193, v4.9.232, v4.4.232.

v5.8: Build OK!
v5.7.14: Build OK!
v5.4.57: Build OK!
v4.19.138: Failed to apply! Possible dependencies:
    Unable to calculate

v4.14.193: Failed to apply! Possible dependencies:
    Unable to calculate

v4.9.232: Failed to apply! Possible dependencies:
    1a367063ca0c ("powerpc/pseries: Check memory device state before onlining/offlining")
    25b587fba9a4 ("powerpc/pseries: Correct possible read beyond dlpar sysfs buffer")
    333f7b76865b ("powerpc/pseries: Implement indexed-count hotplug memory add")
    753843471cbb ("powerpc/pseries: Implement indexed-count hotplug memory remove")
    943db62c316c ("powerpc/pseries: Revert 'Auto-online hotplugged memory'")
    c21f515c7436 ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step")
    e70d59700fc3 ("powerpc/pseries: Introduce memory hotplug READD operation")
    f84775c2d5d9 ("powerpc/pseries: Fix build break when MEMORY_HOTREMOVE=n")

v4.4.232: Failed to apply! Possible dependencies:
    183deeea5871 ("powerpc/pseries: Consolidate CPU hotplug code to hotplug-cpu.c")
    1a367063ca0c ("powerpc/pseries: Check memory device state before onlining/offlining")
    1dc759566636 ("powerpc/pseries: Use kernel hotplug queue for PowerVM hotplug events")
    1f859adb9253 ("powerpc/pseries: Verify CPU doesn't exist before adding")
    25b587fba9a4 ("powerpc/pseries: Correct possible read beyond dlpar sysfs buffer")
    333f7b76865b ("powerpc/pseries: Implement indexed-count hotplug memory add")
    4a4bdfea7cb7 ("powerpc/pseries: Refactor dlpar_add_lmb() code")
    753843471cbb ("powerpc/pseries: Implement indexed-count hotplug memory remove")
    9054619ef54a ("powerpc/pseries: Add pseries hotplug workqueue")
    943db62c316c ("powerpc/pseries: Revert 'Auto-online hotplugged memory'")
    9dc512819e4b ("powerpc: Fix unused function warning 'lmb_to_memblock'")
    bdf5fc633804 ("powerpc/pseries: Update LMB associativity index during DLPAR add/remove")
    c21f515c7436 ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step")
    e70d59700fc3 ("powerpc/pseries: Introduce memory hotplug READD operation")
    e9d764f80396 ("powerpc/pseries: Enable kernel CPU dlpar from sysfs")
    ec999072442a ("powerpc/pseries: Auto-online hotplugged memory")
    f84775c2d5d9 ("powerpc/pseries: Fix build break when MEMORY_HOTREMOVE=n")
    fdb4f6e99ffa ("powerpc/pseries: Remove call to memblock_add()")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

-- 
Thanks
Sasha

^ permalink raw reply

* [PATCH 3/9] powerpc: Remove CONFIG_PPC601_SYNC_FIX
From: Christophe Leroy @ 2020-08-13 16:36 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <11a330af231af22874c006302a945388846f8112.1597336548.git.christophe.leroy@csgroup.eu>

This config option isn't in any defconfig.

The very first versions of Powerpc 601 have a bug which
requires additional sync before and/or after some instructions.

This was more than 25 years ago and time has come to retire
those buggy versions of the 601 from the kernel.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/ppc_asm.h |  6 ------
 arch/powerpc/platforms/Kconfig     | 15 ---------------
 2 files changed, 21 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index b4cc6608131c..0b9dc814b81c 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -382,15 +382,9 @@ GLUE(.,name):
 #endif
 
 /* various errata or part fixups */
-#ifdef CONFIG_PPC601_SYNC_FIX
-#define SYNC		sync; isync
-#define SYNC_601	sync
-#define ISYNC_601	isync
-#else
 #define	SYNC
 #define SYNC_601
 #define ISYNC_601
-#endif
 
 #if defined(CONFIG_PPC_CELL) || defined(CONFIG_PPC_FSL_BOOK3E)
 #define MFTB(dest)			\
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index fb7515b4fa9c..f377a56ecc85 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -199,21 +199,6 @@ source "drivers/cpuidle/Kconfig"
 
 endmenu
 
-config PPC601_SYNC_FIX
-	bool "Workarounds for PPC601 bugs"
-	depends on PPC_BOOK3S_601 && PPC_PMAC
-	default y
-	help
-	  Some versions of the PPC601 (the first PowerPC chip) have bugs which
-	  mean that extra synchronization instructions are required near
-	  certain instructions, typically those that make major changes to the
-	  CPU state.  These extra instructions reduce performance slightly.
-	  If you say N here, these extra instructions will not be included,
-	  resulting in a kernel which will run faster but may not run at all
-	  on some systems with the PPC601 chip.
-
-	  If in doubt, say Y here.
-
 config TAU
 	bool "On-chip CPU temperature sensor support"
 	depends on PPC_BOOK3S_32
-- 
2.25.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox