* [PATCH v2 5/6] sparc64/mm: Implement pXX_leaf_size() support
From: Peter Zijlstra @ 2020-11-26 12:01 UTC (permalink / raw)
To: kan.liang, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
eranian
Cc: linux-arch, ak, catalin.marinas, peterz, linuxppc-dev, willy,
linux-kernel, dave.hansen, npiggin, aneesh.kumar, sparclinux,
will, davem, kirill.shutemov
In-Reply-To: <20201126120114.071913521@infradead.org>
Sparc64 has non-pagetable aligned large page support; wire up the
pXX_leaf_size() functions to report the correct pagetable page size.
This enables PERF_SAMPLE_{DATA,CODE}_PAGE_SIZE to report accurate
pagetable leaf sizes.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/sparc/include/asm/pgtable_64.h | 13 +++++++++++++
arch/sparc/mm/hugetlbpage.c | 19 +++++++++++++------
2 files changed, 26 insertions(+), 6 deletions(-)
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -1121,6 +1121,19 @@ extern unsigned long cmdline_memory_size
asmlinkage void do_sparc64_fault(struct pt_regs *regs);
+#ifdef CONFIG_HUGETLB_PAGE
+
+#define pud_leaf_size pud_leaf_size
+extern unsigned long pud_leaf_size(pud_t pud);
+
+#define pmd_leaf_size pmd_leaf_size
+extern unsigned long pmd_leaf_size(pmd_t pmd);
+
+#define pte_leaf_size pte_leaf_size
+extern unsigned long pte_leaf_size(pte_t pte);
+
+#endif /* CONFIG_HUGETLB_PAGE */
+
#endif /* !(__ASSEMBLY__) */
#endif /* !(_SPARC64_PGTABLE_H) */
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -247,14 +247,17 @@ static unsigned int sun4u_huge_tte_to_sh
return shift;
}
-static unsigned int huge_tte_to_shift(pte_t entry)
+static unsigned long tte_to_shift(pte_t entry)
{
- unsigned long shift;
-
if (tlb_type == hypervisor)
- shift = sun4v_huge_tte_to_shift(entry);
- else
- shift = sun4u_huge_tte_to_shift(entry);
+ return sun4v_huge_tte_to_shift(entry);
+
+ return sun4u_huge_tte_to_shift(entry);
+}
+
+static unsigned int huge_tte_to_shift(pte_t entry)
+{
+ unsigned long shift = tte_to_shift(entry);
if (shift == PAGE_SHIFT)
WARN_ONCE(1, "tto_to_shift: invalid hugepage tte=0x%lx\n",
@@ -272,6 +275,10 @@ static unsigned long huge_tte_to_size(pt
return size;
}
+unsigned long pud_leaf_size(pud_t pud) { return 1UL << tte_to_shift((pte_t)pud); }
+unsigned long pmd_leaf_size(pmd_t pmd) { return 1UL << tte_to_shift((pte_t)pmd); }
+unsigned long pte_leaf_size(pte_t pte) { return 1UL << tte_to_shift((pte_t)pte); }
+
pte_t *huge_pte_alloc(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
^ permalink raw reply
* [PATCH v2 6/6] powerpc/8xx: Implement pXX_leaf_size() support
From: Peter Zijlstra @ 2020-11-26 12:01 UTC (permalink / raw)
To: kan.liang, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
eranian
Cc: linux-arch, ak, catalin.marinas, peterz, linuxppc-dev, willy,
linux-kernel, dave.hansen, npiggin, aneesh.kumar, sparclinux,
will, davem, kirill.shutemov
In-Reply-To: <20201126120114.071913521@infradead.org>
Christophe Leroy wrote:
> I can help with powerpc 8xx. It is a 32 bits powerpc. The PGD has 1024
> entries, that means each entry maps 4M.
>
> Page sizes are 4k, 16k, 512k and 8M.
>
> For the 8M pages we use hugepd with a single entry. The two related PGD
> entries point to the same hugepd.
>
> For the other sizes, they are in standard page tables. 16k pages appear
> 4 times in the page table. 512k entries appear 128 times in the page
> table.
>
> When the PGD entry has _PMD_PAGE_8M bits, the PMD entry points to a
> hugepd with holds the single 8M entry.
>
> In the PTE, we have two bits: _PAGE_SPS and _PAGE_HUGE
>
> _PAGE_HUGE means it is a 512k page
> _PAGE_SPS means it is not a 4k page
>
> The kernel can by build either with 4k pages as standard page size, or
> 16k pages. It doesn't change the page table layout though.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -135,6 +135,29 @@ static inline pte_t pte_mkhuge(pte_t pte
}
#define pte_mkhuge pte_mkhuge
+
+static inline unsigned long pgd_leaf_size(pgd_t pgd)
+{
+ if (pgd_val(pgd) & _PMD_PAGE_8M)
+ return SZ_8M;
+ return SZ_4M;
+}
+
+#define pgd_leaf_size pgd_leaf_size
+
+static inline unsigned long pte_leaf_size(pte_t pte)
+{
+ pte_basic_t val = pte_val(pte);
+
+ if (val & _PAGE_HUGE)
+ return SZ_512K;
+ if (val & _PAGE_SPS)
+ return SZ_16K;
+ return SZ_4K;
+}
+
+#define pte_leaf_size pte_leaf_size
+
#endif
#endif /* __KERNEL__ */
^ permalink raw reply
* [PATCH v2 2/6] mm: Introduce pXX_leaf_size()
From: Peter Zijlstra @ 2020-11-26 12:01 UTC (permalink / raw)
To: kan.liang, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
eranian
Cc: linux-arch, ak, catalin.marinas, peterz, linuxppc-dev, willy,
linux-kernel, dave.hansen, npiggin, aneesh.kumar, sparclinux,
will, davem, kirill.shutemov
In-Reply-To: <20201126120114.071913521@infradead.org>
A number of architectures have non-pagetable aligned huge/large pages.
For such architectures a leaf can actually be part of a larger entry.
Provide generic helpers to determine the size of a page-table leaf.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
include/linux/pgtable.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1536,4 +1536,20 @@ typedef unsigned int pgtbl_mod_mask;
#define pmd_leaf(x) 0
#endif
+#ifndef pgd_leaf_size
+#define pgd_leaf_size(x) (1ULL << PGDIR_SHIFT)
+#endif
+#ifndef p4d_leaf_size
+#define p4d_leaf_size(x) P4D_SIZE
+#endif
+#ifndef pud_leaf_size
+#define pud_leaf_size(x) PUD_SIZE
+#endif
+#ifndef pmd_leaf_size
+#define pmd_leaf_size(x) PMD_SIZE
+#endif
+#ifndef pte_leaf_size
+#define pte_leaf_size(x) PAGE_SIZE
+#endif
+
#endif /* _LINUX_PGTABLE_H */
^ permalink raw reply
* Re: [PATCH 0/5] perf/mm: Fix PERF_SAMPLE_*_PAGE_SIZE
From: Peter Zijlstra @ 2020-11-26 10:46 UTC (permalink / raw)
To: Christophe Leroy
Cc: mark.rutland, aneesh.kumar, willy, catalin.marinas, will,
alexander.shishkin, linuxppc-dev, npiggin, linux-kernel, acme,
davem, dave.hansen, ak, eranian, sparclinux, linux-arch, jolsa,
mingo, kirill.shutemov, kan.liang
In-Reply-To: <20201120122004.GG3021@hirez.programming.kicks-ass.net>
On Fri, Nov 20, 2020 at 01:20:04PM +0100, Peter Zijlstra wrote:
> > > I can help with powerpc 8xx. It is a 32 bits powerpc. The PGD has 1024
> > > entries, that means each entry maps 4M.
> > >
> > > Page sizes are 4k, 16k, 512k and 8M.
> > >
> > > For the 8M pages we use hugepd with a single entry. The two related PGD
> > > entries point to the same hugepd.
> > >
> > > For the other sizes, they are in standard page tables. 16k pages appear
> > > 4 times in the page table. 512k entries appear 128 times in the page
> > > table.
> > >
> > > When the PGD entry has _PMD_PAGE_8M bits, the PMD entry points to a
> > > hugepd with holds the single 8M entry.
> > >
> > > In the PTE, we have two bits: _PAGE_SPS and _PAGE_HUGE
> > >
> > > _PAGE_HUGE means it is a 512k page
> > > _PAGE_SPS means it is not a 4k page
> > >
> > > The kernel can by build either with 4k pages as standard page size, or
> > > 16k pages. It doesn't change the page table layout though.
> > >
> > > Hope this is clear. Now I don't really know to wire that up to your series.
Does the below accurately reflect things?
Let me go find a suitable cross-compiler ..
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 1581204467e1..fcc48d590d88 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -135,6 +135,29 @@ static inline pte_t pte_mkhuge(pte_t pte)
}
#define pte_mkhuge pte_mkhuge
+
+static inline unsigned long pgd_leaf_size(pgd_t pgd)
+{
+ if (pgd_val(pgd) & _PMD_PAGE_8M)
+ return SZ_8M;
+ return SZ_4M;
+}
+
+#define pgd_leaf_size pgd_leaf_size
+
+static inline unsigned long pte_leaf_size(pte_t pte)
+{
+ pte_basic_t val = pte_val(pte);
+
+ if (val & _PAGE_HUGE)
+ return SZ_512K;
+ if (val & _PAGE_SPS)
+ return SZ_16K;
+ return SZ_4K;
+}
+
+#define pte_leaf_size pte_leaf_size
+
#endif
#endif /* __KERNEL__ */
^ permalink raw reply related
* Re: [PATCH v6 16/22] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Christophe Leroy @ 2020-11-26 10:39 UTC (permalink / raw)
To: Michael Ellerman, Aneesh Kumar K.V, linuxppc-dev
In-Reply-To: <87r1ogxy8j.fsf@mpe.ellerman.id.au>
Le 26/11/2020 à 10:29, Michael Ellerman a écrit :
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>
>>> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
>>>> With hash translation use DSISR_KEYFAULT to identify a wrong access.
>>>> With Radix we look at the AMR value and type of fault.
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>>> ---
>>>> arch/powerpc/include/asm/book3s/32/kup.h | 4 +--
>>>> arch/powerpc/include/asm/book3s/64/kup.h | 27 ++++++++++++++++----
>>>> arch/powerpc/include/asm/kup.h | 4 +--
>>>> arch/powerpc/include/asm/nohash/32/kup-8xx.h | 4 +--
>>>> arch/powerpc/mm/fault.c | 2 +-
>>>> 5 files changed, 29 insertions(+), 12 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
>>>> index 32fd4452e960..b18cd931e325 100644
>>>> --- a/arch/powerpc/include/asm/book3s/32/kup.h
>>>> +++ b/arch/powerpc/include/asm/book3s/32/kup.h
>>>> @@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
>>>> allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
>>>> }
>>>>
>>>> -static inline bool
>>>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>>>> + bool is_write, unsigned long error_code)
>>>> {
>>>> unsigned long begin = regs->kuap & 0xf0000000;
>>>> unsigned long end = regs->kuap << 28;
>>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>>>> index 4a3d0d601745..2922c442a218 100644
>>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>>> @@ -301,12 +301,29 @@ static inline void set_kuap(unsigned long value)
>>>> isync();
>>>> }
>>>>
>>>> -static inline bool
>>>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>>> +#define RADIX_KUAP_BLOCK_READ UL(0x4000000000000000)
>>>> +#define RADIX_KUAP_BLOCK_WRITE UL(0x8000000000000000)
>>>> +
>>>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>>>> + bool is_write, unsigned long error_code)
>>>> {
>>>> - return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
>>>> - (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
>>>> - "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>>> + if (!mmu_has_feature(MMU_FTR_KUAP))
>>>> + return false;
>>>> +
>>>> + if (radix_enabled()) {
>>>> + /*
>>>> + * Will be a storage protection fault.
>>>> + * Only check the details of AMR[0]
>>>> + */
>>>> + return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
>>>> + "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>>
>>> I think it is pointless to keep the WARN() here.
>>>
>>> I have a series aiming at removing them. See
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/cc9129bdda1dbc2f0a09cf45fece7d0b0e690784.1605541983.git.christophe.leroy@csgroup.eu/
>>
>> Can we do this as a spearate patch as you posted above? We can drop the
>> WARN in that while keeping the hash branch to look at DSISR value.
>
> Yeah we can reconcile Christophe's series with yours later.
>
> I'm still not 100% convinced I want to drop that WARN.
Ok, you can still take the rest of the series as that patch is the last one.
But, I really can't see the point with the WARN. When I hip a kuap bad fault, I get a double dump
(see below). The second one is the interesting one, it tells me everything about the fault. But the
WARN provides internals of do_page_fault() function. What interesting information do I get from there ?
[ 37.842509] lkdtm: attempting bad write at b7bae000
[ 37.842526] ------------[ cut here ]------------
[ 37.842536] Bug: write fault blocked by segment registers !
[ 37.842598] WARNING: CPU: 0 PID: 434 at arch/powerpc/include/asm/book3s/32/kup.h:189
do_page_fault+0x3c8/0x5f0
[ 37.842630] CPU: 0 PID: 434 Comm: busybox Not tainted 5.10.0-rc5-s3k-dev-01343-g8bec80f73baa #4165
[ 37.842650] NIP: c00155e4 LR: c00155e4 CTR: 00000000
[ 37.842670] REGS: e6719c78 TRAP: 0700 Not tainted (5.10.0-rc5-s3k-dev-01343-g8bec80f73baa)
[ 37.842683] MSR: 00021032 <ME,IR,DR,RI> CR: 22002224 XER: 20000000
[ 37.842750]
[ 37.842750] GPR00: c00155e4 e6719d30 c113c660 0000002f c097adf8 c097af10 00000027 00000027
[ 37.842750] GPR08: c0b0afbc 00000000 00000023 00000001 24002224 100d166a 100a0920 00000000
[ 37.842750] GPR16: 100cac0c 100b0000 10169444 1016a685 100d0000 100d0000 00000000 100a0900
[ 37.842750] GPR24: ffffffef ffffffff c1392220 00000300 c076f424 02000000 b7bae000 e6719d70
[ 37.843049] NIP [c00155e4] do_page_fault+0x3c8/0x5f0
[ 37.843074] LR [c00155e4] do_page_fault+0x3c8/0x5f0
[ 37.843087] Call Trace:
[ 37.843114] [e6719d30] [c00155e4] do_page_fault+0x3c8/0x5f0 (unreliable)
[ 37.843154] [e6719d60] [c0014384] handle_page_fault+0x10/0x3c
[ 37.843211] --- interrupt: 301 at lkdtm_ACCESS_USERSPACE+0xdc/0xe4
[ 37.843211] LR = lkdtm_ACCESS_USERSPACE+0xd0/0xe4
[ 37.843238] [e6719e48] [c039d76c] direct_entry+0xe0/0x164
[ 37.843281] [e6719e68] [c0286730] full_proxy_write+0x78/0xbc
[ 37.843325] [e6719e88] [c01657a8] vfs_write+0xdc/0x458
[ 37.843359] [e6719f08] [c0165cb0] ksys_write+0x6c/0x11c
[ 37.843397] [e6719f38] [c0014164] ret_from_syscall+0x0/0x34
[ 37.843426] --- interrupt: c01 at 0xfd55784
[ 37.843426] LR = 0xfe16244
[ 37.843438] Instruction dump:
[ 37.843459] 38600007 4bff7a19 3bc00000 4bfffdbc 419e0110 813f00b0 55280006 7c1e4040
[ 37.843529] 408000f4 3c60c080 3863e148 4801552d <0fe00000> 3c80c072 3c60c097 38840d84
[ 37.843602] ---[ end trace 29c115c8ef352681 ]---
[ 37.843627] Kernel attempted to write user page (b7bae000) - exploit attempt? (uid: 0)
[ 37.851531] BUG: Unable to handle kernel data access on write at 0xb7bae000
[ 37.858472] Faulting instruction address: 0xc039e550
[ 37.863432] Oops: Kernel access of bad area, sig: 11 [#1]
[ 37.868822] BE PAGE_SIZE=4K PREEMPT CMPCPRO
[ 37.873029] SAF3000 DIE NOTIFICATION
[ 37.876624] CPU: 0 PID: 434 Comm: busybox Tainted: G W
5.10.0-rc5-s3k-dev-01343-g8bec80f73baa #4165
[ 37.886940] NIP: c039e550 LR: c039e544 CTR: 00000000
[ 37.891988] REGS: e6719d70 TRAP: 0300 Tainted: G W
(5.10.0-rc5-s3k-dev-01343-g8bec80f73baa)
[ 37.901866] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 24002224 XER: 00000000
[ 37.908617] DAR: b7bae000 DSISR: 0a000000
[ 37.908617] GPR00: c039e544 e6719e28 c113c660 c083aad8 c097adf8 c097af10 00000027 00000027
[ 37.908617] GPR08: c0b0afbc c0dec0de 00000023 00000001 28002224 100d166a 100a0920 00000000
[ 37.908617] GPR16: 100cac0c 100b0000 10169444 1016a685 100d0000 100d0000 00000000 100a0900
[ 37.908617] GPR24: ffffffef ffffffff e6719f10 00000011 c076f424 c1cb1000 c0839e24 b7bae000
[ 37.946267] NIP [c039e550] lkdtm_ACCESS_USERSPACE+0xdc/0xe4
[ 37.951842] LR [c039e544] lkdtm_ACCESS_USERSPACE+0xd0/0xe4
[ 37.957316] Call Trace:
[ 37.959782] [e6719e28] [c039e544] lkdtm_ACCESS_USERSPACE+0xd0/0xe4 (unreliable)
[ 37.967102] [e6719e48] [c039d76c] direct_entry+0xe0/0x164
[ 37.972524] [e6719e68] [c0286730] full_proxy_write+0x78/0xbc
[ 37.978204] [e6719e88] [c01657a8] vfs_write+0xdc/0x458
[ 37.983358] [e6719f08] [c0165cb0] ksys_write+0x6c/0x11c
[ 37.988605] [e6719f38] [c0014164] ret_from_syscall+0x0/0x34
[ 37.994185] --- interrupt: c01 at 0xfd55784
[ 37.994185] LR = 0xfe16244
[ 38.001385] Instruction dump:
[ 38.004360] 3863ac00 3d29c0df 3929c0de 91210008 4bcd04c9 3c60c084 3863ac24 7fe4fb78
[ 38.012149] 4bcd04b9 3c60c084 81210008 3863aad8 <913f0000> 4bffff80 3c60c084 3863aa00
[ 38.020120] ---[ end trace 29c115c8ef352682 ]---
Christophe
^ permalink raw reply
* Re: [PATCH 0/4] powerpc/64s: Fix for radix TLB invalidation bug
From: Aneesh Kumar K.V @ 2020-11-26 10:36 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Cc: Milton Miller, Paul Mackerras, Nicholas Piggin
In-Reply-To: <20201126102530.691335-1-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> This fixes a tricky bug that was noticed by TLB multi-hits in a guest
> stress testing CPU hotplug, but TLB invalidation means any kind of
> data corruption is possible.
>
> Thanks,
> Nick
>
> Nicholas Piggin (4):
> powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation
> powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels
> kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling
> powerpc/64s: Trim offlined CPUs from mm_cpumasks
>
> arch/powerpc/include/asm/book3s/64/mmu.h | 12 ++++++++++
> arch/powerpc/mm/book3s64/hash_native.c | 23 +++++++++++++-------
> arch/powerpc/mm/book3s64/mmu_context.c | 20 +++++++++++++++++
> arch/powerpc/platforms/powermac/smp.c | 2 ++
> arch/powerpc/platforms/powernv/smp.c | 3 +++
> arch/powerpc/platforms/pseries/hotplug-cpu.c | 3 +++
> kernel/cpu.c | 6 ++++-
> 7 files changed, 60 insertions(+), 9 deletions(-)
>
You can add for the series
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
-aneesh
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: sstep: Fix load and update instructions
From: Ravi Bangoria @ 2020-11-26 10:36 UTC (permalink / raw)
To: Sandipan Das, mpe
Cc: Ravi Bangoria, jniethe5, paulus, naveen.n.rao, linuxppc-dev, dja
In-Reply-To: <20201119054139.244083-1-sandipan@linux.ibm.com>
On 11/19/20 11:11 AM, Sandipan Das wrote:
> The Power ISA says that the fixed-point load and update
> instructions must neither use R0 for the base address (RA)
> nor have the destination (RT) and the base address (RA) as
> the same register. In these cases, the instruction is
> invalid. This applies to the following instructions.
> * Load Byte and Zero with Update (lbzu)
> * Load Byte and Zero with Update Indexed (lbzux)
> * Load Halfword and Zero with Update (lhzu)
> * Load Halfword and Zero with Update Indexed (lhzux)
> * Load Halfword Algebraic with Update (lhau)
> * Load Halfword Algebraic with Update Indexed (lhaux)
> * Load Word and Zero with Update (lwzu)
> * Load Word and Zero with Update Indexed (lwzux)
> * Load Word Algebraic with Update Indexed (lwaux)
> * Load Doubleword with Update (ldu)
> * Load Doubleword with Update Indexed (ldux)
>
> However, the following behaviour is observed using some
> invalid opcodes where RA = RT.
>
> An userspace program using an invalid instruction word like
> 0xe9ce0001, i.e. "ldu r14, 0(r14)", runs and exits without
> getting terminated abruptly. The instruction performs the
> load operation but does not write the effective address to
> the base address register. Attaching an uprobe at that
> instruction's address results in emulation which writes the
> effective address to the base register. Thus, the final value
> of the base address register is different.
>
> To remove any inconsistencies, this adds an additional check
> for the aforementioned instructions to make sure that they
> are treated as unknown by the emulation infrastructure when
> RA = 0 or RA = RT. The kernel will then fallback to executing
> the instruction on hardware.
>
> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
For the series:
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
^ permalink raw reply
* [PATCH 4/4] powerpc/64s: Trim offlined CPUs from mm_cpumasks
From: Nicholas Piggin @ 2020-11-26 10:25 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Paul Mackerras, Nicholas Piggin, Milton Miller
In-Reply-To: <20201126102530.691335-1-npiggin@gmail.com>
When offlining a CPU, powerpc/64s does not flush TLBs, rather it just
leaves the CPU set in mm_cpumasks, so it continues to receive TLBIEs
to manage its TLBs.
However the exit_flush_lazy_tlbs() function expects that after
returning, all CPUs (except self) have flushed TLBs for that mm, in
which case TLBIEL can be used for this flush. This breaks for offline
CPUs because they don't get the IPI to flush their TLB. This can lead
to stale translations.
Fix this by clearing the CPU from mm_cpumasks, then flushing all TLBs
before going offline.
These offlined CPU bits stuck in the cpumask also prevents the cpumask
from being trimmed back to local mode, which means continual broadcast
IPIs or TLBIEs are needed for TLB flushing. This patch prevents that
situation too.
A cast of many were involved in working this out, but in particular
Milton, Aneesh, Paul made key discoveries.
Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
Debugged-by: Milton Miller <miltonm@us.ibm.com>
Debugged-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Debugged-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/book3s/64/mmu.h | 12 ++++++++++++
arch/powerpc/mm/book3s64/mmu_context.c | 20 ++++++++++++++++++++
arch/powerpc/platforms/powermac/smp.c | 2 ++
arch/powerpc/platforms/powernv/smp.c | 3 +++
arch/powerpc/platforms/pseries/hotplug-cpu.c | 3 +++
5 files changed, 40 insertions(+)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index e0b52940e43c..750918451dd2 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -242,6 +242,18 @@ extern void radix_init_pseries(void);
static inline void radix_init_pseries(void) { };
#endif
+#ifdef CONFIG_HOTPLUG_CPU
+#define arch_clear_mm_cpumask_cpu(cpu, mm) \
+ do { \
+ if (cpumask_test_cpu(cpu, mm_cpumask(mm))) { \
+ atomic_dec(&(mm)->context.active_cpus); \
+ cpumask_clear_cpu(cpu, mm_cpumask(mm)); \
+ } \
+ } while (0)
+
+void cleanup_cpu_mmu_context(void);
+#endif
+
static inline int get_user_context(mm_context_t *ctx, unsigned long ea)
{
int index = ea >> MAX_EA_BITS_PER_CONTEXT;
diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
index 1c54821de7bf..0c8557220ae2 100644
--- a/arch/powerpc/mm/book3s64/mmu_context.c
+++ b/arch/powerpc/mm/book3s64/mmu_context.c
@@ -17,6 +17,7 @@
#include <linux/export.h>
#include <linux/gfp.h>
#include <linux/slab.h>
+#include <linux/cpu.h>
#include <asm/mmu_context.h>
#include <asm/pgalloc.h>
@@ -307,3 +308,22 @@ void radix__switch_mmu_context(struct mm_struct *prev, struct mm_struct *next)
isync();
}
#endif
+
+/**
+ * cleanup_cpu_mmu_context - Clean up MMU details for this CPU (newly offlined)
+ *
+ * This clears the CPU from mm_cpumask for all processes, and then flushes the
+ * local TLB to ensure TLB coherency in case the CPU is onlined again.
+ *
+ * KVM guest translations are not necessarily flushed here. If KVM started
+ * using mm_cpumask or the Linux APIs which do, this would have to be resolved.
+ */
+#ifdef CONFIG_HOTPLUG_CPU
+void cleanup_cpu_mmu_context(void)
+{
+ int cpu = smp_processor_id();
+
+ clear_tasks_mm_cpumask(cpu);
+ tlbiel_all();
+}
+#endif
diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c
index 74ebe664b016..adae2a6712e1 100644
--- a/arch/powerpc/platforms/powermac/smp.c
+++ b/arch/powerpc/platforms/powermac/smp.c
@@ -911,6 +911,8 @@ static int smp_core99_cpu_disable(void)
mpic_cpu_set_priority(0xf);
+ cleanup_cpu_mmu_context();
+
return 0;
}
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 54c4ba45c7ce..cbb67813cd5d 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -143,6 +143,9 @@ static int pnv_smp_cpu_disable(void)
xive_smp_disable_cpu();
else
xics_migrate_irqs_away();
+
+ cleanup_cpu_mmu_context();
+
return 0;
}
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index f2837e33bf5d..a02012f1b04a 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -90,6 +90,9 @@ static int pseries_cpu_disable(void)
xive_smp_disable_cpu();
else
xics_migrate_irqs_away();
+
+ cleanup_cpu_mmu_context();
+
return 0;
}
--
2.23.0
^ permalink raw reply related
* [PATCH 3/4] kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling
From: Nicholas Piggin @ 2020-11-26 10:25 UTC (permalink / raw)
To: linuxppc-dev
Cc: Peter Zijlstra, Aneesh Kumar K.V, Paul Mackerras, Nicholas Piggin,
Milton Miller
In-Reply-To: <20201126102530.691335-1-npiggin@gmail.com>
powerpc/64s keeps a counter in the mm which counts bits set in
mm_cpumask as well as other things. This means it can't use generic code
to clear bits out of the mask and doesn't adjust the arch specific
counter.
Add an arch override that allows powerpc/64s to use
clear_tasks_mm_cpumask().
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
kernel/cpu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6ff2578ecf17..2b8d7a5db383 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -815,6 +815,10 @@ void __init cpuhp_threads_init(void)
}
#ifdef CONFIG_HOTPLUG_CPU
+#ifndef arch_clear_mm_cpumask_cpu
+#define arch_clear_mm_cpumask_cpu(cpu, mm) cpumask_clear_cpu(cpu, mm_cpumask(mm))
+#endif
+
/**
* clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU
* @cpu: a CPU id
@@ -850,7 +854,7 @@ void clear_tasks_mm_cpumask(int cpu)
t = find_lock_task_mm(p);
if (!t)
continue;
- cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
+ arch_clear_mm_cpumask_cpu(cpu, t->mm);
task_unlock(t);
}
rcu_read_unlock();
--
2.23.0
^ permalink raw reply related
* [PATCH 2/4] powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels
From: Nicholas Piggin @ 2020-11-26 10:25 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Paul Mackerras, Nicholas Piggin, Milton Miller
In-Reply-To: <20201126102530.691335-1-npiggin@gmail.com>
tlbiel_all() can not be usable in !HVMODE when running hash presently,
remove HV privileged flushes when running in guest to make it usable.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/mm/book3s64/hash_native.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
index 97fa42d7027e..52e170bd95ae 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -92,16 +92,15 @@ static void tlbiel_all_isa300(unsigned int num_sets, unsigned int is)
asm volatile("ptesync": : :"memory");
/*
- * Flush the first set of the TLB, and any caching of partition table
- * entries. Then flush the remaining sets of the TLB. Hash mode uses
- * partition scoped TLB translations.
+ * Flush the partition table cache if this is HV mode.
*/
- tlbiel_hash_set_isa300(0, is, 0, 2, 0);
- for (set = 1; set < num_sets; set++)
- tlbiel_hash_set_isa300(set, is, 0, 0, 0);
+ if (early_cpu_has_feature(CPU_FTR_HVMODE))
+ tlbiel_hash_set_isa300(0, is, 0, 2, 0);
/*
- * Now invalidate the process table cache.
+ * Now invalidate the process table cache. UPRT=0 HPT modes (what
+ * current hardware implements) do not use the process table, but
+ * add the flushes anyway.
*
* From ISA v3.0B p. 1078:
* The following forms are invalid.
@@ -110,6 +109,14 @@ static void tlbiel_all_isa300(unsigned int num_sets, unsigned int is)
*/
tlbiel_hash_set_isa300(0, is, 0, 2, 1);
+ /*
+ * Then flush the sets of the TLB proper. Hash mode uses
+ * partition scoped TLB translations, which may be flushed
+ * in !HV mode.
+ */
+ for (set = 0; set < num_sets; set++)
+ tlbiel_hash_set_isa300(set, is, 0, 0, 0);
+
ppc_after_tlbiel_barrier();
asm volatile(PPC_ISA_3_0_INVALIDATE_ERAT "; isync" : : :"memory");
--
2.23.0
^ permalink raw reply related
* [PATCH 1/4] powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation
From: Nicholas Piggin @ 2020-11-26 10:25 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Paul Mackerras, Nicholas Piggin, Milton Miller
In-Reply-To: <20201126102530.691335-1-npiggin@gmail.com>
A typo has the R field of the instruction assigned by lucky dip a la
register allocator.
Fixes: d4748276ae14c ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/mm/book3s64/hash_native.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/book3s64/hash_native.c b/arch/powerpc/mm/book3s64/hash_native.c
index 0203cdf48c54..97fa42d7027e 100644
--- a/arch/powerpc/mm/book3s64/hash_native.c
+++ b/arch/powerpc/mm/book3s64/hash_native.c
@@ -68,7 +68,7 @@ static __always_inline void tlbiel_hash_set_isa300(unsigned int set, unsigned in
rs = ((unsigned long)pid << PPC_BITLSHIFT(31));
asm volatile(PPC_TLBIEL(%0, %1, %2, %3, %4)
- : : "r"(rb), "r"(rs), "i"(ric), "i"(prs), "r"(r)
+ : : "r"(rb), "r"(rs), "i"(ric), "i"(prs), "i"(r)
: "memory");
}
--
2.23.0
^ permalink raw reply related
* [PATCH 0/4] powerpc/64s: Fix for radix TLB invalidation bug
From: Nicholas Piggin @ 2020-11-26 10:25 UTC (permalink / raw)
To: linuxppc-dev
Cc: Aneesh Kumar K.V, Paul Mackerras, Nicholas Piggin, Milton Miller
This fixes a tricky bug that was noticed by TLB multi-hits in a guest
stress testing CPU hotplug, but TLB invalidation means any kind of
data corruption is possible.
Thanks,
Nick
Nicholas Piggin (4):
powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation
powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels
kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling
powerpc/64s: Trim offlined CPUs from mm_cpumasks
arch/powerpc/include/asm/book3s/64/mmu.h | 12 ++++++++++
arch/powerpc/mm/book3s64/hash_native.c | 23 +++++++++++++-------
arch/powerpc/mm/book3s64/mmu_context.c | 20 +++++++++++++++++
arch/powerpc/platforms/powermac/smp.c | 2 ++
arch/powerpc/platforms/powernv/smp.c | 3 +++
arch/powerpc/platforms/pseries/hotplug-cpu.c | 3 +++
kernel/cpu.c | 6 ++++-
7 files changed, 60 insertions(+), 9 deletions(-)
--
2.23.0
^ permalink raw reply
* [PATCH v2 31/36] powerpc: asm: hvconsole: Move 'hvc_vio_init_early's prototype to shared location
From: Lee Jones @ 2020-11-26 9:36 UTC (permalink / raw)
To: linux-kernel, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, linuxppc-dev
In-Reply-To: <20201104193549.4026187-32-lee.jones@linaro.org>
Fixes the following W=1 kernel build warning(s):
drivers/tty/hvc/hvc_vio.c:385:13: warning: no previous prototype for ‘hvc_vio_init_early’ [-Wmissing-prototypes]
385 | void __init hvc_vio_init_early(void)
| ^~~~~~~~~~~~~~~~~~
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
---
v2:
- Removed 'extern' keyword
arch/powerpc/include/asm/hvconsole.h | 3 +++
arch/powerpc/platforms/pseries/pseries.h | 3 ---
arch/powerpc/platforms/pseries/setup.c | 1 +
3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/hvconsole.h b/arch/powerpc/include/asm/hvconsole.h
index 999ed5ac90531..ccb2034506f0f 100644
--- a/arch/powerpc/include/asm/hvconsole.h
+++ b/arch/powerpc/include/asm/hvconsole.h
@@ -24,5 +24,8 @@
extern int hvc_get_chars(uint32_t vtermno, char *buf, int count);
extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count);
+/* Provided by HVC VIO */
+void hvc_vio_init_early(void);
+
#endif /* __KERNEL__ */
#endif /* _PPC64_HVCONSOLE_H */
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 593840847cd3d..693f58d784b5b 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -43,9 +43,6 @@ extern void pSeries_final_fixup(void);
/* Poweron flag used for enabling auto ups restart */
extern unsigned long rtas_poweron_auto;
-/* Provided by HVC VIO */
-extern void hvc_vio_init_early(void);
-
/* Dynamic logical Partitioning/Mobility */
extern void dlpar_free_cc_nodes(struct device_node *);
extern void dlpar_free_cc_property(struct property *);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 090c13f6c8815..b5513eefd12c9 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -71,6 +71,7 @@
#include <asm/swiotlb.h>
#include <asm/svm.h>
#include <asm/dtl.h>
+#include <asm/hvconsole.h>
#include "pseries.h"
#include "../../../../drivers/pci/pci.h"
--
2.25.1
^ permalink raw reply related
* Re: [PATCH v6 16/22] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Michael Ellerman @ 2020-11-26 9:29 UTC (permalink / raw)
To: Aneesh Kumar K.V, Christophe Leroy, linuxppc-dev
In-Reply-To: <87h7pctvdl.fsf@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>
>> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
>>> With hash translation use DSISR_KEYFAULT to identify a wrong access.
>>> With Radix we look at the AMR value and type of fault.
>>>
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ---
>>> arch/powerpc/include/asm/book3s/32/kup.h | 4 +--
>>> arch/powerpc/include/asm/book3s/64/kup.h | 27 ++++++++++++++++----
>>> arch/powerpc/include/asm/kup.h | 4 +--
>>> arch/powerpc/include/asm/nohash/32/kup-8xx.h | 4 +--
>>> arch/powerpc/mm/fault.c | 2 +-
>>> 5 files changed, 29 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
>>> index 32fd4452e960..b18cd931e325 100644
>>> --- a/arch/powerpc/include/asm/book3s/32/kup.h
>>> +++ b/arch/powerpc/include/asm/book3s/32/kup.h
>>> @@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
>>> allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
>>> }
>>>
>>> -static inline bool
>>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>>> + bool is_write, unsigned long error_code)
>>> {
>>> unsigned long begin = regs->kuap & 0xf0000000;
>>> unsigned long end = regs->kuap << 28;
>>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>>> index 4a3d0d601745..2922c442a218 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>>> @@ -301,12 +301,29 @@ static inline void set_kuap(unsigned long value)
>>> isync();
>>> }
>>>
>>> -static inline bool
>>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>>> +#define RADIX_KUAP_BLOCK_READ UL(0x4000000000000000)
>>> +#define RADIX_KUAP_BLOCK_WRITE UL(0x8000000000000000)
>>> +
>>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>>> + bool is_write, unsigned long error_code)
>>> {
>>> - return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
>>> - (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
>>> - "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>> + if (!mmu_has_feature(MMU_FTR_KUAP))
>>> + return false;
>>> +
>>> + if (radix_enabled()) {
>>> + /*
>>> + * Will be a storage protection fault.
>>> + * Only check the details of AMR[0]
>>> + */
>>> + return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
>>> + "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>>
>> I think it is pointless to keep the WARN() here.
>>
>> I have a series aiming at removing them. See
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/cc9129bdda1dbc2f0a09cf45fece7d0b0e690784.1605541983.git.christophe.leroy@csgroup.eu/
>
> Can we do this as a spearate patch as you posted above? We can drop the
> WARN in that while keeping the hash branch to look at DSISR value.
Yeah we can reconcile Christophe's series with yours later.
I'm still not 100% convinced I want to drop that WARN.
cheers
^ permalink raw reply
* Re: [PATCH 1/2] powerpc: sstep: Fix load and update instructions
From: Sandipan Das @ 2020-11-26 9:06 UTC (permalink / raw)
To: Ravi Bangoria; +Cc: jniethe5, paulus, naveen.n.rao, linuxppc-dev, dja
In-Reply-To: <daf02936-8a92-e909-7495-7a48f01cfe31@linux.ibm.com>
Hi,
On 25/11/20 3:39 pm, Ravi Bangoria wrote:
>
>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>> index 855457ed09b5..25a5436be6c6 100644
>> --- a/arch/powerpc/lib/sstep.c
>> +++ b/arch/powerpc/lib/sstep.c
>> @@ -2157,11 +2157,15 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>> case 23: /* lwzx */
>> case 55: /* lwzux */
>> + if (u && (ra == 0 || ra == rd))
>> + return -1;
>
> I guess you also need to split case 23 and 55?
>
'u' takes care of that. It will be set for lwzux but not lwzx.
- Sandipan
^ permalink raw reply
* Re: [PATCH V4 2/5] ocxl: Initiate a TLB invalidate command
From: Frederic Barrat @ 2020-11-26 9:00 UTC (permalink / raw)
To: Christophe Lombard, linuxppc-dev, fbarrat, ajd
In-Reply-To: <20201125155013.39955-3-clombard@linux.vnet.ibm.com>
On 25/11/2020 16:50, Christophe Lombard wrote:
> When a TLB Invalidate is required for the Logical Partition, the following
> sequence has to be performed:
>
> 1. Load MMIO ATSD AVA register with the necessary value, if required.
> 2. Write the MMIO ATSD launch register to initiate the TLB Invalidate
> command.
> 3. Poll the MMIO ATSD status register to determine when the TLB Invalidate
> has been completed.
>
> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
> ---
Thanks!
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
> arch/powerpc/include/asm/pnv-ocxl.h | 51 ++++++++++++++++++++
> arch/powerpc/platforms/powernv/ocxl.c | 69 +++++++++++++++++++++++++++
> 2 files changed, 120 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/pnv-ocxl.h b/arch/powerpc/include/asm/pnv-ocxl.h
> index 60c3c74427d9..9acd1fbf1197 100644
> --- a/arch/powerpc/include/asm/pnv-ocxl.h
> +++ b/arch/powerpc/include/asm/pnv-ocxl.h
> @@ -3,12 +3,59 @@
> #ifndef _ASM_PNV_OCXL_H
> #define _ASM_PNV_OCXL_H
>
> +#include <linux/bitfield.h>
> #include <linux/pci.h>
>
> #define PNV_OCXL_TL_MAX_TEMPLATE 63
> #define PNV_OCXL_TL_BITS_PER_RATE 4
> #define PNV_OCXL_TL_RATE_BUF_SIZE ((PNV_OCXL_TL_MAX_TEMPLATE+1) * PNV_OCXL_TL_BITS_PER_RATE / 8)
>
> +#define PNV_OCXL_ATSD_TIMEOUT 1
> +
> +/* TLB Management Instructions */
> +#define PNV_OCXL_ATSD_LNCH 0x00
> +/* Radix Invalidate */
> +#define PNV_OCXL_ATSD_LNCH_R PPC_BIT(0)
> +/* Radix Invalidation Control
> + * 0b00 Just invalidate TLB.
> + * 0b01 Invalidate just Page Walk Cache.
> + * 0b10 Invalidate TLB, Page Walk Cache, and any
> + * caching of Partition and Process Table Entries.
> + */
> +#define PNV_OCXL_ATSD_LNCH_RIC PPC_BITMASK(1, 2)
> +/* Number and Page Size of translations to be invalidated */
> +#define PNV_OCXL_ATSD_LNCH_LP PPC_BITMASK(3, 10)
> +/* Invalidation Criteria
> + * 0b00 Invalidate just the target VA.
> + * 0b01 Invalidate matching PID.
> + */
> +#define PNV_OCXL_ATSD_LNCH_IS PPC_BITMASK(11, 12)
> +/* 0b1: Process Scope, 0b0: Partition Scope */
> +#define PNV_OCXL_ATSD_LNCH_PRS PPC_BIT(13)
> +/* Invalidation Flag */
> +#define PNV_OCXL_ATSD_LNCH_B PPC_BIT(14)
> +/* Actual Page Size to be invalidated
> + * 000 4KB
> + * 101 64KB
> + * 001 2MB
> + * 010 1GB
> + */
> +#define PNV_OCXL_ATSD_LNCH_AP PPC_BITMASK(15, 17)
> +/* Defines the large page select
> + * L=0b0 for 4KB pages
> + * L=0b1 for large pages)
> + */
> +#define PNV_OCXL_ATSD_LNCH_L PPC_BIT(18)
> +/* Process ID */
> +#define PNV_OCXL_ATSD_LNCH_PID PPC_BITMASK(19, 38)
> +/* NoFlush – Assumed to be 0b0 */
> +#define PNV_OCXL_ATSD_LNCH_F PPC_BIT(39)
> +#define PNV_OCXL_ATSD_LNCH_OCAPI_SLBI PPC_BIT(40)
> +#define PNV_OCXL_ATSD_LNCH_OCAPI_SINGLETON PPC_BIT(41)
> +#define PNV_OCXL_ATSD_AVA 0x08
> +#define PNV_OCXL_ATSD_AVA_AVA PPC_BITMASK(0, 51)
> +#define PNV_OCXL_ATSD_STAT 0x10
> +
> int pnv_ocxl_get_actag(struct pci_dev *dev, u16 *base, u16 *enabled, u16 *supported);
> int pnv_ocxl_get_pasid_count(struct pci_dev *dev, int *count);
>
> @@ -31,4 +78,8 @@ int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle);
> int pnv_ocxl_map_lpar(struct pci_dev *dev, uint64_t lparid,
> uint64_t lpcr, void __iomem **arva);
> void pnv_ocxl_unmap_lpar(void __iomem *arva);
> +void pnv_ocxl_tlb_invalidate(void __iomem *arva,
> + unsigned long pid,
> + unsigned long addr,
> + unsigned long page_size);
> #endif /* _ASM_PNV_OCXL_H */
> diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
> index 57fc1062677b..9105efcf242a 100644
> --- a/arch/powerpc/platforms/powernv/ocxl.c
> +++ b/arch/powerpc/platforms/powernv/ocxl.c
> @@ -528,3 +528,72 @@ void pnv_ocxl_unmap_lpar(void __iomem *arva)
> iounmap(arva);
> }
> EXPORT_SYMBOL_GPL(pnv_ocxl_unmap_lpar);
> +
> +void pnv_ocxl_tlb_invalidate(void __iomem *arva,
> + unsigned long pid,
> + unsigned long addr,
> + unsigned long page_size)
> +{
> + unsigned long timeout = jiffies + (HZ * PNV_OCXL_ATSD_TIMEOUT);
> + u64 val = 0ull;
> + int pend;
> + u8 size;
> +
> + if (!(arva))
> + return;
> +
> + if (addr) {
> + /* load Abbreviated Virtual Address register with
> + * the necessary value
> + */
> + val |= FIELD_PREP(PNV_OCXL_ATSD_AVA_AVA, addr >> (63-51));
> + out_be64(arva + PNV_OCXL_ATSD_AVA, val);
> + }
> +
> + /* Write access initiates a shoot down to initiate the
> + * TLB Invalidate command
> + */
> + val = PNV_OCXL_ATSD_LNCH_R;
> + val |= FIELD_PREP(PNV_OCXL_ATSD_LNCH_RIC, 0b10);
> + if (addr)
> + val |= FIELD_PREP(PNV_OCXL_ATSD_LNCH_IS, 0b00);
> + else {
> + val |= FIELD_PREP(PNV_OCXL_ATSD_LNCH_IS, 0b01);
> + val |= PNV_OCXL_ATSD_LNCH_OCAPI_SINGLETON;
> + }
> + val |= PNV_OCXL_ATSD_LNCH_PRS;
> + /* Actual Page Size to be invalidated
> + * 000 4KB
> + * 101 64KB
> + * 001 2MB
> + * 010 1GB
> + */
> + size = 0b101;
> + if (page_size == 0x1000)
> + size = 0b000;
> + if (page_size == 0x200000)
> + size = 0b001;
> + if (page_size == 0x40000000)
> + size = 0b010;
> + val |= FIELD_PREP(PNV_OCXL_ATSD_LNCH_AP, size);
> + val |= FIELD_PREP(PNV_OCXL_ATSD_LNCH_PID, pid);
> + out_be64(arva + PNV_OCXL_ATSD_LNCH, val);
> +
> + /* Poll the ATSD status register to determine when the
> + * TLB Invalidate has been completed.
> + */
> + val = in_be64(arva + PNV_OCXL_ATSD_STAT);
> + pend = val >> 63;
> +
> + while (pend) {
> + if (time_after_eq(jiffies, timeout)) {
> + pr_err("%s - Timeout while reading XTS MMIO ATSD status register (val=%#llx, pidr=0x%lx)\n",
> + __func__, val, pid);
> + return;
> + }
> + cpu_relax();
> + val = in_be64(arva + PNV_OCXL_ATSD_STAT);
> + pend = val >> 63;
> + }
> +}
> +EXPORT_SYMBOL_GPL(pnv_ocxl_tlb_invalidate);
>
^ permalink raw reply
* [PATCH v4 2/2] powerpc/pseries: Pass MSI affinity to irq_create_mapping()
From: Laurent Vivier @ 2020-11-26 8:28 UTC (permalink / raw)
To: linux-kernel
Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201126082852.1178497-1-lvivier@redhat.com>
With virtio multiqueue, normally each queue IRQ is mapped to a CPU.
Commit 0d9f0a52c8b9f ("virtio_scsi: use virtio IRQ affinity") exposed
an existing shortcoming of the arch code by moving virtio_scsi to
the automatic IRQ affinity assignment.
The affinity is correctly computed in msi_desc but this is not applied
to the system IRQs.
It appears the affinity is correctly passed to rtas_setup_msi_irqs() but
lost at this point and never passed to irq_domain_alloc_descs()
(see commit 06ee6d571f0e ("genirq: Add affinity hint to irq allocation"))
because irq_create_mapping() doesn't take an affinity parameter.
As the previous patch has added the affinity parameter to
irq_create_mapping() we can forward the affinity from rtas_setup_msi_irqs()
to irq_domain_alloc_descs().
With this change, the virtqueues are correctly dispatched between the CPUs
on pseries.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/platforms/pseries/msi.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index 133f6adcb39c..b3ac2455faad 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,7 +458,8 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
return hwirq;
}
- virq = irq_create_mapping(NULL, hwirq);
+ virq = irq_create_mapping_affinity(NULL, hwirq,
+ entry->affinity);
if (!virq) {
pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
--
2.28.0
^ permalink raw reply related
* [PATCH v4 1/2] genirq/irqdomain: Add an irq_create_mapping_affinity() function
From: Laurent Vivier @ 2020-11-26 8:28 UTC (permalink / raw)
To: linux-kernel
Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
linuxppc-dev, Christoph Hellwig
In-Reply-To: <20201126082852.1178497-1-lvivier@redhat.com>
There is currently no way to convey the affinity of an interrupt
via irq_create_mapping(), which creates issues for devices that
expect that affinity to be managed by the kernel.
In order to sort this out, rename irq_create_mapping() to
irq_create_mapping_affinity() with an additional affinity parameter
that can conveniently passed down to irq_domain_alloc_descs().
irq_create_mapping() is then re-implemented as a wrapper around
irq_create_mapping_affinity().
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
---
include/linux/irqdomain.h | 12 ++++++++++--
kernel/irq/irqdomain.c | 13 ++++++++-----
2 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index 71535e87109f..ea5a337e0f8b 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -384,11 +384,19 @@ extern void irq_domain_associate_many(struct irq_domain *domain,
extern void irq_domain_disassociate(struct irq_domain *domain,
unsigned int irq);
-extern unsigned int irq_create_mapping(struct irq_domain *host,
- irq_hw_number_t hwirq);
+extern unsigned int irq_create_mapping_affinity(struct irq_domain *host,
+ irq_hw_number_t hwirq,
+ const struct irq_affinity_desc *affinity);
extern unsigned int irq_create_fwspec_mapping(struct irq_fwspec *fwspec);
extern void irq_dispose_mapping(unsigned int virq);
+static inline unsigned int irq_create_mapping(struct irq_domain *host,
+ irq_hw_number_t hwirq)
+{
+ return irq_create_mapping_affinity(host, hwirq, NULL);
+}
+
+
/**
* irq_linear_revmap() - Find a linux irq from a hw irq number.
* @domain: domain owning this hardware interrupt
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index cf8b374b892d..e4ca69608f3b 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -624,17 +624,19 @@ unsigned int irq_create_direct_mapping(struct irq_domain *domain)
EXPORT_SYMBOL_GPL(irq_create_direct_mapping);
/**
- * irq_create_mapping() - Map a hardware interrupt into linux irq space
+ * irq_create_mapping_affinity() - Map a hardware interrupt into linux irq space
* @domain: domain owning this hardware interrupt or NULL for default domain
* @hwirq: hardware irq number in that domain space
+ * @affinity: irq affinity
*
* Only one mapping per hardware interrupt is permitted. Returns a linux
* irq number.
* If the sense/trigger is to be specified, set_irq_type() should be called
* on the number returned from that call.
*/
-unsigned int irq_create_mapping(struct irq_domain *domain,
- irq_hw_number_t hwirq)
+unsigned int irq_create_mapping_affinity(struct irq_domain *domain,
+ irq_hw_number_t hwirq,
+ const struct irq_affinity_desc *affinity)
{
struct device_node *of_node;
int virq;
@@ -660,7 +662,8 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
}
/* Allocate a virtual interrupt number */
- virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node), NULL);
+ virq = irq_domain_alloc_descs(-1, 1, hwirq, of_node_to_nid(of_node),
+ affinity);
if (virq <= 0) {
pr_debug("-> virq allocation failed\n");
return 0;
@@ -676,7 +679,7 @@ unsigned int irq_create_mapping(struct irq_domain *domain,
return virq;
}
-EXPORT_SYMBOL_GPL(irq_create_mapping);
+EXPORT_SYMBOL_GPL(irq_create_mapping_affinity);
/**
* irq_create_strict_mappings() - Map a range of hw irqs to fixed linux irqs
--
2.28.0
^ permalink raw reply related
* [PATCH v4 0/2] powerpc/pseries: fix MSI/X IRQ affinity on pseries
From: Laurent Vivier @ 2020-11-26 8:28 UTC (permalink / raw)
To: linux-kernel
Cc: Laurent Vivier, Michael S . Tsirkin, linux-pci, Greg Kurz,
linux-block, Paul Mackerras, Marc Zyngier, Thomas Gleixner,
linuxppc-dev, Christoph Hellwig
With virtio, in multiqueue case, each queue IRQ is normally
bound to a different CPU using the affinity mask.
This works fine on x86_64 but totally ignored on pseries.
This is not obvious at first look because irqbalance is doing
some balancing to improve that.
It appears that the "managed" flag set in the MSI entry
is never copied to the system IRQ entry.
This series passes the affinity mask from rtas_setup_msi_irqs()
to irq_domain_alloc_descs() by adding an affinity parameter to
irq_create_mapping().
The first patch adds the parameter (no functional change), the
second patch passes the actual affinity mask to irq_create_mapping()
in rtas_setup_msi_irqs().
For instance, with 32 CPUs VM and 32 queues virtio-scsi interface:
... -smp 32 -device virtio-scsi-pci,id=virtio_scsi_pci0,num_queues=32
for IRQ in $(grep virtio2-request /proc/interrupts |cut -d: -f1); do
for file in /proc/irq/$IRQ/ ; do
echo -n "IRQ: $(basename $file) CPU: " ; cat $file/smp_affinity_list
done
done
Without the patch (and without irqbalanced)
IRQ: 268 CPU: 0-31
IRQ: 269 CPU: 0-31
IRQ: 270 CPU: 0-31
IRQ: 271 CPU: 0-31
IRQ: 272 CPU: 0-31
IRQ: 273 CPU: 0-31
IRQ: 274 CPU: 0-31
IRQ: 275 CPU: 0-31
IRQ: 276 CPU: 0-31
IRQ: 277 CPU: 0-31
IRQ: 278 CPU: 0-31
IRQ: 279 CPU: 0-31
IRQ: 280 CPU: 0-31
IRQ: 281 CPU: 0-31
IRQ: 282 CPU: 0-31
IRQ: 283 CPU: 0-31
IRQ: 284 CPU: 0-31
IRQ: 285 CPU: 0-31
IRQ: 286 CPU: 0-31
IRQ: 287 CPU: 0-31
IRQ: 288 CPU: 0-31
IRQ: 289 CPU: 0-31
IRQ: 290 CPU: 0-31
IRQ: 291 CPU: 0-31
IRQ: 292 CPU: 0-31
IRQ: 293 CPU: 0-31
IRQ: 294 CPU: 0-31
IRQ: 295 CPU: 0-31
IRQ: 296 CPU: 0-31
IRQ: 297 CPU: 0-31
IRQ: 298 CPU: 0-31
IRQ: 299 CPU: 0-31
With the patch:
IRQ: 265 CPU: 0
IRQ: 266 CPU: 1
IRQ: 267 CPU: 2
IRQ: 268 CPU: 3
IRQ: 269 CPU: 4
IRQ: 270 CPU: 5
IRQ: 271 CPU: 6
IRQ: 272 CPU: 7
IRQ: 273 CPU: 8
IRQ: 274 CPU: 9
IRQ: 275 CPU: 10
IRQ: 276 CPU: 11
IRQ: 277 CPU: 12
IRQ: 278 CPU: 13
IRQ: 279 CPU: 14
IRQ: 280 CPU: 15
IRQ: 281 CPU: 16
IRQ: 282 CPU: 17
IRQ: 283 CPU: 18
IRQ: 284 CPU: 19
IRQ: 285 CPU: 20
IRQ: 286 CPU: 21
IRQ: 287 CPU: 22
IRQ: 288 CPU: 23
IRQ: 289 CPU: 24
IRQ: 290 CPU: 25
IRQ: 291 CPU: 26
IRQ: 292 CPU: 27
IRQ: 293 CPU: 28
IRQ: 294 CPU: 29
IRQ: 295 CPU: 30
IRQ: 299 CPU: 31
This matches what we have on an x86_64 system.
v4: udate changelog of PATCH 2, add Michael's Acked-by
v3: update changelog of PATCH 1 with comments from Thomas Gleixner and
Marc Zyngier.
v2: add a wrapper around original irq_create_mapping() with the
affinity parameter. Update comments
Laurent Vivier (2):
genirq/irqdomain: Add an irq_create_mapping_affinity() function
powerpc/pseries: Pass MSI affinity to irq_create_mapping()
arch/powerpc/platforms/pseries/msi.c | 3 ++-
include/linux/irqdomain.h | 12 ++++++++++--
kernel/irq/irqdomain.c | 13 ++++++++-----
3 files changed, 20 insertions(+), 8 deletions(-)
--
2.28.0
^ permalink raw reply
* Re: [PATCH v6 16/22] powerpc/book3s64/kuap: Improve error reporting with KUAP
From: Aneesh Kumar K.V @ 2020-11-26 7:44 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev, mpe
In-Reply-To: <bd854266-6cb5-3a04-ae80-a53e03f1e1d3@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
>> With hash translation use DSISR_KEYFAULT to identify a wrong access.
>> With Radix we look at the AMR value and type of fault.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/book3s/32/kup.h | 4 +--
>> arch/powerpc/include/asm/book3s/64/kup.h | 27 ++++++++++++++++----
>> arch/powerpc/include/asm/kup.h | 4 +--
>> arch/powerpc/include/asm/nohash/32/kup-8xx.h | 4 +--
>> arch/powerpc/mm/fault.c | 2 +-
>> 5 files changed, 29 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
>> index 32fd4452e960..b18cd931e325 100644
>> --- a/arch/powerpc/include/asm/book3s/32/kup.h
>> +++ b/arch/powerpc/include/asm/book3s/32/kup.h
>> @@ -177,8 +177,8 @@ static inline void restore_user_access(unsigned long flags)
>> allow_user_access(to, to, end - addr, KUAP_READ_WRITE);
>> }
>>
>> -static inline bool
>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>> + bool is_write, unsigned long error_code)
>> {
>> unsigned long begin = regs->kuap & 0xf0000000;
>> unsigned long end = regs->kuap << 28;
>> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h
>> index 4a3d0d601745..2922c442a218 100644
>> --- a/arch/powerpc/include/asm/book3s/64/kup.h
>> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
>> @@ -301,12 +301,29 @@ static inline void set_kuap(unsigned long value)
>> isync();
>> }
>>
>> -static inline bool
>> -bad_kuap_fault(struct pt_regs *regs, unsigned long address, bool is_write)
>> +#define RADIX_KUAP_BLOCK_READ UL(0x4000000000000000)
>> +#define RADIX_KUAP_BLOCK_WRITE UL(0x8000000000000000)
>> +
>> +static inline bool bad_kuap_fault(struct pt_regs *regs, unsigned long address,
>> + bool is_write, unsigned long error_code)
>> {
>> - return WARN(mmu_has_feature(MMU_FTR_KUAP) &&
>> - (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)),
>> - "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>> + if (!mmu_has_feature(MMU_FTR_KUAP))
>> + return false;
>> +
>> + if (radix_enabled()) {
>> + /*
>> + * Will be a storage protection fault.
>> + * Only check the details of AMR[0]
>> + */
>> + return WARN((regs->kuap & (is_write ? RADIX_KUAP_BLOCK_WRITE : RADIX_KUAP_BLOCK_READ)),
>> + "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
>
> I think it is pointless to keep the WARN() here.
>
> I have a series aiming at removing them. See
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/cc9129bdda1dbc2f0a09cf45fece7d0b0e690784.1605541983.git.christophe.leroy@csgroup.eu/
Can we do this as a spearate patch as you posted above? We can drop the
WARN in that while keeping the hash branch to look at DSISR value.
-aneesh
^ permalink raw reply
* Re: [PATCH v6 09/22] powerpc/exec: Set thread.regs early during exec
From: Christophe Leroy @ 2020-11-26 7:43 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev, mpe
In-Reply-To: <87k0u8tvoj.fsf@linux.ibm.com>
Le 26/11/2020 à 08:38, Aneesh Kumar K.V a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>
>> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
> ....
>
>> +++ b/arch/powerpc/kernel/process.c
>>> @@ -1530,10 +1530,32 @@ void flush_thread(void)
>>> #ifdef CONFIG_PPC_BOOK3S_64
>>> void arch_setup_new_exec(void)
>>> {
>>> - if (radix_enabled())
>>> - return;
>>> - hash__setup_new_exec();
>>> + if (!radix_enabled())
>>> + hash__setup_new_exec();
>>> +
>>> + /*
>>> + * If we exec out of a kernel thread then thread.regs will not be
>>> + * set. Do it now.
>>> + */
>>> + if (!current->thread.regs) {
>>> + struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>>> + current->thread.regs = regs - 1;
>>> + }
>>> +
>>> +}
>>> +#else
>>> +void arch_setup_new_exec(void)
>>> +{
>>> + /*
>>> + * If we exec out of a kernel thread then thread.regs will not be
>>> + * set. Do it now.
>>> + */
>>> + if (!current->thread.regs) {
>>> + struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>>> + current->thread.regs = regs - 1;
>>> + }
>>> }
>>> +
>>> #endif
>>
>> No need to duplicate arch_setup_new_exec() I think. radix_enabled() is defined at all time so the
>> first function should be valid at all time.
>>
>
> arch/powerpc/kernel/process.c: In function ‘arch_setup_new_exec’:
> arch/powerpc/kernel/process.c:1529:3: error: implicit declaration of function ‘hash__setup_new_exec’; did you mean ‘arch_setup_new_exec’? [-Werror=implicit-function-declaration]
> 1529 | hash__setup_new_exec();
> | ^~~~~~~~~~~~~~~~~~~~
> | arch_setup_new_exec
>
>
> That requires us to have hash__setup_new_exec prototype for all platforms.
Yes indeed.
So maybe, just enclose that part in the #ifdef instead of duplicating the common part ?
Christophe
^ permalink raw reply
* Re: [PATCH v6 09/22] powerpc/exec: Set thread.regs early during exec
From: Aneesh Kumar K.V @ 2020-11-26 7:38 UTC (permalink / raw)
To: Christophe Leroy, linuxppc-dev, mpe
In-Reply-To: <f5960226-f451-41ed-2992-bbe0acf9d190@csgroup.eu>
Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 25/11/2020 à 06:16, Aneesh Kumar K.V a écrit :
....
> +++ b/arch/powerpc/kernel/process.c
>> @@ -1530,10 +1530,32 @@ void flush_thread(void)
>> #ifdef CONFIG_PPC_BOOK3S_64
>> void arch_setup_new_exec(void)
>> {
>> - if (radix_enabled())
>> - return;
>> - hash__setup_new_exec();
>> + if (!radix_enabled())
>> + hash__setup_new_exec();
>> +
>> + /*
>> + * If we exec out of a kernel thread then thread.regs will not be
>> + * set. Do it now.
>> + */
>> + if (!current->thread.regs) {
>> + struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>> + current->thread.regs = regs - 1;
>> + }
>> +
>> +}
>> +#else
>> +void arch_setup_new_exec(void)
>> +{
>> + /*
>> + * If we exec out of a kernel thread then thread.regs will not be
>> + * set. Do it now.
>> + */
>> + if (!current->thread.regs) {
>> + struct pt_regs *regs = task_stack_page(current) + THREAD_SIZE;
>> + current->thread.regs = regs - 1;
>> + }
>> }
>> +
>> #endif
>
> No need to duplicate arch_setup_new_exec() I think. radix_enabled() is defined at all time so the
> first function should be valid at all time.
>
arch/powerpc/kernel/process.c: In function ‘arch_setup_new_exec’:
arch/powerpc/kernel/process.c:1529:3: error: implicit declaration of function ‘hash__setup_new_exec’; did you mean ‘arch_setup_new_exec’? [-Werror=implicit-function-declaration]
1529 | hash__setup_new_exec();
| ^~~~~~~~~~~~~~~~~~~~
| arch_setup_new_exec
That requires us to have hash__setup_new_exec prototype for all platforms.
-aneesh
^ permalink raw reply
* [PATCH] ASoC: fsl: Fix config name of CONFIG_ARCH_MXC
From: Shengjiu Wang @ 2020-11-26 6:14 UTC (permalink / raw)
To: timur, nicoleotsuka, Xiubo.Lee, festevam, broonie, perex, tiwai,
alsa-devel
Cc: linuxppc-dev, linux-kernel
CONFIG_ARCH_MXC should be ARCH_MXC
Fixes: 674226db62ec ("ASoC: fsl: SND_SOC_FSL_AUD2HTX should depend on ARCH_MXC")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
---
sound/soc/fsl/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
index 7d48f4f98e8b..835a14821360 100644
--- a/sound/soc/fsl/Kconfig
+++ b/sound/soc/fsl/Kconfig
@@ -107,7 +107,7 @@ config SND_SOC_FSL_XCVR
config SND_SOC_FSL_AUD2HTX
tristate "AUDIO TO HDMI TX module support"
- depends on CONFIG_ARCH_MXC || COMPILE_TEST
+ depends on ARCH_MXC || COMPILE_TEST
help
Say Y if you want to add AUDIO TO HDMI TX support for NXP.
--
2.27.0
^ permalink raw reply related
* Re: [PATCH] tpm: ibmvtpm: fix error return code in tpm_ibmvtpm_probe()
From: Jarkko Sakkinen @ 2020-11-26 3:35 UTC (permalink / raw)
To: Wang Hai, mpe, benh, paulus, peterhuewe, jgg, stefanb, nayna
Cc: linux-integrity, linuxppc-dev, linux-kernel
In-Reply-To: <20201124135244.31932-1-wanghai38@huawei.com>
On Tue, 2020-11-24 at 21:52 +0800, Wang Hai wrote:
> Fix to return a negative error code from the error handling
> case instead of 0, as done elsewhere in this function.
>
> Fixes: d8d74ea3c002 ("tpm: ibmvtpm: Wait for buffer to be set before
> proceeding")
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: Wang Hai <wanghai38@huawei.com>
Provide a reasoning for -ETIMEOUT in the commit message.
/Jarkko
> ---
> drivers/char/tpm/tpm_ibmvtpm.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/char/tpm/tpm_ibmvtpm.c
> b/drivers/char/tpm/tpm_ibmvtpm.c
> index 994385bf37c0..813eb2cac0ce 100644
> --- a/drivers/char/tpm/tpm_ibmvtpm.c
> +++ b/drivers/char/tpm/tpm_ibmvtpm.c
> @@ -687,6 +687,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev
> *vio_dev,
> ibmvtpm->rtce_buf != NULL,
> HZ)) {
> dev_err(dev, "CRQ response timed out\n");
> + rc = -ETIMEDOUT;
> goto init_irq_cleanup;
> }
>
^ permalink raw reply
* Re: [PATCH v6 04/22] powerpc/book3s64/kuap/kuep: Move uamor setup to pkey init
From: Michael Ellerman @ 2020-11-26 3:28 UTC (permalink / raw)
To: Aneesh Kumar K.V, linuxppc-dev; +Cc: Aneesh Kumar K.V
In-Reply-To: <20201125051634.509286-5-aneesh.kumar@linux.ibm.com>
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> This patch consolidates UAMOR update across pkey, kuap and kuep features.
> The boot cpu initialize UAMOR via pkey init and both radix/hash do the
> secondary cpu UAMOR init in early_init_mmu_secondary.
>
> We don't check for mmu_feature in radix secondary init because UAMOR
> is a supported SPRN with all CPUs supporting radix translation.
> The old code was not updating UAMOR if we had smap disabled and smep enabled.
> This change handles that case.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
> arch/powerpc/mm/book3s64/radix_pgtable.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 3adcf730f478..bfe441af916a 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -620,9 +620,6 @@ void setup_kuap(bool disabled)
> cur_cpu_spec->mmu_features |= MMU_FTR_RADIX_KUAP;
> }
>
> - /* Make sure userspace can't change the AMR */
> - mtspr(SPRN_UAMOR, 0);
> -
> /*
> * Set the default kernel AMR values on all cpus.
> */
> @@ -721,6 +718,11 @@ void radix__early_init_mmu_secondary(void)
>
> radix__switch_mmu_context(NULL, &init_mm);
> tlbiel_all();
> +
> +#ifdef CONFIG_PPC_PKEY
> + /* Make sure userspace can't change the AMR */
> + mtspr(SPRN_UAMOR, 0);
> +#endif
If PPC_PKEY is disabled I think this leaves UAMOR unset, which means it
could potentially allow AMR to be used as a covert channel between
processes.
cheers
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox