* Re: [PATCH V3 10/15] arch/kmap: Define kmap_atomic_prot() for all arch's
From: Guenter Roeck @ 2020-05-17 17:37 UTC (permalink / raw)
To: ira.weiny
Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
James E.J. Bottomley, Max Filippov, Paul Mackerras,
H. Peter Anvin, sparclinux, Dan Williams, Helge Deller, x86,
linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
Thomas Gleixner, linux-arm-kernel, Chris Zankel,
Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <20200507150004.1423069-11-ira.weiny@intel.com>
Hi,
On Thu, May 07, 2020 at 07:59:58AM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
>
> To support kmap_atomic_prot(), all architectures need to support
> protections passed to their kmap_atomic_high() function. Pass
> protections into kmap_atomic_high() and change the name to
> kmap_atomic_high_prot() to match.
>
> Then define kmap_atomic_prot() as a core function which calls
> kmap_atomic_high_prot() when needed.
>
> Finally, redefine kmap_atomic() as a wrapper of kmap_atomic_prot() with
> the default kmap_prot exported by the architectures.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
This patch causes a variety of crashes whem booting powerpc images in qemu.
There are lots of warnings such as:
WARNING: CPU: 0 PID: 0 at lib/locking-selftest.c:743 irqsafe1_hard_spin_12+0x50/0xb0
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.7.0-rc5-next-20200515 #1
NIP: c0660c7c LR: c0660c44 CTR: c0660c2c
REGS: c1223e68 TRAP: 0700 Tainted: G W (5.7.0-rc5-next-20200515)
MSR: 00021000 <CE,ME> CR: 28000224 XER: 20000000
GPR00: c0669c78 c1223f20 c113d560 c0660c44 00000000 00000001 c1223ea8 00000001
GPR08: 00000000 00000001 0000fffc ffffffff 88000222 00000000 00000000 00000000
GPR16: 00000000 00000000 00000000 00000000 c0000000 00000000 00000000 c1125084
GPR24: c1125084 c1230000 c1879538 fffffffc 00000001 00000000 c1011afc c1230000
NIP [c0660c7c] irqsafe1_hard_spin_12+0x50/0xb0
LR [c0660c44] irqsafe1_hard_spin_12+0x18/0xb0
Call Trace:
[c1223f20] [c1880000] megasas_mgmt_info+0xee4/0x1008 (unreliable)
[c1223f40] [c0669c78] dotest+0x38/0x550
[c1223f70] [c066aa4c] locking_selftest+0x8bc/0x1d54
[c1223fa0] [c10e0bc8] start_kernel+0x3ec/0x510
[c1223ff0] [c00003a0] set_ivor+0x118/0x154
Instruction dump:
81420000 38e80001 3d4a0001 2c080000 91420000 90e20488 40820008 91020470
81290000 5529031e 7d290034 5529d97e <0f090000> 3fe0c11c 3bff3964 3bff00ac
irq event stamp: 588
hardirqs last enabled at (587): [<c00b9fe4>] vprintk_emit+0x1b4/0x33c
hardirqs last disabled at (588): [<c0660c44>] irqsafe1_hard_spin_12+0x18/0xb0
softirqs last enabled at (0): [<00000000>] 0x0
softirqs last disabled at (0): [<00000000>] 0x0
---[ end trace b18fe9e172f99d03 ]---
This is followed by:
BUG: sleeping function called from invalid context at lib/mpi/mpi-pow.c:245
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: cryptomgr_test
INFO: lockdep is turned off.
CPU: 0 PID: 14 Comm: cryptomgr_test Tainted: G W 5.7.0-rc5-next-20200515 #1
Call Trace:
[ce221b58] [c008755c] ___might_sleep+0x280/0x2a8 (unreliable)
[ce221b78] [c06bc524] mpi_powm+0x634/0xc50
[ce221c38] [c05eafdc] rsa_dec+0x88/0x134
[ce221c78] [c05f3b40] test_akcipher_one+0x678/0x804
[ce221dc8] [c05f3d7c] alg_test_akcipher+0xb0/0x130
[ce221df8] [c05ee674] alg_test.part.0+0xb4/0x458
[ce221ed8] [c05ed2b0] cryptomgr_test+0x30/0x50
[ce221ef8] [c007cd74] kthread+0x134/0x170
[ce221f38] [c001433c] ret_from_kernel_thread+0x14/0x1c
Kernel panic - not syncing: Aiee, killing interrupt handler!
CPU: 0 PID: 14 Comm: cryptomgr_test Tainted: G W 5.7.0-rc5-next-20200515 #1
Call Trace:
[ce221e08] [c00530fc] panic+0x148/0x34c (unreliable)
[ce221e68] [c0056460] do_exit+0xac0/0xb40
[ce221eb8] [c00f5be8] find_kallsyms_symbol_value+0x0/0x128
[ce221ed8] [c05ed2d0] crypto_alg_put+0x0/0x70
[ce221ef8] [c007cd74] kthread+0x134/0x170
[ce221f38] [c001433c] ret_from_kernel_thread+0x14/0x1c
Bisect log is attached. The patch can not easily be reverted since
it results in compile errors.
Note that similar failures are seen with sparc32 images. Those bisect
to a different patch, but reverting that patch doesn't fix the problem.
The failure pattern (warnings followed by a crash in cryptomgr_test)
is the same.
Guenter
---
# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect good 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# good: [87f6f21783522e6d62127cf33ae5e95f50874beb] Merge remote-tracking branch 'spi/for-next'
git bisect good 87f6f21783522e6d62127cf33ae5e95f50874beb
# good: [5c428e8277d5d97c85126387d4e00aa5adde4400] Merge remote-tracking branch 'staging/staging-next'
git bisect good 5c428e8277d5d97c85126387d4e00aa5adde4400
# good: [f68de67ed934e7bdef4799fd7777c86f33f14982] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good f68de67ed934e7bdef4799fd7777c86f33f14982
# bad: [54acd2dc52b069da59639eea0d0c92726f32fb01] mm/memblock: fix a typo in comment "implict"->"implicit"
git bisect bad 54acd2dc52b069da59639eea0d0c92726f32fb01
# good: [784a17aa58a529b84f7cc50f351ed4acf3bd11f3] mm: remove the pgprot argument to __vmalloc
git bisect good 784a17aa58a529b84f7cc50f351ed4acf3bd11f3
# good: [6cd8137ff37e9a37aee2d2a8889c8beb8eab192f] khugepaged: replace the usage of system(3) in the test
git bisect good 6cd8137ff37e9a37aee2d2a8889c8beb8eab192f
# bad: [6987da379826ed01b8a1cf046b67cc8cc10117cc] sparc: remove unnecessary includes
git bisect bad 6987da379826ed01b8a1cf046b67cc8cc10117cc
# good: [bc17b545388f64c09e83e367898e28f60277c584] mm/hugetlb: define a generic fallback for is_hugepage_only_range()
git bisect good bc17b545388f64c09e83e367898e28f60277c584
# good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
# bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
# good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
# good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
# first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
^ permalink raw reply
* Re: [PATCH v8 11/30] powerpc: Use a datatype for instructions
From: Jordan Niethe @ 2020-05-17 10:48 UTC (permalink / raw)
To: linuxppc-dev
Cc: Christophe Leroy, Alistair Popple, Nicholas Piggin, Balamuruhan S,
naveen.n.rao, Daniel Axtens
In-Reply-To: <20200506034050.24806-12-jniethe5@gmail.com>
mpe, this is to go with the fixup I posted for mmu_patch_addis() in
[PATCH v8 12/30] powerpc: Use a function for reading instructions.
Thanks to Christophe pointing it out.
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -98,11 +98,12 @@ static void mmu_patch_cmp_limit(s32 *site,
unsigned long mapped)
static void mmu_patch_addis(s32 *site, long simm)
{
- unsigned int instr = *(unsigned int *)patch_site_addr(site);
+ struct ppc_inst instr = *(struct ppc_inst *)patch_site_addr(site);
+ unsigned int val = ppc_inst_val(instr);
- instr &= 0xffff0000;
- instr |= ((unsigned long)simm) >> 16;
- patch_instruction_site(site, ppc_inst(instr));
+ val &= 0xffff0000;
+ val |= ((unsigned long)simm) >> 16;
+ patch_instruction_site(site, ppc_inst(val));
}
static void mmu_mapin_ram_chunk(unsigned long offset, unsigned long
top, pgprot_t prot)
--
^ permalink raw reply
* Re: [PATCH v8 12/30] powerpc: Use a function for reading instructions
From: Jordan Niethe @ 2020-05-17 10:44 UTC (permalink / raw)
To: Christophe Leroy
Cc: Christophe Leroy, Alistair Popple, Nicholas Piggin, Balamuruhan S,
naveen.n.rao, linuxppc-dev, Daniel Axtens
In-Reply-To: <a7005edf-cdda-4aec-b7b0-fd9f45776147@csgroup.eu>
On Sun, May 17, 2020 at 4:39 AM Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
>
>
> Le 06/05/2020 à 05:40, Jordan Niethe a écrit :
> > Prefixed instructions will mean there are instructions of different
> > length. As a result dereferencing a pointer to an instruction will not
> > necessarily give the desired result. Introduce a function for reading
> > instructions from memory into the instruction data type.
>
>
> Shouldn't this function be used in mmu_patch_addis() in mm/nohash/8xx.c ?
>
> Christophe
Yes, that would be a good idea. mpe here is a fix, along with one I'll
post for [PATCH v8 11/30] powerpc: Use a datatype for instructions.
diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c
--- a/arch/powerpc/mm/nohash/8xx.c
+++ b/arch/powerpc/mm/nohash/8xx.c
@@ -98,7 +98,7 @@ static void mmu_patch_cmp_limit(s32 *site, unsigned
long mapped)
static void mmu_patch_addis(s32 *site, long simm)
{
- struct ppc_inst instr = *(struct ppc_inst *)patch_site_addr(site);
+ struct ppc_inst instr = ppc_inst_read((struct ppc_inst
*)patch_site_addr(site));
unsigned int val = ppc_inst_val(instr);
val &= 0xffff0000;
--
>
> >
> > Reviewed-by: Alistair Popple <alistair@popple.id.au>
> > Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> > ---
> > v4: New to series
> > v5: - Rename read_inst() -> probe_kernel_read_inst()
> > - No longer modify uprobe probe type in this patch
> > v6: - feature-fixups.c: do_final_fixups(): Use here
> > - arch_prepare_kprobe(): patch_instruction(): no longer part of this
> > patch
> > - Move probe_kernel_read_inst() out of this patch
> > - Use in uprobes
> > v8: style
> > ---
> > arch/powerpc/include/asm/inst.h | 5 +++++
> > arch/powerpc/kernel/kprobes.c | 6 +++---
> > arch/powerpc/kernel/mce_power.c | 2 +-
> > arch/powerpc/kernel/optprobes.c | 4 ++--
> > arch/powerpc/kernel/trace/ftrace.c | 4 ++--
> > arch/powerpc/kernel/uprobes.c | 2 +-
> > arch/powerpc/lib/code-patching.c | 26 ++++++++++++++------------
> > arch/powerpc/lib/feature-fixups.c | 4 ++--
> > arch/powerpc/xmon/xmon.c | 6 +++---
> > 9 files changed, 33 insertions(+), 26 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
> > index 19d8bb7a1c2b..552e953bf04f 100644
> > --- a/arch/powerpc/include/asm/inst.h
> > +++ b/arch/powerpc/include/asm/inst.h
> > @@ -27,6 +27,11 @@ static inline struct ppc_inst ppc_inst_swab(struct ppc_inst x)
> > return ppc_inst(swab32(ppc_inst_val(x)));
> > }
> >
> > +static inline struct ppc_inst ppc_inst_read(const struct ppc_inst *ptr)
> > +{
> > + return *ptr;
> > +}
> > +
> > static inline bool ppc_inst_equal(struct ppc_inst x, struct ppc_inst y)
> > {
> > return ppc_inst_val(x) == ppc_inst_val(y);
> > diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> > index a08ae5803622..f64312dca84f 100644
> > --- a/arch/powerpc/kernel/kprobes.c
> > +++ b/arch/powerpc/kernel/kprobes.c
> > @@ -106,7 +106,7 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> > int arch_prepare_kprobe(struct kprobe *p)
> > {
> > int ret = 0;
> > - struct ppc_inst insn = *(struct ppc_inst *)p->addr;
> > + struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->addr);
> >
> > if ((unsigned long)p->addr & 0x03) {
> > printk("Attempt to register kprobe at an unaligned address\n");
> > @@ -127,7 +127,7 @@ int arch_prepare_kprobe(struct kprobe *p)
> > if (!ret) {
> > memcpy(p->ainsn.insn, p->addr,
> > MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
> > - p->opcode = *p->addr;
> > + p->opcode = ppc_inst_val(insn);
> > flush_icache_range((unsigned long)p->ainsn.insn,
> > (unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
> > }
> > @@ -217,7 +217,7 @@ NOKPROBE_SYMBOL(arch_prepare_kretprobe);
> > static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> > {
> > int ret;
> > - struct ppc_inst insn = *(struct ppc_inst *)p->ainsn.insn;
> > + struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->ainsn.insn);
> >
> > /* regs->nip is also adjusted if emulate_step returns 1 */
> > ret = emulate_step(regs, insn);
> > diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> > index cd23218c60bb..45c51ba0071b 100644
> > --- a/arch/powerpc/kernel/mce_power.c
> > +++ b/arch/powerpc/kernel/mce_power.c
> > @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> > pfn = addr_to_pfn(regs, regs->nip);
> > if (pfn != ULONG_MAX) {
> > instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> > - instr = *(struct ppc_inst *)(instr_addr);
> > + instr = ppc_inst_read((struct ppc_inst *)instr_addr);
> > if (!analyse_instr(&op, &tmp, instr)) {
> > pfn = addr_to_pfn(regs, op.ea);
> > *addr = op.ea;
> > diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> > index 5a71fef71c22..52c1ab3f85aa 100644
> > --- a/arch/powerpc/kernel/optprobes.c
> > +++ b/arch/powerpc/kernel/optprobes.c
> > @@ -100,9 +100,9 @@ static unsigned long can_optimize(struct kprobe *p)
> > * Ensure that the instruction is not a conditional branch,
> > * and that can be emulated.
> > */
> > - if (!is_conditional_branch(*(struct ppc_inst *)p->ainsn.insn) &&
> > + if (!is_conditional_branch(ppc_inst_read((struct ppc_inst *)p->ainsn.insn)) &&
> > analyse_instr(&op, ®s,
> > - *(struct ppc_inst *)p->ainsn.insn) == 1) {
> > + ppc_inst_read((struct ppc_inst *)p->ainsn.insn)) == 1) {
> > emulate_update_regs(®s, &op);
> > nip = regs.nip;
> > }
> > diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
> > index 3117ed675735..acd5b889815f 100644
> > --- a/arch/powerpc/kernel/trace/ftrace.c
> > +++ b/arch/powerpc/kernel/trace/ftrace.c
> > @@ -848,7 +848,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
> > struct ppc_inst old, new;
> > int ret;
> >
> > - old = *(struct ppc_inst *)&ftrace_call;
> > + old = ppc_inst_read((struct ppc_inst *)&ftrace_call);
> > new = ftrace_call_replace(ip, (unsigned long)func, 1);
> > ret = ftrace_modify_code(ip, old, new);
> >
> > @@ -856,7 +856,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
> > /* Also update the regs callback function */
> > if (!ret) {
> > ip = (unsigned long)(&ftrace_regs_call);
> > - old = *(struct ppc_inst *)&ftrace_regs_call;
> > + old = ppc_inst_read((struct ppc_inst *)&ftrace_regs_call);
> > new = ftrace_call_replace(ip, (unsigned long)func, 1);
> > ret = ftrace_modify_code(ip, old, new);
> > }
> > diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> > index 31c870287f2b..6893d40a48c5 100644
> > --- a/arch/powerpc/kernel/uprobes.c
> > +++ b/arch/powerpc/kernel/uprobes.c
> > @@ -174,7 +174,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> > * emulate_step() returns 1 if the insn was successfully emulated.
> > * For all other cases, we need to single-step in hardware.
> > */
> > - ret = emulate_step(regs, auprobe->insn);
> > + ret = emulate_step(regs, ppc_inst_read(&auprobe->insn));
> > if (ret > 0)
> > return true;
> >
> > diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> > index 1dff9d9d6645..435fc8e9f45d 100644
> > --- a/arch/powerpc/lib/code-patching.c
> > +++ b/arch/powerpc/lib/code-patching.c
> > @@ -348,9 +348,9 @@ static unsigned long branch_bform_target(const struct ppc_inst *instr)
> >
> > unsigned long branch_target(const struct ppc_inst *instr)
> > {
> > - if (instr_is_branch_iform(*instr))
> > + if (instr_is_branch_iform(ppc_inst_read(instr)))
> > return branch_iform_target(instr);
> > - else if (instr_is_branch_bform(*instr))
> > + else if (instr_is_branch_bform(ppc_inst_read(instr)))
> > return branch_bform_target(instr);
> >
> > return 0;
> > @@ -358,7 +358,8 @@ unsigned long branch_target(const struct ppc_inst *instr)
> >
> > int instr_is_branch_to_addr(const struct ppc_inst *instr, unsigned long addr)
> > {
> > - if (instr_is_branch_iform(*instr) || instr_is_branch_bform(*instr))
> > + if (instr_is_branch_iform(ppc_inst_read(instr)) ||
> > + instr_is_branch_bform(ppc_inst_read(instr)))
> > return branch_target(instr) == addr;
> >
> > return 0;
> > @@ -368,13 +369,14 @@ int translate_branch(struct ppc_inst *instr, const struct ppc_inst *dest,
> > const struct ppc_inst *src)
> > {
> > unsigned long target;
> > -
> > target = branch_target(src);
> >
> > - if (instr_is_branch_iform(*src))
> > - return create_branch(instr, dest, target, ppc_inst_val(*src));
> > - else if (instr_is_branch_bform(*src))
> > - return create_cond_branch(instr, dest, target, ppc_inst_val(*src));
> > + if (instr_is_branch_iform(ppc_inst_read(src)))
> > + return create_branch(instr, dest, target,
> > + ppc_inst_val(ppc_inst_read(src)));
> > + else if (instr_is_branch_bform(ppc_inst_read(src)))
> > + return create_cond_branch(instr, dest, target,
> > + ppc_inst_val(ppc_inst_read(src)));
> >
> > return 1;
> > }
> > @@ -598,7 +600,7 @@ static void __init test_translate_branch(void)
> > patch_instruction(q, instr);
> > check(instr_is_branch_to_addr(p, addr));
> > check(instr_is_branch_to_addr(q, addr));
> > - check(ppc_inst_equal(*q, ppc_inst(0x4a000000)));
> > + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x4a000000)));
> >
> > /* Maximum positive case, move x to x - 32 MB + 4 */
> > p = buf + 0x2000000;
> > @@ -609,7 +611,7 @@ static void __init test_translate_branch(void)
> > patch_instruction(q, instr);
> > check(instr_is_branch_to_addr(p, addr));
> > check(instr_is_branch_to_addr(q, addr));
> > - check(ppc_inst_equal(*q, ppc_inst(0x49fffffc)));
> > + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x49fffffc)));
> >
> > /* Jump to x + 16 MB moved to x + 20 MB */
> > p = buf;
> > @@ -655,7 +657,7 @@ static void __init test_translate_branch(void)
> > patch_instruction(q, instr);
> > check(instr_is_branch_to_addr(p, addr));
> > check(instr_is_branch_to_addr(q, addr));
> > - check(ppc_inst_equal(*q, ppc_inst(0x43ff8000)));
> > + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x43ff8000)));
> >
> > /* Maximum positive case, move x to x - 32 KB + 4 */
> > p = buf + 0x8000;
> > @@ -667,7 +669,7 @@ static void __init test_translate_branch(void)
> > patch_instruction(q, instr);
> > check(instr_is_branch_to_addr(p, addr));
> > check(instr_is_branch_to_addr(q, addr));
> > - check(ppc_inst_equal(*q, ppc_inst(0x43ff7ffc)));
> > + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x43ff7ffc)));
> >
> > /* Jump to x + 12 KB moved to x + 20 KB */
> > p = buf;
> > diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
> > index fb6e8e8abf4e..c0d3ed4efb7e 100644
> > --- a/arch/powerpc/lib/feature-fixups.c
> > +++ b/arch/powerpc/lib/feature-fixups.c
> > @@ -48,7 +48,7 @@ static int patch_alt_instruction(struct ppc_inst *src, struct ppc_inst *dest,
> > int err;
> > struct ppc_inst instr;
> >
> > - instr = *src;
> > + instr = ppc_inst_read(src);
> >
> > if (instr_is_relative_branch(*src)) {
> > struct ppc_inst *target = (struct ppc_inst *)branch_target(src);
> > @@ -403,7 +403,7 @@ static void do_final_fixups(void)
> > length = (__end_interrupts - _stext) / sizeof(struct ppc_inst);
> >
> > while (length--) {
> > - raw_patch_instruction(dest, *src);
> > + raw_patch_instruction(dest, ppc_inst_read(src));
> > src++;
> > dest++;
> > }
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index e0132d6d24d0..68e0b05d9226 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -702,13 +702,13 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> > if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> > bp = at_breakpoint(regs->nip);
> > if (bp != NULL) {
> > - int stepped = emulate_step(regs, bp->instr[0]);
> > + int stepped = emulate_step(regs, ppc_inst_read(bp->instr));
> > if (stepped == 0) {
> > regs->nip = (unsigned long) &bp->instr[0];
> > atomic_inc(&bp->ref_count);
> > } else if (stepped < 0) {
> > printf("Couldn't single-step %s instruction\n",
> > - (IS_RFID(bp->instr[0])? "rfid": "mtmsrd"));
> > + IS_RFID(ppc_inst_read(bp->instr))? "rfid": "mtmsrd");
> > }
> > }
> > }
> > @@ -949,7 +949,7 @@ static void remove_bpts(void)
> > if (mread(bp->address, &instr, 4) == 4
> > && ppc_inst_equal(instr, ppc_inst(bpinstr))
> > && patch_instruction(
> > - (struct ppc_inst *)bp->address, bp->instr[0]) != 0)
> > + (struct ppc_inst *)bp->address, ppc_inst_read(bp->instr)) != 0)
> > printf("Couldn't remove breakpoint at %lx\n",
> > bp->address);
> > }
> >
^ permalink raw reply
* Re: [PATCH] powerpc/64s: Fix early_init_mmu section mismatch
From: Christian Zigotzky @ 2020-05-17 10:12 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
In-Reply-To: <20200429070247.1678172-1-npiggin@gmail.com>
Hi All,
This patch wasn't included in the PowerPC fixes 5.7-4. Please add it.
Thanks,
Christian
On 29 April 2020 at 09:02 am, Nicholas Piggin wrote:
> Christian reports:
>
> MODPOST vmlinux.o
> WARNING: modpost: vmlinux.o(.text.unlikely+0x1a0): Section mismatch in
> reference from the function .early_init_mmu() to the function
> .init.text:.radix__early_init_mmu()
> The function .early_init_mmu() references
> the function __init .radix__early_init_mmu().
> This is often because .early_init_mmu lacks a __init
> annotation or the annotation of .radix__early_init_mmu is wrong.
>
> WARNING: modpost: vmlinux.o(.text.unlikely+0x1ac): Section mismatch in
> reference from the function .early_init_mmu() to the function
> .init.text:.hash__early_init_mmu()
> The function .early_init_mmu() references
> the function __init .hash__early_init_mmu().
> This is often because .early_init_mmu lacks a __init
> annotation or the annotation of .hash__early_init_mmu is wrong.
>
> The compiler is uninlining early_init_mmu and not putting it in an init
> section because there is no annotation. Add it.
>
> Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/include/asm/book3s/64/mmu.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index bb3deb76c951..3ffe5f967483 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -208,7 +208,7 @@ void hash__early_init_devtree(void);
> void radix__early_init_devtree(void);
> extern void hash__early_init_mmu(void);
> extern void radix__early_init_mmu(void);
> -static inline void early_init_mmu(void)
> +static inline void __init early_init_mmu(void)
> {
> if (radix_enabled())
> return radix__early_init_mmu();
^ permalink raw reply
* Re: [PATCH v8 08/30] powerpc: Use a function for getting the instruction op code
From: Jordan Niethe @ 2020-05-17 7:41 UTC (permalink / raw)
To: Michael Ellerman
Cc: Christophe Leroy, Alistair Popple, Nicholas Piggin, Balamuruhan S,
naveen.n.rao, linuxppc-dev, Daniel Axtens
In-Reply-To: <87v9kw9lx3.fsf@mpe.ellerman.id.au>
On Sat, May 16, 2020 at 9:08 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Jordan Niethe <jniethe5@gmail.com> writes:
> > mpe, as suggested by Christophe could you please add this.
>
> I did that and ...
>
> > diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
> > --- a/arch/powerpc/include/asm/inst.h
> > +++ b/arch/powerpc/include/asm/inst.h
> > @@ -2,6 +2,8 @@
> > #ifndef _ASM_INST_H
> > #define _ASM_INST_H
> >
> > +#include <asm/disassemble.h>
>
> .. this eventually breaks the build in some driver, because get_ra() is
> redefined.
>
> So I've backed out this change for now.
Thanks, that is fine with me.
>
> If we want to use the macros in disassemble.h we'll need to namespace
> them better, eg. make them ppc_get_ra() and so on.
>
> cheers
>
> > /*
> > * Instruction data type for POWER
> > */
> > @@ -15,7 +17,7 @@ static inline u32 ppc_inst_val(u32 x)
> >
> > static inline int ppc_inst_primary_opcode(u32 x)
> > {
> > - return ppc_inst_val(x) >> 26;
> > + return get_op(ppc_inst_val(x));
> > }
> >
> > #endif /* _ASM_INST_H */
> > --
> > 2.17.1
^ permalink raw reply
* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.7-4 tag
From: pr-tracker-bot @ 2020-05-17 4:45 UTC (permalink / raw)
To: Michael Ellerman
Cc: christophe.leroy, Linus Torvalds, nayna, linux-kernel, npiggin,
linuxppc-dev
In-Reply-To: <87pnb49j0c.fsf@mpe.ellerman.id.au>
The pull request you sent on Sat, 16 May 2020 22:11:47 +1000:
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.7-4
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/befc42e5dd4977b63dd3b0c0db05e21d56f13c2e
Thank you!
--
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker
^ permalink raw reply
* [powerpc:next-test] BUILD SUCCESS 7b92ee1db93df64553a36f23fec86298d37ee10a
From: kbuild test robot @ 2020-05-17 2:27 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
branch HEAD: 7b92ee1db93df64553a36f23fec86298d37ee10a powerpc/watchpoint/xmon: Support 2nd DAWR
i386-tinyconfig vmlinux size:
+-------+------------------------------------+-----------------------------------------------------------------------+
| DELTA | SYMBOL | COMMIT |
+-------+------------------------------------+-----------------------------------------------------------------------+
| +118 | TOTAL | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +114 | TEXT | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +1355 | balance_dirty_pages() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +615 | __setup_rt_frame() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +136 | arch/x86/events/zhaoxin/built-in.* | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +136 | arch/x86/events/zhaoxin/built-in.* | 722c1963aba5 selftests/powerpc: Add README for GZIP engine tests |
| -136 | arch/x86/events/zhaoxin/built-in.* | 45591da76588 powerpc/vas: Include linux/types.h in uapi/asm/vas-api.h |
| +113 | klist_release() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +93 | change_clocksource() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +86 | release_bdi() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| +84 | kobject_release() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -68 | bdi_put() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -77 | kobject_put() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -79 | timekeeping_notify() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -99 | klist_dec_and_del() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -555 | do_signal() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
| -1383 | balance_dirty_pages_ratelimited() | ae83d0b416db..7b92ee1db93d (ALL COMMITS) |
+-------+------------------------------------+-----------------------------------------------------------------------+
elapsed time: 695m
configs tested: 67
configs skipped: 77
The following configs have been built successfully.
More configs may be tested in the coming days.
sparc allyesconfig
mips allyesconfig
m68k allyesconfig
arm em_x270_defconfig
powerpc pq2fads_defconfig
mips cu1000-neo_defconfig
powerpc ppc40x_defconfig
sh shx3_defconfig
m68k hp300_defconfig
ia64 allmodconfig
powerpc powernv_defconfig
arm aspeed_g5_defconfig
sh titan_defconfig
microblaze mmu_defconfig
arm jornada720_defconfig
arm spear3xx_defconfig
i386 allnoconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a006-20200517
i386 randconfig-a005-20200517
i386 randconfig-a003-20200517
i386 randconfig-a001-20200517
i386 randconfig-a004-20200517
i386 randconfig-a002-20200517
x86_64 randconfig-a005-20200517
x86_64 randconfig-a003-20200517
x86_64 randconfig-a006-20200517
x86_64 randconfig-a004-20200517
x86_64 randconfig-a001-20200517
x86_64 randconfig-a002-20200517
i386 randconfig-a012-20200517
i386 randconfig-a016-20200517
i386 randconfig-a014-20200517
i386 randconfig-a011-20200517
i386 randconfig-a013-20200517
i386 randconfig-a015-20200517
i386 randconfig-a012-20200515
i386 randconfig-a016-20200515
i386 randconfig-a014-20200515
i386 randconfig-a013-20200515
x86_64 defconfig
i386 allyesconfig
um allmodconfig
um allnoconfig
um allyesconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* [powerpc:merge] BUILD SUCCESS 5b55902a1d35b73daec747b11b903959b8e7aa70
From: kbuild test robot @ 2020-05-17 0:06 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 5b55902a1d35b73daec747b11b903959b8e7aa70 Automatic merge of 'master', 'next' and 'fixes' (2020-05-16 21:36)
elapsed time: 553m
configs tested: 117
configs skipped: 3
The following configs have been built successfully.
More configs may be tested in the coming days.
arm64 allyesconfig
arm64 defconfig
arm64 allmodconfig
arm64 allnoconfig
arm defconfig
arm allyesconfig
arm allmodconfig
arm allnoconfig
mips allyesconfig
m68k allyesconfig
s390 zfcpdump_defconfig
powerpc mpc866_ads_defconfig
powerpc adder875_defconfig
sh microdev_defconfig
arm shannon_defconfig
sh kfr2r09-romimage_defconfig
powerpc pq2fads_defconfig
mips bmips_stb_defconfig
csky alldefconfig
mips malta_kvm_guest_defconfig
arm imx_v4_v5_defconfig
arm moxart_defconfig
c6x dsk6455_defconfig
mips gpr_defconfig
sh sh2007_defconfig
arm vt8500_v6_v7_defconfig
powerpc g5_defconfig
powerpc cell_defconfig
arm multi_v7_defconfig
i386 allnoconfig
i386 defconfig
i386 debian-10.3
i386 allyesconfig
ia64 allmodconfig
ia64 defconfig
ia64 allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k allnoconfig
m68k sun3_defconfig
m68k defconfig
nds32 defconfig
nds32 allnoconfig
csky allyesconfig
csky defconfig
alpha defconfig
alpha allyesconfig
xtensa allyesconfig
h8300 allyesconfig
h8300 allmodconfig
xtensa defconfig
nios2 defconfig
nios2 allyesconfig
openrisc defconfig
c6x allyesconfig
c6x allnoconfig
openrisc allyesconfig
arc defconfig
arc allyesconfig
sh allmodconfig
sh allnoconfig
microblaze allnoconfig
mips allnoconfig
mips allmodconfig
parisc allnoconfig
parisc defconfig
parisc allyesconfig
parisc allmodconfig
powerpc defconfig
powerpc allyesconfig
powerpc rhel-kconfig
powerpc allmodconfig
powerpc allnoconfig
i386 randconfig-a006-20200515
i386 randconfig-a005-20200515
i386 randconfig-a003-20200515
i386 randconfig-a001-20200515
i386 randconfig-a004-20200515
i386 randconfig-a002-20200515
i386 randconfig-a012-20200515
i386 randconfig-a016-20200515
i386 randconfig-a014-20200515
i386 randconfig-a011-20200515
i386 randconfig-a013-20200515
i386 randconfig-a015-20200515
x86_64 randconfig-a005-20200515
x86_64 randconfig-a003-20200515
x86_64 randconfig-a006-20200515
x86_64 randconfig-a004-20200515
x86_64 randconfig-a001-20200515
x86_64 randconfig-a002-20200515
riscv allyesconfig
riscv allnoconfig
riscv defconfig
riscv allmodconfig
s390 allyesconfig
s390 allnoconfig
s390 allmodconfig
s390 defconfig
x86_64 defconfig
sparc allyesconfig
sparc defconfig
sparc64 defconfig
sparc64 allnoconfig
sparc64 allyesconfig
sparc64 allmodconfig
um allmodconfig
um allnoconfig
um allyesconfig
um defconfig
x86_64 rhel
x86_64 rhel-7.6
x86_64 rhel-7.6-kselftests
x86_64 rhel-7.2-clear
x86_64 lkp
x86_64 fedora-25
x86_64 kexec
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
^ permalink raw reply
* Re: [PATCH V3 07/15] arch/kunmap_atomic: Consolidate duplicate code
From: Guenter Roeck @ 2020-05-16 22:33 UTC (permalink / raw)
To: ira.weiny
Cc: Peter Zijlstra, Dave Hansen, dri-devel, linux-mips,
James E.J. Bottomley, Max Filippov, Paul Mackerras,
H. Peter Anvin, sparclinux, Dan Williams, Helge Deller, x86,
linux-csky, Christoph Hellwig, Ingo Molnar, linux-snps-arc,
linux-xtensa, Borislav Petkov, Al Viro, Andy Lutomirski,
Thomas Gleixner, linux-arm-kernel, Chris Zankel,
Thomas Bogendoerfer, linux-parisc, linux-kernel, Christian Koenig,
Andrew Morton, linuxppc-dev, David S. Miller
In-Reply-To: <20200507150004.1423069-8-ira.weiny@intel.com>
On Thu, May 07, 2020 at 07:59:55AM -0700, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
>
> Every single architecture (including !CONFIG_HIGHMEM) calls...
>
> pagefault_enable();
> preempt_enable();
>
> ... before returning from __kunmap_atomic(). Lift this code into the
> kunmap_atomic() macro.
>
> While we are at it rename __kunmap_atomic() to kunmap_atomic_high() to
> be consistent.
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
This patch results in:
Starting init: /bin/sh exists but couldn't execute it (error -14)
when trying to boot microblazeel:petalogix-ml605 in qemu.
Bisect log attached.
Guenter
---
# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect good 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# good: [87f6f21783522e6d62127cf33ae5e95f50874beb] Merge remote-tracking branch 'spi/for-next'
git bisect good 87f6f21783522e6d62127cf33ae5e95f50874beb
# good: [5c428e8277d5d97c85126387d4e00aa5adde4400] Merge remote-tracking branch 'staging/staging-next'
git bisect good 5c428e8277d5d97c85126387d4e00aa5adde4400
# good: [f68de67ed934e7bdef4799fd7777c86f33f14982] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good f68de67ed934e7bdef4799fd7777c86f33f14982
# bad: [54acd2dc52b069da59639eea0d0c92726f32fb01] mm/memblock: fix a typo in comment "implict"->"implicit"
git bisect bad 54acd2dc52b069da59639eea0d0c92726f32fb01
# good: [784a17aa58a529b84f7cc50f351ed4acf3bd11f3] mm: remove the pgprot argument to __vmalloc
git bisect good 784a17aa58a529b84f7cc50f351ed4acf3bd11f3
# good: [6cd8137ff37e9a37aee2d2a8889c8beb8eab192f] khugepaged: replace the usage of system(3) in the test
git bisect good 6cd8137ff37e9a37aee2d2a8889c8beb8eab192f
# bad: [6987da379826ed01b8a1cf046b67cc8cc10117cc] sparc: remove unnecessary includes
git bisect bad 6987da379826ed01b8a1cf046b67cc8cc10117cc
# good: [bc17b545388f64c09e83e367898e28f60277c584] mm/hugetlb: define a generic fallback for is_hugepage_only_range()
git bisect good bc17b545388f64c09e83e367898e28f60277c584
# bad: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect bad 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
# good: [0941a38ff0790c1004270f952067a5918a4ba32d] arch/kmap: remove redundant arch specific kmaps
git bisect good 0941a38ff0790c1004270f952067a5918a4ba32d
# good: [56e635a64c2cbfa815c851af10e0f811e809977b] arch-kunmap-remove-duplicate-kunmap-implementations-fix
git bisect good 56e635a64c2cbfa815c851af10e0f811e809977b
# bad: [60f96b2233c790d4f1c49317643051f1670bcb29] arch/kmap_atomic: consolidate duplicate code
git bisect bad 60f96b2233c790d4f1c49317643051f1670bcb29
# good: [7b3708dc3bf72a647243064fe7ddf9a76248ddfd] {x86,powerpc,microblaze}/kmap: move preempt disable
git bisect good 7b3708dc3bf72a647243064fe7ddf9a76248ddfd
# first bad commit: [60f96b2233c790d4f1c49317643051f1670bcb29] arch/kmap_atomic: consolidate duplicate code
^ permalink raw reply
* Re: [PATCH v8 12/30] powerpc: Use a function for reading instructions
From: Christophe Leroy @ 2020-05-16 18:39 UTC (permalink / raw)
To: Jordan Niethe, linuxppc-dev
Cc: christophe.leroy, alistair, npiggin, bala24, naveen.n.rao, dja
In-Reply-To: <20200506034050.24806-13-jniethe5@gmail.com>
Le 06/05/2020 à 05:40, Jordan Niethe a écrit :
> Prefixed instructions will mean there are instructions of different
> length. As a result dereferencing a pointer to an instruction will not
> necessarily give the desired result. Introduce a function for reading
> instructions from memory into the instruction data type.
Shouldn't this function be used in mmu_patch_addis() in mm/nohash/8xx.c ?
Christophe
>
> Reviewed-by: Alistair Popple <alistair@popple.id.au>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
> v4: New to series
> v5: - Rename read_inst() -> probe_kernel_read_inst()
> - No longer modify uprobe probe type in this patch
> v6: - feature-fixups.c: do_final_fixups(): Use here
> - arch_prepare_kprobe(): patch_instruction(): no longer part of this
> patch
> - Move probe_kernel_read_inst() out of this patch
> - Use in uprobes
> v8: style
> ---
> arch/powerpc/include/asm/inst.h | 5 +++++
> arch/powerpc/kernel/kprobes.c | 6 +++---
> arch/powerpc/kernel/mce_power.c | 2 +-
> arch/powerpc/kernel/optprobes.c | 4 ++--
> arch/powerpc/kernel/trace/ftrace.c | 4 ++--
> arch/powerpc/kernel/uprobes.c | 2 +-
> arch/powerpc/lib/code-patching.c | 26 ++++++++++++++------------
> arch/powerpc/lib/feature-fixups.c | 4 ++--
> arch/powerpc/xmon/xmon.c | 6 +++---
> 9 files changed, 33 insertions(+), 26 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
> index 19d8bb7a1c2b..552e953bf04f 100644
> --- a/arch/powerpc/include/asm/inst.h
> +++ b/arch/powerpc/include/asm/inst.h
> @@ -27,6 +27,11 @@ static inline struct ppc_inst ppc_inst_swab(struct ppc_inst x)
> return ppc_inst(swab32(ppc_inst_val(x)));
> }
>
> +static inline struct ppc_inst ppc_inst_read(const struct ppc_inst *ptr)
> +{
> + return *ptr;
> +}
> +
> static inline bool ppc_inst_equal(struct ppc_inst x, struct ppc_inst y)
> {
> return ppc_inst_val(x) == ppc_inst_val(y);
> diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
> index a08ae5803622..f64312dca84f 100644
> --- a/arch/powerpc/kernel/kprobes.c
> +++ b/arch/powerpc/kernel/kprobes.c
> @@ -106,7 +106,7 @@ kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> int arch_prepare_kprobe(struct kprobe *p)
> {
> int ret = 0;
> - struct ppc_inst insn = *(struct ppc_inst *)p->addr;
> + struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->addr);
>
> if ((unsigned long)p->addr & 0x03) {
> printk("Attempt to register kprobe at an unaligned address\n");
> @@ -127,7 +127,7 @@ int arch_prepare_kprobe(struct kprobe *p)
> if (!ret) {
> memcpy(p->ainsn.insn, p->addr,
> MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
> - p->opcode = *p->addr;
> + p->opcode = ppc_inst_val(insn);
> flush_icache_range((unsigned long)p->ainsn.insn,
> (unsigned long)p->ainsn.insn + sizeof(kprobe_opcode_t));
> }
> @@ -217,7 +217,7 @@ NOKPROBE_SYMBOL(arch_prepare_kretprobe);
> static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
> {
> int ret;
> - struct ppc_inst insn = *(struct ppc_inst *)p->ainsn.insn;
> + struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->ainsn.insn);
>
> /* regs->nip is also adjusted if emulate_step returns 1 */
> ret = emulate_step(regs, insn);
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index cd23218c60bb..45c51ba0071b 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -374,7 +374,7 @@ static int mce_find_instr_ea_and_phys(struct pt_regs *regs, uint64_t *addr,
> pfn = addr_to_pfn(regs, regs->nip);
> if (pfn != ULONG_MAX) {
> instr_addr = (pfn << PAGE_SHIFT) + (regs->nip & ~PAGE_MASK);
> - instr = *(struct ppc_inst *)(instr_addr);
> + instr = ppc_inst_read((struct ppc_inst *)instr_addr);
> if (!analyse_instr(&op, &tmp, instr)) {
> pfn = addr_to_pfn(regs, op.ea);
> *addr = op.ea;
> diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
> index 5a71fef71c22..52c1ab3f85aa 100644
> --- a/arch/powerpc/kernel/optprobes.c
> +++ b/arch/powerpc/kernel/optprobes.c
> @@ -100,9 +100,9 @@ static unsigned long can_optimize(struct kprobe *p)
> * Ensure that the instruction is not a conditional branch,
> * and that can be emulated.
> */
> - if (!is_conditional_branch(*(struct ppc_inst *)p->ainsn.insn) &&
> + if (!is_conditional_branch(ppc_inst_read((struct ppc_inst *)p->ainsn.insn)) &&
> analyse_instr(&op, ®s,
> - *(struct ppc_inst *)p->ainsn.insn) == 1) {
> + ppc_inst_read((struct ppc_inst *)p->ainsn.insn)) == 1) {
> emulate_update_regs(®s, &op);
> nip = regs.nip;
> }
> diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
> index 3117ed675735..acd5b889815f 100644
> --- a/arch/powerpc/kernel/trace/ftrace.c
> +++ b/arch/powerpc/kernel/trace/ftrace.c
> @@ -848,7 +848,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
> struct ppc_inst old, new;
> int ret;
>
> - old = *(struct ppc_inst *)&ftrace_call;
> + old = ppc_inst_read((struct ppc_inst *)&ftrace_call);
> new = ftrace_call_replace(ip, (unsigned long)func, 1);
> ret = ftrace_modify_code(ip, old, new);
>
> @@ -856,7 +856,7 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
> /* Also update the regs callback function */
> if (!ret) {
> ip = (unsigned long)(&ftrace_regs_call);
> - old = *(struct ppc_inst *)&ftrace_regs_call;
> + old = ppc_inst_read((struct ppc_inst *)&ftrace_regs_call);
> new = ftrace_call_replace(ip, (unsigned long)func, 1);
> ret = ftrace_modify_code(ip, old, new);
> }
> diff --git a/arch/powerpc/kernel/uprobes.c b/arch/powerpc/kernel/uprobes.c
> index 31c870287f2b..6893d40a48c5 100644
> --- a/arch/powerpc/kernel/uprobes.c
> +++ b/arch/powerpc/kernel/uprobes.c
> @@ -174,7 +174,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
> * emulate_step() returns 1 if the insn was successfully emulated.
> * For all other cases, we need to single-step in hardware.
> */
> - ret = emulate_step(regs, auprobe->insn);
> + ret = emulate_step(regs, ppc_inst_read(&auprobe->insn));
> if (ret > 0)
> return true;
>
> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
> index 1dff9d9d6645..435fc8e9f45d 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -348,9 +348,9 @@ static unsigned long branch_bform_target(const struct ppc_inst *instr)
>
> unsigned long branch_target(const struct ppc_inst *instr)
> {
> - if (instr_is_branch_iform(*instr))
> + if (instr_is_branch_iform(ppc_inst_read(instr)))
> return branch_iform_target(instr);
> - else if (instr_is_branch_bform(*instr))
> + else if (instr_is_branch_bform(ppc_inst_read(instr)))
> return branch_bform_target(instr);
>
> return 0;
> @@ -358,7 +358,8 @@ unsigned long branch_target(const struct ppc_inst *instr)
>
> int instr_is_branch_to_addr(const struct ppc_inst *instr, unsigned long addr)
> {
> - if (instr_is_branch_iform(*instr) || instr_is_branch_bform(*instr))
> + if (instr_is_branch_iform(ppc_inst_read(instr)) ||
> + instr_is_branch_bform(ppc_inst_read(instr)))
> return branch_target(instr) == addr;
>
> return 0;
> @@ -368,13 +369,14 @@ int translate_branch(struct ppc_inst *instr, const struct ppc_inst *dest,
> const struct ppc_inst *src)
> {
> unsigned long target;
> -
> target = branch_target(src);
>
> - if (instr_is_branch_iform(*src))
> - return create_branch(instr, dest, target, ppc_inst_val(*src));
> - else if (instr_is_branch_bform(*src))
> - return create_cond_branch(instr, dest, target, ppc_inst_val(*src));
> + if (instr_is_branch_iform(ppc_inst_read(src)))
> + return create_branch(instr, dest, target,
> + ppc_inst_val(ppc_inst_read(src)));
> + else if (instr_is_branch_bform(ppc_inst_read(src)))
> + return create_cond_branch(instr, dest, target,
> + ppc_inst_val(ppc_inst_read(src)));
>
> return 1;
> }
> @@ -598,7 +600,7 @@ static void __init test_translate_branch(void)
> patch_instruction(q, instr);
> check(instr_is_branch_to_addr(p, addr));
> check(instr_is_branch_to_addr(q, addr));
> - check(ppc_inst_equal(*q, ppc_inst(0x4a000000)));
> + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x4a000000)));
>
> /* Maximum positive case, move x to x - 32 MB + 4 */
> p = buf + 0x2000000;
> @@ -609,7 +611,7 @@ static void __init test_translate_branch(void)
> patch_instruction(q, instr);
> check(instr_is_branch_to_addr(p, addr));
> check(instr_is_branch_to_addr(q, addr));
> - check(ppc_inst_equal(*q, ppc_inst(0x49fffffc)));
> + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x49fffffc)));
>
> /* Jump to x + 16 MB moved to x + 20 MB */
> p = buf;
> @@ -655,7 +657,7 @@ static void __init test_translate_branch(void)
> patch_instruction(q, instr);
> check(instr_is_branch_to_addr(p, addr));
> check(instr_is_branch_to_addr(q, addr));
> - check(ppc_inst_equal(*q, ppc_inst(0x43ff8000)));
> + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x43ff8000)));
>
> /* Maximum positive case, move x to x - 32 KB + 4 */
> p = buf + 0x8000;
> @@ -667,7 +669,7 @@ static void __init test_translate_branch(void)
> patch_instruction(q, instr);
> check(instr_is_branch_to_addr(p, addr));
> check(instr_is_branch_to_addr(q, addr));
> - check(ppc_inst_equal(*q, ppc_inst(0x43ff7ffc)));
> + check(ppc_inst_equal(ppc_inst_read(q), ppc_inst(0x43ff7ffc)));
>
> /* Jump to x + 12 KB moved to x + 20 KB */
> p = buf;
> diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
> index fb6e8e8abf4e..c0d3ed4efb7e 100644
> --- a/arch/powerpc/lib/feature-fixups.c
> +++ b/arch/powerpc/lib/feature-fixups.c
> @@ -48,7 +48,7 @@ static int patch_alt_instruction(struct ppc_inst *src, struct ppc_inst *dest,
> int err;
> struct ppc_inst instr;
>
> - instr = *src;
> + instr = ppc_inst_read(src);
>
> if (instr_is_relative_branch(*src)) {
> struct ppc_inst *target = (struct ppc_inst *)branch_target(src);
> @@ -403,7 +403,7 @@ static void do_final_fixups(void)
> length = (__end_interrupts - _stext) / sizeof(struct ppc_inst);
>
> while (length--) {
> - raw_patch_instruction(dest, *src);
> + raw_patch_instruction(dest, ppc_inst_read(src));
> src++;
> dest++;
> }
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index e0132d6d24d0..68e0b05d9226 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -702,13 +702,13 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
> if ((regs->msr & (MSR_IR|MSR_PR|MSR_64BIT)) == (MSR_IR|MSR_64BIT)) {
> bp = at_breakpoint(regs->nip);
> if (bp != NULL) {
> - int stepped = emulate_step(regs, bp->instr[0]);
> + int stepped = emulate_step(regs, ppc_inst_read(bp->instr));
> if (stepped == 0) {
> regs->nip = (unsigned long) &bp->instr[0];
> atomic_inc(&bp->ref_count);
> } else if (stepped < 0) {
> printf("Couldn't single-step %s instruction\n",
> - (IS_RFID(bp->instr[0])? "rfid": "mtmsrd"));
> + IS_RFID(ppc_inst_read(bp->instr))? "rfid": "mtmsrd");
> }
> }
> }
> @@ -949,7 +949,7 @@ static void remove_bpts(void)
> if (mread(bp->address, &instr, 4) == 4
> && ppc_inst_equal(instr, ppc_inst(bpinstr))
> && patch_instruction(
> - (struct ppc_inst *)bp->address, bp->instr[0]) != 0)
> + (struct ppc_inst *)bp->address, ppc_inst_read(bp->instr)) != 0)
> printf("Couldn't remove breakpoint at %lx\n",
> bp->address);
> }
>
^ permalink raw reply
* Re: [PATCH v4 03/14] arm64: add support for folded p4d page tables
From: Mike Rapoport @ 2020-05-16 17:20 UTC (permalink / raw)
To: Andrew Morton
Cc: Rich Felker, linux-ia64, Geert Uytterhoeven, linux-sh, linux-mm,
Paul Mackerras, linux-hexagon, Will Deacon, kvmarm, Jonas Bonn,
linux-arch, Brian Cain, Marc Zyngier, Russell King, Ley Foon Tan,
Mike Rapoport, Catalin Marinas, Julien Thierry, uclinux-h8-devel,
Fenghua Yu, Arnd Bergmann, Suzuki K Poulose, kvm-ppc,
Stefan Kristiansson, openrisc, Stafford Horne, Guan Xuetao,
linux-arm-kernel, Christophe Leroy, Tony Luck, Yoshinori Sato,
linux-kernel, James Morse, nios2-dev, linuxppc-dev
In-Reply-To: <20200515114012.49f45aa01efb7d8b918bc0f5@linux-foundation.org>
On Fri, May 15, 2020 at 11:40:12AM -0700, Andrew Morton wrote:
> On Tue, 14 Apr 2020 18:34:44 +0300 Mike Rapoport <rppt@kernel.org> wrote:
>
> > Implement primitives necessary for the 4th level folding, add walks of p4d
> > level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
> > remove __ARCH_USE_5LEVEL_HACK.
>
> This needed some rework due to arm changes in linux-next. Please check
> my handiwork and test it once I've merged this into linux-next?
Looks ok to me. It passed defconfig and a couple of randconfig builds
and qemu-system-aarch64 boots find with this.
> Rejects were
>
> --- arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ arch/arm64/include/asm/pgtable.h
> @@ -596,49 +604,50 @@ static inline phys_addr_t pud_page_paddr
>
> #define pud_ERROR(pud) __pud_error(__FILE__, __LINE__, pud_val(pud))
>
> -#define pgd_none(pgd) (!pgd_val(pgd))
> -#define pgd_bad(pgd) (!(pgd_val(pgd) & 2))
> -#define pgd_present(pgd) (pgd_val(pgd))
> +#define p4d_none(p4d) (!p4d_val(p4d))
> +#define p4d_bad(p4d) (!(p4d_val(p4d) & 2))
> +#define p4d_present(p4d) (p4d_val(p4d))
>
> -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
> +static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
> {
> - if (in_swapper_pgdir(pgdp)) {
> - set_swapper_pgd(pgdp, pgd);
> + if (in_swapper_pgdir(p4dp)) {
> + set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
> return;
> }
>
> - WRITE_ONCE(*pgdp, pgd);
> + WRITE_ONCE(*p4dp, p4d);
> dsb(ishst);
> isb();
> }
>
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
> {
> - set_pgd(pgdp, __pgd(0));
> + set_p4d(p4dp, __p4d(0));
> }
>
> -static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
> +static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
> {
> - return __pgd_to_phys(pgd);
> + return __p4d_to_phys(p4d);
> }
>
> /* Find an entry in the frst-level page table. */
> #define pud_index(addr) (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
>
> -#define pud_offset_phys(dir, addr) (pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> +#define pud_offset_phys(dir, addr) (p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> #define pud_offset(dir, addr) ((pud_t *)__va(pud_offset_phys((dir), (addr))))
>
> #define pud_set_fixmap(addr) ((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> -#define pud_set_fixmap_offset(pgd, addr) pud_set_fixmap(pud_offset_phys(pgd, addr))
> +#define pud_set_fixmap_offset(p4d, addr) pud_set_fixmap(pud_offset_phys(p4d, addr))
> #define pud_clear_fixmap() clear_fixmap(FIX_PUD)
>
> -#define pgd_page(pgd) pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
> +#define p4d_page(p4d) pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
>
> /* use ONLY for statically allocated translation tables */
> #define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
>
> #else
>
> +#define p4d_page_paddr(p4d) ({ BUILD_BUG(); 0;})
> #define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
>
> /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
>
>
>
> and
>
> --- arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ arch/arm64/kvm/mmu.c
> @@ -469,7 +517,7 @@ static void stage2_flush_memslot(struct
> do {
> next = stage2_pgd_addr_end(kvm, addr, end);
> if (!stage2_pgd_none(kvm, *pgd))
> - stage2_flush_puds(kvm, pgd, addr, next);
> + stage2_flush_p4ds(kvm, pgd, addr, next);
> } while (pgd++, addr = next, addr != end);
> }
>
>
>
> Result:
>
> From: Mike Rapoport <rppt@linux.ibm.com>
> Subject: arm64: add support for folded p4d page tables
>
> Implement primitives necessary for the 4th level folding, add walks of p4d
> level where appropriate, replace 5level-fixup.h with pgtable-nop4d.h and
> remove __ARCH_USE_5LEVEL_HACK.
>
> Link: http://lkml.kernel.org/r/20200414153455.21744-4-rppt@kernel.org
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Brian Cain <bcain@codeaurora.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Guan Xuetao <gxt@pku.edu.cn>
> Cc: James Morse <james.morse@arm.com>
> Cc: Jonas Bonn <jonas@southpole.se>
> Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
> Cc: Ley Foon Tan <ley.foon.tan@intel.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Rich Felker <dalias@libc.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> arch/arm64/include/asm/kvm_mmu.h | 10 -
> arch/arm64/include/asm/pgalloc.h | 10 -
> arch/arm64/include/asm/pgtable-types.h | 5
> arch/arm64/include/asm/pgtable.h | 37 ++-
> arch/arm64/include/asm/stage2_pgtable.h | 48 +++--
> arch/arm64/kernel/hibernate.c | 44 +++-
> arch/arm64/kvm/mmu.c | 209 ++++++++++++++++++----
> arch/arm64/mm/fault.c | 9
> arch/arm64/mm/hugetlbpage.c | 15 +
> arch/arm64/mm/kasan_init.c | 26 ++
> arch/arm64/mm/mmu.c | 52 +++--
> arch/arm64/mm/pageattr.c | 7
> 12 files changed, 368 insertions(+), 104 deletions(-)
>
> --- a/arch/arm64/include/asm/kvm_mmu.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/kvm_mmu.h
> @@ -172,8 +172,8 @@ void kvm_clear_hyp_idmap(void);
> __pmd(__phys_to_pmd_val(__pa(ptep)) | PMD_TYPE_TABLE)
> #define kvm_mk_pud(pmdp) \
> __pud(__phys_to_pud_val(__pa(pmdp)) | PMD_TYPE_TABLE)
> -#define kvm_mk_pgd(pudp) \
> - __pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
> +#define kvm_mk_p4d(pmdp) \
> + __p4d(__phys_to_p4d_val(__pa(pmdp)) | PUD_TYPE_TABLE)
>
> #define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
>
> @@ -299,6 +299,12 @@ static inline bool kvm_s2pud_young(pud_t
> #define hyp_pud_table_empty(pudp) kvm_page_empty(pudp)
> #endif
>
> +#ifdef __PAGETABLE_P4D_FOLDED
> +#define hyp_p4d_table_empty(p4dp) (0)
> +#else
> +#define hyp_p4d_table_empty(p4dp) kvm_page_empty(p4dp)
> +#endif
> +
> struct kvm;
>
> #define kvm_flush_dcache_to_poc(a,l) __flush_dcache_area((a), (l))
> --- a/arch/arm64/include/asm/pgalloc.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgalloc.h
> @@ -73,17 +73,17 @@ static inline void pud_free(struct mm_st
> free_page((unsigned long)pudp);
> }
>
> -static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
> +static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
> {
> - set_pgd(pgdp, __pgd(__phys_to_pgd_val(pudp) | prot));
> + set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot));
> }
>
> -static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, pud_t *pudp)
> +static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp)
> {
> - __pgd_populate(pgdp, __pa(pudp), PUD_TYPE_TABLE);
> + __p4d_populate(p4dp, __pa(pudp), PUD_TYPE_TABLE);
> }
> #else
> -static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pudp, pgdval_t prot)
> +static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot)
> {
> BUILD_BUG();
> }
> --- a/arch/arm64/include/asm/pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgtable.h
> @@ -298,6 +298,11 @@ static inline pte_t pgd_pte(pgd_t pgd)
> return __pte(pgd_val(pgd));
> }
>
> +static inline pte_t p4d_pte(p4d_t p4d)
> +{
> + return __pte(p4d_val(p4d));
> +}
> +
> static inline pte_t pud_pte(pud_t pud)
> {
> return __pte(pud_val(pud));
> @@ -401,6 +406,9 @@ static inline pmd_t pmd_mkdevmap(pmd_t p
>
> #define set_pmd_at(mm, addr, pmdp, pmd) set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
>
> +#define __p4d_to_phys(p4d) __pte_to_phys(p4d_pte(p4d))
> +#define __phys_to_p4d_val(phys) __phys_to_pte_val(phys)
> +
> #define __pgd_to_phys(pgd) __pte_to_phys(pgd_pte(pgd))
> #define __phys_to_pgd_val(phys) __phys_to_pte_val(phys)
>
> @@ -592,49 +600,50 @@ static inline phys_addr_t pud_page_paddr
>
> #define pud_ERROR(pud) __pud_error(__FILE__, __LINE__, pud_val(pud))
>
> -#define pgd_none(pgd) (!pgd_val(pgd))
> -#define pgd_bad(pgd) (!(pgd_val(pgd) & 2))
> -#define pgd_present(pgd) (pgd_val(pgd))
> +#define p4d_none(p4d) (!p4d_val(p4d))
> +#define p4d_bad(p4d) (!(p4d_val(p4d) & 2))
> +#define p4d_present(p4d) (p4d_val(p4d))
>
> -static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
> +static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
> {
> - if (in_swapper_pgdir(pgdp)) {
> - set_swapper_pgd(pgdp, pgd);
> + if (in_swapper_pgdir(p4dp)) {
> + set_swapper_pgd((pgd_t *)p4dp, __pgd(p4d_val(p4d)));
> return;
> }
>
> - WRITE_ONCE(*pgdp, pgd);
> + WRITE_ONCE(*p4dp, p4d);
> dsb(ishst);
> isb();
> }
>
> -static inline void pgd_clear(pgd_t *pgdp)
> +static inline void p4d_clear(p4d_t *p4dp)
> {
> - set_pgd(pgdp, __pgd(0));
> + set_p4d(p4dp, __p4d(0));
> }
>
> -static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
> +static inline phys_addr_t p4d_page_paddr(p4d_t p4d)
> {
> - return __pgd_to_phys(pgd);
> + return __p4d_to_phys(p4d);
> }
>
> /* Find an entry in the frst-level page table. */
> #define pud_index(addr) (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
>
> -#define pud_offset_phys(dir, addr) (pgd_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> +#define pud_offset_phys(dir, addr) (p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t))
> #define pud_offset(dir, addr) ((pud_t *)__va(pud_offset_phys((dir), (addr))))
>
> #define pud_set_fixmap(addr) ((pud_t *)set_fixmap_offset(FIX_PUD, addr))
> -#define pud_set_fixmap_offset(pgd, addr) pud_set_fixmap(pud_offset_phys(pgd, addr))
> +#define pud_set_fixmap_offset(p4d, addr) pud_set_fixmap(pud_offset_phys(p4d, addr))
> #define pud_clear_fixmap() clear_fixmap(FIX_PUD)
>
> -#define pgd_page(pgd) phys_to_page(__pgd_to_phys(pgd))
> +#define p4d_page(p4d) pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d)))
>
> /* use ONLY for statically allocated translation tables */
> #define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
>
> #else
>
> +#define p4d_page_paddr(p4d) ({ BUILD_BUG(); 0;})
> #define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;})
>
> /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */
> --- a/arch/arm64/include/asm/pgtable-types.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/pgtable-types.h
> @@ -14,6 +14,7 @@
> typedef u64 pteval_t;
> typedef u64 pmdval_t;
> typedef u64 pudval_t;
> +typedef u64 p4dval_t;
> typedef u64 pgdval_t;
>
> /*
> @@ -44,13 +45,11 @@ typedef struct { pteval_t pgprot; } pgpr
> #define __pgprot(x) ((pgprot_t) { (x) } )
>
> #if CONFIG_PGTABLE_LEVELS == 2
> -#define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopmd.h>
> #elif CONFIG_PGTABLE_LEVELS == 3
> -#define __ARCH_USE_5LEVEL_HACK
> #include <asm-generic/pgtable-nopud.h>
> #elif CONFIG_PGTABLE_LEVELS == 4
> -#include <asm-generic/5level-fixup.h>
> +#include <asm-generic/pgtable-nop4d.h>
> #endif
>
> #endif /* __ASM_PGTABLE_TYPES_H */
> --- a/arch/arm64/include/asm/stage2_pgtable.h~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/include/asm/stage2_pgtable.h
> @@ -68,41 +68,67 @@ static inline bool kvm_stage2_has_pud(st
> #define S2_PUD_SIZE (1UL << S2_PUD_SHIFT)
> #define S2_PUD_MASK (~(S2_PUD_SIZE - 1))
>
> -static inline bool stage2_pgd_none(struct kvm *kvm, pgd_t pgd)
> +#define stage2_pgd_none(kvm, pgd) pgd_none(pgd)
> +#define stage2_pgd_clear(kvm, pgd) pgd_clear(pgd)
> +#define stage2_pgd_present(kvm, pgd) pgd_present(pgd)
> +#define stage2_pgd_populate(kvm, pgd, p4d) pgd_populate(NULL, pgd, p4d)
> +
> +static inline p4d_t *stage2_p4d_offset(struct kvm *kvm,
> + pgd_t *pgd, unsigned long address)
> +{
> + return p4d_offset(pgd, address);
> +}
> +
> +static inline void stage2_p4d_free(struct kvm *kvm, p4d_t *p4d)
> +{
> +}
> +
> +static inline bool stage2_p4d_table_empty(struct kvm *kvm, p4d_t *p4dp)
> +{
> + return false;
> +}
> +
> +static inline phys_addr_t stage2_p4d_addr_end(struct kvm *kvm,
> + phys_addr_t addr, phys_addr_t end)
> +{
> + return end;
> +}
> +
> +static inline bool stage2_p4d_none(struct kvm *kvm, p4d_t p4d)
> {
> if (kvm_stage2_has_pud(kvm))
> - return pgd_none(pgd);
> + return p4d_none(p4d);
> else
> return 0;
> }
>
> -static inline void stage2_pgd_clear(struct kvm *kvm, pgd_t *pgdp)
> +static inline void stage2_p4d_clear(struct kvm *kvm, p4d_t *p4dp)
> {
> if (kvm_stage2_has_pud(kvm))
> - pgd_clear(pgdp);
> + p4d_clear(p4dp);
> }
>
> -static inline bool stage2_pgd_present(struct kvm *kvm, pgd_t pgd)
> +static inline bool stage2_p4d_present(struct kvm *kvm, p4d_t p4d)
> {
> if (kvm_stage2_has_pud(kvm))
> - return pgd_present(pgd);
> + return p4d_present(p4d);
> else
> return 1;
> }
>
> -static inline void stage2_pgd_populate(struct kvm *kvm, pgd_t *pgd, pud_t *pud)
> +static inline void stage2_p4d_populate(struct kvm *kvm, p4d_t *p4d, pud_t *pud)
> {
> if (kvm_stage2_has_pud(kvm))
> - pgd_populate(NULL, pgd, pud);
> + p4d_populate(NULL, p4d, pud);
> }
>
> static inline pud_t *stage2_pud_offset(struct kvm *kvm,
> - pgd_t *pgd, unsigned long address)
> + p4d_t *p4d, unsigned long address)
> {
> if (kvm_stage2_has_pud(kvm))
> - return pud_offset(pgd, address);
> + return pud_offset(p4d, address);
> else
> - return (pud_t *)pgd;
> + return (pud_t *)p4d;
> }
>
> static inline void stage2_pud_free(struct kvm *kvm, pud_t *pud)
> --- a/arch/arm64/kernel/hibernate.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/kernel/hibernate.c
> @@ -184,6 +184,7 @@ static int trans_pgd_map_page(pgd_t *tra
> pgprot_t pgprot)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep;
> @@ -196,7 +197,15 @@ static int trans_pgd_map_page(pgd_t *tra
> pgd_populate(&init_mm, pgdp, pudp);
> }
>
> - pudp = pud_offset(pgdp, dst_addr);
> + p4dp = p4d_offset(pgdp, dst_addr);
> + if (p4d_none(READ_ONCE(*p4dp))) {
> + pudp = (void *)get_safe_page(GFP_ATOMIC);
> + if (!pudp)
> + return -ENOMEM;
> + p4d_populate(&init_mm, p4dp, pudp);
> + }
> +
> + pudp = pud_offset(p4dp, dst_addr);
> if (pud_none(READ_ONCE(*pudp))) {
> pmdp = (void *)get_safe_page(GFP_ATOMIC);
> if (!pmdp)
> @@ -419,7 +428,7 @@ static int copy_pmd(pud_t *dst_pudp, pud
> return 0;
> }
>
> -static int copy_pud(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
> +static int copy_pud(p4d_t *dst_p4dp, p4d_t *src_p4dp, unsigned long start,
> unsigned long end)
> {
> pud_t *dst_pudp;
> @@ -427,15 +436,15 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
> unsigned long next;
> unsigned long addr = start;
>
> - if (pgd_none(READ_ONCE(*dst_pgdp))) {
> + if (p4d_none(READ_ONCE(*dst_p4dp))) {
> dst_pudp = (pud_t *)get_safe_page(GFP_ATOMIC);
> if (!dst_pudp)
> return -ENOMEM;
> - pgd_populate(&init_mm, dst_pgdp, dst_pudp);
> + p4d_populate(&init_mm, dst_p4dp, dst_pudp);
> }
> - dst_pudp = pud_offset(dst_pgdp, start);
> + dst_pudp = pud_offset(dst_p4dp, start);
>
> - src_pudp = pud_offset(src_pgdp, start);
> + src_pudp = pud_offset(src_p4dp, start);
> do {
> pud_t pud = READ_ONCE(*src_pudp);
>
> @@ -454,6 +463,27 @@ static int copy_pud(pgd_t *dst_pgdp, pgd
> return 0;
> }
>
> +static int copy_p4d(pgd_t *dst_pgdp, pgd_t *src_pgdp, unsigned long start,
> + unsigned long end)
> +{
> + p4d_t *dst_p4dp;
> + p4d_t *src_p4dp;
> + unsigned long next;
> + unsigned long addr = start;
> +
> + dst_p4dp = p4d_offset(dst_pgdp, start);
> + src_p4dp = p4d_offset(src_pgdp, start);
> + do {
> + next = p4d_addr_end(addr, end);
> + if (p4d_none(READ_ONCE(*src_p4dp)))
> + continue;
> + if (copy_pud(dst_p4dp, src_p4dp, addr, next))
> + return -ENOMEM;
> + } while (dst_p4dp++, src_p4dp++, addr = next, addr != end);
> +
> + return 0;
> +}
> +
> static int copy_page_tables(pgd_t *dst_pgdp, unsigned long start,
> unsigned long end)
> {
> @@ -466,7 +496,7 @@ static int copy_page_tables(pgd_t *dst_p
> next = pgd_addr_end(addr, end);
> if (pgd_none(READ_ONCE(*src_pgdp)))
> continue;
> - if (copy_pud(dst_pgdp, src_pgdp, addr, next))
> + if (copy_p4d(dst_pgdp, src_pgdp, addr, next))
> return -ENOMEM;
> } while (dst_pgdp++, src_pgdp++, addr = next, addr != end);
>
> --- a/arch/arm64/kvm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/kvm/mmu.c
> @@ -158,13 +158,22 @@ static void *mmu_memory_cache_alloc(stru
>
> static void clear_stage2_pgd_entry(struct kvm *kvm, pgd_t *pgd, phys_addr_t addr)
> {
> - pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, pgd, 0UL);
> + p4d_t *p4d_table __maybe_unused = stage2_p4d_offset(kvm, pgd, 0UL);
> stage2_pgd_clear(kvm, pgd);
> kvm_tlb_flush_vmid_ipa(kvm, addr);
> - stage2_pud_free(kvm, pud_table);
> + stage2_p4d_free(kvm, p4d_table);
> put_page(virt_to_page(pgd));
> }
>
> +static void clear_stage2_p4d_entry(struct kvm *kvm, p4d_t *p4d, phys_addr_t addr)
> +{
> + pud_t *pud_table __maybe_unused = stage2_pud_offset(kvm, p4d, 0);
> + stage2_p4d_clear(kvm, p4d);
> + kvm_tlb_flush_vmid_ipa(kvm, addr);
> + stage2_pud_free(kvm, pud_table);
> + put_page(virt_to_page(p4d));
> +}
> +
> static void clear_stage2_pud_entry(struct kvm *kvm, pud_t *pud, phys_addr_t addr)
> {
> pmd_t *pmd_table __maybe_unused = stage2_pmd_offset(kvm, pud, 0);
> @@ -208,12 +217,20 @@ static inline void kvm_pud_populate(pud_
> dsb(ishst);
> }
>
> -static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp)
> +static inline void kvm_p4d_populate(p4d_t *p4dp, pud_t *pudp)
> {
> - WRITE_ONCE(*pgdp, kvm_mk_pgd(pudp));
> + WRITE_ONCE(*p4dp, kvm_mk_p4d(pudp));
> dsb(ishst);
> }
>
> +static inline void kvm_pgd_populate(pgd_t *pgdp, p4d_t *p4dp)
> +{
> +#ifndef __PAGETABLE_P4D_FOLDED
> + WRITE_ONCE(*pgdp, kvm_mk_pgd(p4dp));
> + dsb(ishst);
> +#endif
> +}
> +
> /*
> * Unmapping vs dcache management:
> *
> @@ -293,13 +310,13 @@ static void unmap_stage2_pmds(struct kvm
> clear_stage2_pud_entry(kvm, pud, start_addr);
> }
>
> -static void unmap_stage2_puds(struct kvm *kvm, pgd_t *pgd,
> +static void unmap_stage2_puds(struct kvm *kvm, p4d_t *p4d,
> phys_addr_t addr, phys_addr_t end)
> {
> phys_addr_t next, start_addr = addr;
> pud_t *pud, *start_pud;
>
> - start_pud = pud = stage2_pud_offset(kvm, pgd, addr);
> + start_pud = pud = stage2_pud_offset(kvm, p4d, addr);
> do {
> next = stage2_pud_addr_end(kvm, addr, end);
> if (!stage2_pud_none(kvm, *pud)) {
> @@ -317,6 +334,23 @@ static void unmap_stage2_puds(struct kvm
> } while (pud++, addr = next, addr != end);
>
> if (stage2_pud_table_empty(kvm, start_pud))
> + clear_stage2_p4d_entry(kvm, p4d, start_addr);
> +}
> +
> +static void unmap_stage2_p4ds(struct kvm *kvm, pgd_t *pgd,
> + phys_addr_t addr, phys_addr_t end)
> +{
> + phys_addr_t next, start_addr = addr;
> + p4d_t *p4d, *start_p4d;
> +
> + start_p4d = p4d = stage2_p4d_offset(kvm, pgd, addr);
> + do {
> + next = stage2_p4d_addr_end(kvm, addr, end);
> + if (!stage2_p4d_none(kvm, *p4d))
> + unmap_stage2_puds(kvm, p4d, addr, next);
> + } while (p4d++, addr = next, addr != end);
> +
> + if (stage2_p4d_table_empty(kvm, start_p4d))
> clear_stage2_pgd_entry(kvm, pgd, start_addr);
> }
>
> @@ -351,7 +385,7 @@ static void unmap_stage2_range(struct kv
> break;
> next = stage2_pgd_addr_end(kvm, addr, end);
> if (!stage2_pgd_none(kvm, *pgd))
> - unmap_stage2_puds(kvm, pgd, addr, next);
> + unmap_stage2_p4ds(kvm, pgd, addr, next);
> /*
> * If the range is too large, release the kvm->mmu_lock
> * to prevent starvation and lockup detector warnings.
> @@ -391,13 +425,13 @@ static void stage2_flush_pmds(struct kvm
> } while (pmd++, addr = next, addr != end);
> }
>
> -static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
> +static void stage2_flush_puds(struct kvm *kvm, p4d_t *p4d,
> phys_addr_t addr, phys_addr_t end)
> {
> pud_t *pud;
> phys_addr_t next;
>
> - pud = stage2_pud_offset(kvm, pgd, addr);
> + pud = stage2_pud_offset(kvm, p4d, addr);
> do {
> next = stage2_pud_addr_end(kvm, addr, end);
> if (!stage2_pud_none(kvm, *pud)) {
> @@ -409,6 +443,20 @@ static void stage2_flush_puds(struct kvm
> } while (pud++, addr = next, addr != end);
> }
>
> +static void stage2_flush_p4ds(struct kvm *kvm, pgd_t *pgd,
> + phys_addr_t addr, phys_addr_t end)
> +{
> + p4d_t *p4d;
> + phys_addr_t next;
> +
> + p4d = stage2_p4d_offset(kvm, pgd, addr);
> + do {
> + next = stage2_p4d_addr_end(kvm, addr, end);
> + if (!stage2_p4d_none(kvm, *p4d))
> + stage2_flush_puds(kvm, p4d, addr, next);
> + } while (p4d++, addr = next, addr != end);
> +}
> +
> static void stage2_flush_memslot(struct kvm *kvm,
> struct kvm_memory_slot *memslot)
> {
> @@ -421,7 +469,7 @@ static void stage2_flush_memslot(struct
> do {
> next = stage2_pgd_addr_end(kvm, addr, end);
> if (!stage2_pgd_none(kvm, *pgd))
> - stage2_flush_puds(kvm, pgd, addr, next);
> + stage2_flush_p4ds(kvm, pgd, addr, next);
>
> if (next != end)
> cond_resched_lock(&kvm->mmu_lock);
> @@ -454,12 +502,21 @@ static void stage2_flush_vm(struct kvm *
>
> static void clear_hyp_pgd_entry(pgd_t *pgd)
> {
> - pud_t *pud_table __maybe_unused = pud_offset(pgd, 0UL);
> + p4d_t *p4d_table __maybe_unused = p4d_offset(pgd, 0UL);
> pgd_clear(pgd);
> - pud_free(NULL, pud_table);
> + p4d_free(NULL, p4d_table);
> put_page(virt_to_page(pgd));
> }
>
> +static void clear_hyp_p4d_entry(p4d_t *p4d)
> +{
> + pud_t *pud_table __maybe_unused = pud_offset(p4d, 0);
> + VM_BUG_ON(p4d_huge(*p4d));
> + p4d_clear(p4d);
> + pud_free(NULL, pud_table);
> + put_page(virt_to_page(p4d));
> +}
> +
> static void clear_hyp_pud_entry(pud_t *pud)
> {
> pmd_t *pmd_table __maybe_unused = pmd_offset(pud, 0);
> @@ -511,12 +568,12 @@ static void unmap_hyp_pmds(pud_t *pud, p
> clear_hyp_pud_entry(pud);
> }
>
> -static void unmap_hyp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
> +static void unmap_hyp_puds(p4d_t *p4d, phys_addr_t addr, phys_addr_t end)
> {
> phys_addr_t next;
> pud_t *pud, *start_pud;
>
> - start_pud = pud = pud_offset(pgd, addr);
> + start_pud = pud = pud_offset(p4d, addr);
> do {
> next = pud_addr_end(addr, end);
> /* Hyp doesn't use huge puds */
> @@ -525,6 +582,23 @@ static void unmap_hyp_puds(pgd_t *pgd, p
> } while (pud++, addr = next, addr != end);
>
> if (hyp_pud_table_empty(start_pud))
> + clear_hyp_p4d_entry(p4d);
> +}
> +
> +static void unmap_hyp_p4ds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
> +{
> + phys_addr_t next;
> + p4d_t *p4d, *start_p4d;
> +
> + start_p4d = p4d = p4d_offset(pgd, addr);
> + do {
> + next = p4d_addr_end(addr, end);
> + /* Hyp doesn't use huge p4ds */
> + if (!p4d_none(*p4d))
> + unmap_hyp_puds(p4d, addr, next);
> + } while (p4d++, addr = next, addr != end);
> +
> + if (hyp_p4d_table_empty(start_p4d))
> clear_hyp_pgd_entry(pgd);
> }
>
> @@ -548,7 +622,7 @@ static void __unmap_hyp_range(pgd_t *pgd
> do {
> next = pgd_addr_end(addr, end);
> if (!pgd_none(*pgd))
> - unmap_hyp_puds(pgd, addr, next);
> + unmap_hyp_p4ds(pgd, addr, next);
> } while (pgd++, addr = next, addr != end);
> }
>
> @@ -658,7 +732,7 @@ static int create_hyp_pmd_mappings(pud_t
> return 0;
> }
>
> -static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
> +static int create_hyp_pud_mappings(p4d_t *p4d, unsigned long start,
> unsigned long end, unsigned long pfn,
> pgprot_t prot)
> {
> @@ -669,7 +743,7 @@ static int create_hyp_pud_mappings(pgd_t
>
> addr = start;
> do {
> - pud = pud_offset(pgd, addr);
> + pud = pud_offset(p4d, addr);
>
> if (pud_none_or_clear_bad(pud)) {
> pmd = pmd_alloc_one(NULL, addr);
> @@ -691,12 +765,45 @@ static int create_hyp_pud_mappings(pgd_t
> return 0;
> }
>
> +static int create_hyp_p4d_mappings(pgd_t *pgd, unsigned long start,
> + unsigned long end, unsigned long pfn,
> + pgprot_t prot)
> +{
> + p4d_t *p4d;
> + pud_t *pud;
> + unsigned long addr, next;
> + int ret;
> +
> + addr = start;
> + do {
> + p4d = p4d_offset(pgd, addr);
> +
> + if (p4d_none(*p4d)) {
> + pud = pud_alloc_one(NULL, addr);
> + if (!pud) {
> + kvm_err("Cannot allocate Hyp pud\n");
> + return -ENOMEM;
> + }
> + kvm_p4d_populate(p4d, pud);
> + get_page(virt_to_page(p4d));
> + }
> +
> + next = p4d_addr_end(addr, end);
> + ret = create_hyp_pud_mappings(p4d, addr, next, pfn, prot);
> + if (ret)
> + return ret;
> + pfn += (next - addr) >> PAGE_SHIFT;
> + } while (addr = next, addr != end);
> +
> + return 0;
> +}
> +
> static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
> unsigned long start, unsigned long end,
> unsigned long pfn, pgprot_t prot)
> {
> pgd_t *pgd;
> - pud_t *pud;
> + p4d_t *p4d;
> unsigned long addr, next;
> int err = 0;
>
> @@ -707,18 +814,18 @@ static int __create_hyp_mappings(pgd_t *
> pgd = pgdp + kvm_pgd_index(addr, ptrs_per_pgd);
>
> if (pgd_none(*pgd)) {
> - pud = pud_alloc_one(NULL, addr);
> - if (!pud) {
> - kvm_err("Cannot allocate Hyp pud\n");
> + p4d = p4d_alloc_one(NULL, addr);
> + if (!p4d) {
> + kvm_err("Cannot allocate Hyp p4d\n");
> err = -ENOMEM;
> goto out;
> }
> - kvm_pgd_populate(pgd, pud);
> + kvm_pgd_populate(pgd, p4d);
> get_page(virt_to_page(pgd));
> }
>
> next = pgd_addr_end(addr, end);
> - err = create_hyp_pud_mappings(pgd, addr, next, pfn, prot);
> + err = create_hyp_p4d_mappings(pgd, addr, next, pfn, prot);
> if (err)
> goto out;
> pfn += (next - addr) >> PAGE_SHIFT;
> @@ -1015,22 +1122,40 @@ void kvm_free_stage2_pgd(struct kvm *kvm
> free_pages_exact(pgd, stage2_pgd_size(kvm));
> }
>
> -static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> +static p4d_t *stage2_get_p4d(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> phys_addr_t addr)
> {
> pgd_t *pgd;
> - pud_t *pud;
> + p4d_t *p4d;
>
> pgd = kvm->arch.pgd + stage2_pgd_index(kvm, addr);
> if (stage2_pgd_none(kvm, *pgd)) {
> if (!cache)
> return NULL;
> - pud = mmu_memory_cache_alloc(cache);
> - stage2_pgd_populate(kvm, pgd, pud);
> + p4d = mmu_memory_cache_alloc(cache);
> + stage2_pgd_populate(kvm, pgd, p4d);
> get_page(virt_to_page(pgd));
> }
>
> - return stage2_pud_offset(kvm, pgd, addr);
> + return stage2_p4d_offset(kvm, pgd, addr);
> +}
> +
> +static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> + phys_addr_t addr)
> +{
> + p4d_t *p4d;
> + pud_t *pud;
> +
> + p4d = stage2_get_p4d(kvm, cache, addr);
> + if (stage2_p4d_none(kvm, *p4d)) {
> + if (!cache)
> + return NULL;
> + pud = mmu_memory_cache_alloc(cache);
> + stage2_p4d_populate(kvm, p4d, pud);
> + get_page(virt_to_page(p4d));
> + }
> +
> + return stage2_pud_offset(kvm, p4d, addr);
> }
>
> static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> @@ -1423,18 +1548,18 @@ static void stage2_wp_pmds(struct kvm *k
> }
>
> /**
> - * stage2_wp_puds - write protect PGD range
> + * stage2_wp_puds - write protect P4D range
> * @pgd: pointer to pgd entry
> * @addr: range start address
> * @end: range end address
> */
> -static void stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
> +static void stage2_wp_puds(struct kvm *kvm, p4d_t *p4d,
> phys_addr_t addr, phys_addr_t end)
> {
> pud_t *pud;
> phys_addr_t next;
>
> - pud = stage2_pud_offset(kvm, pgd, addr);
> + pud = stage2_pud_offset(kvm, p4d, addr);
> do {
> next = stage2_pud_addr_end(kvm, addr, end);
> if (!stage2_pud_none(kvm, *pud)) {
> @@ -1449,6 +1574,26 @@ static void stage2_wp_puds(struct kvm *
> }
>
> /**
> + * stage2_wp_p4ds - write protect PGD range
> + * @pgd: pointer to pgd entry
> + * @addr: range start address
> + * @end: range end address
> + */
> +static void stage2_wp_p4ds(struct kvm *kvm, pgd_t *pgd,
> + phys_addr_t addr, phys_addr_t end)
> +{
> + p4d_t *p4d;
> + phys_addr_t next;
> +
> + p4d = stage2_p4d_offset(kvm, pgd, addr);
> + do {
> + next = stage2_p4d_addr_end(kvm, addr, end);
> + if (!stage2_p4d_none(kvm, *p4d))
> + stage2_wp_puds(kvm, p4d, addr, next);
> + } while (p4d++, addr = next, addr != end);
> +}
> +
> +/**
> * stage2_wp_range() - write protect stage2 memory region range
> * @kvm: The KVM pointer
> * @addr: Start address of range
> @@ -1475,7 +1620,7 @@ static void stage2_wp_range(struct kvm *
> break;
> next = stage2_pgd_addr_end(kvm, addr, end);
> if (stage2_pgd_present(kvm, *pgd))
> - stage2_wp_puds(kvm, pgd, addr, next);
> + stage2_wp_p4ds(kvm, pgd, addr, next);
> } while (pgd++, addr = next, addr != end);
> }
>
> --- a/arch/arm64/mm/fault.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/fault.c
> @@ -145,6 +145,7 @@ static void show_pte(unsigned long addr)
> pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
>
> do {
> + p4d_t *p4dp, p4d;
> pud_t *pudp, pud;
> pmd_t *pmdp, pmd;
> pte_t *ptep, pte;
> @@ -152,7 +153,13 @@ static void show_pte(unsigned long addr)
> if (pgd_none(pgd) || pgd_bad(pgd))
> break;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + p4d = READ_ONCE(*p4dp);
> + pr_cont(", p4d=%016llx", p4d_val(p4d));
> + if (p4d_none(p4d) || p4d_bad(p4d))
> + break;
> +
> + pudp = pud_offset(p4dp, addr);
> pud = READ_ONCE(*pudp);
> pr_cont(", pud=%016llx", pud_val(pud));
> if (pud_none(pud) || pud_bad(pud))
> --- a/arch/arm64/mm/hugetlbpage.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/hugetlbpage.c
> @@ -67,11 +67,13 @@ static int find_num_contig(struct mm_str
> pte_t *ptep, size_t *pgsize)
> {
> pgd_t *pgdp = pgd_offset(mm, addr);
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> *pgsize = PAGE_SIZE;
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + pudp = pud_offset(p4dp, addr);
> pmdp = pmd_offset(pudp, addr);
> if ((pte_t *)pmdp == ptep) {
> *pgsize = PMD_SIZE;
> @@ -217,12 +219,14 @@ pte_t *huge_pte_alloc(struct mm_struct *
> unsigned long addr, unsigned long sz)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
> pte_t *ptep = NULL;
>
> pgdp = pgd_offset(mm, addr);
> - pudp = pud_alloc(mm, pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + pudp = pud_alloc(mm, p4dp, addr);
> if (!pudp)
> return NULL;
>
> @@ -261,6 +265,7 @@ pte_t *huge_pte_offset(struct mm_struct
> unsigned long addr, unsigned long sz)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp, pud;
> pmd_t *pmdp, pmd;
>
> @@ -268,7 +273,11 @@ pte_t *huge_pte_offset(struct mm_struct
> if (!pgd_present(READ_ONCE(*pgdp)))
> return NULL;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + if (!p4d_present(READ_ONCE(*p4dp)))
> + return NULL;
> +
> + pudp = pud_offset(p4dp, addr);
> pud = READ_ONCE(*pudp);
> if (sz != PUD_SIZE && pud_none(pud))
> return NULL;
> --- a/arch/arm64/mm/kasan_init.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/kasan_init.c
> @@ -84,17 +84,17 @@ static pmd_t *__init kasan_pmd_offset(pu
> return early ? pmd_offset_kimg(pudp, addr) : pmd_offset(pudp, addr);
> }
>
> -static pud_t *__init kasan_pud_offset(pgd_t *pgdp, unsigned long addr, int node,
> +static pud_t *__init kasan_pud_offset(p4d_t *p4dp, unsigned long addr, int node,
> bool early)
> {
> - if (pgd_none(READ_ONCE(*pgdp))) {
> + if (p4d_none(READ_ONCE(*p4dp))) {
> phys_addr_t pud_phys = early ?
> __pa_symbol(kasan_early_shadow_pud)
> : kasan_alloc_zeroed_page(node);
> - __pgd_populate(pgdp, pud_phys, PMD_TYPE_TABLE);
> + __p4d_populate(p4dp, pud_phys, PMD_TYPE_TABLE);
> }
>
> - return early ? pud_offset_kimg(pgdp, addr) : pud_offset(pgdp, addr);
> + return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr);
> }
>
> static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
> @@ -126,11 +126,11 @@ static void __init kasan_pmd_populate(pu
> } while (pmdp++, addr = next, addr != end && pmd_none(READ_ONCE(*pmdp)));
> }
>
> -static void __init kasan_pud_populate(pgd_t *pgdp, unsigned long addr,
> +static void __init kasan_pud_populate(p4d_t *p4dp, unsigned long addr,
> unsigned long end, int node, bool early)
> {
> unsigned long next;
> - pud_t *pudp = kasan_pud_offset(pgdp, addr, node, early);
> + pud_t *pudp = kasan_pud_offset(p4dp, addr, node, early);
>
> do {
> next = pud_addr_end(addr, end);
> @@ -138,6 +138,18 @@ static void __init kasan_pud_populate(pg
> } while (pudp++, addr = next, addr != end && pud_none(READ_ONCE(*pudp)));
> }
>
> +static void __init kasan_p4d_populate(pgd_t *pgdp, unsigned long addr,
> + unsigned long end, int node, bool early)
> +{
> + unsigned long next;
> + p4d_t *p4dp = p4d_offset(pgdp, addr);
> +
> + do {
> + next = p4d_addr_end(addr, end);
> + kasan_pud_populate(p4dp, addr, next, node, early);
> + } while (p4dp++, addr = next, addr != end);
> +}
> +
> static void __init kasan_pgd_populate(unsigned long addr, unsigned long end,
> int node, bool early)
> {
> @@ -147,7 +159,7 @@ static void __init kasan_pgd_populate(un
> pgdp = pgd_offset_k(addr);
> do {
> next = pgd_addr_end(addr, end);
> - kasan_pud_populate(pgdp, addr, next, node, early);
> + kasan_p4d_populate(pgdp, addr, next, node, early);
> } while (pgdp++, addr = next, addr != end);
> }
>
> --- a/arch/arm64/mm/mmu.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/mmu.c
> @@ -290,18 +290,19 @@ static void alloc_init_pud(pgd_t *pgdp,
> {
> unsigned long next;
> pud_t *pudp;
> - pgd_t pgd = READ_ONCE(*pgdp);
> + p4d_t *p4dp = p4d_offset(pgdp, addr);
> + p4d_t p4d = READ_ONCE(*p4dp);
>
> - if (pgd_none(pgd)) {
> + if (p4d_none(p4d)) {
> phys_addr_t pud_phys;
> BUG_ON(!pgtable_alloc);
> pud_phys = pgtable_alloc(PUD_SHIFT);
> - __pgd_populate(pgdp, pud_phys, PUD_TYPE_TABLE);
> - pgd = READ_ONCE(*pgdp);
> + __p4d_populate(p4dp, pud_phys, PUD_TYPE_TABLE);
> + p4d = READ_ONCE(*p4dp);
> }
> - BUG_ON(pgd_bad(pgd));
> + BUG_ON(p4d_bad(p4d));
>
> - pudp = pud_set_fixmap_offset(pgdp, addr);
> + pudp = pud_set_fixmap_offset(p4dp, addr);
> do {
> pud_t old_pud = READ_ONCE(*pudp);
>
> @@ -672,6 +673,7 @@ static void __init map_kernel(pgd_t *pgd
> READ_ONCE(*pgd_offset_k(FIXADDR_START)));
> } else if (CONFIG_PGTABLE_LEVELS > 3) {
> pgd_t *bm_pgdp;
> + p4d_t *bm_p4dp;
> pud_t *bm_pudp;
> /*
> * The fixmap shares its top level pgd entry with the kernel
> @@ -681,7 +683,8 @@ static void __init map_kernel(pgd_t *pgd
> */
> BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
> bm_pgdp = pgd_offset_raw(pgdp, FIXADDR_START);
> - bm_pudp = pud_set_fixmap_offset(bm_pgdp, FIXADDR_START);
> + bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_START);
> + bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_START);
> pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd));
> pud_clear_fixmap();
> } else {
> @@ -715,6 +718,7 @@ void __init paging_init(void)
> int kern_addr_valid(unsigned long addr)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp, pud;
> pmd_t *pmdp, pmd;
> pte_t *ptep, pte;
> @@ -726,7 +730,11 @@ int kern_addr_valid(unsigned long addr)
> if (pgd_none(READ_ONCE(*pgdp)))
> return 0;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + if (p4d_none(READ_ONCE(*p4dp)))
> + return 0;
> +
> + pudp = pud_offset(p4dp, addr);
> pud = READ_ONCE(*pudp);
> if (pud_none(pud))
> return 0;
> @@ -1069,6 +1077,7 @@ int __meminit vmemmap_populate(unsigned
> unsigned long addr = start;
> unsigned long next;
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> @@ -1079,7 +1088,11 @@ int __meminit vmemmap_populate(unsigned
> if (!pgdp)
> return -ENOMEM;
>
> - pudp = vmemmap_pud_populate(pgdp, addr, node);
> + p4dp = vmemmap_p4d_populate(pgdp, addr, node);
> + if (!p4dp)
> + return -ENOMEM;
> +
> + pudp = vmemmap_pud_populate(p4dp, addr, node);
> if (!pudp)
> return -ENOMEM;
>
> @@ -1114,11 +1127,12 @@ void vmemmap_free(unsigned long start, u
> static inline pud_t * fixmap_pud(unsigned long addr)
> {
> pgd_t *pgdp = pgd_offset_k(addr);
> - pgd_t pgd = READ_ONCE(*pgdp);
> + p4d_t *p4dp = p4d_offset(pgdp, addr);
> + p4d_t p4d = READ_ONCE(*p4dp);
>
> - BUG_ON(pgd_none(pgd) || pgd_bad(pgd));
> + BUG_ON(p4d_none(p4d) || p4d_bad(p4d));
>
> - return pud_offset_kimg(pgdp, addr);
> + return pud_offset_kimg(p4dp, addr);
> }
>
> static inline pmd_t * fixmap_pmd(unsigned long addr)
> @@ -1144,25 +1158,27 @@ static inline pte_t * fixmap_pte(unsigne
> */
> void __init early_fixmap_init(void)
> {
> - pgd_t *pgdp, pgd;
> + pgd_t *pgdp;
> + p4d_t *p4dp, p4d;
> pud_t *pudp;
> pmd_t *pmdp;
> unsigned long addr = FIXADDR_START;
>
> pgdp = pgd_offset_k(addr);
> - pgd = READ_ONCE(*pgdp);
> + p4dp = p4d_offset(pgdp, addr);
> + p4d = READ_ONCE(*p4dp);
> if (CONFIG_PGTABLE_LEVELS > 3 &&
> - !(pgd_none(pgd) || pgd_page_paddr(pgd) == __pa_symbol(bm_pud))) {
> + !(p4d_none(p4d) || p4d_page_paddr(p4d) == __pa_symbol(bm_pud))) {
> /*
> * We only end up here if the kernel mapping and the fixmap
> * share the top level pgd entry, which should only happen on
> * 16k/4 levels configurations.
> */
> BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
> - pudp = pud_offset_kimg(pgdp, addr);
> + pudp = pud_offset_kimg(p4dp, addr);
> } else {
> - if (pgd_none(pgd))
> - __pgd_populate(pgdp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
> + if (p4d_none(p4d))
> + __p4d_populate(p4dp, __pa_symbol(bm_pud), PUD_TYPE_TABLE);
> pudp = fixmap_pud(addr);
> }
> if (pud_none(READ_ONCE(*pudp)))
> --- a/arch/arm64/mm/pageattr.c~arm64-add-support-for-folded-p4d-page-tables
> +++ a/arch/arm64/mm/pageattr.c
> @@ -198,6 +198,7 @@ void __kernel_map_pages(struct page *pag
> bool kernel_page_present(struct page *page)
> {
> pgd_t *pgdp;
> + p4d_t *p4dp;
> pud_t *pudp, pud;
> pmd_t *pmdp, pmd;
> pte_t *ptep;
> @@ -210,7 +211,11 @@ bool kernel_page_present(struct page *pa
> if (pgd_none(READ_ONCE(*pgdp)))
> return false;
>
> - pudp = pud_offset(pgdp, addr);
> + p4dp = p4d_offset(pgdp, addr);
> + if (p4d_none(READ_ONCE(*p4dp)))
> + return false;
> +
> + pudp = pud_offset(p4dp, addr);
> pud = READ_ONCE(*pudp);
> if (pud_none(pud))
> return false;
> _
>
--
Sincerely yours,
Mike.
^ permalink raw reply
* [PATCH v3 7/9] powerpc/ps3: Add check for otheros image size
From: Geoff Levand @ 2020-05-16 16:20 UTC (permalink / raw)
To: Michael Ellerman
Cc: linuxppc-dev, Geert Uytterhoeven, Markus Elfring,
Emmanuel Nicolet
In-Reply-To: <4e8defeb49d62dd9d435e5ea3ddc5668e56fa496.1589049250.git.geoff@infradead.org>
The ps3's otheros flash loader has a size limit of 16 MiB for the
uncompressed image. If that limit will be reached output the
flash image file as 'otheros-too-big.bld'.
Signed-off-by: Geoff Levand <geoff@infradead.org>
---
v2: Change from decimal to hex values. Output an INFO message to screen.
v3: Remove the INFO message.
arch/powerpc/boot/wrapper | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index 35ace40d9fc2..d0b5f202c49c 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -571,7 +571,18 @@ ps3)
count=$overlay_size bs=1
odir="$(dirname "$ofile.bin")"
- rm -f "$odir/otheros.bld"
- gzip -n --force -9 --stdout "$ofile.bin" > "$odir/otheros.bld"
+
+ # The ps3's flash loader has a size limit of 16 MiB for the uncompressed
+ # image. If a compressed image that exceeded this limit is written to
+ # flash the loader will decompress that image until the 16 MiB limit is
+ # reached, then enter the system reset vector of the partially decompressed
+ # image. No warning is issued.
+ rm -f "$odir"/{otheros,otheros-too-big}.bld
+ size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' ' -f1)
+ bld="otheros.bld"
+ if [ $size -gt $((0x1000000)) ]; then
+ bld="otheros-too-big.bld"
+ fi
+ gzip -n --force -9 --stdout "$ofile.bin" > "$odir/$bld"
;;
esac
--
2.20.1
^ permalink raw reply related
* Re: [PATCH v2 7/9] powerpc/ps3: Add check for otheros image size
From: Geoff Levand @ 2020-05-16 16:03 UTC (permalink / raw)
To: Michael Ellerman
Cc: linuxppc-dev, Geert Uytterhoeven, Markus Elfring,
Emmanuel Nicolet
In-Reply-To: <87y2pu9cqd.fsf@mpe.ellerman.id.au>
Hi Michael,
On 5/14/20 7:02 PM, Michael Ellerman wrote:
> Geoff Levand <geoff@infradead.org> writes:
...
>> + # The ps3's flash loader has a size limit of 16 MiB for the uncompressed
>> + # image. If a compressed image that exceeded this limit is written to
>> + # flash the loader will decompress that image until the 16 MiB limit is
>> + # reached, then enter the system reset vector of the partially decompressed
>> + # image. No warning is issued.
>> + rm -f "$odir"/{otheros,otheros-too-big}.bld
>> + size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' ' -f1)
>> + bld="otheros.bld"
>> + if [ $size -gt $((0x1000000)) ]; then
>> + bld="otheros-too-big.bld"
>> + echo " INFO: Uncompressed kernel is too large to program into PS3 flash memory;" \
>
> This now appears on all my ppc64_defconfig builds, which I don't really
> like.
No, neither do I. I didn't think of that case.
> That does highlight the fact that ppc64_defconfig including
> CONFIG_PPC_PS3 is not really helpful for people actually wanting to run
> the kernel on a PS3.
No, this is just for the bootloader image (.bld) that can be
programed into flash memory. This is what is used to create,
for example, a petitboot bootloader image.
Normal usage is for the bootloader in flash to load a vmlinux
image from disk or network, in which case running a ppc64_defconfig
image would be fine.
> So I wonder if we should drop CONFIG_PPC_PS3 from ppc64_defconfig, in
> which case I'd be happy to keep the INFO message because it should only
> appear on ps3 specific builds.
I'd like to keep CONFIG_PPC_PS3 set in ppc64_defconfig. I feel it
useful to get some build testing of the PS3 platform code.
> The other option would be to drop the message, or only print it when
> we're doing a verbose build.
Building a boatloader image to program into flash memory is
something only very advanced users would be doing. I don't
think they would need this message. They would see the file
name and understand the situation. I'll post a v3 patch that
removes the message.
-Geoff
^ permalink raw reply
* Re: [PATCH v5 2/2] powerpc/rtas: Implement reentrant rtas call
From: kbuild test robot @ 2020-05-16 12:15 UTC (permalink / raw)
To: Leonardo Bras, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras, Nicholas Piggin, Thomas Gleixner, Allison Randal,
Greg Kroah-Hartman, Thiago Jung Bauermann, Anshuman Khandual,
Daniel Axtens, Nathan Lynch, Gautham R. Shenoy, Nadav Amit
Cc: linuxppc-dev, kbuild-all, linux-kernel
In-Reply-To: <20200516052137.175881-3-leobras.c@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3330 bytes --]
Hi Leonardo,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.7-rc5 next-20200515]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Leonardo-Bras/Implement-reentrant-rtas-call/20200516-132358
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-r006-20200515 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross ARCH=powerpc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
arch/powerpc/kernel/rtas.c: In function 'rtas_call_reentrant':
>> arch/powerpc/kernel/rtas.c:519:9: error: 'local_paca' undeclared (first use in this function); did you mean 'local_inc'?
519 | args = local_paca->reentrant_args;
| ^~~~~~~~~~
| local_inc
arch/powerpc/kernel/rtas.c:519:9: note: each undeclared identifier is reported only once for each function it appears in
vim +519 arch/powerpc/kernel/rtas.c
486
487 /**
488 * rtas_call_reentrant() - Used for reentrant rtas calls
489 * @token: Token for desired reentrant RTAS call
490 * @nargs: Number of Input Parameters
491 * @nret: Number of Output Parameters
492 * @outputs: Array of outputs
493 * @...: Inputs for desired RTAS call
494 *
495 * According to LoPAR documentation, only "ibm,int-on", "ibm,int-off",
496 * "ibm,get-xive" and "ibm,set-xive" are currently reentrant.
497 * Reentrant calls need their own rtas_args buffer, so not using rtas.args, but
498 * PACA one instead.
499 *
500 * Return: -1 on error,
501 * First output value of RTAS call if (nret > 0),
502 * 0 otherwise,
503 */
504
505 int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
506 {
507 va_list list;
508 struct rtas_args *args;
509 unsigned long flags;
510 int i, ret = 0;
511
512 if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
513 return -1;
514
515 local_irq_save(flags);
516 preempt_disable();
517
518 /* We use the per-cpu (PACA) rtas args buffer */
> 519 args = local_paca->reentrant_args;
520
521 va_start(list, outputs);
522 va_rtas_call_unlocked(args, token, nargs, nret, list);
523 va_end(list);
524
525 if (nret > 1 && outputs)
526 for (i = 0; i < nret - 1; ++i)
527 outputs[i] = be32_to_cpu(args->rets[i + 1]);
528
529 if (nret > 0)
530 ret = be32_to_cpu(args->rets[0]);
531
532 local_irq_restore(flags);
533 preempt_enable();
534
535 return ret;
536 }
537
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35659 bytes --]
^ permalink raw reply
* [GIT PULL] Please pull powerpc/linux.git powerpc-5.7-4 tag
From: Michael Ellerman @ 2020-05-16 12:11 UTC (permalink / raw)
To: Linus Torvalds
Cc: christophe.leroy, nayna, linux-kernel, npiggin, linuxppc-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Linus,
Please pull some more powerpc fixes for 5.7.
This is actually three weeks worth of fixes, I was going to send most of them
last week but my build box had a hiccup so I didn't. ie. we haven't just found
all these just before rc6.
cheers
The following changes since commit 5990cdee689c6885b27c6d969a3d58b09002b0bc:
lib/mpi: Fix building for powerpc with clang (2020-04-24 13:14:59 +1000)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.7-4
for you to fetch changes up to 249c9b0cd193d983c3a0b00f3fd3b92333bfeebe:
powerpc/40x: Make more space for system call exception (2020-05-12 21:22:11 +1000)
- ------------------------------------------------------------------
powerpc fixes for 5.7 #4
A fix for unrecoverable SLB faults in the interrupt exit path, introduced by the
recent rewrite of interrupt exit in C.
Four fixes for our KUAP (Kernel Userspace Access Prevention) support on 64-bit.
These are all fairly minor with the exception of the change to evaluate the
get/put_user() arguments before we enable user access, which reduces the amount
of code we run with user access enabled.
A fix for our secure boot IMA rules, if enforcement of module signatures is
enabled at runtime rather than build time.
A fix to our 32-bit VDSO clock_getres() which wasn't falling back to the syscall
for unknown clocks.
A build fix for CONFIG_PPC_KUAP_DEBUG on 32-bit BookS, and another for 40x.
Thanks to:
Christophe Leroy, Hugh Dickins, Nicholas Piggin, Aurelien Jarno, Mimi Zohar,
Nayna Jain.
- ------------------------------------------------------------------
Christophe Leroy (3):
powerpc/32s: Fix build failure with CONFIG_PPC_KUAP_DEBUG
powerpc/vdso32: Fallback on getres syscall when clock is unknown
powerpc/40x: Make more space for system call exception
Michael Ellerman (2):
Merge KUAP fix from topic/uaccess-ppc into fixes
powerpc/64s: Fix unrecoverable SLB crashes due to preemption check
Nayna Jain (1):
powerpc/ima: Fix secure boot rules in ima arch policy
Nicholas Piggin (4):
powerpc/uaccess: Evaluate macro arguments once, before user access is allowed
powerpc/64/kuap: Move kuap checks out of MSR[RI]=0 regions of exit code
powerpc/64s/kuap: Restore AMR in system reset exception
powerpc/64s/kuap: Restore AMR in fast_interrupt_return
arch/powerpc/include/asm/book3s/32/kup.h | 2 +-
arch/powerpc/include/asm/hw_irq.h | 20 +++++++-
arch/powerpc/include/asm/uaccess.h | 49 ++++++++++++++------
arch/powerpc/kernel/entry_64.S | 4 +-
arch/powerpc/kernel/exceptions-64s.S | 1 +
arch/powerpc/kernel/head_40x.S | 3 +-
arch/powerpc/kernel/ima_arch.c | 6 +--
arch/powerpc/kernel/syscall_64.c | 20 ++++----
arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +--
9 files changed, 78 insertions(+), 33 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl6/2FcACgkQUevqPMjh
pYAKFQ/9EnGGclynmL6LtPGZwUx85SjGDlWSLTL3edJFQZ4x66QkSBT2R6HF9PYA
HMNC8ei5gTUBOSM9SBtapbYPfVrPO0PtjLzCIKASXpr/vmh7wPN9Fs0kcofKcRoi
p0fnxabTxFMRjOljAsywhbNBMzT5YFN91E0Ab20x/TKsh4PbXm1NfOvHy5R1J/Hw
EcomxGwp6yLEGGMl6hshEmPI49C2+BchO5rUxEYziGQmnfoG3QtPMLg3f9Spe764
MdkPd9lLgP/jLSBdIIG/qGg8OT3O3tN6l/cXZE6nHri7qHAe/1UubXQ4R5zzCmut
4hKwmDIugRdaX0MX55NKq1DRAvw6txK596Gfcas4ooO+4CXYD+0kmIcMTQsQyWxg
SY5ZpHyrU6GTvvcvAR7NVKZVDrw/xrnlpxE5L2lqRE41BUWj1dRmU3NCRpny/otp
WuXqi9rKeFJrPIO2ziBbj3a/205BbkYmVz+kDhemWQ7nh137ryUlaXGTbUfqgE3z
sTNyw84Sc1NTbd5QfFABJDDIMT3kotyDgWeLiMuTw0u6FPcaixpYCSO/DcoxxrZr
2Q6G/4QEgm6lT1tDZ0Te3dStT6PKCg6YjiC7mefMeo3OnnWJGIE3iVJJTwtDDv3a
kuzhIMynKpNZ/26kAmydpEYjFHxmyX0nwzb8704aL6WxgGhVyTw=
=LCRY
-----END PGP SIGNATURE-----
^ permalink raw reply
* [PATCH v8 22.5/30] powerpc/optprobes: Add register argument to patch_imm64_load_insns()
From: Michael Ellerman @ 2020-05-16 11:54 UTC (permalink / raw)
To: linuxppc-dev; +Cc: christophe.leroy, jniethe5
In-Reply-To: <20200506034050.24806-24-jniethe5@gmail.com>
From: Jordan Niethe <jniethe5@gmail.com>
Currently patch_imm32_load_insns() is used to load an instruction to
r4 to be emulated by emulate_step(). For prefixed instructions we
would like to be able to load a 64bit immediate to r4. To prepare for
this make patch_imm64_load_insns() take an argument that decides which
register to load an immediate to - rather than hardcoding r3.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
arch/powerpc/kernel/optprobes.c | 34 ++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
v8: Split out of patch 23.
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index 52c1ab3f85aa..8eea8dbb93fa 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -162,38 +162,38 @@ void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
/*
* Generate instructions to load provided immediate 64-bit value
- * to register 'r3' and patch these instructions at 'addr'.
+ * to register 'reg' and patch these instructions at 'addr'.
*/
-void patch_imm64_load_insns(unsigned long val, kprobe_opcode_t *addr)
+void patch_imm64_load_insns(unsigned long val, int reg, kprobe_opcode_t *addr)
{
- /* lis r3,(op)@highest */
+ /* lis reg,(op)@highest */
patch_instruction((struct ppc_inst *)addr,
- ppc_inst(PPC_INST_ADDIS | ___PPC_RT(3) |
+ ppc_inst(PPC_INST_ADDIS | ___PPC_RT(reg) |
((val >> 48) & 0xffff)));
addr++;
- /* ori r3,r3,(op)@higher */
+ /* ori reg,reg,(op)@higher */
patch_instruction((struct ppc_inst *)addr,
- ppc_inst(PPC_INST_ORI | ___PPC_RA(3) |
- ___PPC_RS(3) | ((val >> 32) & 0xffff)));
+ ppc_inst(PPC_INST_ORI | ___PPC_RA(reg) |
+ ___PPC_RS(reg) | ((val >> 32) & 0xffff)));
addr++;
- /* rldicr r3,r3,32,31 */
+ /* rldicr reg,reg,32,31 */
patch_instruction((struct ppc_inst *)addr,
- ppc_inst(PPC_INST_RLDICR | ___PPC_RA(3) |
- ___PPC_RS(3) | __PPC_SH64(32) | __PPC_ME64(31)));
+ ppc_inst(PPC_INST_RLDICR | ___PPC_RA(reg) |
+ ___PPC_RS(reg) | __PPC_SH64(32) | __PPC_ME64(31)));
addr++;
- /* oris r3,r3,(op)@h */
+ /* oris reg,reg,(op)@h */
patch_instruction((struct ppc_inst *)addr,
- ppc_inst(PPC_INST_ORIS | ___PPC_RA(3) |
- ___PPC_RS(3) | ((val >> 16) & 0xffff)));
+ ppc_inst(PPC_INST_ORIS | ___PPC_RA(reg) |
+ ___PPC_RS(reg) | ((val >> 16) & 0xffff)));
addr++;
- /* ori r3,r3,(op)@l */
+ /* ori reg,reg,(op)@l */
patch_instruction((struct ppc_inst *)addr,
- ppc_inst(PPC_INST_ORI | ___PPC_RA(3) |
- ___PPC_RS(3) | (val & 0xffff)));
+ ppc_inst(PPC_INST_ORI | ___PPC_RA(reg) |
+ ___PPC_RS(reg) | (val & 0xffff)));
}
int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
@@ -249,7 +249,7 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op, struct kprobe *p)
* Fixup the template with instructions to:
* 1. load the address of the actual probepoint
*/
- patch_imm64_load_insns((unsigned long)op, buff + TMPL_OP_IDX);
+ patch_imm64_load_insns((unsigned long)op, 3, buff + TMPL_OP_IDX);
/*
* 2. branch to optimized_callback() and emulate_step()
--
2.25.1
^ permalink raw reply related
* Re: [PATCH v8 08/30] powerpc: Use a function for getting the instruction op code
From: Michael Ellerman @ 2020-05-16 11:08 UTC (permalink / raw)
To: Jordan Niethe, linuxppc-dev
Cc: Christophe Leroy, Alistair Popple, Nicholas Piggin, Balamuruhan S,
naveen.n.rao, Daniel Axtens
In-Reply-To: <CACzsE9o0DNZ+fwO4Zh-oUp8B+zMukXAr_bicCi0V5PYcnJO7_A@mail.gmail.com>
Jordan Niethe <jniethe5@gmail.com> writes:
> mpe, as suggested by Christophe could you please add this.
I did that and ...
> diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
> --- a/arch/powerpc/include/asm/inst.h
> +++ b/arch/powerpc/include/asm/inst.h
> @@ -2,6 +2,8 @@
> #ifndef _ASM_INST_H
> #define _ASM_INST_H
>
> +#include <asm/disassemble.h>
.. this eventually breaks the build in some driver, because get_ra() is
redefined.
So I've backed out this change for now.
If we want to use the macros in disassemble.h we'll need to namespace
them better, eg. make them ppc_get_ra() and so on.
cheers
> /*
> * Instruction data type for POWER
> */
> @@ -15,7 +17,7 @@ static inline u32 ppc_inst_val(u32 x)
>
> static inline int ppc_inst_primary_opcode(u32 x)
> {
> - return ppc_inst_val(x) >> 26;
> + return get_op(ppc_inst_val(x));
> }
>
> #endif /* _ASM_INST_H */
> --
> 2.17.1
^ permalink raw reply
* Re: [PATCH v5 2/2] powerpc/rtas: Implement reentrant rtas call
From: Nicholas Piggin @ 2020-05-16 7:36 UTC (permalink / raw)
To: Allison Randal, Thiago Jung Bauermann, Benjamin Herrenschmidt,
Daniel Axtens, Gautham R. Shenoy, Greg Kroah-Hartman,
Anshuman Khandual, Leonardo Bras, Michael Ellerman, Nadav Amit,
Nathan Lynch, Paul Mackerras, Thomas Gleixner
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200516052137.175881-3-leobras.c@gmail.com>
Excerpts from Leonardo Bras's message of May 16, 2020 3:21 pm:
> Implement rtas_call_reentrant() for reentrant rtas-calls:
> "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive".
>
> On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
> items 2 and 3 say:
>
> 2 - For the PowerPC External Interrupt option: The * call must be
> reentrant to the number of processors on the platform.
> 3 - For the PowerPC External Interrupt option: The * argument call
> buffer for each simultaneous call must be physically unique.
>
> So, these rtas-calls can be called in a lockless way, if using
> a different buffer for each cpu doing such rtas call.
>
> For this, it was suggested to add the buffer (struct rtas_args)
> in the PACA struct, so each cpu can have it's own buffer.
> The PACA struct received a pointer to rtas buffer, which is
> allocated in the memory range available to rtas 32-bit.
>
> Reentrant rtas calls are useful to avoid deadlocks in crashing,
> where rtas-calls are needed, but some other thread crashed holding
> the rtas.lock.
>
> This is a backtrace of a deadlock from a kdump testing environment:
>
> #0 arch_spin_lock
> #1 lock_rtas ()
> #2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
> #3 ics_rtas_mask_real_irq (hw_irq=4100)
> #4 machine_kexec_mask_interrupts
> #5 default_machine_crash_shutdown
> #6 machine_crash_shutdown
> #7 __crash_kexec
> #8 crash_kexec
> #9 oops_end
>
> Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
> ---
> arch/powerpc/include/asm/paca.h | 2 ++
> arch/powerpc/include/asm/rtas.h | 1 +
> arch/powerpc/kernel/paca.c | 20 +++++++++++
> arch/powerpc/kernel/rtas.c | 52 +++++++++++++++++++++++++++++
> arch/powerpc/sysdev/xics/ics-rtas.c | 22 ++++++------
> 5 files changed, 86 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index e3cc9eb9204d..87cd9c2220cc 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -29,6 +29,7 @@
> #include <asm/hmi.h>
> #include <asm/cpuidle.h>
> #include <asm/atomic.h>
> +#include <asm/rtas-types.h>
>
> #include <asm-generic/mmiowb_types.h>
>
> @@ -270,6 +271,7 @@ struct paca_struct {
> #ifdef CONFIG_MMIOWB
> struct mmiowb_state mmiowb_state;
> #endif
> + struct rtas_args *reentrant_args;
> } ____cacheline_aligned;
>
> extern void copy_mm_to_paca(struct mm_struct *mm);
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index c35c5350b7e4..fa7509c85881 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -236,6 +236,7 @@ extern struct rtas_t rtas;
> extern int rtas_token(const char *service);
> extern int rtas_service_present(const char *service);
> extern int rtas_call(int token, int, int, int *, ...);
> +int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...);
> void rtas_call_unlocked(struct rtas_args *args, int token, int nargs,
> int nret, ...);
> extern void __noreturn rtas_restart(char *cmd);
> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> index 3f91ccaa9c74..88c9b61489fc 100644
> --- a/arch/powerpc/kernel/paca.c
> +++ b/arch/powerpc/kernel/paca.c
> @@ -16,6 +16,7 @@
> #include <asm/kexec.h>
> #include <asm/svm.h>
> #include <asm/ultravisor.h>
> +#include <asm/rtas.h>
>
> #include "setup.h"
>
> @@ -164,6 +165,23 @@ static struct slb_shadow * __init new_slb_shadow(int cpu, unsigned long limit)
>
> #endif /* CONFIG_PPC_BOOK3S_64 */
>
> +/**
> + * new_rtas_args() - Allocates rtas args
> + * @cpu: CPU number
> + * @limit: Memory limit for this allocation
> + *
> + * Allocates a struct rtas_args and return it's pointer.
> + *
> + * Return: Pointer to allocated rtas_args
> + */
> +static struct rtas_args * __init new_rtas_args(int cpu, unsigned long limit)
> +{
> + limit = min_t(unsigned long, limit, RTAS_INSTANTIATE_MAX);
> +
> + return alloc_paca_data(sizeof(struct rtas_args), L1_CACHE_BYTES,
> + limit, cpu);
> +}
> +
> /* The Paca is an array with one entry per processor. Each contains an
> * lppaca, which contains the information shared between the
> * hypervisor and Linux.
> @@ -202,6 +220,7 @@ void __init __nostackprotector initialise_paca(struct paca_struct *new_paca, int
> /* For now -- if we have threads this will be adjusted later */
> new_paca->tcd_ptr = &new_paca->tcd;
> #endif
> + new_paca->reentrant_args = NULL;
> }
>
> /* Put the paca pointer into r13 and SPRG_PACA */
> @@ -274,6 +293,7 @@ void __init allocate_paca(int cpu)
> #ifdef CONFIG_PPC_BOOK3S_64
> paca->slb_shadow_ptr = new_slb_shadow(cpu, limit);
> #endif
> + paca->reentrant_args = new_rtas_args(cpu, limit);
Good, I think tihs should work as you want now. Can you allocate it like
lppacas? Put it under PSERIES (and in the paca) and check for !HV?
Thanks,
Nick
^ permalink raw reply
* [PATCH v5 2/2] powerpc/rtas: Implement reentrant rtas call
From: Leonardo Bras @ 2020-05-16 5:21 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Nicholas Piggin, Thomas Gleixner, Leonardo Bras, Allison Randal,
Greg Kroah-Hartman, Thiago Jung Bauermann, Anshuman Khandual,
Daniel Axtens, Nathan Lynch, Gautham R. Shenoy, Nadav Amit
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200516052137.175881-1-leobras.c@gmail.com>
Implement rtas_call_reentrant() for reentrant rtas-calls:
"ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive".
On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
items 2 and 3 say:
2 - For the PowerPC External Interrupt option: The * call must be
reentrant to the number of processors on the platform.
3 - For the PowerPC External Interrupt option: The * argument call
buffer for each simultaneous call must be physically unique.
So, these rtas-calls can be called in a lockless way, if using
a different buffer for each cpu doing such rtas call.
For this, it was suggested to add the buffer (struct rtas_args)
in the PACA struct, so each cpu can have it's own buffer.
The PACA struct received a pointer to rtas buffer, which is
allocated in the memory range available to rtas 32-bit.
Reentrant rtas calls are useful to avoid deadlocks in crashing,
where rtas-calls are needed, but some other thread crashed holding
the rtas.lock.
This is a backtrace of a deadlock from a kdump testing environment:
#0 arch_spin_lock
#1 lock_rtas ()
#2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
#3 ics_rtas_mask_real_irq (hw_irq=4100)
#4 machine_kexec_mask_interrupts
#5 default_machine_crash_shutdown
#6 machine_crash_shutdown
#7 __crash_kexec
#8 crash_kexec
#9 oops_end
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/include/asm/paca.h | 2 ++
arch/powerpc/include/asm/rtas.h | 1 +
arch/powerpc/kernel/paca.c | 20 +++++++++++
arch/powerpc/kernel/rtas.c | 52 +++++++++++++++++++++++++++++
arch/powerpc/sysdev/xics/ics-rtas.c | 22 ++++++------
5 files changed, 86 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index e3cc9eb9204d..87cd9c2220cc 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -29,6 +29,7 @@
#include <asm/hmi.h>
#include <asm/cpuidle.h>
#include <asm/atomic.h>
+#include <asm/rtas-types.h>
#include <asm-generic/mmiowb_types.h>
@@ -270,6 +271,7 @@ struct paca_struct {
#ifdef CONFIG_MMIOWB
struct mmiowb_state mmiowb_state;
#endif
+ struct rtas_args *reentrant_args;
} ____cacheline_aligned;
extern void copy_mm_to_paca(struct mm_struct *mm);
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index c35c5350b7e4..fa7509c85881 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -236,6 +236,7 @@ extern struct rtas_t rtas;
extern int rtas_token(const char *service);
extern int rtas_service_present(const char *service);
extern int rtas_call(int token, int, int, int *, ...);
+int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...);
void rtas_call_unlocked(struct rtas_args *args, int token, int nargs,
int nret, ...);
extern void __noreturn rtas_restart(char *cmd);
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 3f91ccaa9c74..88c9b61489fc 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -16,6 +16,7 @@
#include <asm/kexec.h>
#include <asm/svm.h>
#include <asm/ultravisor.h>
+#include <asm/rtas.h>
#include "setup.h"
@@ -164,6 +165,23 @@ static struct slb_shadow * __init new_slb_shadow(int cpu, unsigned long limit)
#endif /* CONFIG_PPC_BOOK3S_64 */
+/**
+ * new_rtas_args() - Allocates rtas args
+ * @cpu: CPU number
+ * @limit: Memory limit for this allocation
+ *
+ * Allocates a struct rtas_args and return it's pointer.
+ *
+ * Return: Pointer to allocated rtas_args
+ */
+static struct rtas_args * __init new_rtas_args(int cpu, unsigned long limit)
+{
+ limit = min_t(unsigned long, limit, RTAS_INSTANTIATE_MAX);
+
+ return alloc_paca_data(sizeof(struct rtas_args), L1_CACHE_BYTES,
+ limit, cpu);
+}
+
/* The Paca is an array with one entry per processor. Each contains an
* lppaca, which contains the information shared between the
* hypervisor and Linux.
@@ -202,6 +220,7 @@ void __init __nostackprotector initialise_paca(struct paca_struct *new_paca, int
/* For now -- if we have threads this will be adjusted later */
new_paca->tcd_ptr = &new_paca->tcd;
#endif
+ new_paca->reentrant_args = NULL;
}
/* Put the paca pointer into r13 and SPRG_PACA */
@@ -274,6 +293,7 @@ void __init allocate_paca(int cpu)
#ifdef CONFIG_PPC_BOOK3S_64
paca->slb_shadow_ptr = new_slb_shadow(cpu, limit);
#endif
+ paca->reentrant_args = new_rtas_args(cpu, limit);
paca_struct_size += sizeof(struct paca_struct);
}
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c5fa251b8950..6e22eb4fc0e7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -41,6 +41,7 @@
#include <asm/time.h>
#include <asm/mmu.h>
#include <asm/topology.h>
+#include <asm/paca.h>
/* This is here deliberately so it's only used in this file */
void enter_rtas(unsigned long);
@@ -483,6 +484,57 @@ int rtas_call(int token, int nargs, int nret, int *outputs, ...)
}
EXPORT_SYMBOL(rtas_call);
+/**
+ * rtas_call_reentrant() - Used for reentrant rtas calls
+ * @token: Token for desired reentrant RTAS call
+ * @nargs: Number of Input Parameters
+ * @nret: Number of Output Parameters
+ * @outputs: Array of outputs
+ * @...: Inputs for desired RTAS call
+ *
+ * According to LoPAR documentation, only "ibm,int-on", "ibm,int-off",
+ * "ibm,get-xive" and "ibm,set-xive" are currently reentrant.
+ * Reentrant calls need their own rtas_args buffer, so not using rtas.args, but
+ * PACA one instead.
+ *
+ * Return: -1 on error,
+ * First output value of RTAS call if (nret > 0),
+ * 0 otherwise,
+ */
+
+int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
+{
+ va_list list;
+ struct rtas_args *args;
+ unsigned long flags;
+ int i, ret = 0;
+
+ if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
+ return -1;
+
+ local_irq_save(flags);
+ preempt_disable();
+
+ /* We use the per-cpu (PACA) rtas args buffer */
+ args = local_paca->reentrant_args;
+
+ va_start(list, outputs);
+ va_rtas_call_unlocked(args, token, nargs, nret, list);
+ va_end(list);
+
+ if (nret > 1 && outputs)
+ for (i = 0; i < nret - 1; ++i)
+ outputs[i] = be32_to_cpu(args->rets[i + 1]);
+
+ if (nret > 0)
+ ret = be32_to_cpu(args->rets[0]);
+
+ local_irq_restore(flags);
+ preempt_enable();
+
+ return ret;
+}
+
/* For RTAS_BUSY (-2), delay for 1 millisecond. For an extended busy status
* code of 990n, perform the hinted delay of 10^n (last digit) milliseconds.
*/
diff --git a/arch/powerpc/sysdev/xics/ics-rtas.c b/arch/powerpc/sysdev/xics/ics-rtas.c
index 6aabc74688a6..4cf18000f07c 100644
--- a/arch/powerpc/sysdev/xics/ics-rtas.c
+++ b/arch/powerpc/sysdev/xics/ics-rtas.c
@@ -50,8 +50,8 @@ static void ics_rtas_unmask_irq(struct irq_data *d)
server = xics_get_irq_server(d->irq, irq_data_get_affinity_mask(d), 0);
- call_status = rtas_call(ibm_set_xive, 3, 1, NULL, hw_irq, server,
- DEFAULT_PRIORITY);
+ call_status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, hw_irq,
+ server, DEFAULT_PRIORITY);
if (call_status != 0) {
printk(KERN_ERR
"%s: ibm_set_xive irq %u server %x returned %d\n",
@@ -60,7 +60,7 @@ static void ics_rtas_unmask_irq(struct irq_data *d)
}
/* Now unmask the interrupt (often a no-op) */
- call_status = rtas_call(ibm_int_on, 1, 1, NULL, hw_irq);
+ call_status = rtas_call_reentrant(ibm_int_on, 1, 1, NULL, hw_irq);
if (call_status != 0) {
printk(KERN_ERR "%s: ibm_int_on irq=%u returned %d\n",
__func__, hw_irq, call_status);
@@ -91,7 +91,7 @@ static void ics_rtas_mask_real_irq(unsigned int hw_irq)
if (hw_irq == XICS_IPI)
return;
- call_status = rtas_call(ibm_int_off, 1, 1, NULL, hw_irq);
+ call_status = rtas_call_reentrant(ibm_int_off, 1, 1, NULL, hw_irq);
if (call_status != 0) {
printk(KERN_ERR "%s: ibm_int_off irq=%u returned %d\n",
__func__, hw_irq, call_status);
@@ -99,8 +99,8 @@ static void ics_rtas_mask_real_irq(unsigned int hw_irq)
}
/* Have to set XIVE to 0xff to be able to remove a slot */
- call_status = rtas_call(ibm_set_xive, 3, 1, NULL, hw_irq,
- xics_default_server, 0xff);
+ call_status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, hw_irq,
+ xics_default_server, 0xff);
if (call_status != 0) {
printk(KERN_ERR "%s: ibm_set_xive(0xff) irq=%u returned %d\n",
__func__, hw_irq, call_status);
@@ -131,7 +131,7 @@ static int ics_rtas_set_affinity(struct irq_data *d,
if (hw_irq == XICS_IPI || hw_irq == XICS_IRQ_SPURIOUS)
return -1;
- status = rtas_call(ibm_get_xive, 1, 3, xics_status, hw_irq);
+ status = rtas_call_reentrant(ibm_get_xive, 1, 3, xics_status, hw_irq);
if (status) {
printk(KERN_ERR "%s: ibm,get-xive irq=%u returns %d\n",
@@ -146,8 +146,8 @@ static int ics_rtas_set_affinity(struct irq_data *d,
return -1;
}
- status = rtas_call(ibm_set_xive, 3, 1, NULL,
- hw_irq, irq_server, xics_status[1]);
+ status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL,
+ hw_irq, irq_server, xics_status[1]);
if (status) {
printk(KERN_ERR "%s: ibm,set-xive irq=%u returns %d\n",
@@ -179,7 +179,7 @@ static int ics_rtas_map(struct ics *ics, unsigned int virq)
return -EINVAL;
/* Check if RTAS knows about this interrupt */
- rc = rtas_call(ibm_get_xive, 1, 3, status, hw_irq);
+ rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, hw_irq);
if (rc)
return -ENXIO;
@@ -198,7 +198,7 @@ static long ics_rtas_get_server(struct ics *ics, unsigned long vec)
{
int rc, status[2];
- rc = rtas_call(ibm_get_xive, 1, 3, status, vec);
+ rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, vec);
if (rc)
return -1;
return status[0];
--
2.25.4
^ permalink raw reply related
* [PATCH v5 1/2] powerpc/rtas: Move type/struct definitions from rtas.h into rtas-types.h
From: Leonardo Bras @ 2020-05-16 5:21 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Nicholas Piggin, Thomas Gleixner, Leonardo Bras, Allison Randal,
Greg Kroah-Hartman, Thiago Jung Bauermann, Anshuman Khandual,
Daniel Axtens, Nathan Lynch, Gautham R. Shenoy, Nadav Amit
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20200516052137.175881-1-leobras.c@gmail.com>
In order to get any rtas* struct into other headers, including rtas.h
may cause a lot of errors, regarding include dependency needed for
inline functions.
Create rtas-types.h and move there all type/struct definitions
from rtas.h, then include rtas-types.h into rtas.h.
Also, as suggested by checkpath.pl, replace uint8_t for u8, and keep
the same type pattern for the whole file, as they are the same
according to powerpc/boot/types.h.
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
arch/powerpc/include/asm/rtas-types.h | 126 ++++++++++++++++++++++++++
arch/powerpc/include/asm/rtas.h | 118 +-----------------------
2 files changed, 127 insertions(+), 117 deletions(-)
create mode 100644 arch/powerpc/include/asm/rtas-types.h
diff --git a/arch/powerpc/include/asm/rtas-types.h b/arch/powerpc/include/asm/rtas-types.h
new file mode 100644
index 000000000000..87354e28f160
--- /dev/null
+++ b/arch/powerpc/include/asm/rtas-types.h
@@ -0,0 +1,126 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _POWERPC_RTAS_TYPES_H
+#define _POWERPC_RTAS_TYPES_H
+#ifdef __KERNEL__
+
+#include <linux/spinlock_types.h>
+
+typedef __be32 rtas_arg_t;
+
+struct rtas_args {
+ __be32 token;
+ __be32 nargs;
+ __be32 nret;
+ rtas_arg_t args[16];
+ rtas_arg_t *rets; /* Pointer to return values in args[]. */
+};
+
+struct rtas_t {
+ unsigned long entry; /* physical address pointer */
+ unsigned long base; /* physical address pointer */
+ unsigned long size;
+ arch_spinlock_t lock;
+ struct rtas_args args;
+ struct device_node *dev; /* virtual address pointer */
+};
+
+struct rtas_suspend_me_data {
+ atomic_t working; /* number of cpus accessing this struct */
+ atomic_t done;
+ int token; /* ibm,suspend-me */
+ atomic_t error;
+ struct completion *complete; /* wait on this until working == 0 */
+};
+
+struct rtas_error_log {
+ /* Byte 0 */
+ u8 byte0; /* Architectural version */
+
+ /* Byte 1 */
+ u8 byte1;
+ /* XXXXXXXX
+ * XXX 3: Severity level of error
+ * XX 2: Degree of recovery
+ * X 1: Extended log present?
+ * XX 2: Reserved
+ */
+
+ /* Byte 2 */
+ u8 byte2;
+ /* XXXXXXXX
+ * XXXX 4: Initiator of event
+ * XXXX 4: Target of failed operation
+ */
+ u8 byte3; /* General event or error*/
+ __be32 extended_log_length; /* length in bytes */
+ unsigned char buffer[1]; /* Start of extended log */
+ /* Variable length. */
+};
+
+/* RTAS general extended event log, Version 6. The extended log starts
+ * from "buffer" field of struct rtas_error_log defined above.
+ */
+struct rtas_ext_event_log_v6 {
+ /* Byte 0 */
+ u8 byte0;
+ /* XXXXXXXX
+ * X 1: Log valid
+ * X 1: Unrecoverable error
+ * X 1: Recoverable (correctable or successfully retried)
+ * X 1: Bypassed unrecoverable error (degraded operation)
+ * X 1: Predictive error
+ * X 1: "New" log (always 1 for data returned from RTAS)
+ * X 1: Big Endian
+ * X 1: Reserved
+ */
+
+ /* Byte 1 */
+ u8 byte1; /* reserved */
+
+ /* Byte 2 */
+ u8 byte2;
+ /* XXXXXXXX
+ * X 1: Set to 1 (indicating log is in PowerPC format)
+ * XXX 3: Reserved
+ * XXXX 4: Log format used for bytes 12-2047
+ */
+
+ /* Byte 3 */
+ u8 byte3; /* reserved */
+ /* Byte 4-11 */
+ u8 reserved[8]; /* reserved */
+ /* Byte 12-15 */
+ __be32 company_id; /* Company ID of the company */
+ /* that defines the format for */
+ /* the vendor specific log type */
+ /* Byte 16-end of log */
+ u8 vendor_log[1]; /* Start of vendor specific log */
+ /* Variable length. */
+};
+
+/* Vendor specific Platform Event Log Format, Version 6, section header */
+struct pseries_errorlog {
+ __be16 id; /* 0x00 2-byte ASCII section ID */
+ __be16 length; /* 0x02 Section length in bytes */
+ u8 version; /* 0x04 Section version */
+ u8 subtype; /* 0x05 Section subtype */
+ __be16 creator_component; /* 0x06 Creator component ID */
+ u8 data[]; /* 0x08 Start of section data */
+};
+
+/* RTAS pseries hotplug errorlog section */
+struct pseries_hp_errorlog {
+ u8 resource;
+ u8 action;
+ u8 id_type;
+ u8 reserved;
+ union {
+ __be32 drc_index;
+ __be32 drc_count;
+ struct { __be32 count, index; } ic;
+ char drc_name[1];
+ } _drc_u;
+};
+
+#endif /* __KERNEL__ */
+#endif /* _POWERPC_RTAS_TYPES_H */
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 3c1887351c71..c35c5350b7e4 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -5,6 +5,7 @@
#include <linux/spinlock.h>
#include <asm/page.h>
+#include <asm/rtas-types.h>
#include <linux/time.h>
#include <linux/cpumask.h>
@@ -42,33 +43,6 @@
*
*/
-typedef __be32 rtas_arg_t;
-
-struct rtas_args {
- __be32 token;
- __be32 nargs;
- __be32 nret;
- rtas_arg_t args[16];
- rtas_arg_t *rets; /* Pointer to return values in args[]. */
-};
-
-struct rtas_t {
- unsigned long entry; /* physical address pointer */
- unsigned long base; /* physical address pointer */
- unsigned long size;
- arch_spinlock_t lock;
- struct rtas_args args;
- struct device_node *dev; /* virtual address pointer */
-};
-
-struct rtas_suspend_me_data {
- atomic_t working; /* number of cpus accessing this struct */
- atomic_t done;
- int token; /* ibm,suspend-me */
- atomic_t error;
- struct completion *complete; /* wait on this until working == 0 */
-};
-
/* RTAS event classes */
#define RTAS_INTERNAL_ERROR 0x80000000 /* set bit 0 */
#define RTAS_EPOW_WARNING 0x40000000 /* set bit 1 */
@@ -148,31 +122,6 @@ struct rtas_suspend_me_data {
/* RTAS check-exception vector offset */
#define RTAS_VECTOR_EXTERNAL_INTERRUPT 0x500
-struct rtas_error_log {
- /* Byte 0 */
- uint8_t byte0; /* Architectural version */
-
- /* Byte 1 */
- uint8_t byte1;
- /* XXXXXXXX
- * XXX 3: Severity level of error
- * XX 2: Degree of recovery
- * X 1: Extended log present?
- * XX 2: Reserved
- */
-
- /* Byte 2 */
- uint8_t byte2;
- /* XXXXXXXX
- * XXXX 4: Initiator of event
- * XXXX 4: Target of failed operation
- */
- uint8_t byte3; /* General event or error*/
- __be32 extended_log_length; /* length in bytes */
- unsigned char buffer[1]; /* Start of extended log */
- /* Variable length. */
-};
-
static inline uint8_t rtas_error_severity(const struct rtas_error_log *elog)
{
return (elog->byte1 & 0xE0) >> 5;
@@ -212,47 +161,6 @@ uint32_t rtas_error_extended_log_length(const struct rtas_error_log *elog)
#define RTAS_V6EXT_COMPANY_ID_IBM (('I' << 24) | ('B' << 16) | ('M' << 8))
-/* RTAS general extended event log, Version 6. The extended log starts
- * from "buffer" field of struct rtas_error_log defined above.
- */
-struct rtas_ext_event_log_v6 {
- /* Byte 0 */
- uint8_t byte0;
- /* XXXXXXXX
- * X 1: Log valid
- * X 1: Unrecoverable error
- * X 1: Recoverable (correctable or successfully retried)
- * X 1: Bypassed unrecoverable error (degraded operation)
- * X 1: Predictive error
- * X 1: "New" log (always 1 for data returned from RTAS)
- * X 1: Big Endian
- * X 1: Reserved
- */
-
- /* Byte 1 */
- uint8_t byte1; /* reserved */
-
- /* Byte 2 */
- uint8_t byte2;
- /* XXXXXXXX
- * X 1: Set to 1 (indicating log is in PowerPC format)
- * XXX 3: Reserved
- * XXXX 4: Log format used for bytes 12-2047
- */
-
- /* Byte 3 */
- uint8_t byte3; /* reserved */
- /* Byte 4-11 */
- uint8_t reserved[8]; /* reserved */
- /* Byte 12-15 */
- __be32 company_id; /* Company ID of the company */
- /* that defines the format for */
- /* the vendor specific log type */
- /* Byte 16-end of log */
- uint8_t vendor_log[1]; /* Start of vendor specific log */
- /* Variable length. */
-};
-
static
inline uint8_t rtas_ext_event_log_format(struct rtas_ext_event_log_v6 *ext_log)
{
@@ -287,16 +195,6 @@ inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log)
#define PSERIES_ELOG_SECT_ID_HOTPLUG (('H' << 8) | 'P')
#define PSERIES_ELOG_SECT_ID_MCE (('M' << 8) | 'C')
-/* Vendor specific Platform Event Log Format, Version 6, section header */
-struct pseries_errorlog {
- __be16 id; /* 0x00 2-byte ASCII section ID */
- __be16 length; /* 0x02 Section length in bytes */
- uint8_t version; /* 0x04 Section version */
- uint8_t subtype; /* 0x05 Section subtype */
- __be16 creator_component; /* 0x06 Creator component ID */
- uint8_t data[]; /* 0x08 Start of section data */
-};
-
static
inline uint16_t pseries_errorlog_id(struct pseries_errorlog *sect)
{
@@ -309,20 +207,6 @@ inline uint16_t pseries_errorlog_length(struct pseries_errorlog *sect)
return be16_to_cpu(sect->length);
}
-/* RTAS pseries hotplug errorlog section */
-struct pseries_hp_errorlog {
- u8 resource;
- u8 action;
- u8 id_type;
- u8 reserved;
- union {
- __be32 drc_index;
- __be32 drc_count;
- struct { __be32 count, index; } ic;
- char drc_name[1];
- } _drc_u;
-};
-
#define PSERIES_HP_ELOG_RESOURCE_CPU 1
#define PSERIES_HP_ELOG_RESOURCE_MEM 2
#define PSERIES_HP_ELOG_RESOURCE_SLOT 3
--
2.25.4
^ permalink raw reply related
* [PATCH v5 0/2] Implement reentrant rtas call
From: Leonardo Bras @ 2020-05-16 5:21 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Allison Randal, Greg Kroah-Hartman, Thomas Gleixner,
Nicholas Piggin, Leonardo Bras, Nathan Lynch, Gautham R. Shenoy,
Nadav Amit
Cc: linuxppc-dev, linux-kernel
Patch 2 implement rtas_call_reentrant() for reentrant rtas-calls:
"ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive",
according to LoPAPR Version 1.1 (March 24, 2016).
For that, it's necessary that every call uses a different
rtas buffer (rtas_args). Paul Mackerras suggested using the PACA
structure for creating a per-cpu buffer for these calls.
Patch 1 was necessary to make PACA have a 'struct rtas_args' member.
Reentrant rtas calls can be useful to avoid deadlocks in crashing,
where rtas-calls are needed, but some other thread crashed holding
the rtas.lock.
This is a backtrace of a deadlock from a kdump testing environment:
#0 arch_spin_lock
#1 lock_rtas ()
#2 rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
#3 ics_rtas_mask_real_irq (hw_irq=4100)
#4 machine_kexec_mask_interrupts
#5 default_machine_crash_shutdown
#6 machine_crash_shutdown
#7 __crash_kexec
#8 crash_kexec
#9 oops_end
Signed-off-by: Leonardo Bras <leobras.c@gmail.com>
---
Changes since v4:
- Insted of having the full buffer on PACA, adds only a pointer and
allocate it during allocate_paca(), making sure it's in a memory
range available for RTAS (32-bit). (Thanks Nick Piggin!)
Changes since v3:
- Adds protection from preemption and interruption
Changes since v2:
- Fixed build failure from ppc64e, by including spinlock_types.h on
rtas-types.h
- Improved commit messages
Changes since v1:
- Moved buffer from stack to PACA (as suggested by Paul Mackerras)
- Added missing output bits
- Improve documentation following kernel-doc format (as suggested by
Nathan Lynch)
Leonardo Bras (2):
powerpc/rtas: Move type/struct definitions from rtas.h into
rtas-types.h
powerpc/rtas: Implement reentrant rtas call
arch/powerpc/include/asm/paca.h | 2 +
arch/powerpc/include/asm/rtas-types.h | 126 ++++++++++++++++++++++++++
arch/powerpc/include/asm/rtas.h | 119 +-----------------------
arch/powerpc/kernel/rtas.c | 42 +++++++++
arch/powerpc/sysdev/xics/ics-rtas.c | 22 ++---
5 files changed, 183 insertions(+), 128 deletions(-)
create mode 100644 arch/powerpc/include/asm/rtas-types.h
--
2.25.4
^ permalink raw reply
* Re: [PATCH v4 2/2] powerpc/rtas: Implement reentrant rtas call
From: Leonardo Bras @ 2020-05-16 4:08 UTC (permalink / raw)
To: Nicholas Piggin, Allison Randal, Benjamin Herrenschmidt,
Gautham R. Shenoy, Greg Kroah-Hartman, Michael Ellerman,
Nadav Amit, Nathan Lynch, Paul Mackerras, Thomas Gleixner
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <1589525800.2asfsw2zlu.astroid@bobo.none>
Hello Nick,
On Fri, 2020-05-15 at 17:30 +1000, Nicholas Piggin wrote:
> Excerpts from Leonardo Bras's message of May 15, 2020 9:51 am:
> > Implement rtas_call_reentrant() for reentrant rtas-calls:
> > "ibm,int-on", "ibm,int-off",ibm,get-xive" and "ibm,set-xive".
> >
> > On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
> > items 2 and 3 say:
> >
> > 2 - For the PowerPC External Interrupt option: The * call must be
> > reentrant to the number of processors on the platform.
> > 3 - For the PowerPC External Interrupt option: The * argument call
> > buffer for each simultaneous call must be physically unique.
> >
> > So, these rtas-calls can be called in a lockless way, if using
> > a different buffer for each cpu doing such rtas call.
>
> What about rtas_call_unlocked? Do the callers need to take the rtas
> lock?
>
> Machine checks must call ibm,nmi-interlock too, which we really don't
> want to take a lock for either. Hopefully that's in a class of its own
> and we can essentially ignore with respect to other rtas calls.
>
> The spec is pretty vague too :(
>
> "The ibm,get-xive call must be reentrant to the number of processors on
> the platform."
>
> This suggests ibm,get-xive can be called concurrently by multiple
> processors. It doesn't say anything about being re-entrant against any
> of the other re-entrant calls. Maybe that could be reasonably assumed,
> but I don't know if it's reasonable to assume it can be called
> concurrently with a *non-reentrant* call, is it?
This was discussed on a previous version of the patchset:
https://lore.kernel.org/linuxppc-dev/875zcy2v8o.fsf@linux.ibm.com/
He checked with partition firmware development and these calls can be
used concurrently with arbitrary other RTAS calls.
>
> > For this, it was suggested to add the buffer (struct rtas_args)
> > in the PACA struct, so each cpu can have it's own buffer.
>
> You can't do this, paca is not limited to RTAS_INSTANTIATE_MAX.
> Which is good, because I didn't want you to add another 88 bytes to the
> paca :) Can you make it a pointer and allocate it separately? Check
> the slb_shadow allocation, you could use a similar pattern.
Sure, I will send the next version with this change.
>
> The other option would be to have just one more rtas args, and have the
> crashing CPU always that. That would skirt the re-entrancy issue -- the
> concurrency is only ever a last resort. Would be a bit tricker though.
It seems a good idea, but I would like to try the previous alternative
first.
> Thanks,
> Nick
Thank you Nick!
^ permalink raw reply
* Re: powerpc/pci: [PATCH 1/1 V2] PCIE PHB reset
From: Gustavo Romero @ 2020-05-15 22:02 UTC (permalink / raw)
To: wenxiong, linuxppc-dev; +Cc: brking, oohall, wenxiong
In-Reply-To: <1589573097-12892-1-git-send-email-wenxiong@linux.vnet.ibm.com>
Hi Xiong,
On 5/15/20 5:04 PM, wenxiong@linux.vnet.ibm.com wrote:
> From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
>
> Several device drivers hit EEH(Extended Error handling) when triggering
> kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> in pci general code. PHB reset stop all PCI transactions from previous
> kernel. We have tested the patch in several enviroments:
What do you mean exactly by "previous kernel" in here? Is there a way to
enhance that comment a bit further?
Thanks,
Gustavo
^ permalink raw reply
* powerpc/pci: [PATCH 1/1 V2] PCIE PHB reset
From: wenxiong @ 2020-05-15 20:04 UTC (permalink / raw)
To: linuxppc-dev; +Cc: brking, Wen Xiong, oohall, wenxiong
From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Several device drivers hit EEH(Extended Error handling) when triggering
kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
in pci general code. PHB reset stop all PCI transactions from previous
kernel. We have tested the patch in several enviroments:
- direct slot adapters
- adapters under the switch
- a VF adapter in PowerVM
- a VF adapter/adapter in KVM guest.
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/pci.c | 152 +++++++++++++++++++++++++++
1 file changed, 152 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/pci.c b/arch/powerpc/platforms/pseries/pci.c
index 911534b89c85..cb7e4276cf04 100644
--- a/arch/powerpc/platforms/pseries/pci.c
+++ b/arch/powerpc/platforms/pseries/pci.c
@@ -11,6 +11,8 @@
#include <linux/kernel.h>
#include <linux/pci.h>
#include <linux/string.h>
+#include <linux/crash_dump.h>
+#include <linux/delay.h>
#include <asm/eeh.h>
#include <asm/pci-bridge.h>
@@ -354,3 +356,153 @@ int pseries_root_bridge_prepare(struct pci_host_bridge *bridge)
return 0;
}
+
+/**
+ * pseries_get_pdn_addr - Retrieve PHB address
+ * @pe: EEH PE
+ *
+ * Retrieve the assocated PHB address. Actually, there're 2 RTAS
+ * function calls dedicated for the purpose. We need implement
+ * it through the new function and then the old one. Besides,
+ * you should make sure the config address is figured out from
+ * FDT node before calling the function.
+ *
+ */
+static int pseries_get_pdn_addr(struct pci_controller *phb)
+{
+ int ret = -1;
+ int rets[3];
+ int ibm_get_config_addr_info;
+ int ibm_get_config_addr_info2;
+ int config_addr = 0;
+ struct pci_dn *root_pdn, *pdn;
+
+ ibm_get_config_addr_info2 = rtas_token("ibm,get-config-addr-info2");
+ ibm_get_config_addr_info = rtas_token("ibm,get-config-addr-info");
+
+ root_pdn = PCI_DN(phb->dn);
+ pdn = list_first_entry(&root_pdn->child_list, struct pci_dn, list);
+ config_addr = (pdn->busno << 16) | (pdn->devfn << 8);
+
+ if (ibm_get_config_addr_info2 != RTAS_UNKNOWN_SERVICE) {
+ /*
+ * First of all, we need to make sure there has one PE
+ * associated with the device. If option is 1, it
+ * queries if config address is supported in a PE or not.
+ * If option is 0, it returns PE config address or config
+ * address for the PE primary bus.
+ */
+ ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
+ config_addr, BUID_HI(pdn->phb->buid),
+ BUID_LO(pdn->phb->buid), 1);
+ if (ret || (rets[0] == 0)) {
+ pr_warn("%s: Failed to get address for PHB#%x-PE# option=%d config_addr=%x\n",
+ __func__, pdn->phb->global_number, 1, rets[0]);
+ return -1;
+ }
+
+ /* Retrieve the associated PE config address */
+ ret = rtas_call(ibm_get_config_addr_info2, 4, 2, rets,
+ config_addr, BUID_HI(pdn->phb->buid),
+ BUID_LO(pdn->phb->buid), 0);
+ if (ret) {
+ pr_warn("%s: Failed to get address for PHB#%x-PE# option=%d config_addr=%x\n",
+ __func__, pdn->phb->global_number, 0, rets[0]);
+ return -1;
+ }
+ return rets[0];
+ }
+
+ if (ibm_get_config_addr_info != RTAS_UNKNOWN_SERVICE) {
+ ret = rtas_call(ibm_get_config_addr_info, 4, 2, rets,
+ config_addr, BUID_HI(pdn->phb->buid),
+ BUID_LO(pdn->phb->buid), 0);
+ if (ret || rets[0]) {
+ pr_warn("%s: Failed to get address for PHB#%x-PE# config_addr=%x\n",
+ __func__, pdn->phb->global_number, rets[0]);
+ return -1;
+ }
+ return rets[0];
+ }
+
+ return ret;
+}
+
+static int __init pseries_phb_reset(void)
+{
+ struct pci_controller *phb;
+ int config_addr;
+ int ibm_set_slot_reset;
+ int ibm_configure_pe;
+ int ret;
+
+ if (is_kdump_kernel() || reset_devices) {
+ pr_info("Issue PHB reset ...\n");
+ ibm_set_slot_reset = rtas_token("ibm,set-slot-reset");
+ ibm_configure_pe = rtas_token("ibm,configure-pe");
+
+ if (ibm_set_slot_reset == RTAS_UNKNOWN_SERVICE ||
+ ibm_configure_pe == RTAS_UNKNOWN_SERVICE) {
+ pr_info("%s: EEH functionality not supported\n",
+ __func__);
+ }
+
+ list_for_each_entry(phb, &hose_list, list_node) {
+ config_addr = pseries_get_pdn_addr(phb);
+ if (config_addr == -1)
+ continue;
+
+ ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
+ config_addr, BUID_HI(phb->buid),
+ BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
+
+ /* If fundamental-reset not supported, try hot-reset */
+ if (ret == -8)
+ ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
+ config_addr, BUID_HI(phb->buid),
+ BUID_LO(phb->buid), EEH_RESET_HOT);
+
+ if (ret) {
+ pr_err("%s: PHB#%x-PE# failed with rtas_call activate reset=%d\n",
+ __func__, phb->global_number, ret);
+ continue;
+ }
+ }
+ msleep(EEH_PE_RST_SETTLE_TIME);
+
+ list_for_each_entry(phb, &hose_list, list_node) {
+ config_addr = pseries_get_pdn_addr(phb);
+ if (config_addr == -1)
+ continue;
+
+ ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
+ config_addr, BUID_HI(phb->buid),
+ BUID_LO(phb->buid), EEH_RESET_DEACTIVATE);
+ if (ret) {
+ pr_err("%s: PHB#%x-PE# failed with rtas_call deactive reset=%d\n",
+ __func__, phb->global_number, ret);
+ continue;
+ }
+ }
+ msleep(EEH_PE_RST_SETTLE_TIME);
+
+ list_for_each_entry(phb, &hose_list, list_node) {
+ config_addr = pseries_get_pdn_addr(phb);
+ if (config_addr == -1)
+ continue;
+
+ ret = rtas_call(ibm_configure_pe, 3, 1, NULL,
+ config_addr, BUID_HI(phb->buid),
+ BUID_LO(phb->buid));
+ if (ret) {
+ pr_err("%s: PHB#%x-PE# failed with rtas_call configure_pe =%d\n",
+ __func__, phb->global_number, ret);
+ continue;
+ }
+ }
+ }
+
+ return 0;
+}
+machine_postcore_initcall(pseries, pseries_phb_reset);
+
--
2.18.1
^ permalink raw reply related
* Re: [PATCH v2 00/12] mm: consolidate definitions of page table accessors
From: Andrew Morton @ 2020-05-15 21:12 UTC (permalink / raw)
To: Mike Rapoport
Cc: linux-m68k, Rich Felker, linux-ia64, linux-sh, Catalin Marinas,
Heiko Carstens, linux-mips, Max Filippov, Guo Ren, Matthew Wilcox,
sparclinux, linux-hexagon, linux-riscv, Vincent Chen, Will Deacon,
Greg Ungerer, linux-arch, linux-s390, linux-c6x-dev, Brian Cain,
Helge Deller, x86, Russell King, Ley Foon Tan, Mike Rapoport,
Ingo Molnar, Geert Uytterhoeven, linux-parisc, Mark Salter,
Matt Turner, linux-snps-arc, linux-xtensa, Arnd Bergmann,
linux-alpha, linux-um, Tony Luck, Borislav Petkov, Greentime Hu,
Paul Walmsley, Stafford Horne, linux-csky, Guan Xuetao,
linux-arm-kernel, Chris Zankel, Michal Simek, Thomas Bogendoerfer,
Yoshinori Sato, Nick Hu, linux-mm, Vineet Gupta, linux-kernel,
openrisc, Thomas Gleixner, Richard Weinberger, linuxppc-dev,
David S. Miller
In-Reply-To: <20200514170327.31389-1-rppt@kernel.org>
On Thu, 14 May 2020 20:03:15 +0300 Mike Rapoport <rppt@kernel.org> wrote:
> The low level page table accessors (pXY_index(), pXY_offset()) are
> duplicated across all architectures and sometimes more than once. For
> instance, we have 31 definition of pgd_offset() for 25 supported
> architectures.
>
> Most of these definitions are actually identical and typically it boils
> down to, e.g.
>
> static inline unsigned long pmd_index(unsigned long address)
> {
> return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
> }
>
> static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
> {
> return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
> }
>
> These definitions can be shared among 90% of the arches provided XYZ_SHIFT,
> PTRS_PER_XYZ and xyz_page_vaddr() are defined.
>
> For architectures that really need a custom version there is always
> possibility to override the generic version with the usual ifdefs magic.
>
> These patches introduce include/linux/pgtable.h that replaces
> include/asm-generic/pgtable.h and add the definitions of the page table
> accessors to the new header.
hm,
> 712 files changed, 684 insertions(+), 2021 deletions(-)
big!
There's a lot of stuff going on at present (I suspect everyone is
sitting at home coding up a storm). However this all merged up fairly
cleanly, haven't tried compiling it yet.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox