LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] perf tools: Resolve symbols against debug file first
From: Namhyung Kim @ 2021-01-15  7:37 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Mark Rutland, Peter Zijlstra, Jiri Slaby, linuxppc-dev,
	linux-kernel, Arnaldo Carvalho de Melo, Alexander Shishkin,
	Ingo Molnar, Jiri Olsa
In-Reply-To: <87sg73kbsq.fsf@mpe.ellerman.id.au>

Hello,

On Thu, Jan 14, 2021 at 8:17 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Namhyung Kim <namhyung@kernel.org> writes:
> > On Wed, Jan 13, 2021 at 8:43 PM Jiri Slaby <jslaby@suse.cz> wrote:
> >>
> >> On 13. 01. 21, 11:46, Jiri Olsa wrote:
> >> > On Wed, Jan 13, 2021 at 09:01:28AM +0100, Jiri Slaby wrote:
> >> >> With LTO, there are symbols like these:
> >> >> /usr/lib/debug/usr/lib64/libantlr4-runtime.so.4.8-4.8-1.4.x86_64.debug
> >> >>   10305: 0000000000955fa4     0 NOTYPE  LOCAL  DEFAULT   29 Predicate.cpp.2bc410e7
> >> >>
> >> >> This comes from a runtime/debug split done by the standard way:
> >> >> objcopy --only-keep-debug $runtime $debug
> >> >> objcopy --add-gnu-debuglink=$debugfn -R .comment -R .GCC.command.line --strip-all $runtime
> >> >>
> ...
> >> >> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> >> >> index f3577f7d72fe..a31b716fa61c 100644
> >> >> --- a/tools/perf/util/symbol-elf.c
> >> >> +++ b/tools/perf/util/symbol-elf.c
> >> >> @@ -1226,12 +1226,20 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> >> >>              if (sym.st_shndx == SHN_ABS)
> >> >>                      continue;
> >> >>
> >> >> -            sec = elf_getscn(runtime_ss->elf, sym.st_shndx);
> >> >> +            sec = elf_getscn(syms_ss->elf, sym.st_shndx);
> >> >>              if (!sec)
> >> >>                      goto out_elf_end;
> >> >
> >> > we iterate symbols from syms_ss, so the fix seems to be correct
> >> > to call elf_getscn on syms_ss, not on runtime_ss as we do now
> >> >
> >> > I'd think this worked only when runtime_ss == syms_ss
> >>
> >> No, because the headers are copied 1:1 from runtime_ss to syms_ss. And
> >> runtime_ss is then stripped, so only .debug* sections are removed there.
> >> (And syms_ss's are set as NOBITS.)
> >>
> >> We iterated .debug* sections in syms_ss and used runtime_ss section
> >> _headers_ only to adjust symbols (sometimes). That worked.
> >
> > It seems PPC has an opd section only in the runtime_ss and that's why
> > we use it for section headers.
>
> At least on my system (Ubuntu 20.04.1) I see .opd in the debug file with
> NOBITS set:
>
> $ readelf -e vmlinux.debug | grep opd
>   [37] .opd              NOBITS           c000000001c1f548  01202e14
>
>
> But possibly that's not the case with older toolchains?

I was referring to this commit:

commit 261360b6e90a782f0a63d8f61a67683c376c88cf
Author: Cody P Schafer <cody@linux.vnet.ibm.com>
Date:   Fri Aug 10 15:23:01 2012 -0700

    perf symbols: Convert dso__load_syms to take 2 symsrc's

    To properly handle platforms with an opd section, both a runtime image
    (which contains the opd section but possibly lacks symbols) and a symbol
    image (which probably lacks an opd section but has symbols).

    The next patch ("perf symbol: use both runtime and debug images")
    adjusts the callsite in dso__load() to take advantage of being able to
    pass both runtime & debug images.

    Assumptions made here:

     - The opd section, if it exists in the runtime image, has headers in
       both the runtime image and the debug/syms image.

     - The index of the opd section (again, only if it exists in the runtime
       image) is the same in both the runtime and debug/symbols image.

    Both of these are true on RHEL, but it is unclear how accurate they are
    in general (on platforms with function descriptors in opd sections).

    Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>


Thanks,
Namhyung

^ permalink raw reply

* [PATCH] Adds a new ioctl32 syscall for backwards compatibility layers
From: sonicadvance1 @ 2021-01-15  7:02 UTC (permalink / raw)
  Cc: linux-xtensa, Rich Felker, Jan Kara, Miklos Szeredi,
	Dominik Brodowski, linux-kernel, James E.J. Bottomley,
	Max Filippov, Paul Mackerras, H. Peter Anvin, sparclinux,
	linux-ia64, Christian Brauner, Vincenzo Frascino, Will Deacon,
	linux-arch, Stephen Rothwell, Arnd Bergmann, Yoshinori Sato,
	linux-sh, Helge Deller, x86, YueHaibing, Russell King,
	Christian Borntraeger, Ingo Molnar, Geert Uytterhoeven,
	linux-arm-kernel, Catalin Marinas, Matt Turner, linux-mips,
	Fenghua Yu, Arnaldo Carvalho de Melo, Vasily Gorbik, Ryan Houdek,
	Brian Gerst, Heiko Carstens, David Rientjes, Willem de Bruijn,
	Nicholas Piggin, Suren Baghdasaryan, Aleksa Sarai,
	Thomas Bogendoerfer, Ivan Kokshaysky, Alexander Viro,
	Andy Lutomirski, Thomas Gleixner, Xiaoming Ni, Vlastimil Babka,
	Richard Henderson, Chris Zankel, Michal Simek, Tony Luck,
	linux-parisc, linux-m68k, linux-s390, Oleg Nesterov, Minchan Kim,
	Eric W. Biederman, linux-alpha, linux-fsdevel, Borislav Petkov,
	Andrew Morton, linux-api, linuxppc-dev, David S. Miller
In-Reply-To: <20210106064807.253112-1-Sonicadvance1@gmail.com>

From: Ryan Houdek <Sonicadvance1@gmail.com>

Problem presented:
A backwards compatibility layer that allows running x86-64 and x86
processes inside of an AArch64 process.
  - CPU is emulated
  - Syscall interface is mostly passthrough
  - Some syscalls require patching or emulation depending on behaviour
  - Not viable from the emulator design to use an AArch32 host process

x86-64 and x86 userspace emulator source:
https://github.com/FEX-Emu/FEX
Usage of ioctl32 is currently in a downstream fork. This will be the
first user of the syscall.

Cross documentation:
https://github.com/FEX-Emu/FEX/wiki/32Bit-x86-Woes#ioctl---54

ioctls are opaque from the emulator perspective and the data wants to be
passed through a syscall as unimpeded as possible.
Sadly due to ioctl struct differences between x86 and x86-64, we need a
syscall that exposes the compatibility ioctl handler to userspace in a
64bit process.

This is necessary behaves of the behaviour differences that occur
between an x86 process doing an ioctl and an x86-64 process doing an
ioctl.

Both of which are captured and passed through the AArch64 ioctl space.
This is implementing a new ioctl32 syscall that allows us to pass 32bit
x86 ioctls through to the kernel with zero or minimal manipulation.

The only supported hosts where we care about this currently is AArch64
and x86-64 (For testing purposes).
PPC64LE, MIPS64LE, and RISC-V64 might be interesting to support in the
future; But I don't have any platforms that get anywhere near Cortex-A77
performance in those architectures. Nor do I have the time to bring up
the emulator on them.
x86-64 can get to the compatibility ioctl through the int $0x80 handler.

This does not solve the following problems:
1) compat_alloc_user_space inside ioctl
2) ioctls that check task mode instead of entry point for behaviour
3) ioctls allocating memory
4) struct packing problems between architectures

Workarounds for the problems presented:
1a) Do a stack pivot to the lower 32bits from userspace
  - Forces host 64bit process to have its thread stacks to live in 32bit
  space. Not ideal.
  - Only do a stack pivot on ioctl to save previous 32bit VA space
1b) Teach kernel that compat_alloc_userspace can return a 64bit pointer
  - x86-64 truncates stack from this function
  - AArch64 returns the full stack pointer
  - Only ~29 users. Validating all of them support a 64bit stack is
  trivial?

2a) Any application using these can be checked for compatibility in
userspace and put on a block list.
2b) Fix any ioctls doing broken behaviour based on task mode rather than
ioctl entry point

3a) Userspace consumes all VA space above 32bit. Forcing allocations to
occur in lower 32bits
  - This is the current implementation
3b) Ensure any allocation in the ioctl handles ioctl entrypoint rather
than just allow generic memory allocations in full VA space
  - This is hard to guarantee

4a) Blocklist any application using ioctls that have different struct
packing across the boundary
  - Can happen when struct packing of 32bit x86 application goes down
  the aarch64 compat_ioctl path
  - Userspace is a AArch64 process passing 32bit x86 ioctl structures
  through the compat_ioctl path which is typically for AArch32 processes
  - None currently identified
4b) Work with upstream kernel and userspace projects to evaluate and fix
  - Identify the problem ioctls
  - Implement a new ioctl with more sane struct packing that matches
  cross-arch
  - Implement new ioctl while maintaining backwards compatibility with
  previous ioctl handler
  - Change upstream project to use the new compatibility ioctl
  - ioctl deprecation will be case by case per device and project
4b) Userspace implements a full ioctl emulation layer
  - Parses the full ioctl tree
  - Either passes through ioctls that it doesn't understand or
  transforms ioctls that it knows are trouble
  - Has the downside that it can still run in to edge cases that will
  fail
  - Performance of additional tracking is a concern
  - Prone to failure keeping the kernel ioctl and userspace ioctl
  handling in sync
  - Really want to have it in the kernel space as much as possible

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
---
 arch/alpha/kernel/syscalls/syscall.tbl      |  1 +
 arch/arm/tools/syscall.tbl                  |  1 +
 arch/arm64/include/asm/unistd.h             |  2 +-
 arch/arm64/include/asm/unistd32.h           |  2 ++
 arch/ia64/kernel/syscalls/syscall.tbl       |  1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |  1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |  1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |  2 ++
 arch/mips/kernel/syscalls/syscall_o32.tbl   |  1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |  1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |  1 +
 arch/s390/kernel/syscalls/syscall.tbl       |  1 +
 arch/sh/kernel/syscalls/syscall.tbl         |  1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |  1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |  1 +
 fs/ioctl.c                                  | 18 ++++++++++++++++--
 include/linux/syscalls.h                    |  4 ++++
 include/uapi/asm-generic/unistd.h           |  9 ++++++++-
 kernel/sys_ni.c                             |  3 +++
 tools/include/uapi/asm-generic/unistd.h     |  9 ++++++++-
 23 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index a6617067dbe6..81e70fd241d7 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -481,3 +481,4 @@
 549	common	faccessat2			sys_faccessat2
 550	common	process_madvise			sys_process_madvise
 551	common	epoll_pwait2			sys_epoll_pwait2
+552	common	ioctl32			sys_ni_syscall
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 20e1170e2e0a..98fbf1af1169 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -455,3 +455,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 86a9d7b3eabe..949788f5ba40 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END		(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls		442
+#define __NR_compat_syscalls		443
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index cccfbbefbf95..35e3bc83dbdc 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -891,6 +891,8 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2)
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2)
+#define __NR_ioctl32 442
+__SYSCALL(__NR_ioctl32, compat_sys_ioctl)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index bfc00f2bd437..087fc9627357 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -362,3 +362,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	sys_ioctl32			sys_ioctl32
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 7fe4e45c864c..502b2f87ab60 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -441,3 +441,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index a522adf194ab..e69be6c836d2 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -447,3 +447,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 0f03ad223f33..ba395218446f 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -380,3 +380,4 @@
 439	n32	faccessat2			sys_faccessat2
 440	n32	process_madvise			sys_process_madvise
 441	n32	epoll_pwait2			compat_sys_epoll_pwait2
+442	n32	ioctl32			sys_ni_syscall
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 91649690b52f..f42f939702e2 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -356,3 +356,5 @@
 439	n64	faccessat2			sys_faccessat2
 440	n64	process_madvise			sys_process_madvise
 441	n64	epoll_pwait2			sys_epoll_pwait2
+441	n64	epoll_pwait2			sys_epoll_pwait2
+442	n64	ioctl32			sys_ioctl32
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 4bad0c40aed6..b08ff6066f06 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -429,3 +429,4 @@
 439	o32	faccessat2			sys_faccessat2
 440	o32	process_madvise			sys_process_madvise
 441	o32	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	o32	ioctl32			sys_ni_syscall
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 6bcc31966b44..84d2b88d92fa 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	ioctl32			sys_ioctl32
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index f744eb5cba88..9f04d73cf649 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -531,3 +531,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32				sys_ioctl32
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index d443423495e5..2c90c0ecb5c7 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -444,3 +444,4 @@
 439  common	faccessat2		sys_faccessat2			sys_faccessat2
 440  common	process_madvise		sys_process_madvise		sys_process_madvise
 441  common	epoll_pwait2		sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32			sys_ni_syscall
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 9df40ac0ebc0..1e02a13fa049 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -444,3 +444,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 40d8c7cd8298..f7d24678d0b1 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -487,3 +487,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32			sys_ioctl32
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 874aeacde2dd..b1a3461e1e20 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -446,3 +446,4 @@
 439	i386	faccessat2		sys_faccessat2
 440	i386	process_madvise		sys_process_madvise
 441	i386	epoll_pwait2		sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	i386	ioctl32		sys_ni_syscall
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 78672124d28b..0250a04df0df 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -363,6 +363,7 @@
 439	common	faccessat2		sys_faccessat2
 440	common	process_madvise		sys_process_madvise
 441	common	epoll_pwait2		sys_epoll_pwait2
+442	64	ioctl32		sys_ioctl32
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 46116a28eeed..34b653b36b7b 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -412,3 +412,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 4e6cc0a7d69c..7b324a21a257 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -790,8 +790,8 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 }
 EXPORT_SYMBOL(compat_ptr_ioctl);
 
-COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
-		       compat_ulong_t, arg)
+long do_ioctl32(unsigned int fd, unsigned int cmd,
+			compat_ulong_t arg)
 {
 	struct fd f = fdget(fd);
 	int error;
@@ -850,4 +850,18 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
 
 	return error;
 }
+
+COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
+			compat_ulong_t, arg)
+{
+	return do_ioctl32(fd, cmd, arg);
+}
+
+#if BITS_PER_LONG == 64
+SYSCALL_DEFINE3(ioctl32, unsigned int, fd, unsigned int, cmd,
+			compat_ulong_t, arg)
+{
+	return do_ioctl32(fd, cmd, arg);
+}
+#endif
 #endif
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f3929aff39cf..fb7bac17167a 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -386,6 +386,10 @@ asmlinkage long sys_inotify_rm_watch(int fd, __s32 wd);
 /* fs/ioctl.c */
 asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd,
 				unsigned long arg);
+#if defined(CONFIG_COMPAT) && BITS_PER_LONG == 64
+asmlinkage long sys_ioctl32(unsigned int fd, unsigned int cmd,
+				compat_ulong_t arg);
+#endif
 
 /* fs/ioprio.c */
 asmlinkage long sys_ioprio_set(int which, int who, int ioprio);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 728752917785..18279e5b7b4f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
 
+#define __NR_ioctl32 442
+#ifdef CONFIG_COMPAT
+__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
+#else
+__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
+#endif
+
 #undef __NR_syscalls
-#define __NR_syscalls 442
+#define __NR_syscalls 443
 
 /*
  * 32 bit systems traditionally used different
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 19aa806890d5..5a2f25eb341c 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -302,6 +302,9 @@ COND_SYSCALL(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time64);
 
+COND_SYSCALL(ioctl32);
+COND_SYSCALL_COMPAT(ioctl32);
+
 /*
  * Architecture specific syscalls: see further below
  */
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 728752917785..18279e5b7b4f 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
 
+#define __NR_ioctl32 442
+#ifdef CONFIG_COMPAT
+__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
+#else
+__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
+#endif
+
 #undef __NR_syscalls
-#define __NR_syscalls 442
+#define __NR_syscalls 443
 
 /*
  * 32 bit systems traditionally used different
-- 
2.27.0


^ permalink raw reply related

* [PATCH v2] Adds a new ioctl32 syscall for backwards compatibility layers
From: sonicadvance1 @ 2021-01-15  7:17 UTC (permalink / raw)
  Cc: linux-xtensa, Rich Felker, linux-ia64, Miklos Szeredi,
	Dominik Brodowski, linux-mips, James E.J. Bottomley, Max Filippov,
	Paul Mackerras, H. Peter Anvin, sparclinux, Christian Brauner,
	Vincenzo Frascino, Will Deacon, linux-arch, Stephen Rothwell,
	Arnd Bergmann, Yoshinori Sato, linux-sh, Helge Deller, x86,
	YueHaibing, Russell King, Christian Borntraeger, Ingo Molnar,
	Geert Uytterhoeven, linux-arm-kernel, Catalin Marinas,
	Matt Turner, Fenghua Yu, Arnaldo Carvalho de Melo, Vasily Gorbik,
	Ryan Houdek, Brian Gerst, Heiko Carstens, David Rientjes,
	Willem de Bruijn, Nicholas Piggin, Suren Baghdasaryan,
	linux-fsdevel, Aleksa Sarai, Thomas Bogendoerfer, Ivan Kokshaysky,
	Alexander Viro, Andy Lutomirski, Thomas Gleixner, Xiaoming Ni,
	Vlastimil Babka, Richard Henderson, Chris Zankel, Michal Simek,
	Tony Luck, linux-parisc, linux-m68k, linux-s390, linux-kernel,
	Minchan Kim, Eric W. Biederman, linux-alpha, Joe Perches,
	Borislav Petkov, Andrew Morton, linux-api, linuxppc-dev,
	David S. Miller
In-Reply-To: <20210115070326.294332-1-Sonicadvance1@gmail.com>

From: Ryan Houdek <Sonicadvance1@gmail.com>

Problem presented:
A backwards compatibility layer that allows running x86-64 and x86
processes inside of an AArch64 process.
  - CPU is emulated
  - Syscall interface is mostly passthrough
  - Some syscalls require patching or emulation depending on behaviour
  - Not viable from the emulator design to use an AArch32 host process

x86-64 and x86 userspace emulator source:
https://github.com/FEX-Emu/FEX
Usage of ioctl32 is currently in a downstream fork. This will be the
first user of the syscall.

Cross documentation:
https://github.com/FEX-Emu/FEX/wiki/32Bit-x86-Woes#ioctl---54

ioctls are opaque from the emulator perspective and the data wants to be
passed through a syscall as unimpeded as possible.
Sadly due to ioctl struct differences between x86 and x86-64, we need a
syscall that exposes the compatibility ioctl handler to userspace in a
64bit process.

This is necessary behaves of the behaviour differences that occur
between an x86 process doing an ioctl and an x86-64 process doing an
ioctl.

Both of which are captured and passed through the AArch64 ioctl space.
This is implementing a new ioctl32 syscall that allows us to pass 32bit
x86 ioctls through to the kernel with zero or minimal manipulation.

The only supported hosts where we care about this currently is AArch64
and x86-64 (For testing purposes).
PPC64LE, MIPS64LE, and RISC-V64 might be interesting to support in the
future; But I don't have any platforms that get anywhere near Cortex-A77
performance in those architectures. Nor do I have the time to bring up
the emulator on them.
x86-64 can get to the compatibility ioctl through the int $0x80 handler.

This does not solve the following problems:
1) compat_alloc_user_space inside ioctl
2) ioctls that check task mode instead of entry point for behaviour
3) ioctls allocating memory
4) struct packing problems between architectures

Workarounds for the problems presented:
1a) Do a stack pivot to the lower 32bits from userspace
  - Forces host 64bit process to have its thread stacks to live in 32bit
  space. Not ideal.
  - Only do a stack pivot on ioctl to save previous 32bit VA space
1b) Teach kernel that compat_alloc_userspace can return a 64bit pointer
  - x86-64 truncates stack from this function
  - AArch64 returns the full stack pointer
  - Only ~29 users. Validating all of them support a 64bit stack is
  trivial?

2a) Any application using these can be checked for compatibility in
userspace and put on a block list.
2b) Fix any ioctls doing broken behaviour based on task mode rather than
ioctl entry point

3a) Userspace consumes all VA space above 32bit. Forcing allocations to
occur in lower 32bits
  - This is the current implementation
3b) Ensure any allocation in the ioctl handles ioctl entrypoint rather
than just allow generic memory allocations in full VA space
  - This is hard to guarantee

4a) Blocklist any application using ioctls that have different struct
packing across the boundary
  - Can happen when struct packing of 32bit x86 application goes down
  the aarch64 compat_ioctl path
  - Userspace is a AArch64 process passing 32bit x86 ioctl structures
  through the compat_ioctl path which is typically for AArch32 processes
  - None currently identified
4b) Work with upstream kernel and userspace projects to evaluate and fix
  - Identify the problem ioctls
  - Implement a new ioctl with more sane struct packing that matches
  cross-arch
  - Implement new ioctl while maintaining backwards compatibility with
  previous ioctl handler
  - Change upstream project to use the new compatibility ioctl
  - ioctl deprecation will be case by case per device and project
4b) Userspace implements a full ioctl emulation layer
  - Parses the full ioctl tree
  - Either passes through ioctls that it doesn't understand or
  transforms ioctls that it knows are trouble
  - Has the downside that it can still run in to edge cases that will
  fail
  - Performance of additional tracking is a concern
  - Prone to failure keeping the kernel ioctl and userspace ioctl
  handling in sync
  - Really want to have it in the kernel space as much as possible

Changes in v2:
- Added the syscall to all architecture tables
- Disabled on 32bit and BE platforms. They can call ioctl directly.
- Disabled on x86-64 as well since you can call this from ia32 or x32
dispatch tables

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
---
 arch/alpha/kernel/syscalls/syscall.tbl      |  1 +
 arch/arm/tools/syscall.tbl                  |  1 +
 arch/arm64/include/asm/unistd.h             |  2 +-
 arch/arm64/include/asm/unistd32.h           |  2 ++
 arch/ia64/kernel/syscalls/syscall.tbl       |  1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |  1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |  1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |  1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |  2 ++
 arch/mips/kernel/syscalls/syscall_o32.tbl   |  1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |  1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |  1 +
 arch/s390/kernel/syscalls/syscall.tbl       |  1 +
 arch/sh/kernel/syscalls/syscall.tbl         |  1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |  1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |  1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |  1 +
 fs/ioctl.c                                  | 18 ++++++++++++++++--
 include/linux/syscalls.h                    |  4 ++++
 include/uapi/asm-generic/unistd.h           |  9 ++++++++-
 kernel/sys_ni.c                             |  3 +++
 tools/include/uapi/asm-generic/unistd.h     |  9 ++++++++-
 23 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index a6617067dbe6..81e70fd241d7 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -481,3 +481,4 @@
 549	common	faccessat2			sys_faccessat2
 550	common	process_madvise			sys_process_madvise
 551	common	epoll_pwait2			sys_epoll_pwait2
+552	common	ioctl32			sys_ni_syscall
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 20e1170e2e0a..98fbf1af1169 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -455,3 +455,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 86a9d7b3eabe..949788f5ba40 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END		(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls		442
+#define __NR_compat_syscalls		443
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index cccfbbefbf95..35e3bc83dbdc 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -891,6 +891,8 @@ __SYSCALL(__NR_faccessat2, sys_faccessat2)
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SYSCALL(__NR_epoll_pwait2, compat_sys_epoll_pwait2)
+#define __NR_ioctl32 442
+__SYSCALL(__NR_ioctl32, compat_sys_ioctl)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index bfc00f2bd437..087fc9627357 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -362,3 +362,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	sys_ioctl32			sys_ioctl32
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 7fe4e45c864c..502b2f87ab60 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -441,3 +441,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index a522adf194ab..e69be6c836d2 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -447,3 +447,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 0f03ad223f33..ba395218446f 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -380,3 +380,4 @@
 439	n32	faccessat2			sys_faccessat2
 440	n32	process_madvise			sys_process_madvise
 441	n32	epoll_pwait2			compat_sys_epoll_pwait2
+442	n32	ioctl32			sys_ni_syscall
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 91649690b52f..f42f939702e2 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -356,3 +356,5 @@
 439	n64	faccessat2			sys_faccessat2
 440	n64	process_madvise			sys_process_madvise
 441	n64	epoll_pwait2			sys_epoll_pwait2
+441	n64	epoll_pwait2			sys_epoll_pwait2
+442	n64	ioctl32			sys_ioctl32
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 4bad0c40aed6..b08ff6066f06 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -429,3 +429,4 @@
 439	o32	faccessat2			sys_faccessat2
 440	o32	process_madvise			sys_process_madvise
 441	o32	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	o32	ioctl32			sys_ni_syscall
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 6bcc31966b44..84d2b88d92fa 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	ioctl32			sys_ioctl32
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index f744eb5cba88..9f04d73cf649 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -531,3 +531,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32				sys_ioctl32
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index d443423495e5..2c90c0ecb5c7 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -444,3 +444,4 @@
 439  common	faccessat2		sys_faccessat2			sys_faccessat2
 440  common	process_madvise		sys_process_madvise		sys_process_madvise
 441  common	epoll_pwait2		sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32			sys_ni_syscall
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 9df40ac0ebc0..1e02a13fa049 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -444,3 +444,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 40d8c7cd8298..f7d24678d0b1 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -487,3 +487,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	64	sys_ioctl32			sys_ioctl32
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 874aeacde2dd..b1a3461e1e20 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -446,3 +446,4 @@
 439	i386	faccessat2		sys_faccessat2
 440	i386	process_madvise		sys_process_madvise
 441	i386	epoll_pwait2		sys_epoll_pwait2		compat_sys_epoll_pwait2
+442	i386	ioctl32		sys_ni_syscall
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 78672124d28b..0250a04df0df 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -363,6 +363,7 @@
 439	common	faccessat2		sys_faccessat2
 440	common	process_madvise		sys_process_madvise
 441	common	epoll_pwait2		sys_epoll_pwait2
+442	64	ioctl32		sys_ioctl32
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 46116a28eeed..34b653b36b7b 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -412,3 +412,4 @@
 439	common	faccessat2			sys_faccessat2
 440	common	process_madvise			sys_process_madvise
 441	common	epoll_pwait2			sys_epoll_pwait2
+442	common	ioctl32			sys_ni_syscall
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 4e6cc0a7d69c..7b324a21a257 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -790,8 +790,8 @@ long compat_ptr_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 }
 EXPORT_SYMBOL(compat_ptr_ioctl);
 
-COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
-		       compat_ulong_t, arg)
+long do_ioctl32(unsigned int fd, unsigned int cmd,
+			compat_ulong_t arg)
 {
 	struct fd f = fdget(fd);
 	int error;
@@ -850,4 +850,18 @@ COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
 
 	return error;
 }
+
+COMPAT_SYSCALL_DEFINE3(ioctl, unsigned int, fd, unsigned int, cmd,
+			compat_ulong_t, arg)
+{
+	return do_ioctl32(fd, cmd, arg);
+}
+
+#if BITS_PER_LONG == 64
+SYSCALL_DEFINE3(ioctl32, unsigned int, fd, unsigned int, cmd,
+			compat_ulong_t, arg)
+{
+	return do_ioctl32(fd, cmd, arg);
+}
+#endif
 #endif
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f3929aff39cf..fb7bac17167a 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -386,6 +386,10 @@ asmlinkage long sys_inotify_rm_watch(int fd, __s32 wd);
 /* fs/ioctl.c */
 asmlinkage long sys_ioctl(unsigned int fd, unsigned int cmd,
 				unsigned long arg);
+#if defined(CONFIG_COMPAT) && BITS_PER_LONG == 64
+asmlinkage long sys_ioctl32(unsigned int fd, unsigned int cmd,
+				compat_ulong_t arg);
+#endif
 
 /* fs/ioprio.c */
 asmlinkage long sys_ioprio_set(int which, int who, int ioprio);
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 728752917785..18279e5b7b4f 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
 
+#define __NR_ioctl32 442
+#ifdef CONFIG_COMPAT
+__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
+#else
+__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
+#endif
+
 #undef __NR_syscalls
-#define __NR_syscalls 442
+#define __NR_syscalls 443
 
 /*
  * 32 bit systems traditionally used different
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 19aa806890d5..5a2f25eb341c 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -302,6 +302,9 @@ COND_SYSCALL(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time32);
 COND_SYSCALL_COMPAT(recvmmsg_time64);
 
+COND_SYSCALL(ioctl32);
+COND_SYSCALL_COMPAT(ioctl32);
+
 /*
  * Architecture specific syscalls: see further below
  */
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 728752917785..18279e5b7b4f 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -862,8 +862,15 @@ __SYSCALL(__NR_process_madvise, sys_process_madvise)
 #define __NR_epoll_pwait2 441
 __SC_COMP(__NR_epoll_pwait2, sys_epoll_pwait2, compat_sys_epoll_pwait2)
 
+#define __NR_ioctl32 442
+#ifdef CONFIG_COMPAT
+__SC_COMP(__NR_ioctl32, sys_ioctl32, compat_sys_ioctl)
+#else
+__SC_COMP(__NR_ioctl32, sys_ni_syscall, sys_ni_syscall)
+#endif
+
 #undef __NR_syscalls
-#define __NR_syscalls 442
+#define __NR_syscalls 443
 
 /*
  * 32 bit systems traditionally used different
-- 
2.27.0


^ permalink raw reply related

* [PATCH] powerpc: dts: p2020rdb: add missing peripherials
From: Pawel Dembicki @ 2021-01-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: devicetree, linux-kernel, Rob Herring, Pawel Dembicki,
	Paul Mackerras

This patch adds dts entry for some peripherials:
- i2c: temperature sensor ADT7461
- i2c: eeprom m24256
- i2c: eeprom at24c01
- i2c: pmic zl2006
- i2c: gpio expander
- phy: reset pins for phy
- dsa: switch vsc7385

It was required to adjust rgmii settings for enet0 because switch with
dsa driver act different without 8081 sterring.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
---
 arch/powerpc/boot/dts/fsl/p2020rdb.dts | 73 ++++++++++++++++++++++++--
 1 file changed, 70 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/boot/dts/fsl/p2020rdb.dts b/arch/powerpc/boot/dts/fsl/p2020rdb.dts
index 3acd3890b397..1f2ddeca0375 100644
--- a/arch/powerpc/boot/dts/fsl/p2020rdb.dts
+++ b/arch/powerpc/boot/dts/fsl/p2020rdb.dts
@@ -7,6 +7,9 @@
 
 /include/ "p2020si-pre.dtsi"
 
+#include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/input/input.h>
+
 / {
 	model = "fsl,P2020RDB";
 	compatible = "fsl,P2020RDB";
@@ -131,22 +134,84 @@ partition@1100000 {
 		L2switch@2,0 {
 			#address-cells = <1>;
 			#size-cells = <1>;
-			compatible = "vitesse-7385";
+			compatible = "vitesse,vsc7385";
 			reg = <0x2 0x0 0x20000>;
-		};
+			reset-gpios = <&gpio0 12 GPIO_ACTIVE_LOW>;
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
 
+				port@1 {
+					reg = <1>;
+					label = "e1-sw-p1";
+				};
+				port@2 {
+					reg = <2>;
+					label = "e1-sw-p2";
+				};
+				port@3 {
+					reg = <3>;
+					label = "e1-sw-p3";
+				};
+				port@4 {
+					reg = <4>;
+					label = "e1-sw-p4";
+				};
+				port@6 {
+					reg = <6>;
+					label = "cpu";
+					ethernet = <&enet0>;
+					phy-mode = "rgmii";
+					fixed-link {
+						speed = <1000>;
+						full-duplex;
+						pause;
+					};
+				};
+			};
+		};
 	};
 
 	soc: soc@ffe00000 {
 		ranges = <0x0 0x0 0xffe00000 0x100000>;
 
+		gpio0: gpio-controller@fc00 {
+		};
+
 		i2c@3000 {
+			temperature-sensor@4c {
+				compatible = "adi,adt7461";
+				reg = <0x4c>;
+			};
+
+			eeprom@50 {
+				compatible = "atmel,24c256";
+				reg = <0x50>;
+			};
 			rtc@68 {
 				compatible = "dallas,ds1339";
 				reg = <0x68>;
 			};
 		};
 
+		i2c@3100 {
+			pmic@11 {
+				compatible = "zl2006";
+				reg = <0x11>;
+			};
+
+			gpio@18 {
+				compatible = "nxp,pca9557";
+				reg = <0x18>;
+			};
+
+			eeprom@52 {
+				compatible = "atmel,24c01";
+				reg = <0x52>;
+			};
+		};
+
 		spi@7000 {
 			flash@0 {
 				#address-cells = <1>;
@@ -200,10 +265,12 @@ mdio@24520 {
 			phy0: ethernet-phy@0 {
 				interrupts = <3 1 0 0>;
 				reg = <0x0>;
+				reset-gpios = <&gpio0 14 GPIO_ACTIVE_LOW>;
 			};
 			phy1: ethernet-phy@1 {
 				interrupts = <3 1 0 0>;
 				reg = <0x1>;
+				reset-gpios = <&gpio0 6 GPIO_ACTIVE_LOW>;
 			};
 			tbi-phy@2 {
 				device_type = "tbi-phy";
@@ -233,7 +300,7 @@ ptp_clock@24e00 {
 
 		enet0: ethernet@24000 {
 			fixed-link = <1 1 1000 0 0>;
-			phy-connection-type = "rgmii-id";
+			phy-connection-type = "rgmii";
 		};
 
 		enet1: ethernet@25000 {
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH 1/3] tty: hvcs: Drop unnecessary if block
From: Jiri Slaby @ 2021-01-15  8:19 UTC (permalink / raw)
  To: Uwe Kleine-König, Greg Kroah-Hartman
  Cc: sparclinux, linuxppc-dev, David S . Miller, linux-kernel
In-Reply-To: <20210114175718.137483-2-u.kleine-koenig@pengutronix.de>

On 14. 01. 21, 18:57, Uwe Kleine-König wrote:
> If hvcs_probe() succeeded dev_set_drvdata() is called with a non-NULL
> value, and if hvcs_probe() failed hvcs_remove() isn't called.
> 
> So there is no way dev_get_drvdata() can return NULL in hvcs_remove() and
> the check can just go away.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>

> ---
>   drivers/tty/hvc/hvcs.c | 3 ---
>   1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
> index 509d1042825a..3e0461285c34 100644
> --- a/drivers/tty/hvc/hvcs.c
> +++ b/drivers/tty/hvc/hvcs.c
> @@ -825,9 +825,6 @@ static int hvcs_remove(struct vio_dev *dev)
>   	unsigned long flags;
>   	struct tty_struct *tty;
>   
> -	if (!hvcsd)
> -		return -ENODEV;
> -
>   	/* By this time the vty-server won't be getting any more interrupts */
>   
>   	spin_lock_irqsave(&hvcsd->lock, flags);
> 


-- 
js

^ permalink raw reply

* Re: [PATCH 2/3] tty: vcc: Drop unnecessary if block
From: Jiri Slaby @ 2021-01-15  8:20 UTC (permalink / raw)
  To: Uwe Kleine-König, Greg Kroah-Hartman
  Cc: sparclinux, linuxppc-dev, David S . Miller, linux-kernel
In-Reply-To: <20210114175718.137483-3-u.kleine-koenig@pengutronix.de>

On 14. 01. 21, 18:57, Uwe Kleine-König wrote:
> If vcc_probe() succeeded dev_set_drvdata() is called with a non-NULL
> value, and if vcc_probe() failed vcc_remove() isn't called.
> 
> So there is no way dev_get_drvdata() can return NULL in vcc_remove() and
> the check can just go away.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>

> ---
>   drivers/tty/vcc.c | 3 ---
>   1 file changed, 3 deletions(-)
> 
> diff --git a/drivers/tty/vcc.c b/drivers/tty/vcc.c
> index 9ffd42e333b8..d9b0dc6deae9 100644
> --- a/drivers/tty/vcc.c
> +++ b/drivers/tty/vcc.c
> @@ -681,9 +681,6 @@ static int vcc_remove(struct vio_dev *vdev)
>   {
>   	struct vcc_port *port = dev_get_drvdata(&vdev->dev);
>   
> -	if (!port)
> -		return -ENODEV;
> -
>   	del_timer_sync(&port->rx_timer);
>   	del_timer_sync(&port->tx_timer);
>   
> 


-- 
js

^ permalink raw reply

* Re: [PATCH 3/3] tty: vcc: Drop impossible to hit WARN_ON
From: Jiri Slaby @ 2021-01-15  8:20 UTC (permalink / raw)
  To: Uwe Kleine-König, Greg Kroah-Hartman
  Cc: sparclinux, linuxppc-dev, David S . Miller, linux-kernel
In-Reply-To: <20210114175718.137483-4-u.kleine-koenig@pengutronix.de>

On 14. 01. 21, 18:57, Uwe Kleine-König wrote:
> vcc_get() returns the port that has provided port->index. As the port that
> is about to be removed isn't removed yet this trivially will find this
> port. So simplify the call to not assign an identical value to the port
> pointer and drop the warning that is never hit.
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>

> ---
>   drivers/tty/vcc.c | 7 ++-----
>   1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/tty/vcc.c b/drivers/tty/vcc.c
> index d9b0dc6deae9..e2d6205f83ce 100644
> --- a/drivers/tty/vcc.c
> +++ b/drivers/tty/vcc.c
> @@ -692,12 +692,9 @@ static int vcc_remove(struct vio_dev *vdev)
>   		tty_vhangup(port->tty);
>   
>   	/* Get exclusive reference to VCC, ensures that there are no other
> -	 * clients to this port
> +	 * clients to this port. This cannot fail.
>   	 */
> -	port = vcc_get(port->index, true);
> -
> -	if (WARN_ON(!port))
> -		return -ENODEV;
> +	vcc_get(port->index, true);
>   
>   	tty_unregister_device(vcc_tty_driver, port->index);
>   
> 


-- 
js

^ permalink raw reply

* Re: [PATCH v2 0/7] Rid W=1 warnings in Ethernet
From: Lee Jones @ 2021-01-15 11:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Paul Durrant, Kurt Kanzenbach, Alexei Starovoitov,
	Gustavo A. R. Silva, Peter Cammaert, Paul Mackerras,
	Sukadev Bhattiprolu, Wei Liu, Daniel Borkmann, John Fastabend,
	Santiago Leon, xen-devel, Grygorii Strashko, Thomas Falcon,
	Jesper Dangaard Brouer, Jens Osterkamp, Rusty Russell,
	Daris A Nevil, Lijun Pan, Ivan Khoronzhuk, Nicolas Pitre,
	Geoff Levand, netdev, linux-kernel, Erik Stahlman, John Allen,
	Utz Bacher, Dany Madden, bpf, linuxppc-dev, David S. Miller,
	Russell King
In-Reply-To: <20210114195422.GB3975472@dell>

On Thu, 14 Jan 2021, Lee Jones wrote:

> On Thu, 14 Jan 2021, Jakub Kicinski wrote:
> 
> > On Thu, 14 Jan 2021 08:33:49 +0000 Lee Jones wrote:
> > > On Wed, 13 Jan 2021, Jakub Kicinski wrote:
> > > 
> > > > On Wed, 13 Jan 2021 16:41:16 +0000 Lee Jones wrote:  
> > > > > Resending the stragglers again.                                                                                  
> > > > > 
> > > > > This set is part of a larger effort attempting to clean-up W=1                                                   
> > > > > kernel builds, which are currently overwhelmingly riddled with                                                   
> > > > > niggly little warnings.                                                                                          
> > > > >                                                                                                                  
> > > > > v2:                                                                                                              
> > > > >  - Squashed IBM patches                                                                                      
> > > > >  - Fixed real issue in SMSC
> > > > >  - Added Andrew's Reviewed-by tags on remainder  
> > > > 
> > > > Does not apply, please rebase on net-next/master.  
> > > 
> > > These are based on Tuesday's next/master.
> > 
> > What's next/master?
> 
> I'm not sure if this is a joke, or not? :)
> 
> next/master == Linux Next.  The daily merged repo where all of the
> *-next branches end up to ensure interoperability.  It's also the
> branch that is most heavily tested by the auto-builders to ensure the
> vast majority of issues are ironed out before hitting Mainline.
> 
> > This is net-next:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> 
> Looks like net-next gets merged into next/master:
> 
> commit 452958f1f3d1c8980a8414f9c37c8c6de24c7d32
> Merge: 1eabba209a17a f50e2f9f79164
> Author: Stephen Rothwell <sfr@canb.auug.org.au>
> Date:   Thu Jan 14 10:35:40 2021 +1100
> 
>     Merge remote-tracking branch 'net-next/master'
> 
> So I'm not sure what it's conflicting with.
> 
> Do you have patches in net-next that didn't make it into next/master
> for some reason?
> 
> I'll try to rebase again tomorrow.
> 
> Hopefully I am able to reproduce your issue by then.

Okay so my development branch rebased again with no issue.

I also took the liberty to checkout net-next and cherry-pick the
patches [0], which again didn't cause a problem.

I'm not sure what else to suggest.  Is your local copy up-to-date?

[0]

lee@dell:~/projects/linux/kernel [net-next]$ gcp 0cea4b05acd57..924e1f46aba5e
Auto-merging drivers/net/ethernet/smsc/smc91x.c
[net-next 19811db3120a2] net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
 Date: Mon Oct 19 12:47:11 2020 +0100
 1 file changed, 1 insertion(+), 1 deletion(-)
[net-next dc6f4490cb64e] net: xen-netback: xenbus: Demote nonconformant kernel-doc headers
 Date: Tue Oct 20 08:00:43 2020 +0100
 1 file changed, 2 insertions(+), 2 deletions(-)
[net-next a3363cb09ae59] net: ethernet: ti: am65-cpsw-qos: Demote non-conformant function header
 Date: Tue Oct 20 09:05:38 2020 +0100
 1 file changed, 1 insertion(+), 1 deletion(-)
[net-next 005462f886c3e] net: ethernet: ti: am65-cpts: Document am65_cpts_rx_enable()'s 'en' parameter
 Date: Tue Oct 20 09:06:57 2020 +0100
 1 file changed, 1 insertion(+), 1 deletion(-)
Auto-merging drivers/net/xen-netfront.c
[net-next 9fcc32e395e09] net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanours
 Date: Tue Oct 20 09:42:19 2020 +0100
 2 files changed, 16 insertions(+), 17 deletions(-)
[net-next 15ba865fef481] net: ethernet: toshiba: ps3_gelic_net: Fix some kernel-doc misdemeanours
 Date: Tue Oct 20 10:10:14 2020 +0100
 1 file changed, 5 insertions(+), 3 deletions(-)
[net-next f815eb7cbd7f5] net: ethernet: toshiba: spider_net: Document a whole bunch of function parameters
 Date: Tue Oct 20 10:18:21 2020 +0100
 1 file changed, 11 insertions(+), 7 deletions(-)

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply

* Re: [PATCH -next] pci/controller/dwc: convert comma to semicolon
From: Lorenzo Pieralisi @ 2021-01-15 11:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: robh, roy.zang, minghuan.Lian, linux-pci, linux-kernel,
	Zheng Yongjun, linux-arm-kernel, linuxppc-dev, mingkai.hu
In-Reply-To: <20210106190722.GA1327553@bjorn-Precision-5520>

On Wed, Jan 06, 2021 at 01:07:22PM -0600, Bjorn Helgaas wrote:
> On Wed, Dec 16, 2020 at 09:19:44PM +0800, Zheng Yongjun wrote:
> > Replace a comma between expression statements by a semicolon.
> 
> Looks like a good fix, but read this about the changelog title:
> 
> https://lore.kernel.org/r/20171026223701.GA25649@bhelgaas-glaptop.roam.corp.google.com

I would request NXP maintainers to take this patch, rewrite it as
Bjorn requested and resend it as fast as possible, this is a very
relevant fix.

Thanks,
Lorenzo

> > Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
> > ---
> >  drivers/pci/controller/dwc/pci-layerscape-ep.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > index 84206f265e54..917ba8d254fc 100644
> > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
> > @@ -178,7 +178,7 @@ static int __init ls_pcie_ep_probe(struct platform_device *pdev)
> >  	pci->dev = dev;
> >  	pci->ops = pcie->drvdata->dw_pcie_ops;
> >  
> > -	ls_epc->bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4),
> > +	ls_epc->bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4);
> >  
> >  	pcie->pci = pci;
> >  	pcie->ls_epc = ls_epc;
> > -- 
> > 2.22.0
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
From: kernel test robot @ 2021-01-15 12:04 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev
  Cc: tyreld, kbuild-all, ajd, aik, aneesh.kumar, brking
In-Reply-To: <20210114220004.1138993-6-nathanl@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 2928 bytes --]

Hi Nathan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.11-rc3 next-20210115]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nathan-Lynch/powerpc-rtas-miscellaneous-cleanups-user-region-allocation/20210115-113045
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/656368735841abab983fa593b2ddae52a6273a71
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nathan-Lynch/powerpc-rtas-miscellaneous-cleanups-user-region-allocation/20210115-113045
        git checkout 656368735841abab983fa593b2ddae52a6273a71
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/rtas.c: In function 'in_rmo_buf':
>> arch/powerpc/kernel/rtas.c:990:26: error: 'RTAS_RMOBUF_MAX' undeclared (first use in this function)
     990 |   base < (rtas_rmo_buf + RTAS_RMOBUF_MAX) &&
         |                          ^~~~~~~~~~~~~~~
   arch/powerpc/kernel/rtas.c:990:26: note: each undeclared identifier is reported only once for each function it appears in
   arch/powerpc/kernel/rtas.c:994:1: error: control reaches end of non-void function [-Werror=return-type]
     994 | }
         | ^
   cc1: some warnings being treated as errors


vim +/RTAS_RMOBUF_MAX +990 arch/powerpc/kernel/rtas.c

bd59380c5ba4147d Andrew Donnellan 2020-08-20  986  
bd59380c5ba4147d Andrew Donnellan 2020-08-20  987  static bool in_rmo_buf(u32 base, u32 end)
bd59380c5ba4147d Andrew Donnellan 2020-08-20  988  {
bd59380c5ba4147d Andrew Donnellan 2020-08-20  989  	return base >= rtas_rmo_buf &&
bd59380c5ba4147d Andrew Donnellan 2020-08-20 @990  		base < (rtas_rmo_buf + RTAS_RMOBUF_MAX) &&
bd59380c5ba4147d Andrew Donnellan 2020-08-20  991  		base <= end &&
bd59380c5ba4147d Andrew Donnellan 2020-08-20  992  		end >= rtas_rmo_buf &&
bd59380c5ba4147d Andrew Donnellan 2020-08-20  993  		end < (rtas_rmo_buf + RTAS_RMOBUF_MAX);
bd59380c5ba4147d Andrew Donnellan 2020-08-20  994  }
bd59380c5ba4147d Andrew Donnellan 2020-08-20  995  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 72490 bytes --]

^ permalink raw reply

* Re: [PATCH v3] powerpc/vdso: fix clock_gettime_fallback for vdso32
From: Michael Ellerman @ 2021-01-15 12:23 UTC (permalink / raw)
  To: Paul Mackerras, Benjamin Herrenschmidt, Michael Ellerman, schwab,
	Christophe Leroy
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <0c0ab0eb3cc80687c326f76ff0dd5762b8812ecc.1610452505.git.christophe.leroy@csgroup.eu>

On Tue, 12 Jan 2021 11:55:15 +0000 (UTC), Christophe Leroy wrote:
> The second argument of __kernel_clock_gettime64 points to a struct
> __kernel_timespec, with 64-bit time_t, so use the clock_gettime64 syscall
> in the fallback function for the 32-bit vdso.  Similarily,
> clock_getres_fallback should use the clock_getres_time64 syscall, though
> it isn't yet called from the 32-bit vdso.
> 
> [chleroy: Moved into a single #ifdef __powerpc64__ block]

Applied to powerpc/fixes.

[1/1] powerpc/vdso: Fix clock_gettime_fallback for vdso32
      https://git.kernel.org/powerpc/c/41131a5e54ae7ba5a2bb8d7b30d1818b3f5b13d2

cheers

^ permalink raw reply

* Re: [PATCH v2] powerpc: fix alignment bug whithin the init sections
From: Michael Ellerman @ 2021-01-15 12:23 UTC (permalink / raw)
  To: Ariel Marcovitch, mpe
  Cc: keescook, maskray, linux-kernel, npiggin, oss, paulus,
	ariel.marcovitch, naveen.n.rao, linuxppc-dev, dja
In-Reply-To: <20210102201156.10805-1-ariel.marcovitch@gmail.com>

On Sat, 2 Jan 2021 22:11:56 +0200, Ariel Marcovitch wrote:
> This is a bug that causes early crashes in builds with a
> .exit.text section smaller than a page and a .init.text section that
> ends in the beginning of a physical page (this is kinda random, which
> might explain why this wasn't really encountered before).
> 
> The init sections are ordered like this:
> 	.init.text
> 	.exit.text
> 	.init.data
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc: Fix alignment bug within the init sections
      https://git.kernel.org/powerpc/c/2225a8dda263edc35a0e8b858fe2945cf6240fde

cheers

^ permalink raw reply

* Re: [PATCH v2] powerpc: Handle .text.{hot, unlikely}.* in linker script
From: Michael Ellerman @ 2021-01-15 12:23 UTC (permalink / raw)
  To: Nathan Chancellor, Michael Ellerman
  Cc: Nick Desaulniers, linux-kernel, stable, clang-built-linux,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <20210104205952.1399409-1-natechancellor@gmail.com>

On Mon, 4 Jan 2021 13:59:53 -0700, Nathan Chancellor wrote:
> Commit eff8728fe698 ("vmlinux.lds.h: Add PGO and AutoFDO input
> sections") added ".text.unlikely.*" and ".text.hot.*" due to an LLVM
> change [1].
> 
> After another LLVM change [2], these sections are seen in some PowerPC
> builds, where there is a orphan section warning then build failure:
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc: Handle .text.{hot,unlikely}.* in linker script
      https://git.kernel.org/powerpc/c/3ce47d95b7346dcafd9bed3556a8d072cb2b8571

cheers

^ permalink raw reply

* [PATCH v3] powerpc/mce: Remove per cpu variables from MCE handlers
From: Ganesh Goudar @ 2021-01-15 12:58 UTC (permalink / raw)
  To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, mahesh, npiggin

Access to per-cpu variables requires translation to be enabled on
pseries machine running in hash mmu mode, Since part of MCE handler
runs in realmode and part of MCE handling code is shared between ppc
architectures pseries and powernv, it becomes difficult to manage
these variables differently on different architectures, So have
these variables in paca instead of having them as per-cpu variables
to avoid complications.

Maximum recursive depth of MCE is 4, Considering the maximum depth
allowed reduce the size of event to 10 from 100.

Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
v2: Dynamically allocate memory for machine check event info

v3: Remove check for hash mmu lpar, use memblock_alloc_try_nid
    to allocate memory.
---
 arch/powerpc/include/asm/mce.h     | 21 ++++++++-
 arch/powerpc/include/asm/paca.h    |  4 ++
 arch/powerpc/kernel/mce.c          | 76 +++++++++++++++++-------------
 arch/powerpc/kernel/setup-common.c |  2 +-
 4 files changed, 69 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index e6c27ae843dc..8d6e3a7a9f37 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -204,7 +204,18 @@ struct mce_error_info {
 	bool			ignore_event;
 };
 
-#define MAX_MC_EVT	100
+#define MAX_MC_EVT	10
+
+struct mce_info {
+	int mce_nest_count;
+	struct machine_check_event mce_event[MAX_MC_EVT];
+	/* Queue for delayed MCE events. */
+	int mce_queue_count;
+	struct machine_check_event mce_event_queue[MAX_MC_EVT];
+	/* Queue for delayed MCE UE events. */
+	int mce_ue_count;
+	struct machine_check_event  mce_ue_event_queue[MAX_MC_EVT];
+};
 
 /* Release flags for get_mce_event() */
 #define MCE_EVENT_RELEASE	true
@@ -233,5 +244,13 @@ long __machine_check_early_realmode_p7(struct pt_regs *regs);
 long __machine_check_early_realmode_p8(struct pt_regs *regs);
 long __machine_check_early_realmode_p9(struct pt_regs *regs);
 long __machine_check_early_realmode_p10(struct pt_regs *regs);
+#define get_mce_info() local_paca->mce_info
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
+#ifdef CONFIG_PPC_BOOK3S_64
+void mce_init(void);
+#else
+static inline void mce_init(void) { };
 #endif /* CONFIG_PPC_BOOK3S_64 */
+
 #endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 9454d29ff4b4..38e0c55e845d 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -29,6 +29,7 @@
 #include <asm/hmi.h>
 #include <asm/cpuidle.h>
 #include <asm/atomic.h>
+#include <asm/mce.h>
 
 #include <asm-generic/mmiowb_types.h>
 
@@ -273,6 +274,9 @@ struct paca_struct {
 #ifdef CONFIG_MMIOWB
 	struct mmiowb_state mmiowb_state;
 #endif
+#ifdef CONFIG_PPC_BOOK3S_64
+	struct mce_info *mce_info;
+#endif /* CONFIG_PPC_BOOK3S_64 */
 } ____cacheline_aligned;
 
 extern void copy_mm_to_paca(struct mm_struct *mm);
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 9f3e133b57b7..feeb3231b541 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -17,22 +17,13 @@
 #include <linux/irq_work.h>
 #include <linux/extable.h>
 #include <linux/ftrace.h>
+#include <linux/memblock.h>
 
 #include <asm/machdep.h>
 #include <asm/mce.h>
 #include <asm/nmi.h>
 
-static DEFINE_PER_CPU(int, mce_nest_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
-
-/* Queue for delayed MCE events. */
-static DEFINE_PER_CPU(int, mce_queue_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
-
-/* Queue for delayed MCE UE events. */
-static DEFINE_PER_CPU(int, mce_ue_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT],
-					mce_ue_event_queue);
+#include "setup.h"
 
 static void machine_check_process_queued_event(struct irq_work *work);
 static void machine_check_ue_irq_work(struct irq_work *work);
@@ -103,8 +94,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
 		    struct mce_error_info *mce_err,
 		    uint64_t nip, uint64_t addr, uint64_t phys_addr)
 {
-	int index = __this_cpu_inc_return(mce_nest_count) - 1;
-	struct machine_check_event *mce = this_cpu_ptr(&mce_event[index]);
+	int index = get_mce_info()->mce_nest_count++;
+	struct machine_check_event *mce = &get_mce_info()->mce_event[index];
 
 	/*
 	 * Return if we don't have enough space to log mce event.
@@ -191,7 +182,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
  */
 int get_mce_event(struct machine_check_event *mce, bool release)
 {
-	int index = __this_cpu_read(mce_nest_count) - 1;
+	int index = get_mce_info()->mce_nest_count - 1;
 	struct machine_check_event *mc_evt;
 	int ret = 0;
 
@@ -201,7 +192,7 @@ int get_mce_event(struct machine_check_event *mce, bool release)
 
 	/* Check if we have MCE info to process. */
 	if (index < MAX_MC_EVT) {
-		mc_evt = this_cpu_ptr(&mce_event[index]);
+		mc_evt = &get_mce_info()->mce_event[index];
 		/* Copy the event structure and release the original */
 		if (mce)
 			*mce = *mc_evt;
@@ -211,7 +202,7 @@ int get_mce_event(struct machine_check_event *mce, bool release)
 	}
 	/* Decrement the count to free the slot. */
 	if (release)
-		__this_cpu_dec(mce_nest_count);
+		get_mce_info()->mce_nest_count--;
 
 	return ret;
 }
@@ -233,13 +224,13 @@ static void machine_check_ue_event(struct machine_check_event *evt)
 {
 	int index;
 
-	index = __this_cpu_inc_return(mce_ue_count) - 1;
+	index = get_mce_info()->mce_ue_count++;
 	/* If queue is full, just return for now. */
 	if (index >= MAX_MC_EVT) {
-		__this_cpu_dec(mce_ue_count);
+		get_mce_info()->mce_ue_count--;
 		return;
 	}
-	memcpy(this_cpu_ptr(&mce_ue_event_queue[index]), evt, sizeof(*evt));
+	memcpy(&get_mce_info()->mce_ue_event_queue[index], evt, sizeof(*evt));
 
 	/* Queue work to process this event later. */
 	irq_work_queue(&mce_ue_event_irq_work);
@@ -256,13 +247,13 @@ void machine_check_queue_event(void)
 	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
 		return;
 
-	index = __this_cpu_inc_return(mce_queue_count) - 1;
+	index = get_mce_info()->mce_queue_count++;
 	/* If queue is full, just return for now. */
 	if (index >= MAX_MC_EVT) {
-		__this_cpu_dec(mce_queue_count);
+		get_mce_info()->mce_queue_count--;
 		return;
 	}
-	memcpy(this_cpu_ptr(&mce_event_queue[index]), &evt, sizeof(evt));
+	memcpy(&get_mce_info()->mce_event_queue[index], &evt, sizeof(evt));
 
 	/* Queue irq work to process this event later. */
 	irq_work_queue(&mce_event_process_work);
@@ -289,9 +280,9 @@ static void machine_process_ue_event(struct work_struct *work)
 	int index;
 	struct machine_check_event *evt;
 
-	while (__this_cpu_read(mce_ue_count) > 0) {
-		index = __this_cpu_read(mce_ue_count) - 1;
-		evt = this_cpu_ptr(&mce_ue_event_queue[index]);
+	while (get_mce_info()->mce_ue_count > 0) {
+		index = get_mce_info()->mce_ue_count - 1;
+		evt = &get_mce_info()->mce_ue_event_queue[index];
 		blocking_notifier_call_chain(&mce_notifier_list, 0, evt);
 #ifdef CONFIG_MEMORY_FAILURE
 		/*
@@ -304,7 +295,7 @@ static void machine_process_ue_event(struct work_struct *work)
 		 */
 		if (evt->error_type == MCE_ERROR_TYPE_UE) {
 			if (evt->u.ue_error.ignore_event) {
-				__this_cpu_dec(mce_ue_count);
+				get_mce_info()->mce_ue_count--;
 				continue;
 			}
 
@@ -320,7 +311,7 @@ static void machine_process_ue_event(struct work_struct *work)
 					"was generated\n");
 		}
 #endif
-		__this_cpu_dec(mce_ue_count);
+		get_mce_info()->mce_ue_count--;
 	}
 }
 /*
@@ -338,17 +329,17 @@ static void machine_check_process_queued_event(struct irq_work *work)
 	 * For now just print it to console.
 	 * TODO: log this error event to FSP or nvram.
 	 */
-	while (__this_cpu_read(mce_queue_count) > 0) {
-		index = __this_cpu_read(mce_queue_count) - 1;
-		evt = this_cpu_ptr(&mce_event_queue[index]);
+	while (get_mce_info()->mce_queue_count > 0) {
+		index = get_mce_info()->mce_queue_count - 1;
+		evt = &get_mce_info()->mce_event_queue[index];
 
 		if (evt->error_type == MCE_ERROR_TYPE_UE &&
 		    evt->u.ue_error.ignore_event) {
-			__this_cpu_dec(mce_queue_count);
+			get_mce_info()->mce_queue_count--;
 			continue;
 		}
 		machine_check_print_event_info(evt, false, false);
-		__this_cpu_dec(mce_queue_count);
+		get_mce_info()->mce_queue_count--;
 	}
 }
 
@@ -741,3 +732,24 @@ long hmi_exception_realmode(struct pt_regs *regs)
 
 	return 1;
 }
+
+void __init mce_init(void)
+{
+	struct mce_info *mce_info;
+	u64 limit;
+	int i;
+
+	limit = min(ppc64_bolted_size(), ppc64_rma_size);
+	for_each_possible_cpu(i) {
+		mce_info = memblock_alloc_try_nid(sizeof(*mce_info),
+						  __alignof__(*mce_info),
+						  MEMBLOCK_LOW_LIMIT,
+						  limit, cpu_to_node(i));
+		if (!mce_info)
+			goto err;
+		paca_ptrs[i]->mce_info = mce_info;
+	}
+	return;
+err:
+	panic("Failed to allocate memory for MCE event data\n");
+}
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 71f38e9248be..17dc451f0e45 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -916,7 +916,6 @@ void __init setup_arch(char **cmdline_p)
 	/* On BookE, setup per-core TLB data structures. */
 	setup_tlb_core_data();
 #endif
-
 	/* Print various info about the machine that has been gathered so far. */
 	print_system_info();
 
@@ -938,6 +937,7 @@ void __init setup_arch(char **cmdline_p)
 	exc_lvl_early_init();
 	emergency_stack_init();
 
+	mce_init();
 	smp_release_cpus();
 
 	initmem_init();
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v2 0/7] Rid W=1 warnings in Ethernet
From: Christophe Leroy @ 2021-01-15 13:14 UTC (permalink / raw)
  To: Lee Jones, Jakub Kicinski
  Cc: Paul Durrant, Kurt Kanzenbach, Alexei Starovoitov, linux-kernel,
	Peter Cammaert, Paul Mackerras, Sukadev Bhattiprolu, Wei Liu,
	Daniel Borkmann, Utz Bacher, John Fastabend, Santiago Leon,
	xen-devel, Grygorii Strashko, Thomas Falcon,
	Jesper Dangaard Brouer, Jens Osterkamp, Rusty Russell,
	Daris A Nevil, Lijun Pan, Ivan Khoronzhuk, Nicolas Pitre,
	Geoff Levand, netdev, Gustavo A. R. Silva, Erik Stahlman,
	John Allen, Dany Madden, bpf, linuxppc-dev, David S. Miller,
	Russell King
In-Reply-To: <20210115111823.GH3975472@dell>



Le 15/01/2021 à 12:18, Lee Jones a écrit :
> On Thu, 14 Jan 2021, Lee Jones wrote:
> 
>> On Thu, 14 Jan 2021, Jakub Kicinski wrote:
>>
>>> On Thu, 14 Jan 2021 08:33:49 +0000 Lee Jones wrote:
>>>> On Wed, 13 Jan 2021, Jakub Kicinski wrote:
>>>>
>>>>> On Wed, 13 Jan 2021 16:41:16 +0000 Lee Jones wrote:
>>>>>> Resending the stragglers again.
>>>>>>
>>>>>> This set is part of a larger effort attempting to clean-up W=1
>>>>>> kernel builds, which are currently overwhelmingly riddled with
>>>>>> niggly little warnings.
>>>>>>                                                                                                                   
>>>>>> v2:
>>>>>>   - Squashed IBM patches
>>>>>>   - Fixed real issue in SMSC
>>>>>>   - Added Andrew's Reviewed-by tags on remainder
>>>>>
>>>>> Does not apply, please rebase on net-next/master.
>>>>
>>>> These are based on Tuesday's next/master.
>>>
>>> What's next/master?
>>
>> I'm not sure if this is a joke, or not? :)
>>
>> next/master == Linux Next.  The daily merged repo where all of the
>> *-next branches end up to ensure interoperability.  It's also the
>> branch that is most heavily tested by the auto-builders to ensure the
>> vast majority of issues are ironed out before hitting Mainline.
>>
>>> This is net-next:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
>>
>> Looks like net-next gets merged into next/master:
>>
>> commit 452958f1f3d1c8980a8414f9c37c8c6de24c7d32
>> Merge: 1eabba209a17a f50e2f9f79164
>> Author: Stephen Rothwell <sfr@canb.auug.org.au>
>> Date:   Thu Jan 14 10:35:40 2021 +1100
>>
>>      Merge remote-tracking branch 'net-next/master'
>>
>> So I'm not sure what it's conflicting with.
>>
>> Do you have patches in net-next that didn't make it into next/master
>> for some reason?
>>
>> I'll try to rebase again tomorrow.
>>
>> Hopefully I am able to reproduce your issue by then.
> 
> Okay so my development branch rebased again with no issue.

Rebasing is not same as patches application.

> 
> I also took the liberty to checkout net-next and cherry-pick the
> patches [0], which again didn't cause a problem.

Also normal, cherry-picking is not the same as applying a patch series.

> 
> I'm not sure what else to suggest.  Is your local copy up-to-date?

I guess so, I have the same problem as Jakub, see below. I had to use 'git am -3' to apply you 
series. As you can see, git falls back to 3 way merge for patch 1, which means your series is close 
to but not fully in sync with net-next.


[root@localhost linux-powerpc]# git remote -v
net-next	https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git (fetch)
net-next	https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git (push)

[root@localhost linux-powerpc]# git checkout net-next/master -b net-next
Switched to a new branch 'net-next'

[root@localhost linux-powerpc]# git am /root/Downloads/Rid-W-1-warnings-in-Ethernet.patch
Applying: net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
error: patch failed: drivers/net/ethernet/smsc/smc91x.c:2192
error: drivers/net/ethernet/smsc/smc91x.c: patch does not apply
Patch failed at 0001 net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
hint: Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

[root@localhost linux-powerpc]# git am --abort

[root@localhost linux-powerpc]# git am -3 /root/Downloads/Rid-W-1-warnings-in-Ethernet.patch
Applying: net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
Using index info to reconstruct a base tree...
M	drivers/net/ethernet/smsc/smc91x.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/net/ethernet/smsc/smc91x.c
Applying: net: xen-netback: xenbus: Demote nonconformant kernel-doc headers
Applying: net: ethernet: ti: am65-cpsw-qos: Demote non-conformant function header
Applying: net: ethernet: ti: am65-cpts: Document am65_cpts_rx_enable()'s 'en' parameter
Applying: net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanours
Applying: net: ethernet: toshiba: ps3_gelic_net: Fix some kernel-doc misdemeanours
Applying: net: ethernet: toshiba: spider_net: Document a whole bunch of function parameters


Christophe

^ permalink raw reply

* Re: [PATCH v2 0/7] Rid W=1 warnings in Ethernet
From: Lee Jones @ 2021-01-15 13:38 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Paul Durrant, Kurt Kanzenbach, Alexei Starovoitov, linux-kernel,
	Peter Cammaert, Paul Mackerras, Sukadev Bhattiprolu, Wei Liu,
	Daniel Borkmann, Utz Bacher, John Fastabend, Santiago Leon,
	xen-devel, Grygorii Strashko, Thomas Falcon,
	Jesper Dangaard Brouer, Jens Osterkamp, Rusty Russell,
	Daris A Nevil, Lijun Pan, Jakub Kicinski, Ivan Khoronzhuk,
	Nicolas Pitre, Geoff Levand, netdev, Gustavo A. R. Silva,
	Erik Stahlman, John Allen, Dany Madden, bpf, linuxppc-dev,
	David S. Miller, Russell King
In-Reply-To: <bc775cc3-fda3-0280-5f92-53058996f02f@csgroup.eu>

On Fri, 15 Jan 2021, Christophe Leroy wrote:

> 
> 
> Le 15/01/2021 à 12:18, Lee Jones a écrit :
> > On Thu, 14 Jan 2021, Lee Jones wrote:
> > 
> > > On Thu, 14 Jan 2021, Jakub Kicinski wrote:
> > > 
> > > > On Thu, 14 Jan 2021 08:33:49 +0000 Lee Jones wrote:
> > > > > On Wed, 13 Jan 2021, Jakub Kicinski wrote:
> > > > > 
> > > > > > On Wed, 13 Jan 2021 16:41:16 +0000 Lee Jones wrote:
> > > > > > > Resending the stragglers again.
> > > > > > > 
> > > > > > > This set is part of a larger effort attempting to clean-up W=1
> > > > > > > kernel builds, which are currently overwhelmingly riddled with
> > > > > > > niggly little warnings.
> > > > > > > v2:
> > > > > > >   - Squashed IBM patches
> > > > > > >   - Fixed real issue in SMSC
> > > > > > >   - Added Andrew's Reviewed-by tags on remainder
> > > > > > 
> > > > > > Does not apply, please rebase on net-next/master.
> > > > > 
> > > > > These are based on Tuesday's next/master.
> > > > 
> > > > What's next/master?
> > > 
> > > I'm not sure if this is a joke, or not? :)
> > > 
> > > next/master == Linux Next.  The daily merged repo where all of the
> > > *-next branches end up to ensure interoperability.  It's also the
> > > branch that is most heavily tested by the auto-builders to ensure the
> > > vast majority of issues are ironed out before hitting Mainline.
> > > 
> > > > This is net-next:
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/
> > > 
> > > Looks like net-next gets merged into next/master:
> > > 
> > > commit 452958f1f3d1c8980a8414f9c37c8c6de24c7d32
> > > Merge: 1eabba209a17a f50e2f9f79164
> > > Author: Stephen Rothwell <sfr@canb.auug.org.au>
> > > Date:   Thu Jan 14 10:35:40 2021 +1100
> > > 
> > >      Merge remote-tracking branch 'net-next/master'
> > > 
> > > So I'm not sure what it's conflicting with.
> > > 
> > > Do you have patches in net-next that didn't make it into next/master
> > > for some reason?
> > > 
> > > I'll try to rebase again tomorrow.
> > > 
> > > Hopefully I am able to reproduce your issue by then.
> > 
> > Okay so my development branch rebased again with no issue.
> 
> Rebasing is not same as patches application.
> 
> > 
> > I also took the liberty to checkout net-next and cherry-pick the
> > patches [0], which again didn't cause a problem.
> 
> Also normal, cherry-picking is not the same as applying a patch series.
> 
> > 
> > I'm not sure what else to suggest.  Is your local copy up-to-date?
> 
> I guess so, I have the same problem as Jakub, see below. I had to use 'git
> am -3' to apply you series. As you can see, git falls back to 3 way merge
> for patch 1, which means your series is close to but not fully in sync with
> net-next.
> 
> 
> [root@localhost linux-powerpc]# git remote -v
> net-next	https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git (fetch)
> net-next	https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git (push)
> 
> [root@localhost linux-powerpc]# git checkout net-next/master -b net-next
> Switched to a new branch 'net-next'
> 
> [root@localhost linux-powerpc]# git am /root/Downloads/Rid-W-1-warnings-in-Ethernet.patch
> Applying: net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
> error: patch failed: drivers/net/ethernet/smsc/smc91x.c:2192
> error: drivers/net/ethernet/smsc/smc91x.c: patch does not apply
> Patch failed at 0001 net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
> hint: Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
> 
> [root@localhost linux-powerpc]# git am --abort
> 
> [root@localhost linux-powerpc]# git am -3 /root/Downloads/Rid-W-1-warnings-in-Ethernet.patch
> Applying: net: ethernet: smsc: smc91x: Fix function name in kernel-doc header
> Using index info to reconstruct a base tree...
> M	drivers/net/ethernet/smsc/smc91x.c
> Falling back to patching base and 3-way merge...
> Auto-merging drivers/net/ethernet/smsc/smc91x.c
> Applying: net: xen-netback: xenbus: Demote nonconformant kernel-doc headers
> Applying: net: ethernet: ti: am65-cpsw-qos: Demote non-conformant function header
> Applying: net: ethernet: ti: am65-cpts: Document am65_cpts_rx_enable()'s 'en' parameter
> Applying: net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanours
> Applying: net: ethernet: toshiba: ps3_gelic_net: Fix some kernel-doc misdemeanours
> Applying: net: ethernet: toshiba: spider_net: Document a whole bunch of function parameters

Seeing as you went to all that effort, I thought it was only fair that
I did the same.  After some digging my tentative conclusion is that
Linux -next is up-to-date with net-next, but net-next is not
up-to-date with Linux -next.

I think this patch is conflicting:

 smc91x: remove GPIOLIB dependency.

Was that taken in via another tree?  If not resolved there is a chance
that this may cause a conflict when net-next is merged into Mainline.

..

Okay, so what would you like me to do?  Would you like me to re-submit
the set based only on net-next, or are you happy with your 3-way
merge?

I'll do as you ask.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply

* Re: [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Nathan Lynch @ 2021-01-15 15:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <5276937f-b72a-89ba-d0d8-19e4be55ae35@ozlabs.ru>

Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 15/01/2021 09:00, Nathan Lynch wrote:
>> Memory locations passed as arguments from the OS to RTAS usually need
>> to be addressable in 32-bit mode and must reside in the Real Mode
>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>> first logical memory block reported in the LPAR’s device tree.
>> 
>> On powerpc targets with RTAS, Linux makes available to user space a
>> region of memory suitable for arguments to be passed to RTAS via
>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>> API during boot in order to ensure that it satisfies the requirements
>> described above.
>> 
>> With radix MMU, the upper limit supplied to the memblock allocation
>> can exceed the bounds of the first logical memory block, since
>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
>> a common size of the first memory block according to a small sample of
>> LPARs I have checked.) This leads to failures when user space invokes
>> an RTAS function that uses a work area, such as
>> ibm,configure-connector.
>> 
>> Alter the determination of the upper limit for rtas_rmo_buf's
>> allocation to consult the device tree directly, ensuring placement
>> within the RMA regardless of the MMU in use.
>
> Can we tie this with RTAS (which also needs to be in RMA) and simply add 
> extra 64K in prom_instantiate_rtas() and advertise this address 
> (ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do 
> not need this RMO area before that point.

Can you explain more about what advantage that would bring? I'm not
seeing it. It's a more significant change than what I've written
here. Would it interact well with kexec?

> And probably do the same with per-cpu RTAS argument structures mentioned 
> in the cover letter?

I don't think so, since those need to be allocated with the pacas and
limited to the maximum possible CPUs, which is discovered by the kernel
much later.

But maybe I misunderstand what you're suggesting.

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
From: Nathan Lynch @ 2021-01-15 15:56 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <d91c5b15-7c3d-a332-45ac-1b865341e962@ozlabs.ru>

Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 15/01/2021 09:00, Nathan Lynch wrote:
>> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
>> index 332e1000ca0f..1aa7ab1cbc84 100644
>> --- a/arch/powerpc/include/asm/rtas.h
>> +++ b/arch/powerpc/include/asm/rtas.h
>> @@ -19,8 +19,11 @@
>>   #define RTAS_UNKNOWN_SERVICE (-1)
>>   #define RTAS_INSTANTIATE_MAX (1ULL<<30) /* Don't instantiate rtas at/above this value */
>>   
>> -/* Buffer size for ppc_rtas system call. */
>> -#define RTAS_RMOBUF_MAX (64 * 1024)
>> +/* Work areas shared with RTAS must be 4K, naturally aligned. */
>
> Why exactly 4K and not (for example) PAGE_SIZE?

4K is a platform requirement and isn't related to Linux's configured
page size. See the PAPR specification for RTAS functions such as
ibm,configure-connector, ibm,update-nodes, ibm,update-properties.

There are other calls with work area parameters where alignment isn't
specified (e.g. ibm,get-system-parameter) but 4KB alignment is a safe
choice for those.

>> +#define RTAS_WORK_AREA_SIZE   4096
>> +
>> +/* Work areas allocated for user space access. */
>> +#define RTAS_USER_REGION_SIZE (RTAS_WORK_AREA_SIZE * 16)
>
> This is still 64K but no clarity why. There is 16 of something, what
> is it?

There are 16 4KB work areas in the region. I can name it
RTAS_NR_USER_WORK_AREAS or similar.

^ permalink raw reply

* Re: [PATCH 4/6] powerpc/rtas: move syscall filter setup into separate function
From: Nathan Lynch @ 2021-01-15 16:04 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <20e340a9-0c61-cb21-659f-cf60d75f3cd3@ozlabs.ru>

Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 15/01/2021 09:00, Nathan Lynch wrote:
>> +static void __init rtas_syscall_filter_init(void)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < ARRAY_SIZE(rtas_filters); i++) {
>> +		rtas_filters[i].token = rtas_token(rtas_filters[i].name);
>> +	}
>> +
>
> Unnecessary curly braces (I understand it is cut-n-paste but still) and 
> an empty line.

Yes, will fix in v2, thanks.

^ permalink raw reply

* [PATCH v6 01/39] KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: kvm-ppc, Nicholas Piggin
In-Reply-To: <20210115165012.1260253-1-npiggin@gmail.com>

Interrupts that occur in kernel mode expect that context tracking
is set to kernel. Enabling local irqs before context tracking
switches from guest to host means interrupts can come in and trigger
warnings about wrong context, and possibly worse.

Cc: kvm-ppc@vger.kernel.org
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kvm/book3s_hv.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6f612d240392..d348e77cee20 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -3407,8 +3407,9 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
 
 	kvmppc_set_host_core(pcpu);
 
+	guest_exit_irqoff();
+
 	local_irq_enable();
-	guest_exit();
 
 	/* Let secondaries go back to the offline loop */
 	for (i = 0; i < controlled_threads; ++i) {
@@ -4217,8 +4218,9 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
 
 	kvmppc_set_host_core(pcpu);
 
+	guest_exit_irqoff();
+
 	local_irq_enable();
-	guest_exit();
 
 	cpumask_clear_cpu(pcpu, &kvm->arch.cpu_in_guest);
 
-- 
2.23.0


^ permalink raw reply related

* [PATCH v6 00/39] powerpc: interrupt wrappers
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This adds interrupt handler wrapper functions, similar to the
generic / x86 code, and moves several common operations into them
from either asm or open coded in the individual handlers.

This series is based on upstream plus "powerpc/64s: fix scv entry
fallback flush vs interrupt".

Since v1:
- Fixed a couple of compile issues
- Fixed perf weirdness (sometimes NMI, sometimes not)
- Also move irq_enter/exit into wrappers

Since v2:
- Rebased upstream
- Took code in patch 3 from Christophe
- Fixed some compile errors from 0day

Since v3:
- Rebased
- Split Christophe's 32s DABR patch into its own patch
- Fixed missing asm from 32s on patch 3 noticed by Christophe.
- Moved changes around, split out one more patch (patch 9) to make
  changes more logical and atomic.
- Add comments explaining _RAW handlers (SLB, HPTE) interrupts better

Since v4:
- Rebased (on top of scv fallback flush fix)
- Rearranged a few changes into different patches from Christophe,
  e.g., the ___do_page_fault change from patch 2 to 10. I didn't
  do everything (e.g., splitting to update __hash_page to drop the
  msr argument before the bulk of patch 2 seemed like churn without
  much improvement), and also other things like removing the new
  ___do_page_fault variant if we can change hash fault context tracking
  I didn't get time to completely investigate and implement. I think
  this shouldn't be a showstopper though we can make more improvements
  as we go.

Since v5:
- Lots of good review suggestions from Christophe, see v5 email threads.
- Major change being do_break is left in asm and selected early as an
  alternate interrupt handler now, which is a smaller step and matches
  other subarchs better.
- Rearranged patches, split, moved things, bug fixes, etc.
- Converted a few more missed exception handlers for debug and ras
  stuff.
- Found a few relatively minor bugs, comment updates, and tidy ups that
  don't really have to be part of this series but might as well be.

Thanks,
Nick

Christophe Leroy (1):
  powerpc/32s: move DABR match out of handle_page_fault

Nicholas Piggin (38):
  KVM: PPC: Book3S HV: Context tracking exit guest context before
    enabling irqs
  powerpc/64s: move DABR match out of handle_page_fault
  powerpc/64s: move the hash fault handling logic to C
  powerpc: remove arguments from fault handler functions
  powerpc: do_break get registers from regs
  powerpc: bad_page_fault get registers from regs
  powerpc: rearrange do_page_fault error case to be inside
    exception_enter
  powerpc/64s: move bad_page_fault handling to C
  powerpc/64s: split do_hash_fault
  powerpc/mm: Remove stale do_page_fault comment referring to SLB faults
  powerpc/64s: slb comment update
  powerpc/traps: add NOKPROBE_SYMBOL for sreset and mce
  powerpc/perf: move perf irq/nmi handling details into traps.c
  powerpc/time: move timer_broadcast_interrupt prototype to asm/time.h
  powerpc: add and use unknown_async_exception
  powerpc/fsl_booke/32: CacheLockingException remove args
  powerpc: DebugException remove args
  powerpc/cell: tidy up pervasive declarations
  powerpc: introduce die_mce
  powerpc/mce: ensure machine check handler always tests RI
  powerpc: improve handling of unrecoverable system reset
  powerpc: interrupt handler wrapper functions
  powerpc: add interrupt wrapper entry / exit stub functions
  powerpc: convert interrupt handlers to use wrappers
  powerpc: add interrupt_cond_local_irq_enable helper
  powerpc/64: context tracking remove _TIF_NOHZ
  powerpc/64s/hash: improve context tracking of hash faults
  powerpc/64: context tracking move to interrupt wrappers
  powerpc/64: add context tracking to asynchronous interrupts
  powerpc: handle irq_enter/irq_exit in interrupt handler wrappers
  powerpc/64s: move context tracking exit to interrupt exit path
  powerpc/64s: reconcile interrupts in C
  powerpc/64: move account_stolen_time into its own function
  powerpc/64: entry cpu time accounting in C
  powerpc: move NMI entry/exit code into wrapper
  powerpc/64s: move NMI soft-mask handling to C
  powerpc/64s: runlatch interrupt handling in C
  powerpc/64s: power4 nap fixup in C

 arch/powerpc/Kconfig                       |   1 -
 arch/powerpc/include/asm/asm-prototypes.h  |  29 --
 arch/powerpc/include/asm/bug.h             |   9 +-
 arch/powerpc/include/asm/cputime.h         |  14 +
 arch/powerpc/include/asm/debug.h           |   4 -
 arch/powerpc/include/asm/hw_irq.h          |   9 -
 arch/powerpc/include/asm/interrupt.h       | 434 +++++++++++++++++++++
 arch/powerpc/include/asm/ppc_asm.h         |  24 --
 arch/powerpc/include/asm/processor.h       |   1 +
 arch/powerpc/include/asm/thread_info.h     |  10 +-
 arch/powerpc/include/asm/time.h            |   2 +
 arch/powerpc/kernel/dbell.c                |   9 +-
 arch/powerpc/kernel/entry_32.S             |  25 +-
 arch/powerpc/kernel/exceptions-64e.S       |   8 +-
 arch/powerpc/kernel/exceptions-64s.S       | 310 ++-------------
 arch/powerpc/kernel/head_40x.S             |  11 +-
 arch/powerpc/kernel/head_8xx.S             |  11 +-
 arch/powerpc/kernel/head_book3s_32.S       |  14 +-
 arch/powerpc/kernel/head_booke.h           |   6 +-
 arch/powerpc/kernel/head_fsl_booke.S       |   6 +-
 arch/powerpc/kernel/idle_book3s.S          |   4 +
 arch/powerpc/kernel/irq.c                  |   7 +-
 arch/powerpc/kernel/mce.c                  |  16 +-
 arch/powerpc/kernel/process.c              |   8 +-
 arch/powerpc/kernel/ptrace/ptrace.c        |   4 -
 arch/powerpc/kernel/signal.c               |   4 -
 arch/powerpc/kernel/syscall_64.c           |  30 +-
 arch/powerpc/kernel/tau_6xx.c              |   5 +-
 arch/powerpc/kernel/time.c                 |   7 +-
 arch/powerpc/kernel/traps.c                | 263 +++++++------
 arch/powerpc/kernel/watchdog.c             |  15 +-
 arch/powerpc/kvm/book3s_hv.c               |   7 +-
 arch/powerpc/kvm/book3s_hv_builtin.c       |   1 +
 arch/powerpc/kvm/booke.c                   |   1 +
 arch/powerpc/mm/book3s64/hash_utils.c      |  96 +++--
 arch/powerpc/mm/book3s64/slb.c             |  40 +-
 arch/powerpc/mm/fault.c                    |  80 ++--
 arch/powerpc/perf/core-book3s.c            |  35 +-
 arch/powerpc/perf/core-fsl-emb.c           |  25 --
 arch/powerpc/platforms/8xx/machine_check.c |   2 +-
 arch/powerpc/platforms/cell/pervasive.c    |   1 +
 arch/powerpc/platforms/cell/pervasive.h    |   3 -
 arch/powerpc/platforms/cell/ras.c          |   6 +-
 arch/powerpc/platforms/cell/ras.h          |   9 +-
 arch/powerpc/platforms/powernv/idle.c      |   1 +
 arch/powerpc/platforms/powernv/opal.c      |   2 +-
 arch/powerpc/platforms/pseries/ras.c       |   2 +-
 47 files changed, 867 insertions(+), 744 deletions(-)
 create mode 100644 arch/powerpc/include/asm/interrupt.h

-- 
2.23.0


^ permalink raw reply

* [PATCH v6 02/39] powerpc/32s: move DABR match out of handle_page_fault
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210115165012.1260253-1-npiggin@gmail.com>

From: Christophe Leroy <christophe.leroy@csgroup.eu>

handle_page_fault() has some code dedicated to book3s/32 to
call do_break() when the DSI is a DABR match.

On other platforms, do_break() is handled separately.

Do the same for book3s/32, do it earlier in the process of DSI.

This change also avoid doing the test on ISI.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/entry_32.S       | 15 ---------------
 arch/powerpc/kernel/head_book3s_32.S |  3 +++
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 1c9b0ccc2172..238eacfda7b0 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -670,10 +670,6 @@ ppc_swapcontext:
 	.globl	handle_page_fault
 handle_page_fault:
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-#ifdef CONFIG_PPC_BOOK3S_32
-	andis.  r0,r5,DSISR_DABRMATCH@h
-	bne-    handle_dabr_fault
-#endif
 	bl	do_page_fault
 	cmpwi	r3,0
 	beq+	ret_from_except
@@ -687,17 +683,6 @@ handle_page_fault:
 	bl	__bad_page_fault
 	b	ret_from_except_full
 
-#ifdef CONFIG_PPC_BOOK3S_32
-	/* We have a data breakpoint exception - handle it */
-handle_dabr_fault:
-	SAVE_NVGPRS(r1)
-	lwz	r0,_TRAP(r1)
-	clrrwi	r0,r0,1
-	stw	r0,_TRAP(r1)
-	bl      do_break
-	b	ret_from_except_full
-#endif
-
 /*
  * This routine switches between two different tasks.  The process
  * state of one is saved on its kernel stack.  Then the state
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 349bf3f0c3af..fc9a12768a14 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -680,7 +680,10 @@ handle_page_fault_tramp_1:
 	lwz	r5, _DSISR(r11)
 	/* fall through */
 handle_page_fault_tramp_2:
+	andis.	r0, r5, DSISR_DABRMATCH@h
+	bne-	1f
 	EXC_XFER_LITE(0x300, handle_page_fault)
+1:	EXC_XFER_STD(0x300, do_break)
 
 #ifdef CONFIG_VMAP_STACK
 .macro save_regs_thread		thread
-- 
2.23.0


^ permalink raw reply related

* [PATCH v6 03/39] powerpc/64s: move DABR match out of handle_page_fault
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210115165012.1260253-1-npiggin@gmail.com>

Similar to the 32/s change, move the test and call to the do_break
handler to the DSI.

Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 34 +++++++++++++---------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 6e53f7638737..a6333b986a57 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1439,6 +1439,8 @@ EXC_COMMON_BEGIN(data_access_common)
 	GEN_COMMON data_access
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
+	andis.	r0,r5,DSISR_DABRMATCH@h
+	bne-	1f
 BEGIN_MMU_FTR_SECTION
 	ld	r6,_MSR(r1)
 	li	r3,0x300
@@ -1447,6 +1449,18 @@ MMU_FTR_SECTION_ELSE
 	b	handle_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
+1:	/* We have a data breakpoint exception - handle it */
+	ld      r4,_DAR(r1)
+	ld      r5,_DSISR(r1)
+	addi    r3,r1,STACK_FRAME_OVERHEAD
+	bl      do_break
+	/*
+	 * do_break() may have changed the NV GPRS while handling a breakpoint.
+	 * If so, we need to restore them with their updated values.
+	 */
+	REST_NVGPRS(r1)
+	b       interrupt_return
+
 	GEN_KVM data_access
 
 
@@ -3228,7 +3242,7 @@ disable_machine_check:
 	.balign	IFETCH_ALIGN_BYTES
 do_hash_page:
 #ifdef CONFIG_PPC_BOOK3S_64
-	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_DABRMATCH | DSISR_KEYFAULT)@h
+	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_KEYFAULT)@h
 	ori	r0,r0,DSISR_BAD_FAULT_64S@l
 	and.	r0,r5,r0		/* weird error? */
 	bne-	handle_page_fault	/* if not, try to insert a HPTE */
@@ -3262,15 +3276,13 @@ do_hash_page:
 	/* Error */
 	blt-	13f
 
-	/* Reload DAR/DSISR into r4/r5 for the DABR check below */
+	/* Reload DAR/DSISR into r4/r5 for handle_page_fault */
 	ld	r4,_DAR(r1)
 	ld      r5,_DSISR(r1)
 #endif /* CONFIG_PPC_BOOK3S_64 */
 
 /* Here we have a page fault that hash_page can't handle. */
 handle_page_fault:
-11:	andis.  r0,r5,DSISR_DABRMATCH@h
-	bne-    handle_dabr_fault
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_page_fault
 	cmpdi	r3,0
@@ -3281,20 +3293,6 @@ handle_page_fault:
 	bl	__bad_page_fault
 	b	interrupt_return
 
-/* We have a data breakpoint exception - handle it */
-handle_dabr_fault:
-	ld      r4,_DAR(r1)
-	ld      r5,_DSISR(r1)
-	addi    r3,r1,STACK_FRAME_OVERHEAD
-	bl      do_break
-	/*
-	 * do_break() may have changed the NV GPRS while handling a breakpoint.
-	 * If so, we need to restore them with their updated values.
-	 */
-	REST_NVGPRS(r1)
-	b       interrupt_return
-
-
 #ifdef CONFIG_PPC_BOOK3S_64
 /* We have a page fault that hash_page could handle but HV refused
  * the PTE insertion
-- 
2.23.0


^ permalink raw reply related

* [PATCH v6 04/39] powerpc/64s: move the hash fault handling logic to C
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210115165012.1260253-1-npiggin@gmail.com>

The fault handling still has some complex logic particularly around
hash table handling, in asm. Implement most of this in C.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   1 +
 arch/powerpc/kernel/exceptions-64s.S          | 127 ++++--------------
 arch/powerpc/mm/book3s64/hash_utils.c         |  77 +++++++----
 3 files changed, 78 insertions(+), 127 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 066b1d34c7bc..60a669379aa0 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -454,6 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
 #define HPTE_NOHPTE_UPDATE	0x2
 #define HPTE_USE_KERNEL_KEY	0x4
 
+int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
 			  unsigned long vsid, pte_t *ptep, unsigned long trap,
 			  unsigned long flags, int ssize, int subpage_prot);
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index a6333b986a57..07aba8af99d3 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1401,14 +1401,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
  *
  * Handling:
  * - Hash MMU
- *   Go to do_hash_page first to see if the HPT can be filled from an entry in
- *   the Linux page table. Hash faults can hit in kernel mode in a fairly
+ *   Go to do_hash_fault, which attempts to fill the HPT from an entry in the
+ *   Linux page table. Hash faults can hit in kernel mode in a fairly
  *   arbitrary state (e.g., interrupts disabled, locks held) when accessing
  *   "non-bolted" regions, e.g., vmalloc space. However these should always be
- *   backed by Linux page tables.
+ *   backed by Linux page table entries.
  *
- *   If none is found, do a Linux page fault. Linux page faults can happen in
- *   kernel mode due to user copy operations of course.
+ *   If no entry is found the Linux page fault handler is invoked (by
+ *   do_hash_fault). Linux page faults can happen in kernel mode due to user
+ *   copy operations of course.
  *
  *   KVM: The KVM HDSI handler may perform a load with MSR[DR]=1 in guest
  *   MMU context, which may cause a DSI in the host, which must go to the
@@ -1439,27 +1440,29 @@ EXC_COMMON_BEGIN(data_access_common)
 	GEN_COMMON data_access
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 	andis.	r0,r5,DSISR_DABRMATCH@h
 	bne-	1f
 BEGIN_MMU_FTR_SECTION
-	ld	r6,_MSR(r1)
-	li	r3,0x300
-	b	do_hash_page		/* Try to handle as hpte fault */
+	bl	do_hash_fault
 MMU_FTR_SECTION_ELSE
-	b	handle_page_fault
+	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+	cmpdi	r3,0
+	beq+	interrupt_return
+	mr	r5,r3
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	ld	r4,_DAR(r1)
+	bl	__bad_page_fault
+	b	interrupt_return
 
-1:	/* We have a data breakpoint exception - handle it */
-	ld      r4,_DAR(r1)
-	ld      r5,_DSISR(r1)
-	addi    r3,r1,STACK_FRAME_OVERHEAD
-	bl      do_break
+1:	bl	do_break
 	/*
 	 * do_break() may have changed the NV GPRS while handling a breakpoint.
 	 * If so, we need to restore them with their updated values.
 	 */
 	REST_NVGPRS(r1)
-	b       interrupt_return
+	b	interrupt_return
 
 	GEN_KVM data_access
 
@@ -1554,13 +1557,19 @@ EXC_COMMON_BEGIN(instruction_access_common)
 	GEN_COMMON instruction_access
 	ld	r4,_DAR(r1)
 	ld	r5,_DSISR(r1)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
-	ld      r6,_MSR(r1)
-	li	r3,0x400
-	b	do_hash_page		/* Try to handle as hpte fault */
+	bl	do_hash_fault
 MMU_FTR_SECTION_ELSE
-	b	handle_page_fault
+	bl	do_page_fault
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
+	cmpdi	r3,0
+	beq+	interrupt_return
+	mr	r5,r3
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	ld	r4,_DAR(r1)
+	bl	__bad_page_fault
+	b	interrupt_return
 
 	GEN_KVM instruction_access
 
@@ -3235,83 +3244,3 @@ disable_machine_check:
 	RFI_TO_KERNEL
 1:	mtlr	r0
 	blr
-
-/*
- * Hash table stuff
- */
-	.balign	IFETCH_ALIGN_BYTES
-do_hash_page:
-#ifdef CONFIG_PPC_BOOK3S_64
-	lis	r0,(DSISR_BAD_FAULT_64S | DSISR_KEYFAULT)@h
-	ori	r0,r0,DSISR_BAD_FAULT_64S@l
-	and.	r0,r5,r0		/* weird error? */
-	bne-	handle_page_fault	/* if not, try to insert a HPTE */
-
-	/*
-	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
-	 * don't call hash_page, just fail the fault. This is required to
-	 * prevent re-entrancy problems in the hash code, namely perf
-	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
-	 * hash fault. See the comment in hash_preload().
-	 */
-	ld	r11, PACA_THREAD_INFO(r13)
-	lwz	r0,TI_PREEMPT(r11)
-	andis.	r0,r0,NMI_MASK@h
-	bne	77f
-
-	/*
-	 * r3 contains the trap number
-	 * r4 contains the faulting address
-	 * r5 contains dsisr
-	 * r6 msr
-	 *
-	 * at return r3 = 0 for success, 1 for page fault, negative for error
-	 */
-	bl	__hash_page		/* build HPTE if possible */
-        cmpdi	r3,0			/* see if __hash_page succeeded */
-
-	/* Success */
-	beq	interrupt_return	/* Return from exception on success */
-
-	/* Error */
-	blt-	13f
-
-	/* Reload DAR/DSISR into r4/r5 for handle_page_fault */
-	ld	r4,_DAR(r1)
-	ld      r5,_DSISR(r1)
-#endif /* CONFIG_PPC_BOOK3S_64 */
-
-/* Here we have a page fault that hash_page can't handle. */
-handle_page_fault:
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	do_page_fault
-	cmpdi	r3,0
-	beq+	interrupt_return
-	mr	r5,r3
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	ld	r4,_DAR(r1)
-	bl	__bad_page_fault
-	b	interrupt_return
-
-#ifdef CONFIG_PPC_BOOK3S_64
-/* We have a page fault that hash_page could handle but HV refused
- * the PTE insertion
- */
-13:	mr	r5,r3
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	ld	r4,_DAR(r1)
-	bl	low_hash_fault
-	b	interrupt_return
-#endif
-
-/*
- * We come here as a result of a DSI at a point where we don't want
- * to call hash_page, such as when we are accessing memory (possibly
- * user memory) inside a PMU interrupt that occurred while interrupts
- * were soft-disabled.  We want to invoke the exception handler for
- * the access, or panic if there isn't a handler.
- */
-77:	addi	r3,r1,STACK_FRAME_OVERHEAD
-	li	r5,SIGSEGV
-	bl	bad_page_fault
-	b	interrupt_return
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 73b06adb6eeb..e866cae57e2f 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1512,16 +1512,40 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
-		unsigned long msr)
+int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
 {
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
 	unsigned long flags = 0;
-	struct mm_struct *mm = current->mm;
-	unsigned int region_id = get_region_id(ea);
+	struct mm_struct *mm;
+	unsigned int region_id;
+	int err;
+
+	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_KEYFAULT)))
+		goto page_fault;
+
+	/*
+	 * If we are in an "NMI" (e.g., an interrupt when soft-disabled), then
+	 * don't call hash_page, just fail the fault. This is required to
+	 * prevent re-entrancy problems in the hash code, namely perf
+	 * interrupts hitting while something holds H_PAGE_BUSY, and taking a
+	 * hash fault. See the comment in hash_preload().
+	 *
+	 * We come here as a result of a DSI at a point where we don't want
+	 * to call hash_page, such as when we are accessing memory (possibly
+	 * user memory) inside a PMU interrupt that occurred while interrupts
+	 * were soft-disabled.  We want to invoke the exception handler for
+	 * the access, or panic if there isn't a handler.
+	 */
+	if (unlikely(in_nmi())) {
+		bad_page_fault(regs, ea, SIGSEGV);
+		return 0;
+	}
 
+	region_id = get_region_id(ea);
 	if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
 		mm = &init_mm;
+	else
+		mm = current->mm;
 
 	if (dsisr & DSISR_NOHPTE)
 		flags |= HPTE_NOHPTE_UPDATE;
@@ -1537,13 +1561,31 @@ int __hash_page(unsigned long trap, unsigned long ea, unsigned long dsisr,
 	 * 2) user space access kernel space.
 	 */
 	access |= _PAGE_PRIVILEGED;
-	if ((msr & MSR_PR) || (region_id == USER_REGION_ID))
+	if (user_mode(regs) || (region_id == USER_REGION_ID))
 		access &= ~_PAGE_PRIVILEGED;
 
-	if (trap == 0x400)
+	if (regs->trap == 0x400)
 		access |= _PAGE_EXEC;
 
-	return hash_page_mm(mm, ea, access, trap, flags);
+	err = hash_page_mm(mm, ea, access, regs->trap, flags);
+	if (unlikely(err < 0)) {
+		// failed to instert a hash PTE due to an hypervisor error
+		if (user_mode(regs)) {
+			if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
+				_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
+			else
+				_exception(SIGBUS, regs, BUS_ADRERR, ea);
+		} else {
+			bad_page_fault(regs, ea, SIGBUS);
+		}
+		err = 0;
+
+	} else if (err) {
+page_fault:
+		err = do_page_fault(regs, ea, dsisr);
+	}
+
+	return err;
 }
 
 #ifdef CONFIG_PPC_MM_SLICES
@@ -1843,27 +1885,6 @@ void flush_hash_range(unsigned long number, int local)
 	}
 }
 
-/*
- * low_hash_fault is called when we the low level hash code failed
- * to instert a PTE due to an hypervisor error
- */
-void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
-{
-	enum ctx_state prev_state = exception_enter();
-
-	if (user_mode(regs)) {
-#ifdef CONFIG_PPC_SUBPAGE_PROT
-		if (rc == -2)
-			_exception(SIGSEGV, regs, SEGV_ACCERR, address);
-		else
-#endif
-			_exception(SIGBUS, regs, BUS_ADRERR, address);
-	} else
-		bad_page_fault(regs, address, SIGBUS);
-
-	exception_exit(prev_state);
-}
-
 long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
 			   unsigned long pa, unsigned long rflags,
 			   unsigned long vflags, int psize, int ssize)
-- 
2.23.0


^ permalink raw reply related

* [PATCH v6 05/39] powerpc: remove arguments from fault handler functions
From: Nicholas Piggin @ 2021-01-15 16:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20210115165012.1260253-1-npiggin@gmail.com>

Make mm fault handlers all just take the pt_regs * argument and load
DAR/DSISR from that. Make those that return a value return long.

This is done to make the function signatures match other handlers, which
will help with a future patch to add wrappers. Explicit arguments could
be added for performance but that would require more wrapper macro
variants.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/asm-prototypes.h     |  4 ++--
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
 arch/powerpc/include/asm/bug.h                |  2 +-
 arch/powerpc/kernel/entry_32.S                |  7 +------
 arch/powerpc/kernel/exceptions-64e.S          |  2 --
 arch/powerpc/kernel/exceptions-64s.S          | 17 ++++-------------
 arch/powerpc/kernel/head_40x.S                | 10 +++++-----
 arch/powerpc/kernel/head_8xx.S                |  6 +++---
 arch/powerpc/kernel/head_book3s_32.S          |  5 ++---
 arch/powerpc/kernel/head_booke.h              |  4 +---
 arch/powerpc/mm/book3s64/hash_utils.c         |  8 +++++---
 arch/powerpc/mm/book3s64/slb.c                | 11 +++++++----
 arch/powerpc/mm/fault.c                       |  5 ++---
 13 files changed, 34 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h
index d0b832cbbec8..22c9d08fa3a4 100644
--- a/arch/powerpc/include/asm/asm-prototypes.h
+++ b/arch/powerpc/include/asm/asm-prototypes.h
@@ -82,8 +82,8 @@ void kernel_bad_stack(struct pt_regs *regs);
 void system_reset_exception(struct pt_regs *regs);
 void machine_check_exception(struct pt_regs *regs);
 void emulation_assist_interrupt(struct pt_regs *regs);
-long do_slb_fault(struct pt_regs *regs, unsigned long ea);
-void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err);
+long do_slb_fault(struct pt_regs *regs);
+void do_bad_slb_fault(struct pt_regs *regs);
 
 /* signals, syscalls and interrupts */
 long sys_swapcontext(struct ucontext __user *old_ctx,
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 60a669379aa0..b9968e297da2 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -454,7 +454,7 @@ static inline unsigned long hpt_hash(unsigned long vpn,
 #define HPTE_NOHPTE_UPDATE	0x2
 #define HPTE_USE_KERNEL_KEY	0x4
 
-int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr);
+long do_hash_fault(struct pt_regs *regs);
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
 			  unsigned long vsid, pte_t *ptep, unsigned long trap,
 			  unsigned long flags, int ssize, int subpage_prot);
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index 464f8ca8a5c9..f7827e993196 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -111,7 +111,7 @@
 #ifndef __ASSEMBLY__
 
 struct pt_regs;
-extern int do_page_fault(struct pt_regs *, unsigned long, unsigned long);
+long do_page_fault(struct pt_regs *);
 extern void bad_page_fault(struct pt_regs *, unsigned long, int);
 void __bad_page_fault(struct pt_regs *regs, unsigned long address, int sig);
 extern void _exception(int, struct pt_regs *, int, unsigned long);
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 238eacfda7b0..d6ea3f2d6cc0 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -276,8 +276,7 @@ reenable_mmu:
 	 * We save a bunch of GPRs,
 	 * r3 can be different from GPR3(r1) at this point, r9 and r11
 	 * contains the old MSR and handler address respectively,
-	 * r4 & r5 can contain page fault arguments that need to be passed
-	 * along as well. r0, r6-r8, r12, CCR, CTR, XER etc... are left
+	 * r0, r4-r8, r12, CCR, CTR, XER etc... are left
 	 * clobbered as they aren't useful past this point.
 	 */
 
@@ -285,15 +284,11 @@ reenable_mmu:
 	stw	r9,8(r1)
 	stw	r11,12(r1)
 	stw	r3,16(r1)
-	stw	r4,20(r1)
-	stw	r5,24(r1)
 
 	/* If we are disabling interrupts (normal case), simply log it with
 	 * lockdep
 	 */
 1:	bl	trace_hardirqs_off
-	lwz	r5,24(r1)
-	lwz	r4,20(r1)
 	lwz	r3,16(r1)
 	lwz	r11,12(r1)
 	lwz	r9,8(r1)
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 74d07dc0bb48..43e71d86dcbf 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1011,8 +1011,6 @@ storage_fault_common:
 	std	r14,_DAR(r1)
 	std	r15,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	mr	r4,r14
-	mr	r5,r15
 	ld	r14,PACA_EXGEN+EX_R14(r13)
 	ld	r15,PACA_EXGEN+EX_R15(r13)
 	bl	do_page_fault
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 07aba8af99d3..839dcb94eea7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1438,10 +1438,9 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
 EXC_VIRT_END(data_access, 0x4300, 0x80)
 EXC_COMMON_BEGIN(data_access_common)
 	GEN_COMMON data_access
-	ld	r4,_DAR(r1)
-	ld	r5,_DSISR(r1)
+	ld	r4,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-	andis.	r0,r5,DSISR_DABRMATCH@h
+	andis.	r0,r4,DSISR_DABRMATCH@h
 	bne-	1f
 BEGIN_MMU_FTR_SECTION
 	bl	do_hash_fault
@@ -1504,10 +1503,9 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
 EXC_COMMON_BEGIN(data_access_slb_common)
 	GEN_COMMON data_access_slb
-	ld	r4,_DAR(r1)
-	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	/* HPT case, do SLB fault */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
@@ -1519,8 +1517,6 @@ MMU_FTR_SECTION_ELSE
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	RECONCILE_IRQ_STATE(r10, r11)
-	ld	r4,_DAR(r1)
-	ld	r5,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
@@ -1555,8 +1551,6 @@ EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
 EXC_VIRT_END(instruction_access, 0x4400, 0x80)
 EXC_COMMON_BEGIN(instruction_access_common)
 	GEN_COMMON instruction_access
-	ld	r4,_DAR(r1)
-	ld	r5,_DSISR(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	bl	do_hash_fault
@@ -1602,10 +1596,9 @@ EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
 EXC_VIRT_END(instruction_access_slb, 0x4480, 0x80)
 EXC_COMMON_BEGIN(instruction_access_slb_common)
 	GEN_COMMON instruction_access_slb
-	ld	r4,_DAR(r1)
-	addi	r3,r1,STACK_FRAME_OVERHEAD
 BEGIN_MMU_FTR_SECTION
 	/* HPT case, do SLB fault */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_slb_fault
 	cmpdi	r3,0
 	bne-	1f
@@ -1617,8 +1610,6 @@ MMU_FTR_SECTION_ELSE
 ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 	std	r3,RESULT(r1)
 	RECONCILE_IRQ_STATE(r10, r11)
-	ld	r4,_DAR(r1)
-	ld	r5,RESULT(r1)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	do_bad_slb_fault
 	b	interrupt_return
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index a1ae00689e0f..3c5577ac4dc8 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -179,9 +179,9 @@ _ENTRY(saved_ksp_limit)
  */
 	START_EXCEPTION(0x0300,	DataStorage)
 	EXCEPTION_PROLOG
-	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it, pass arg3 */
+	mfspr	r5, SPRN_ESR		/* Grab the ESR, save it */
 	stw	r5, _ESR(r11)
-	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it, pass arg2 */
+	mfspr	r4, SPRN_DEAR		/* Grab the DEAR, save it */
 	stw	r4, _DEAR(r11)
 	EXC_XFER_LITE(0x300, handle_page_fault)
 
@@ -191,9 +191,9 @@ _ENTRY(saved_ksp_limit)
  */
 	START_EXCEPTION(0x0400, InstructionAccess)
 	EXCEPTION_PROLOG
-	mr	r4,r12			/* Pass SRR0 as arg2 */
-	stw	r4, _DEAR(r11)
-	li	r5,0			/* Pass zero as arg3 */
+	li	r5,0
+	stw	r5, _ESR(r11)		/* Zero ESR */
+	stw	r12, _DEAR(r11)		/* SRR0 as DEAR */
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* 0x0500 - External Interrupt Exception */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 52702f3db6df..0b2c247cfdff 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -312,14 +312,14 @@ DataStoreTLBMiss:
 	. = 0x1300
 InstructionTLBError:
 	EXCEPTION_PROLOG
-	mr	r4,r12
 	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
 	andis.	r10,r9,SRR1_ISI_NOPT@h
 	beq+	.Litlbie
-	tlbie	r4
+	tlbie	r12
 	/* 0x400 is InstructionAccess exception, needed by bad_page_fault() */
 .Litlbie:
-	stw	r4, _DAR(r11)
+	stw	r12, _DAR(r11)
+	stw	r5, _DSISR(r11)
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* This is the data TLB error on the MPC8xx.  This could be due to
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index fc9a12768a14..94ad1372c490 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -352,9 +352,9 @@ BEGIN_MMU_FTR_SECTION
 	bl	hash_page
 END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
 #endif	/* CONFIG_VMAP_STACK */
-1:	mr	r4,r12
 	andis.	r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
-	stw	r4, _DAR(r11)
+	stw	r5, _DSISR(r11)
+	stw	r12, _DAR(r11)
 	EXC_XFER_LITE(0x400, handle_page_fault)
 
 /* External interrupt */
@@ -676,7 +676,6 @@ handle_page_fault_tramp_1:
 #ifdef CONFIG_VMAP_STACK
 	EXCEPTION_PROLOG_2 handle_dar_dsisr=1
 #endif
-	lwz	r4, _DAR(r11)
 	lwz	r5, _DSISR(r11)
 	/* fall through */
 handle_page_fault_tramp_2:
diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 74e230c200fb..0fbdacc7fab7 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -476,9 +476,7 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
 	NORMAL_EXCEPTION_PROLOG(INST_STORAGE);		      \
 	mfspr	r5,SPRN_ESR;		/* Grab the ESR and save it */	      \
 	stw	r5,_ESR(r11);						      \
-	mr      r4,r12;                 /* Pass SRR0 as arg2 */		      \
-	stw	r4, _DEAR(r11);						      \
-	li      r5,0;                   /* Pass zero as arg3 */		      \
+	stw	r12, _DEAR(r11);	/* Pass SRR0 as arg2 */		      \
 	EXC_XFER_LITE(0x0400, handle_page_fault)
 
 #define ALIGNMENT_EXCEPTION						      \
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index e866cae57e2f..9a499af3eebf 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1512,13 +1512,15 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
-int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
+long do_hash_fault(struct pt_regs *regs)
 {
+	unsigned long ea = regs->dar;
+	unsigned long dsisr = regs->dsisr;
 	unsigned long access = _PAGE_PRESENT | _PAGE_READ;
 	unsigned long flags = 0;
 	struct mm_struct *mm;
 	unsigned int region_id;
-	int err;
+	long err;
 
 	if (unlikely(dsisr & (DSISR_BAD_FAULT_64S | DSISR_KEYFAULT)))
 		goto page_fault;
@@ -1582,7 +1584,7 @@ int do_hash_fault(struct pt_regs *regs, unsigned long ea, unsigned long dsisr)
 
 	} else if (err) {
 page_fault:
-		err = do_page_fault(regs, ea, dsisr);
+		err = do_page_fault(regs);
 	}
 
 	return err;
diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c
index 584567970c11..985902ce0272 100644
--- a/arch/powerpc/mm/book3s64/slb.c
+++ b/arch/powerpc/mm/book3s64/slb.c
@@ -813,8 +813,9 @@ static long slb_allocate_user(struct mm_struct *mm, unsigned long ea)
 	return slb_insert_entry(ea, context, flags, ssize, false);
 }
 
-long do_slb_fault(struct pt_regs *regs, unsigned long ea)
+long do_slb_fault(struct pt_regs *regs)
 {
+	unsigned long ea = regs->dar;
 	unsigned long id = get_region_id(ea);
 
 	/* IRQs are not reconciled here, so can't check irqs_disabled */
@@ -865,13 +866,15 @@ long do_slb_fault(struct pt_regs *regs, unsigned long ea)
 	}
 }
 
-void do_bad_slb_fault(struct pt_regs *regs, unsigned long ea, long err)
+void do_bad_slb_fault(struct pt_regs *regs)
 {
+	int err = regs->result;
+
 	if (err == -EFAULT) {
 		if (user_mode(regs))
-			_exception(SIGSEGV, regs, SEGV_BNDERR, ea);
+			_exception(SIGSEGV, regs, SEGV_BNDERR, regs->dar);
 		else
-			bad_page_fault(regs, ea, SIGSEGV);
+			bad_page_fault(regs, regs->dar, SIGSEGV);
 	} else if (err == -EINVAL) {
 		unrecoverable_exception(regs);
 	} else {
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 8961b44f350c..273ff845eccf 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -542,12 +542,11 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 }
 NOKPROBE_SYMBOL(__do_page_fault);
 
-int do_page_fault(struct pt_regs *regs, unsigned long address,
-		  unsigned long error_code)
+long do_page_fault(struct pt_regs *regs)
 {
 	const struct exception_table_entry *entry;
 	enum ctx_state prev_state = exception_enter();
-	int rc = __do_page_fault(regs, address, error_code);
+	int rc = __do_page_fault(regs, regs->dar, regs->dsisr);
 	exception_exit(prev_state);
 	if (likely(!rc))
 		return 0;
-- 
2.23.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox