* Re: Fwd: [powerpc/Baremetal]Kernel OOPS while executing memory hotplug on Power8 baremetal
From: vrbagal1 @ 2018-06-07 10:37 UTC (permalink / raw)
To: Bart Van Assche, axboe, kent.overstreet, snitzer, linux-block
Cc: linux-scsi, linuxppc-dev, sachinp, Linuxppc-dev
In-Reply-To: <042fc8ee69b74c57815c0edfdbb253495e9d7718.camel@wdc.com>
On 2018-06-07 13:12, Bart Van Assche wrote:
> On Thu, 2018-06-07 at 12:56 +0530, Venkat Rao B wrote:
>> On Thursday 07 June 2018 12:46 PM, Bart Van Assche wrote:
>> > On Thu, 2018-06-07 at 12:38 +0530, vrbagal1 wrote:
>> > > Observing Kernel oops and machine reboots while executing memory hotplug
>> > > test case, on Power8 Baremetal machine.
>> > >
>> > > I see this is introduced some where between rc6 and 4.17.
>> >
>> > Please provide the exact versions (git commit IDs) of the kernel versions
>> > you have tested.
>>
>> Commit Id ---> 5037be168f
>
> The reason I was asking for the commit ID is because I saw that
> clone_endio()
> occurs in the oops which means that the dm driver is involved. An
> important fix
> for the dm driver went upstream recently, namely d37753540568 ("dm: Use
> kzalloc
> for all structs with embedded biosets/mempools"). Can you double check
> whether
> that commit it present in your tree? If it is not present, please
> update to the
> latest master and retest. If it is present, please report how to
> reproduce
> this oops to Kent Overstreet, Jens Axboe, linux-block and Mike Snitzer.
>
> Thanks,
>
> Bart.
Yes, the fix is present in the tree, which I have tested.
Steps to reproduce:
Step1: Clone and Install avocado git clone
https://github.com/avocado-framework/avocado.git
Step2: Clone
https://github.com/avocado-framework-tests/avocado-misc-tests.git
Test case is
https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py
Step3: Command to run the test is avocado run
avocado-misc-tests/memory/memhotplug.py
Regards,
Venkat.
^ permalink raw reply
* Re: [v2 PATCH 0/5] powerpc/pseries: Machien check handler improvements.
From: Nicholas Piggin @ 2018-06-07 10:45 UTC (permalink / raw)
To: Mahesh J Salgaonkar; +Cc: linuxppc-dev, Laurent Dufour, Aneesh Kumar K.V
In-Reply-To: <152836568375.29173.3046879842311381046.stgit@jupiter.in.ibm.com>
On Thu, 07 Jun 2018 15:36:25 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> This patch series includes some improvement to Machine check handler
> for pseries. Patch 1 fixes an issue where machine check handler crashes
> kernel while accessing vmalloc-ed buffer while in nmi context.
> Patch 3 dumps the SLB contents on SLB MCE errors to improve the debugability.
> Patch 4 display's the MCE error details on console.
>
> Change in V2:
> - patch 4: Display additional info (NIP and task info) in MCE error details.
> - patch 5: Fix endain bug while restoring of r3 in MCE handler.
>
> ---
>
> Mahesh Salgaonkar (5):
> powerpc/pseries: convert rtas_log_buf to linear allocation.
> powerpc/pseries: Define MCE error event section.
> powerpc/pseries: Dump and flush SLB contents on SLB MCE errors.
> powerpc/pseries: Display machine check error details.
> powerpc/pseries: Fix endainness while restoring of r3 in MCE handler.
These look good, should patch 5 be moved to patch 2 and the first 2
patches marked for stable?
Do you also plan to dump SLB contents for bare metal MCEs?
Thanks,
Nick
^ permalink raw reply
* Re: [RFC PATCH -tip v5 07/27] powerpc/kprobes: Remove jprobe powerpc implementation
From: Naveen N. Rao @ 2018-06-07 11:31 UTC (permalink / raw)
To: Masami Hiramatsu, Ingo Molnar, Thomas Gleixner
Cc: Andrew Morton, Ananth N Mavinakayanahalli, Benjamin Herrenschmidt,
H . Peter Anvin, linux-arch, linux-kernel, linuxppc-dev,
Ingo Molnar, Michael Ellerman, Paul Mackerras, Steven Rostedt
In-Reply-To: <152812751377.10068.6090934299713110701.stgit@devbox>
Masami Hiramatsu wrote:
> Remove arch dependent setjump/longjump functions
> and unused fields in kprobe_ctlblk for jprobes
> from arch/powerpc. This also reverts commits
> related __is_active_jprobe() function.
>=20
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
>=20
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
> arch/powerpc/include/asm/kprobes.h | 2 -
> arch/powerpc/kernel/kprobes-ftrace.c | 15 -------
> arch/powerpc/kernel/kprobes.c | 54 ------------------=
------
> arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 39 ++---------------
> 4 files changed, 5 insertions(+), 105 deletions(-)
LGTM.
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
- Naveen
=
^ permalink raw reply
* Re: [RFC PATCH -tip v5 18/27] powerpc/kprobes: Don't call the ->break_handler() in arm kprobes code
From: Naveen N. Rao @ 2018-06-07 11:37 UTC (permalink / raw)
To: Masami Hiramatsu, Ingo Molnar, Thomas Gleixner
Cc: Andrew Morton, H . Peter Anvin, linux-arch, linux-kernel,
linuxppc-dev, Ingo Molnar, Paul Mackerras, Steven Rostedt
In-Reply-To: <152812783350.10068.4690566636762511152.stgit@devbox>
Masami Hiramatsu wrote:
> Don't call the ->break_handler() from the arm kprobes code,
^^^ powerpc
> because it was only used by jprobes which got removed.
>=20
> This also makes skip_singlestep() a static function since
> only ftrace-kprobe.c is using this function.
>=20
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
> arch/powerpc/include/asm/kprobes.h | 10 ----------
> arch/powerpc/kernel/kprobes-ftrace.c | 16 +++-------------
> arch/powerpc/kernel/kprobes.c | 31 +++++++++++-----------------=
---
> 3 files changed, 14 insertions(+), 43 deletions(-)
With 2 small comments...
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
- Naveen
>=20
> diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/as=
m/kprobes.h
> index 674036db558b..785c464b6588 100644
> --- a/arch/powerpc/include/asm/kprobes.h
> +++ b/arch/powerpc/include/asm/kprobes.h
> @@ -102,16 +102,6 @@ extern int kprobe_exceptions_notify(struct notifier_=
block *self,
> extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
> extern int kprobe_handler(struct pt_regs *regs);
> extern int kprobe_post_handler(struct pt_regs *regs);
> -#ifdef CONFIG_KPROBES_ON_FTRACE
> -extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> - struct kprobe_ctlblk *kcb);
> -#else
> -static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs=
,
> - struct kprobe_ctlblk *kcb)
> -{
> - return 0;
> -}
> -#endif
> #else
> static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
> static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; =
}
> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/k=
probes-ftrace.c
> index 1b316331c2d9..3869b0e5d5c7 100644
> --- a/arch/powerpc/kernel/kprobes-ftrace.c
> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> @@ -26,8 +26,8 @@
> #include <linux/ftrace.h>
>=20
> static nokprobe_inline
> -int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> - struct kprobe_ctlblk *kcb, unsigned long orig_nip)
> +int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> + struct kprobe_ctlblk *kcb, unsigned long orig_nip)
> {
> /*
> * Emulate singlestep (and also recover regs->nip)
> @@ -44,16 +44,6 @@ int __skip_singlestep(struct kprobe *p, struct pt_regs=
*regs,
> return 1;
> }
>=20
> -int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> - struct kprobe_ctlblk *kcb)
> -{
> - if (kprobe_ftrace(p))
> - return __skip_singlestep(p, regs, kcb, 0);
> - else
> - return 0;
> -}
> -NOKPROBE_SYMBOL(skip_singlestep);
> -
> /* Ftrace callback handler for kprobes */
> void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> struct ftrace_ops *ops, struct pt_regs *regs)
> @@ -82,7 +72,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned =
long parent_nip,
> __this_cpu_write(current_kprobe, p);
> kcb->kprobe_status =3D KPROBE_HIT_ACTIVE;
> if (!p->pre_handler || !p->pre_handler(p, regs))
> - __skip_singlestep(p, regs, kcb, orig_nip);
> + skip_singlestep(p, regs, kcb, orig_nip);
We can probably get rid of skip_singlestep() completely along with=20
orig_nip since instructions are always 4 bytes on powerpc. So, the=20
changes we do to nip should help to recover the value automatically.
- Naveen
=
^ permalink raw reply
* Re: [RFC PATCH -tip v5 24/27] bpf: error-inject: kprobes: Clear current_kprobe and enable preempt in kprobe
From: Naveen N. Rao @ 2018-06-07 11:42 UTC (permalink / raw)
To: Masami Hiramatsu, Ingo Molnar, Thomas Gleixner
Cc: Andrew Morton, Alexei Starovoitov, Catalin Marinas, Rich Felker,
David S. Miller, Fenghua Yu, Heiko Carstens, H . Peter Anvin,
Josef Bacik, James Hogan, linux-arch, linux-arm-kernel,
Russell King, linux-ia64, linux-kernel, linux-mips, linuxppc-dev,
linux-s390, linux-sh, linux-snps-arc, Ingo Molnar, Paul Mackerras,
Ralf Baechle, Steven Rostedt, Martin Schwidefsky, sparclinux,
Tony Luck, Vineet Gupta, Will Deacon, x86, Yoshinori Sato
In-Reply-To: <152812800822.10068.3306094708706993432.stgit@devbox>
Masami Hiramatsu wrote:
> Clear current_kprobe and enable preemption in kprobe
> even if pre_handler returns !0.
>=20
> This simplifies function override using kprobes.
>=20
> Jprobe used to require to keep the preemption disabled and
> keep current_kprobe until it returned to original function
> entry. For this reason kprobe_int3_handler() and similar
> arch dependent kprobe handers checks pre_handler result
> and exit without enabling preemption if the result is !0.
>=20
> After removing the jprobe, Kprobes does not need to
> keep preempt disabled even if user handler returns !0
> anymore.
>=20
> But since the function override handler in error-inject
> and bpf is also returns !0 if it overrides a function,
> to balancing the preempt count, it enables preemption
> and reset current kprobe by itself.
>=20
> That is a bad design that is very buggy. This fixes
> such unbalanced preempt-count and current_kprobes setting
> in kprobes, bpf and error-inject.
>=20
> Note: for powerpc and x86, this removes all preempt_disable
> from kprobe_ftrace_handler because ftrace callbacks are
> called under preempt disabled.
>=20
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> Cc: Vineet Gupta <vgupta@synopsys.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Ralf Baechle <ralf@linux-mips.org>
> Cc: James Hogan <jhogan@kernel.org>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Cc: Rich Felker <dalias@libc.org>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> Cc: Josef Bacik <jbacik@fb.com>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: x86@kernel.org
> Cc: linux-snps-arc@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-mips@linux-mips.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-s390@vger.kernel.org
> Cc: linux-sh@vger.kernel.org
> Cc: sparclinux@vger.kernel.org
> ---
> Changes in v5:
> - Fix kprobe_ftrace_handler in arch/powerpc too.
> ---
> arch/arc/kernel/kprobes.c | 5 +++--
> arch/arm/probes/kprobes/core.c | 10 +++++-----
> arch/arm64/kernel/probes/kprobes.c | 10 +++++-----
> arch/ia64/kernel/kprobes.c | 13 ++++---------
> arch/mips/kernel/kprobes.c | 4 ++--
> arch/powerpc/kernel/kprobes-ftrace.c | 15 ++++++---------
> arch/powerpc/kernel/kprobes.c | 7 +++++--
For the powerpc bits:
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Thanks,
Naveen
=
^ permalink raw reply
* [GIT PULL] Please pull powerpc/linux.git powerpc-4.18-1 tag
From: Michael Ellerman @ 2018-06-07 12:00 UTC (permalink / raw)
To: Linus Torvalds
Cc: aik, akshay.adiga, alastair, andrew.donnellan, aneesh.kumar,
aneesh.kumar, anju, anton, arnd, bauerman, bsingharora,
christophe.leroy, clg, colin.king, dale, duwe, ego, fabio.estevam,
fbarrat, fthain, haren, hbathini, j.neuschaefer, jpoimboe,
jrdr.linux, linux-kernel, linux, linuxppc-dev, linuxram, maddy,
mahesh, malat, mgreer, mikey, msuchanek, naveen.n.rao, npiggin,
olof, paul.gortmaker, paulus, peda, ravi.bangoria, rdunlap, remi,
rostedt, ruscur, sbobroff, shilpa.bhat, stewart, vaibhav, vaibhav,
viro, wei.guo.simon, weiyongjun1, wsa+renesas, xieyisheng1,
yuehaibing
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Linus,
Please pull powerpc updates for 4.18:
The following changes since commit 6da6c0db5316275015e8cc2959f12a17584aeb64:
Linux v4.17-rc3 (2018-04-29 14:17:42 -0700)
are available in the git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/po=
werpc-4.18-1
for you to fetch changes up to ff5bc793e47b537bf3e904fada585e102c54dd8b:
powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap (2018-06-06 18=
:50:53 +1000)
- ------------------------------------------------------------------
powerpc updates for 4.18
Notable changes:
- Support for split PMD page table lock on 64-bit Book3S (Power8/9).
- Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live
patching again.
- Add support for patching barrier_nospec in copy_from_user() and syscall =
entry.
- A couple of fixes for our data breakpoints on Book3S.
- A series from Nick optimising TLB/mm handling with the Radix MMU.
- Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malat=
erre.
- Several series optimising various parts of the 32-bit code from Christop=
he Leroy.
- Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,=
C2K"),
which is why the diffstat has so many deletions.
And many other small improvements & fixes.
There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve=
, and
a fix to our powernv cpuidle driver. Then there's a series touching mm, x86=
and
fs/proc/task_mmu.c, which cleans up some details around pkey support. It was
ack'ed/reviewed by Ingo & Dave and has been in next for several weeks.
Thanks to:
Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew
Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh,
C=C3=A9dric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian Ki=
ng, Dave
Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Ha=
ren
Myneni, Hari Bathini, Ingo Molnar, Jonathan Neusch=C3=A4fer, Josh Poimboe=
uf,
Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mat=
hieu
Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Ra=
o,
Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul
Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, =
Ravi
Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher
Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith,
Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram S=
ang,
Yisheng Xie, YueHaibing.
- ------------------------------------------------------------------
Akshay Adiga (1):
powerpc/powernv/cpuidle: Init all present cpus for deep states
Al Viro (6):
powerpc/syscalls: Switch trivial cases to SYSCALL_DEFINE
powerpc/syscalls: signal_{32, 64} - switch to SYSCALL_DEFINE
powerpc/syscalls: switch rtas(2) to SYSCALL_DEFINE
powerpc/syscalls: kill ppc32_select()
powerpc/syscalls: timer_create can be handle by perfectly normal COMP=
AT_SYS_SPU
powerpc/ptrace: Use copy_{from, to}_user() rather than open-coding
Alastair D'Silva (7):
powerpc: Add TIDR CPU feature for POWER9
powerpc: Use TIDR CPU feature to control TIDR allocation
powerpc: use task_pid_nr() for TID allocation
ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
ocxl: Expose the thread_id needed for wait on POWER9
ocxl: Add an IOCTL so userspace knows what OCXL features are available
ocxl: Document new OCXL IOCTLs
Alexey Kardashevskiy (2):
powerpc/ioda: Use ibm, supported-tce-sizes for IOMMU page size mask
powerpc/powernv/ioda2: Remove redundant free of TCE pages
Aneesh Kumar K.V (20):
powerpc/kvm: Switch kvm pmd allocator to custom allocator
powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64
powerpc/mm: Use pmd_lockptr instead of opencoding it
powerpc/mm: Rename pte fragment functions
powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke
powerpc/mm/nohash: Remove pte fragment dependency from nohash
powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fr=
agment
powerpc/book3s64/mm: Simplify the rcu callback for page table free
powerpc/mm: Implement helpers for pagetable fragment support at PMD l=
evel
powerpc/mm: Use page fragments for allocation page table at PMD level
powerpc/book3s64: Enable split pmd ptlock.
powerpc/livepatch: Fix build error with kprobes disabled.
powerpc/mm: Fix kernel crash on page table free
powerpc/mm/hugetlb: Update huge_ptep_set_access_flags to call __ptep_=
set_access_flags directly
powerpc/mm/radix: Move function from radix.h to pgtable-radix.c
powerpc/mm: Change function prototype
powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang
powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch
powerpc/mm/hugetlb: Update hugetlb related locks
powerpc/mm/hash: hard disable irq in the SLB insert path
Anju T Sudhakar (5):
powerpc/perf: Fix memory allocation for core-imc based on num_possibl=
e_cpus()
powerpc/perf: Rearrange memory freeing in imc init
powerpc/perf: Replace the direct return with goto statement
powerpc/perf: Return appropriate value for unknown domain
powerpc/perf: Unregister thread-imc if core-imc not supported
Arnd Bergmann (5):
powerpc: always enable RTC_LIB
powerpc: rtas: clean up time handling
powerpc: use time64_t in read_persistent_clock
powerpc: use time64_t in update_persistent_clock
powerpc: remove unused to_tm() helper
Balbir Singh (1):
Revert "powerpc/powernv: Increase memory block size to 1GB on radix"
Christophe Leroy (30):
powerpc/nohash: Remove hash related code from nohash headers.
powerpc/nohash: Remove _PAGE_BUSY
powerpc/nohash: Use IS_ENABLED() to simplify __set_pte_at()
powerpc: get rid of PMD_PAGE_SIZE() and _PMD_SIZE
Revert "powerpc/64: Fix checksum folding in csum_add()"
powerpc: Avoid an unnecessary test and branch in longjmp()
powerpc/32: Use stmw/lmw for registers save/restore in asm
powerpc/mm: Use instruction symbolic names in store_updates_sp()
powerpc/mm: Only read faulting instruction when necessary in do_page_=
fault()
powerpc/8xx: fix invalid register expression in head_8xx.S
powerpc/64: Fix strncpy() related build failures with GCC 8.1
powerpc: Fix build by disabling attribute-alias warning for SYSCALL_D=
EFINEx
powerpc/dma: remove unnecessary BUG()
powerpc/mm: constify FIRST_CONTEXT in mmu_context_nohash
powerpc/mm: Avoid unnecessary test and reduce code size
powerpc/mm: constify LAST_CONTEXT in mmu_context_nohash
powerpc/mm: Remove stale_map[] handling on non SMP processors
powerpc/64: optimises from64to32()
powerpc/misc: merge reloc_offset() and add_reloc_offset()
powerpc/boot: remove unused variable in mpc8xx
powerpc/8xx: Remove RTC clock on 88x
powerpc/signal32: Use fault_in_pages_readable() to prefault user cont=
ext
powerpc/lib: Adjust .balign inside string functions for PPC32
powerpc/32: Optimise __csum_partial()
powerpc: Implement csum_ipv6_magic in assembly
powerpc/Makefile: set -mcpu=3D860 flag for the 8xx
powerpc/time: inline arch_vtime_task_switch()
powerpc/lib: optimise 32 bits __clear_user()
powerpc/lib: optimise PPC32 memcmp
powerpc: fix build failure by disabling attribute-alias warning in pc=
i_32
Colin Ian King (4):
macintosh/windfarm: fix spelling mistake: "ttarged" -> "ttarget"
powerpc/rtas: Fix spelling mistake "Discharching" -> "Discharging"
powerpc: fix spelling mistake: "Usupported" -> "Unsupported"
powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted"
C=C3=A9dric Le Goater (4):
powerpc/64/kexec: fix race in kexec when XIVE is shutdown
powerpc/xive: fix hcall H_INT_RESET to support long busy delays
powerpc/xive: shutdown XIVE when kexec or kdump is performed
powerpc/xive: prepare all hcalls to support long busy delays
Fabio Estevam (1):
powerpc: cpm_gpio: Remove owner assignment from platform_driver
Finn Thain (1):
powerpc/lib: Fix "integer constant is too large" build failure
Gautham R. Shenoy (1):
cpuidle: powernv: Fix promotion from snooze if next state disabled
Haren Myneni (1):
powerpc/powernv: copy/paste - Mask SO bit in CR
Hari Bathini (1):
powerpc/fadump: Do not use hugepages when fadump is active
Jonathan Neusch=C3=A4fer (6):
powerpc: wii_defconfig: Disable Ethernet driver support code
powerpc: wii_defconfig: Enable GPIO-related options
powerpc: wii_defconfig: Enable Wii SDHCI driver
powerpc: wii_defconfig: Disable BCMA support
powerpc/embedded6xx/flipper-pic: Don't match all IRQ domains
powerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled b=
y Starlet
Josh Poimboeuf (1):
powerpc/modules: remove unused mod_arch_specific.toc field
Madhavan Srinivasan (1):
powerpc/perf: Update raw-event code encoding comment for power8
Mahesh Salgaonkar (2):
powerpc/fadump: exclude memory holes while reserving memory in second=
kernel
powerpc/fadump: Unregister fadump on kexec down path.
Mark Greer (5):
powerpc/embedded6xx: Remove C2K board support
powerpc/boot: Remove support for Marvell MPSC serial controller
powerpc/boot: Remove support for Marvell mv64x60 i2c controller
powerpc/boot: Remove core support for Marvell mv64x60 hostbridges
powerpc: Remove core support for Marvell mv64x60 hostbridges
Mathieu Malaterre (21):
powerpc/kvm: Prefer fault_in_pages_readable function
powerpc/xmon: Add __printf annotation to xmon_printf()
powerpc: Add __printf verification to prom_printf
powerpc/altivec: Add missing prototypes for altivec
powerpc/sparse: Fix plain integer as NULL pointer warning
powerpc/mm/radix: Use do/while(0) trick for single statement block
powerpc/wii: Make hlwd_pic_init function static
powerpc/chrp/setup: Remove idu_size variable and make some functions =
static
powerpc/powermac: Mark variable x as unused
powerpc/chrp/pci: Make some functions static
powerpc/powermac: Move pmac_pfunc_base_install prototype to header fi=
le
powerpc/powermac: Add missing prototype for note_bootable_part()
powerpc/52xx: Add missing functions prototypes
powerpc: Add missing prototype
powerpc/tau: Synchronize function prototypes and body
powerpc: Make function btext_initialize static
powerpc/tau: Make some function static
powerpc/chrp/time: Make some functions static, add missing header inc=
lude
powerpc/32: Add a missing include header
powerpc: Add a missing include header
powerpc/prom: Fix %u/%llx usage since prom_printf() change
Michael Ellerman (36):
powerpc: Only support DYNAMIC_FTRACE not static
tracing: Remove PPC32 wart from config TRACING_SUPPORT
mm/pkeys: Remove include of asm/mmu_context.h from pkeys.h
mm/pkeys, powerpc, x86: Provide an empty vma_pkey() in linux/pkeys.h
x86/pkeys: Move vma_pkey() into asm/pkeys.h
x86/pkeys: Add arch_pkeys_enabled()
mm/pkeys: Add an empty arch_pkeys_enabled()
powerpc/pkeys: Drop private VM_PKEY definitions
powerpc/pseries: hcall_exit tracepoint retval should be signed
powerpc/syscalls: Add COMPAT_SPU_NEW() macro
powerpc: Make it clearer that systbl check errors are errors
powerpc/lib: Fix feature fixup test of external branch
powerpc/lib: Fix the feature fixup tests to actually work
powerpc/lib: Rename ftr_fixup_test7 to ftr_fixup_test_too_big
powerpc/lib: Add alt patching test of branching past the last instruc=
tion
powerpc/prom: Drop support for old FDT versions
powerpc/powernv: Fix memtrace build when NUMA=3Dn
Merge branch 'topic/ppc-kvm' into next
powerpc/io: Add __raw_writeq_be() __raw_rm_writeq_be()
powerpc/powernv: Use __raw_[rm_]writeq_be() in pci-ioda.c
powerpc/powernv: Use __raw_[rm_]writeq_be() in npu-dma.c
powerpc/xmon: Specify the full format in DUMP() macro
powerpc/xmon: Realign paca dump fields
powerpc/xmon: Update paca fields dumped in xmon
Merge branch 'topic/ppc-kvm' into next
Merge branch 'topic/kbuild' into next
Merge branch 'fixes' into next
Merge branch 'topic/pkey' into next
powerpc: Rename thread_struct.fs to addr_limit
powerpc: Check address limit on user-mode return (TIF_FSCHECK)
powerpc/64: Save stack pointer when we hard disable interrupts
powerpc/nmi: Add an API for sending "safe" NMIs
powerpc/64s: Wire up arch_trigger_cpumask_backtrace()
powerpc/stacktrace: Update copyright
powerpc: Use barrier_nospec in copy_from_user()
powerpc/64: Use barrier_nospec in syscall entry
Michael Neuling (6):
powerpc/ptrace: Fix enforcement of DAWR constraints
powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_=
DEBUGREG
selftests/powerpc: Remove redundant cp_abort test
selftests/powerpc: Add missing .gitignores
selftests/powerpc: Add ptrace hw breakpoint test
selftests/powerpc: Add perf breakpoint test
Michal Suchanek (6):
powerpc/xmon: Also setup debugger hooks when single-stepping
powerpc/64s: Add barrier_nospec
powerpc/64s: Add support for ori barrier_nospec patching
powerpc/64s: Patch barrier_nospec in modules
powerpc/64s: Enable barrier_nospec based on firmware settings
powerpc/64s: Enhance the information in cpu_show_spectre_v1()
Naveen N. Rao (10):
powerpc64/ftrace: Add a field in paca to disable ftrace in unsafe cod=
e paths
powerpc64/ftrace: Rearrange #ifdef sections in ftrace.h
powerpc64/ftrace: Add helpers to hard disable ftrace
powerpc64/ftrace: Delay enabling ftrace on secondary cpus
powerpc64/ftrace: Disable ftrace during hotplug
powerpc64/ftrace: Disable ftrace during kvm entry/exit
powerpc64/kexec: Hard disable ftrace before switching to the new kern=
el
powerpc64/module: Tighten detection of mcount call sites with -mprofi=
le-kernel
powerpc64/ftrace: Use the generic version of ftrace_replace_code()
powerpc64/ftrace: Implement support for ftrace_regs_caller()
Nicholas Piggin (32):
powerpc/config: powernv_defconfig updates
powerpc/watchdog: don't update the watchdog timestamp if a lockup is =
detected
powerpc/watchdog: provide more data in watchdog messages
selftests/powerpc: fix exec benchmark
powerpc/mm/radix: implement LPID based TLB flushes to be used by KVM
powerpc/powernv: Fix opal_event_shutdown() called with interrupts dis=
abled
powerpc/kbuild: Set default generic machine type for 32-bit compile
powerpc/kbuild: Remove CROSS32 defines from top level powerpc Makefile
powerpc/kbuild: Use flags variables rather than overriding LD/CC/AS
powerpc/64: irq_work avoid interrupt when called with hardware irqs e=
nabled
powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG
powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=3D1 supp=
ort
powerpc/64: remove start_tb and accum_tb from thread_struct
powerpc/pseries: lparcfg calculate PURR on demand
powerpc: generic clockevents broadcast receiver call tick_receive_bro=
adcast
powerpc: allow soft-NMI watchdog to cover timer interrupts with large=
decrementers
powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAS=
T ifdef
powerpc: move a stray NMI IPI case under NMI_IPI ifdef
powerpc/time: account broadcast timer event interrupts separately
powerpc/pmu/fsl: fix is_nmi test for irq mask change
powerpc/64: change softe to irqmask in show_regs and xmon
powerpc/powernv: call OPAL_QUIESCE before OPAL_SIGNAL_SYSTEM_RESET
powerpc/powernv: process all OPAL event interrupts with kopald
powerpc/64s/radix: do not flush TLB when relaxing access
powerpc/64s/radix: do not flush TLB on spurious fault
powerpc/64s/radix: make ptep_get_and_clear_full non-atomic for the fu=
ll case
powerpc/64s/radix: prefetch user address in update_mmu_cache
powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_fl=
ags
powerpc/64s/radix: optimise pte_update
powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask
powerpc/64s: Fix compiler store ordering to SLB shadow area
powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap
Olof Johansson (1):
powerpc/pasemi: Set PCI_SCAN_ALL_PCI_DEVS
Paul Gortmaker (1):
powerpc: remove retired sbc834x support
Peter Rosin (1):
powerpc/fsl/dts: fix the i2c-mux compatible for t104xqds
Ram Pai (4):
mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS i=
s enabled
mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
mm/pkeys, x86, powerpc: Display pkey in smaps if arch supports pkeys
powerpc/pkeys: Detach execute_only key on !PROT_EXEC
Ravi Bangoria (3):
powerpc/sstep: Introduce GETTYPE macro
powerpc/sstep: Fix kernel crash if VSX is not present
powerpc/sstep: Fix emulate_step test if VSX not present
Russell Currey (1):
powerpc/xive: Remove (almost) unused macros
Sam Bobroff (12):
powerpc/eeh: Add final message for successful recovery
powerpc/eeh: Fix use-after-release of EEH driver
powerpc/eeh: Remove unused eeh_pcid_name()
powerpc/eeh: Strengthen types of eeh traversal functions
powerpc/eeh: Add message when PE processing at parent
powerpc/eeh: Clean up pci_ers_result handling
powerpc/eeh: Introduce eeh_for_each_pe()
powerpc/eeh: Introduce eeh_edev_actionable()
powerpc/eeh: Introduce eeh_set_channel_state()
powerpc/eeh: Introduce eeh_set_irq_state()
powerpc/eeh: Cleaner handling of EEH_DEV_NO_HANDLER
powerpc/eeh: Refactor report functions
Shilpasri G Bhat (3):
powernv: opal-sensor: Add support to read 64bit sensor values
hwmon: (ibmpowernv): Add support to read 64 bit sensors
hwmon: (ibmpowernv) Add energy sensors
Simon Guo (3):
powerpc: Export msr_check_and_set() to modules
powerpc/reg: Add TEXASR related macros
powerpc: Export tm_enable()/tm_disable/tm_abort() APIs
Souptick Joarder (1):
powerpc/cell/spufs: Change return type to vm_fault_t
Stewart Smith (1):
hvc_opal: don't set tb_ticks_per_usec in udbg_init_opal_common()
Thiago Jung Bauermann (2):
selftests/powerpc: Add ptrace tests for Protection Key registers
selftests/powerpc: Add core file test for Protection Key registers
Torsten Duwe (1):
powerpc/livepatch: Implement reliable stack tracing for the consisten=
cy model
Vaibhav Jain (2):
cxl: Disable prefault_mode in Radix mode
cxl: Configure PSL to not use APC virtual machines
Wei Yongjun (1):
ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()
Wolfram Sang (1):
powerpc/watchdog: fix typo 'can by' to 'can be'
Yisheng Xie (1):
powerpc/xmon: use match_string() helper
YueHaibing (1):
powerpc/xics: Add missing of_node_put() in error path
Documentation/ABI/testing/sysfs-class-cxl | 4 +-
Documentation/accelerators/ocxl.rst | 11 +
Documentation/devicetree/bindings/marvell.txt | 516 ------------------
arch/powerpc/Kconfig | 3 +
arch/powerpc/Makefile | 32 +-
arch/powerpc/boot/Makefile | 23 +-
arch/powerpc/boot/cuboot-c2k.c | 189 -------
arch/powerpc/boot/dts/c2k.dts | 366 -------------
arch/powerpc/boot/dts/fsl/t104xqds.dtsi | 2 +-
arch/powerpc/boot/dts/sbc8349.dts | 331 ------------
arch/powerpc/boot/mpc8xx.c | 3 +-
arch/powerpc/boot/mpsc.c | 169 ------
arch/powerpc/boot/mv64x60.c | 581 -----------------=
----
arch/powerpc/boot/mv64x60.h | 70 ---
arch/powerpc/boot/mv64x60_i2c.c | 204 --------
arch/powerpc/boot/ops.h | 1 -
arch/powerpc/boot/serial.c | 4 -
arch/powerpc/configs/83xx/sbc834x_defconfig | 74 ---
arch/powerpc/configs/c2k_defconfig | 389 --------------
arch/powerpc/configs/powernv_defconfig | 108 ++--
arch/powerpc/configs/wii_defconfig | 14 +
arch/powerpc/include/asm/asm-prototypes.h | 20 +-
arch/powerpc/include/asm/barrier.h | 15 +
arch/powerpc/include/asm/book3s/32/pgalloc.h | 1 +
arch/powerpc/include/asm/book3s/32/pgtable.h | 7 +-
arch/powerpc/include/asm/book3s/64/hash-4k.h | 8 +-
arch/powerpc/include/asm/book3s/64/hash-64k.h | 7 +
arch/powerpc/include/asm/book3s/64/hash.h | 10 -
arch/powerpc/include/asm/book3s/64/mmu.h | 7 +-
arch/powerpc/include/asm/book3s/64/pgalloc.h | 46 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 28 +-
arch/powerpc/include/asm/book3s/64/radix-4k.h | 3 +
arch/powerpc/include/asm/book3s/64/radix-64k.h | 4 +
arch/powerpc/include/asm/book3s/64/radix.h | 86 ++-
.../powerpc/include/asm/book3s/64/tlbflush-radix.h | 7 +
arch/powerpc/include/asm/book3s/64/tlbflush.h | 12 +-
arch/powerpc/include/asm/cache.h | 3 +
arch/powerpc/include/asm/cacheflush.h | 14 +-
arch/powerpc/include/asm/checksum.h | 15 +-
arch/powerpc/include/asm/cputable.h | 3 +-
arch/powerpc/include/asm/cputime.h | 16 +-
arch/powerpc/include/asm/eeh.h | 11 +-
arch/powerpc/include/asm/feature-fixups.h | 9 +
arch/powerpc/include/asm/ftrace.h | 31 +-
arch/powerpc/include/asm/hardirq.h | 1 +
arch/powerpc/include/asm/hugetlb.h | 21 +-
arch/powerpc/include/asm/hw_irq.h | 11 +-
arch/powerpc/include/asm/imc-pmu.h | 1 +
arch/powerpc/include/asm/io.h | 10 +
arch/powerpc/include/asm/machdep.h | 2 +-
arch/powerpc/include/asm/mmu-book3e.h | 6 -
arch/powerpc/include/asm/mmu_context.h | 5 -
arch/powerpc/include/asm/module.h | 10 +-
arch/powerpc/include/asm/mpc52xx.h | 6 +-
arch/powerpc/include/asm/nmi.h | 6 +
arch/powerpc/include/asm/nohash/32/pgalloc.h | 1 +
arch/powerpc/include/asm/nohash/32/pgtable.h | 41 +-
arch/powerpc/include/asm/nohash/32/pte-40x.h | 3 -
arch/powerpc/include/asm/nohash/64/pgalloc.h | 95 ++--
arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 57 --
arch/powerpc/include/asm/nohash/64/pgtable.h | 46 +-
arch/powerpc/include/asm/nohash/pgtable.h | 58 +-
arch/powerpc/include/asm/nohash/pte-book3e.h | 6 -
arch/powerpc/include/asm/opal-api.h | 8 +
arch/powerpc/include/asm/opal.h | 5 +-
arch/powerpc/include/asm/paca.h | 3 +-
arch/powerpc/include/asm/page.h | 1 +
arch/powerpc/include/asm/pgtable.h | 1 +
arch/powerpc/include/asm/pkeys.h | 13 -
arch/powerpc/include/asm/plpar_wrappers.h | 8 +-
arch/powerpc/include/asm/pmac_pfunc.h | 1 +
arch/powerpc/include/asm/pnv-ocxl.h | 2 +-
arch/powerpc/include/asm/ppc-opcode.h | 1 +
arch/powerpc/include/asm/ppc_asm.h | 6 +-
arch/powerpc/include/asm/processor.h | 10 +-
arch/powerpc/include/asm/pte-common.h | 8 -
arch/powerpc/include/asm/reg.h | 32 +-
arch/powerpc/include/asm/rheap.h | 3 +
arch/powerpc/include/asm/rtas.h | 2 +-
arch/powerpc/include/asm/setup.h | 9 +
arch/powerpc/include/asm/smp.h | 1 +
arch/powerpc/include/asm/sstep.h | 2 +
arch/powerpc/include/asm/switch_to.h | 1 -
arch/powerpc/include/asm/syscalls.h | 2 +-
arch/powerpc/include/asm/systbl.h | 6 +-
arch/powerpc/include/asm/thread_info.h | 8 +-
arch/powerpc/include/asm/time.h | 11 -
arch/powerpc/include/asm/tlb.h | 13 +
arch/powerpc/include/asm/tm.h | 2 -
arch/powerpc/include/asm/trace.h | 7 +-
arch/powerpc/include/asm/uaccess.h | 21 +-
arch/powerpc/include/asm/xive-regs.h | 6 -
arch/powerpc/include/asm/xmon.h | 2 +-
arch/powerpc/include/asm/xor.h | 12 +-
arch/powerpc/include/asm/xor_altivec.h | 19 +
arch/powerpc/kernel/align.c | 2 +-
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/btext.c | 10 +-
arch/powerpc/kernel/dma.c | 2 -
arch/powerpc/kernel/dt_cpu_ftrs.c | 1 +
arch/powerpc/kernel/eeh.c | 19 +-
arch/powerpc/kernel/eeh_driver.c | 496 ++++++++++--------
arch/powerpc/kernel/eeh_pe.c | 26 +-
arch/powerpc/kernel/entry_64.S | 11 +
arch/powerpc/kernel/exceptions-64s.S | 1 +
arch/powerpc/kernel/fadump.c | 40 +-
arch/powerpc/kernel/head_8xx.S | 2 +-
arch/powerpc/kernel/hw_breakpoint.c | 4 +-
arch/powerpc/kernel/irq.c | 8 +-
arch/powerpc/kernel/kvm.c | 4 +-
arch/powerpc/kernel/machine_kexec.c | 2 +
arch/powerpc/kernel/machine_kexec_64.c | 8 +-
arch/powerpc/kernel/misc.S | 36 +-
arch/powerpc/kernel/module.c | 6 +
arch/powerpc/kernel/module_32.c | 4 +-
arch/powerpc/kernel/module_64.c | 44 +-
arch/powerpc/kernel/nvram_64.c | 4 +-
arch/powerpc/kernel/pci_32.c | 11 +-
arch/powerpc/kernel/pci_64.c | 8 +-
arch/powerpc/kernel/ppc_save_regs.S | 4 +
arch/powerpc/kernel/process.c | 147 ++----
arch/powerpc/kernel/prom.c | 23 +-
arch/powerpc/kernel/prom_init.c | 189 ++++---
arch/powerpc/kernel/ptrace.c | 21 +-
arch/powerpc/kernel/rtas-proc.c | 26 +-
arch/powerpc/kernel/rtas-rtc.c | 4 +-
arch/powerpc/kernel/rtas.c | 7 +-
arch/powerpc/kernel/security.c | 71 +++
arch/powerpc/kernel/setup-common.c | 6 -
arch/powerpc/kernel/setup.h | 6 +
arch/powerpc/kernel/setup_64.c | 7 +
arch/powerpc/kernel/signal.c | 4 +
arch/powerpc/kernel/signal.h | 6 +-
arch/powerpc/kernel/signal_32.c | 61 ++-
arch/powerpc/kernel/signal_64.c | 19 +-
arch/powerpc/kernel/smp.c | 46 +-
arch/powerpc/kernel/stacktrace.c | 181 ++++++-
arch/powerpc/kernel/sys_ppc32.c | 9 -
arch/powerpc/kernel/syscalls.c | 4 +
arch/powerpc/kernel/systbl.S | 2 +-
arch/powerpc/kernel/systbl_chk.c | 2 +-
arch/powerpc/kernel/systbl_chk.sh | 4 +-
arch/powerpc/kernel/tau_6xx.c | 15 +-
arch/powerpc/kernel/time.c | 227 +++-----
arch/powerpc/kernel/tm.S | 12 +
arch/powerpc/kernel/trace/ftrace.c | 212 ++++++--
arch/powerpc/kernel/trace/ftrace_32.S | 20 -
arch/powerpc/kernel/trace/ftrace_64.S | 29 -
arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 88 +++-
arch/powerpc/kernel/trace/ftrace_64_pg.S | 6 +-
arch/powerpc/kernel/vdso32/Makefile | 15 +-
arch/powerpc/kernel/vecemu.c | 1 +
arch/powerpc/kernel/vmlinux.lds.S | 7 +
arch/powerpc/kernel/watchdog.c | 32 +-
arch/powerpc/kvm/book3s_64_mmu_radix.c | 36 +-
arch/powerpc/kvm/book3s_hv.c | 4 +
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 3 +
arch/powerpc/lib/Makefile | 5 +-
arch/powerpc/lib/checksum_32.S | 46 +-
arch/powerpc/lib/checksum_64.S | 28 +
arch/powerpc/lib/feature-fixups-test.S | 42 +-
arch/powerpc/lib/feature-fixups.c | 60 ++-
arch/powerpc/lib/memcmp_32.S | 45 ++
arch/powerpc/lib/sstep.c | 26 +-
arch/powerpc/lib/string.S | 70 +--
arch/powerpc/lib/string_32.S | 90 ++++
arch/powerpc/lib/test_emulate_step.c | 21 +-
arch/powerpc/lib/xor_vmx_glue.c | 1 +
arch/powerpc/mm/fault.c | 76 ++-
arch/powerpc/mm/hash_utils_64.c | 10 +-
arch/powerpc/mm/hugetlbpage.c | 40 +-
arch/powerpc/mm/mem.c | 4 +-
arch/powerpc/mm/mmu_context.c | 6 +-
arch/powerpc/mm/mmu_context_book3s64.c | 39 +-
arch/powerpc/mm/mmu_context_nohash.c | 135 ++---
arch/powerpc/mm/pgtable-book3s64.c | 279 +++++++++-
arch/powerpc/mm/pgtable-hash64.c | 8 +-
arch/powerpc/mm/pgtable-radix.c | 38 +-
arch/powerpc/mm/pgtable.c | 49 +-
arch/powerpc/mm/pgtable_64.c | 171 ------
arch/powerpc/mm/pkeys.c | 4 +-
arch/powerpc/mm/ppc_mmu_32.c | 2 +-
arch/powerpc/mm/slb.c | 21 +-
arch/powerpc/mm/subpage-prot.c | 8 +-
arch/powerpc/mm/tlb-radix.c | 366 ++++++++++++-
arch/powerpc/mm/tlb_hash32.c | 10 +-
arch/powerpc/perf/core-fsl-emb.c | 2 +-
arch/powerpc/perf/imc-pmu.c | 68 ++-
arch/powerpc/perf/isa207-common.h | 64 ---
arch/powerpc/perf/power8-pmu.c | 64 +++
arch/powerpc/platforms/83xx/Kconfig | 7 -
arch/powerpc/platforms/83xx/Makefile | 1 -
arch/powerpc/platforms/83xx/sbc834x.c | 73 ---
arch/powerpc/platforms/8xx/adder875.c | 2 -
arch/powerpc/platforms/8xx/ep88xc.c | 2 -
arch/powerpc/platforms/8xx/m8xx_setup.c | 11 +-
arch/powerpc/platforms/8xx/mpc885ads_setup.c | 2 -
arch/powerpc/platforms/Kconfig.cputype | 4 +
arch/powerpc/platforms/cell/spu_callbacks.c | 2 +-
arch/powerpc/platforms/cell/spu_syscalls.c | 3 +-
arch/powerpc/platforms/cell/spufs/file.c | 33 +-
arch/powerpc/platforms/chrp/pci.c | 12 +-
arch/powerpc/platforms/chrp/setup.c | 12 +-
arch/powerpc/platforms/chrp/time.c | 6 +-
arch/powerpc/platforms/embedded6xx/Kconfig | 10 -
arch/powerpc/platforms/embedded6xx/Makefile | 1 -
arch/powerpc/platforms/embedded6xx/c2k.c | 148 ------
arch/powerpc/platforms/embedded6xx/flipper-pic.c | 8 -
arch/powerpc/platforms/embedded6xx/hlwd-pic.c | 7 +-
arch/powerpc/platforms/maple/maple.h | 2 +-
arch/powerpc/platforms/maple/time.c | 5 +-
arch/powerpc/platforms/pasemi/pasemi.h | 2 +-
arch/powerpc/platforms/pasemi/pci.c | 2 +
arch/powerpc/platforms/pasemi/time.c | 4 +-
arch/powerpc/platforms/powermac/bootx_init.c | 6 +-
arch/powerpc/platforms/powermac/pci.c | 4 +-
arch/powerpc/platforms/powermac/pmac.h | 2 +-
arch/powerpc/platforms/powermac/setup.c | 9 +-
arch/powerpc/platforms/powermac/smp.c | 1 -
arch/powerpc/platforms/powermac/time.c | 48 +-
arch/powerpc/platforms/powernv/copy-paste.h | 6 +-
arch/powerpc/platforms/powernv/idle.c | 4 +-
arch/powerpc/platforms/powernv/memtrace.c | 2 +-
arch/powerpc/platforms/powernv/npu-dma.c | 5 +-
arch/powerpc/platforms/powernv/ocxl.c | 4 +-
arch/powerpc/platforms/powernv/opal-hmi.c | 2 +-
arch/powerpc/platforms/powernv/opal-imc.c | 22 +-
arch/powerpc/platforms/powernv/opal-irqchip.c | 89 ++--
arch/powerpc/platforms/powernv/opal-rtc.c | 5 +-
arch/powerpc/platforms/powernv/opal-sensor.c | 53 ++
arch/powerpc/platforms/powernv/opal-wrappers.S | 2 +
arch/powerpc/platforms/powernv/opal.c | 23 +-
arch/powerpc/platforms/powernv/pci-ioda.c | 46 +-
arch/powerpc/platforms/powernv/powernv.h | 3 +-
arch/powerpc/platforms/powernv/setup.c | 11 +-
arch/powerpc/platforms/powernv/smp.c | 17 +-
arch/powerpc/platforms/ps3/platform.h | 2 +-
arch/powerpc/platforms/ps3/repository.c | 4 +-
arch/powerpc/platforms/ps3/time.c | 26 +-
arch/powerpc/platforms/pseries/hvCall_inst.c | 2 +-
arch/powerpc/platforms/pseries/kexec.c | 7 +-
arch/powerpc/platforms/pseries/lpar.c | 3 +-
arch/powerpc/platforms/pseries/lparcfg.c | 18 +-
arch/powerpc/platforms/pseries/setup.c | 1 +
arch/powerpc/sysdev/Makefile | 3 -
arch/powerpc/sysdev/cpm_gpio.c | 1 -
arch/powerpc/sysdev/mv64x60.h | 13 -
arch/powerpc/sysdev/mv64x60_dev.c | 535 -----------------=
--
arch/powerpc/sysdev/mv64x60_pci.c | 171 ------
arch/powerpc/sysdev/mv64x60_pic.c | 297 -----------
arch/powerpc/sysdev/mv64x60_udbg.c | 152 ------
arch/powerpc/sysdev/xics/xics-common.c | 7 +-
arch/powerpc/sysdev/xive/native.c | 2 +-
arch/powerpc/sysdev/xive/spapr.c | 88 +++-
arch/powerpc/tools/gcc-check-mprofile-kernel.sh | 12 +-
arch/powerpc/xmon/nonstdio.h | 8 +-
arch/powerpc/xmon/spu-dis.c | 18 +-
arch/powerpc/xmon/xmon.c | 261 ++++-----
arch/x86/include/asm/mmu_context.h | 15 -
arch/x86/include/asm/pkeys.h | 13 +
arch/x86/kernel/setup.c | 8 -
drivers/cpuidle/cpuidle-powernv.c | 32 +-
drivers/hwmon/ibmpowernv.c | 9 +-
drivers/macintosh/via-pmu.c | 18 +-
drivers/macintosh/windfarm_pm121.c | 2 +-
drivers/macintosh/windfarm_pm81.c | 2 +-
drivers/macintosh/windfarm_pm91.c | 2 +-
drivers/misc/cxl/pci.c | 4 +-
drivers/misc/cxl/sysfs.c | 16 +-
drivers/misc/ocxl/context.c | 5 +-
drivers/misc/ocxl/file.c | 80 +++
drivers/misc/ocxl/link.c | 38 +-
drivers/misc/ocxl/ocxl_internal.h | 1 +
drivers/tty/hvc/hvc_opal.c | 1 -
fs/proc/task_mmu.c | 13 +-
include/linux/mm.h | 14 +-
include/linux/pkeys.h | 13 +-
include/misc/ocxl.h | 9 +
include/uapi/misc/ocxl.h | 14 +
kernel/sys_ni.c | 2 +-
kernel/trace/Kconfig | 6 +-
scripts/recordmcount.pl | 18 +-
tools/testing/selftests/powerpc/Makefile | 1 -
.../testing/selftests/powerpc/alignment/.gitignore | 1 +
.../selftests/powerpc/benchmarks/exec_target.c | 7 +-
.../selftests/powerpc/context_switch/.gitignore | 1 -
.../selftests/powerpc/context_switch/Makefile | 5 -
.../selftests/powerpc/context_switch/cp_abort.c | 110 ----
tools/testing/selftests/powerpc/include/reg.h | 1 +
tools/testing/selftests/powerpc/ptrace/.gitignore | 2 +
tools/testing/selftests/powerpc/ptrace/Makefile | 6 +-
tools/testing/selftests/powerpc/ptrace/child.h | 139 +++++
tools/testing/selftests/powerpc/ptrace/core-pkey.c | 461 ++++++++++++++++
.../selftests/powerpc/ptrace/perf-hwbreak.c | 195 +++++++
.../selftests/powerpc/ptrace/ptrace-hwbreak.c | 342 ++++++++++++
.../testing/selftests/powerpc/ptrace/ptrace-pkey.c | 327 ++++++++++++
tools/testing/selftests/powerpc/ptrace/ptrace.h | 38 ++
tools/testing/selftests/powerpc/tm/.gitignore | 1 +
298 files changed, 5696 insertions(+), 6903 deletions(-)
delete mode 100644 Documentation/devicetree/bindings/marvell.txt
delete mode 100644 arch/powerpc/boot/cuboot-c2k.c
delete mode 100644 arch/powerpc/boot/dts/c2k.dts
delete mode 100644 arch/powerpc/boot/dts/sbc8349.dts
delete mode 100644 arch/powerpc/boot/mpsc.c
delete mode 100644 arch/powerpc/boot/mv64x60.c
delete mode 100644 arch/powerpc/boot/mv64x60.h
delete mode 100644 arch/powerpc/boot/mv64x60_i2c.c
delete mode 100644 arch/powerpc/configs/83xx/sbc834x_defconfig
delete mode 100644 arch/powerpc/configs/c2k_defconfig
delete mode 100644 arch/powerpc/include/asm/nohash/64/pgtable-64k.h
create mode 100644 arch/powerpc/include/asm/xor_altivec.h
create mode 100644 arch/powerpc/lib/memcmp_32.S
create mode 100644 arch/powerpc/lib/string_32.S
delete mode 100644 arch/powerpc/platforms/83xx/sbc834x.c
delete mode 100644 arch/powerpc/platforms/embedded6xx/c2k.c
delete mode 100644 arch/powerpc/sysdev/mv64x60.h
delete mode 100644 arch/powerpc/sysdev/mv64x60_dev.c
delete mode 100644 arch/powerpc/sysdev/mv64x60_pci.c
delete mode 100644 arch/powerpc/sysdev/mv64x60_pic.c
delete mode 100644 arch/powerpc/sysdev/mv64x60_udbg.c
delete mode 100644 tools/testing/selftests/powerpc/context_switch/.gitigno=
re
delete mode 100644 tools/testing/selftests/powerpc/context_switch/Makefile
delete mode 100644 tools/testing/selftests/powerpc/context_switch/cp_abort=
.c
create mode 100644 tools/testing/selftests/powerpc/ptrace/child.h
create mode 100644 tools/testing/selftests/powerpc/ptrace/core-pkey.c
create mode 100644 tools/testing/selftests/powerpc/ptrace/perf-hwbreak.c
create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-hwbreak.c
create mode 100644 tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCAAGBQJbGR2pAAoJEFHr6jzI4aWAzq8QAJEPg0NmJ2CHmwqYCmrUG5HQ
NelMOxgWUCaOpXaQ6dWHhSgXMfK82cM7xQ1MjlsBdUCdfTRfaqVCjXCXvXiPDo/T
2ST0JT6xe8MoPT8RbqYY5dguaFbwdgDMsf0cgglOmWOegnyVUOEH7D9KiJ5iUeb0
dUq0ZsZ7vroQFCw3wJwsRwTBw+a8Jtb6yM8QWujzo4TOePpgWVr2qJvcBUPieOcf
kkheIQgp3Q7N1AK+LEx8gODrWKClkoKv5ACxXYzC4xvhFMA7H9PYVY1YB2/68Db6
NHBkeCoxYC8RwfDLczEpQnj8FgOfatJNqcCygqe/fHPGB9mydZNSSaIgJwyd64Rz
Nkqy6wzirPSAbVr0mpx/RNQaraWDDoLHCk7QK/1QrS/cpJb6bvURqjsMwkRVCnS+
x5MZvgb+Pkdt1aXXT6X6Qgso3QbNYvqJRmya9tjnUfyrorauwu+Grj8AteU50ACC
n8hSppD7qqU99KqoySsgrHsqj+ShrVL6n/TgOJOkdMtJexGoMxsUy1UgWO2pxqFc
uOsekJaxYrPHRnDkePTRUTHa27oqj0MJ5kwYBM1P0W6O5L1VY21IVzALgmj0O+6r
KC+ONQVCst7jlVF5E0vTBnzDrp30WOZVxmU9iGb0ha9X/JU0pvHaci7CXtKpko33
705n/Q3r4XFNNbHc4zfN
=3D2yFv
-----END PGP SIGNATURE-----
^ permalink raw reply
* Re: Fwd: [powerpc/Baremetal]Kernel OOPS while executing memory hotplug on Power8 baremetal
From: Michael Ellerman @ 2018-06-07 12:51 UTC (permalink / raw)
To: vrbagal1, Bart Van Assche, axboe, kent.overstreet, snitzer,
linux-block
Cc: sachinp, Linuxppc-dev, linuxppc-dev, linux-scsi
In-Reply-To: <e01428cf15ab45bd42a45a14424e5384@linux.vnet.ibm.com>
vrbagal1 <vrbagal1@linux.vnet.ibm.com> writes:
> On 2018-06-07 13:12, Bart Van Assche wrote:
>> On Thu, 2018-06-07 at 12:56 +0530, Venkat Rao B wrote:
>>> On Thursday 07 June 2018 12:46 PM, Bart Van Assche wrote:
>>> > On Thu, 2018-06-07 at 12:38 +0530, vrbagal1 wrote:
>>> > > Observing Kernel oops and machine reboots while executing memory hotplug
>>> > > test case, on Power8 Baremetal machine.
>>> > >
>>> > > I see this is introduced some where between rc6 and 4.17.
>>> >
>>> > Please provide the exact versions (git commit IDs) of the kernel versions
>>> > you have tested.
>>>
>>> Commit Id ---> 5037be168f
>>
>> The reason I was asking for the commit ID is because I saw that
>> clone_endio()
>> occurs in the oops which means that the dm driver is involved. An
>> important fix
>> for the dm driver went upstream recently, namely d37753540568 ("dm: Use
>> kzalloc
>> for all structs with embedded biosets/mempools"). Can you double check
>> whether
>> that commit it present in your tree? If it is not present, please
>> update to the
>> latest master and retest. If it is present, please report how to
>> reproduce
>> this oops to Kent Overstreet, Jens Axboe, linux-block and Mike Snitzer.
>>
>> Thanks,
>>
>> Bart.
>
>
> Yes, the fix is present in the tree, which I have tested.
>
> Steps to reproduce:
>
> Step1: Clone and Install avocado git clone
> https://github.com/avocado-framework/avocado.git
> Step2: Clone
> https://github.com/avocado-framework-tests/avocado-misc-tests.git
> Test case is
> https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py
> Step3: Command to run the test is avocado run
> avocado-misc-tests/memory/memhotplug.py
That gave me:
$ avocado run avocado-misc-tests/memory/memhotplug.py
avocado: command not found
Was I meant to install it?
I tried this which worked (I think):
$ ./scripts/avocado run avocado-misc-tests/memory/memhotplug.py
Failed to load plugin from module "avocado_runner_vm": ImportError('No module named libvirt',)
JOB ID : 28deb5a455fb876a7e177deb2b46eab640f313c8
JOB LOG : /home/michael/avocado/job-results/job-2018-06-07T22.27-28deb5a/job.log
(1/4) avocado-misc-tests/memory/memhotplug.py:memstress.test_hotplug_loop: PASS (10.62 s)
(2/4) avocado-misc-tests/memory/memhotplug.py:memstress.test_hotplug_toggle: PASS (245.15 s)
(3/4) avocado-misc-tests/memory/memhotplug.py:memstress.test_dlpar_mem_hotplug: PASS (0.37 s)
(4/4) avocado-misc-tests/memory/memhotplug.py:memstress.test_hotplug_per_numa_node: PASS (41.09 s)
RESULTS : PASS 4 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0
JOB TIME : 323.45 s
JOB HTML : /home/michael/avocado/job-results/job-2018-06-07T22.27-28deb5a/results.html
So what's different about your system?
What does 'lsblk -O' say on your system?
cheers
^ permalink raw reply
* Re: [RFC PATCH -tip v5 07/27] powerpc/kprobes: Remove jprobe powerpc implementation
From: Masami Hiramatsu @ 2018-06-07 14:23 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton,
Ananth N Mavinakayanahalli, Benjamin Herrenschmidt,
H . Peter Anvin, linux-arch, linux-kernel, linuxppc-dev,
Ingo Molnar, Michael Ellerman, Paul Mackerras, Steven Rostedt
In-Reply-To: <1528370755.shi1gq6h7g.naveen@linux.ibm.com>
On Thu, 07 Jun 2018 17:01:23 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> Masami Hiramatsu wrote:
> > Remove arch dependent setjump/longjump functions
> > and unused fields in kprobe_ctlblk for jprobes
> > from arch/powerpc. This also reverts commits
> > related __is_active_jprobe() function.
> >
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> >
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> > Cc: linuxppc-dev@lists.ozlabs.org
> > ---
> > arch/powerpc/include/asm/kprobes.h | 2 -
> > arch/powerpc/kernel/kprobes-ftrace.c | 15 -------
> > arch/powerpc/kernel/kprobes.c | 54 ------------------------
> > arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 39 ++---------------
> > 4 files changed, 5 insertions(+), 105 deletions(-)
>
> LGTM.
>
> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Thanks Naveen!
>
> - Naveen
>
>
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply
* Re: [RFC PATCH -tip v5 18/27] powerpc/kprobes: Don't call the ->break_handler() in arm kprobes code
From: Masami Hiramatsu @ 2018-06-07 14:28 UTC (permalink / raw)
To: Naveen N. Rao
Cc: Ingo Molnar, Thomas Gleixner, Andrew Morton, H . Peter Anvin,
linux-arch, linux-kernel, linuxppc-dev, Ingo Molnar,
Paul Mackerras, Steven Rostedt
In-Reply-To: <1528371112.vwnh1m0k39.naveen@linux.ibm.com>
On Thu, 07 Jun 2018 17:07:00 +0530
"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
> Masami Hiramatsu wrote:
> > Don't call the ->break_handler() from the arm kprobes code,
> ^^^ powerpc
>
> > because it was only used by jprobes which got removed.
> >
> > This also makes skip_singlestep() a static function since
> > only ftrace-kprobe.c is using this function.
> >
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Cc: Paul Mackerras <paulus@samba.org>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
> > Cc: linuxppc-dev@lists.ozlabs.org
> > ---
> > arch/powerpc/include/asm/kprobes.h | 10 ----------
> > arch/powerpc/kernel/kprobes-ftrace.c | 16 +++-------------
> > arch/powerpc/kernel/kprobes.c | 31 +++++++++++--------------------
> > 3 files changed, 14 insertions(+), 43 deletions(-)
>
> With 2 small comments...
2 ? or 1 ?
> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
>
> - Naveen
>
> >
> > diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
> > index 674036db558b..785c464b6588 100644
> > --- a/arch/powerpc/include/asm/kprobes.h
> > +++ b/arch/powerpc/include/asm/kprobes.h
> > @@ -102,16 +102,6 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
> > extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
> > extern int kprobe_handler(struct pt_regs *regs);
> > extern int kprobe_post_handler(struct pt_regs *regs);
> > -#ifdef CONFIG_KPROBES_ON_FTRACE
> > -extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > - struct kprobe_ctlblk *kcb);
> > -#else
> > -static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > - struct kprobe_ctlblk *kcb)
> > -{
> > - return 0;
> > -}
> > -#endif
> > #else
> > static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
> > static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; }
> > diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> > index 1b316331c2d9..3869b0e5d5c7 100644
> > --- a/arch/powerpc/kernel/kprobes-ftrace.c
> > +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> > @@ -26,8 +26,8 @@
> > #include <linux/ftrace.h>
> >
> > static nokprobe_inline
> > -int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > - struct kprobe_ctlblk *kcb, unsigned long orig_nip)
> > +int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > + struct kprobe_ctlblk *kcb, unsigned long orig_nip)
> > {
> > /*
> > * Emulate singlestep (and also recover regs->nip)
> > @@ -44,16 +44,6 @@ int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > return 1;
> > }
> >
> > -int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
> > - struct kprobe_ctlblk *kcb)
> > -{
> > - if (kprobe_ftrace(p))
> > - return __skip_singlestep(p, regs, kcb, 0);
> > - else
> > - return 0;
> > -}
> > -NOKPROBE_SYMBOL(skip_singlestep);
> > -
> > /* Ftrace callback handler for kprobes */
> > void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> > struct ftrace_ops *ops, struct pt_regs *regs)
> > @@ -82,7 +72,7 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> > __this_cpu_write(current_kprobe, p);
> > kcb->kprobe_status = KPROBE_HIT_ACTIVE;
> > if (!p->pre_handler || !p->pre_handler(p, regs))
> > - __skip_singlestep(p, regs, kcb, orig_nip);
> > + skip_singlestep(p, regs, kcb, orig_nip);
>
> We can probably get rid of skip_singlestep() completely along with
> orig_nip since instructions are always 4 bytes on powerpc. So, the
> changes we do to nip should help to recover the value automatically.
Good point! Yes, skip_singlestep() is no more exported, so we just consolidate
it into kprobe_ftrace_handler() for simplifying operation.
Thank you!
>
> - Naveen
>
>
--
Masami Hiramatsu <mhiramat@kernel.org>
^ permalink raw reply
* Re: Fwd: [powerpc/Baremetal]Kernel OOPS while executing memory hotplug on Power8 baremetal
From: Jens Axboe @ 2018-06-07 14:45 UTC (permalink / raw)
To: vrbagal1, Bart Van Assche, kent.overstreet, snitzer, linux-block
Cc: linux-scsi, linuxppc-dev, sachinp, Linuxppc-dev
In-Reply-To: <e01428cf15ab45bd42a45a14424e5384@linux.vnet.ibm.com>
On 6/7/18 4:37 AM, vrbagal1 wrote:
> On 2018-06-07 13:12, Bart Van Assche wrote:
>> On Thu, 2018-06-07 at 12:56 +0530, Venkat Rao B wrote:
>>> On Thursday 07 June 2018 12:46 PM, Bart Van Assche wrote:
>>>> On Thu, 2018-06-07 at 12:38 +0530, vrbagal1 wrote:
>>>>> Observing Kernel oops and machine reboots while executing memory hotplug
>>>>> test case, on Power8 Baremetal machine.
>>>>>
>>>>> I see this is introduced some where between rc6 and 4.17.
>>>>
>>>> Please provide the exact versions (git commit IDs) of the kernel versions
>>>> you have tested.
>>>
>>> Commit Id ---> 5037be168f
>>
>> The reason I was asking for the commit ID is because I saw that
>> clone_endio()
>> occurs in the oops which means that the dm driver is involved. An
>> important fix
>> for the dm driver went upstream recently, namely d37753540568 ("dm: Use
>> kzalloc
>> for all structs with embedded biosets/mempools"). Can you double check
>> whether
>> that commit it present in your tree? If it is not present, please
>> update to the
>> latest master and retest. If it is present, please report how to
>> reproduce
>> this oops to Kent Overstreet, Jens Axboe, linux-block and Mike Snitzer.
>>
>> Thanks,
>>
>> Bart.
>
>
> Yes, the fix is present in the tree, which I have tested.
>
> Steps to reproduce:
>
> Step1: Clone and Install avocado git clone
> https://github.com/avocado-framework/avocado.git
> Step2: Clone
> https://github.com/avocado-framework-tests/avocado-misc-tests.git
> Test case is
> https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py
> Step3: Command to run the test is avocado run
> avocado-misc-tests/memory/memhotplug.py
Can you try with the below? Not a fully formed fix since I'd prefer
if the dm bioset copy stuff was changed instead, but worth a shot.
diff --git a/block/bio.c b/block/bio.c
index 595663e0281a..45bdee67d28b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1967,6 +1967,27 @@ int bioset_init(struct bio_set *bs,
}
EXPORT_SYMBOL(bioset_init);
+void bioset_move(struct bio_set *dst, struct bio_set *src)
+{
+ dst->bio_slab = src->bio_slab;
+ dst->front_pad = src->front_pad;
+ mempool_move(&dst->bio_pool, &src->bio_pool);
+ mempool_move(&dst->bvec_pool, &src->bvec_pool);
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ mempool_move(&dst->bio_integrity_pool, &src->bio_integrity_pool);
+ mempool_move(&dst->bvec_integrity_pool, &src->bvec_integrity_pool);
+#endif
+ BUG_ON(!bio_list_empty(&src->rescue_list));
+ BUG_ON(work_pending(&src->rescue_work));
+ spin_lock_init(&dst->rescue_lock);
+ bio_list_init(&dst->rescue_list);
+ INIT_WORK(&dst->rescue_work, bio_alloc_rescue);
+ dst->rescue_workqueue = src->rescue_workqueue;
+
+ memset(src, 0, sizeof(*src));
+}
+EXPORT_SYMBOL(bioset_move);
+
#ifdef CONFIG_BLK_CGROUP
/**
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 98dff36b89a3..87f636815baf 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1982,10 +1982,8 @@ static void __bind_mempools(struct mapped_device *md, struct dm_table *t)
bioset_initialized(&md->bs) ||
bioset_initialized(&md->io_bs));
- md->bs = p->bs;
- memset(&p->bs, 0, sizeof(p->bs));
- md->io_bs = p->io_bs;
- memset(&p->io_bs, 0, sizeof(p->io_bs));
+ bioset_move(&md->bs, &p->bs);
+ bioset_move(&md->io_bs, &p->io_bs);
out:
/* mempool bind completed, no longer need any mempools in the table */
dm_table_free_md_mempools(t);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 810a8bee8f85..7581231dd0a3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -417,6 +417,7 @@ enum {
extern int bioset_init(struct bio_set *, unsigned int, unsigned int, int flags);
extern void bioset_exit(struct bio_set *);
extern int biovec_init_pool(mempool_t *pool, int pool_entries);
+extern void bioset_move(struct bio_set *dst, struct bio_set *src);
extern struct bio *bio_alloc_bioset(gfp_t, unsigned int, struct bio_set *);
extern void bio_put(struct bio *);
diff --git a/include/linux/mempool.h b/include/linux/mempool.h
index 0c964ac107c2..20818919180c 100644
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -47,6 +47,7 @@ extern int mempool_resize(mempool_t *pool, int new_min_nr);
extern void mempool_destroy(mempool_t *pool);
extern void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) __malloc;
extern void mempool_free(void *element, mempool_t *pool);
+extern void mempool_move(mempool_t *dst, mempool_t *src);
/*
* A mempool_alloc_t and mempool_free_t that get the memory from
diff --git a/mm/mempool.c b/mm/mempool.c
index b54f2c20e5e0..dd402653367b 100644
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -181,6 +181,8 @@ int mempool_init_node(mempool_t *pool, int min_nr, mempool_alloc_t *alloc_fn,
mempool_free_t *free_fn, void *pool_data,
gfp_t gfp_mask, int node_id)
{
+ memset(pool, 0, sizeof(*pool));
+
spin_lock_init(&pool->lock);
pool->min_nr = min_nr;
pool->pool_data = pool_data;
@@ -546,3 +548,19 @@ void mempool_free_pages(void *element, void *pool_data)
__free_pages(element, order);
}
EXPORT_SYMBOL(mempool_free_pages);
+
+void mempool_move(mempool_t *dst, mempool_t *src)
+{
+ BUG_ON(waitqueue_active(&src->wait));
+
+ spin_lock_init(&dst->lock);
+ dst->min_nr = src->min_nr;
+ dst->curr_nr = src->curr_nr;
+ memcpy(dst->elements, src->elements, sizeof(void *) * src->curr_nr);
+ dst->pool_data = src->pool_data;
+ dst->alloc = src->alloc;
+ dst->free = src->free;
+ init_waitqueue_head(&dst->wait);
+
+ memset(src, 0, sizeof(*src));
+}
--
Jens Axboe
^ permalink raw reply related
* [RFC PATCH 4/5] powerpc: Add VSX regset to compat_regsets
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20180607152534.29427-1-pedromfc@linux.vnet.ibm.com>
This patch copies the the missing VSX regset to the compat_regsets
array.
Not having this regset can cause issues in fs/binfmt_elf.c in the
fill_thread_core_info function, which iterates over all the regsets
defined in compat_regsets to fill note info for a core dump of a
32-bit thread. However, the number of regset notes allocated for
writing is the number of regsets with core_note_type != 0. If the
regset array has an entry with core_note_type == 0, which is the case
for the missing VSX element, this can cause later regsets to be
written outside the bounds of the allocated notes.
The compat_regset is also missing entries for REGSET_PMR and
REGSET_PKEY, but because these are at the end of the powerpc_regset
enum, the designated initializers for the compat_regset array don't
cause implicit elements to be created, like they did for REGSET_VSX.
---
arch/powerpc/kernel/ptrace.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 69123feaef9e..2da0668a96dc 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2237,6 +2237,13 @@ static const struct user_regset compat_regsets[] = {
.active = vr_active, .get = vr_get, .set = vr_set
},
#endif
+#ifdef CONFIG_VSX
+ [REGSET_VSX] = {
+ .core_note_type = NT_PPC_VSX, .n = 32,
+ .size = sizeof(double), .align = sizeof(double),
+ .active = vsr_active, .get = vsr_get, .set = vsr_set
+ },
+#endif
#ifdef CONFIG_SPE
[REGSET_SPE] = {
.core_note_type = NT_PPC_SPE, .n = 35,
--
2.13.6
^ permalink raw reply related
* [RFC PATCH 2/5] powerpc: Flush checkpointed gpr state for 32-bit processes in ptrace
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20180607152534.29427-1-pedromfc@linux.vnet.ibm.com>
Currently ptrace doesn't flush the register state when the
checkpointed GPRs of a 32-bit thread are accessed. This can cause core
dumps to have stale data in the checkpointed GPR note.
---
arch/powerpc/kernel/ptrace.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 6618570c6d56..be8ca03a0bd5 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2124,6 +2124,16 @@ static int tm_cgpr32_get(struct task_struct *target,
unsigned int pos, unsigned int count,
void *kbuf, void __user *ubuf)
{
+ if (!cpu_has_feature(CPU_FTR_TM))
+ return -ENODEV;
+
+ if (!MSR_TM_ACTIVE(target->thread.regs->msr))
+ return -ENODATA;
+
+ flush_tmregs_to_thread(target);
+ flush_fp_to_thread(target);
+ flush_altivec_to_thread(target);
+
return gpr32_get_common(target, regset, pos, count, kbuf, ubuf,
&target->thread.ckpt_regs.gpr[0]);
}
@@ -2133,6 +2143,16 @@ static int tm_cgpr32_set(struct task_struct *target,
unsigned int pos, unsigned int count,
const void *kbuf, const void __user *ubuf)
{
+ if (!cpu_has_feature(CPU_FTR_TM))
+ return -ENODEV;
+
+ if (!MSR_TM_ACTIVE(target->thread.regs->msr))
+ return -ENODATA;
+
+ flush_tmregs_to_thread(target);
+ flush_fp_to_thread(target);
+ flush_altivec_to_thread(target);
+
return gpr32_set_common(target, regset, pos, count, kbuf, ubuf,
&target->thread.ckpt_regs.gpr[0]);
}
--
2.13.6
^ permalink raw reply related
* [RFC PATCH 0/5] powerpc: Misc. ptrace regset fixes
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
This series attempts to fix a few issues with ptrace regsets.
Patch 1 simply inverts the active predicate for ebb_set. I don't know
if there was a reason for having opposite predicates in
ebb_get/ebb_set, but I assumed this was a typo.
Patch 2 adds the usual HTM prologue for regsets to the tm_cgpr32
get/set functions, so that the cgprs are flushed. I don't really
understand the need for flushing the fp and altivec states, but I
copied that over since it was done in the regular tm_cgpr get/set
functions.
Patch 3 changes the pmu get/set functions so that they don't read or
write outside the bounds of thread_struct.mmcr0. The endianess of the
kernel is used to determine where the mmcr0 word should be placed (or
read from) in its corresponding 64-bit slot in the regset. I am not
sure if this is the correct way to go, or if the endianess of the
thread being traced should determine this position (can the kernel run
threads with a different endianess?). I used the kernel endianess
because that is what seems to happen for other registers smaller than
their regset fields (for instance, it seems that checkpointed CR is
saved by the kernel as a doubleword, so the the position of the word
depends on the kernel's endianess). The rest of the function assumes
that unsigned longs are doublewords, so the patch assumes that an
unsigned is a word. This patch (and the original pmu_get/set
functions) might not work if the kernel is compiled in 32 bits.
Patch 4 adds the VSX regset to compat_regsets, which could cause out
of bounds writes in fs/binfmt_elf.c.
Patch 5 adds the PMU regset to compat_regsets.
I also noticed that the regset for CGPRs for 32-bit threads has 48 * 8
bytes (same as the one for 64-bit threads), but the data only occupies
the first 48 * 4 bytes (like for the 32-bit GPR regset). I am not sure
if this was intended, or if it can be changed now that other programs
might already assume the 48 * 8 size. If the kernel is compiled in
32-bits, the size will change (because it depends on sizeof (long)),
but I don't know if HTM and the corresponding regsets are supported in
the first place for a 32-bit kernel.
I haven't added the PKEY regset to compat_regsets. Does that make
sense for 32-bit threads?
Pedro Franco de Carvalho (5):
powerpc: Fix inverted active predicate for setting the EBB regset
powerpc: Flush checkpointed gpr state for 32-bit processes in ptrace
powerpc: Fix pmu get/set functions
powerpc: Add VSX regset to compat_regsets
powerpc: Add PMU regset to compat_regsets
arch/powerpc/kernel/ptrace.c | 65 ++++++++++++++++++++++++++++++++++++++++----
1 file changed, 60 insertions(+), 5 deletions(-)
--
2.13.6
^ permalink raw reply
* [RFC PATCH 1/5] powerpc: Fix inverted active predicate for setting the EBB regset
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20180607152534.29427-1-pedromfc@linux.vnet.ibm.com>
Currently, the ebb_set function for writing to the EBB regset returns
ENODATA when ebb is active in the thread, and copies in the data when
it is inactive. This patch inverts the condition so that it matches
ebb_get and ebb_active.
---
arch/powerpc/kernel/ptrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index d23cf632edf0..6618570c6d56 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1701,7 +1701,7 @@ static int ebb_set(struct task_struct *target,
if (!cpu_has_feature(CPU_FTR_ARCH_207S))
return -ENODEV;
- if (target->thread.used_ebb)
+ if (!target->thread.used_ebb)
return -ENODATA;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
--
2.13.6
^ permalink raw reply related
* [RFC PATCH 3/5] powerpc: Fix pmu get/set functions
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20180607152534.29427-1-pedromfc@linux.vnet.ibm.com>
The PMU regset exposed through ptrace has 5 64-bit words, which are
all copied in and out. However, mmcr0 in the thread_struct is an
unsigned, which causes pmu_set to clobber the next variable in the
thread_struct (used_ebb), and pmu_get to return the same variable in
one half of the mmcr0 slot.
---
arch/powerpc/kernel/ptrace.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index be8ca03a0bd5..69123feaef9e 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1733,6 +1733,9 @@ static int pmu_get(struct task_struct *target,
unsigned int pos, unsigned int count,
void *kbuf, void __user *ubuf)
{
+ int ret = 0;
+ unsigned long mmcr0 = target->thread.mmcr0;
+
/* Build tests */
BUILD_BUG_ON(TSO(siar) + sizeof(unsigned long) != TSO(sdar));
BUILD_BUG_ON(TSO(sdar) + sizeof(unsigned long) != TSO(sier));
@@ -1742,9 +1745,16 @@ static int pmu_get(struct task_struct *target,
if (!cpu_has_feature(CPU_FTR_ARCH_207S))
return -ENODEV;
- return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
- &target->thread.siar, 0,
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &target->thread.siar, 0,
+ 4 * sizeof(unsigned long));
+
+ if (!ret)
+ ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+ &mmcr0, 4 * sizeof(unsigned long),
5 * sizeof(unsigned long));
+
+ return ret;
}
static int pmu_set(struct task_struct *target,
@@ -1754,6 +1764,12 @@ static int pmu_set(struct task_struct *target,
{
int ret = 0;
+#ifdef __BIG_ENDIAN
+ int mmcr0_offset = sizeof(unsigned);
+#else
+ int mmcr0_offset = 0;
+#endif
+
/* Build tests */
BUILD_BUG_ON(TSO(siar) + sizeof(unsigned long) != TSO(sdar));
BUILD_BUG_ON(TSO(sdar) + sizeof(unsigned long) != TSO(sier));
@@ -1783,9 +1799,16 @@ static int pmu_set(struct task_struct *target,
4 * sizeof(unsigned long));
if (!ret)
+ ret = user_regset_copyin_ignore(&pos, &count, &kbuf,
+ &ubuf, 4 * sizeof(unsigned long),
+ 4 * sizeof(unsigned long) + mmcr0_offset);
+
+ if (!ret)
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
- &target->thread.mmcr0, 4 * sizeof(unsigned long),
- 5 * sizeof(unsigned long));
+ &target->thread.mmcr0,
+ 4 * sizeof(unsigned long) + mmcr0_offset,
+ 4 * sizeof(unsigned long) + mmcr0_offset
+ + sizeof (unsigned));
return ret;
}
#endif
--
2.13.6
^ permalink raw reply related
* [RFC PATCH 5/5] powerpc: Add PMU regset to compat_regsets
From: Pedro Franco de Carvalho @ 2018-06-07 15:25 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20180607152534.29427-1-pedromfc@linux.vnet.ibm.com>
This patch allows setting and getting PMU registers from 32-bit
threads.
---
arch/powerpc/kernel/ptrace.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 2da0668a96dc..3a9c4ae65429 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2317,6 +2317,11 @@ static const struct user_regset compat_regsets[] = {
.size = sizeof(u64), .align = sizeof(u64),
.active = ebb_active, .get = ebb_get, .set = ebb_set
},
+ [REGSET_PMR] = {
+ .core_note_type = NT_PPC_PMU, .n = ELF_NPMU,
+ .size = sizeof(u64), .align = sizeof(u64),
+ .active = pmu_active, .get = pmu_get, .set = pmu_set
+ },
#endif
};
--
2.13.6
^ permalink raw reply related
* Re: Fwd: [powerpc/Baremetal]Kernel OOPS while executing memory hotplug on Power8 baremetal
From: Jens Axboe @ 2018-06-07 15:40 UTC (permalink / raw)
To: vrbagal1, Bart Van Assche, kent.overstreet, snitzer, linux-block
Cc: linux-scsi, linuxppc-dev, sachinp, Linuxppc-dev
In-Reply-To: <b1b67ad6-b300-7def-5851-b11baa6edf97@kernel.dk>
On 6/7/18 8:45 AM, Jens Axboe wrote:
> On 6/7/18 4:37 AM, vrbagal1 wrote:
>> On 2018-06-07 13:12, Bart Van Assche wrote:
>>> On Thu, 2018-06-07 at 12:56 +0530, Venkat Rao B wrote:
>>>> On Thursday 07 June 2018 12:46 PM, Bart Van Assche wrote:
>>>>> On Thu, 2018-06-07 at 12:38 +0530, vrbagal1 wrote:
>>>>>> Observing Kernel oops and machine reboots while executing memory hotplug
>>>>>> test case, on Power8 Baremetal machine.
>>>>>>
>>>>>> I see this is introduced some where between rc6 and 4.17.
>>>>>
>>>>> Please provide the exact versions (git commit IDs) of the kernel versions
>>>>> you have tested.
>>>>
>>>> Commit Id ---> 5037be168f
>>>
>>> The reason I was asking for the commit ID is because I saw that
>>> clone_endio()
>>> occurs in the oops which means that the dm driver is involved. An
>>> important fix
>>> for the dm driver went upstream recently, namely d37753540568 ("dm: Use
>>> kzalloc
>>> for all structs with embedded biosets/mempools"). Can you double check
>>> whether
>>> that commit it present in your tree? If it is not present, please
>>> update to the
>>> latest master and retest. If it is present, please report how to
>>> reproduce
>>> this oops to Kent Overstreet, Jens Axboe, linux-block and Mike Snitzer.
>>>
>>> Thanks,
>>>
>>> Bart.
>>
>>
>> Yes, the fix is present in the tree, which I have tested.
>>
>> Steps to reproduce:
>>
>> Step1: Clone and Install avocado git clone
>> https://github.com/avocado-framework/avocado.git
>> Step2: Clone
>> https://github.com/avocado-framework-tests/avocado-misc-tests.git
>> Test case is
>> https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/memhotplug.py
>> Step3: Command to run the test is avocado run
>> avocado-misc-tests/memory/memhotplug.py
>
> Can you try with the below? Not a fully formed fix since I'd prefer
> if the dm bioset copy stuff was changed instead, but worth a shot.
This is closer to an actual fix, please try that instead.
diff --git a/block/bio.c b/block/bio.c
index 595663e0281a..0616d86b15c6 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1967,6 +1967,21 @@ int bioset_init(struct bio_set *bs,
}
EXPORT_SYMBOL(bioset_init);
+int bioset_init_from_src(struct bio_set *new, struct bio_set *src)
+{
+ unsigned int pool_size = src->bio_pool.min_nr;
+ int flags;
+
+ flags = 0;
+ if (src->bvec_pool.min_nr)
+ flags |= BIOSET_NEED_BVECS;
+ if (src->rescue_workqueue)
+ flags |= BIOSET_NEED_RESCUER;
+
+ return bioset_init(new, pool_size, src->front_pad, flags);
+}
+EXPORT_SYMBOL(bioset_init_from_src);
+
#ifdef CONFIG_BLK_CGROUP
/**
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 98dff36b89a3..20a8d63754bf 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1953,9 +1953,10 @@ static void free_dev(struct mapped_device *md)
kvfree(md);
}
-static void __bind_mempools(struct mapped_device *md, struct dm_table *t)
+static int __bind_mempools(struct mapped_device *md, struct dm_table *t)
{
struct dm_md_mempools *p = dm_table_get_md_mempools(t);
+ int ret = 0;
if (dm_table_bio_based(t)) {
/*
@@ -1982,13 +1983,16 @@ static void __bind_mempools(struct mapped_device *md, struct dm_table *t)
bioset_initialized(&md->bs) ||
bioset_initialized(&md->io_bs));
- md->bs = p->bs;
- memset(&p->bs, 0, sizeof(p->bs));
- md->io_bs = p->io_bs;
- memset(&p->io_bs, 0, sizeof(p->io_bs));
+ ret = bioset_init_from_src(&md->bs, &p->bs);
+ if (ret)
+ goto out;
+ ret = bioset_init_from_src(&md->io_bs, &p->io_bs);
+ if (ret)
+ bioset_exit(&md->bs);
out:
/* mempool bind completed, no longer need any mempools in the table */
dm_table_free_md_mempools(t);
+ return ret;
}
/*
@@ -2033,6 +2037,7 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
struct request_queue *q = md->queue;
bool request_based = dm_table_request_based(t);
sector_t size;
+ int ret;
lockdep_assert_held(&md->suspend_lock);
@@ -2068,7 +2073,11 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
md->immutable_target = dm_table_get_immutable_target(t);
}
- __bind_mempools(md, t);
+ ret = __bind_mempools(md, t);
+ if (ret) {
+ old_map = ERR_PTR(ret);
+ goto out;
+ }
old_map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock));
rcu_assign_pointer(md->map, (void *)t);
@@ -2078,6 +2087,7 @@ static struct dm_table *__bind(struct mapped_device *md, struct dm_table *t,
if (old_map)
dm_sync_table(md);
+out:
return old_map;
}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 810a8bee8f85..307682ac2f31 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -417,6 +417,7 @@ enum {
extern int bioset_init(struct bio_set *, unsigned int, unsigned int, int flags);
extern void bioset_exit(struct bio_set *);
extern int biovec_init_pool(mempool_t *pool, int pool_entries);
+extern int bioset_init_from_src(struct bio_set *new, struct bio_set *src);
extern struct bio *bio_alloc_bioset(gfp_t, unsigned int, struct bio_set *);
extern void bio_put(struct bio *);
--
Jens Axboe
^ permalink raw reply related
* Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices
From: Michael S. Tsirkin @ 2018-06-07 16:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Anshuman Khandual, Ram Pai, robh, aik, jasowang, linux-kernel,
virtualization, joe, linuxppc-dev, elfring, david, cohuck,
pawel.moll, Tom Lendacky, Rustad, Mark D
In-Reply-To: <20180607052306.GA1532@infradead.org>
On Wed, Jun 06, 2018 at 10:23:06PM -0700, Christoph Hellwig wrote:
> On Thu, May 31, 2018 at 08:43:58PM +0300, Michael S. Tsirkin wrote:
> > Pls work on a long term solution. Short term needs can be served by
> > enabling the iommu platform in qemu.
>
> So, I spent some time looking at converting virtio to dma ops overrides,
> and the current virtio spec, and the sad through I have to tell is that
> both the spec and the Linux implementation are complete and utterly fucked
> up.
Let me restate it: DMA API has support for a wide range of hardware, and
hardware based virtio implementations likely won't benefit from all of
it.
And given virtio right now is optimized for specific workloads, improving
portability without regressing performance isn't easy.
I think it's unsurprising since it started a strictly a guest/host
mechanism. People did implement offloads on specific platforms though,
and they are known to work. To improve portability even further,
we might need to make spec and code changes.
I'm not really sympathetic to people complaining that they can't even
set a flag in qemu though. If that's the case the stack in question is
way too inflexible.
> Both in the flag naming and the implementation there is an implication
> of DMA API == IOMMU, which is fundamentally wrong.
Maybe we need to extend the meaning of PLATFORM_IOMMU or rename it.
It's possible that some setups will benefit from a more
fine-grained approach where some aspects of the DMA
API are bypassed, others aren't.
This seems to be what was being asked for in this thread,
with comments claiming IOMMU flag adds too much overhead.
> The DMA API does a few different things:
>
> a) address translation
>
> This does include IOMMUs. But it also includes random offsets
> between PCI bars and system memory that we see on various
> platforms.
I don't think you mean bars. That's unrelated to DMA.
> Worse so some of these offsets might be based on
> banks, e.g. on the broadcom bmips platform. It also deals
> with bitmask in physical addresses related to memory encryption
> like AMD SEV. I'd be really curious how for example the
> Intel virtio based NIC is going to work on any of those
> plaforms.
SEV guys report that they just set the iommu flag and then it all works.
I guess if there's translation we can think of this as a kind of iommu.
Maybe we should rename PLATFORM_IOMMU to PLARTFORM_TRANSLATION?
And apparently some people complain that just setting that flag makes
qemu check translation on each access with an unacceptable performance
overhead. Forcing same behaviour for everyone on general principles
even without the flag is unlikely to make them happy.
> b) coherency
>
> On many architectures DMA is not cache coherent, and we need
> to invalidate and/or write back cache lines before doing
> DMA. Again, I wonder how this is every going to work with
> hardware based virtio implementations.
You mean dma_Xmb and friends?
There's a new feature VIRTIO_F_IO_BARRIER that's being proposed
for that.
> Even worse I think this
> is actually broken at least for VIVT event for virtualized
> implementations. E.g. a KVM guest is going to access memory
> using different virtual addresses than qemu, vhost might throw
> in another different address space.
I don't really know what VIVT is. Could you help me please?
> c) bounce buffering
>
> Many DMA implementations can not address all physical memory
> due to addressing limitations. In such cases we copy the
> DMA memory into a known addressable bounc buffer and DMA
> from there.
Don't do it then?
> d) flushing write combining buffers or similar
>
> On some hardware platforms we need workarounds to e.g. read
> from a certain mmio address to make sure DMA can actually
> see memory written by the host.
I guess it isn't an issue as long as WC isn't actually used.
It will become an issue when virtio spec adds some WC capability -
I suspect we can ignore this for now.
>
> All of this is bypassed by virtio by default despite generally being
> platform issues, not particular to a given device.
It's both a device and a platform issue. A PV device is often more like
another CPU than like a PCI device.
--
MST
^ permalink raw reply
* Re: [v2 PATCH 0/5] powerpc/pseries: Machien check handler improvements.
From: Mahesh Jagannath Salgaonkar @ 2018-06-07 16:32 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, Laurent Dufour, Aneesh Kumar K.V
In-Reply-To: <20180607204554.05565a56@roar.ozlabs.ibm.com>
On 06/07/2018 04:15 PM, Nicholas Piggin wrote:
> On Thu, 07 Jun 2018 15:36:25 +0530
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
>
>> This patch series includes some improvement to Machine check handler
>> for pseries. Patch 1 fixes an issue where machine check handler crashes
>> kernel while accessing vmalloc-ed buffer while in nmi context.
>> Patch 3 dumps the SLB contents on SLB MCE errors to improve the debugability.
>> Patch 4 display's the MCE error details on console.
>>
>> Change in V2:
>> - patch 4: Display additional info (NIP and task info) in MCE error details.
>> - patch 5: Fix endain bug while restoring of r3 in MCE handler.
>>
>> ---
>>
>> Mahesh Salgaonkar (5):
>> powerpc/pseries: convert rtas_log_buf to linear allocation.
>> powerpc/pseries: Define MCE error event section.
>> powerpc/pseries: Dump and flush SLB contents on SLB MCE errors.
>> powerpc/pseries: Display machine check error details.
>> powerpc/pseries: Fix endainness while restoring of r3 in MCE handler.
>
> These look good, should patch 5 be moved to patch 2 and the first 2
> patches marked for stable?
Yup. Will move patch 5 to 2nd position.
>
> Do you also plan to dump SLB contents for bare metal MCEs?
Yes. That's the plan. Will do that separately.
Thanks,
-Mahesh.
^ permalink raw reply
* Re: [RFC PATCH -tip v5 18/27] powerpc/kprobes: Don't call the ->break_handler() in arm kprobes code
From: Naveen N. Rao @ 2018-06-07 16:37 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Andrew Morton, H . Peter Anvin, linux-arch, linux-kernel,
linuxppc-dev, Ingo Molnar, Ingo Molnar, Paul Mackerras,
Steven Rostedt, Thomas Gleixner
In-Reply-To: <20180607232802.c5fcea960e94ef2f3cd4cde8@kernel.org>
Masami Hiramatsu wrote:
> On Thu, 07 Jun 2018 17:07:00 +0530
> "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote:
>=20
>> Masami Hiramatsu wrote:
>> > Don't call the ->break_handler() from the arm kprobes code,
>> ^^^ powerpc
>>=20
>> > because it was only used by jprobes which got removed.
>> >=20
>> > This also makes skip_singlestep() a static function since
>> > only ftrace-kprobe.c is using this function.
>> >=20
>> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
>> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> > Cc: Paul Mackerras <paulus@samba.org>
>> > Cc: Michael Ellerman <mpe@ellerman.id.au>
>> > Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
>> > Cc: linuxppc-dev@lists.ozlabs.org
>> > ---
>> > arch/powerpc/include/asm/kprobes.h | 10 ----------
>> > arch/powerpc/kernel/kprobes-ftrace.c | 16 +++-------------
>> > arch/powerpc/kernel/kprobes.c | 31 +++++++++++--------------=
------
>> > 3 files changed, 14 insertions(+), 43 deletions(-)
>>=20
>> With 2 small comments...
>=20
> 2 ? or 1 ?
Two, with one in the commit log above :)
- Naveen
=
^ permalink raw reply
* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alex Williamson @ 2018-06-07 17:04 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607084420.29513-1-aik@ozlabs.ru>
On Thu, 7 Jun 2018 18:44:15 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> Here is an rfc of some patches adding psaa-through support
> for NVIDIA V100 GPU found in some POWER9 boxes.
>
> The example P9 system has 6 GPUs, each accompanied with 2 bridges
> representing the hardware links (aka NVLink2):
>
> 4 0004:04:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
> 5 0004:05:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
> 6 0004:06:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
> 4 0006:00:00.0 Bridge: IBM Device 04ea (rev 01)
> 4 0006:00:00.1 Bridge: IBM Device 04ea (rev 01)
> 5 0006:00:01.0 Bridge: IBM Device 04ea (rev 01)
> 5 0006:00:01.1 Bridge: IBM Device 04ea (rev 01)
> 6 0006:00:02.0 Bridge: IBM Device 04ea (rev 01)
> 6 0006:00:02.1 Bridge: IBM Device 04ea (rev 01)
> 10 0007:00:00.0 Bridge: IBM Device 04ea (rev 01)
> 10 0007:00:00.1 Bridge: IBM Device 04ea (rev 01)
> 11 0007:00:01.0 Bridge: IBM Device 04ea (rev 01)
> 11 0007:00:01.1 Bridge: IBM Device 04ea (rev 01)
> 12 0007:00:02.0 Bridge: IBM Device 04ea (rev 01)
> 12 0007:00:02.1 Bridge: IBM Device 04ea (rev 01)
> 10 0035:03:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
> 11 0035:04:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
> 12 0035:05:00.0 3D: NVIDIA Corporation GV100GL [Tesla V100 SXM2] (rev a1)
>
> ^^ the number is an IOMMU group ID.
Can we back up and discuss whether the IOMMU grouping of NVLink
connected devices makes sense? AIUI we have a PCI view of these
devices and from that perspective they're isolated. That's the view of
the device used to generate the grouping. However, not visible to us,
these devices are interconnected via NVLink. What isolation properties
does NVLink provide given that its entire purpose for existing seems to
be to provide a high performance link for p2p between devices?
> Each bridge represents an additional hardware interface called "NVLink2",
> it is not a PCI link but separate but. The design inherits from original
> NVLink from POWER8.
>
> The new feature of V100 is 16GB of cache coherent memory on GPU board.
> This memory is presented to the host via the device tree and remains offline
> until the NVIDIA driver loads, trains NVLink2 (via the config space of these
> bridges above) and the nvidia-persistenced daemon then onlines it.
> The memory remains online as long as nvidia-persistenced is running, when
> it stops, it offlines the memory.
>
> The amount of GPUs suggest passing them through to a guest. However,
> in order to do so we cannot use the NVIDIA driver so we have a host with
> a 128GB window (bigger or equal to actual GPU RAM size) in a system memory
> with no page structs backing this window and we cannot touch this memory
> before the NVIDIA driver configures it in a host or a guest as
> HMI (hardware management interrupt?) occurs.
Having a lot of GPUs only suggests assignment to a guest if there's
actually isolation provided between those GPUs. Otherwise we'd need to
assign them as one big group, which gets a lot less useful. Thanks,
Alex
> On the example system the GPU RAM windows are located at:
> 0x0400 0000 0000
> 0x0420 0000 0000
> 0x0440 0000 0000
> 0x2400 0000 0000
> 0x2420 0000 0000
> 0x2440 0000 0000
>
> So the complications are:
>
> 1. cannot touch the GPU memory till it is trained, i.e. cannot add ptes
> to VFIO-to-userspace or guest-to-host-physical translations till
> the driver trains it (i.e. nvidia-persistenced has started), otherwise
> prefetching happens and HMI occurs; I am trying to get this changed
> somehow;
>
> 2. since it appears as normal cache coherent memory, it will be used
> for DMA which means it has to be pinned and mapped in the host. Having
> no page structs makes it different from the usual case - we only need
> translate user addresses to host physical and map GPU RAM memory but
> pinning is not required.
>
> This series maps GPU RAM via the GPU vfio-pci device so QEMU can then
> register this memory as a KVM memory slot and present memory nodes to
> the guest. Unless NVIDIA provides an userspace driver, this is no use
> for things like DPDK.
>
>
> There is another problem which the series does not address but worth
> mentioning - it is not strictly necessary to map GPU RAM to the guest
> exactly where it is in the host (I tested this to some extent), we still
> might want to represent the memory at the same offset as on the host
> which increases the size of a TCE table needed to cover such a huge
> window: (((0x244000000000 + 0x2000000000) >> 16)*8)>>20 = 4556MB
> I am addressing this in a separate patchset by allocating indirect TCE
> levels on demand and using 16MB IOMMU pages in the guest as we can now
> back emulated pages with the smaller hardware ones.
>
>
> This is an RFC. Please comment. Thanks.
>
>
>
> Alexey Kardashevskiy (5):
> vfio/spapr_tce: Simplify page contained test
> powerpc/iommu_context: Change referencing in API
> powerpc/iommu: Do not pin memory of a memory device
> vfio_pci: Allow mapping extra regions
> vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
>
> drivers/vfio/pci/Makefile | 1 +
> arch/powerpc/include/asm/mmu_context.h | 5 +-
> drivers/vfio/pci/vfio_pci_private.h | 11 ++
> include/uapi/linux/vfio.h | 3 +
> arch/powerpc/kernel/iommu.c | 8 +-
> arch/powerpc/mm/mmu_context_iommu.c | 70 +++++++++---
> drivers/vfio/pci/vfio_pci.c | 19 +++-
> drivers/vfio/pci/vfio_pci_nvlink2.c | 190 +++++++++++++++++++++++++++++++++
> drivers/vfio/vfio_iommu_spapr_tce.c | 42 +++++---
> drivers/vfio/pci/Kconfig | 4 +
> 10 files changed, 319 insertions(+), 34 deletions(-)
> create mode 100644 drivers/vfio/pci/vfio_pci_nvlink2.c
>
^ permalink raw reply
* Re: [RFC PATCH kernel 5/5] vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] [10de:1db1] subdriver
From: Alex Williamson @ 2018-06-07 17:04 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607084420.29513-6-aik@ozlabs.ru>
On Thu, 7 Jun 2018 18:44:20 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> Some POWER9 chips come with special NVLink2 links which provide
> cacheable memory access to the RAM physically located on NVIDIA GPU.
> This memory is presented to a host via the device tree but remains
> offline until the NVIDIA driver onlines it.
>
> This exports this RAM to the userspace as a new region so
> the NVIDIA driver in the guest can train these links and online GPU RAM.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> drivers/vfio/pci/Makefile | 1 +
> drivers/vfio/pci/vfio_pci_private.h | 8 ++
> include/uapi/linux/vfio.h | 3 +
> drivers/vfio/pci/vfio_pci.c | 9 ++
> drivers/vfio/pci/vfio_pci_nvlink2.c | 190 ++++++++++++++++++++++++++++++++++++
> drivers/vfio/pci/Kconfig | 4 +
> 6 files changed, 215 insertions(+)
> create mode 100644 drivers/vfio/pci/vfio_pci_nvlink2.c
>
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 76d8ec0..9662c06 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -1,5 +1,6 @@
>
> vfio-pci-y := vfio_pci.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
> vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
> +vfio-pci-$(CONFIG_VFIO_PCI_NVLINK2) += vfio_pci_nvlink2.o
>
> obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index 86aab05..7115b9b 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -160,4 +160,12 @@ static inline int vfio_pci_igd_init(struct vfio_pci_device *vdev)
> return -ENODEV;
> }
> #endif
> +#ifdef CONFIG_VFIO_PCI_NVLINK2
> +extern int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev);
> +#else
> +static inline int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev)
> +{
> + return -ENODEV;
> +}
> +#endif
> #endif /* VFIO_PCI_PRIVATE_H */
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 1aa7b82..2fe8227 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -301,6 +301,9 @@ struct vfio_region_info_cap_type {
> #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG (2)
> #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG (3)
>
> +/* NVIDIA GPU NV2 */
> +#define VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2 (4)
You're continuing the Intel vendor ID sub-types for an NVIDIA vendor ID
subtype. Each vendor has their own address space of sub-types.
> +
> /*
> * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
> * which allows direct access to non-MSIX registers which happened to be within
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 7bddf1e..38c9475 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -306,6 +306,15 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev)
> }
> }
>
> + if (pdev->vendor == PCI_VENDOR_ID_NVIDIA &&
> + pdev->device == 0x1db1 &&
> + IS_ENABLED(CONFIG_VFIO_PCI_NVLINK2)) {
Can't we do better than check this based on device ID? Perhaps PCIe
capability hints at this?
Is it worthwhile to continue with assigning the device in the !ENABLED
case? For instance, maybe it would be better to provide a weak
definition of vfio_pci_nvlink2_init() that would cause us to fail here
if we don't have this device specific support enabled. I realize
you're following the example set forth for IGD, but those regions are
optional, for better or worse.
> + ret = vfio_pci_nvlink2_init(vdev);
> + if (ret)
> + dev_warn(&vdev->pdev->dev,
> + "Failed to setup NVIDIA NV2 RAM region\n");
> + }
> +
> vfio_pci_probe_mmaps(vdev);
>
> return 0;
> diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
> new file mode 100644
> index 0000000..451c5cb
> --- /dev/null
> +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
> @@ -0,0 +1,190 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * VFIO PCI NVIDIA Whitherspoon GPU support a.k.a. NVLink2.
> + *
> + * Copyright (C) 2018 IBM Corp. All rights reserved.
> + * Author: Alexey Kardashevskiy <aik@ozlabs.ru>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * Register an on-GPU RAM region for cacheable access.
> + *
> + * Derived from original vfio_pci_igd.c:
> + * Copyright (C) 2016 Red Hat, Inc. All rights reserved.
> + * Author: Alex Williamson <alex.williamson@redhat.com>
> + */
> +
> +#include <linux/io.h>
> +#include <linux/pci.h>
> +#include <linux/uaccess.h>
> +#include <linux/vfio.h>
> +#include <linux/sched/mm.h>
> +#include <linux/mmu_context.h>
> +
> +#include "vfio_pci_private.h"
> +
> +struct vfio_pci_nvlink2_data {
> + unsigned long gpu_hpa;
> + unsigned long useraddr;
> + unsigned long size;
> + struct mm_struct *mm;
> + struct mm_iommu_table_group_mem_t *mem;
> +};
> +
> +static size_t vfio_pci_nvlink2_rw(struct vfio_pci_device *vdev,
> + char __user *buf, size_t count, loff_t *ppos, bool iswrite)
> +{
> + unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - VFIO_PCI_NUM_REGIONS;
> + void *base = vdev->region[i].data;
> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
> +
> + if (pos >= vdev->region[i].size)
> + return -EINVAL;
> +
> + count = min(count, (size_t)(vdev->region[i].size - pos));
> +
> + if (iswrite) {
> + if (copy_from_user(base + pos, buf, count))
> + return -EFAULT;
> + } else {
> + if (copy_to_user(buf, base + pos, count))
> + return -EFAULT;
> + }
> + *ppos += count;
> +
> + return count;
> +}
> +
> +static void vfio_pci_nvlink2_release(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region)
> +{
> + struct vfio_pci_nvlink2_data *data = region->data;
> + long ret;
> +
> + ret = mm_iommu_put(data->mm, data->mem);
> + WARN_ON(ret);
> +
> + mmdrop(data->mm);
> + kfree(data);
> +}
> +
> +static int vfio_pci_nvlink2_mmap_fault(struct vm_fault *vmf)
> +{
> + struct vm_area_struct *vma = vmf->vma;
> + struct vfio_pci_region *region = vma->vm_private_data;
> + struct vfio_pci_nvlink2_data *data = region->data;
> + int ret;
> + unsigned long vmf_off = (vmf->address - vma->vm_start) >> PAGE_SHIFT;
> + unsigned long nv2pg = data->gpu_hpa >> PAGE_SHIFT;
> + unsigned long vm_pgoff = vma->vm_pgoff &
> + ((1U << (VFIO_PCI_OFFSET_SHIFT - PAGE_SHIFT)) - 1);
> + unsigned long pfn = nv2pg + vm_pgoff + vmf_off;
> +
> + ret = vm_insert_pfn(vma, vmf->address, pfn);
> + /* TODO: make it a tracepoint */
> + pr_debug("NVLink2: vmf=%lx hpa=%lx ret=%d\n",
> + vmf->address, pfn << PAGE_SHIFT, ret);
> + if (ret)
> + return VM_FAULT_SIGSEGV;
> +
> + return VM_FAULT_NOPAGE;
> +}
> +
> +static const struct vm_operations_struct vfio_pci_nvlink2_mmap_vmops = {
> + .fault = vfio_pci_nvlink2_mmap_fault,
> +};
> +
> +static int vfio_pci_nvlink2_mmap(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region, struct vm_area_struct *vma)
> +{
> + long ret;
> + struct vfio_pci_nvlink2_data *data = region->data;
> +
> + if (data->useraddr)
> + return -EPERM;
> +
> + if (vma->vm_end - vma->vm_start > data->size)
> + return -EINVAL;
> +
> + vma->vm_private_data = region;
> + vma->vm_flags |= VM_PFNMAP;
> + vma->vm_ops = &vfio_pci_nvlink2_mmap_vmops;
> +
> + /*
> + * Calling mm_iommu_newdev() here once as the region is not
> + * registered yet and therefore right initialization will happen now.
> + * Other places will use mm_iommu_find() which returns
> + * registered @mem and does not go gup().
> + */
> + data->useraddr = vma->vm_start;
> + data->mm = current->mm;
> + atomic_inc(&data->mm->mm_count);
> + ret = mm_iommu_newdev(data->mm, data->useraddr,
> + (vma->vm_end - vma->vm_start) >> PAGE_SHIFT,
> + data->gpu_hpa, &data->mem);
> +
> + pr_debug("VFIO NVLINK2 mmap: useraddr=%lx hpa=%lx size=%lx ret=%ld\n",
> + data->useraddr, data->gpu_hpa,
> + vma->vm_end - vma->vm_start, ret);
> +
> + return ret;
> +}
> +
> +static const struct vfio_pci_regops vfio_pci_nvlink2_regops = {
> + .rw = vfio_pci_nvlink2_rw,
> + .release = vfio_pci_nvlink2_release,
> + .mmap = vfio_pci_nvlink2_mmap,
> +};
> +
> +int vfio_pci_nvlink2_init(struct vfio_pci_device *vdev)
> +{
> + int len = 0, ret;
> + struct device_node *npu_node, *mem_node;
> + struct pci_dev *npu_dev;
> + uint32_t *mem_phandle, *val;
> + struct vfio_pci_nvlink2_data *data;
> +
> + npu_dev = pnv_pci_get_npu_dev(vdev->pdev, 0);
> + if (!npu_dev)
> + return -EINVAL;
> +
> + npu_node = pci_device_to_OF_node(npu_dev);
> + if (!npu_node)
> + return -EINVAL;
> +
> + mem_phandle = (void *) of_get_property(npu_node, "memory-region", NULL);
> + if (!mem_phandle)
> + return -EINVAL;
> +
> + mem_node = of_find_node_by_phandle(be32_to_cpu(*mem_phandle));
> + if (!mem_node)
> + return -EINVAL;
> +
> + val = (uint32_t *) of_get_property(mem_node, "reg", &len);
> + if (!val || len != 2 * sizeof(uint64_t))
> + return -EINVAL;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + data->gpu_hpa = ((uint64_t)be32_to_cpu(val[0]) << 32) |
> + be32_to_cpu(val[1]);
> + data->size = ((uint64_t)be32_to_cpu(val[2]) << 32) |
> + be32_to_cpu(val[3]);
> +
> + dev_dbg(&vdev->pdev->dev, "%lx..%lx\n", data->gpu_hpa,
> + data->gpu_hpa + data->size - 1);
> +
> + ret = vfio_pci_register_dev_region(vdev,
> + PCI_VENDOR_ID_NVIDIA | VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
> + VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2,
> + &vfio_pci_nvlink2_regops, data->size,
> + VFIO_REGION_INFO_FLAG_READ, data);
> + if (ret)
> + kfree(data);
> +
> + return ret;
> +}
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 24ee260..2725bc8 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -30,3 +30,7 @@ config VFIO_PCI_INTX
> config VFIO_PCI_IGD
> depends on VFIO_PCI
> def_bool y if X86
> +
> +config VFIO_PCI_NVLINK2
> + depends on VFIO_PCI
> + def_bool y if PPC_POWERNV
As written, this also depends on PPC_POWERNV (or at least TCE), it's not
a portable implementation that we could re-use on X86 or ARM or any
other platform if hardware appeared for it. Can we improve that as
well to make this less POWER specific? Thanks,
Alex
^ permalink raw reply
* Re: [RFC PATCH kernel 4/5] vfio_pci: Allow mapping extra regions
From: Alex Williamson @ 2018-06-07 17:04 UTC (permalink / raw)
To: Alexey Kardashevskiy
Cc: linuxppc-dev, David Gibson, kvm-ppc, Benjamin Herrenschmidt,
Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180607084420.29513-5-aik@ozlabs.ru>
On Thu, 7 Jun 2018 18:44:19 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
What's an "extra region", -ENOCOMMITLOG
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> drivers/vfio/pci/vfio_pci_private.h | 3 +++
> drivers/vfio/pci/vfio_pci.c | 10 ++++++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_private.h b/drivers/vfio/pci/vfio_pci_private.h
> index cde3b5d..86aab05 100644
> --- a/drivers/vfio/pci/vfio_pci_private.h
> +++ b/drivers/vfio/pci/vfio_pci_private.h
> @@ -59,6 +59,9 @@ struct vfio_pci_regops {
> size_t count, loff_t *ppos, bool iswrite);
> void (*release)(struct vfio_pci_device *vdev,
> struct vfio_pci_region *region);
> + int (*mmap)(struct vfio_pci_device *vdev,
> + struct vfio_pci_region *region,
> + struct vm_area_struct *vma);
> };
>
> struct vfio_pci_region {
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 3729937..7bddf1e 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -1123,10 +1123,16 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
> return -EINVAL;
> if ((vma->vm_flags & VM_SHARED) == 0)
> return -EINVAL;
> + if (index >= VFIO_PCI_NUM_REGIONS) {
> + int regnum = index - VFIO_PCI_NUM_REGIONS;
> + struct vfio_pci_region *region = vdev->region + regnum;
> +
> + if (region && region->ops && region->ops->mmap)
> + return region->ops->mmap(vdev, region, vma);
> + return -EINVAL;
> + }
> if (index >= VFIO_PCI_ROM_REGION_INDEX)
> return -EINVAL;
> - if (!vdev->bar_mmap_supported[index])
> - return -EINVAL;
This seems unrelated. Thanks,
Alex
> phys_len = PAGE_ALIGN(pci_resource_len(pdev, index));
> req_len = vma->vm_end - vma->vm_start;
^ permalink raw reply
* [v3 PATCH 0/5] powerpc/pseries: Machien check handler improvements.
From: Mahesh J Salgaonkar @ 2018-06-07 17:27 UTC (permalink / raw)
To: linuxppc-dev
Cc: Michael Ellerman, stable, Aneesh Kumar K.V, Aneesh Kumar K.V,
Michael Ellerman, Laurent Dufour, Nicholas Piggin
This patch series includes some improvement to Machine check handler
for pseries. Patch 1 fixes an issue where machine check handler crashes
kernel while accessing vmalloc-ed buffer while in nmi context.
Patch 2 fixes endain bug while restoring of r3 in MCE handler.
Patch 4 dumps the SLB contents on SLB MCE errors to improve the debugability.
Patch 5 display's the MCE error details on console.
CHange in V3:
- Moved patch 5 to patch 2
Change in V2:
- patch 3: Display additional info (NIP and task info) in MCE error details.
- patch 5: Fix endain bug while restoring of r3 in MCE handler.
---
Mahesh Salgaonkar (5):
powerpc/pseries: convert rtas_log_buf to linear allocation.
powerpc/pseries: Fix endainness while restoring of r3 in MCE handler.
powerpc/pseries: Define MCE error event section.
powerpc/pseries: Dump and flush SLB contents on SLB MCE errors.
powerpc/pseries: Display machine check error details.
arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1
arch/powerpc/include/asm/rtas.h | 109 ++++++++++++++++++
arch/powerpc/kernel/rtasd.c | 2
arch/powerpc/mm/slb.c | 35 ++++++
arch/powerpc/platforms/pseries/ras.c | 155 +++++++++++++++++++++++++
5 files changed, 299 insertions(+), 3 deletions(-)
--
Signature
^ permalink raw reply
* [v3 PATCH 1/5] powerpc/pseries: convert rtas_log_buf to linear allocation.
From: Mahesh J Salgaonkar @ 2018-06-07 17:28 UTC (permalink / raw)
To: linuxppc-dev
Cc: stable, Aneesh Kumar K.V, Aneesh Kumar K.V, Michael Ellerman,
Laurent Dufour, Nicholas Piggin
In-Reply-To: <152839244928.25118.15100234720683911223.stgit@jupiter.in.ibm.com>
From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
rtas_log_buf is a buffer to hold RTAS event data that are communicated
to kernel by hypervisor. This buffer is then used to pass RTAS event
data to user through proc fs. This buffer is allocated from vmalloc
(non-linear mapping) area.
On Machine check interrupt, register r3 points to RTAS extended event
log passed by hypervisor that contains the MCE event. The pseries
machine check handler then logs this error into rtas_log_buf. The
rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a
page fault (vector 0x300) while accessing it. Since machine check
interrupt handler runs in NMI context we can not afford to take any
page fault. Page faults are not honored in NMI context and causes
kernel panic. This patch fixes this issue by allocating rtas_log_buf
using kmalloc.
Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt")
Cc: stable@vger.kernel.org
Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
arch/powerpc/kernel/rtasd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index f915db93cd42..3957d4ae2ba2 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -559,7 +559,7 @@ static int __init rtas_event_scan_init(void)
rtas_error_log_max = rtas_get_error_log_max();
rtas_error_log_buffer_max = rtas_error_log_max + sizeof(int);
- rtas_log_buf = vmalloc(rtas_error_log_buffer_max*LOG_NUMBER);
+ rtas_log_buf = kmalloc(rtas_error_log_buffer_max*LOG_NUMBER, GFP_KERNEL);
if (!rtas_log_buf) {
printk(KERN_ERR "rtasd: no memory\n");
return -ENOMEM;
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox