LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [musl] ppc64le and 32-bit LE userland compatibility
From: Will Springer @ 2020-05-30 22:56 UTC (permalink / raw)
  To: Rich Felker, linuxppc-dev
  Cc: libc-alpha, eery, daniel, musl, binutils, libc-dev
In-Reply-To: <20200529192426.GM1079@brightrain.aerifal.cx>

On Friday, May 29, 2020 12:24:27 PM PDT Rich Felker wrote:
> The argument passing for pread/pwrite is historically a mess and
> differs between archs. musl has a dedicated macro that archs can
> define to override it. But it looks like it should match regardless of
> BE vs LE, and musl already defines it for powerpc with the default
> definition, adding a zero arg to start on an even arg-slot index,
> which is an odd register (since ppc32 args start with an odd one, r3).
> 
> > [6]:
> > https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c
> I don't think this is correct, but I'm confused about where it's
> getting messed up because it looks like it should already be right.

Hmm, interesting. Will have to go back to it I guess...

> > This was enough to fix up the `file` bug. I'm no seasoned kernel
> > hacker, though, and there is still concern over the right way to
> > approach this, whether it should live in the kernel or libc, etc.
> > Frankly, I don't know the ABI structure enough to understand why the
> > register padding has to be different in this case, or what
> > lower-level component is responsible for it.. For comparison, I had a
> > look at the mips tree, since it's bi-endian and has a similar 32/64
> > situation. There is a macro conditional upon endianness that is
> > responsible for munging long longs; it uses __MIPSEB__ and __MIPSEL__
> > instead of an if/else on the generic __LITTLE_ENDIAN__. Not sure what
> > to make of that. (It also simply swaps registers for LE, unlike what
> > I did for ppc.)
> Indeed the problem is probably that you need to swap registers for LE,
> not remove the padding slot. Did you check what happens if you pass a
> value larger than 32 bits?
> 
> If so, the right way to fix this on the kernel side would be to
> construct the value as a union rather than by bitwise ops so it's
> endian-agnostic:
> 
> 	(union { u32 parts[2]; u64 val; }){{ arg1, arg2 }}.val
> 
> But the kernel folks might prefer endian ifdefs for some odd reason...

You are right, this does seem odd considering what the other archs do. 
It's quite possible I made a silly mistake, of course...

I haven't tested with values outside the 32-bit range yet; again, this is 
new territory for me, so I haven't exactly done exhaustive tests on 
everything. I'll give it a closer look.

> > Also worth noting is the one other outstanding bug, where the
> > time-related syscalls in the 32-bit vDSO seem to return garbage. It
> > doesn't look like an endian bug to me, and it doesn't affect standard
> > syscalls (which is why if you run `date` on musl it prints the
> > correct time, unlike on glibc). The vDSO time functions are
> > implemented in ppc asm (arch/powerpc/kernel/vdso32/ gettimeofday.S),
> > and I've never touched the stuff, so if anyone has a clue I'm all
> > ears.
> Not sure about this. Worst-case, just leave it disabled until someone
> finds a fix.

Apparently these asm implementations are being replaced by the generic C 
ones [1], so it may be this fixes itself on its own.

Thanks,
Will [she/her]

[1]: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231







^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Will Springer @ 2020-05-30 22:17 UTC (permalink / raw)
  To: linuxppc-dev, Christophe Leroy
  Cc: libc-alpha, eery, daniel, musl, binutils, libc-dev
In-Reply-To: <8be94d2e-8e20-52b6-22e6-152b79a94139@csgroup.eu>

On Saturday, May 30, 2020 8:37:43 AM PDT Christophe Leroy wrote:
> There is a series at
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231 to
> switch powerpc to the Generic C VDSO.
> 
> Can you try and see whether it fixes your issue ?
> 
> Christophe

Sure thing, I spotted that after making the initial post. Will report back 
with results.

Will [she/her]





^ permalink raw reply

* Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.7-6 tag
From: pr-tracker-bot @ 2020-05-30 19:35 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev, Linus Torvalds, ajd, dja
In-Reply-To: <87lfl9ikmp.fsf@mpe.ellerman.id.au>

The pull request you sent on Sun, 31 May 2020 00:05:02 +1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.7-6

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ffeb595d84811dde16a28b33d8a7cf26d51d51b3

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Segher Boessenkool @ 2020-05-30 19:22 UTC (permalink / raw)
  To: Will Springer
  Cc: libc-alpha, eery, daniel, musl, binutils, libc-dev, linuxppc-dev
In-Reply-To: <2047231.C4sosBPzcN@sheen>

Hi!

On Fri, May 29, 2020 at 07:03:48PM +0000, Will Springer wrote:
> Hey all, a couple of us over in #talos-workstation on freenode have been
> working on an effort to bring up a Linux PowerPC userland that runs in 32-bit
> little-endian mode, aka ppcle. As far as we can tell, no ABI has ever been
> designated for this

https://www.polyomino.org.uk/publications/2011/Power-Arch-32-bit-ABI-supp-1.0-Embedded.pdf

> (unless you count the patchset from a decade ago [1]), so

The original sysv PowerPC supplement
http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
supports LE as well, and most powerpcle ports use that.  But, the
big-endian Linux ABI differs in quite a few places, and it of course
makes a lot better sense if powerpcle-linux follows that.

> it's pretty much uncharted territory as far as Linux is concerned. We want to
> sync up with libc and the relevant kernel folks to establish the best path
> forward.
> 
> The practical application that drove these early developments (as you might
> expect) is x86 emulation. The box86 project [2] implements a translation layer
> for ia32 library calls to native architecture ones; this way, emulation
> overhead is significantly reduced by relying on native libraries where
> possible (libc, libGL, etc.) instead of emulating an entire x86 userspace.
> box86 is primarily targeted at ARM, but it can be adapted to other
> architectures—so long as they match ia32's 32-bit, little-endian nature. Hence
> the need for a ppcle userland; modern POWER brought ppc64le as a supported
> configuration, but without a 32-bit equivalent there is no option for a 32/64
> multilib environment, as seen with ppc/ppc64 and arm/aarch64.
> 
> Surprisingly, beyond minor patching of gcc to get crosscompile going,

What patches did you need?  I regularly build >30 cross compilers (on
both BE and LE hosts; I haven't used 32-bit hosts for a long time, but
in the past those worked fine as well).  I also cross-built
powerpcle-linux-gcc quite a few times (from powerpc64le, from powerpc64,
from various x86).

> bootstrapping the initial userland was not much of a problem. The work has
> been done on top of the Void Linux PowerPC project [3], and much of that is
> now present in its source package tree [4].
> 
> The first issue with running the userland came from the ppc32 signal handler 
> forcing BE in the MSR, causing any 32LE process receiving a signal (such as a 
> shell receiving SIGCHLD) to terminate with SIGILL. This was trivially patched, 

Heh :-)

> along with enabling the 32-bit vDSO on ppc64le kernels [5]. (Given that this 
> behavior has been in place since 2006, I don't think anyone has been using the 
> kernel in this state to run ppcle userlands.)

Almost no project that used 32-bit PowerPC in LE mode has sent patches
to the upstreams.

> The next problem concerns the ABI more directly. The failure mode was `file`
> surfacing EINVAL from pread64 when invoked on an ELF; pread64 was passed a
> garbage value for `pos`, which didn't appear to be caused by anything in 
> `file`. Initially it seemed as though the 32-bit components of the arg were
> getting swapped, and we made hacky fixes to glibc and musl to put them in the
> "right order"; however, we weren't sure if that was the correct approach, or
> if there were knock-on effects we didn't know about. So we found the relevant
> compat code path in the kernel, at arch/powerpc/kernel/sys_ppc32.c, where
> there exists this comment:
> 
> > /*
> >  * long long munging:
> >  * The 32 bit ABI passes long longs in an odd even register pair.
> >  */
> 
> It seems that the opposite is true in LE mode, and something is expecting long
> longs to start on an even register. I realized this after I tried swapping hi/
> lo `u32`s here and didn't see an improvement. I whipped up a patch [6] that
> switches which syscalls use padding arguments depending on endianness, while
> hopefully remaining tidy enough to be unobtrusive. (I took some liberties with
> variable names/types so that the macro could be consistent.)

The ABI says long longs are passed in the same order in registers as it
would be in memory; so the high part and the low part are swapped between
BE and LE.  Which registers make up a pair is exactly the same between
the two.  (You can verify this with an existing powerpcle-* compiler, too;
I did, and we implement it correctly as far as I can see).

> This was enough to fix up the `file` bug. I'm no seasoned kernel hacker,
> though, and there is still concern over the right way to approach this,
> whether it should live in the kernel or libc, etc. Frankly, I don't know the
> ABI structure enough to understand why the register padding has to be
> different in this case, or what lower-level component is responsible for it. 
> For comparison, I had a look at the mips tree, since it's bi-endian and has a 
> similar 32/64 situation. There is a macro conditional upon endianness that is 
> responsible for munging long longs; it uses __MIPSEB__ and __MIPSEL__ instead 
> of an if/else on the generic __LITTLE_ENDIAN__. Not sure what to make of that. 
> (It also simply swaps registers for LE, unlike what I did for ppc.)

But you should :-)

> Also worth noting is the one other outstanding bug, where the time-related
> syscalls in the 32-bit vDSO seem to return garbage. It doesn't look like an
> endian bug to me, and it doesn't affect standard syscalls (which is why if you
> run `date` on musl it prints the correct time, unlike on glibc). The vDSO time
> functions are implemented in ppc asm (arch/powerpc/kernel/vdso32/
> gettimeofday.S), and I've never touched the stuff, so if anyone has a clue I'm 
> all ears.
> 
> Again, I'd appreciate feedback on the approach to take here, in order to 
> touch/special-case only the minimum necessary, while keeping the kernel/libc 
> folks happy.

A huge factor in having good GCC support for powerpcle-linux (or anything
else) is someone needs to regularly test it, and share test results with
us (via gcc-testresults@).  Hint hint hint :-)

That way we know it is in good shape, know when we are regressing it,
know there is interest in it.

gl;hf,


Segher

^ permalink raw reply

* Re: [PATCH] powerpc/32: disable KASAN with pages bigger than 16k
From: Christophe Leroy @ 2020-05-30 17:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <7195fcde7314ccbf7a081b356084a69d421b10d4.1590660977.git.christophe.leroy@csgroup.eu>



Le 28/05/2020 à 12:17, Christophe Leroy a écrit :
> Mapping of early shadow area is implemented by using a single static
> page table having all entries pointing to the same early shadow page.
> The shadow area must therefore occupy full PGD entries.
> 
> The shadow area has a size of 128Mbytes starting at 0xf8000000.
> With 4k pages, a PGD entry is 4Mbytes
> With 16k pages, a PGD entry is 64Mbytes
> With 64k pages, a PGD entry is 256Mbytes which is too big.

That's for 32k pages that a PGD is 256Mbytes.

With 64k pages, a PGD entry is 1Gbytes which is too big.

Michael, can you fix the commit log ?

Thanks
Christophe

> 
> Until we rework the early shadow mapping, disable KASAN when the
> page size is too big.
> 
> Reported-by: kbuild test robot <lkp@intel.com>
> Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
> Cc: stable@vger.kernel.org
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>   arch/powerpc/Kconfig | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 1dfa59126fcf..066b36bf9efa 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -169,8 +169,8 @@ config PPC
>   	select HAVE_ARCH_AUDITSYSCALL
>   	select HAVE_ARCH_HUGE_VMAP		if PPC_BOOK3S_64 && PPC_RADIX_MMU
>   	select HAVE_ARCH_JUMP_LABEL
> -	select HAVE_ARCH_KASAN			if PPC32
> -	select HAVE_ARCH_KASAN_VMALLOC		if PPC32
> +	select HAVE_ARCH_KASAN			if PPC32 && PPC_PAGE_SHIFT <= 14
> +	select HAVE_ARCH_KASAN_VMALLOC		if PPC32 && PPC_PAGE_SHIFT <= 14
>   	select HAVE_ARCH_KGDB
>   	select HAVE_ARCH_MMAP_RND_BITS
>   	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if COMPAT
> 

^ permalink raw reply

* [PATCH v2] powerpc/32s: Fix another build failure with CONFIG_PPC_KUAP_DEBUG
From: Christophe Leroy @ 2020-05-30 17:16 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

'thread' doesn't exist in kuap_check() macro.

Use 'current' instead.

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
v2: Changed author and signed-off-by ... and added asm/bug.h

There are days when you'd better go sleeping instead of wanting to send the fix.
And there are days you stare at your code, think it is good, prepare the patch,
test it, find another bug, fix it, test it ... and send the first patch you prepared :(
hence this second fix for the same bug.
---
 arch/powerpc/include/asm/book3s/32/kup.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index db0a1c281587..a41cfc7cc669 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -2,6 +2,7 @@
 #ifndef _ASM_POWERPC_BOOK3S_32_KUP_H
 #define _ASM_POWERPC_BOOK3S_32_KUP_H
 
+#include <asm/bug.h>
 #include <asm/book3s/32/mmu-hash.h>
 
 #ifdef __ASSEMBLY__
@@ -75,7 +76,7 @@
 
 .macro kuap_check	current, gpr
 #ifdef CONFIG_PPC_KUAP_DEBUG
-	lwz	\gpr, KUAP(thread)
+	lwz	\gpr, THREAD + KUAP(\current)
 999:	twnei	\gpr, 0
 	EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE)
 #endif
-- 
2.25.0


^ permalink raw reply related

* Re: [RFC PATCH 1/2] libnvdimm: Add prctl control for disabling synchronous fault support.
From: Dan Williams @ 2020-05-30 16:35 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Jan Kara, linux-nvdimm, jack, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <7f163562-e7e3-7668-7415-c40e57c32582@linux.ibm.com>

On Sat, May 30, 2020 at 12:18 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> On 5/30/20 12:52 AM, Dan Williams wrote:
> > On Fri, May 29, 2020 at 3:55 AM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> On 5/29/20 3:22 PM, Jan Kara wrote:
> >>> Hi!
> >>>
> >>> On Fri 29-05-20 15:07:31, Aneesh Kumar K.V wrote:
> >>>> Thanks Michal. I also missed Jeff in this email thread.
> >>>
> >>> And I think you'll also need some of the sched maintainers for the prctl
> >>> bits...
> >>>
> >>>> On 5/29/20 3:03 PM, Michal Suchánek wrote:
> >>>>> Adding Jan
> >>>>>
> >>>>> On Fri, May 29, 2020 at 11:11:39AM +0530, Aneesh Kumar K.V wrote:
> >>>>>> With POWER10, architecture is adding new pmem flush and sync instructions.
> >>>>>> The kernel should prevent the usage of MAP_SYNC if applications are not using
> >>>>>> the new instructions on newer hardware.
> >>>>>>
> >>>>>> This patch adds a prctl option MAP_SYNC_ENABLE that can be used to enable
> >>>>>> the usage of MAP_SYNC. The kernel config option is added to allow the user
> >>>>>> to control whether MAP_SYNC should be enabled by default or not.
> >>>>>>
> >>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> >>> ...
> >>>>>> diff --git a/kernel/fork.c b/kernel/fork.c
> >>>>>> index 8c700f881d92..d5a9a363e81e 100644
> >>>>>> --- a/kernel/fork.c
> >>>>>> +++ b/kernel/fork.c
> >>>>>> @@ -963,6 +963,12 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(mmlist_lock);
> >>>>>>     static unsigned long default_dump_filter = MMF_DUMP_FILTER_DEFAULT;
> >>>>>> +#ifdef CONFIG_ARCH_MAP_SYNC_DISABLE
> >>>>>> +unsigned long default_map_sync_mask = MMF_DISABLE_MAP_SYNC_MASK;
> >>>>>> +#else
> >>>>>> +unsigned long default_map_sync_mask = 0;
> >>>>>> +#endif
> >>>>>> +
> >>>
> >>> I'm not sure CONFIG is really the right approach here. For a distro that would
> >>> basically mean to disable MAP_SYNC for all PPC kernels unless application
> >>> explicitly uses the right prctl. Shouldn't we rather initialize
> >>> default_map_sync_mask on boot based on whether the CPU we run on requires
> >>> new flush instructions or not? Otherwise the patch looks sensible.
> >>>
> >>
> >> yes that is correct. We ideally want to deny MAP_SYNC only w.r.t
> >> POWER10. But on a virtualized platform there is no easy way to detect
> >> that. We could ideally hook this into the nvdimm driver where we look at
> >> the new compat string ibm,persistent-memory-v2 and then disable MAP_SYNC
> >> if we find a device with the specific value.
> >>
> >> BTW with the recent changes I posted for the nvdimm driver, older kernel
> >> won't initialize persistent memory device on newer hardware. Newer
> >> hardware will present the device to OS with a different device tree
> >> compat string.
> >>
> >> My expectation  w.r.t this patch was, Distro would want to  mark
> >> CONFIG_ARCH_MAP_SYNC_DISABLE=n based on the different application
> >> certification.  Otherwise application will have to end up calling the
> >> prctl(MMF_DISABLE_MAP_SYNC, 0) any way. If that is the case, should this
> >> be dependent on P10?
> >>
> >> With that I am wondering should we even have this patch? Can we expect
> >> userspace get updated to use new instruction?.
> >>
> >> With ppc64 we never had a real persistent memory device available for
> >> end user to try. The available persistent memory stack was using vPMEM
> >> which was presented as a volatile memory region for which there is no
> >> need to use any of the flush instructions. We could safely assume that
> >> as we get applications certified/verified for working with pmem device
> >> on ppc64, they would all be using the new instructions?
> >
> > I think prctl is the wrong interface for this. I was thinking a sysfs
> > interface along the same lines as /sys/block/pmemX/dax/write_cache.
> > That attribute is toggling DAXDEV_WRITE_CACHE for the determination of
> > whether the platform or the kernel needs to handle cache flushing
> > relative to power loss. A similar attribute can be established for
> > DAXDEV_SYNC, it would simply default to off based on a configuration
> > time policy, but be dynamically changeable at runtime via sysfs.
> >
> > These flags are device properties that affect the kernel and
> > userspace's handling of persistence.
> >
>
> That will not handle the scenario with multiple applications using the
> same fsdax mount point where one is updated to use the new instruction
> and the other is not.

Right, it needs to be a global setting / flag day to switch from one
regime to another. Per-process control is a recipe for disaster.

^ permalink raw reply

* Re: ppc64le and 32-bit LE userland compatibility
From: Christophe Leroy @ 2020-05-30 15:37 UTC (permalink / raw)
  To: Will Springer, linuxppc-dev
  Cc: libc-alpha, eery, daniel, musl, binutils, libc-dev
In-Reply-To: <2047231.C4sosBPzcN@sheen>



Le 29/05/2020 à 21:03, Will Springer a écrit :

[...]

> 
> Also worth noting is the one other outstanding bug, where the time-related
> syscalls in the 32-bit vDSO seem to return garbage. It doesn't look like an
> endian bug to me, and it doesn't affect standard syscalls (which is why if you
> run `date` on musl it prints the correct time, unlike on glibc). The vDSO time
> functions are implemented in ppc asm (arch/powerpc/kernel/vdso32/
> gettimeofday.S), and I've never touched the stuff, so if anyone has a clue I'm
> all ears.
> 

There is a series at 
https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=173231 to 
switch powerpc to the Generic C VDSO.

Can you try and see whether it fixes your issue ?

Christophe

^ permalink raw reply

* [GIT PULL] Please pull powerpc/linux.git powerpc-5.7-6 tag
From: Michael Ellerman @ 2020-05-30 14:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linuxppc-dev, ajd, dja

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi Linus,

Please pull two more powerpc fixes for 5.7. These are both regressions with
small "obviously correct" fixes.

cheers

The following changes since commit 8659a0e0efdd975c73355dbc033f79ba3b31e82c:

  powerpc/64s: Disable STRICT_KERNEL_RWX (2020-05-22 00:04:51 +1000)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.7-6

for you to fetch changes up to 2f26ed1764b42a8c40d9c48441c73a70d805decf:

  powerpc/64s: Disable sanitisers for C syscall/interrupt entry/exit code (2020-05-29 21:12:09 +1000)

- ------------------------------------------------------------------
powerpc fixes for 5.7 #6

A fix for the recent change to how we restore non-volatile GPRs, which broke our
emulation of reading from the DSCR (Data Stream Control Register).

And a fix for the recent rewrite of interrupt/syscall exit in C, we need to
exclude KCOV from that code, otherwise it can lead to unrecoverable faults.

Thanks to:
  Daniel Axtens.

- ------------------------------------------------------------------
Daniel Axtens (1):
      powerpc/64s: Disable sanitisers for C syscall/interrupt entry/exit code

Michael Ellerman (1):
      powerpc/64s: Fix restore of NV GPRs after facility unavailable exception


 arch/powerpc/kernel/Makefile         | 3 +++
 arch/powerpc/kernel/exceptions-64s.S | 2 ++
 2 files changed, 5 insertions(+)
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl7SZ9oACgkQUevqPMjh
pYCF0g//cJYxcH020cVm2jZL5Smcfqr3mYRcGnY/IsEJ28y7HFc30HnoxiBtwO2Z
BwZqqW80IQpDieHS9GZ//IbGaHrrrj5HeyGfA5F6IsKrZXsb9EG++njaKqGZljX5
NStMVMJ+zFs8b+ZVPtA5yQVy4EQO5QlB3A19BIsI7VnqNI/h6R9wXHaAguVz2DUL
AydK/cdU0LdOXnPewY3S3P4HVaUkBE/UrqpekBePJ89obOvUVsbP1C+syDYQLgzA
y43zFPZurVb5UOpEXs9hKoqLnciM5LBKCbk6JV001Hvalb8Bobo/4vvrmxC/BSUs
B0kI6ZHvojTj/cQ7c26B0uEab6AZYdP7TzAIjpP2xoLHvLaWlbGlrOzSz5xNp5B5
lkT8WZCcU8/kQx2hsjOcOOQV6JIMkTDeB7lTm3iwcVQihJDemNu0XxZIAi8gbOzr
46BXNFKVv9ItjzxSIbUtcxl5mpiwnV0XoDw4IM7xDdKnpLG8z9yzrYuBPyQ2gEDB
KhnvTJQry9B4I98EoKoXvMGMttL9BmrYWJ/7P89hTWZT4fs6dOox9Pf+UOHBp3Tk
I/qxG9zzWlf5zXJKUKyTqLv4SIbQNCx7IAqkPkaR9AGCoLCNyapNfZ5orfh6TFHc
n4J9suaFG9TFF7frIMGZNcIt8MBjPlXcJhQQ+s2KCOD4NZE+534=
=KF+h
-----END PGP SIGNATURE-----

^ permalink raw reply

* Re: [RFC PATCH 1/2] libnvdimm: Add prctl control for disabling synchronous fault support.
From: Aneesh Kumar K.V @ 2020-05-30  7:18 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, linux-nvdimm, jack, Jeff Moyer, Oliver O'Halloran,
	Michal Suchánek, linuxppc-dev
In-Reply-To: <CAPcyv4jrss3dFcCOar3JTFnuN0_pgFNtBPiJzUdKxtiax6pPgQ@mail.gmail.com>

On 5/30/20 12:52 AM, Dan Williams wrote:
> On Fri, May 29, 2020 at 3:55 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 5/29/20 3:22 PM, Jan Kara wrote:
>>> Hi!
>>>
>>> On Fri 29-05-20 15:07:31, Aneesh Kumar K.V wrote:
>>>> Thanks Michal. I also missed Jeff in this email thread.
>>>
>>> And I think you'll also need some of the sched maintainers for the prctl
>>> bits...
>>>
>>>> On 5/29/20 3:03 PM, Michal Suchánek wrote:
>>>>> Adding Jan
>>>>>
>>>>> On Fri, May 29, 2020 at 11:11:39AM +0530, Aneesh Kumar K.V wrote:
>>>>>> With POWER10, architecture is adding new pmem flush and sync instructions.
>>>>>> The kernel should prevent the usage of MAP_SYNC if applications are not using
>>>>>> the new instructions on newer hardware.
>>>>>>
>>>>>> This patch adds a prctl option MAP_SYNC_ENABLE that can be used to enable
>>>>>> the usage of MAP_SYNC. The kernel config option is added to allow the user
>>>>>> to control whether MAP_SYNC should be enabled by default or not.
>>>>>>
>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>>> ...
>>>>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>>>>> index 8c700f881d92..d5a9a363e81e 100644
>>>>>> --- a/kernel/fork.c
>>>>>> +++ b/kernel/fork.c
>>>>>> @@ -963,6 +963,12 @@ __cacheline_aligned_in_smp DEFINE_SPINLOCK(mmlist_lock);
>>>>>>     static unsigned long default_dump_filter = MMF_DUMP_FILTER_DEFAULT;
>>>>>> +#ifdef CONFIG_ARCH_MAP_SYNC_DISABLE
>>>>>> +unsigned long default_map_sync_mask = MMF_DISABLE_MAP_SYNC_MASK;
>>>>>> +#else
>>>>>> +unsigned long default_map_sync_mask = 0;
>>>>>> +#endif
>>>>>> +
>>>
>>> I'm not sure CONFIG is really the right approach here. For a distro that would
>>> basically mean to disable MAP_SYNC for all PPC kernels unless application
>>> explicitly uses the right prctl. Shouldn't we rather initialize
>>> default_map_sync_mask on boot based on whether the CPU we run on requires
>>> new flush instructions or not? Otherwise the patch looks sensible.
>>>
>>
>> yes that is correct. We ideally want to deny MAP_SYNC only w.r.t
>> POWER10. But on a virtualized platform there is no easy way to detect
>> that. We could ideally hook this into the nvdimm driver where we look at
>> the new compat string ibm,persistent-memory-v2 and then disable MAP_SYNC
>> if we find a device with the specific value.
>>
>> BTW with the recent changes I posted for the nvdimm driver, older kernel
>> won't initialize persistent memory device on newer hardware. Newer
>> hardware will present the device to OS with a different device tree
>> compat string.
>>
>> My expectation  w.r.t this patch was, Distro would want to  mark
>> CONFIG_ARCH_MAP_SYNC_DISABLE=n based on the different application
>> certification.  Otherwise application will have to end up calling the
>> prctl(MMF_DISABLE_MAP_SYNC, 0) any way. If that is the case, should this
>> be dependent on P10?
>>
>> With that I am wondering should we even have this patch? Can we expect
>> userspace get updated to use new instruction?.
>>
>> With ppc64 we never had a real persistent memory device available for
>> end user to try. The available persistent memory stack was using vPMEM
>> which was presented as a volatile memory region for which there is no
>> need to use any of the flush instructions. We could safely assume that
>> as we get applications certified/verified for working with pmem device
>> on ppc64, they would all be using the new instructions?
> 
> I think prctl is the wrong interface for this. I was thinking a sysfs
> interface along the same lines as /sys/block/pmemX/dax/write_cache.
> That attribute is toggling DAXDEV_WRITE_CACHE for the determination of
> whether the platform or the kernel needs to handle cache flushing
> relative to power loss. A similar attribute can be established for
> DAXDEV_SYNC, it would simply default to off based on a configuration
> time policy, but be dynamically changeable at runtime via sysfs.
> 
> These flags are device properties that affect the kernel and
> userspace's handling of persistence.
> 

That will not handle the scenario with multiple applications using the 
same fsdax mount point where one is updated to use the new instruction 
and the other is not.

-aneeseh

^ permalink raw reply

* Re: [PATCH v3 5/7] powerpc/pmem/of_pmem: Update of_pmem to use the new barrier instruction.
From: kbuild test robot @ 2020-05-30  3:08 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe, linux-nvdimm
  Cc: kbuild-all, Aneesh Kumar K.V, clang-built-linux, oohall, alistair,
	dan.j.williams
In-Reply-To: <20200519055502.128318-5-aneesh.kumar@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 13484 bytes --]

Hi "Aneesh,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.7-rc7]
[cannot apply to next-20200529]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/powerpc-pmem-Restrict-papr_scm-to-P8-and-above/20200519-195350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r016-20200529 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 2d068e534f1671459e1b135852c1b3c10502e929)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

In file included from arch/powerpc/kernel/syscall_64.c:4:
In file included from arch/powerpc/include/asm/asm-prototypes.h:12:
>> arch/powerpc/include/asm/cacheflush.h:126:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
In file included from arch/powerpc/kernel/syscall_64.c:4:
In file included from arch/powerpc/include/asm/asm-prototypes.h:17:
In file included from arch/powerpc/include/asm/mmu_context.h:12:
In file included from arch/powerpc/include/asm/cputhreads.h:7:
>> arch/powerpc/include/asm/cpu_has_feature.h:49:20: error: static declaration of 'cpu_has_feature' follows non-static declaration
static inline bool cpu_has_feature(unsigned long feature)
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: previous implicit declaration is here
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
2 errors generated.
--
In file included from arch/powerpc/kernel/kprobes.c:23:
>> arch/powerpc/include/asm/cacheflush.h:126:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
1 error generated.
--
In file included from arch/powerpc/kernel/optprobes.c:15:
>> arch/powerpc/include/asm/cacheflush.h:126:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
arch/powerpc/kernel/optprobes.c:149:6: warning: no previous prototype for function 'patch_imm32_load_insns' [-Wmissing-prototypes]
void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
^
arch/powerpc/kernel/optprobes.c:149:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void patch_imm32_load_insns(unsigned int val, kprobe_opcode_t *addr)
^
static
arch/powerpc/kernel/optprobes.c:167:6: warning: no previous prototype for function 'patch_imm64_load_insns' [-Wmissing-prototypes]
void patch_imm64_load_insns(unsigned long val, int reg, kprobe_opcode_t *addr)
^
arch/powerpc/kernel/optprobes.c:167:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void patch_imm64_load_insns(unsigned long val, int reg, kprobe_opcode_t *addr)
^
static
2 warnings and 1 error generated.
--
In file included from arch/powerpc/kernel/epapr_paravirt.c:11:
>> arch/powerpc/include/asm/cacheflush.h:126:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:33:
In file included from arch/powerpc/include/asm/delay.h:7:
In file included from arch/powerpc/include/asm/time.h:17:
>> arch/powerpc/include/asm/cpu_has_feature.h:49:20: error: static declaration of 'cpu_has_feature' follows non-static declaration
static inline bool cpu_has_feature(unsigned long feature)
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: previous implicit declaration is here
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:23:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_RET(inb, u8, (unsigned long port), (port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:593:9: note: expanded from macro 'DEF_PCI_AC_RET'
return __do_##name al;                                                    ^~~~~~~~~~~~~~
<scratch space>:32:1: note: expanded from here
__do_inb
^
arch/powerpc/include/asm/io.h:524:53: note: expanded from macro '__do_inb'
#define __do_inb(port)          readb((PCI_IO_ADDR)_IO_BASE + port);
~~~~~~~~~~~~~~~~~~~~~ ^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:24:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_RET(inw, u16, (unsigned long port), (port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:593:9: note: expanded from macro 'DEF_PCI_AC_RET'
return __do_##name al;                                                    ^~~~~~~~~~~~~~
<scratch space>:34:1: note: expanded from here
__do_inw
^
arch/powerpc/include/asm/io.h:525:53: note: expanded from macro '__do_inw'
#define __do_inw(port)          readw((PCI_IO_ADDR)_IO_BASE + port);
~~~~~~~~~~~~~~~~~~~~~ ^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:25:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_RET(inl, u32, (unsigned long port), (port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:593:9: note: expanded from macro 'DEF_PCI_AC_RET'
return __do_##name al;                                                    ^~~~~~~~~~~~~~
<scratch space>:36:1: note: expanded from here
__do_inl
^
arch/powerpc/include/asm/io.h:526:53: note: expanded from macro '__do_inl'
#define __do_inl(port)          readl((PCI_IO_ADDR)_IO_BASE + port);
~~~~~~~~~~~~~~~~~~~~~ ^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:26:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outb, (u8 val, unsigned long port), (val, port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:602:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al;                                                    ^~~~~~~~~~~~~~
<scratch space>:38:1: note: expanded from here
__do_outb
^
arch/powerpc/include/asm/io.h:521:62: note: expanded from macro '__do_outb'
#define __do_outb(val, port)    writeb(val,(PCI_IO_ADDR)_IO_BASE+port);
~~~~~~~~~~~~~~~~~~~~~^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:27:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outw, (u16 val, unsigned long port), (val, port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:602:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al;                                                    ^~~~~~~~~~~~~~
<scratch space>:40:1: note: expanded from here
__do_outw
^
arch/powerpc/include/asm/io.h:522:62: note: expanded from macro '__do_outw'
#define __do_outw(val, port)    writew(val,(PCI_IO_ADDR)_IO_BASE+port);
~~~~~~~~~~~~~~~~~~~~~^
In file included from arch/powerpc/kernel/epapr_paravirt.c:13:
In file included from arch/powerpc/include/asm/machdep.h:8:
In file included from include/linux/dma-mapping.h:11:
In file included from include/linux/scatterlist.h:9:
In file included from arch/powerpc/include/asm/io.h:605:
arch/powerpc/include/asm/io-defs.h:28:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
DEF_PCI_AC_NORET(outl, (u32 val, unsigned long port), (val, port), pio, port)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/include/asm/io.h:602:3: note: expanded from macro 'DEF_PCI_AC_NORET'
__do_##name al;                                 --
In file included from arch/powerpc/lib/pmem.c:10:
>> arch/powerpc/include/asm/cacheflush.h:126:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/include/asm/cacheflush.h:126:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
arch/powerpc/lib/pmem.c:44:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/lib/pmem.c:50:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/lib/pmem.c:57:6: warning: no previous prototype for function 'arch_wb_cache_pmem' [-Wmissing-prototypes]
void arch_wb_cache_pmem(void *addr, size_t size)
^
arch/powerpc/lib/pmem.c:57:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void arch_wb_cache_pmem(void *addr, size_t size)
^
static
arch/powerpc/lib/pmem.c:64:6: warning: no previous prototype for function 'arch_invalidate_pmem' [-Wmissing-prototypes]
void arch_invalidate_pmem(void *addr, size_t size)
^
arch/powerpc/lib/pmem.c:64:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void arch_invalidate_pmem(void *addr, size_t size)
^
static
2 warnings and 3 errors generated.

vim +/cpu_has_feature +126 arch/powerpc/include/asm/cacheflush.h

   113	
   114	#define copy_to_user_page(vma, page, vaddr, dst, src, len) \
   115		do { \
   116			memcpy(dst, src, len); \
   117			flush_icache_user_range(vma, page, vaddr, len); \
   118		} while (0)
   119	#define copy_from_user_page(vma, page, vaddr, dst, src, len) \
   120		memcpy(dst, src, len)
   121	
   122	
   123	#define arch_pmem_flush_barrier arch_pmem_flush_barrier
   124	static inline void  arch_pmem_flush_barrier(void)
   125	{
 > 126		if (cpu_has_feature(CPU_FTR_ARCH_207S))
   127			asm volatile(PPC_PHWSYNC ::: "memory");
   128	}
   129	#endif /* __KERNEL__ */
   130	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33734 bytes --]

^ permalink raw reply

* Re: [PATCH v3 3/7] powerpc/pmem: Add flush routines using new pmem store and sync instruction
From: kbuild test robot @ 2020-05-30  0:47 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linuxppc-dev, mpe, linux-nvdimm
  Cc: kbuild-all, Aneesh Kumar K.V, clang-built-linux, oohall, alistair,
	dan.j.williams
In-Reply-To: <20200519055502.128318-3-aneesh.kumar@linux.ibm.com>

[-- Attachment #1: Type: text/plain, Size: 3057 bytes --]

Hi "Aneesh,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.7-rc7 next-20200529]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/powerpc-pmem-Restrict-papr_scm-to-P8-and-above/20200519-195350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r016-20200529 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 2d068e534f1671459e1b135852c1b3c10502e929)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> arch/powerpc/lib/pmem.c:44:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/lib/pmem.c:44:6: note: did you mean 'mmu_has_feature'?
arch/powerpc/include/asm/mmu.h:234:20: note: 'mmu_has_feature' declared here
static inline bool mmu_has_feature(unsigned long feature)
^
arch/powerpc/lib/pmem.c:50:6: error: implicit declaration of function 'cpu_has_feature' [-Werror,-Wimplicit-function-declaration]
if (cpu_has_feature(CPU_FTR_ARCH_207S))
^
arch/powerpc/lib/pmem.c:57:6: warning: no previous prototype for function 'arch_wb_cache_pmem' [-Wmissing-prototypes]
void arch_wb_cache_pmem(void *addr, size_t size)
^
arch/powerpc/lib/pmem.c:57:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void arch_wb_cache_pmem(void *addr, size_t size)
^
static
arch/powerpc/lib/pmem.c:64:6: warning: no previous prototype for function 'arch_invalidate_pmem' [-Wmissing-prototypes]
void arch_invalidate_pmem(void *addr, size_t size)
^
arch/powerpc/lib/pmem.c:64:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void arch_invalidate_pmem(void *addr, size_t size)
^
static
2 warnings and 2 errors generated.

vim +/cpu_has_feature +44 arch/powerpc/lib/pmem.c

    41	
    42	static inline void clean_pmem_range(unsigned long start, unsigned long stop)
    43	{
  > 44		if (cpu_has_feature(CPU_FTR_ARCH_207S))
    45			return __clean_pmem_range(start, stop);
    46	}
    47	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33734 bytes --]

^ permalink raw reply

* Re: [PATCH net] drivers/net/ibmvnic: Update VNIC protocol version reporting
From: David Miller @ 2020-05-30  0:21 UTC (permalink / raw)
  To: tlfalcon; +Cc: netdev, linuxppc-dev
In-Reply-To: <1590682757-3814-1-git-send-email-tlfalcon@linux.ibm.com>

From: Thomas Falcon <tlfalcon@linux.ibm.com>
Date: Thu, 28 May 2020 11:19:17 -0500

> VNIC protocol version is reported in big-endian format, but it
> is not byteswapped before logging. Fix that, and remove version
> comparison as only one protocol version exists at this time.
> 
> Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>

Applied, thanks.

^ permalink raw reply

* Re: [RFC][PATCH v3 1/5] sparc64: Fix asm/percpu.h build error
From: David Miller @ 2020-05-29 23:29 UTC (permalink / raw)
  To: peterz
  Cc: linux-s390, linuxppc-dev, bigeasy, x86, heiko.carstens,
	linux-kernel, rostedt, a.darwish, sparclinux, tglx, will, mingo
In-Reply-To: <20200529214203.673108357@infradead.org>

From: Peter Zijlstra <peterz@infradead.org>
Date: Fri, 29 May 2020 23:35:51 +0200

> ../arch/sparc/include/asm/percpu_64.h:7:24: warning: call-clobbered register used for global register variable
> register unsigned long __local_per_cpu_offset asm("g5");

The "-ffixed-g5" option on the command line tells gcc that we are
using 'g5' as a fixed register, so some part of your build isn't using
the:

KBUILD_CFLAGS += -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wno-sign-compare

from arch/sparc/Makefile for some reason.

^ permalink raw reply

* ppc64le and 32-bit LE userland compatibility
From: Will Springer @ 2020-05-29 19:03 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: libc-alpha, eery, daniel, musl, binutils, libc-dev

Hey all, a couple of us over in #talos-workstation on freenode have been
working on an effort to bring up a Linux PowerPC userland that runs in 32-bit
little-endian mode, aka ppcle. As far as we can tell, no ABI has ever been
designated for this (unless you count the patchset from a decade ago [1]), so
it's pretty much uncharted territory as far as Linux is concerned. We want to
sync up with libc and the relevant kernel folks to establish the best path
forward.

The practical application that drove these early developments (as you might
expect) is x86 emulation. The box86 project [2] implements a translation layer
for ia32 library calls to native architecture ones; this way, emulation
overhead is significantly reduced by relying on native libraries where
possible (libc, libGL, etc.) instead of emulating an entire x86 userspace.
box86 is primarily targeted at ARM, but it can be adapted to other
architectures—so long as they match ia32's 32-bit, little-endian nature. Hence
the need for a ppcle userland; modern POWER brought ppc64le as a supported
configuration, but without a 32-bit equivalent there is no option for a 32/64
multilib environment, as seen with ppc/ppc64 and arm/aarch64.

Surprisingly, beyond minor patching of gcc to get crosscompile going,
bootstrapping the initial userland was not much of a problem. The work has
been done on top of the Void Linux PowerPC project [3], and much of that is
now present in its source package tree [4].

The first issue with running the userland came from the ppc32 signal handler 
forcing BE in the MSR, causing any 32LE process receiving a signal (such as a 
shell receiving SIGCHLD) to terminate with SIGILL. This was trivially patched, 
along with enabling the 32-bit vDSO on ppc64le kernels [5]. (Given that this 
behavior has been in place since 2006, I don't think anyone has been using the 
kernel in this state to run ppcle userlands.)

The next problem concerns the ABI more directly. The failure mode was `file`
surfacing EINVAL from pread64 when invoked on an ELF; pread64 was passed a
garbage value for `pos`, which didn't appear to be caused by anything in 
`file`. Initially it seemed as though the 32-bit components of the arg were
getting swapped, and we made hacky fixes to glibc and musl to put them in the
"right order"; however, we weren't sure if that was the correct approach, or
if there were knock-on effects we didn't know about. So we found the relevant
compat code path in the kernel, at arch/powerpc/kernel/sys_ppc32.c, where
there exists this comment:

> /*
>  * long long munging:
>  * The 32 bit ABI passes long longs in an odd even register pair.
>  */

It seems that the opposite is true in LE mode, and something is expecting long
longs to start on an even register. I realized this after I tried swapping hi/
lo `u32`s here and didn't see an improvement. I whipped up a patch [6] that
switches which syscalls use padding arguments depending on endianness, while
hopefully remaining tidy enough to be unobtrusive. (I took some liberties with
variable names/types so that the macro could be consistent.)

This was enough to fix up the `file` bug. I'm no seasoned kernel hacker,
though, and there is still concern over the right way to approach this,
whether it should live in the kernel or libc, etc. Frankly, I don't know the
ABI structure enough to understand why the register padding has to be
different in this case, or what lower-level component is responsible for it. 
For comparison, I had a look at the mips tree, since it's bi-endian and has a 
similar 32/64 situation. There is a macro conditional upon endianness that is 
responsible for munging long longs; it uses __MIPSEB__ and __MIPSEL__ instead 
of an if/else on the generic __LITTLE_ENDIAN__. Not sure what to make of that. 
(It also simply swaps registers for LE, unlike what I did for ppc.)

Also worth noting is the one other outstanding bug, where the time-related
syscalls in the 32-bit vDSO seem to return garbage. It doesn't look like an
endian bug to me, and it doesn't affect standard syscalls (which is why if you
run `date` on musl it prints the correct time, unlike on glibc). The vDSO time
functions are implemented in ppc asm (arch/powerpc/kernel/vdso32/
gettimeofday.S), and I've never touched the stuff, so if anyone has a clue I'm 
all ears.

Again, I'd appreciate feedback on the approach to take here, in order to 
touch/special-case only the minimum necessary, while keeping the kernel/libc 
folks happy.

Cheers,
Will [she/her]

(p.s. there is ancillary interest in a ppcle-native kernel as well; that's a 
good deal more work and not the focus of this message at all, but it is a 
topic of interest)

[1]: https://lwn.net/Articles/408845/
[2]: https://github.com/ptitSeb/box86
[3]: https://voidlinux-ppc.org/
[4]: https://github.com/void-ppc/void-packages
[5]: https://gist.github.com/eerykitty/01707dc6bca2be32b4c5e30d15d15dcf
[6]: https://gist.github.com/Skirmisher/02891c1a8cafa0ff18b2460933ef4f3c





^ permalink raw reply

* [powerpc:next-test 160/198] arch/powerpc/platforms/powernv/pci-ioda.c:1889:13: warning: function 'pnv_ioda_setup_bus_dma' is not needed and will not be emitted
From: kbuild test robot @ 2020-05-29 22:22 UTC (permalink / raw)
  To: Oliver, O'Halloran,; +Cc: clang-built-linux, kbuild-all, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2998 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next-test
head:   e376ca093587eafd840bb0f9df04090e2a54249c
commit: dc3d8f85bb571c3640ebba24b82a527cf2cb3f24 [160/198] powerpc/powernv/pci: Re-work bus PE configuration
config: powerpc64-randconfig-r024-20200529 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 2d068e534f1671459e1b135852c1b3c10502e929)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc64 cross compiling tool for clang build
        # apt-get install binutils-powerpc64-linux-gnu
        git checkout dc3d8f85bb571c3640ebba24b82a527cf2cb3f24
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> arch/powerpc/platforms/powernv/pci-ioda.c:1889:13: warning: function 'pnv_ioda_setup_bus_dma' is not needed and will not be emitted [-Wunneeded-internal-declaration]
static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
^
1 warning generated.

vim +/pnv_ioda_setup_bus_dma +1889 arch/powerpc/platforms/powernv/pci-ioda.c

fe7e85c6f5ff63 Gavin Shan             2014-09-30  1888  
5eada8a3f087df Alexey Kardashevskiy   2018-12-19 @1889  static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe, struct pci_bus *bus)
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1890  {
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1891  	struct pci_dev *dev;
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1892  
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1893  	list_for_each_entry(dev, &bus->devices, bus_list) {
b348aa65297659 Alexey Kardashevskiy   2015-06-05  1894  		set_iommu_table_base(&dev->dev, pe->table_group.tables[0]);
0617fc0ca412b5 Christoph Hellwig      2019-02-13  1895  		dev->dev.archdata.dma_offset = pe->tce_bypass_base;
dff4a39e880062 Gavin Shan             2014-07-15  1896  
5c89a87d13d168 Alexey Kardashevskiy   2015-06-18  1897  		if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
5eada8a3f087df Alexey Kardashevskiy   2018-12-19  1898  			pnv_ioda_setup_bus_dma(pe, dev->subordinate);
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1899  	}
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1900  }
74251fe21bfa93 Benjamin Herrenschmidt 2013-07-01  1901  

:::::: The code at line 1889 was first introduced by commit
:::::: 5eada8a3f087df74af1c2797770a3e2249374fe1 powerpc/iommu_api: Move IOMMU groups setup to a single place

:::::: TO: Alexey Kardashevskiy <aik@ozlabs.ru>
:::::: CC: Michael Ellerman <mpe@ellerman.id.au>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33114 bytes --]

^ permalink raw reply

* [PATCH v9 2/5] seq_buf: Export seq_buf_printf
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Cezary Rojewski, Piotr Maziarz, Steven Rostedt,
	Christoph Hellwig, Oliver O'Halloran, Aneesh Kumar K . V,
	Borislav Petkov, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200529214719.223344-1-vaibhav@linux.ibm.com>

'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz <piotrx.maziarz@linux.intel.com>
Cc: Cezary Rojewski <cezary.rojewski@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2


^ permalink raw reply related

* [PATCH v9 5/5] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200529214719.223344-1-vaibhav@linux.ibm.com>

This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

The patch also introduces a new member 'struct papr_scm_priv.health'
thats an instance of 'struct nd_papr_pdsm_health' to cache the health
information of a nvdimm. As a result functions drc_pmem_query_health()
and flags_show() are updated to populate and use this new struct
instead of a u64 integer that was earlier used.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h |  39 +++++++
 arch/powerpc/platforms/pseries/papr_scm.c | 125 +++++++++++++++++++---
 2 files changed, 147 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 6407fefcc007..411725a91591 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -115,6 +115,7 @@ struct nd_pdsm_cmd_pkg {
  */
 enum papr_pdsm {
 	PAPR_PDSM_MIN = 0x0,
+	PAPR_PDSM_HEALTH,
 	PAPR_PDSM_MAX,
 };
 
@@ -133,4 +134,42 @@ static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
 		return (void *)(pcmd->payload);
 }
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY       0
+#define PAPR_PDSM_DIMM_UNHEALTHY     1
+#define PAPR_PDSM_DIMM_CRITICAL      2
+#define PAPR_PDSM_DIMM_FATAL         3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * dimm_unarmed		: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown	: Previous shutdown did not persist contents.
+ * dimm_bad_restore	: Contents from previous shutdown werent restored.
+ * dimm_scrubbed	: Contents of the dimm have been scrubbed.
+ * dimm_locked		: Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted	: Contents of dimm are encrypted.
+ * dimm_health		: Dimm health indicator. One of PAPR_PDSM_DIMM_XXXX
+ */
+struct nd_papr_pdsm_health_v1 {
+	__u8 dimm_unarmed;
+	__u8 dimm_bad_shutdown;
+	__u8 dimm_bad_restore;
+	__u8 dimm_scrubbed;
+	__u8 dimm_locked;
+	__u8 dimm_encrypted;
+	__u16 dimm_health;
+} __packed;
+
+/*
+ * Typedef the current struct for dimm_health so that any application
+ * or kernel recompiled after introducing a new version automatically
+ * supports the new version.
+ */
+#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
+
+/* Current version number for the dimm health struct */
+#define ND_PAPR_PDSM_HEALTH_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 5e2237e7ec08..c0606c0c659c 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -88,7 +88,7 @@ struct papr_scm_priv {
 	unsigned long lasthealth_jiffies;
 
 	/* Health information for the dimm */
-	u64 health_bitmap;
+	struct nd_papr_pdsm_health health;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 static int __drc_pmem_query_health(struct papr_scm_priv *p)
 {
 	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	u64 health;
 	long rc;
 
 	/* issue the hcall */
@@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv *p)
 	if (rc != H_SUCCESS) {
 		dev_err(&p->pdev->dev,
 			 "Failed to query health information, Err:%ld\n", rc);
-		rc = -ENXIO;
-		goto out;
+		return -ENXIO;
 	}
 
 	p->lasthealth_jiffies = jiffies;
-	p->health_bitmap = ret[0] & ret[1];
+	health = ret[0] & ret[1];
 
 	dev_dbg(&p->pdev->dev,
 		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
 		ret[0], ret[1]);
-out:
-	return rc;
+
+	memset(&p->health, 0, sizeof(p->health));
+
+	/* Check for various masks in bitmap and set the buffer */
+	if (health & PAPR_PMEM_UNARMED_MASK)
+		p->health.dimm_unarmed = 1;
+
+	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+		p->health.dimm_bad_shutdown = 1;
+
+	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+		p->health.dimm_bad_restore = 1;
+
+	if (health & PAPR_PMEM_ENCRYPTED)
+		p->health.dimm_encrypted = 1;
+
+	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED) {
+		p->health.dimm_locked = 1;
+		p->health.dimm_scrubbed = 1;
+	}
+
+	if (health & PAPR_PMEM_HEALTH_UNHEALTHY)
+		p->health.dimm_health = PAPR_PDSM_DIMM_UNHEALTHY;
+
+	if (health & PAPR_PMEM_HEALTH_CRITICAL)
+		p->health.dimm_health = PAPR_PDSM_DIMM_CRITICAL;
+
+	if (health & PAPR_PMEM_HEALTH_FATAL)
+		p->health.dimm_health = PAPR_PDSM_DIMM_FATAL;
+
+	return 0;
 }
 
 /* Min interval in seconds for assuming stable dimm health */
@@ -403,6 +432,58 @@ static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 	return 0;
 }
 
+/* Fetch the DIMM health info and populate it in provided package. */
+static int papr_pdsm_health(struct papr_scm_priv *p,
+			       struct nd_pdsm_cmd_pkg *pkg)
+{
+	int rc;
+	size_t copysize = sizeof(p->health);
+
+	/* Ensure dimm health mutex is taken preventing concurrent access */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		goto out;
+
+	/* Always fetch upto date dimm health data ignoring cached values */
+	rc = __drc_pmem_query_health(p);
+	if (rc)
+		goto out_unlock;
+	/*
+	 * If the requested payload version is greater than one we know
+	 * about, return the payload version we know about and let
+	 * caller/userspace handle.
+	 */
+	if (pkg->payload_version > ND_PAPR_PDSM_HEALTH_VERSION)
+		pkg->payload_version = ND_PAPR_PDSM_HEALTH_VERSION;
+
+	if (pkg->hdr.nd_size_out < copysize) {
+		dev_dbg(&p->pdev->dev, "Truncated payload (%u). Expected (%lu)",
+			pkg->hdr.nd_size_out, copysize);
+		rc = -ENOSPC;
+		goto out_unlock;
+	}
+
+	dev_dbg(&p->pdev->dev, "Copying payload size=%lu version=0x%x\n",
+		copysize, pkg->payload_version);
+
+	/* Copy the health struct to the payload */
+	memcpy(pdsm_cmd_to_payload(pkg), &p->health, copysize);
+	pkg->hdr.nd_fw_size = copysize;
+
+out_unlock:
+	mutex_unlock(&p->health_mutex);
+
+out:
+	/*
+	 * Put the error in out package and return success from function
+	 * so that errors if any are propogated back to userspace.
+	 */
+	pkg->cmd_status = rc;
+	dev_dbg(&p->pdev->dev, "completion code = %d\n", rc);
+
+	return 0;
+}
+
 static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 				struct nd_pdsm_cmd_pkg *call_pkg)
 {
@@ -417,6 +498,9 @@ static int papr_scm_service_pdsm(struct papr_scm_priv *p,
 
 	/* Depending on the DSM command call appropriate service routine */
 	switch (call_pkg->hdr.nd_command) {
+	case PAPR_PDSM_HEALTH:
+		return papr_pdsm_health(p, call_pkg);
+
 	default:
 		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
 			call_pkg->hdr.nd_command);
@@ -485,34 +569,41 @@ static ssize_t flags_show(struct device *dev,
 	struct nvdimm *dimm = to_nvdimm(dev);
 	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
 	struct seq_buf s;
-	u64 health;
 	int rc;
 
 	rc = drc_pmem_query_health(p);
 	if (rc)
 		return rc;
 
-	/* Copy health_bitmap locally, check masks & update out buffer */
-	health = READ_ONCE(p->health_bitmap);
-
 	seq_buf_init(&s, buf, PAGE_SIZE);
-	if (health & PAPR_PMEM_UNARMED_MASK)
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	if (p->health.dimm_unarmed)
 		seq_buf_printf(&s, "not_armed ");
 
-	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+	if (p->health.dimm_bad_shutdown)
 		seq_buf_printf(&s, "flush_fail ");
 
-	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+	if (p->health.dimm_bad_restore)
 		seq_buf_printf(&s, "restore_fail ");
 
-	if (health & PAPR_PMEM_ENCRYPTED)
+	if (p->health.dimm_encrypted)
 		seq_buf_printf(&s, "encrypted ");
 
-	if (health & PAPR_PMEM_SMART_EVENT_MASK)
+	if (p->health.dimm_health)
 		seq_buf_printf(&s, "smart_notify ");
 
-	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
-		seq_buf_printf(&s, "scrubbed locked ");
+	if (p->health.dimm_scrubbed)
+		seq_buf_printf(&s, "scrubbed ");
+
+	if (p->health.dimm_locked)
+		seq_buf_printf(&s, "locked ");
+
+	mutex_unlock(&p->health_mutex);
 
 	if (seq_buf_used(&s))
 		seq_buf_printf(&s, "\n");
-- 
2.26.2


^ permalink raw reply related

* [PATCH v9 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200529214719.223344-1-vaibhav@linux.ibm.com>

Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x10000 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 136 ++++++++++++++++++++++
 arch/powerpc/platforms/pseries/papr_scm.c | 101 +++++++++++++++-
 include/uapi/linux/ndctl.h                |   1 +
 3 files changed, 232 insertions(+), 6 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index 000000000000..6407fefcc007
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,136 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * PAPR nvDimm Specific Methods (PDSM) and structs for libndctl
+ *
+ * (C) Copyright IBM 2020
+ *
+ * Author: Vaibhav Jain <vaibhav at linux.ibm.com>
+ */
+
+#ifndef _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+#define _UAPI_ASM_POWERPC_PAPR_PDSM_H_
+
+#include <linux/types.h>
+
+/*
+ * PDSM Envelope:
+ *
+ * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
+ * envelope which consists of a header and user-defined payload sections.
+ * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
+ * payload following it and accessible via 'nd_pdsm_cmd_pkg.payload' field.
+ * There is reserved field that can used to introduce new fields to the
+ * structure in future. It also tries to ensure that 'nd_pdsm_cmd_pkg.payload'
+ * lies at a 8-byte boundary.
+ *
+ *  +-------------+---------------------+---------------------------+
+ *  |   64-Bytes  |       16-Bytes      |       Max 176-Bytes       |
+ *  +-------------+---------------------+---------------------------+
+ *  |               nd_pdsm_cmd_pkg     |                           |
+ *  |-------------+                     |                           |
+ *  |  nd_cmd_pkg |                     |                           |
+ *  +-------------+---------------------+---------------------------+
+ *  | nd_family   |                     |                           |
+ *  | nd_size_out | cmd_status          |                           |
+ *  | nd_size_in  | payload_version     |     payload               |
+ *  | nd_command  | reserved            |                           |
+ *  | nd_fw_size  |                     |                           |
+ *  +-------------+---------------------+---------------------------+
+ *
+ * PDSM Header:
+ *
+ * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
+ * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
+ * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which is
+ * contained in 'struct nd_cmd_pkg', the header also has members following
+ * members:
+ *
+ * 'cmd_status'		: (Out) Errors if any encountered while servicing PDSM.
+ * 'payload_version'	: (In/Out) Version number associated with the payload.
+ * 'reserved'		: Not used and reserved for future.
+ *
+ * PDSM Payload:
+ *
+ * The layout of the PDSM Payload is defined by various structs shared between
+ * papr_scm and libndctl so that contents of payload can be interpreted. During
+ * servicing of a PDSM the papr_scm module will read input args from the payload
+ * field by casting its contents to an appropriate struct pointer based on the
+ * PDSM command. Similarly the output of servicing the PDSM command will be
+ * copied to the payload field using the same struct.
+ *
+ * 'libnvdimm' enforces a hard limit of 256 bytes on the envelope size, which
+ * leaves around 176 bytes for the envelope payload (ignoring any padding that
+ * the compiler may silently introduce).
+ *
+ * Payload Version:
+ *
+ * A 'payload_version' field is present in PDSM header that indicates a specific
+ * version of the structure present in PDSM Payload for a given PDSM command.
+ * This provides backward compatibility in case the PDSM Payload structure
+ * evolves and different structures are supported by 'papr_scm' and 'libndctl'.
+ *
+ * When sending a PDSM Payload to 'papr_scm', 'libndctl' should send the version
+ * of the payload struct it supports via 'payload_version' field. The 'papr_scm'
+ * module when servicing the PDSM envelope checks the 'payload_version' and then
+ * uses 'payload struct version' == MIN('payload_version field',
+ * 'max payload-struct-version supported by papr_scm') to service the PDSM.
+ * After servicing the PDSM, 'papr_scm' put the negotiated version of payload
+ * struct in returned 'payload_version' field.
+ *
+ * Libndctl on receiving the envelope back from papr_scm again checks the
+ * 'payload_version' field and based on it use the appropriate version dsm
+ * struct to parse the results.
+ *
+ * Backward Compatibility:
+ *
+ * Above scheme of exchanging different versioned PDSM struct between libndctl
+ * and papr_scm should provide backward compatibility until following two
+ * assumptions/conditions when defining new PDSM structs hold:
+ *
+ * Let T(X) = { set of attributes in PDSM struct 'T' versioned X }
+ *
+ * 1. T(X) is a proper subset of T(Y) if Y > X.
+ *    i.e Each new version of PDSM struct should retain existing struct
+ *    attributes from previous version
+ *
+ * 2. If an entity (libndctl or papr_scm) supports a PDSM struct T(X) then
+ *    it should also support T(1), T(2)...T(X - 1).
+ *    i.e When adding support for new version of a PDSM struct, libndctl
+ *    and papr_scm should retain support of the existing PDSM struct
+ *    version they support.
+ */
+
+/* PDSM-header + payload expected with ND_CMD_CALL ioctl from libnvdimm */
+struct nd_pdsm_cmd_pkg {
+	struct nd_cmd_pkg hdr;	/* Package header containing sub-cmd */
+	__s32 cmd_status;	/* Out: Sub-cmd status returned back */
+	__u16 reserved[5];	/* Ignored and to be used in future */
+	__u16 payload_version;	/* In/Out: version of the payload */
+	__u8 payload[];		/* In/Out: Sub-cmd data buffer */
+} __packed;
+
+/*
+ * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
+ * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
+ */
+enum papr_pdsm {
+	PAPR_PDSM_MIN = 0x0,
+	PAPR_PDSM_MAX,
+};
+
+/* Convert a libnvdimm nd_cmd_pkg to pdsm specific pkg */
+static inline struct nd_pdsm_cmd_pkg *nd_to_pdsm_cmd_pkg(struct nd_cmd_pkg *cmd)
+{
+	return (struct nd_pdsm_cmd_pkg *) cmd;
+}
+
+/* Return the payload pointer for a given pcmd */
+static inline void *pdsm_cmd_to_payload(struct nd_pdsm_cmd_pkg *pcmd)
+{
+	if (pcmd->hdr.nd_size_in == 0 && pcmd->hdr.nd_size_out == 0)
+		return NULL;
+	else
+		return (void *)(pcmd->payload);
+}
+
+#endif /* _UAPI_ASM_POWERPC_PAPR_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index 149431594839..5e2237e7ec08 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -15,13 +15,15 @@
 #include <linux/seq_buf.h>
 
 #include <asm/plpar_wrappers.h>
+#include <asm/papr_pdsm.h>
 
 #define BIND_ANY_ADDR (~0ul)
 
 #define PAPR_SCM_DIMM_CMD_MASK \
 	((1ul << ND_CMD_GET_CONFIG_SIZE) | \
 	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
-	 (1ul << ND_CMD_SET_CONFIG_DATA))
+	 (1ul << ND_CMD_SET_CONFIG_DATA) | \
+	 (1ul << ND_CMD_CALL))
 
 /* DIMM health bitmap bitmap indicators */
 /* SCM device is unable to persist memory contents */
@@ -350,16 +352,97 @@ static int papr_scm_meta_set(struct papr_scm_priv *p,
 	return 0;
 }
 
+/*
+ * Validate the inputs args to dimm-control function and return '0' if valid.
+ * This also does initial sanity validation to ND_CMD_CALL sub-command packages.
+ */
+static int is_cmd_valid(struct nvdimm *nvdimm, unsigned int cmd, void *buf,
+		       unsigned int buf_len)
+{
+	unsigned long cmd_mask = PAPR_SCM_DIMM_CMD_MASK;
+	struct nd_pdsm_cmd_pkg *pkg = nd_to_pdsm_cmd_pkg(buf);
+	struct papr_scm_priv *p;
+
+	/* Only dimm-specific calls are supported atm */
+	if (!nvdimm)
+		return -EINVAL;
+
+	/* get the provider date from struct nvdimm */
+	p = nvdimm_provider_data(nvdimm);
+
+	if (!test_bit(cmd, &cmd_mask)) {
+		dev_dbg(&p->pdev->dev, "Unsupported cmd=%u\n", cmd);
+		return -EINVAL;
+	} else if (cmd == ND_CMD_CALL) {
+
+		/* Verify the envelope package */
+		if (!buf || buf_len < sizeof(struct nd_pdsm_cmd_pkg)) {
+			dev_dbg(&p->pdev->dev, "Invalid pkg size=%u\n",
+				buf_len);
+			return -EINVAL;
+		}
+
+		/* Verify that the PDSM family is valid */
+		if (pkg->hdr.nd_family != NVDIMM_FAMILY_PAPR) {
+			dev_dbg(&p->pdev->dev, "Invalid pkg family=0x%llx\n",
+				pkg->hdr.nd_family);
+			return -EINVAL;
+
+		}
+
+		/* We except a payload with all PDSM commands */
+		if (pdsm_cmd_to_payload(pkg) == NULL) {
+			dev_dbg(&p->pdev->dev,
+				"Empty payload for sub-command=0x%llx\n",
+				pkg->hdr.nd_command);
+			return -EINVAL;
+		}
+	}
+
+	/* Command looks valid */
+	return 0;
+}
+
+static int papr_scm_service_pdsm(struct papr_scm_priv *p,
+				struct nd_pdsm_cmd_pkg *call_pkg)
+{
+	/* unknown subcommands return error in packages */
+	if (call_pkg->hdr.nd_command <= PAPR_PDSM_MIN ||
+	    call_pkg->hdr.nd_command >= PAPR_PDSM_MAX) {
+		dev_dbg(&p->pdev->dev, "Invalid PDSM request 0x%llx\n",
+			call_pkg->hdr.nd_command);
+		call_pkg->cmd_status = -EINVAL;
+		return 0;
+	}
+
+	/* Depending on the DSM command call appropriate service routine */
+	switch (call_pkg->hdr.nd_command) {
+	default:
+		dev_dbg(&p->pdev->dev, "Unsupported PDSM request 0x%llx\n",
+			call_pkg->hdr.nd_command);
+		call_pkg->cmd_status = -ENOENT;
+		return 0;
+	}
+}
+
 static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 			  struct nvdimm *nvdimm, unsigned int cmd, void *buf,
 			  unsigned int buf_len, int *cmd_rc)
 {
 	struct nd_cmd_get_config_size *get_size_hdr;
 	struct papr_scm_priv *p;
+	struct nd_pdsm_cmd_pkg *call_pkg = NULL;
+	int rc;
 
-	/* Only dimm-specific calls are supported atm */
-	if (!nvdimm)
-		return -EINVAL;
+	/* Use a local variable in case cmd_rc pointer is NULL */
+	if (cmd_rc == NULL)
+		cmd_rc = &rc;
+
+	*cmd_rc = is_cmd_valid(nvdimm, cmd, buf, buf_len);
+	if (*cmd_rc) {
+		pr_debug("Invalid cmd=0x%x. Err=%d\n", cmd, *cmd_rc);
+		return *cmd_rc;
+	}
 
 	p = nvdimm_provider_data(nvdimm);
 
@@ -381,13 +464,19 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 		*cmd_rc = papr_scm_meta_set(p, buf);
 		break;
 
+	case ND_CMD_CALL:
+		call_pkg = nd_to_pdsm_cmd_pkg(buf);
+		*cmd_rc = papr_scm_service_pdsm(p, call_pkg);
+		break;
+
 	default:
-		return -EINVAL;
+		dev_dbg(&p->pdev->dev, "Unknown command = %d\n", cmd);
+		*cmd_rc = -EINVAL;
 	}
 
 	dev_dbg(&p->pdev->dev, "returned with cmd_rc = %d\n", *cmd_rc);
 
-	return 0;
+	return *cmd_rc;
 }
 
 static ssize_t flags_show(struct device *dev,
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index de5d90212409..0e09dc5cec19 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -244,6 +244,7 @@ struct nd_cmd_pkg {
 #define NVDIMM_FAMILY_HPE2 2
 #define NVDIMM_FAMILY_MSFT 3
 #define NVDIMM_FAMILY_HYPERV 4
+#define NVDIMM_FAMILY_PAPR 5
 
 #define ND_IOCTL_CALL			_IOWR(ND_IOCTL, ND_CMD_CALL,\
 					struct nd_cmd_pkg)
-- 
2.26.2


^ permalink raw reply related

* [PATCH v9 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200529214719.223344-1-vaibhav@linux.ibm.com>

Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
       	 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c     | 169 +++++++++++++++++-
 2 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index 000000000000..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:		/sys/bus/nd/devices/nmemX/papr/flags
+Date:		Apr, 2020
+KernelVersion:	v5.8
+Contact:	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org,
+Description:
+		(RO) Report flags indicating various states of a
+		papr-pmem NVDIMM device. Each flag maps to a one or
+		more bits set in the dimm-health-bitmap retrieved in
+		response to H_SCM_HEALTH hcall. The details of the bit
+		flags returned in response to this hcall is available
+		at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+		the flags reported in this sysfs file:
+
+		* "not_armed"	: Indicates that NVDIMM contents will not
+				  survive a power cycle.
+		* "flush_fail"	: Indicates that NVDIMM contents
+				  couldn't be flushed during last
+				  shut-down event.
+		* "restore_fail": Indicates that NVDIMM contents
+				  couldn't be restored during NVDIMM
+				  initialization.
+		* "encrypted"	: NVDIMM contents are encrypted.
+		* "smart_notify": There is health event for the NVDIMM.
+		* "scrubbed"	: Indicating that contents of the
+				  NVDIMM have been scrubbed.
+		* "locked"	: Indicating that NVDIMM contents cant
+				  be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..149431594839 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -12,6 +12,7 @@
 #include <linux/libnvdimm.h>
 #include <linux/platform_device.h>
 #include <linux/delay.h>
+#include <linux/seq_buf.h>
 
 #include <asm/plpar_wrappers.h>
 
@@ -22,6 +23,44 @@
 	 (1ul << ND_CMD_GET_CONFIG_DATA) | \
 	 (1ul << ND_CMD_SET_CONFIG_DATA))
 
+/* DIMM health bitmap bitmap indicators */
+/* SCM device is unable to persist memory contents */
+#define PAPR_PMEM_UNARMED                   (1ULL << (63 - 0))
+/* SCM device failed to persist memory contents */
+#define PAPR_PMEM_SHUTDOWN_DIRTY            (1ULL << (63 - 1))
+/* SCM device contents are persisted from previous IPL */
+#define PAPR_PMEM_SHUTDOWN_CLEAN            (1ULL << (63 - 2))
+/* SCM device contents are not persisted from previous IPL */
+#define PAPR_PMEM_EMPTY                     (1ULL << (63 - 3))
+/* SCM device memory life remaining is critically low */
+#define PAPR_PMEM_HEALTH_CRITICAL           (1ULL << (63 - 4))
+/* SCM device will be garded off next IPL due to failure */
+#define PAPR_PMEM_HEALTH_FATAL              (1ULL << (63 - 5))
+/* SCM contents cannot persist due to current platform health status */
+#define PAPR_PMEM_HEALTH_UNHEALTHY          (1ULL << (63 - 6))
+/* SCM device is unable to persist memory contents in certain conditions */
+#define PAPR_PMEM_HEALTH_NON_CRITICAL       (1ULL << (63 - 7))
+/* SCM device is encrypted */
+#define PAPR_PMEM_ENCRYPTED                 (1ULL << (63 - 8))
+/* SCM device has been scrubbed and locked */
+#define PAPR_PMEM_SCRUBBED_AND_LOCKED       (1ULL << (63 - 9))
+
+/* Bits status indicators for health bitmap indicating unarmed dimm */
+#define PAPR_PMEM_UNARMED_MASK (PAPR_PMEM_UNARMED |		\
+				PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* Bits status indicators for health bitmap indicating unflushed dimm */
+#define PAPR_PMEM_BAD_SHUTDOWN_MASK (PAPR_PMEM_SHUTDOWN_DIRTY)
+
+/* Bits status indicators for health bitmap indicating unrestored dimm */
+#define PAPR_PMEM_BAD_RESTORE_MASK  (PAPR_PMEM_EMPTY)
+
+/* Bit status indicators for smart event notification */
+#define PAPR_PMEM_SMART_EVENT_MASK (PAPR_PMEM_HEALTH_CRITICAL | \
+				    PAPR_PMEM_HEALTH_FATAL |	\
+				    PAPR_PMEM_HEALTH_UNHEALTHY)
+
+/* private struct associated with each region */
 struct papr_scm_priv {
 	struct platform_device *pdev;
 	struct device_node *dn;
@@ -39,6 +78,15 @@ struct papr_scm_priv {
 	struct resource res;
 	struct nd_region *region;
 	struct nd_interleave_set nd_set;
+
+	/* Protect dimm health data from concurrent read/writes */
+	struct mutex health_mutex;
+
+	/* Last time the health information of the dimm was updated */
+	unsigned long lasthealth_jiffies;
+
+	/* Health information for the dimm */
+	u64 health_bitmap;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -144,6 +192,62 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 	return drc_pmem_bind(p);
 }
 
+/*
+ * Issue hcall to retrieve dimm health info and populate papr_scm_priv with the
+ * health information.
+ */
+static int __drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long ret[PLPAR_HCALL_BUFSIZE];
+	long rc;
+
+	/* issue the hcall */
+	rc = plpar_hcall(H_SCM_HEALTH, ret, p->drc_index);
+	if (rc != H_SUCCESS) {
+		dev_err(&p->pdev->dev,
+			 "Failed to query health information, Err:%ld\n", rc);
+		rc = -ENXIO;
+		goto out;
+	}
+
+	p->lasthealth_jiffies = jiffies;
+	p->health_bitmap = ret[0] & ret[1];
+
+	dev_dbg(&p->pdev->dev,
+		"Queried dimm health info. Bitmap:0x%016lx Mask:0x%016lx\n",
+		ret[0], ret[1]);
+out:
+	return rc;
+}
+
+/* Min interval in seconds for assuming stable dimm health */
+#define MIN_HEALTH_QUERY_INTERVAL 60
+
+/* Query cached health info and if needed call drc_pmem_query_health */
+static int drc_pmem_query_health(struct papr_scm_priv *p)
+{
+	unsigned long cache_timeout;
+	int rc;
+
+	/* Protect concurrent modifications to papr_scm_priv */
+	rc = mutex_lock_interruptible(&p->health_mutex);
+	if (rc)
+		return rc;
+
+	/* Jiffies offset for which the health data is assumed to be same */
+	cache_timeout = p->lasthealth_jiffies +
+		msecs_to_jiffies(MIN_HEALTH_QUERY_INTERVAL * 1000);
+
+	/* Fetch new health info is its older than MIN_HEALTH_QUERY_INTERVAL */
+	if (time_after(jiffies, cache_timeout))
+		rc = __drc_pmem_query_health(p);
+	else
+		/* Assume cached health data is valid */
+		rc = 0;
+
+	mutex_unlock(&p->health_mutex);
+	return rc;
+}
 
 static int papr_scm_meta_get(struct papr_scm_priv *p,
 			     struct nd_cmd_get_config_data_hdr *hdr)
@@ -286,6 +390,64 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor *nd_desc,
 	return 0;
 }
 
+static ssize_t flags_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct nvdimm *dimm = to_nvdimm(dev);
+	struct papr_scm_priv *p = nvdimm_provider_data(dimm);
+	struct seq_buf s;
+	u64 health;
+	int rc;
+
+	rc = drc_pmem_query_health(p);
+	if (rc)
+		return rc;
+
+	/* Copy health_bitmap locally, check masks & update out buffer */
+	health = READ_ONCE(p->health_bitmap);
+
+	seq_buf_init(&s, buf, PAGE_SIZE);
+	if (health & PAPR_PMEM_UNARMED_MASK)
+		seq_buf_printf(&s, "not_armed ");
+
+	if (health & PAPR_PMEM_BAD_SHUTDOWN_MASK)
+		seq_buf_printf(&s, "flush_fail ");
+
+	if (health & PAPR_PMEM_BAD_RESTORE_MASK)
+		seq_buf_printf(&s, "restore_fail ");
+
+	if (health & PAPR_PMEM_ENCRYPTED)
+		seq_buf_printf(&s, "encrypted ");
+
+	if (health & PAPR_PMEM_SMART_EVENT_MASK)
+		seq_buf_printf(&s, "smart_notify ");
+
+	if (health & PAPR_PMEM_SCRUBBED_AND_LOCKED)
+		seq_buf_printf(&s, "scrubbed locked ");
+
+	if (seq_buf_used(&s))
+		seq_buf_printf(&s, "\n");
+
+	return seq_buf_used(&s);
+}
+DEVICE_ATTR_RO(flags);
+
+/* papr_scm specific dimm attributes */
+static struct attribute *papr_nd_attributes[] = {
+	&dev_attr_flags.attr,
+	NULL,
+};
+
+static struct attribute_group papr_nd_attribute_group = {
+	.name = "papr",
+	.attrs = papr_nd_attributes,
+};
+
+static const struct attribute_group *papr_nd_attr_groups[] = {
+	&papr_nd_attribute_group,
+	NULL,
+};
+
 static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 {
 	struct device *dev = &p->pdev->dev;
@@ -312,8 +474,8 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
 	dimm_flags = 0;
 	set_bit(NDD_LABELING, &dimm_flags);
 
-	p->nvdimm = nvdimm_create(p->bus, p, NULL, dimm_flags,
-				  PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
+	p->nvdimm = nvdimm_create(p->bus, p, papr_nd_attr_groups,
+				  dimm_flags, PAPR_SCM_DIMM_CMD_MASK, 0, NULL);
 	if (!p->nvdimm) {
 		dev_err(dev, "Error creating DIMM object for %pOF\n", p->dn);
 		goto err;
@@ -399,6 +561,9 @@ static int papr_scm_probe(struct platform_device *pdev)
 	if (!p)
 		return -ENOMEM;
 
+	/* Initialize the dimm mutex */
+	mutex_init(&p->health_mutex);
+
 	/* optional DT properties */
 	of_property_read_u32(dn, "ibm,metadata-size", &metadata_size);
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH v9 1/5] powerpc: Document details on H_SCM_HEALTH hcall
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny
In-Reply-To: <20200529214719.223344-1-vaibhav@linux.ibm.com>

Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
Changelog:

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the PMEM device and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400000000000000
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++------+-----------------------------------------------------------------------+
+|  Bit |               Definition                                              |
++======+=======================================================================+
+|  00  | PMEM device is unable to persist memory contents.                     |
+|      | If the system is powered down, nothing will be saved.                 |
++------+-----------------------------------------------------------------------+
+|  01  | PMEM device failed to persist memory contents. Either contents were   |
+|      | not saved successfully on power down or were not restored properly on |
+|      | power up.                                                             |
++------+-----------------------------------------------------------------------+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   |
+|      | the last boot were successfully restored.                             |
++------+-----------------------------------------------------------------------+
+|  03  | PMEM device contents are not persisted from previous IPL. There was no|
+|      | data to restore from the last boot.                                   |
++------+-----------------------------------------------------------------------+
+|  04  | PMEM device memory life remaining is critically low                   |
++------+-----------------------------------------------------------------------+
+|  05  | PMEM device will be garded off next IPL due to failure                |
++------+-----------------------------------------------------------------------+
+|  06  | PMEM device contents cannot persist due to current platform health    |
+|      | status. A hardware failure may prevent data from being saved or       |
+|      | restored.                                                             |
++------+-----------------------------------------------------------------------+
+|  07  | PMEM device is unable to persist memory contents in certain conditions|
++------+-----------------------------------------------------------------------+
+|  08  | PMEM device is encrypted                                              |
++------+-----------------------------------------------------------------------+
+|  09  | PMEM device has successfully completed a requested erase or secure    |
+|      | erase procedure.                                                      |
++------+-----------------------------------------------------------------------+
+|10:63 | Reserved / Unused                                                     |
++------+-----------------------------------------------------------------------+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH v9 0/5] powerpc/papr_scm: Add support for reporting nvdimm health
From: Vaibhav Jain @ 2020-05-29 21:47 UTC (permalink / raw)
  To: linuxppc-dev, linux-nvdimm, linux-kernel
  Cc: Santosh Sivaraj, Steven Rostedt, Oliver O'Halloran,
	Aneesh Kumar K . V, Vaibhav Jain, Dan Williams, Ira Weiny

Changes since v8 [1]:

* Updated proposed changes to remove usage of term 'SCM' due to
  ambiguity with 'PMEM' and 'NVDIMM'. [ Dan Williams ]
* Replaced the usage of term 'SCM' with 'PMEM' in most contexts.
  [ Aneesh ]
* Renamed NVDIMM health defines from PAPR_SCM_DIMM_* to PAPR_PMEM_* .
* Updates to various newly introduced identifiers in 'papr_scm.c'
  removing the 'SCM' prefix from their names.
* Renamed NVDIMM_FAMILY_PAPR_SCM to NVDIMM_FAMILY_PAPR
* Renamed PAPR_SCM_PDSM_HEALTH PAPR_PDSM_HEALTH
* Renamed uapi header 'papr_scm_pdsm.h' to 'papr_pdsm.h'.
* Renamed sysfs ABI document from sysfs-bus-papr-scm to
  sysfs-bus-papr-pmem.
* No behavioural changes from v8 [1].

[1] https://lore.kernel.org/linux-nvdimm/20200527041244.37821-1-vaibhav@linux.ibm.com/
---

The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
    "dev":"nmem0",
    "health":{
      "health_state":"fatal",
      "shutdown_state":"dirty"
    }
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
    "dev":"nmem0",
    "health":{
      "health_state":"ok",
      "shutdown_state":"clean"
    }
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==================================

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==========================

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patches deal with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

References:
[2] "Power Architecture Platform Reference"
      https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
     ("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
     https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v9

---

Vaibhav Jain (5):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
 Documentation/powerpc/papr_hcalls.rst         |  46 ++-
 arch/powerpc/include/uapi/asm/papr_pdsm.h     | 175 +++++++++
 arch/powerpc/platforms/pseries/papr_scm.c     | 363 +++++++++++++++++-
 include/uapi/linux/ndctl.h                    |   1 +
 lib/seq_buf.c                                 |   1 +
 6 files changed, 600 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

-- 
2.26.2


^ permalink raw reply

* [PATCH v3 0/5] lockdep: Change IRQ state tracking to use per-cpu variables
From: Peter Zijlstra @ 2020-05-29 21:35 UTC (permalink / raw)
  To: mingo, will, tglx
  Cc: linux-s390, peterz, bigeasy, x86, heiko.carstens, linux-kernel,
	rostedt, a.darwish, sparclinux, linuxppc-dev, davem

Ahmed and Sebastian wanted additional lockdep_assert*() macros and ran
into header hell.

Move the IRQ state into per-cpu variables, which removes the dependency on
task_struct, which is what generated the header-hell.

These patches are intended to go on top of:

 https://lkml.kernel.org/r/20200529212728.795169701@infradead.org

but should apply on current tip/master without much difficulty.

There are a few build fixes for Sparc64, PowerPC64 and s390. Especially the
Sparc one I'm not sure about.


^ permalink raw reply

* [PATCH v3 2/5] powerpc64: Break asm/percpu.h vs spinlock_types.h dependency
From: Peter Zijlstra @ 2020-05-29 21:35 UTC (permalink / raw)
  To: mingo, will, tglx
  Cc: linux-s390, peterz, bigeasy, x86, heiko.carstens, linux-kernel,
	rostedt, a.darwish, sparclinux, linuxppc-dev, davem
In-Reply-To: <20200529213550.683440625@infradead.org>

In order to use <asm/percpu.h> in lockdep.h, we need to make sure
asm/percpu.h does not itself depend on lockdep.

The below seems to make that so and builds powerpc64-defconfig +
PROVE_LOCKING.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/powerpc/include/asm/dtl.h         |   52 +++++++++++++++++++++++++++++++++
 arch/powerpc/include/asm/lppaca.h      |   44 ---------------------------
 arch/powerpc/kernel/time.c             |    1 
 arch/powerpc/kvm/book3s_hv.c           |    1 
 arch/powerpc/platforms/pseries/dtl.c   |    1 
 arch/powerpc/platforms/pseries/lpar.c  |    1 
 arch/powerpc/platforms/pseries/setup.c |    1 
 arch/powerpc/platforms/pseries/svm.c   |    1 
 8 files changed, 58 insertions(+), 44 deletions(-)

--- /dev/null
+++ b/arch/powerpc/include/asm/dtl.h
@@ -0,0 +1,52 @@
+#ifndef _ASM_POWERPC_DTL_H
+#define _ASM_POWERPC_DTL_H
+
+#include <asm/lppaca.h>
+#include <linux/spinlock_types.h>
+
+/*
+ * Layout of entries in the hypervisor's dispatch trace log buffer.
+ */
+struct dtl_entry {
+	u8	dispatch_reason;
+	u8	preempt_reason;
+	__be16	processor_id;
+	__be32	enqueue_to_dispatch_time;
+	__be32	ready_to_enqueue_time;
+	__be32	waiting_to_ready_time;
+	__be64	timebase;
+	__be64	fault_addr;
+	__be64	srr0;
+	__be64	srr1;
+};
+
+#define DISPATCH_LOG_BYTES	4096	/* bytes per cpu */
+#define N_DISPATCH_LOG		(DISPATCH_LOG_BYTES / sizeof(struct dtl_entry))
+
+/*
+ * Dispatch trace log event enable mask:
+ *   0x1: voluntary virtual processor waits
+ *   0x2: time-slice preempts
+ *   0x4: virtual partition memory page faults
+ */
+#define DTL_LOG_CEDE		0x1
+#define DTL_LOG_PREEMPT		0x2
+#define DTL_LOG_FAULT		0x4
+#define DTL_LOG_ALL		(DTL_LOG_CEDE | DTL_LOG_PREEMPT | DTL_LOG_FAULT)
+
+extern struct kmem_cache *dtl_cache;
+extern rwlock_t dtl_access_lock;
+
+/*
+ * When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE = y, the cpu accounting code controls
+ * reading from the dispatch trace log.  If other code wants to consume
+ * DTL entries, it can set this pointer to a function that will get
+ * called once for each DTL entry that gets processed.
+ */
+extern void (*dtl_consumer)(struct dtl_entry *entry, u64 index);
+
+extern void register_dtl_buffer(int cpu);
+extern void alloc_dtl_buffers(unsigned long *time_limit);
+extern long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity);
+
+#endif /* _ASM_POWERPC_DTL_H */
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -42,7 +42,6 @@
  */
 #include <linux/cache.h>
 #include <linux/threads.h>
-#include <linux/spinlock_types.h>
 #include <asm/types.h>
 #include <asm/mmu.h>
 #include <asm/firmware.h>
@@ -146,49 +145,6 @@ struct slb_shadow {
 	} save_area[SLB_NUM_BOLTED];
 } ____cacheline_aligned;
 
-/*
- * Layout of entries in the hypervisor's dispatch trace log buffer.
- */
-struct dtl_entry {
-	u8	dispatch_reason;
-	u8	preempt_reason;
-	__be16	processor_id;
-	__be32	enqueue_to_dispatch_time;
-	__be32	ready_to_enqueue_time;
-	__be32	waiting_to_ready_time;
-	__be64	timebase;
-	__be64	fault_addr;
-	__be64	srr0;
-	__be64	srr1;
-};
-
-#define DISPATCH_LOG_BYTES	4096	/* bytes per cpu */
-#define N_DISPATCH_LOG		(DISPATCH_LOG_BYTES / sizeof(struct dtl_entry))
-
-/*
- * Dispatch trace log event enable mask:
- *   0x1: voluntary virtual processor waits
- *   0x2: time-slice preempts
- *   0x4: virtual partition memory page faults
- */
-#define DTL_LOG_CEDE		0x1
-#define DTL_LOG_PREEMPT		0x2
-#define DTL_LOG_FAULT		0x4
-#define DTL_LOG_ALL		(DTL_LOG_CEDE | DTL_LOG_PREEMPT | DTL_LOG_FAULT)
-
-extern struct kmem_cache *dtl_cache;
-extern rwlock_t dtl_access_lock;
-
-/*
- * When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE = y, the cpu accounting code controls
- * reading from the dispatch trace log.  If other code wants to consume
- * DTL entries, it can set this pointer to a function that will get
- * called once for each DTL entry that gets processed.
- */
-extern void (*dtl_consumer)(struct dtl_entry *entry, u64 index);
-
-extern void register_dtl_buffer(int cpu);
-extern void alloc_dtl_buffers(unsigned long *time_limit);
 extern long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity);
 
 #endif /* CONFIG_PPC_BOOK3S */
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -69,6 +69,7 @@
 #include <asm/vdso_datapage.h>
 #include <asm/firmware.h>
 #include <asm/asm-prototypes.h>
+#include <asm/dtl.h>
 
 /* powerpc clocksource/clockevent code */
 
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -74,6 +74,7 @@
 #include <asm/hw_breakpoint.h>
 #include <asm/kvm_book3s_uvmem.h>
 #include <asm/ultravisor.h>
+#include <asm/dtl.h>
 
 #include "book3s.h"
 
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -12,6 +12,7 @@
 #include <asm/smp.h>
 #include <linux/uaccess.h>
 #include <asm/firmware.h>
+#include <asm/dtl.h>
 #include <asm/lppaca.h>
 #include <asm/debugfs.h>
 #include <asm/plpar_wrappers.h>
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -40,6 +40,7 @@
 #include <asm/fadump.h>
 #include <asm/asm-prototypes.h>
 #include <asm/debugfs.h>
+#include <asm/dtl.h>
 
 #include "pseries.h"
 
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -70,6 +70,7 @@
 #include <asm/asm-const.h>
 #include <asm/swiotlb.h>
 #include <asm/svm.h>
+#include <asm/dtl.h>
 
 #include "pseries.h"
 #include "../../../../drivers/pci/pci.h"
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -11,6 +11,7 @@
 #include <asm/svm.h>
 #include <asm/swiotlb.h>
 #include <asm/ultravisor.h>
+#include <asm/dtl.h>
 
 static int __init init_svm(void)
 {



^ permalink raw reply

* [RFC][PATCH v3 1/5] sparc64: Fix asm/percpu.h build error
From: Peter Zijlstra @ 2020-05-29 21:35 UTC (permalink / raw)
  To: mingo, will, tglx
  Cc: linux-s390, peterz, bigeasy, x86, heiko.carstens, linux-kernel,
	rostedt, a.darwish, sparclinux, linuxppc-dev, davem
In-Reply-To: <20200529213550.683440625@infradead.org>

In order to break a header dependency between lockdep and task_struct,
I need per-cpu stuff from lockdep.

Including <asm/percpu.h> from lockdep.h gives a build error, this
patch cures that, but results in the following warning:

../arch/sparc/include/asm/percpu_64.h:7:24: warning: call-clobbered register used for global register variable
register unsigned long __local_per_cpu_offset asm("g5");

But i've no idea how to fix that :/ but it does build.

Not-Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/sparc/include/asm/trap_block.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/sparc/include/asm/trap_block.h
+++ b/arch/sparc/include/asm/trap_block.h
@@ -2,6 +2,8 @@
 #ifndef _SPARC_TRAP_BLOCK_H
 #define _SPARC_TRAP_BLOCK_H
 
+#include <linux/threads.h>
+
 #include <asm/hypervisor.h>
 #include <asm/asi.h>
 



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox