* Re: KVM on POWER8 host lock up since 10d91611f426 ("powerpc/64s: Reimplement book3s idle code in C")
From: Nicholas Piggin @ 2020-10-19 4:50 UTC (permalink / raw)
To: Michal Suchánek; +Cc: ro, linuxppc-dev, Hari Bathini
In-Reply-To: <1603066878.gtbyofrzyo.astroid@bobo.none>
Excerpts from Nicholas Piggin's message of October 19, 2020 11:00 am:
> Excerpts from Michal Suchánek's message of October 17, 2020 6:14 am:
>> On Mon, Sep 07, 2020 at 11:13:47PM +1000, Nicholas Piggin wrote:
>>> Excerpts from Michael Ellerman's message of August 31, 2020 8:50 pm:
>>> > Michal Suchánek <msuchanek@suse.de> writes:
>>> >> On Mon, Aug 31, 2020 at 11:14:18AM +1000, Nicholas Piggin wrote:
>>> >>> Excerpts from Michal Suchánek's message of August 31, 2020 6:11 am:
>>> >>> > Hello,
>>> >>> >
>>> >>> > on POWER8 KVM hosts lock up since commit 10d91611f426 ("powerpc/64s:
>>> >>> > Reimplement book3s idle code in C").
>>> >>> >
>>> >>> > The symptom is host locking up completely after some hours of KVM
>>> >>> > workload with messages like
>>> >>> >
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 71
>>> >>> > 2020-08-30T10:51:31+00:00 obs-power8-01 kernel: KVM: couldn't grab cpu 47
>>> >>> >
>>> >>> > printed before the host locks up.
>>> >>> >
>>> >>> > The machines run sandboxed builds which is a mixed workload resulting in
>>> >>> > IO/single core/mutiple core load over time and there are periods of no
>>> >>> > activity and no VMS runnig as well. The VMs are shortlived so VM
>>> >>> > setup/terdown is somewhat excercised as well.
>>> >>> >
>>> >>> > POWER9 with the new guest entry fast path does not seem to be affected.
>>> >>> >
>>> >>> > Reverted the patch and the followup idle fixes on top of 5.2.14 and
>>> >>> > re-applied commit a3f3072db6ca ("powerpc/powernv/idle: Restore IAMR
>>> >>> > after idle") which gives same idle code as 5.1.16 and the kernel seems
>>> >>> > stable.
>>> >>> >
>>> >>> > Config is attached.
>>> >>> >
>>> >>> > I cannot easily revert this commit, especially if I want to use the same
>>> >>> > kernel on POWER8 and POWER9 - many of the POWER9 fixes are applicable
>>> >>> > only to the new idle code.
>>> >>> >
>>> >>> > Any idea what can be the problem?
>>> >>>
>>> >>> So hwthread_state is never getting back to to HWTHREAD_IN_IDLE on
>>> >>> those threads. I wonder what they are doing. POWER8 doesn't have a good
>>> >>> NMI IPI and I don't know if it supports pdbg dumping registers from the
>>> >>> BMC unfortunately.
>>> >>
>>> >> It may be possible to set up fadump with a later kernel version that
>>> >> supports it on powernv and dump the whole kernel.
>>> >
>>> > Your firmware won't support it AFAIK.
>>> >
>>> > You could try kdump, but if we have CPUs stuck in KVM then there's a
>>> > good chance it won't work :/
>>>
>>> I haven't had any luck yet reproducing this still. Testing with sub
>>> cores of various different combinations, etc. I'll keep trying though.
>>
>> Hello,
>>
>> I tried running some KVM guests to simulate the workload and what I get
>> is guests failing to start with a rcu stall. Tried both 5.3 and 5.9
>> kernel and qemu 4.2.1 and 5.1.0
>>
>> To start some guests I run
>>
>> for i in $(seq 0 9) ; do /opt/qemu/bin/qemu-system-ppc64 -m 2048 -accel kvm -smp 8 -kernel /boot/vmlinux -initrd /boot/initrd -nodefaults -nographic -serial mon:telnet::444$i,server,wait & done
>>
>> To simulate some workload I run
>>
>> xz -zc9T0 < /dev/zero > /dev/null &
>> while true; do
>> killall -STOP xz; sleep 1; killall -CONT xz; sleep 1;
>> done &
>>
>> on the host and add a job that executes this to the ramdisk. However, most
>> guests never get to the point where the job is executed.
>>
>> Any idea what might be the problem?
>
> I would say try without pv queued spin locks (but if the same thing is
> happening with 5.3 then it must be something else I guess).
>
> I'll try to test a similar setup on a POWER8 here.
Couldn't reproduce the guest hang, they seem to run fine even with
queued spinlocks. Might have a different .config.
I might have got a lockup in the host (although different symptoms than
the original report). I'll look into that a bit further.
Thanks,
Nick
^ permalink raw reply
* Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
From: Joel Stanley @ 2020-10-19 4:55 UTC (permalink / raw)
To: Christophe Leroy, Segher Boessenkool
Cc: linux-arch, Arnd Bergmann, Linux Kernel Mailing List,
Masahiro Yamada, Andrew Morton, linuxppc-dev
In-Reply-To: <96c6172d619c51acc5c1c4884b80785c59af4102.1602949927.git.christophe.leroy@csgroup.eu>
On Sat, 17 Oct 2020 at 15:55, Christophe Leroy
<christophe.leroy@csgroup.eu> wrote:
>
> When building mpc885_ads_defconfig with gcc 10.1,
> the function get_order() appears 50 times in vmlinux:
>
> [linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l
> 50
>
> [linux]# size vmlinux
> text data bss dec hex filename
> 3842620 675624 135160 4653404 47015c vmlinux
>
> In the old days, marking a function 'static inline' was forcing
> GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
> CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
> a function.
>
> It looks like GCC 10 is taking poor decisions on this.
>
> get_order() compiles into the following tiny function,
> occupying 20 bytes of text.
>
> 0000007c <get_order>:
> 7c: 38 63 ff ff addi r3,r3,-1
> 80: 54 63 a3 3e rlwinm r3,r3,20,12,31
> 84: 7c 63 00 34 cntlzw r3,r3
> 88: 20 63 00 20 subfic r3,r3,32
> 8c: 4e 80 00 20 blr
>
> By forcing get_order() to be __always_inline, the size of text is
> reduced by 1940 bytes, that is almost twice the space occupied by
> 50 times get_order()
>
> [linux-powerpc]# size vmlinux
> text data bss dec hex filename
> 3840680 675588 135176 4651444 46f9b4 vmlinux
I see similar results with GCC 10.2 building for arm32. There are 143
instances of get_order with aspeed_g5_defconfig.
Before:
9071838 2630138 186468 11888444 b5673c vmlinux
After:
9069886 2630126 186468 11886480 b55f90 vmlinux
1952 bytes smaller with your patch applied. Did you raise this with
anyone from GCC?
Reviewed-by: Joel Stanley <joel@jms.id.au>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> include/asm-generic/getorder.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
> index e9f20b813a69..f2979e3a96b6 100644
> --- a/include/asm-generic/getorder.h
> +++ b/include/asm-generic/getorder.h
> @@ -26,7 +26,7 @@
> *
> * The result is undefined if the size is 0.
> */
> -static inline __attribute_const__ int get_order(unsigned long size)
> +static __always_inline __attribute_const__ int get_order(unsigned long size)
> {
> if (__builtin_constant_p(size)) {
> if (!size)
> --
> 2.25.0
>
^ permalink raw reply
* Re: [PATCH v4 0/2] powerpc/mce: Fix mce handler and add selftest
From: Ganesh @ 2020-10-19 5:26 UTC (permalink / raw)
To: Michael Ellerman, mpe, linuxppc-dev; +Cc: msuchanek, mahesh, npiggin, keescook
In-Reply-To: <160284791867.1794337.3709853894956238598.b4-ty@ellerman.id.au>
On 10/16/20 5:02 PM, Michael Ellerman wrote:
> On Fri, 9 Oct 2020 12:10:03 +0530, Ganesh Goudar wrote:
>> This patch series fixes mce handling for pseries, Adds LKDTM test
>> for SLB multihit recovery and enables selftest for the same,
>> basically to test MCE handling on pseries/powernv machines running
>> in hash mmu mode.
>>
>> v4:
>> * Use radix_enabled() to check if its in Hash or Radix mode.
>> * Use FW_FEATURE_LPAR instead of machine_is_pseries().
>>
>> [...]
> Patch 1 applied to powerpc/fixes.
>
> [1/2] powerpc/mce: Avoid nmi_enter/exit in real mode on pseries hash
> https://git.kernel.org/powerpc/c/8d0e2101274358d9b6b1f27232b40253ca48bab5
>
> cheers
Thank you, Any comments on patch 2.
^ permalink raw reply
* Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
From: Christophe Leroy @ 2020-10-19 5:50 UTC (permalink / raw)
To: Joel Stanley, Segher Boessenkool
Cc: linux-arch, Arnd Bergmann, Linux Kernel Mailing List,
Masahiro Yamada, Andrew Morton, linuxppc-dev
In-Reply-To: <CACPK8XfgK0Bj3sLjKCi80jS6vK34FN5BTkU8FvBGcMR=RQn4Xw@mail.gmail.com>
Le 19/10/2020 à 06:55, Joel Stanley a écrit :
> On Sat, 17 Oct 2020 at 15:55, Christophe Leroy
> <christophe.leroy@csgroup.eu> wrote:
>>
>> When building mpc885_ads_defconfig with gcc 10.1,
>> the function get_order() appears 50 times in vmlinux:
>>
>> [linux]# ppc-linux-objdump -x vmlinux | grep get_order | wc -l
>> 50
>>
>> [linux]# size vmlinux
>> text data bss dec hex filename
>> 3842620 675624 135160 4653404 47015c vmlinux
>>
>> In the old days, marking a function 'static inline' was forcing
>> GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
>> CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
>> a function.
>>
>> It looks like GCC 10 is taking poor decisions on this.
>>
>> get_order() compiles into the following tiny function,
>> occupying 20 bytes of text.
>>
>> 0000007c <get_order>:
>> 7c: 38 63 ff ff addi r3,r3,-1
>> 80: 54 63 a3 3e rlwinm r3,r3,20,12,31
>> 84: 7c 63 00 34 cntlzw r3,r3
>> 88: 20 63 00 20 subfic r3,r3,32
>> 8c: 4e 80 00 20 blr
>>
>> By forcing get_order() to be __always_inline, the size of text is
>> reduced by 1940 bytes, that is almost twice the space occupied by
>> 50 times get_order()
>>
>> [linux-powerpc]# size vmlinux
>> text data bss dec hex filename
>> 3840680 675588 135176 4651444 46f9b4 vmlinux
>
> I see similar results with GCC 10.2 building for arm32. There are 143
> instances of get_order with aspeed_g5_defconfig.
>
> Before:
> 9071838 2630138 186468 11888444 b5673c vmlinux
> After:
> 9069886 2630126 186468 11886480 b55f90 vmlinux
>
> 1952 bytes smaller with your patch applied. Did you raise this with
> anyone from GCC?
Yes I did, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445
For the time being, it's at a standstill.
Christophe
>
> Reviewed-by: Joel Stanley <joel@jms.id.au>
>
>
>
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>> include/asm-generic/getorder.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/asm-generic/getorder.h b/include/asm-generic/getorder.h
>> index e9f20b813a69..f2979e3a96b6 100644
>> --- a/include/asm-generic/getorder.h
>> +++ b/include/asm-generic/getorder.h
>> @@ -26,7 +26,7 @@
>> *
>> * The result is undefined if the size is 0.
>> */
>> -static inline __attribute_const__ int get_order(unsigned long size)
>> +static __always_inline __attribute_const__ int get_order(unsigned long size)
>> {
>> if (__builtin_constant_p(size)) {
>> if (!size)
>> --
>> 2.25.0
>>
^ permalink raw reply
* Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
From: Segher Boessenkool @ 2020-10-19 8:32 UTC (permalink / raw)
To: Christophe Leroy
Cc: linux-arch, Arnd Bergmann, Linux Kernel Mailing List,
Masahiro Yamada, Joel Stanley, Andrew Morton, linuxppc-dev
In-Reply-To: <0bd0afae-f043-2811-944b-c94d90e231d2@csgroup.eu>
On Mon, Oct 19, 2020 at 07:50:41AM +0200, Christophe Leroy wrote:
> Le 19/10/2020 à 06:55, Joel Stanley a écrit :
> >>In the old days, marking a function 'static inline' was forcing
> >>GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
> >>CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
> >>a function.
> >>
> >>It looks like GCC 10 is taking poor decisions on this.
> >1952 bytes smaller with your patch applied. Did you raise this with
> >anyone from GCC?
>
> Yes I did, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445
>
> For the time being, it's at a standstill.
The kernel should just use __always_inline if that is what it *wants*;
that is true here most likely. GCC could perhaps improve its heuristics
so that it no longer thinks these functions are often too big for
inlining (they *are* pretty big, but not after basic optimisations with
constant integer arguments).
Segher
^ permalink raw reply
* Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
From: Christophe Leroy @ 2020-10-19 8:54 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linux-arch, Arnd Bergmann, Linux Kernel Mailing List,
Masahiro Yamada, Joel Stanley, Andrew Morton, linuxppc-dev
In-Reply-To: <20201019083225.GN2672@gate.crashing.org>
Le 19/10/2020 à 10:32, Segher Boessenkool a écrit :
> On Mon, Oct 19, 2020 at 07:50:41AM +0200, Christophe Leroy wrote:
>> Le 19/10/2020 à 06:55, Joel Stanley a écrit :
>>>> In the old days, marking a function 'static inline' was forcing
>>>> GCC to inline, but since commit ac7c3e4ff401 ("compiler: enable
>>>> CONFIG_OPTIMIZE_INLINING forcibly") GCC may decide to not inline
>>>> a function.
>>>>
>>>> It looks like GCC 10 is taking poor decisions on this.
>
>>> 1952 bytes smaller with your patch applied. Did you raise this with
>>> anyone from GCC?
>>
>> Yes I did, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97445
>>
>> For the time being, it's at a standstill.
>
> The kernel should just use __always_inline if that is what it *wants*;
> that is true here most likely. GCC could perhaps improve its heuristics
> so that it no longer thinks these functions are often too big for
> inlining (they *are* pretty big, but not after basic optimisations with
> constant integer arguments).
>
Yes I guess __always_inline is to be added on functions like this defined in headers for exactly
that, and that's the purpose of this patch.
However I find it odd that get_order() is outlined by GCC even in some object files that don't use
it at all, for instance in fs/pipe.o
Christophe
^ permalink raw reply
* [PATCH] drm/amd/display: Fix missing declaration of enable_kernel_vsx()
From: Christophe Leroy @ 2020-10-19 9:39 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Alex Deucher, Christian König,
David Airlie, Daniel Vetter
Cc: dri-devel, linuxppc-dev, linux-kernel, amd-gfx
Include <asm/switch-to.h> in order to avoid following build failure
because of missing declaration of enable_kernel_vsx()
CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o
In file included from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29,
from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37,
from drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27:
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 'dcn_bw_apply_registry_override':
./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? [-Werror=implicit-function-declaration]
64 | enable_kernel_vsx(); \
| ^~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in expansion of macro 'DC_FP_START'
640 | DC_FP_START();
| ^~~~~~~~~~~
./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? [-Werror=implicit-function-declaration]
75 | disable_kernel_vsx(); \
| ^~~~~~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in expansion of macro 'DC_FP_END'
676 | DC_FP_END();
| ^~~~~~~~~
cc1: some warnings being treated as errors
make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] Error 1
Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
drivers/gpu/drm/amd/display/dc/os_types.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h b/drivers/gpu/drm/amd/display/dc/os_types.h
index c3bbfe397e8d..9000cf188544 100644
--- a/drivers/gpu/drm/amd/display/dc/os_types.h
+++ b/drivers/gpu/drm/amd/display/dc/os_types.h
@@ -33,6 +33,7 @@
#include <linux/slab.h>
#include <asm/byteorder.h>
+#include <asm/switch-to.h>
#include <drm/drm_print.h>
--
2.25.0
^ permalink raw reply related
* Re: [PATCH v4 2/2] lkdtm/powerpc: Add SLB multihit test
From: Michael Ellerman @ 2020-10-19 10:59 UTC (permalink / raw)
To: Ganesh Goudar, linuxppc-dev
Cc: msuchanek, Ganesh Goudar, keescook, npiggin, mahesh
In-Reply-To: <20201009064005.19777-3-ganeshgr@linux.ibm.com>
Hi Ganesh,
Some comments below ...
Ganesh Goudar <ganeshgr@linux.ibm.com> writes:
> To check machine check handling, add support to inject slb
> multihit errors.
>
> Cc: Kees Cook <keescook@chromium.org>
> Reviewed-by: Michal Suchánek <msuchanek@suse.de>
> Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
> ---
> drivers/misc/lkdtm/Makefile | 1 +
> drivers/misc/lkdtm/core.c | 3 +
> drivers/misc/lkdtm/lkdtm.h | 3 +
> drivers/misc/lkdtm/powerpc.c | 156 ++++++++++++++++++++++++
> tools/testing/selftests/lkdtm/tests.txt | 1 +
> 5 files changed, 164 insertions(+)
> create mode 100644 drivers/misc/lkdtm/powerpc.c
>
..
> diff --git a/drivers/misc/lkdtm/powerpc.c b/drivers/misc/lkdtm/powerpc.c
> new file mode 100644
> index 000000000000..f388b53dccba
> --- /dev/null
> +++ b/drivers/misc/lkdtm/powerpc.c
> @@ -0,0 +1,156 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include "lkdtm.h"
> +#include <linux/slab.h>
> +#include <linux/vmalloc.h>
Usual style is to include the linux headers first and then the local header.
> +
> +/* Gets index for new slb entry */
> +static inline unsigned long get_slb_index(void)
> +{
> + unsigned long index;
> +
> + index = get_paca()->stab_rr;
> +
> + /*
> + * simple round-robin replacement of slb starting at SLB_NUM_BOLTED.
> + */
> + if (index < (mmu_slb_size - 1))
> + index++;
> + else
> + index = SLB_NUM_BOLTED;
> + get_paca()->stab_rr = index;
> + return index;
> +}
I'm not sure we need that really?
We can just always insert at SLB_MUM_BOLTED and SLB_NUM_BOLTED + 1.
Or we could allocate from the top down using mmu_slb_size - 1, and
mmu_slb_size - 2.
> +#define slb_esid_mask(ssize) \
> + (((ssize) == MMU_SEGSIZE_256M) ? ESID_MASK : ESID_MASK_1T)
> +
> +/* Form the operand for slbmte */
> +static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
> + unsigned long slot)
> +{
> + return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | slot;
> +}
> +
> +#define slb_vsid_shift(ssize) \
> + ((ssize) == MMU_SEGSIZE_256M ? SLB_VSID_SHIFT : SLB_VSID_SHIFT_1T)
> +
> +/* Form the operand for slbmte */
> +static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
> + unsigned long flags)
> +{
> + return (get_kernel_vsid(ea, ssize) << slb_vsid_shift(ssize)) | flags |
> + ((unsigned long)ssize << SLB_VSID_SSIZE_SHIFT);
> +}
I realise it's not much code, but I'd rather those were in a header,
rather than copied from slb.c. That way they can never skew vs the
versions in slb.c
Best place I think would be arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +
> +/* Inserts new slb entry */
It inserts two.
> +static void insert_slb_entry(char *p, int ssize)
> +{
> + unsigned long flags, entry;
> +
> + flags = SLB_VSID_KERNEL | mmu_psize_defs[MMU_PAGE_64K].sllp;
That won't work if the kernel is built for 4K pages. Or at least it
won't work the way we want it to.
You should use mmu_linear_psize.
But for vmalloc you should use mmu_vmalloc_psize, so it will need to be
a parameter.
> + preempt_disable();
> +
> + entry = get_slb_index();
> + asm volatile("slbmte %0,%1" :
> + : "r" (mk_vsid_data((unsigned long)p, ssize, flags)),
> + "r" (mk_esid_data((unsigned long)p, ssize, entry))
> + : "memory");
> +
> + entry = get_slb_index();
> + asm volatile("slbmte %0,%1" :
> + : "r" (mk_vsid_data((unsigned long)p, ssize, flags)),
> + "r" (mk_esid_data((unsigned long)p, ssize, entry))
> + : "memory");
> + preempt_enable();
> + /*
> + * This triggers exception, If handled correctly we must recover
> + * from this error.
> + */
> + p[0] = '!';
That doesn't belong in here, it should be done by the caller.
That would also mean p could be unsigned long in here, so you wouldn't
have to cast it four times.
> +}
> +
> +/* Inject slb multihit on vmalloc-ed address i.e 0xD00... */
> +static void inject_vmalloc_slb_multihit(void)
> +{
> + char *p;
> +
> + p = vmalloc(2048);
vmalloc() allocates whole pages, so it may as well be vmalloc(PAGE_SIZE).
> + if (!p)
> + return;
That's unlikely, but it should be an error that's propagated up to the caller.
> +
> + insert_slb_entry(p, MMU_SEGSIZE_1T);
> + vfree(p);
> +}
> +
> +/* Inject slb multihit on kmalloc-ed address i.e 0xC00... */
> +static void inject_kmalloc_slb_multihit(void)
> +{
> + char *p;
> +
> + p = kmalloc(2048, GFP_KERNEL);
> + if (!p)
> + return;
> +
> + insert_slb_entry(p, MMU_SEGSIZE_1T);
> + kfree(p);
> +}
> +
> +/*
> + * Few initial SLB entries are bolted. Add a test to inject
> + * multihit in bolted entry 0.
> + */
> +static void insert_dup_slb_entry_0(void)
> +{
> + unsigned long test_address = 0xC000000000000000;
Should use PAGE_OFFSET;
> + volatile unsigned long *test_ptr;
Does it need to be a volatile?
The slbmte should act as a compiler barrier (it has a memory clobber)
and a CPU barrier as well?
> + unsigned long entry, i = 0;
> + unsigned long esid, vsid;
Please group your variables:
unsigned long esid, vsid, entry, test_address, i;
volatile unsigned long *test_ptr;
And then initialise them as appropriate.
> + test_ptr = (unsigned long *)test_address;
> + preempt_disable();
> +
> + asm volatile("slbmfee %0,%1" : "=r" (esid) : "r" (i));
> + asm volatile("slbmfev %0,%1" : "=r" (vsid) : "r" (i));
Why do we need to read them out of the SLB rather than just computing
the values?
> + entry = get_slb_index();
> +
> + /* for i !=0 we would need to mask out the old entry number */
Or you could just compute esid and then it wouldn't be an issue.
> + asm volatile("slbmte %0,%1" :
> + : "r" (vsid),
> + "r" (esid | entry)
> + : "memory");
At this point we've just inserted a duplicate of entry 0. So you don't
need to insert a third entry do you?
> + asm volatile("slbmfee %0,%1" : "=r" (esid) : "r" (i));
> + asm volatile("slbmfev %0,%1" : "=r" (vsid) : "r" (i));
> + entry = get_slb_index();
> +
> + /* for i !=0 we would need to mask out the old entry number */
> + asm volatile("slbmte %0,%1" :
> + : "r" (vsid),
> + "r" (esid | entry)
> + : "memory");
> +
> + pr_info("%s accessing test address 0x%lx: 0x%lx\n",
> + __func__, test_address, *test_ptr);
This prints the first two instructions of the kernel. I happen to know
what values they should have, but most people won't understand what
they're seeing. A better test would be to read the value at the top of
the function and then load it again here and check we got the right
thing.
eg.
unsigned long val;
val = *test_ptr;
...
if (val == *test_ptr)
pr_info("Success ...")
else
pr_info("Fail ...")
> +
> + preempt_enable();
> +}
> +
> +void lkdtm_PPC_SLB_MULTIHIT(void)
> +{
> + if (mmu_has_feature(MMU_FTR_HPTE_TABLE)) {
That will be true even if radix is enabled. You need to use radix_enabled().
> + pr_info("Injecting SLB multihit errors\n");
> + /*
> + * These need not be separate tests, And they do pretty
> + * much same thing. In any case we must recover from the
> + * errors introduced by these functions, machine would not
> + * survive these tests in case of failure to handle.
> + */
> + inject_vmalloc_slb_multihit();
> + inject_kmalloc_slb_multihit();
> + insert_dup_slb_entry_0();
> + pr_info("Recovered from SLB multihit errors\n");
> + } else {
> + pr_err("XFAIL: This test is for ppc64 and with hash mode MMU only\n");
The whole file is only built if CONFIG_PPC64, so you don't need to
mention ppc64 in the message.
It should say something more like:
XFAIL: can't test SLB multi-hit when using Radix MMU
cheers
^ permalink raw reply
* [PATCH v1 0/2] Use H_RPT_INVALIDATE for nested guest
From: Bharata B Rao @ 2020-10-19 11:26 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, npiggin
This patchset adds support for the new hcall H_RPT_INVALIDATE
(currently handles nested case only) and replaces the nested tlb flush
calls with this new hcall if the support for the same exists.
Changes in v1:
-------------
- Removed the bits that added the FW_FEATURE_RPT_INVALIDATE feature
as they are already upstream.
v0: https://lore.kernel.org/linuxppc-dev/20200703104420.21349-1-bharata@linux.ibm.com/T/#m1800c5f5b3d4f6a154ae58fc1c617c06f286358f
H_RPT_INVALIDATE
================
Syntax:
int64 /* H_Success: Return code on successful completion */
/* H_Busy - repeat the call with the same */
/* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid parameters */
hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT translation lookaside information */
uint64 pid, /* PID/LPID to invalidate */
uint64 target, /* Invalidation target */
uint64 type, /* Type of lookaside information */
uint64 pageSizes, /* Page sizes */
uint64 start, /* Start of Effective Address (EA) range (inclusive) */
uint64 end) /* End of EA range (exclusive) */
Invalidation targets (target)
-----------------------------
Core MMU 0x01 /* All virtual processors in the partition */
Core local MMU 0x02 /* Current virtual processor */
Nest MMU 0x04 /* All nest/accelerator agents in use by the partition */
A combination of the above can be specified, except core and core local.
Type of translation to invalidate (type)
---------------------------------------
NESTED 0x0001 /* Invalidate nested guest partition-scope */
TLB 0x0002 /* Invalidate TLB */
PWC 0x0004 /* Invalidate Page Walk Cache */
PRT 0x0008 /* Invalidate Process Table Entries if NESTED is clear */
PAT 0x0008 /* Invalidate Partition Table Entries if NESTED is set */
A combination of the above can be specified.
Page size mask (pageSizes)
--------------------------
4K 0x01
64K 0x02
2M 0x04
1G 0x08
All sizes (-1UL)
A combination of the above can be specified.
All page sizes can be selected with -1.
Semantics: Invalidate radix tree lookaside information
matching the parameters given.
* Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters are
different from the defined values.
* Return H_PARAMETER if NESTED is set and pid is not a valid nested
LPID allocated to this partition
* Return H_P5 if (start, end) doesn't form a valid range. Start and end
should be a valid Quadrant address and end > start.
* Return H_NotSupported if the partition is not in running in radix
translation mode.
* May invalidate more translation information than requested.
* If start = 0 and end = -1, set the range to cover all valid addresses.
Else start and end should be aligned to 4kB (lower 11 bits clear).
* If NESTED is clear, then invalidate process scoped lookaside information.
Else pid specifies a nested LPID, and the invalidation is performed
on nested guest partition table and nested guest partition scope real
addresses.
* If pid = 0 and NESTED is clear, then valid addresses are quadrant 3 and
quadrant 0 spaces, Else valid addresses are quadrant 0.
* Pages which are fully covered by the range are to be invalidated.
Those which are partially covered are considered outside invalidation
range, which allows a caller to optimally invalidate ranges that may
contain mixed page sizes.
* Return H_SUCCESS on success.
Bharata B Rao (2):
KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE (nested case
only)
KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
Documentation/virt/kvm/api.rst | 17 +++
.../include/asm/book3s/64/tlbflush-radix.h | 18 +++
arch/powerpc/include/asm/kvm_book3s.h | 3 +
arch/powerpc/kvm/book3s_64_mmu_radix.c | 26 ++++-
arch/powerpc/kvm/book3s_hv.c | 32 ++++++
arch/powerpc/kvm/book3s_hv_nested.c | 107 +++++++++++++++++-
arch/powerpc/kvm/powerpc.c | 3 +
arch/powerpc/mm/book3s64/radix_tlb.c | 4 -
include/uapi/linux/kvm.h | 1 +
9 files changed, 200 insertions(+), 11 deletions(-)
--
2.26.2
^ permalink raw reply
* [PATCH v1 2/2] KVM: PPC: Book3S HV: Use H_RPT_INVALIDATE in nested KVM
From: Bharata B Rao @ 2020-10-19 11:26 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, npiggin
In-Reply-To: <20201019112642.53016-1-bharata@linux.ibm.com>
In the nested KVM case, replace H_TLB_INVALIDATE by the new hcall
H_RPT_INVALIDATE if available. The availability of this hcall
is determined from "hcall-rpt-invalidate" string in ibm,hypertas-functions
DT property.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
arch/powerpc/kvm/book3s_64_mmu_radix.c | 26 +++++++++++++++++++++-----
arch/powerpc/kvm/book3s_hv_nested.c | 13 +++++++++++--
2 files changed, 32 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 22a677b18695e..9934a91adcc3b 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -21,6 +21,7 @@
#include <asm/pte-walk.h>
#include <asm/ultravisor.h>
#include <asm/kvm_book3s_uvmem.h>
+#include <asm/plpar_wrappers.h>
/*
* Supported radix tree geometry.
@@ -318,9 +319,17 @@ void kvmppc_radix_tlbie_page(struct kvm *kvm, unsigned long addr,
}
psi = shift_to_mmu_psize(pshift);
- rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(0, 0, 1),
- lpid, rb);
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE)) {
+ rb = addr | (mmu_get_ap(psi) << PPC_BITLSHIFT(58));
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE,
+ H_TLBIE_P1_ENC(0, 0, 1), lpid, rb);
+ } else {
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_TLB,
+ psize_to_rpti_pgsize(psi),
+ addr, addr + psize);
+ }
if (rc)
pr_err("KVM: TLB page invalidation hcall failed, rc=%ld\n", rc);
}
@@ -334,8 +343,15 @@ static void kvmppc_radix_flush_pwc(struct kvm *kvm, unsigned int lpid)
return;
}
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(1, 0, 1),
- lpid, TLBIEL_INVAL_SET_LPID);
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE,
+ H_TLBIE_P1_ENC(1, 0, 1),
+ lpid, TLBIEL_INVAL_SET_LPID);
+ else
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_PWC, H_RPTI_PAGE_ALL,
+ 0, -1UL);
if (rc)
pr_err("KVM: TLB PWC invalidation hcall failed, rc=%ld\n", rc);
}
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 3ec0231628b42..2a187c782e89b 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -19,6 +19,7 @@
#include <asm/pgalloc.h>
#include <asm/pte-walk.h>
#include <asm/reg.h>
+#include <asm/plpar_wrappers.h>
static struct patb_entry *pseries_partition_tb;
@@ -402,8 +403,16 @@ static void kvmhv_flush_lpid(unsigned int lpid)
return;
}
- rc = plpar_hcall_norets(H_TLB_INVALIDATE, H_TLBIE_P1_ENC(2, 0, 1),
- lpid, TLBIEL_INVAL_SET_LPID);
+ if (!firmware_has_feature(FW_FEATURE_RPT_INVALIDATE))
+ rc = plpar_hcall_norets(H_TLB_INVALIDATE,
+ H_TLBIE_P1_ENC(2, 0, 1),
+ lpid, TLBIEL_INVAL_SET_LPID);
+ else
+ rc = pseries_rpt_invalidate(lpid, H_RPTI_TARGET_CMMU,
+ H_RPTI_TYPE_NESTED |
+ H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
+ H_RPTI_TYPE_PAT,
+ H_RPTI_PAGE_ALL, 0, -1UL);
if (rc)
pr_err("KVM: TLB LPID invalidation hcall failed, rc=%ld\n", rc);
}
--
2.26.2
^ permalink raw reply related
* [PATCH v1 1/2] KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE (nested case only)
From: Bharata B Rao @ 2020-10-19 11:26 UTC (permalink / raw)
To: kvm-ppc, linuxppc-dev; +Cc: aneesh.kumar, Bharata B Rao, npiggin
In-Reply-To: <20201019112642.53016-1-bharata@linux.ibm.com>
Implements H_RPT_INVALIDATE hcall and supports only nested case
currently.
A KVM capability KVM_CAP_RPT_INVALIDATE is added to indicate the
support for this hcall.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
Documentation/virt/kvm/api.rst | 17 ++++
.../include/asm/book3s/64/tlbflush-radix.h | 18 ++++
arch/powerpc/include/asm/kvm_book3s.h | 3 +
arch/powerpc/kvm/book3s_hv.c | 32 +++++++
arch/powerpc/kvm/book3s_hv_nested.c | 94 +++++++++++++++++++
arch/powerpc/kvm/powerpc.c | 3 +
arch/powerpc/mm/book3s64/radix_tlb.c | 4 -
include/uapi/linux/kvm.h | 1 +
8 files changed, 168 insertions(+), 4 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 1f26d83e6b168..67e98a56271ae 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5852,6 +5852,23 @@ controlled by the kvm module parameter halt_poll_ns. This capability allows
the maximum halt time to specified on a per-VM basis, effectively overriding
the module parameter for the target VM.
+7.21 KVM_CAP_RPT_INVALIDATE
+--------------------------
+
+:Capability: KVM_CAP_RPT_INVALIDATE
+:Architectures: ppc
+:Type: vm
+
+This capability indicates that the kernel is capable of handling
+H_RPT_INVALIDATE hcall.
+
+In order to enable the use of H_RPT_INVALIDATE in the guest,
+user space might have to advertise it for the guest. For example,
+IBM pSeries (sPAPR) guest starts using it if "hcall-rpt-invalidate" is
+present in the "ibm,hypertas-functions" device-tree property.
+
+This capability is always enabled.
+
8. Other capabilities.
======================
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index 94439e0cefc9c..aace7e9b2397d 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -4,6 +4,10 @@
#include <asm/hvcall.h>
+#define RIC_FLUSH_TLB 0
+#define RIC_FLUSH_PWC 1
+#define RIC_FLUSH_ALL 2
+
struct vm_area_struct;
struct mm_struct;
struct mmu_gather;
@@ -21,6 +25,20 @@ static inline u64 psize_to_rpti_pgsize(unsigned long psize)
return H_RPTI_PAGE_ALL;
}
+static inline int rpti_pgsize_to_psize(unsigned long page_size)
+{
+ if (page_size == H_RPTI_PAGE_4K)
+ return MMU_PAGE_4K;
+ if (page_size == H_RPTI_PAGE_64K)
+ return MMU_PAGE_64K;
+ if (page_size == H_RPTI_PAGE_2M)
+ return MMU_PAGE_2M;
+ if (page_size == H_RPTI_PAGE_1G)
+ return MMU_PAGE_1G;
+ else
+ return MMU_PAGE_64K; /* Default */
+}
+
static inline int mmu_get_ap(int psize)
{
return mmu_psize_defs[psize].ap;
diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
index d32ec9ae73bd4..0f1c5fa6e8ce3 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -298,6 +298,9 @@ void kvmhv_set_ptbl_entry(unsigned int lpid, u64 dw0, u64 dw1);
void kvmhv_release_all_nested(struct kvm *kvm);
long kvmhv_enter_nested_guest(struct kvm_vcpu *vcpu);
long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu);
+long kvmhv_h_rpti_nested(struct kvm_vcpu *vcpu, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end);
int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu,
u64 time_limit, unsigned long lpcr);
void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3bd3118c76330..6cbd37af91ebf 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -904,6 +904,28 @@ static int kvmppc_get_yield_count(struct kvm_vcpu *vcpu)
return yield_count;
}
+static long kvmppc_h_rpt_invalidate(struct kvm_vcpu *vcpu,
+ unsigned long pid, unsigned long target,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end)
+{
+ if (end < start)
+ return H_P5;
+
+ if ((!type & H_RPTI_TYPE_NESTED))
+ return H_P3;
+
+ if (!nesting_enabled(vcpu->kvm))
+ return H_FUNCTION;
+
+ /* Support only cores as target */
+ if (target != H_RPTI_TARGET_CMMU)
+ return H_P2;
+
+ return kvmhv_h_rpti_nested(vcpu, pid, (type & ~H_RPTI_TYPE_NESTED),
+ pg_sizes, start, end);
+}
+
int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
{
unsigned long req = kvmppc_get_gpr(vcpu, 3);
@@ -1112,6 +1134,14 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
*/
ret = kvmppc_h_svm_init_abort(vcpu->kvm);
break;
+ case H_RPT_INVALIDATE:
+ ret = kvmppc_h_rpt_invalidate(vcpu, kvmppc_get_gpr(vcpu, 4),
+ kvmppc_get_gpr(vcpu, 5),
+ kvmppc_get_gpr(vcpu, 6),
+ kvmppc_get_gpr(vcpu, 7),
+ kvmppc_get_gpr(vcpu, 8),
+ kvmppc_get_gpr(vcpu, 9));
+ break;
default:
return RESUME_HOST;
@@ -1158,6 +1188,7 @@ static int kvmppc_hcall_impl_hv(unsigned long cmd)
case H_XIRR_X:
#endif
case H_PAGE_INIT:
+ case H_RPT_INVALIDATE:
return 1;
}
@@ -5334,6 +5365,7 @@ static unsigned int default_hcall_list[] = {
H_XIRR,
H_XIRR_X,
#endif
+ H_RPT_INVALIDATE,
0
};
diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
index 6822d23a2da4d..3ec0231628b42 100644
--- a/arch/powerpc/kvm/book3s_hv_nested.c
+++ b/arch/powerpc/kvm/book3s_hv_nested.c
@@ -1149,6 +1149,100 @@ long kvmhv_do_nested_tlbie(struct kvm_vcpu *vcpu)
return H_SUCCESS;
}
+static long do_tlb_invalidate_nested_tlb(struct kvm_vcpu *vcpu,
+ unsigned long lpid,
+ unsigned long page_size,
+ unsigned long ap,
+ unsigned long start,
+ unsigned long end)
+{
+ unsigned long addr = start;
+ int ret;
+
+ do {
+ ret = kvmhv_emulate_tlbie_tlb_addr(vcpu, lpid, ap,
+ get_epn(addr));
+ if (ret)
+ return ret;
+ addr += page_size;
+ } while (addr < end);
+
+ return ret;
+}
+
+static long do_tlb_invalidate_nested_all(struct kvm_vcpu *vcpu,
+ unsigned long lpid)
+{
+ struct kvm *kvm = vcpu->kvm;
+ struct kvm_nested_guest *gp;
+
+ gp = kvmhv_get_nested(kvm, lpid, false);
+ if (gp) {
+ kvmhv_emulate_tlbie_lpid(vcpu, gp, RIC_FLUSH_ALL);
+ kvmhv_put_nested(gp);
+ }
+ return H_SUCCESS;
+}
+
+long kvmhv_h_rpti_nested(struct kvm_vcpu *vcpu, unsigned long lpid,
+ unsigned long type, unsigned long pg_sizes,
+ unsigned long start, unsigned long end)
+{
+ struct kvm_nested_guest *gp;
+ long ret;
+ unsigned long psize, ap;
+
+ /*
+ * If L2 lpid isn't valid, we need to return H_PARAMETER.
+ * Nested KVM issues a L2 lpid flush call when creating
+ * partition table entries for L2. This happens even before
+ * the corresponding shadow lpid is created in HV. Until
+ * this is fixed, ignore such flush requests.
+ */
+ gp = kvmhv_find_nested(vcpu->kvm, lpid);
+ if (!gp)
+ return H_SUCCESS;
+
+ if ((type & H_RPTI_TYPE_NESTED_ALL) == H_RPTI_TYPE_NESTED_ALL)
+ return do_tlb_invalidate_nested_all(vcpu, lpid);
+
+ if ((type & H_RPTI_TYPE_TLB) == H_RPTI_TYPE_TLB) {
+ if (pg_sizes & H_RPTI_PAGE_64K) {
+ psize = rpti_pgsize_to_psize(pg_sizes & H_RPTI_PAGE_64K);
+ ap = mmu_get_ap(psize);
+
+ ret = do_tlb_invalidate_nested_tlb(vcpu, lpid,
+ (1UL << 16),
+ ap, start, end);
+ if (ret)
+ return H_P4;
+ }
+
+ if (pg_sizes & H_RPTI_PAGE_2M) {
+ psize = rpti_pgsize_to_psize(pg_sizes & H_RPTI_PAGE_2M);
+ ap = mmu_get_ap(psize);
+
+ ret = do_tlb_invalidate_nested_tlb(vcpu, lpid,
+ (1UL << 21),
+ ap, start, end);
+ if (ret)
+ return H_P4;
+ }
+
+ if (pg_sizes & H_RPTI_PAGE_1G) {
+ psize = rpti_pgsize_to_psize(pg_sizes & H_RPTI_PAGE_1G);
+ ap = mmu_get_ap(psize);
+
+ ret = do_tlb_invalidate_nested_tlb(vcpu, lpid,
+ (1UL << 30),
+ ap, start, end);
+ if (ret)
+ return H_P4;
+ }
+ }
+ return H_SUCCESS;
+}
+
/* Used to convert a nested guest real address to a L1 guest real address */
static int kvmhv_translate_addr_nested(struct kvm_vcpu *vcpu,
struct kvm_nested_guest *gp,
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 13999123b7358..8ab4a71c40b4a 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -678,6 +678,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = hv_enabled && kvmppc_hv_ops->enable_svm &&
!kvmppc_hv_ops->enable_svm(NULL);
break;
+ case KVM_CAP_RPT_INVALIDATE:
+ r = 1;
+ break;
#endif
default:
r = 0;
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index b487b489d4b68..3a2b12d1d49b8 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -18,10 +18,6 @@
#include <asm/cputhreads.h>
#include <asm/plpar_wrappers.h>
-#define RIC_FLUSH_TLB 0
-#define RIC_FLUSH_PWC 1
-#define RIC_FLUSH_ALL 2
-
/*
* tlbiel instruction for radix, set invalidation
* i.e., r=1 and is=01 or is=10 or is=11
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7d8eced6f459b..109ede74735d4 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1037,6 +1037,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_SMALLER_MAXPHYADDR 185
#define KVM_CAP_S390_DIAG318 186
#define KVM_CAP_STEAL_TIME 187
+#define KVM_CAP_RPT_INVALIDATE 188
#ifdef KVM_CAP_IRQ_ROUTING
--
2.26.2
^ permalink raw reply related
* [PATCH 2/3] powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at
From: Christophe Leroy @ 2020-10-19 12:12 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <5ffcb064f695d5285bf1faab91bffa3f9245fc26.1603109522.git.christophe.leroy@csgroup.eu>
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
The placeholder for instruction selection should use the second
argument's operand, which is %1, not %0. This could generate incorrect
assembly code if the instruction selection for argument %0 ever differs
from argument %1.
Fixes: 9bf2b5cdc5fe ("powerpc: Fixes for CONFIG_PTE_64BIT for SMP support")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: <stable@vger.kernel.org> # v2.6.28+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/book3s/32/pgtable.h | 2 +-
arch/powerpc/include/asm/nohash/pgtable.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 36443cda8dcf..34f5ca391f0c 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -524,7 +524,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
__asm__ __volatile__("\
stw%U0%X0 %2,%0\n\
eieio\n\
- stw%U0%X0 %L2,%1"
+ stw%U1%X1 %L2,%1"
: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
: "r" (pte) : "memory");
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 4b7c3472eab1..a00e4c1746d6 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -199,7 +199,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
__asm__ __volatile__("\
stw%U0%X0 %2,%0\n\
eieio\n\
- stw%U0%X0 %L2,%1"
+ stw%U1%X1 %L2,%1"
: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
: "r" (pte) : "memory");
return;
--
2.25.0
^ permalink raw reply related
* [PATCH 3/3] powerpc: Fix pre-update addressing in inline assembly
From: Christophe Leroy @ 2020-10-19 12:12 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
In-Reply-To: <5ffcb064f695d5285bf1faab91bffa3f9245fc26.1603109522.git.christophe.leroy@csgroup.eu>
In several places, inline assembly uses the "%Un" modifier
to enable the use of instruction with pre-update addressing,
but the associated "<>" constraint is missing.
As mentioned in previous patch, this fails with gcc 4.9, so
"<>" can't be used directly.
Use UPD_CONSTR macro everywhere %Un modifier is used.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/atomic.h | 9 +++++----
arch/powerpc/include/asm/book3s/32/pgtable.h | 2 +-
arch/powerpc/include/asm/io.h | 4 ++--
arch/powerpc/include/asm/nohash/pgtable.h | 2 +-
arch/powerpc/kvm/powerpc.c | 4 ++--
5 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
index 8a55eb8cc97b..b82f9154e45a 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -10,6 +10,7 @@
#include <linux/types.h>
#include <asm/cmpxchg.h>
#include <asm/barrier.h>
+#include <asm/ppc_asm.h>
/*
* Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
@@ -26,14 +27,14 @@ static __inline__ int atomic_read(const atomic_t *v)
{
int t;
- __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+ __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
return t;
}
static __inline__ void atomic_set(atomic_t *v, int i)
{
- __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+ __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
}
#define ATOMIC_OP(op, asm_op) \
@@ -316,14 +317,14 @@ static __inline__ s64 atomic64_read(const atomic64_t *v)
{
s64 t;
- __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter));
+ __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"UPD_CONSTR(v->counter));
return t;
}
static __inline__ void atomic64_set(atomic64_t *v, s64 i)
{
- __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i));
+ __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"UPD_CONSTR(v->counter) : "r"(i));
}
#define ATOMIC64_OP(op, asm_op) \
diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 34f5ca391f0c..0e1b6e020cef 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -525,7 +525,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
stw%U0%X0 %2,%0\n\
eieio\n\
stw%U1%X1 %L2,%1"
- : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+ : "=m"UPD_CONSTR (*ptep), "=m"UPD_CONSTR (*((unsigned char *)ptep+4))
: "r" (pte) : "memory");
#else
diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index 58635960403c..87964dfb838e 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -122,7 +122,7 @@ static inline u##size name(const volatile u##size __iomem *addr) \
{ \
u##size ret; \
__asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\
- : "=r" (ret) : "m" (*addr) : "memory"); \
+ : "=r" (ret) : "m"UPD_CONSTR (*addr) : "memory"); \
return ret; \
}
@@ -130,7 +130,7 @@ static inline u##size name(const volatile u##size __iomem *addr) \
static inline void name(volatile u##size __iomem *addr, u##size val) \
{ \
__asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0" \
- : "=m" (*addr) : "r" (val) : "memory"); \
+ : "=m"UPD_CONSTR (*addr) : "r" (val) : "memory"); \
mmiowb_set_pending(); \
}
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index a00e4c1746d6..55ef2112ed00 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -200,7 +200,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
stw%U0%X0 %2,%0\n\
eieio\n\
stw%U1%X1 %L2,%1"
- : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+ : "=m"UPD_CONSTR (*ptep), "=m"UPD_CONSTR (*((unsigned char *)ptep+4))
: "r" (pte) : "memory");
return;
}
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 13999123b735..cf52d26f49cd 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -1087,7 +1087,7 @@ static inline u64 sp_to_dp(u32 fprs)
preempt_disable();
enable_kernel_fp();
- asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m" (fprd) : "m" (fprs)
+ asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m"UPD_CONSTR (fprd) : "m"UPD_CONSTR (fprs)
: "fr0");
preempt_enable();
return fprd;
@@ -1099,7 +1099,7 @@ static inline u32 dp_to_sp(u64 fprd)
preempt_disable();
enable_kernel_fp();
- asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m" (fprs) : "m" (fprd)
+ asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m"UPD_CONSTR (fprs) : "m"UPD_CONSTR (fprd)
: "fr0");
preempt_enable();
return fprs;
--
2.25.0
^ permalink raw reply related
* [PATCH 1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
From: Christophe Leroy @ 2020-10-19 12:12 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
GCC 4.9 sometimes fails to build with "m<>" constraint in
inline assembly.
CC lib/iov_iter.o
In file included from ./arch/powerpc/include/asm/cmpxchg.h:6:0,
from ./arch/powerpc/include/asm/atomic.h:11,
from ./include/linux/atomic.h:7,
from ./include/linux/crypto.h:15,
from ./include/crypto/hash.h:11,
from lib/iov_iter.c:2:
lib/iov_iter.c: In function 'iovec_from_user.part.30':
./arch/powerpc/include/asm/uaccess.h:287:2: error: 'asm' operand has impossible constraints
__asm__ __volatile__( \
^
./include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
# define unlikely(x) __builtin_expect(!!(x), 0)
^
./arch/powerpc/include/asm/uaccess.h:583:34: note: in expansion of macro 'unsafe_op_wrap'
#define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
^
./arch/powerpc/include/asm/uaccess.h:329:10: note: in expansion of macro '__get_user_asm'
case 4: __get_user_asm(x, (u32 __user *)ptr, retval, "lwz"); break; \
^
./arch/powerpc/include/asm/uaccess.h:363:3: note: in expansion of macro '__get_user_size_allowed'
__get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \
^
./arch/powerpc/include/asm/uaccess.h:100:2: note: in expansion of macro '__get_user_nocheck'
__get_user_nocheck((x), (ptr), sizeof(*(ptr)), false)
^
./arch/powerpc/include/asm/uaccess.h:583:49: note: in expansion of macro '__get_user_allowed'
#define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
^
lib/iov_iter.c:1663:3: note: in expansion of macro 'unsafe_get_user'
unsafe_get_user(len, &uiov[i].iov_len, uaccess_end);
^
make[1]: *** [scripts/Makefile.build:283: lib/iov_iter.o] Error 1
Define a UPD_CONSTR macro that is "<>" by default and
only "" with GCC prior to GCC 5.
Fixes: fcf1f26895a4 ("powerpc/uaccess: Add pre-update addressing to __put_user_asm_goto()")
Fixes: 2f279eeb68b8 ("powerpc/uaccess: Add pre-update addressing to __get_user_asm() and __put_user_asm()")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/ppc_asm.h | 14 ++++++++++++++
arch/powerpc/include/asm/uaccess.h | 4 ++--
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index 511786f0e40d..471c7c57fc98 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -803,6 +803,20 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96)
#endif /* !CONFIG_PPC_BOOK3E */
+#else /* __ASSEMBLY */
+
+/*
+ * Inline assembly memory constraint
+ *
+ * GCC 4.9 doesn't properly handle pre update memory constraint "m<>"
+ *
+ */
+#if defined(GCC_VERSION) && GCC_VERSION < 50000
+#define UPD_CONSTR ""
+#else
+#define UPD_CONSTR "<>"
+#endif
+
#endif /* __ASSEMBLY__ */
/*
diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 604d705f1bb8..8f27ea48fadb 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -223,7 +223,7 @@ do { \
"1: " op "%U1%X1 %0,%1 # put_user\n" \
EX_TABLE(1b, %l2) \
: \
- : "r" (x), "m<>" (*addr) \
+ : "r" (x), "m"UPD_CONSTR (*addr) \
: \
: label)
@@ -294,7 +294,7 @@ extern long __get_user_bad(void);
".previous\n" \
EX_TABLE(1b, 3b) \
: "=r" (err), "=r" (x) \
- : "m<>" (*addr), "i" (-EFAULT), "0" (err))
+ : "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
#ifdef __powerpc64__
#define __get_user_asm2(x, addr, err) \
--
2.25.0
^ permalink raw reply related
* Re: [PATCH] drm/amd/display: Fix missing declaration of enable_kernel_vsx()
From: kernel test robot @ 2020-10-19 12:52 UTC (permalink / raw)
To: Christophe Leroy, Harry Wentland, Leo Li, Alex Deucher,
Christian König, David Airlie, Daniel Vetter
Cc: linuxppc-dev, amd-gfx, kbuild-all, linux-kernel, dri-devel
In-Reply-To: <1a5d454cf080a00c04ae488ef6e98d5fcc933550.1603100151.git.christophe.leroy@csgroup.eu>
[-- Attachment #1: Type: text/plain, Size: 2296 bytes --]
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v5.9 next-20201016]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7cf726a59435301046250c42131554d9ccc566b8
config: arc-randconfig-r013-20201019 (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/33f0ea8bebc4132d957107f55776d8f1e02df928
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
git checkout 33f0ea8bebc4132d957107f55776d8f1e02df928
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29,
from drivers/gpu/drm/amd/amdgpu/../include/dm_pp_interface.h:26,
from drivers/gpu/drm/amd/amdgpu/amdgpu.h:67,
from drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:40:
>> drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:36:10: fatal error: asm/switch-to.h: No such file or directory
36 | #include <asm/switch-to.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
vim +36 drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h
34
35 #include <asm/byteorder.h>
> 36 #include <asm/switch-to.h>
37
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31866 bytes --]
^ permalink raw reply
* Re: [PATCH v4 2/2] lkdtm/powerpc: Add SLB multihit test
From: Michal Suchánek @ 2020-10-19 13:15 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, Ganesh Goudar, keescook, npiggin, mahesh
In-Reply-To: <87362azdjm.fsf@mpe.ellerman.id.au>
On Mon, Oct 19, 2020 at 09:59:57PM +1100, Michael Ellerman wrote:
> Hi Ganesh,
>
> Some comments below ...
>
> Ganesh Goudar <ganeshgr@linux.ibm.com> writes:
> > To check machine check handling, add support to inject slb
> > multihit errors.
> >
> > Cc: Kees Cook <keescook@chromium.org>
> > Reviewed-by: Michal Suchánek <msuchanek@suse.de>
> > Co-developed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> > Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
> > ---
> > drivers/misc/lkdtm/Makefile | 1 +
> > drivers/misc/lkdtm/core.c | 3 +
> > drivers/misc/lkdtm/lkdtm.h | 3 +
> > drivers/misc/lkdtm/powerpc.c | 156 ++++++++++++++++++++++++
> > tools/testing/selftests/lkdtm/tests.txt | 1 +
> > 5 files changed, 164 insertions(+)
> > create mode 100644 drivers/misc/lkdtm/powerpc.c
> >
> ..
> > diff --git a/drivers/misc/lkdtm/powerpc.c b/drivers/misc/lkdtm/powerpc.c
> > new file mode 100644
> > index 000000000000..f388b53dccba
> > --- /dev/null
> > +++ b/drivers/misc/lkdtm/powerpc.c
> > @@ -0,0 +1,156 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include "lkdtm.h"
> > +#include <linux/slab.h>
> > +#include <linux/vmalloc.h>
>
> Usual style is to include the linux headers first and then the local header.
>
> > +
> > +/* Gets index for new slb entry */
> > +static inline unsigned long get_slb_index(void)
> > +{
> > + unsigned long index;
> > +
> > + index = get_paca()->stab_rr;
> > +
> > + /*
> > + * simple round-robin replacement of slb starting at SLB_NUM_BOLTED.
> > + */
> > + if (index < (mmu_slb_size - 1))
> > + index++;
> > + else
> > + index = SLB_NUM_BOLTED;
> > + get_paca()->stab_rr = index;
> > + return index;
> > +}
>
> I'm not sure we need that really?
>
> We can just always insert at SLB_MUM_BOLTED and SLB_NUM_BOLTED + 1.
>
> Or we could allocate from the top down using mmu_slb_size - 1, and
> mmu_slb_size - 2.
>
>
> > +#define slb_esid_mask(ssize) \
> > + (((ssize) == MMU_SEGSIZE_256M) ? ESID_MASK : ESID_MASK_1T)
> > +
> > +/* Form the operand for slbmte */
> > +static inline unsigned long mk_esid_data(unsigned long ea, int ssize,
> > + unsigned long slot)
> > +{
> > + return (ea & slb_esid_mask(ssize)) | SLB_ESID_V | slot;
> > +}
> > +
> > +#define slb_vsid_shift(ssize) \
> > + ((ssize) == MMU_SEGSIZE_256M ? SLB_VSID_SHIFT : SLB_VSID_SHIFT_1T)
> > +
> > +/* Form the operand for slbmte */
> > +static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
> > + unsigned long flags)
> > +{
> > + return (get_kernel_vsid(ea, ssize) << slb_vsid_shift(ssize)) | flags |
> > + ((unsigned long)ssize << SLB_VSID_SSIZE_SHIFT);
> > +}
>
> I realise it's not much code, but I'd rather those were in a header,
> rather than copied from slb.c. That way they can never skew vs the
> versions in slb.c
>
> Best place I think would be arch/powerpc/include/asm/book3s/64/mmu-hash.h
>
>
> > +
> > +/* Inserts new slb entry */
>
> It inserts two.
>
> > +static void insert_slb_entry(char *p, int ssize)
> > +{
> > + unsigned long flags, entry;
> > +
> > + flags = SLB_VSID_KERNEL | mmu_psize_defs[MMU_PAGE_64K].sllp;
>
> That won't work if the kernel is built for 4K pages. Or at least it
> won't work the way we want it to.
>
> You should use mmu_linear_psize.
>
> But for vmalloc you should use mmu_vmalloc_psize, so it will need to be
> a parameter.
>
> > + preempt_disable();
> > +
> > + entry = get_slb_index();
> > + asm volatile("slbmte %0,%1" :
> > + : "r" (mk_vsid_data((unsigned long)p, ssize, flags)),
> > + "r" (mk_esid_data((unsigned long)p, ssize, entry))
> > + : "memory");
> > +
> > + entry = get_slb_index();
> > + asm volatile("slbmte %0,%1" :
> > + : "r" (mk_vsid_data((unsigned long)p, ssize, flags)),
> > + "r" (mk_esid_data((unsigned long)p, ssize, entry))
> > + : "memory");
> > + preempt_enable();
> > + /*
> > + * This triggers exception, If handled correctly we must recover
> > + * from this error.
> > + */
> > + p[0] = '!';
>
> That doesn't belong in here, it should be done by the caller.
>
> That would also mean p could be unsigned long in here, so you wouldn't
> have to cast it four times.
>
> > +}
> > +
> > +/* Inject slb multihit on vmalloc-ed address i.e 0xD00... */
> > +static void inject_vmalloc_slb_multihit(void)
> > +{
> > + char *p;
> > +
> > + p = vmalloc(2048);
>
> vmalloc() allocates whole pages, so it may as well be vmalloc(PAGE_SIZE).
>
> > + if (!p)
> > + return;
>
> That's unlikely, but it should be an error that's propagated up to the caller.
>
> > +
> > + insert_slb_entry(p, MMU_SEGSIZE_1T);
> > + vfree(p);
> > +}
> > +
> > +/* Inject slb multihit on kmalloc-ed address i.e 0xC00... */
> > +static void inject_kmalloc_slb_multihit(void)
> > +{
> > + char *p;
> > +
> > + p = kmalloc(2048, GFP_KERNEL);
> > + if (!p)
> > + return;
> > +
> > + insert_slb_entry(p, MMU_SEGSIZE_1T);
> > + kfree(p);
> > +}
> > +
> > +/*
> > + * Few initial SLB entries are bolted. Add a test to inject
> > + * multihit in bolted entry 0.
> > + */
> > +static void insert_dup_slb_entry_0(void)
> > +{
> > + unsigned long test_address = 0xC000000000000000;
>
> Should use PAGE_OFFSET;
>
> > + volatile unsigned long *test_ptr;
>
> Does it need to be a volatile?
> The slbmte should act as a compiler barrier (it has a memory clobber)
> and a CPU barrier as well?
>
> > + unsigned long entry, i = 0;
> > + unsigned long esid, vsid;
>
> Please group your variables:
>
> unsigned long esid, vsid, entry, test_address, i;
> volatile unsigned long *test_ptr;
>
> And then initialise them as appropriate.
>
> > + test_ptr = (unsigned long *)test_address;
> > + preempt_disable();
> > +
> > + asm volatile("slbmfee %0,%1" : "=r" (esid) : "r" (i));
> > + asm volatile("slbmfev %0,%1" : "=r" (vsid) : "r" (i));
>
> Why do we need to read them out of the SLB rather than just computing
> the values?
It ensures that the entry is perfect duplicate without copying even more
code from other parts of the kernel, doesn't it?
Especially when inserting only one duplicate as suggested later it
ensures that the test really does what it should.
>
> > + entry = get_slb_index();
> > +
> > + /* for i !=0 we would need to mask out the old entry number */
>
> Or you could just compute esid and then it wouldn't be an issue.
>
> > + asm volatile("slbmte %0,%1" :
> > + : "r" (vsid),
> > + "r" (esid | entry)
> > + : "memory");
>
> At this point we've just inserted a duplicate of entry 0. So you don't
> need to insert a third entry do you?
This code was obviously adapted from the previous one which needed two
entries in case there was none for the memory region to start with.
Addin only one duplicate should suffice and it can be easily tested that
it still generates a MCE.
>
> > + asm volatile("slbmfee %0,%1" : "=r" (esid) : "r" (i));
> > + asm volatile("slbmfev %0,%1" : "=r" (vsid) : "r" (i));
> > + entry = get_slb_index();
> > +
> > + /* for i !=0 we would need to mask out the old entry number */
> > + asm volatile("slbmte %0,%1" :
> > + : "r" (vsid),
> > + "r" (esid | entry)
> > + : "memory");
> > +
> > + pr_info("%s accessing test address 0x%lx: 0x%lx\n",
> > + __func__, test_address, *test_ptr);
>
> This prints the first two instructions of the kernel. I happen to know
> what values they should have, but most people won't understand what
> they're seeing. A better test would be to read the value at the top of
> the function and then load it again here and check we got the right
> thing.
It does not really matter what we read back so long as the compiler does
not optimize out the read. The point here is to access an address in the
range covered by the SLB entry 0. The failure case is that the system
crashes and the test never finishes.
Thanks
Michal
^ permalink raw reply
* Re: [PATCH] drm/amd/display: Fix missing declaration of enable_kernel_vsx()
From: kernel test robot @ 2020-10-19 14:06 UTC (permalink / raw)
To: Christophe Leroy, Harry Wentland, Leo Li, Alex Deucher,
Christian König, David Airlie, Daniel Vetter
Cc: kbuild-all, linux-kernel, amd-gfx, clang-built-linux, dri-devel,
linuxppc-dev
In-Reply-To: <1a5d454cf080a00c04ae488ef6e98d5fcc933550.1603100151.git.christophe.leroy@csgroup.eu>
[-- Attachment #1: Type: text/plain, Size: 2488 bytes --]
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v5.9 next-20201016]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7cf726a59435301046250c42131554d9ccc566b8
config: x86_64-randconfig-a015-20201019 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 094e9f4779eb9b5c6a49014f2f80b8cbb833572f)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# https://github.com/0day-ci/linux/commit/33f0ea8bebc4132d957107f55776d8f1e02df928
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
git checkout 33f0ea8bebc4132d957107f55776d8f1e02df928
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/tonga_baco.c:23:
In file included from drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu.h:67:
In file included from drivers/gpu/drm/amd/amdgpu/../include/dm_pp_interface.h:26:
In file included from drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29:
>> drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:36:10: fatal error: 'asm/switch-to.h' file not found
#include <asm/switch-to.h>
^~~~~~~~~~~~~~~~~
1 error generated.
vim +36 drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h
34
35 #include <asm/byteorder.h>
> 36 #include <asm/switch-to.h>
37
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32643 bytes --]
^ permalink raw reply
* Re: [PATCH 3/3] powerpc: Fix pre-update addressing in inline assembly
From: kernel test robot @ 2020-10-19 15:35 UTC (permalink / raw)
To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman
Cc: clang-built-linux, linuxppc-dev, kbuild-all, linux-kernel
In-Reply-To: <fbcdb173cc42da62f00285dfef8c2f7d4960b5c7.1603109522.git.christophe.leroy@csgroup.eu>
[-- Attachment #1: Type: text/plain, Size: 5813 bytes --]
Hi Christophe,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master next-20201016]
[cannot apply to kvm-ppc/kvm-ppc-next mpe/next v5.9]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-uaccess-Don-t-use-m-constraint-with-GCC-4-9/20201019-201504
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc64-randconfig-r012-20201019 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 094e9f4779eb9b5c6a49014f2f80b8cbb833572f)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc64 cross compiling tool for clang build
# apt-get install binutils-powerpc64-linux-gnu
# https://github.com/0day-ci/linux/commit/d57fd8d270993414b8c0414d7be4b03cc3de1856
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Christophe-Leroy/powerpc-uaccess-Don-t-use-m-constraint-with-GCC-4-9/20201019-201504
git checkout d57fd8d270993414b8c0414d7be4b03cc3de1856
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from arch/powerpc/kernel/asm-offsets.c:14:
In file included from include/linux/compat.h:14:
In file included from include/linux/sem.h:5:
In file included from include/uapi/linux/sem.h:5:
In file included from include/linux/ipc.h:5:
In file included from include/linux/spinlock.h:51:
In file included from include/linux/preempt.h:78:
In file included from ./arch/powerpc/include/generated/asm/preempt.h:1:
In file included from include/asm-generic/preempt.h:5:
In file included from include/linux/thread_info.h:21:
In file included from arch/powerpc/include/asm/current.h:13:
In file included from arch/powerpc/include/asm/paca.h:31:
In file included from arch/powerpc/include/asm/atomic.h:13:
In file included from arch/powerpc/include/asm/ppc_asm.h:9:
In file included from arch/powerpc/include/asm/processor.h:40:
>> arch/powerpc/include/asm/ptrace.h:288:20: error: use of undeclared identifier 'THREAD_SIZE'
return ((addr & ~(THREAD_SIZE - 1)) ==
^
arch/powerpc/include/asm/ptrace.h:289:35: error: use of undeclared identifier 'THREAD_SIZE'
(kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
^
In file included from arch/powerpc/kernel/asm-offsets.c:21:
include/linux/mman.h:137:9: warning: division by zero is undefined [-Wdivision-by-zero]
_calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/mman.h:115:21: note: expanded from macro '_calc_vm_trans'
: ((x) & (bit1)) / ((bit1) / (bit2))))
^ ~~~~~~~~~~~~~~~~~
include/linux/mman.h:138:9: warning: division by zero is undefined [-Wdivision-by-zero]
_calc_vm_trans(flags, MAP_SYNC, VM_SYNC );
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/mman.h:115:21: note: expanded from macro '_calc_vm_trans'
: ((x) & (bit1)) / ((bit1) / (bit2))))
^ ~~~~~~~~~~~~~~~~~
2 warnings and 2 errors generated.
make[2]: *** [scripts/Makefile.build:117: arch/powerpc/kernel/asm-offsets.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [Makefile:1202: prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:185: __sub-make] Error 2
make: Target 'prepare' not remade because of errors.
vim +/THREAD_SIZE +288 arch/powerpc/include/asm/ptrace.h
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 275
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 276 /**
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 277 * regs_within_kernel_stack() - check the address in the stack
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 278 * @regs: pt_regs which contains kernel stack pointer.
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 279 * @addr: address which is checked.
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 280 *
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 281 * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 282 * If @addr is within the kernel stack, it returns true. If not, returns false.
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 283 */
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 284
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 285 static inline bool regs_within_kernel_stack(struct pt_regs *regs,
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 286 unsigned long addr)
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 287 {
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 @288 return ((addr & ~(THREAD_SIZE - 1)) ==
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 289 (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 290 }
359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 291
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26981 bytes --]
^ permalink raw reply
* Re: [PATCH] drm/amd/display: Fix missing declaration of enable_kernel_vsx()
From: Christophe Leroy @ 2020-10-19 15:40 UTC (permalink / raw)
To: kernel test robot, Harry Wentland, Leo Li, Alex Deucher,
Christian König, David Airlie, Daniel Vetter
Cc: linuxppc-dev, amd-gfx, kbuild-all, linux-kernel, dri-devel
In-Reply-To: <202010192006.INRpAG6V-lkp@intel.com>
Le 19/10/2020 à 14:52, kernel test robot a écrit :
> Hi Christophe,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [also build test ERROR on v5.9 next-20201016]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
> base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7cf726a59435301046250c42131554d9ccc566b8
> config: arc-randconfig-r013-20201019 (attached as .config)
> compiler: arceb-elf-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/0day-ci/linux/commit/33f0ea8bebc4132d957107f55776d8f1e02df928
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Christophe-Leroy/drm-amd-display-Fix-missing-declaration-of-enable_kernel_vsx/20201019-174155
> git checkout 33f0ea8bebc4132d957107f55776d8f1e02df928
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> In file included from drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29,
> from drivers/gpu/drm/amd/amdgpu/../include/dm_pp_interface.h:26,
> from drivers/gpu/drm/amd/amdgpu/amdgpu.h:67,
> from drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:40:
>>> drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:36:10: fatal error: asm/switch-to.h: No such file or directory
> 36 | #include <asm/switch-to.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
Argh ! Yes that's a typo. And anyway it fixes nothing because <asm/switch_to.h> is already included.
The issue is that enable_kernel_vsx() is only declared when CONFIG_VSX is set. The simplest solution
will probably be to declare it at all time.
Christophe
>
> vim +36 drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h
>
> 34
> 35 #include <asm/byteorder.h>
> > 36 #include <asm/switch-to.h>
> 37
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
>
^ permalink raw reply
* Re: [PATCH 1/4] KVM: PPC: Book3S HV: Make struct kernel_param_ops definition const
From: Paolo Bonzini @ 2020-10-19 16:01 UTC (permalink / raw)
To: Joe Perches, Paul Mackerras
Cc: kvm-ppc, Athira Rajeev, Tianjia Zhang, Ram Pai, linux-kernel,
Sean Christopherson, Bharata B Rao, Davidlohr Bueso,
Nicholas Piggin, Laurent Dufour, linuxppc-dev
In-Reply-To: <d130e88dd4c82a12d979da747cc0365c72c3ba15.1601770305.git.joe@perches.com>
On 04/10/20 02:18, Joe Perches wrote:
> This should be const, so make it so.
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---
> arch/powerpc/kvm/book3s_hv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 4ba06a2a306c..2b215852cdc9 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -111,7 +111,7 @@ module_param(one_vm_per_core, bool, S_IRUGO | S_IWUSR);
> MODULE_PARM_DESC(one_vm_per_core, "Only run vCPUs from the same VM on a core (requires indep_threads_mode=N)");
>
> #ifdef CONFIG_KVM_XICS
> -static struct kernel_param_ops module_param_ops = {
> +static const struct kernel_param_ops module_param_ops = {
> .set = param_set_int,
> .get = param_get_int,
> };
>
Queued, thanks.
Paolo
^ permalink raw reply
* Re: [PATCH 11/20] dt-bindings: usb: dwc3: Add synopsys,dwc3 compatible string
From: Serge Semin @ 2020-10-19 16:22 UTC (permalink / raw)
To: Rob Herring
Cc: Neil Armstrong, linux-kernel, Pavel Parkhomenko, Kevin Hilman,
Krzysztof Kozlowski, Andy Gross, linux-snps-arc, devicetree,
Mathias Nyman, Lad Prabhakar, Alexey Malahov, Bjorn Andersson,
linux-arm-kernel, Roger Quadros, Felipe Balbi, Greg Kroah-Hartman,
Yoshihiro Shimoda, linux-usb, linux-mips, Serge Semin,
Manu Gautam, linuxppc-dev
In-Reply-To: <20201016185340.GA1734346@bogus>
On Fri, Oct 16, 2020 at 01:53:40PM -0500, Rob Herring wrote:
> On Thu, Oct 15, 2020 at 12:35:54AM +0300, Serge Semin wrote:
> > On Wed, Oct 14, 2020 at 10:18:18PM +0200, Krzysztof Kozlowski wrote:
> > > On Wed, Oct 14, 2020 at 01:13:53PM +0300, Serge Semin wrote:
> > > > The DWC USB3 driver and some DTS files like Exynos 5250, Keystone k2e, etc
> > > > expects the DWC USB3 DT node to have the compatible string with the
> > > > "synopsys" vendor prefix. Let's add the corresponding compatible string to
> > > > the controller DT schema, but mark it as deprecated seeing the Synopsys,
> > > > Inc. is presented with just "snps" vendor prefix.
> > >
> >
> > > Instead of adding deprecated schema just correct the DTSes to use snps.
> > > The "synopsys" is not even in vendor prefixes.
> >
> > Yeah, it's not, but the driver and some dts'es use it this way. I am not sure
> > that the solution suggested by you is much better than mine. So let's hear the
> > Rob'es opinion out in this matter. @Rob, what do you think?
>
> I think we should fix the dts files given there's only 5.
Ok. I'll do that.
-Sergey
>
> Rob
^ permalink raw reply
* Re: [PATCH] asm-generic: Force inlining of get_order() to work around gcc10 poor decision
From: Segher Boessenkool @ 2020-10-19 17:55 UTC (permalink / raw)
To: Christophe Leroy
Cc: linux-arch, Arnd Bergmann, Linux Kernel Mailing List,
Masahiro Yamada, Joel Stanley, Andrew Morton, linuxppc-dev
In-Reply-To: <188e00e1-ae41-693e-1d05-f8d87e7ee696@csgroup.eu>
On Mon, Oct 19, 2020 at 10:54:40AM +0200, Christophe Leroy wrote:
> Le 19/10/2020 à 10:32, Segher Boessenkool a écrit :
> >The kernel should just use __always_inline if that is what it *wants*;
> >that is true here most likely. GCC could perhaps improve its heuristics
> >so that it no longer thinks these functions are often too big for
> >inlining (they *are* pretty big, but not after basic optimisations with
> >constant integer arguments).
>
> Yes I guess __always_inline is to be added on functions like this defined
> in headers for exactly that, and that's the purpose of this patch.
>
> However I find it odd that get_order() is outlined by GCC even in some
> object files that don't use it at all, for instance in fs/pipe.o
It is (arguably) too big too always inline if you do not consider that
__builtin_constant_p will remove half of the function one way or
another. Not sure if that is what happens here, but now we have a PR
(thanks!) and we will find out.
Segher
^ permalink raw reply
* Re: [PATCH 3/3] powerpc: Fix pre-update addressing in inline assembly
From: Christophe Leroy @ 2020-10-19 18:23 UTC (permalink / raw)
To: kernel test robot, Benjamin Herrenschmidt, Paul Mackerras,
Michael Ellerman
Cc: clang-built-linux, linuxppc-dev, kbuild-all, linux-kernel
In-Reply-To: <202010192300.35IC9AK7-lkp@intel.com>
Le 19/10/2020 à 17:35, kernel test robot a écrit :
> Hi Christophe,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on powerpc/next]
> [also build test ERROR on linus/master next-20201016]
> [cannot apply to kvm-ppc/kvm-ppc-next mpe/next v5.9]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-uaccess-Don-t-use-m-constraint-with-GCC-4-9/20201019-201504
> base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> config: powerpc64-randconfig-r012-20201019 (attached as .config)
> compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 094e9f4779eb9b5c6a49014f2f80b8cbb833572f)
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc64 cross compiling tool for clang build
> # apt-get install binutils-powerpc64-linux-gnu
> # https://github.com/0day-ci/linux/commit/d57fd8d270993414b8c0414d7be4b03cc3de1856
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Christophe-Leroy/powerpc-uaccess-Don-t-use-m-constraint-with-GCC-4-9/20201019-201504
> git checkout d57fd8d270993414b8c0414d7be4b03cc3de1856
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> In file included from arch/powerpc/kernel/asm-offsets.c:14:
> In file included from include/linux/compat.h:14:
> In file included from include/linux/sem.h:5:
> In file included from include/uapi/linux/sem.h:5:
> In file included from include/linux/ipc.h:5:
> In file included from include/linux/spinlock.h:51:
> In file included from include/linux/preempt.h:78:
> In file included from ./arch/powerpc/include/generated/asm/preempt.h:1:
> In file included from include/asm-generic/preempt.h:5:
> In file included from include/linux/thread_info.h:21:
> In file included from arch/powerpc/include/asm/current.h:13:
> In file included from arch/powerpc/include/asm/paca.h:31:
> In file included from arch/powerpc/include/asm/atomic.h:13:
> In file included from arch/powerpc/include/asm/ppc_asm.h:9:
> In file included from arch/powerpc/include/asm/processor.h:40:
>>> arch/powerpc/include/asm/ptrace.h:288:20: error: use of undeclared identifier 'THREAD_SIZE'
> return ((addr & ~(THREAD_SIZE - 1)) ==
> ^
> arch/powerpc/include/asm/ptrace.h:289:35: error: use of undeclared identifier 'THREAD_SIZE'
> (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
Most likely a circular inclusion problem.
I'll have to put it in a header that doesn't include pile of other stuff. The least bad candidate
seems to be asm-const.h
Christophe
> ^
> In file included from arch/powerpc/kernel/asm-offsets.c:21:
> include/linux/mman.h:137:9: warning: division by zero is undefined [-Wdivision-by-zero]
> _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> include/linux/mman.h:115:21: note: expanded from macro '_calc_vm_trans'
> : ((x) & (bit1)) / ((bit1) / (bit2))))
> ^ ~~~~~~~~~~~~~~~~~
> include/linux/mman.h:138:9: warning: division by zero is undefined [-Wdivision-by-zero]
> _calc_vm_trans(flags, MAP_SYNC, VM_SYNC );
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> include/linux/mman.h:115:21: note: expanded from macro '_calc_vm_trans'
> : ((x) & (bit1)) / ((bit1) / (bit2))))
> ^ ~~~~~~~~~~~~~~~~~
> 2 warnings and 2 errors generated.
> make[2]: *** [scripts/Makefile.build:117: arch/powerpc/kernel/asm-offsets.s] Error 1
> make[2]: Target '__build' not remade because of errors.
> make[1]: *** [Makefile:1202: prepare0] Error 2
> make[1]: Target 'prepare' not remade because of errors.
> make: *** [Makefile:185: __sub-make] Error 2
> make: Target 'prepare' not remade because of errors.
>
> vim +/THREAD_SIZE +288 arch/powerpc/include/asm/ptrace.h
>
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 275
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 276 /**
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 277 * regs_within_kernel_stack() - check the address in the stack
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 278 * @regs: pt_regs which contains kernel stack pointer.
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 279 * @addr: address which is checked.
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 280 *
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 281 * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 282 * If @addr is within the kernel stack, it returns true. If not, returns false.
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 283 */
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 284
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 285 static inline bool regs_within_kernel_stack(struct pt_regs *regs,
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 286 unsigned long addr)
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 287 {
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 @288 return ((addr & ~(THREAD_SIZE - 1)) ==
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 289 (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 290 }
> 359e4284a3f37ab Mahesh Salgaonkar 2010-04-07 291
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
>
^ permalink raw reply
* Re: [PATCH 1/3] powerpc/uaccess: Don't use "m<>" constraint with GCC 4.9
From: Segher Boessenkool @ 2020-10-19 20:10 UTC (permalink / raw)
To: Christophe Leroy; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <5ffcb064f695d5285bf1faab91bffa3f9245fc26.1603109522.git.christophe.leroy@csgroup.eu>
On Mon, Oct 19, 2020 at 12:12:46PM +0000, Christophe Leroy wrote:
> GCC 4.9 sometimes fails to build with "m<>" constraint in
> inline assembly.
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -223,7 +223,7 @@ do { \
> "1: " op "%U1%X1 %0,%1 # put_user\n" \
> EX_TABLE(1b, %l2) \
> : \
> - : "r" (x), "m<>" (*addr) \
> + : "r" (x), "m"UPD_CONSTR (*addr) \
> : \
> : label)
>
> @@ -294,7 +294,7 @@ extern long __get_user_bad(void);
> ".previous\n" \
> EX_TABLE(1b, 3b) \
> : "=r" (err), "=r" (x) \
> - : "m<>" (*addr), "i" (-EFAULT), "0" (err))
> + : "m"UPD_CONSTR (*addr), "i" (-EFAULT), "0" (err))
Wow, ugly! But these are the only two places that use this, so
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
I just hope that we get rid of 4.9 before we would use this a lot more ;-)
Segher
^ permalink raw reply
* Re: [PATCH 2/3] powerpc: Fix incorrect stw{, ux, u, x} instructions in __set_pte_at
From: Segher Boessenkool @ 2020-10-19 20:14 UTC (permalink / raw)
To: Christophe Leroy; +Cc: Paul Mackerras, linuxppc-dev, linux-kernel
In-Reply-To: <1b26e1b8544ea46ad0da102d1367694cd23c222c.1603109522.git.christophe.leroy@csgroup.eu>
On Mon, Oct 19, 2020 at 12:12:47PM +0000, Christophe Leroy wrote:
> From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>
> The placeholder for instruction selection should use the second
> argument's operand, which is %1, not %0. This could generate incorrect
> assembly code if the instruction selection for argument %0 ever differs
> from argument %1.
"Instruction selection" isn't correct here... "if the memory addressing
of operand 0 is a different form from that of operand 1", perhaps?
The patch looks fine of course :-)
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Segher
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox