LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Alexey Kardashevskiy @ 2021-01-20  4:49 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <871regxwzh.fsf@linux.ibm.com>



On 20/01/2021 11:39, Nathan Lynch wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>> On 16/01/2021 02:38, Nathan Lynch wrote:
>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>> On 15/01/2021 09:00, Nathan Lynch wrote:
>>>>> Memory locations passed as arguments from the OS to RTAS usually need
>>>>> to be addressable in 32-bit mode and must reside in the Real Mode
>>>>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>>>>> first logical memory block reported in the LPAR’s device tree.
>>>>>
>>>>> On powerpc targets with RTAS, Linux makes available to user space a
>>>>> region of memory suitable for arguments to be passed to RTAS via
>>>>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>>>>> API during boot in order to ensure that it satisfies the requirements
>>>>> described above.
>>>>>
>>>>> With radix MMU, the upper limit supplied to the memblock allocation
>>>>> can exceed the bounds of the first logical memory block, since
>>>>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
>>>>> a common size of the first memory block according to a small sample of
>>>>> LPARs I have checked.) This leads to failures when user space invokes
>>>>> an RTAS function that uses a work area, such as
>>>>> ibm,configure-connector.
>>>>>
>>>>> Alter the determination of the upper limit for rtas_rmo_buf's
>>>>> allocation to consult the device tree directly, ensuring placement
>>>>> within the RMA regardless of the MMU in use.
>>>>
>>>> Can we tie this with RTAS (which also needs to be in RMA) and simply add
>>>> extra 64K in prom_instantiate_rtas() and advertise this address
>>>> (ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do
>>>> not need this RMO area before that point.
>>>
>>> Can you explain more about what advantage that would bring? I'm not
>>> seeing it. It's a more significant change than what I've written
>>> here.
>>
>>
>> We already allocate space for RTAS and (like RMO) it needs to be in RMA,
>> and RMO is useless without RTAS. We can reuse RTAS allocation code for
>> RMO like this:
> 
> When you say RMO I assume you are referring to rtas_rmo_buf? (I don't
> think it is well-named.)
> 
> 
>> ===
>> diff --git a/arch/powerpc/kernel/prom_init.c
>> b/arch/powerpc/kernel/prom_init.c
>> index e9d4eb6144e1..d9527d3e01d2 100644
>> --- a/arch/powerpc/kernel/prom_init.c
>> +++ b/arch/powerpc/kernel/prom_init.c
>> @@ -1821,7 +1821,8 @@ static void __init prom_instantiate_rtas(void)
>>           if (size == 0)
>>                   return;
>>
>> -       base = alloc_down(size, PAGE_SIZE, 0);
>> +       /* One page for RTAS, one for RMO */
> 
> One page for RTAS? RTAS is ~20MB on LPARs I've checked:
> 
> # lsprop /proc/device-tree/rtas/{rtas-size,linux,rtas-base}
> /proc/device-tree/rtas/rtas-size
> 		 01370000 (20381696)

You are right, I did not sleep well when replied, sorry about that :) I 
tried it with KVM where RTAS is just a few KBs (20 constant bytes + MCE 
log, depends on cpu number) so it worked for me.


> 
>> +       base = alloc_down(size, PAGE_SIZE + PAGE_SIZE, 0);
> 
> This changes the alignment but not the size of the allocation.


Should be:

base = alloc_down(ALIGN_UP(size, PAGE_SIZE) + PAGE_SIZE, PAGE_SIZE, 0);

> 
> 
>>           if (base == 0)
>>                   prom_panic("Could not allocate memory for RTAS\n");
>>
>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>> index d126d71ea5bd..885d95cf4ed3 100644
>> --- a/arch/powerpc/kernel/rtas.c
>> +++ b/arch/powerpc/kernel/rtas.c
>> @@ -1186,6 +1186,7 @@ void __init rtas_initialize(void)
>>           rtas.size = size;
>>           no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry",
>> &entry);
>>           rtas.entry = no_entry ? rtas.base : entry;
>> +       rtas_rmo_buf = rtas.base + PAGE_SIZE;
> 
> I think this would overlay the user region on top of the RTAS private
> data area, allowing user space to corrupt it.


Right, my bad. Should be:

rtas_rmo_buf = ALIGN_UP(rtas.base + rtas.size, PAGE_SIZE);


> 
>>
>>           /* If RTAS was found, allocate the RMO buffer for it and look for
>>            * the stop-self token if any
>> @@ -1196,11 +1197,6 @@ void __init rtas_initialize(void)
>>                   ibm_suspend_me_token = rtas_token("ibm,suspend-me");
>>           }
>>    #endif
>> -       rtas_rmo_buf = memblock_phys_alloc_range(RTAS_RMOBUF_MAX, PAGE_SIZE,
>> -                                                0, rtas_region);
>> -       if (!rtas_rmo_buf)
>> -               panic("ERROR: RTAS: Failed to allocate %lx bytes below
>> %pa\n",
>> -                     PAGE_SIZE, &rtas_region);
>> ===
>>
>> May be store in the FDT as "linux,rmo-base" next to "linux,rtas-base",
>> for clarity, as sharing symbols between prom and main kernel is a bit
>> tricky.
>>
>> The benefit is that we do not do the same thing   (== find 64K in RMA)
>> in 2 different ways and if the RMO allocated my way is broken - we'll
>> know it much sooner as RTAS itself will break too.
> 
> Implementation details aside... I'll grant that combining the
> allocations into one in prom_init reduces some duplication in the sense
> that both are subject to the same constraints (mostly - the RTAS data
> area must not cross a 256MB boundary, while the user region may). But
> they really are distinct concerns. The RTAS private data area is
> specified in the platform architecture, the OS is obligated to allocate
> it and pass it to instantiate-rtas, etc etc. However the user region
> (rtas_rmo_buf) is purely a Linux construct which is there to support
> sys_rtas.

Not purely - it should be an address which RTAS accepts. Cannot argue 
with the rest though, it all sounds correct.

> Now, there are multiple sites in the kernel proper that must allocate
> memory suitable for passing to RTAS. Obviously there is value in
> consolidating the logic for that purpose in one place, so I'll work on
> adding that in v2. OK?

Sure.


-- 
Alexey

^ permalink raw reply

* Re: [PATCH 5/6] powerpc/rtas: rename RTAS_RMOBUF_MAX to RTAS_USER_REGION_SIZE
From: Alexey Kardashevskiy @ 2021-01-20  5:05 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <87y2gowgo6.fsf@linux.ibm.com>



On 20/01/2021 12:17, Nathan Lynch wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>> On 16/01/2021 02:56, Nathan Lynch wrote:
>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>> On 15/01/2021 09:00, Nathan Lynch wrote:
>>>>> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
>>>>> index 332e1000ca0f..1aa7ab1cbc84 100644
>>>>> --- a/arch/powerpc/include/asm/rtas.h
>>>>> +++ b/arch/powerpc/include/asm/rtas.h
>>>>> @@ -19,8 +19,11 @@
>>>>>     #define RTAS_UNKNOWN_SERVICE (-1)
>>>>>     #define RTAS_INSTANTIATE_MAX (1ULL<<30) /* Don't instantiate rtas at/above this value */
>>>>>     
>>>>> -/* Buffer size for ppc_rtas system call. */
>>>>> -#define RTAS_RMOBUF_MAX (64 * 1024)
>>>>> +/* Work areas shared with RTAS must be 4K, naturally aligned. */
>>>>
>>>> Why exactly 4K and not (for example) PAGE_SIZE?
>>>
>>> 4K is a platform requirement and isn't related to Linux's configured
>>> page size. See the PAPR specification for RTAS functions such as
>>> ibm,configure-connector, ibm,update-nodes, ibm,update-properties.
>>
>> Good, since we are documenting things here - add to the comment ("per
>> PAPR")?
> 
> But almost every constant in this header relates to a specification or
> requirement in PAPR.


Yup, "almost".

> 
>>> There are other calls with work area parameters where alignment isn't
>>> specified (e.g. ibm,get-system-parameter) but 4KB alignment is a safe
>>> choice for those.
>>>
>>>>> +#define RTAS_WORK_AREA_SIZE   4096
>>>>> +
>>>>> +/* Work areas allocated for user space access. */
>>>>> +#define RTAS_USER_REGION_SIZE (RTAS_WORK_AREA_SIZE * 16)
>>>>
>>>> This is still 64K but no clarity why. There is 16 of something, what
>>>> is it?
>>>
>>> There are 16 4KB work areas in the region. I can name it
>>> RTAS_NR_USER_WORK_AREAS or similar.
>>
>>
>> Why 16? PAPR (then add "per PAPR") or we just like 16 ("should be
>> enough")?
> 
> PAPR doesn't know anything about the user region; it's a Linux
> construct. It's been 64KB since pre-git days and I'm not sure what the
> original reason is. At this point, maintaining a kernel-user ABI seems
> like enough justification for the value.

I am not arguing keeping the numbers but you are replacing one magic 
number with another and for neither it is horribly obvious where they 
came from. Is 16 the max number of concurrently running sys_rtas system 
calls? Does the userspace ensure there is no more than 16? btw where is 
that userspace code? I thought 
https://github.com/power-ras/ppc64-diag.git but no. Thanks,



-- 
Alexey

^ permalink raw reply

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
From: Christopher M. Riedl @ 2021-01-20  5:08 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, linuxppc-dev
In-Reply-To: <1e6309a4-58fe-c6a4-6e47-d8659177846c@csgroup.eu>

On Tue Jan 19, 2021 at 11:27 AM CST, Christophe Leroy wrote:
>
>
> Le 19/01/2021 à 18:02, Christopher M. Riedl a écrit :
> > On Tue Jan 19, 2021 at 6:33 AM CST, Christophe Leroy wrote:
> >>
> >>
> >> Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
> >>> "Christopher M. Riedl" <cmr@codefail.de> writes:
> >>>> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
> >>>>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> >>>>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
> >>>>>> access is open. Use this new function to implement raw_copy_from_user().
> >>>>>> Finally, wrap the new function to follow the usual "unsafe_" convention
> >>>>>> of taking a label argument.
> >>>>>
> >>>>> I think there is no point implementing raw_copy_from_user_allowed(), see
> >>>>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
> >>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
> >>>>>
> >>>>> You should simply do:
> >>>>>
> >>>>> #define unsafe_copy_from_user(d, s, l, e) \
> >>>>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> >>>>>
> >>>>
> >>>> I gave this a try and the signal ops decreased by ~8K. Now, to be
> >>>> honest, I am not sure what an "acceptable" benchmark number here
> >>>> actually is - so maybe this is ok? Same loss with both radix and hash:
> >>>>
> >>>> 	|                                      | hash   | radix  |
> >>>> 	| ------------------------------------ | ------ | ------ |
> >>>> 	| linuxppc/next                        | 118693 | 133296 |
> >>>> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
> >>>> 	| unsafe-signal64                      | 200480 | 234067 |
> >>>> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
> >>>>
> >>>> To put this into perspective, prior to KUAP and uaccess flush, signal
> >>>> performance in this benchmark was ~290K on hash.
> >>>
> >>> If I'm doing the math right 8K is ~4% of the best number.
> >>>
> >>> It seems like 4% is worth a few lines of code to handle these constant
> >>> sizes. It's not like we have performance to throw away.
> >>>
> >>> Or, we should chase down where the call sites are that are doing small
> >>> constant copies with copy_to/from_user() and change them to use
> >>> get/put_user().
> >>>
> >>
> >> Christopher, when you say you gave it a try, is I my series or only the
> >> following ?
> >>
> >> #define unsafe_copy_from_user(d, s, l, e) \
> >> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> >>
> > 
> > I only used the above to replace this patch in my series (so none of my
> > changes implementing raw_copy_from_user_allowed() are included).
>
> Then I see no reason why the performance would be different, because you
> only call
> unsafe_copy_from_user() with non trivial lengthes.
>

Ok I made a mistake - this actually included the other improvements from
your feedback on this series; specifically moving the unsafe_get_user()
call to read the msr regs value into the #ifdef. My first pass resulted
in another __get_user() call which explains some of the perf loss.

With that mistake fixed, the benchmark performance is still only ~197k
on hash (consistent). I agree that there are no places where
unsafe_copy_from_user() is called with trivial lengths so the only
conclusion I can draw is that the changes in this patch marginally
speed-up the other __copy_from_user() calls. I started comparing the
disassembly but nothing immediately obvious stands out to me. In fact, I
observe the speed-up even if I keep this patch _and_ apply this change:

#define unsafe_copy_from_user(d, s, l, e) \
unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)

Related to that, I think sigset_t is always 8B on ppc64 so these will
probably need a wrapper if we pick-up your series to get rid of trivial
size optimizations:

__copy_from_user(&set, &uc->uc_sigmask, sizeof(set))

I can bring performance up to ~200K on hash again by replacing the two
__copy_from_user(&set, ...) with direct calls to __get_user_size() (it's
kind of hacky I think). See the last commit in this tree:

https://git.sr.ht/~cmr/linux/log/unsafe-signal64-v4

All my comments apply to performance on radix as well.

> > 
> >>
> >> Because I see no use of unsafe_copy_from_user() that would explain that.
> >>
> >> Christophe


^ permalink raw reply

* Re: [PATCH] powerpc/47x: Disable 256k page size
From: Michael Ellerman @ 2021-01-20  5:45 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <2fed79b1154c872194f98bac4422c23918325e61.1611039590.git.christophe.leroy@csgroup.eu>

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> PPC47x_TLBE_SIZE isn't defined for 256k pages, so
> this size of page shall not be selected for 47x.
>
> Reported-by: kernel test robot <lkp@intel.com>
> Fixes: e7f75ad01d59 ("powerpc/47x: Base ppc476 support")
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 107bb4319e0e..a685e42d3993 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -772,7 +772,7 @@ config PPC_64K_PAGES
>  
>  config PPC_256K_PAGES
>  	bool "256k page size"
> -	depends on 44x && !STDBINUTILS
> +	depends on 44x && !STDBINUTILS && !PPC_47x

Do we still need this STDBINUTILS thing?

It's pretty gross, and I notice we have zero defconfigs which disable
it, meaning it's only randconfig builds that will ever test 256K pages.

Can we just drop it and say if you enable 256K pages you need to know
what you're doing?

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/47x: Disable 256k page size
From: Christophe Leroy @ 2021-01-20  6:31 UTC (permalink / raw)
  To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <87h7ncqhz1.fsf@mpe.ellerman.id.au>



Le 20/01/2021 à 06:45, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> PPC47x_TLBE_SIZE isn't defined for 256k pages, so
>> this size of page shall not be selected for 47x.
>>
>> Reported-by: kernel test robot <lkp@intel.com>
>> Fixes: e7f75ad01d59 ("powerpc/47x: Base ppc476 support")
>> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
>> ---
>>   arch/powerpc/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 107bb4319e0e..a685e42d3993 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -772,7 +772,7 @@ config PPC_64K_PAGES
>>   
>>   config PPC_256K_PAGES
>>   	bool "256k page size"
>> -	depends on 44x && !STDBINUTILS
>> +	depends on 44x && !STDBINUTILS && !PPC_47x
> 
> Do we still need this STDBINUTILS thing?
> 
> It's pretty gross, and I notice we have zero defconfigs which disable
> it, meaning it's only randconfig builds that will ever test 256K pages.
> 
> Can we just drop it and say if you enable 256K pages you need to know
> what you're doing?

I guess we can, yes. I'll send a patch for that.

Christophe

> 

^ permalink raw reply

* [PATCH 1/2] powerpc/32s: Add missing call to kuep_lock on syscall entry
From: Christophe Leroy @ 2021-01-20  7:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

Userspace Execution protection and fast syscall entry were implemented
independently from each other and were both merged in kernel 5.2,
leading to syscall entry missing userspace execution protection.

On syscall entry, execution of user space memory must be
locked in the same way as on exception entry.

Fixes: b86fb88855ea ("powerpc/32: implement fast entry for syscalls on non BOOKE")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/entry_32.S | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 1c9b0ccc2172..9bc4e7dd0bee 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -356,6 +356,9 @@ trace_syscall_entry_irq_off:
 
 	.globl	transfer_to_syscall
 transfer_to_syscall:
+#ifdef CONFIG_PPC_BOOK3S_32
+	kuep_lock r11, r12
+#endif
 #ifdef CONFIG_TRACE_IRQFLAGS
 	andi.	r12,r9,MSR_EE
 	beq-	trace_syscall_entry_irq_off
-- 
2.25.0


^ permalink raw reply related

* [PATCH 2/2] powerpc/32s: Unroll kuep_lock and kuep_unlock macros
From: Christophe Leroy @ 2021-01-20  7:34 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <00e462bd8b271bed0fec62dc9b25ffd4e0ec19cf.1611128021.git.christophe.leroy@csgroup.eu>

Unroll the loops in kuep_lock and kuep_unlock.

Benchmarked on an mpc 8321 with a standard kernel having a
3M/1M user/kernel memory split, i.e. 12 segments for user.

Without KUEP, null_syscall benchmark is 220 cycles.
With KUEP, null_syscall benchmark is 439 cycles.

Once loops are unrolled, null_syscall benchmark is 366 cycles.
This is almost 17% reduction.

It is assumed that userspace covers at least 4 segments and
at most 14 segments.

The isync is removed, it saves 8 cycles. For kuep_unlock, the rfi
will do the synchronisation. For kuep_lock, we get a small window
during which exec is still possible, but is won't last more than a
few instructions.

Both macros are called two times so the size increase is in
the noise (approx 120 instructions).

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/include/asm/book3s/32/kup.h | 67 ++++++++++++++++++------
 1 file changed, 52 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/kup.h b/arch/powerpc/include/asm/book3s/32/kup.h
index a0117a9d5b06..e800b515ac02 100644
--- a/arch/powerpc/include/asm/book3s/32/kup.h
+++ b/arch/powerpc/include/asm/book3s/32/kup.h
@@ -7,21 +7,61 @@
 
 #ifdef __ASSEMBLY__
 
-.macro kuep_update_sr	gpr1, gpr2		/* NEVER use r0 as gpr2 due to addis */
-101:	mtsrin	\gpr1, \gpr2
-	addi	\gpr1, \gpr1, 0x111		/* next VSID */
-	rlwinm	\gpr1, \gpr1, 0, 0xf0ffffff	/* clear VSID overflow */
-	addis	\gpr2, \gpr2, 0x1000		/* address of next segment */
-	bdnz	101b
-	isync
+.macro kuep_increment gpr1, gpr2
+	addi	\gpr1, \gpr1, 0x222		/* Next second VSID */
+	addi	\gpr2, \gpr2, 0x222		/* Next second VSID */
+	rlwinm	\gpr1, \gpr1, 0, 0xf0ffffff	/* Clear VSID overflow */
+	rlwinm	\gpr2, \gpr2, 0, 0xf0ffffff	/* Clear VSID overflow */
+.endm
+
+.macro kuep_update_sr gpr1, gpr2		/* NEVER use r0 as gpr1 or gpr2 due to addi */
+	addi	\gpr2, \gpr1, 0x111		/* Next VSID */
+	rlwinm	\gpr2, \gpr2, 0, 0xf0ffffff	/* Clear VSID overflow */
+	mtsr	0, \gpr1
+	mtsr	1, \gpr2
+	kuep_increment \gpr1, \gpr2
+	mtsr	2, \gpr1
+	mtsr	3, \gpr2
+#if NUM_USER_SEGMENTS > 4
+	kuep_increment \gpr1, \gpr2
+	mtsr	4, \gpr1
+#if NUM_USER_SEGMENTS > 5
+	mtsr	5, \gpr2
+#if NUM_USER_SEGMENTS > 6
+	kuep_increment \gpr1, \gpr2
+	mtsr	6, \gpr1
+#if NUM_USER_SEGMENTS > 7
+	mtsr	7, \gpr2
+#if NUM_USER_SEGMENTS > 8
+	kuep_increment \gpr1, \gpr2
+	mtsr	8, \gpr1
+#if NUM_USER_SEGMENTS > 9
+	mtsr	9, \gpr2
+#if NUM_USER_SEGMENTS > 10
+	kuep_increment \gpr1, \gpr2
+	mtsr	10, \gpr1
+#if NUM_USER_SEGMENTS > 11
+	mtsr	11, \gpr2
+#if NUM_USER_SEGMENTS > 12
+	kuep_increment \gpr1, \gpr2
+	mtsr	12, \gpr1
+#if NUM_USER_SEGMENTS > 13
+	mtsr	13, \gpr2
+#endif
+#endif
+#endif
+#endif
+#endif
+#endif
+#endif
+#endif
+#endif
+#endif
 .endm
 
 .macro kuep_lock	gpr1, gpr2
 #ifdef CONFIG_PPC_KUEP
-	li	\gpr1, NUM_USER_SEGMENTS
-	li	\gpr2, 0
-	mtctr	\gpr1
-	mfsrin	\gpr1, \gpr2
+	mfsr	\gpr1, 0
 	oris	\gpr1, \gpr1, SR_NX@h		/* set Nx */
 	kuep_update_sr \gpr1, \gpr2
 #endif
@@ -29,10 +69,7 @@
 
 .macro kuep_unlock	gpr1, gpr2
 #ifdef CONFIG_PPC_KUEP
-	li	\gpr1, NUM_USER_SEGMENTS
-	li	\gpr2, 0
-	mtctr	\gpr1
-	mfsrin	\gpr1, \gpr2
+	mfsr	\gpr1, 0
 	rlwinm	\gpr1, \gpr1, 0, ~SR_NX		/* Clear Nx */
 	kuep_update_sr \gpr1, \gpr2
 #endif
-- 
2.25.0


^ permalink raw reply related

* [PATCH 1/2] powerpc/47x: Disable 256k page size
From: Christophe Leroy @ 2021-01-20  7:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel

PPC47x_TLBE_SIZE isn't defined for 256k pages, so
this size of page shall not be selected for 47x.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: e7f75ad01d59 ("powerpc/47x: Base ppc476 support")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 107bb4319e0e..a685e42d3993 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -772,7 +772,7 @@ config PPC_64K_PAGES
 
 config PPC_256K_PAGES
 	bool "256k page size"
-	depends on 44x && !STDBINUTILS
+	depends on 44x && !STDBINUTILS && !PPC_47x
 	help
 	  Make the page size 256k.
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH 2/2] powerpc/44x: Remove STDBINUTILS kconfig option
From: Christophe Leroy @ 2021-01-20  7:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <2fed79b1154c872194f98bac4422c23918325e61.1611128938.git.christophe.leroy@csgroup.eu>

STDBINUTILS is just a toggle to allow 256k page size
to appear in the possible page sizes list for the 44x.

Make 256k page size appear all the time with an
explicit warning on binutils, and remove this unneccessary
STDBINUTILS config option.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/Kconfig | 27 +++++++--------------------
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a685e42d3993..3e29995540a7 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -720,18 +720,6 @@ config ARCH_MEMORY_PROBE
 	def_bool y
 	depends on MEMORY_HOTPLUG
 
-config STDBINUTILS
-	bool "Using standard binutils settings"
-	depends on 44x
-	default y
-	help
-	  Turning this option off allows you to select 256KB PAGE_SIZE on 44x.
-	  Note, that kernel will be able to run only those applications,
-	  which had been compiled using binutils later than 2.17.50.0.3 with
-	  '-zmax-page-size' set to 256K (the default is 64K). Or, if using
-	  the older binutils, you can patch them with a trivial patch, which
-	  changes the ELF_MAXPAGESIZE definition from 0x10000 to 0x40000.
-
 choice
 	prompt "Page size"
 	default PPC_4K_PAGES
@@ -771,17 +759,16 @@ config PPC_64K_PAGES
 	select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
 
 config PPC_256K_PAGES
-	bool "256k page size"
-	depends on 44x && !STDBINUTILS && !PPC_47x
+	bool "256k page size (Requires non-standard binutils settings)"
+	depends on 44x && !PPC_47x
 	help
 	  Make the page size 256k.
 
-	  As the ELF standard only requires alignment to support page
-	  sizes up to 64k, you will need to compile all of your user
-	  space applications with a non-standard binutils settings
-	  (see the STDBINUTILS description for details).
-
-	  Say N unless you know what you are doing.
+	  That kernel will be able to run only those applications,
+	  which had been compiled using binutils later than 2.17.50.0.3 with
+	  '-zmax-page-size' set to 256K (the default is 64K). Or, if using
+	  the older binutils, you can patch them with a trivial patch, which
+	  changes the ELF_MAXPAGESIZE definition from 0x10000 to 0x40000.
 
 endchoice
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs
From: Nicholas Piggin @ 2021-01-20  7:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Running softirqs enables interrupts, which can then end up recursing
into the irq soft-mask code we're adjusting, including replaying
interrupts itself, which might be theoretically unbounded.

This abridged trace shows how this can occur:

! NIP replay_soft_interrupts
  LR  interrupt_exit_kernel_prepare
  Call Trace:
    interrupt_exit_kernel_prepare (unreliable)
    interrupt_return
  --- interrupt: ea0 at __rb_reserve_next
  NIP __rb_reserve_next
  LR __rb_reserve_next
  Call Trace:
    ring_buffer_lock_reserve
    trace_function
    function_trace_call
    ftrace_call
    __do_softirq
    irq_exit
    timer_interrupt
!   replay_soft_interrupts
    interrupt_exit_kernel_prepare
    interrupt_return
  --- interrupt: ea0 at arch_local_irq_restore

Fix this by disabling bhs (softirqs) around the interrupt replay.

I don't know that commit 3282a3da25bd ("powerpc/64: Implement soft
interrupt replay in C") actually introduced the problem. Prior to that
change, the interrupt replay seems like it should still be subect to
this recusion, however it's done after all the irq state has been fixed
up at the end of the replay, so it seems reasonable to fix back to this
commit.

Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/irq.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 6b1eca53e36c..7064135f9dc3 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -188,6 +188,18 @@ void replay_soft_interrupts(void)
 	unsigned char happened = local_paca->irq_happened;
 	struct pt_regs regs;
 
+	/*
+	 * Prevent softirqs from being run when an interrupt handler returns
+	 * and calls irq_exit(), because softirq processing enables interrupts.
+	 * If an interrupt is taken, it may then call replay_soft_interrupts
+	 * on its way out, which gets messy and recursive.
+	 *
+	 * softirqs created by replayed interrupts will be run at the end of
+	 * this function when bhs are enabled (if they were enabled in our
+	 * caller).
+	 */
+	local_bh_disable();
+
 	ppc_save_regs(&regs);
 	regs.softe = IRQS_ENABLED;
 
@@ -263,6 +275,8 @@ void replay_soft_interrupts(void)
 		trace_hardirqs_off();
 		goto again;
 	}
+
+	local_bh_enable();
 }
 
 notrace void arch_local_irq_restore(unsigned long mask)
-- 
2.23.0


^ permalink raw reply related

* RE: [PATCH 2/2] powerpc/44x: Remove STDBINUTILS kconfig option
From: David Laight @ 2021-01-20  9:33 UTC (permalink / raw)
  To: 'Christophe Leroy', Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman
  Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
In-Reply-To: <f9981e819009aa121a998dc483052ec76f78f991.1611128938.git.christophe.leroy@csgroup.eu>

From: Christophe Leroy
> Sent: 20 January 2021 07:49
> 
> STDBINUTILS is just a toggle to allow 256k page size
> to appear in the possible page sizes list for the 44x.
> 
> Make 256k page size appear all the time with an
> explicit warning on binutils, and remove this unneccessary
> STDBINUTILS config option.
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
>  arch/powerpc/Kconfig | 27 +++++++--------------------
>  1 file changed, 7 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index a685e42d3993..3e29995540a7 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -720,18 +720,6 @@ config ARCH_MEMORY_PROBE
>  	def_bool y
>  	depends on MEMORY_HOTPLUG
> 
> -config STDBINUTILS
> -	bool "Using standard binutils settings"
> -	depends on 44x
> -	default y
> -	help
> -	  Turning this option off allows you to select 256KB PAGE_SIZE on 44x.
> -	  Note, that kernel will be able to run only those applications,
> -	  which had been compiled using binutils later than 2.17.50.0.3 with
> -	  '-zmax-page-size' set to 256K (the default is 64K). Or, if using
> -	  the older binutils, you can patch them with a trivial patch, which
> -	  changes the ELF_MAXPAGESIZE definition from 0x10000 to 0x40000.
> -
>  choice
>  	prompt "Page size"
>  	default PPC_4K_PAGES
> @@ -771,17 +759,16 @@ config PPC_64K_PAGES
>  	select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64
> 
>  config PPC_256K_PAGES
> -	bool "256k page size"
> -	depends on 44x && !STDBINUTILS && !PPC_47x
> +	bool "256k page size (Requires non-standard binutils settings)"
> +	depends on 44x && !PPC_47x
>  	help
>  	  Make the page size 256k.
> 
> -	  As the ELF standard only requires alignment to support page
> -	  sizes up to 64k, you will need to compile all of your user
> -	  space applications with a non-standard binutils settings
> -	  (see the STDBINUTILS description for details).
> -
> -	  Say N unless you know what you are doing.
> +	  That kernel will be able to run only those applications,
> +	  which had been compiled using binutils later than 2.17.50.0.3 with
> +	  '-zmax-page-size' set to 256K (the default is 64K). Or, if using
> +	  the older binutils, you can patch them with a trivial patch, which
> +	  changes the ELF_MAXPAGESIZE definition from 0x10000 to 0x40000.


The kernel will only be able to run applications that have been
compiled with '-zmax-page-size' set to 256K (the default is 64K)
using binutils later than 2.17.50.0.3, or by patching the
ELF_MAXPAGESIZE definition from 0x10000 to 0x40000 in older versions.

> 
>  endchoice
> 
> --
> 2.25.0

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply

* [PATCH] [PATCH] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Ananth N Mavinakayanahalli @ 2021-01-20  9:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: naveen.n.rao, ravi.bangoria, dja

We currently unconditionally try to newer emulate instructions on older
Power versions that could cause issues. Gate it.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..ed119858e5e9 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1528,6 +1528,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		goto compute_done;
 
 	case 19:
+		if (!cpu_has_feature(CPU_FTR_ARCH_300))
+			return -1;
 		if (((word >> 1) & 0x1f) == 2) {
 			/* addpcis */
 			imm = (short) (word & 0xffc1);	/* d0 + d2 fields */
@@ -2439,6 +2441,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 268:	/* lxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 16;
@@ -2448,6 +2452,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 269:	/* lxvl */
 		case 301: {	/* lxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2475,6 +2481,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 364:	/* lxvwsx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 4;
@@ -2482,6 +2490,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 396:	/* stxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 16;
@@ -2491,6 +2501,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 397:	/* stxvl */
 		case 429: {	/* stxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2542,6 +2554,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 781:	/* lxsibzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 1);
 			op->element_size = 8;
@@ -2549,6 +2563,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 812:	/* lxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 2;
@@ -2556,6 +2572,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 813:	/* lxsihzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 2);
 			op->element_size = 8;
@@ -2569,6 +2587,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 876:	/* lxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 1;
@@ -2582,6 +2602,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 909:	/* stxsibx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 1);
 			op->element_size = 8;
@@ -2589,6 +2611,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 940:	/* stxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 2;
@@ -2596,6 +2620,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 941:	/* stxsihx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 2);
 			op->element_size = 8;
@@ -2609,6 +2635,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1004:	/* stxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 1;
@@ -2717,12 +2745,16 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->type = MKOP(LOAD_FP, 0, 16);
 			break;
 		case 2:		/* lxsd */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 8);
 			op->element_size = 8;
 			op->vsx_flags = VSX_CHECK_VEC;
 			break;
 		case 3:		/* lxssp */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 8;
@@ -2775,6 +2807,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1:		/* lxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;
@@ -2785,6 +2819,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 2:		/* stxsd with LSB of DS field = 0 */
 		case 6:		/* stxsd with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 8);
@@ -2794,6 +2830,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 3:		/* stxssp with LSB of DS field = 0 */
 		case 7:		/* stxssp with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 4);
@@ -2802,6 +2840,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 5:		/* stxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;



^ permalink raw reply related

* Re: [PATCH] [PATCH] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Naveen N. Rao @ 2021-01-20 10:14 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli; +Cc: naveen.n.rao, linuxppc-dev, ravi.bangoria, dja
In-Reply-To: <161113596420.206556.5023431229030762544.stgit@thinktux.local>

On 2021/01/20 03:16PM, Ananth N Mavinakayanahalli wrote:
> We currently unconditionally try to newer emulate instructions on older
				      ^^^^^ never?
				      Or: "emulate newer"?

> Power versions that could cause issues. Gate it.
> 
> Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
> ---
>  arch/powerpc/lib/sstep.c |   40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)

Thanks!

> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index bf7a7d62ae8b..ed119858e5e9 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1528,6 +1528,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>  		goto compute_done;
>  
>  	case 19:
> +		if (!cpu_has_feature(CPU_FTR_ARCH_300))
> +			return -1;
>  		if (((word >> 1) & 0x1f) == 2) {
>  			/* addpcis */
>  			imm = (short) (word & 0xffc1);	/* d0 + d2 fields */

The cpu feature check should be within the if condition above since 
there are other instructions under opcode 19. This is not an issue right 
now as we don't emulate any of the others after this point, but it would 
be good to restrict the change to specific instructions.

Rest of the changes below look good to me. The only other v3.0 
instruction we need to gate is the 'scv' instruction. It would be good 
to handle that too.

- Naveen


^ permalink raw reply

* [PATCH] [PATCH V2] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Ananth N Mavinakayanahalli @ 2021-01-20 10:27 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: naveen.n.rao, ravi.bangoria, dja

We currently unconditionally try to emulate newer instructions on older
Power versions that could cause issues. Gate it.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
---
 arch/powerpc/lib/sstep.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..ed119858e5e9 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1528,6 +1528,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		goto compute_done;
 
 	case 19:
+		if (!cpu_has_feature(CPU_FTR_ARCH_300))
+			return -1;
 		if (((word >> 1) & 0x1f) == 2) {
 			/* addpcis */
 			imm = (short) (word & 0xffc1);	/* d0 + d2 fields */
@@ -2439,6 +2441,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 268:	/* lxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 16;
@@ -2448,6 +2452,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 269:	/* lxvl */
 		case 301: {	/* lxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2475,6 +2481,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 364:	/* lxvwsx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 4;
@@ -2482,6 +2490,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 396:	/* stxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 16;
@@ -2491,6 +2501,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 397:	/* stxvl */
 		case 429: {	/* stxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2542,6 +2554,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 781:	/* lxsibzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 1);
 			op->element_size = 8;
@@ -2549,6 +2563,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 812:	/* lxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 2;
@@ -2556,6 +2572,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 813:	/* lxsihzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 2);
 			op->element_size = 8;
@@ -2569,6 +2587,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 876:	/* lxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 1;
@@ -2582,6 +2602,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 909:	/* stxsibx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 1);
 			op->element_size = 8;
@@ -2589,6 +2611,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 940:	/* stxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 2;
@@ -2596,6 +2620,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 941:	/* stxsihx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 2);
 			op->element_size = 8;
@@ -2609,6 +2635,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1004:	/* stxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 1;
@@ -2717,12 +2745,16 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->type = MKOP(LOAD_FP, 0, 16);
 			break;
 		case 2:		/* lxsd */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 8);
 			op->element_size = 8;
 			op->vsx_flags = VSX_CHECK_VEC;
 			break;
 		case 3:		/* lxssp */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 8;
@@ -2775,6 +2807,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1:		/* lxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;
@@ -2785,6 +2819,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 2:		/* stxsd with LSB of DS field = 0 */
 		case 6:		/* stxsd with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 8);
@@ -2794,6 +2830,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 3:		/* stxssp with LSB of DS field = 0 */
 		case 7:		/* stxssp with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 4);
@@ -2802,6 +2840,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 5:		/* stxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;



^ permalink raw reply related

* Re: [PATCH] [PATCH] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Ananth N Mavinakayanahalli @ 2021-01-20 10:40 UTC (permalink / raw)
  To: Naveen N. Rao; +Cc: linuxppc-dev, ravi.bangoria, dja
In-Reply-To: <20210120101445.GA80@DESKTOP-TDPLP67.localdomain>

On 1/20/21 3:44 PM, Naveen N. Rao wrote:
> On 2021/01/20 03:16PM, Ananth N Mavinakayanahalli wrote:
...

>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>> index bf7a7d62ae8b..ed119858e5e9 100644
>> --- a/arch/powerpc/lib/sstep.c
>> +++ b/arch/powerpc/lib/sstep.c
>> @@ -1528,6 +1528,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>>   		goto compute_done;
>>   
>>   	case 19:
>> +		if (!cpu_has_feature(CPU_FTR_ARCH_300))
>> +			return -1;
>>   		if (((word >> 1) & 0x1f) == 2) {
>>   			/* addpcis */
>>   			imm = (short) (word & 0xffc1);	/* d0 + d2 fields */
> 
> The cpu feature check should be within the if condition above since
> there are other instructions under opcode 19. This is not an issue right
> now as we don't emulate any of the others after this point, but it would
> be good to restrict the change to specific instructions.
> 
> Rest of the changes below look good to me. The only other v3.0
> instruction we need to gate is the 'scv' instruction. It would be good
> to handle that too.

I missed this in v2.. will send a v3. There is a bigger change needed to
accommodate scv since we currently don't check for it in analyze_insn().

-- 
Ananth

^ permalink raw reply

* Re: [PATCH v6 25/39] powerpc: convert interrupt handlers to use wrappers
From: kernel test robot @ 2021-01-20 10:48 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: kbuild-all, Nicholas Piggin
In-Reply-To: <20210115165012.1260253-26-npiggin@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4075 bytes --]

Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.11-rc4 next-20210120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-interrupt-wrappers/20210116-023244
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-skiroot_defconfig (attached as .config)
compiler: powerpc64le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/04d5131f1545e1e992962a5339135b605eb421a5
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nicholas-Piggin/powerpc-interrupt-wrappers/20210116-023244
        git checkout 04d5131f1545e1e992962a5339135b605eb421a5
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/mm/book3s64/hash_utils.c:41:
>> arch/powerpc/mm/book3s64/hash_utils.c:1516:30: error: no previous prototype for '__do_hash_fault' [-Werror=missing-prototypes]
    1516 | DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
         |                              ^~~~~~~~~~~~~~~
   arch/powerpc/include/asm/interrupt.h:150:24: note: in definition of macro 'DEFINE_INTERRUPT_HANDLER_RET'
     150 | __visible noinstr long func(struct pt_regs *regs)   \
         |                        ^~~~
   arch/powerpc/mm/book3s64/hash_utils.c:1905:6: error: no previous prototype for 'hpte_insert_repeating' [-Werror=missing-prototypes]
    1905 | long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
         |      ^~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors


vim +/__do_hash_fault +1516 arch/powerpc/mm/book3s64/hash_utils.c

  1515	
> 1516	DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
  1517	{
  1518		unsigned long ea = regs->dar;
  1519		unsigned long dsisr = regs->dsisr;
  1520		unsigned long access = _PAGE_PRESENT | _PAGE_READ;
  1521		unsigned long flags = 0;
  1522		struct mm_struct *mm;
  1523		unsigned int region_id;
  1524		long err;
  1525	
  1526		region_id = get_region_id(ea);
  1527		if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
  1528			mm = &init_mm;
  1529		else
  1530			mm = current->mm;
  1531	
  1532		if (dsisr & DSISR_NOHPTE)
  1533			flags |= HPTE_NOHPTE_UPDATE;
  1534	
  1535		if (dsisr & DSISR_ISSTORE)
  1536			access |= _PAGE_WRITE;
  1537		/*
  1538		 * We set _PAGE_PRIVILEGED only when
  1539		 * kernel mode access kernel space.
  1540		 *
  1541		 * _PAGE_PRIVILEGED is NOT set
  1542		 * 1) when kernel mode access user space
  1543		 * 2) user space access kernel space.
  1544		 */
  1545		access |= _PAGE_PRIVILEGED;
  1546		if (user_mode(regs) || (region_id == USER_REGION_ID))
  1547			access &= ~_PAGE_PRIVILEGED;
  1548	
  1549		if (regs->trap == 0x400)
  1550			access |= _PAGE_EXEC;
  1551	
  1552		err = hash_page_mm(mm, ea, access, regs->trap, flags);
  1553		if (unlikely(err < 0)) {
  1554			// failed to instert a hash PTE due to an hypervisor error
  1555			if (user_mode(regs)) {
  1556				if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
  1557					_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
  1558				else
  1559					_exception(SIGBUS, regs, BUS_ADRERR, ea);
  1560			} else {
  1561				bad_page_fault(regs, SIGBUS);
  1562			}
  1563			err = 0;
  1564		}
  1565	
  1566		return err;
  1567	}
  1568	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 21496 bytes --]

^ permalink raw reply

* [PATCH] PCI: dwc: layerscape: convert to builtin_platform_driver()
From: Michael Walle @ 2021-01-20 10:52 UTC (permalink / raw)
  To: linuxppc-dev, linux-pci, linux-arm-kernel, linux-kernel
  Cc: Rob Herring, Lorenzo Pieralisi, Saravana Kannan, Roy Zang,
	Greg Kroah-Hartman, Minghuan Lian, Michael Walle, Bjorn Helgaas,
	Mingkai Hu

fw_devlink will defer the probe until all suppliers are ready. We can't
use builtin_platform_driver_probe() because it doesn't retry after probe
deferral. Convert it to builtin_platform_driver().

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Michael Walle <michael@walle.cc>
---
 drivers/pci/controller/dwc/pci-layerscape.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
index 44ad34cdc3bc..5b9c625df7b8 100644
--- a/drivers/pci/controller/dwc/pci-layerscape.c
+++ b/drivers/pci/controller/dwc/pci-layerscape.c
@@ -232,7 +232,7 @@ static const struct of_device_id ls_pcie_of_match[] = {
 	{ },
 };
 
-static int __init ls_pcie_probe(struct platform_device *pdev)
+static int ls_pcie_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct dw_pcie *pci;
@@ -271,10 +271,11 @@ static int __init ls_pcie_probe(struct platform_device *pdev)
 }
 
 static struct platform_driver ls_pcie_driver = {
+	.probe = ls_pcie_probe,
 	.driver = {
 		.name = "layerscape-pcie",
 		.of_match_table = ls_pcie_of_match,
 		.suppress_bind_attrs = true,
 	},
 };
-builtin_platform_driver_probe(ls_pcie_driver, ls_pcie_probe);
+builtin_platform_driver(ls_pcie_driver);
-- 
2.20.1


^ permalink raw reply related

* [PATCH v3] [PATCH] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Ananth N Mavinakayanahalli @ 2021-01-20 11:13 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: naveen.n.rao, ravi.bangoria, dja

We currently unconditionally try to emulate newer instructions on older
Power versions that could cause issues. Gate it.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
---

[v3] Addressed Naveen's comments on scv and addpcis
[v2] Fixed description

 arch/powerpc/lib/sstep.c |   46 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
index bf7a7d62ae8b..5a425a4a1d88 100644
--- a/arch/powerpc/lib/sstep.c
+++ b/arch/powerpc/lib/sstep.c
@@ -1304,9 +1304,11 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		if ((word & 0xfe2) == 2)
 			op->type = SYSCALL;
 		else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-				(word & 0xfe3) == 1)
+				(word & 0xfe3) == 1) {	/* scv */
 			op->type = SYSCALL_VECTORED_0;
-		else
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
+		} else
 			op->type = UNKNOWN;
 		return 0;
 #endif
@@ -1530,6 +1532,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 	case 19:
 		if (((word >> 1) & 0x1f) == 2) {
 			/* addpcis */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			imm = (short) (word & 0xffc1);	/* d0 + d2 fields */
 			imm |= (word >> 15) & 0x3e;	/* d1 field */
 			op->val = regs->nip + (imm << 16) + 4;
@@ -2439,6 +2443,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 268:	/* lxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 16;
@@ -2448,6 +2454,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 269:	/* lxvl */
 		case 301: {	/* lxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2475,6 +2483,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 364:	/* lxvwsx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 4;
@@ -2482,6 +2492,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 396:	/* stxvx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 16;
@@ -2491,6 +2503,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 		case 397:	/* stxvl */
 		case 429: {	/* stxvll */
 			int nb;
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->ea = ra ? regs->gpr[ra] : 0;
 			nb = regs->gpr[rb] & 0xff;
@@ -2542,6 +2556,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 781:	/* lxsibzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 1);
 			op->element_size = 8;
@@ -2549,6 +2565,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 812:	/* lxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 2;
@@ -2556,6 +2574,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 813:	/* lxsihzx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 2);
 			op->element_size = 8;
@@ -2569,6 +2589,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 876:	/* lxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(LOAD_VSX, 0, 16);
 			op->element_size = 1;
@@ -2582,6 +2604,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 909:	/* stxsibx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 1);
 			op->element_size = 8;
@@ -2589,6 +2613,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 940:	/* stxvh8x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 2;
@@ -2596,6 +2622,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 941:	/* stxsihx */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 2);
 			op->element_size = 8;
@@ -2609,6 +2637,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1004:	/* stxvb16x */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd | ((word & 1) << 5);
 			op->type = MKOP(STORE_VSX, 0, 16);
 			op->element_size = 1;
@@ -2717,12 +2747,16 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			op->type = MKOP(LOAD_FP, 0, 16);
 			break;
 		case 2:		/* lxsd */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 8);
 			op->element_size = 8;
 			op->vsx_flags = VSX_CHECK_VEC;
 			break;
 		case 3:		/* lxssp */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->reg = rd + 32;
 			op->type = MKOP(LOAD_VSX, 0, 4);
 			op->element_size = 8;
@@ -2775,6 +2809,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 1:		/* lxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;
@@ -2785,6 +2821,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 2:		/* stxsd with LSB of DS field = 0 */
 		case 6:		/* stxsd with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 8);
@@ -2794,6 +2832,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 
 		case 3:		/* stxssp with LSB of DS field = 0 */
 		case 7:		/* stxssp with LSB of DS field = 1 */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dsform_ea(word, regs);
 			op->reg = rd + 32;
 			op->type = MKOP(STORE_VSX, 0, 4);
@@ -2802,6 +2842,8 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
 			break;
 
 		case 5:		/* stxv */
+			if (!cpu_has_feature(CPU_FTR_ARCH_300))
+				return -1;
 			op->ea = dqform_ea(word, regs);
 			if (word & 8)
 				op->reg = rd + 32;



^ permalink raw reply related

* Re: [PATCH] powerpc/uprobes: Don't allow probe on suffix of prefixed instruction
From: Ravi Bangoria @ 2021-01-20 11:18 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ravi Bangoria, linux-kernel, rostedt, paulus, sandipan, jniethe5,
	naveen.n.rao, linuxppc-dev
In-Reply-To: <20210119172603.GA16696@redhat.com>



On 1/19/21 10:56 PM, Oleg Nesterov wrote:
> On 01/19, Ravi Bangoria wrote:
>>
>> Probe on 2nd word of a prefixed instruction is invalid scenario and
>> should be restricted.
> 
> I don't understand this ppc-specific problem, but...

So far (upto Power9), instruction size was fixed - 4 bytes. But Power10
introduced a prefixed instruction which consist of 8 bytes, where first
4 bytes is prefix and remaining is suffix.

This patch checks whether the Uprobe is on the 2nd word (suffix) of a
prefixed instruction. If so, consider it as invalid Uprobe.

> 
>> +#ifdef CONFIG_PPC64
>> +int arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
>> +			      uprobe_opcode_t opcode)
>> +{
>> +	uprobe_opcode_t prefix;
>> +	void *kaddr;
>> +	struct ppc_inst inst;
>> +
>> +	/* Don't check if vaddr is pointing to the beginning of page */
>> +	if (!(vaddr & ~PAGE_MASK))
>> +		return 0;
> 
> So the fix is incomplete? Or insn at the start of page can't be prefixed?

Prefixed instruction can not cross 64 byte boundary. If it does, kernel
generates SIGBUS. Considering all powerpc supported page sizes to be
multiple of 64 bytes, there will never be a scenario where prefix and
suffix will be on different pages. i.e. a beginning of the page should
never be a suffix.

> 
>> +int __weak arch_uprobe_verify_opcode(struct page *page, unsigned long vaddr,
>> +				     uprobe_opcode_t opcode)
>> +{
>> +	return 0;
>> +}
>> +
>>   static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t *new_opcode)
>>   {
>>   	uprobe_opcode_t old_opcode;
>> @@ -275,6 +281,8 @@ static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t
>>   	if (is_swbp_insn(new_opcode)) {
>>   		if (is_swbp)		/* register: already installed? */
>>   			return 0;
>> +		if (arch_uprobe_verify_opcode(page, vaddr, old_opcode))
>> +			return -EINVAL;
> 
> Well, this doesn't look good...
> 
> To me it would be better to change the prepare_uprobe() path to copy
> the potential prefix into uprobe->arch and check ppc_inst_prefixed()
> in arch_uprobe_analyze_insn(). What do you think?

Agreed. The only reason I was checking via verify_opcode() is to make the
code more simpler. If I need to check via prepare_uprobe(), I'll need to
abuse uprobe->offset by setting it to uprobe->offset - 4 to read previous
4 bytes of current instruction. Which, IMHO, is not that straightforward
with current implementation of prepare_uprobe().

But while replying here, I'm thinking... I should be able to grab a page
using mm and vaddr, which are already available in arch_uprobe_analyze_insn().
With that, I should be able to do all this inside arch_uprobe_analyze_insn()
only. I'll try this and send v2 if that works.

Thanks for the review.
Ravi

^ permalink raw reply

* Re: [PATCH v6 25/39] powerpc: convert interrupt handlers to use wrappers
From: kernel test robot @ 2021-01-20 11:45 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: clang-built-linux, kbuild-all, Nicholas Piggin
In-Reply-To: <20210115165012.1260253-26-npiggin@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4556 bytes --]

Hi Nicholas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on v5.11-rc4 next-20210120]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-interrupt-wrappers/20210116-023244
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r035-20210120 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 22b68440e1647e16b5ee24b924986207173c02d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install powerpc cross compiling tool for clang build
        # apt-get install binutils-powerpc-linux-gnu
        # https://github.com/0day-ci/linux/commit/04d5131f1545e1e992962a5339135b605eb421a5
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Nicholas-Piggin/powerpc-interrupt-wrappers/20210116-023244
        git checkout 04d5131f1545e1e992962a5339135b605eb421a5
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/powerpc/mm/book3s64/hash_utils.c:1516:30: warning: no previous prototype for function '__do_hash_fault' [-Wmissing-prototypes]
   DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
                                ^
   arch/powerpc/mm/book3s64/hash_utils.c:1516:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
   ^
   arch/powerpc/include/asm/interrupt.h:150:19: note: expanded from macro 'DEFINE_INTERRUPT_HANDLER_RET'
   __visible noinstr long func(struct pt_regs *regs)                       \
                     ^
   arch/powerpc/mm/book3s64/hash_utils.c:1905:6: warning: no previous prototype for function 'hpte_insert_repeating' [-Wmissing-prototypes]
   long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
        ^
   arch/powerpc/mm/book3s64/hash_utils.c:1905:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
   ^
   static 
   2 warnings generated.


vim +/__do_hash_fault +1516 arch/powerpc/mm/book3s64/hash_utils.c

  1515	
> 1516	DEFINE_INTERRUPT_HANDLER_RET(__do_hash_fault)
  1517	{
  1518		unsigned long ea = regs->dar;
  1519		unsigned long dsisr = regs->dsisr;
  1520		unsigned long access = _PAGE_PRESENT | _PAGE_READ;
  1521		unsigned long flags = 0;
  1522		struct mm_struct *mm;
  1523		unsigned int region_id;
  1524		long err;
  1525	
  1526		region_id = get_region_id(ea);
  1527		if ((region_id == VMALLOC_REGION_ID) || (region_id == IO_REGION_ID))
  1528			mm = &init_mm;
  1529		else
  1530			mm = current->mm;
  1531	
  1532		if (dsisr & DSISR_NOHPTE)
  1533			flags |= HPTE_NOHPTE_UPDATE;
  1534	
  1535		if (dsisr & DSISR_ISSTORE)
  1536			access |= _PAGE_WRITE;
  1537		/*
  1538		 * We set _PAGE_PRIVILEGED only when
  1539		 * kernel mode access kernel space.
  1540		 *
  1541		 * _PAGE_PRIVILEGED is NOT set
  1542		 * 1) when kernel mode access user space
  1543		 * 2) user space access kernel space.
  1544		 */
  1545		access |= _PAGE_PRIVILEGED;
  1546		if (user_mode(regs) || (region_id == USER_REGION_ID))
  1547			access &= ~_PAGE_PRIVILEGED;
  1548	
  1549		if (regs->trap == 0x400)
  1550			access |= _PAGE_EXEC;
  1551	
  1552		err = hash_page_mm(mm, ea, access, regs->trap, flags);
  1553		if (unlikely(err < 0)) {
  1554			// failed to instert a hash PTE due to an hypervisor error
  1555			if (user_mode(regs)) {
  1556				if (IS_ENABLED(CONFIG_PPC_SUBPAGE_PROT) && err == -2)
  1557					_exception(SIGSEGV, regs, SEGV_ACCERR, ea);
  1558				else
  1559					_exception(SIGBUS, regs, BUS_ADRERR, ea);
  1560			} else {
  1561				bad_page_fault(regs, SIGBUS);
  1562			}
  1563			err = 0;
  1564		}
  1565	
  1566		return err;
  1567	}
  1568	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33697 bytes --]

^ permalink raw reply

* Re: [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Michael Ellerman @ 2021-01-20 12:06 UTC (permalink / raw)
  To: Nathan Lynch, Alexey Kardashevskiy, linuxppc-dev
  Cc: tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <871regxwzh.fsf@linux.ibm.com>

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>> On 16/01/2021 02:38, Nathan Lynch wrote:
>>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>>> On 15/01/2021 09:00, Nathan Lynch wrote:
>>>>> Memory locations passed as arguments from the OS to RTAS usually need
>>>>> to be addressable in 32-bit mode and must reside in the Real Mode
>>>>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>>>>> first logical memory block reported in the LPAR’s device tree.
>>>>>
>>>>> On powerpc targets with RTAS, Linux makes available to user space a
>>>>> region of memory suitable for arguments to be passed to RTAS via
>>>>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>>>>> API during boot in order to ensure that it satisfies the requirements
>>>>> described above.
>>>>>
>>>>> With radix MMU, the upper limit supplied to the memblock allocation
>>>>> can exceed the bounds of the first logical memory block, since
>>>>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB. (512MB is
>>>>> a common size of the first memory block according to a small sample of
>>>>> LPARs I have checked.) This leads to failures when user space invokes
>>>>> an RTAS function that uses a work area, such as
>>>>> ibm,configure-connector.
>>>>>
>>>>> Alter the determination of the upper limit for rtas_rmo_buf's
>>>>> allocation to consult the device tree directly, ensuring placement
>>>>> within the RMA regardless of the MMU in use.
>>>>
>>>> Can we tie this with RTAS (which also needs to be in RMA) and simply add
>>>> extra 64K in prom_instantiate_rtas() and advertise this address
>>>> (ALIGH_UP(rtas-base + rtas-size, PAGE_SIZE)) to the user space? We do
>>>> not need this RMO area before that point.
>>> 
>>> Can you explain more about what advantage that would bring? I'm not
>>> seeing it. It's a more significant change than what I've written
>>> here.
>>
>>
>> We already allocate space for RTAS and (like RMO) it needs to be in RMA, 
>> and RMO is useless without RTAS. We can reuse RTAS allocation code for 
>> RMO like this:
>
> When you say RMO I assume you are referring to rtas_rmo_buf? (I don't
> think it is well-named.)
...

RMO (Real mode offset) is the old term we used to use to refer to what
is now called the RMA (Real mode area). There are still many references
to RMO in Linux, but they almost certainly all refer to what we now call
the RMA.

>> May be store in the FDT as "linux,rmo-base" next to "linux,rtas-base", 
>> for clarity, as sharing symbols between prom and main kernel is a bit 
>> tricky.
>>
>> The benefit is that we do not do the same thing   (== find 64K in RMA) 
>> in 2 different ways and if the RMO allocated my way is broken - we'll 
>> know it much sooner as RTAS itself will break too.
>
> Implementation details aside... I'll grant that combining the
> allocations into one in prom_init reduces some duplication in the sense
> that both are subject to the same constraints (mostly - the RTAS data
> area must not cross a 256MB boundary, while the user region may). But
> they really are distinct concerns. The RTAS private data area is
> specified in the platform architecture, the OS is obligated to allocate
> it and pass it to instantiate-rtas, etc etc. However the user region
> (rtas_rmo_buf) is purely a Linux construct which is there to support
> sys_rtas.
>
> Now, there are multiple sites in the kernel proper that must allocate
> memory suitable for passing to RTAS. Obviously there is value in
> consolidating the logic for that purpose in one place, so I'll work on
> adding that in v2. OK?

I don't think we want to move any allocations into prom_init.c unless we
have to.

It's best thought of as a trampoline, that runs before the kernel
proper, to transition from live OF to a flat DT environment. One thing
that must be done as part of that is instantiating RTAS, because it's
basically a runtime copy of the live OF. But any other allocs are for
Linux to handle later, IMHO.

cheers

^ permalink raw reply

* Re: [PATCH 6/6] powerpc/rtas: constrain user region allocation to RMA
From: Michael Ellerman @ 2021-01-20 12:13 UTC (permalink / raw)
  To: Nathan Lynch, linuxppc-dev; +Cc: aik, tyreld, brking, ajd, aneesh.kumar
In-Reply-To: <874kjcy73z.fsf@linux.ibm.com>

Nathan Lynch <nathanl@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> Nathan Lynch <nathanl@linux.ibm.com> writes:
>>> Memory locations passed as arguments from the OS to RTAS usually need
>>> to be addressable in 32-bit mode and must reside in the Real Mode
>>> Area. On PAPR guests, the RMA starts at logical address 0 and is the
>>> first logical memory block reported in the LPAR’s device tree.
>>>
>>> On powerpc targets with RTAS, Linux makes available to user space a
>>> region of memory suitable for arguments to be passed to RTAS via
>>> sys_rtas(). This region (rtas_rmo_buf) is allocated via the memblock
>>> API during boot in order to ensure that it satisfies the requirements
>>> described above.
>>>
>>> With radix MMU, the upper limit supplied to the memblock allocation
>>> can exceed the bounds of the first logical memory block, since
>>> ppc64_rma_size is ULONG_MAX and RTAS_INSTANTIATE_MAX is 1GB.
>>
>> Why does the size of the first memory block matter for radix?
>
> Here is my understanding: in the platform architecture, the size of the
> first memory block equals the RMA, regardless of the MMU mode. It just
> so happens that when using radix, Linux can pass ibm,configure-connector
> a work area address outside of the RMA because the allocation
> constraints for the work area are computed differently. It would be
> wrong of the OS to pass RTAS arguments outside of this region with hash
> MMU as well.

If that's the requirement then shouldn't we be adjusting ppc64_rma_size?
Otherwise aren't other uses of ppc64_rma_size going to run into similar
problems.

Or does the RMA only apply for RTAS calls when using radix?

cheers

^ permalink raw reply

* [PATCH] powerpc: Fix build error in paravirt.h
From: Michal Suchanek @ 2021-01-20 13:28 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Juergen Gross, Srikar Dronamraju, Deep Shah, VMware, Inc.,
	linux-kernel, virtualization, Paul Mackerras, Waiman Long,
	Michal Suchanek

./arch/powerpc/include/asm/paravirt.h:83:44: error: implicit declaration
of function 'smp_processor_id'; did you mean 'raw_smp_processor_id'?

smp_processor_id is defined in linux/smp.h but it is not included.

The build error happens only when the patch is applied to 5.3 kernel but
it only works by chance in mainline.

Fixes: ca3f969dcb11 ("powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted()")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
---
 arch/powerpc/include/asm/paravirt.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
index edc08f04aef7..5d1726bb28e7 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -10,6 +10,7 @@
 #endif
 
 #ifdef CONFIG_PPC_SPLPAR
+#include <linux/smp.h>
 #include <asm/kvm_guest.h>
 #include <asm/cputhreads.h>
 
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v3] [PATCH] powerpc/sstep: Check ISA 3.0 instruction validity before emulation
From: Naveen N. Rao @ 2021-01-20 13:50 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli; +Cc: naveen.n.rao, linuxppc-dev, ravi.bangoria, dja
In-Reply-To: <161114113785.214433.12934683302522893921.stgit@thinktux.local>

On 2021/01/20 04:43PM, Ananth N Mavinakayanahalli wrote:
> We currently unconditionally try to emulate newer instructions on older
> Power versions that could cause issues. Gate it.
> 
> Signed-off-by: Ananth N Mavinakayanahalli <ananth@linux.ibm.com>
> ---
> 
> [v3] Addressed Naveen's comments on scv and addpcis
> [v2] Fixed description
> 
>  arch/powerpc/lib/sstep.c |   46 ++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 44 insertions(+), 2 deletions(-)

Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>

- Naveen


^ permalink raw reply

* Re: [PATCH] PCI: dwc: layerscape: convert to builtin_platform_driver()
From: Rob Herring @ 2021-01-20 14:23 UTC (permalink / raw)
  To: Michael Walle
  Cc: Roy Zang, Lorenzo Pieralisi, Saravana Kannan, PCI,
	linux-kernel@vger.kernel.org, Minghuan Lian, linux-arm-kernel,
	Greg Kroah-Hartman, Bjorn Helgaas, linuxppc-dev, Mingkai Hu
In-Reply-To: <20210120105246.23218-1-michael@walle.cc>

On Wed, Jan 20, 2021 at 4:53 AM Michael Walle <michael@walle.cc> wrote:
>
> fw_devlink will defer the probe until all suppliers are ready. We can't
> use builtin_platform_driver_probe() because it doesn't retry after probe
> deferral. Convert it to builtin_platform_driver().

If builtin_platform_driver_probe() doesn't work with fw_devlink, then
shouldn't it be fixed or removed? Then we're not fixing drivers later
when folks start caring about deferred probe and devlink.

I'd really prefer if we convert this to a module instead. (And all the
other PCI host drivers.)

> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")

This happened!?

> Signed-off-by: Michael Walle <michael@walle.cc>
> ---
>  drivers/pci/controller/dwc/pci-layerscape.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/pci-layerscape.c b/drivers/pci/controller/dwc/pci-layerscape.c
> index 44ad34cdc3bc..5b9c625df7b8 100644
> --- a/drivers/pci/controller/dwc/pci-layerscape.c
> +++ b/drivers/pci/controller/dwc/pci-layerscape.c
> @@ -232,7 +232,7 @@ static const struct of_device_id ls_pcie_of_match[] = {
>         { },
>  };
>
> -static int __init ls_pcie_probe(struct platform_device *pdev)
> +static int ls_pcie_probe(struct platform_device *pdev)
>  {
>         struct device *dev = &pdev->dev;
>         struct dw_pcie *pci;
> @@ -271,10 +271,11 @@ static int __init ls_pcie_probe(struct platform_device *pdev)
>  }
>
>  static struct platform_driver ls_pcie_driver = {
> +       .probe = ls_pcie_probe,
>         .driver = {
>                 .name = "layerscape-pcie",
>                 .of_match_table = ls_pcie_of_match,
>                 .suppress_bind_attrs = true,
>         },
>  };
> -builtin_platform_driver_probe(ls_pcie_driver, ls_pcie_probe);
> +builtin_platform_driver(ls_pcie_driver);
> --
> 2.20.1
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox