LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: Fwd: Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12  3:21 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: Rob Herring, linux-integrity, linuxppc-dev, linux-arm-kernel,
	Mimi Zohar
In-Reply-To: <e7f3ae2e-20bc-9901-fb8d-80a3163e7d5e@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> On 2/11/21 6:11 PM, Thiago Jung Bauermann wrote:
>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>> 
>>> On 2/11/21 3:59 PM, Thiago Jung Bauermann wrote:
>>>> Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:
>>>>
>>>>> On 2/11/21 9:42 AM, Lakshmi Ramasubramanian wrote:
>>>>>> Hi Rob,
>>>>>> [PATCH] powerpc: Rename kexec elfcorehdr_addr to elf_headers_mem
>>>>>> This change causes build problem for x86_64 architecture (please see the
>>>>>> mail from kernel test bot below) since arch/x86/include/asm/kexec.h uses
>>>>>> "elf_load_addr" for the ELF header buffer address and not
>>>>>> "elf_headers_mem".
>>>>>> struct kimage_arch {
>>>>>>        ...
>>>>>>        /* Core ELF header buffer */
>>>>>>        void *elf_headers;
>>>>>>        unsigned long elf_headers_sz;
>>>>>>        unsigned long elf_load_addr;
>>>>>> };
>>>>>> I am thinking of limiting of_kexec_alloc_and_setup_fdt() to ARM64 and
>>>>>> PPC64 since they are the only ones using this function now.
>>>>>> #if defined(CONFIG_ARM64) && defined(CONFIG_PPC64)
>>>>> Sorry - I meant to say
>>>>> #if defined(CONFIG_ARM64) || defined(CONFIG_PPC64)
>>>>>
>>>> Does it build correctly if you rename elf_headers_mem to elf_load_addr?
>>>> Or the other way around, renaming x86's elf_load_addr to
>>>> elf_headers_mem. I don't really have a preference.
>>>
>>> Yes - changing arm64 and ppc from "elf_headers_mem" to "elf_load_addr" builds
>>> fine.
>>>
>>> But I am concerned about a few other architectures that also define "struct
>>> kimage_arch" such as "parisc", "arm" which do not have any ELF related fields.
>>> They would not build if the config defines CONFIG_KEXEC_FILE and
>>> CONFIG_OF_FLATTREE.
>>>
>>> Do you think that could be an issue?
>> That's a good point. But in practice, arm doesn't support
>> CONFIG_KEXEC_FILE. And while parisc does support CONFIG_KEXEC_FILE, as
>> far as I could determine it doesn't support CONFIG_OF.
>> So IMHO we don't need to worry about them. We'll cross that bridge if we
>> get there. If they ever implement KEXEC_FILE or OF_FLATTREE support,
>> then (again, IMHO) the natural solution would be for them to name the
>> ELF header member the same way the other arches do.
>> And since no other architecture defines struct kimage_arch, those are
>> the only ones we need to consider.
>> 
>
> Sounds good Thiago.
>
> I'll rename arm64 and ppc kimage_arch ELF address field to match that defined
> for x86/x64.
>
> Also, will add "fdt_size" param to of_kexec_alloc_and_setup_fdt(). For now, I'll
> use 2*fdt_totalsize(initial_boot_params) for ppc.
>
> Will send the updated patches shortly.

Sounds good. There will be a small conflict with powerpc/next because of
the patch I mentioned, but it's simple to fix by whoever merges the
series.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* linux-next: manual merge of the spi tree with the powerpc tree
From: Stephen Rothwell @ 2021-02-12  4:31 UTC (permalink / raw)
  To: Mark Brown, Michael Ellerman, PowerPC
  Cc: Linux Next Mailing List, Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1454 bytes --]

Hi all,

Today's linux-next merge of the spi tree got a conflict in:

  drivers/spi/spi-mpc52xx.c

between commit:

  e10656114d32 ("spi: mpc52xx: Avoid using get_tbl()")

from the powerpc tree and commit:

  258ea99fe25a ("spi: spi-mpc52xx: Use new structure for SPI transfer delays")

from the spi tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

BTW Mark: the author's address in 258ea99fe25a uses a non existent domain :-(
-- 
Cheers,
Stephen Rothwell

diff --cc drivers/spi/spi-mpc52xx.c
index e6a30f232370,36f941500676..000000000000
--- a/drivers/spi/spi-mpc52xx.c
+++ b/drivers/spi/spi-mpc52xx.c
@@@ -247,8 -247,10 +247,10 @@@ static int mpc52xx_spi_fsmstate_transfe
  	/* Is the transfer complete? */
  	ms->len--;
  	if (ms->len == 0) {
 -		ms->timestamp = get_tbl();
 +		ms->timestamp = mftb();
- 		ms->timestamp += ms->transfer->delay_usecs * tb_ticks_per_usec;
+ 		if (ms->transfer->delay.unit == SPI_DELAY_UNIT_USECS)
+ 			ms->timestamp += ms->transfer->delay.value *
+ 					 tb_ticks_per_usec;
  		ms->state = mpc52xx_spi_fsmstate_wait;
  		return FSM_CONTINUE;
  	}

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v5 04/10] powerpc: Reference param in MSR_TM_ACTIVE() macro
From: Daniel Axtens @ 2021-02-12  4:52 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210203184323.20792-5-cmr@codefail.de>

Hi Chris,

(totally bikeshedding, but I'd use the full word parameter in the
subject if you have enough spare characters.)

> Unlike the other MSR_TM_* macros, MSR_TM_ACTIVE does not reference or
> use its parameter unless CONFIG_PPC_TRANSACTIONAL_MEM is defined. This
> causes an 'unused variable' compile warning unless the variable is also
> guarded with CONFIG_PPC_TRANSACTIONAL_MEM.
>
> Reference but do nothing with the argument in the macro to avoid a
> potential compile warning.

Andrew asked why we weren't seeing these warnings already.

I was able to reproduce them with CONFIG_PPC_TRANSACTIONAL_MEM off, but
only if I compiled with W=1:

arch/powerpc/kernel/process.c: In function ‘enable_kernel_fp’:
arch/powerpc/kernel/process.c:210:16: error: variable ‘cpumsr’ set but not used [-Werror=unused-but-set-variable]
  210 |  unsigned long cpumsr;
      |                ^~~~~~

Having said that, this change seems like a good idea, squashing warnings
at W=1 is still valuable.

Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>  arch/powerpc/include/asm/reg.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index e40a921d78f9..c5a3e856191c 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -124,7 +124,7 @@
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  #define MSR_TM_ACTIVE(x) (((x) & MSR_TS_MASK) != 0) /* Transaction active? */
>  #else
> -#define MSR_TM_ACTIVE(x) 0
> +#define MSR_TM_ACTIVE(x) ((void)(x), 0)
>  #endif
>  
>  #if defined(CONFIG_PPC_BOOK3S_64)
> -- 
> 2.26.1

^ permalink raw reply

* Re: [PATCH v5 05/10] powerpc/signal64: Remove TM ifdefery in middle of if/else block
From: Daniel Axtens @ 2021-02-12  5:21 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210203184323.20792-6-cmr@codefail.de>

Hi Chris,

> Rework the messy ifdef breaking up the if-else for TM similar to
> commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of if/else").

I'm not sure what 'the messy ifdef' and 'the if-else for TM' is (yet):
perhaps you could start the commit message with a tiny bit of
background.

> Unlike that commit for ppc32, the ifdef can't be removed entirely since
> uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.
>
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>  arch/powerpc/kernel/signal_64.c | 16 +++++++---------
>  1 file changed, 7 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index b211a8ea4f6e..8e1d804ce552 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -710,9 +710,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	struct pt_regs *regs = current_pt_regs();
>  	struct ucontext __user *uc = (struct ucontext __user *)regs->gpr[1];
>  	sigset_t set;
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  	unsigned long msr;
> -#endif
>  
>  	/* Always make any pending restarted system calls return -EINTR */
>  	current->restart_block.fn = do_no_restart_syscall;
> @@ -765,7 +763,10 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  
>  	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
>  		goto badframe;
> +#endif
> +
>  	if (MSR_TM_ACTIVE(msr)) {
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  		/* We recheckpoint on return. */
>  		struct ucontext __user *uc_transact;
>  
> @@ -778,9 +779,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
>  					   &uc_transact->uc_mcontext))
>  			goto badframe;
> -	} else
>  #endif
> -	{
> +	} else {
>  		/*
>  		 * Fall through, for non-TM restore
>  		 *

I think you can get rid of all the ifdefs in _this function_ by
providing a couple of stubs:

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a66f435dabbf..19059a4b798f 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1120,6 +1120,7 @@ void restore_tm_state(struct pt_regs *regs)
 #else
 #define tm_recheckpoint_new_task(new)
 #define __switch_to_tm(prev, new)
+void tm_reclaim_current(uint8_t cause) {}
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
 static inline void save_sprs(struct thread_struct *t)
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 8e1d804ce552..ed58ca019ad9 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -594,6 +594,13 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
 
 	return err;
 }
+#else
+static long restore_tm_sigcontexts(struct task_struct *tsk,
+				   struct sigcontext __user *sc,
+				   struct sigcontext __user *tm_sc)
+{
+	return -EINVAL;
+}
 #endif
 
 /*
@@ -722,7 +729,6 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		goto badframe;
 	set_current_blocked(&set);
 
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/*
 	 * If there is a transactional state then throw it away.
 	 * The purpose of a sigreturn is to destroy all traces of the
@@ -763,10 +769,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
 
 	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
 		goto badframe;
-#endif
 
 	if (MSR_TM_ACTIVE(msr)) {
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 		/* We recheckpoint on return. */
 		struct ucontext __user *uc_transact;
 
@@ -779,7 +783,6 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
 					   &uc_transact->uc_mcontext))
 			goto badframe;
-#endif
 	} else {
 		/*
 		 * Fall through, for non-TM restore

My only concern here was whether it was valid to access
 	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
if CONFIG_PPC_TRANSACTIONAL_MEM was not defined, but I didn't think of
any obvious reason why it wouldn't be...

> @@ -818,10 +818,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>  	unsigned long newsp = 0;
>  	long err = 0;
>  	struct pt_regs *regs = tsk->thread.regs;
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  	/* Save the thread's msr before get_tm_stackpointer() changes it */
>  	unsigned long msr = regs->msr;
> -#endif

I wondered if that comment still makes sense in the absence of the
ifdef. It is preserved in commit f1cf4f93de2f ("powerpc/signal32: Remove
ifdefery in middle of if/else"), and I can't figure out how you would
improve it, so I guess it's probably good as it is.

>  	frame = get_sigframe(ksig, tsk, sizeof(*frame), 0);
>  	if (!access_ok(frame, sizeof(*frame)))
> @@ -836,8 +834,9 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>  	/* Create the ucontext.  */
>  	err |= __put_user(0, &frame->uc.uc_flags);
>  	err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +
>  	if (MSR_TM_ACTIVE(msr)) {
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>  		/* The ucontext_t passed to userland points to the second
>  		 * ucontext_t (for transactional state) with its uc_link ptr.
>  		 */
> @@ -847,9 +846,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>  					    tsk, ksig->sig, NULL,
>  					    (unsigned long)ksig->ka.sa.sa_handler,
>  					    msr);
> -	} else
>  #endif
> -	{
> +	} else {
>  		err |= __put_user(0, &frame->uc.uc_link);
>  		prepare_setup_sigcontext(tsk, 1);
>  		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,

That seems reasonable to me.

Kind regards,
Daniel


^ permalink raw reply related

* Re: [PATCH v5 06/10] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext()
From: Daniel Axtens @ 2021-02-12  5:41 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210203184323.20792-7-cmr@codefail.de>

Hi Chris,

> Previously setup_sigcontext() performed a costly KUAP switch on every
> uaccess operation. These repeated uaccess switches cause a significant
> drop in signal handling performance.
>
> Rewrite setup_sigcontext() to assume that a userspace write access window
> is open. Replace all uaccess functions with their 'unsafe' versions
> which avoid the repeated uaccess switches.

Just to clarify the commit message a bit: you're also changing the
callers of the old safe versions to first open the window, then call the
unsafe variants, then close the window again.

> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>  arch/powerpc/kernel/signal_64.c | 70 ++++++++++++++++++++-------------
>  1 file changed, 43 insertions(+), 27 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 8e1d804ce552..4248e4489ff1 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -101,9 +101,13 @@ static void prepare_setup_sigcontext(struct task_struct *tsk, int ctx_has_vsx_re
>   * Set up the sigcontext for the signal frame.
>   */
>  
> -static long setup_sigcontext(struct sigcontext __user *sc,
> -		struct task_struct *tsk, int signr, sigset_t *set,
> -		unsigned long handler, int ctx_has_vsx_region)
> +#define unsafe_setup_sigcontext(sc, tsk, signr, set, handler,		\
> +				ctx_has_vsx_region, e)			\
> +	unsafe_op_wrap(__unsafe_setup_sigcontext(sc, tsk, signr, set,	\
> +			handler, ctx_has_vsx_region), e)
> +static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
> +					struct task_struct *tsk, int signr, sigset_t *set,
> +					unsigned long handler, int ctx_has_vsx_region)
>  {
>  	/* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
>  	 * process never used altivec yet (MSR_VEC is zero in pt_regs of
> @@ -118,20 +122,19 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>  #endif
>  	struct pt_regs *regs = tsk->thread.regs;
>  	unsigned long msr = regs->msr;
> -	long err = 0;
>  	/* Force usr to alway see softe as 1 (interrupts enabled) */
>  	unsigned long softe = 0x1;
>  
>  	BUG_ON(tsk != current);
>  
>  #ifdef CONFIG_ALTIVEC
> -	err |= __put_user(v_regs, &sc->v_regs);
> +	unsafe_put_user(v_regs, &sc->v_regs, efault_out);
>  
>  	/* save altivec registers */
>  	if (tsk->thread.used_vr) {
>  		/* Copy 33 vec registers (vr0..31 and vscr) to the stack */
> -		err |= __copy_to_user(v_regs, &tsk->thread.vr_state,
> -				      33 * sizeof(vector128));
> +		unsafe_copy_to_user(v_regs, &tsk->thread.vr_state,
> +				    33 * sizeof(vector128), efault_out);
>  		/* set MSR_VEC in the MSR value in the frame to indicate that sc->v_reg)
>  		 * contains valid data.
>  		 */
> @@ -140,12 +143,12 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>  	/* We always copy to/from vrsave, it's 0 if we don't have or don't
>  	 * use altivec.
>  	 */
> -	err |= __put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
> +	unsafe_put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33], efault_out);
>  #else /* CONFIG_ALTIVEC */
> -	err |= __put_user(0, &sc->v_regs);
> +	unsafe_put_user(0, &sc->v_regs, efault_out);
>  #endif /* CONFIG_ALTIVEC */
>  	/* copy fpr regs and fpscr */
> -	err |= copy_fpr_to_user(&sc->fp_regs, tsk);
> +	unsafe_copy_fpr_to_user(&sc->fp_regs, tsk, efault_out);
>  
>  	/*
>  	 * Clear the MSR VSX bit to indicate there is no valid state attached
> @@ -160,24 +163,27 @@ static long setup_sigcontext(struct sigcontext __user *sc,
>  	 */
>  	if (tsk->thread.used_vsr && ctx_has_vsx_region) {
>  		v_regs += ELF_NVRREG;
> -		err |= copy_vsx_to_user(v_regs, tsk);
> +		unsafe_copy_vsx_to_user(v_regs, tsk, efault_out);
>  		/* set MSR_VSX in the MSR value in the frame to
>  		 * indicate that sc->vs_reg) contains valid data.
>  		 */
>  		msr |= MSR_VSX;
>  	}
>  #endif /* CONFIG_VSX */
> -	err |= __put_user(&sc->gp_regs, &sc->regs);
> +	unsafe_put_user(&sc->gp_regs, &sc->regs, efault_out);
>  	WARN_ON(!FULL_REGS(regs));
> -	err |= __copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE);
> -	err |= __put_user(msr, &sc->gp_regs[PT_MSR]);
> -	err |= __put_user(softe, &sc->gp_regs[PT_SOFTE]);
> -	err |= __put_user(signr, &sc->signal);
> -	err |= __put_user(handler, &sc->handler);
> +	unsafe_copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE, efault_out);
> +	unsafe_put_user(msr, &sc->gp_regs[PT_MSR], efault_out);
> +	unsafe_put_user(softe, &sc->gp_regs[PT_SOFTE], efault_out);
> +	unsafe_put_user(signr, &sc->signal, efault_out);
> +	unsafe_put_user(handler, &sc->handler, efault_out);
>  	if (set != NULL)
> -		err |=  __put_user(set->sig[0], &sc->oldmask);
> +		unsafe_put_user(set->sig[0], &sc->oldmask, efault_out);
>  
> -	return err;
> +	return 0;
> +
> +efault_out:
> +	return -EFAULT;
>  }
>  
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -664,12 +670,15 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>  
>  	if (old_ctx != NULL) {
>  		prepare_setup_sigcontext(current, ctx_has_vsx_region);
> -		if (!access_ok(old_ctx, ctx_size)
> -		    || setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL, 0,
> -					ctx_has_vsx_region)
> -		    || __copy_to_user(&old_ctx->uc_sigmask,
> -				      &current->blocked, sizeof(sigset_t)))
> +		if (!user_write_access_begin(old_ctx, ctx_size))
>  			return -EFAULT;

I notice we get rid of access_ok, but that's fine because
user_write_access_begin includes access_ok since Linus decided that was
a good idea.

> +
> +		unsafe_setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL,
> +					0, ctx_has_vsx_region, efault_out);
> +		unsafe_copy_to_user(&old_ctx->uc_sigmask, &current->blocked,
> +				    sizeof(sigset_t), efault_out);
> +
> +		user_write_access_end();
>  	}
>  	if (new_ctx == NULL)
>  		return 0;
> @@ -698,6 +707,10 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>  	/* This returns like rt_sigreturn */
>  	set_thread_flag(TIF_RESTOREALL);
>  	return 0;
> +
> +efault_out:
> +	user_write_access_end();
> +	return -EFAULT;
>  }
>  
>  
> @@ -850,9 +863,12 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>  	} else {
>  		err |= __put_user(0, &frame->uc.uc_link);
>  		prepare_setup_sigcontext(tsk, 1);
> -		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
> -					NULL, (unsigned long)ksig->ka.sa.sa_handler,
> -					1);
> +		if (!user_write_access_begin(frame, sizeof(struct rt_sigframe)))
> +			return -EFAULT;

Here you're opening a window for all of `frame`...

> +		err |= __unsafe_setup_sigcontext(&frame->uc.uc_mcontext, tsk,
... but here you're only passing in frame->uc.uc_mcontext for writing.

ISTR that the underlying AMR switch is fully on / fully off so I don't
think it really matters, but in this case should you be calling
user_write_access_begin() with &frame->uc.uc_mcontext and the size of
that?

> +						ksig->sig, NULL,
> +						(unsigned long)ksig->ka.sa.sa_handler, 1);
> +		user_write_access_end();
>  	}
>  	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
>  	if (err)


Apart from the size thing, everything looks good to me. I checked that
all the things you've changed from safe to unsafe pass the same
parameters, and they all looked good to me.

With those caveats,
 Reviewed-by: Daniel Axtens <dja@axtens.net>

Kind regards,
Daniel

^ permalink raw reply

* Re: [PATCH v2 0/2] of: of_device.h cleanups
From: Greg Kroah-Hartman @ 2021-02-12  7:34 UTC (permalink / raw)
  To: Rob Herring
  Cc: cocci, devicetree, Michal Marek, Rafael J. Wysocki, linuxppc-dev,
	Nicolas Palix, Patrice Chotard, linux-kernel, Julia Lawall,
	Gilles Muller, linux-usb, Paul Mackerras, netdev, Jakub Kicinski,
	Frank Rowand, David S. Miller, linux-arm-kernel, Felipe Balbi
In-Reply-To: <20210211232745.1498137-1-robh@kernel.org>

On Thu, Feb 11, 2021 at 05:27:43PM -0600, Rob Herring wrote:
> This is a couple of cleanups for of_device.h. They fell out from my
> attempt at decoupling of_device.h and of_platform.h which is a mess
> and I haven't finished, but there's no reason to wait on these.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply

* [PATCH for 5.10] powerpc/32: Preserve cr1 in exception prolog stack check to fix build error
From: Christophe Leroy @ 2021-02-12  8:57 UTC (permalink / raw)
  To: fedora.dm0, stable; +Cc: linuxppc-dev, linux-kernel

This is backport of 3642eb21256a ("powerpc/32: Preserve cr1 in
exception prolog stack check to fix build error") for kernel 5.10

It fixes the build failure on v5.10 reported by kernel test robot
and by David Michael.

This fix is not in Linux tree yet, it is in next branch in powerpc tree.

(cherry picked from commit 3642eb21256a317ac14e9ed560242c6d20cf06d9)

THREAD_ALIGN_SHIFT = THREAD_SHIFT + 1 = PAGE_SHIFT + 1
Maximum PAGE_SHIFT is 18 for 256k pages so
THREAD_ALIGN_SHIFT is 19 at the maximum.

No need to clobber cr1, it can be preserved when moving r1
into CR when we check stack overflow.

This reduces the number of instructions in Machine Check Exception
prolog and fixes a build failure reported by the kernel test robot
on v5.10 stable when building with RTAS + VMAP_STACK + KVM. That
build failure is due to too many instructions in the prolog hence
not fitting between 0x200 and 0x300. Allthough the problem doesn't
show up in mainline, it is still worth the change.

Fixes: 98bf2d3f4970 ("powerpc/32s: Fix RTAS machine check with VMAP stack")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5ae4d545e3ac58e133d2599e0deb88843cb494fc.1612768623.git.christophe.leroy@csgroup.eu
---
 arch/powerpc/kernel/head_32.h        | 2 +-
 arch/powerpc/kernel/head_book3s_32.S | 6 ------
 2 files changed, 1 insertion(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h
index c88e66adecb5..fef0b34a77c9 100644
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -56,7 +56,7 @@
 1:
 	tophys_novmstack r11, r11
 #ifdef CONFIG_VMAP_STACK
-	mtcrf	0x7f, r1
+	mtcrf	0x3f, r1
 	bt	32 - THREAD_ALIGN_SHIFT, stack_overflow
 #endif
 .endm
diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index d66da35f2e8d..2729d8fa6e77 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -280,12 +280,6 @@ MachineCheck:
 7:	EXCEPTION_PROLOG_2
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 #ifdef CONFIG_PPC_CHRP
-#ifdef CONFIG_VMAP_STACK
-	mfspr	r4, SPRN_SPRG_THREAD
-	tovirt(r4, r4)
-	lwz	r4, RTAS_SP(r4)
-	cmpwi	cr1, r4, 0
-#endif
 	beq	cr1, machine_check_tramp
 	twi	31, 0, 0
 #else
-- 
2.25.0


^ permalink raw reply related

* Re: linux-next: manual merge of the spi tree with the powerpc tree
From: Mark Brown @ 2021-02-12 12:27 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Linux Next Mailing List, PowerPC, Linux Kernel Mailing List
In-Reply-To: <20210212152755.5c82563a@canb.auug.org.au>

[-- Attachment #1: Type: text/plain, Size: 237 bytes --]

On Fri, Feb 12, 2021 at 03:31:42PM +1100, Stephen Rothwell wrote:

> BTW Mark: the author's address in 258ea99fe25a uses a non existent domain :-(

Ugh, I think that's something gone wrong with b4 :(  A bit late now to
try to fix it up.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Rob Herring @ 2021-02-12 14:38 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: Mark Rutland, tao.li, Mimi Zohar, Paul Mackerras,
	vincenzo.frascino, Frank Rowand, Sasha Levin, Masahiro Yamada,
	James Morris, AKASHI, Takahiro, linux-arm-kernel, Catalin Marinas,
	Serge E. Hallyn, devicetree, Pavel Tatashin, Will Deacon,
	Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
	Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
	linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
	Joe Perches, linux-integrity, linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <8a3aa3d2-2eba-549a-9970-a2b0fe3586c9@linux.microsoft.com>

On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian
<nramas@linux.microsoft.com> wrote:
>
> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
> >
> > There's actually a complication that I just noticed and needs to be
> > addressed. More below.
> >
>
> <...>
>
> >> +
> >> +/*
> >> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
> >> + *
> >> + * @image:          kexec image being loaded.
> >> + * @initrd_load_addr:       Address where the next initrd will be loaded.
> >> + * @initrd_len:             Size of the next initrd, or 0 if there will be none.
> >> + * @cmdline:                Command line for the next kernel, or NULL if there will
> >> + *                  be none.
> >> + *
> >> + * Return: fdt on success, or NULL errno on error.
> >> + */
> >> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
> >> +                               unsigned long initrd_load_addr,
> >> +                               unsigned long initrd_len,
> >> +                               const char *cmdline)
> >> +{
> >> +    void *fdt;
> >> +    int ret, chosen_node;
> >> +    const void *prop;
> >> +    unsigned long fdt_size;
> >> +
> >> +    fdt_size = fdt_totalsize(initial_boot_params) +
> >> +               (cmdline ? strlen(cmdline) : 0) +
> >> +               FDT_EXTRA_SPACE;
> >
> > Just adding 4 KB to initial_boot_params won't be enough for crash
> > kernels on ppc64. The current powerpc code doubles the size of
> > initial_boot_params (which is normally larger than 4 KB) and even that
> > isn't enough. A patch was added to powerpc/next today which uses a more
> > precise (but arch-specific) formula:
> >
> > https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
> >
> > So I believe we need a hook here where architectures can provide their
> > own specific calculation for the size of the fdt. Perhaps a weakly
> > defined function providing a default implementation which an
> > arch-specific file can override (a la arch_kexec_kernel_image_load())?
> >
> > Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
> > function from the patch I linked above.
> >
>
> Do you think it'd better to add "fdt_size" parameter to
> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
> desired FDT buffer size?

Yes, I guess so. But please define the param as extra size, not total
size. The kernel command line size addition can be in the common code.

The above change is also going to conflict, so I think this may have
to wait. Or I'll take the common and arm bits and powerpc can be
converted next cycle (or after the merge window).

Rob

^ permalink raw reply

* Re: [PATCH 2/2] powerpc/uaccess: Move might_fault() into user_access_begin()
From: Christophe Leroy @ 2021-02-12 16:10 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aik
In-Reply-To: <20210208135717.2618798-2-mpe@ellerman.id.au>



Le 08/02/2021 à 14:57, Michael Ellerman a écrit :
> We have a might_fault() check in __unsafe_put_user_goto(), but that is
> dangerous as it potentially calls lots of code while user access is
> enabled.
> 
> It also triggers the check Alexey added to the irq restore path to
> catch cases like that:
> 
>    WARNING: CPU: 30 PID: 1 at arch/powerpc/include/asm/book3s/64/kup.h:324 arch_local_irq_restore+0x160/0x190
>    NIP arch_local_irq_restore+0x160/0x190
>    LR  lock_is_held_type+0x140/0x200
>    Call Trace:
>      0xc00000007f392ff8 (unreliable)
>      ___might_sleep+0x180/0x320
>      __might_fault+0x50/0xe0
>      filldir64+0x2d0/0x5d0
>      call_filldir+0xc8/0x180
>      ext4_readdir+0x948/0xb40
>      iterate_dir+0x1ec/0x240
>      sys_getdents64+0x80/0x290
>      system_call_exception+0x160/0x280
>      system_call_common+0xf0/0x27c
> 
> So remove the might fault check from unsafe_put_user().
> 
> Any call to unsafe_put_user() must be inside a region that's had user
> access enabled with user_access_begin(), so move the might_fault() in
> there. That also allows us to drop the is_kernel_addr() test, because
> there should be no code using user_access_begin() in order to access a
> kernel address.

x86 and mips only have might_fault() on get_user() and put_user(), neither on __get_user() nor on 
__put_user() nor on the unsafe alternative.

When have might_fault() in __get_user_nocheck() that is used by __get_user() and 
__get_user_allowed() ie by unsafe_get_user().

Shoudln't those be dropped as well ?

Christophe

> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>   arch/powerpc/include/asm/uaccess.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 70347ee34c94..71640eca7341 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -214,8 +214,6 @@ do {								\
>   #define __unsafe_put_user_goto(x, ptr, size, label)		\
>   do {								\
>   	__typeof__(*(ptr)) __user *__pu_addr = (ptr);		\
> -	if (!is_kernel_addr((unsigned long)__pu_addr))		\
> -		might_fault();					\
>   	__chk_user_ptr(ptr);					\
>   	__put_user_size_goto((x), __pu_addr, (size), label);	\
>   } while (0)
> @@ -494,6 +492,8 @@ extern void memcpy_page_flushcache(char *to, struct page *page, size_t offset,
>   
>   static __must_check inline bool user_access_begin(const void __user *ptr, size_t len)
>   {
> +	might_fault();
> +
>   	if (unlikely(!access_ok(ptr, len)))
>   		return false;
>   	allow_read_write_user((void __user *)ptr, ptr, len);
> 

^ permalink raw reply

* [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
From: Greg Kurz @ 2021-02-12 16:41 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: lvivier, Greg Kurz, stable, linux-kernel, Cédric Le Goater,
	linuxppc-dev

Depending on the number of online CPUs in the original kernel, it is
likely for CPU #0 to be offline in a kdump kernel. The associated IRQs
in the affinity mappings provided by irq_create_affinity_masks() are
thus not started by irq_startup(), as per-design with managed IRQs.

This can be a problem with multi-queue block devices driven by blk-mq :
such a non-started IRQ is very likely paired with the single queue
enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This
causes the device to remain silent and likely hangs the guest at
some point.

This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries:
Pass MSI affinity to irq_create_mapping()"). Note that this only happens
with the XIVE interrupt controller because XICS has a workaround to bypass
affinity, which is activated during kdump with the "noirqdistrib" kernel
parameter.

The issue comes from a combination of factors:
- discrepancy between the number of queues detected by the multi-queue
  block driver, that was used to create the MSI vectors, and the single
  queue mode enforced later on by blk-mq because of kdump (i.e. keeping
  all queues fixes the issue)
- CPU#0 offline (i.e. kdump always succeed with CPU#0)

Given that I couldn't reproduce on x86, which seems to always have CPU#0
online even during kdump, I'm not sure where this should be fixed. Hence
going for another approach : fine-grained affinity is for performance
and we don't really care about that during kdump. Simply revert to the
previous working behavior of ignoring affinity masks in this case only.

Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()")
Cc: lvivier@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 arch/powerpc/platforms/pseries/msi.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index b3ac2455faad..29d04b83288d 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
 			return hwirq;
 		}
 
-		virq = irq_create_mapping_affinity(NULL, hwirq,
-						   entry->affinity);
+		/*
+		 * Depending on the number of online CPUs in the original
+		 * kernel, it is likely for CPU #0 to be offline in a kdump
+		 * kernel. The associated IRQs in the affinity mappings
+		 * provided by irq_create_affinity_masks() are thus not
+		 * started by irq_startup(), as per-design for managed IRQs.
+		 * This can be a problem with multi-queue block devices driven
+		 * by blk-mq : such a non-started IRQ is very likely paired
+		 * with the single queue enforced by blk-mq during kdump (see
+		 * blk_mq_alloc_tag_set()). This causes the device to remain
+		 * silent and likely hangs the guest at some point.
+		 *
+		 * We don't really care for fine-grained affinity when doing
+		 * kdump actually : simply ignore the pre-computed affinity
+		 * masks in this case and let the default mask with all CPUs
+		 * be used when creating the IRQ mappings.
+		 */
+		if (is_kdump_kernel())
+			virq = irq_create_mapping(NULL, hwirq);
+		else
+			virq = irq_create_mapping_affinity(NULL, hwirq,
+							   entry->affinity);
 
 		if (!virq) {
 			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
From: Laurent Vivier @ 2021-02-12 16:51 UTC (permalink / raw)
  To: Greg Kurz, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, stable, Cédric Le Goater
In-Reply-To: <20210212164132.821332-1-groug@kaod.org>

On 12/02/2021 17:41, Greg Kurz wrote:
> Depending on the number of online CPUs in the original kernel, it is
> likely for CPU #0 to be offline in a kdump kernel. The associated IRQs
> in the affinity mappings provided by irq_create_affinity_masks() are
> thus not started by irq_startup(), as per-design with managed IRQs.
> 
> This can be a problem with multi-queue block devices driven by blk-mq :
> such a non-started IRQ is very likely paired with the single queue
> enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This
> causes the device to remain silent and likely hangs the guest at
> some point.
> 
> This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries:
> Pass MSI affinity to irq_create_mapping()"). Note that this only happens
> with the XIVE interrupt controller because XICS has a workaround to bypass
> affinity, which is activated during kdump with the "noirqdistrib" kernel
> parameter.
> 
> The issue comes from a combination of factors:
> - discrepancy between the number of queues detected by the multi-queue
>   block driver, that was used to create the MSI vectors, and the single
>   queue mode enforced later on by blk-mq because of kdump (i.e. keeping
>   all queues fixes the issue)
> - CPU#0 offline (i.e. kdump always succeed with CPU#0)
> 
> Given that I couldn't reproduce on x86, which seems to always have CPU#0
> online even during kdump, I'm not sure where this should be fixed. Hence
> going for another approach : fine-grained affinity is for performance
> and we don't really care about that during kdump. Simply revert to the
> previous working behavior of ignoring affinity masks in this case only.
> 
> Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()")
> Cc: lvivier@redhat.com
> Cc: stable@vger.kernel.org
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  arch/powerpc/platforms/pseries/msi.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
> index b3ac2455faad..29d04b83288d 100644
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
>  			return hwirq;
>  		}
>  
> -		virq = irq_create_mapping_affinity(NULL, hwirq,
> -						   entry->affinity);
> +		/*
> +		 * Depending on the number of online CPUs in the original
> +		 * kernel, it is likely for CPU #0 to be offline in a kdump
> +		 * kernel. The associated IRQs in the affinity mappings
> +		 * provided by irq_create_affinity_masks() are thus not
> +		 * started by irq_startup(), as per-design for managed IRQs.
> +		 * This can be a problem with multi-queue block devices driven
> +		 * by blk-mq : such a non-started IRQ is very likely paired
> +		 * with the single queue enforced by blk-mq during kdump (see
> +		 * blk_mq_alloc_tag_set()). This causes the device to remain
> +		 * silent and likely hangs the guest at some point.
> +		 *
> +		 * We don't really care for fine-grained affinity when doing
> +		 * kdump actually : simply ignore the pre-computed affinity
> +		 * masks in this case and let the default mask with all CPUs
> +		 * be used when creating the IRQ mappings.
> +		 */
> +		if (is_kdump_kernel())
> +			virq = irq_create_mapping(NULL, hwirq);
> +		else
> +			virq = irq_create_mapping_affinity(NULL, hwirq,
> +							   entry->affinity);
>  
>  		if (!virq) {
>  			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
> 

Reviewed-by: Laurent Vivier <lvivier@redhat.com>


^ permalink raw reply

* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-12 17:19 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, tao.li, Mimi Zohar, Paul Mackerras,
	vincenzo.frascino, Frank Rowand, Sasha Levin, Masahiro Yamada,
	James Morris, AKASHI, Takahiro, linux-arm-kernel, Catalin Marinas,
	Serge E. Hallyn, devicetree, Pavel Tatashin, Will Deacon,
	Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
	Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
	linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
	Joe Perches, linux-integrity, linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <CAL_JsqJ3sDzjsJXtb6EzE77BL+PhUxDJYUngLTqcm0popd7Ajw@mail.gmail.com>

On 2/12/21 6:38 AM, Rob Herring wrote:
> On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian
> <nramas@linux.microsoft.com> wrote:
>>
>> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
>>>
>>> There's actually a complication that I just noticed and needs to be
>>> addressed. More below.
>>>
>>
>> <...>
>>
>>>> +
>>>> +/*
>>>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
>>>> + *
>>>> + * @image:          kexec image being loaded.
>>>> + * @initrd_load_addr:       Address where the next initrd will be loaded.
>>>> + * @initrd_len:             Size of the next initrd, or 0 if there will be none.
>>>> + * @cmdline:                Command line for the next kernel, or NULL if there will
>>>> + *                  be none.
>>>> + *
>>>> + * Return: fdt on success, or NULL errno on error.
>>>> + */
>>>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>>> +                               unsigned long initrd_load_addr,
>>>> +                               unsigned long initrd_len,
>>>> +                               const char *cmdline)
>>>> +{
>>>> +    void *fdt;
>>>> +    int ret, chosen_node;
>>>> +    const void *prop;
>>>> +    unsigned long fdt_size;
>>>> +
>>>> +    fdt_size = fdt_totalsize(initial_boot_params) +
>>>> +               (cmdline ? strlen(cmdline) : 0) +
>>>> +               FDT_EXTRA_SPACE;
>>>
>>> Just adding 4 KB to initial_boot_params won't be enough for crash
>>> kernels on ppc64. The current powerpc code doubles the size of
>>> initial_boot_params (which is normally larger than 4 KB) and even that
>>> isn't enough. A patch was added to powerpc/next today which uses a more
>>> precise (but arch-specific) formula:
>>>
>>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
>>>
>>> So I believe we need a hook here where architectures can provide their
>>> own specific calculation for the size of the fdt. Perhaps a weakly
>>> defined function providing a default implementation which an
>>> arch-specific file can override (a la arch_kexec_kernel_image_load())?
>>>
>>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
>>> function from the patch I linked above.
>>>
>>
>> Do you think it'd better to add "fdt_size" parameter to
>> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
>> desired FDT buffer size?
> 
> Yes, I guess so. But please define the param as extra size, not total
> size. The kernel command line size addition can be in the common code.

Will do. Just to clarify -

The common code will do:

fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size

The caller will pass "extra_fdt_size"
ARM64 => 4KB
PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when 
the patch Thiago had referred to is merged.

> 
> The above change is also going to conflict, so I think this may have
> to wait. Or I'll take the common and arm bits and powerpc can be
> converted next cycle (or after the merge window).
> 

thanks.

  -lakshmi



^ permalink raw reply

* Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
From: Cédric Le Goater @ 2021-02-12 17:27 UTC (permalink / raw)
  To: Greg Kurz, Michael Ellerman; +Cc: lvivier, linuxppc-dev, linux-kernel, stable
In-Reply-To: <20210212164132.821332-1-groug@kaod.org>

On 2/12/21 5:41 PM, Greg Kurz wrote:
> Depending on the number of online CPUs in the original kernel, it is
> likely for CPU #0 to be offline in a kdump kernel. The associated IRQs
> in the affinity mappings provided by irq_create_affinity_masks() are
> thus not started by irq_startup(), as per-design with managed IRQs.
> 
> This can be a problem with multi-queue block devices driven by blk-mq :
> such a non-started IRQ is very likely paired with the single queue
> enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This
> causes the device to remain silent and likely hangs the guest at
> some point.
> 
> This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries:
> Pass MSI affinity to irq_create_mapping()"). Note that this only happens
> with the XIVE interrupt controller because XICS has a workaround to bypass
> affinity, which is activated during kdump with the "noirqdistrib" kernel
> parameter.
> 
> The issue comes from a combination of factors:
> - discrepancy between the number of queues detected by the multi-queue
>   block driver, that was used to create the MSI vectors, and the single
>   queue mode enforced later on by blk-mq because of kdump (i.e. keeping
>   all queues fixes the issue)
> - CPU#0 offline (i.e. kdump always succeed with CPU#0)
> 
> Given that I couldn't reproduce on x86, which seems to always have CPU#0
> online even during kdump, I'm not sure where this should be fixed. Hence
> going for another approach : fine-grained affinity is for performance
> and we don't really care about that during kdump. Simply revert to the
> previous working behavior of ignoring affinity masks in this case only.
> 
> Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()")
> Cc: lvivier@redhat.com
> Cc: stable@vger.kernel.org
> Signed-off-by: Greg Kurz <groug@kaod.org>


Reviewed-by: Cédric Le Goater <clg@kaod.org>

Thanks for tracking this issue. 

This layer needs a rework. Patches adding a MSI domain should be ready 
in a couple of releases. Hopefully. 

C. 

> ---
>  arch/powerpc/platforms/pseries/msi.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
> index b3ac2455faad..29d04b83288d 100644
> --- a/arch/powerpc/platforms/pseries/msi.c
> +++ b/arch/powerpc/platforms/pseries/msi.c
> @@ -458,8 +458,28 @@ static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
>  			return hwirq;
>  		}
>  
> -		virq = irq_create_mapping_affinity(NULL, hwirq,
> -						   entry->affinity);
> +		/*
> +		 * Depending on the number of online CPUs in the original
> +		 * kernel, it is likely for CPU #0 to be offline in a kdump
> +		 * kernel. The associated IRQs in the affinity mappings
> +		 * provided by irq_create_affinity_masks() are thus not
> +		 * started by irq_startup(), as per-design for managed IRQs.
> +		 * This can be a problem with multi-queue block devices driven
> +		 * by blk-mq : such a non-started IRQ is very likely paired
> +		 * with the single queue enforced by blk-mq during kdump (see
> +		 * blk_mq_alloc_tag_set()). This causes the device to remain
> +		 * silent and likely hangs the guest at some point.
> +		 *
> +		 * We don't really care for fine-grained affinity when doing
> +		 * kdump actually : simply ignore the pre-computed affinity
> +		 * masks in this case and let the default mask with all CPUs
> +		 * be used when creating the IRQ mappings.
> +		 */
> +		if (is_kdump_kernel())
> +			virq = irq_create_mapping(NULL, hwirq);
> +		else
> +			virq = irq_create_mapping_affinity(NULL, hwirq,
> +							   entry->affinity);
>  
>  		if (!virq) {
>  			pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
> 


^ permalink raw reply

* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Rob Herring @ 2021-02-12 18:24 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: Mark Rutland, tao.li, Mimi Zohar, Paul Mackerras,
	vincenzo.frascino, Frank Rowand, Sasha Levin, Masahiro Yamada,
	James Morris, AKASHI, Takahiro, linux-arm-kernel, Catalin Marinas,
	Serge E. Hallyn, devicetree, Pavel Tatashin, Will Deacon,
	Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
	Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
	linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
	Joe Perches, linux-integrity, linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <55685b61-dac0-2f24-f74a-939acf74a4f2@linux.microsoft.com>

On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian
<nramas@linux.microsoft.com> wrote:
>
> On 2/12/21 6:38 AM, Rob Herring wrote:
> > On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian
> > <nramas@linux.microsoft.com> wrote:
> >>
> >> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
> >>>
> >>> There's actually a complication that I just noticed and needs to be
> >>> addressed. More below.
> >>>
> >>
> >> <...>
> >>
> >>>> +
> >>>> +/*
> >>>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
> >>>> + *
> >>>> + * @image:          kexec image being loaded.
> >>>> + * @initrd_load_addr:       Address where the next initrd will be loaded.
> >>>> + * @initrd_len:             Size of the next initrd, or 0 if there will be none.
> >>>> + * @cmdline:                Command line for the next kernel, or NULL if there will
> >>>> + *                  be none.
> >>>> + *
> >>>> + * Return: fdt on success, or NULL errno on error.
> >>>> + */
> >>>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
> >>>> +                               unsigned long initrd_load_addr,
> >>>> +                               unsigned long initrd_len,
> >>>> +                               const char *cmdline)
> >>>> +{
> >>>> +    void *fdt;
> >>>> +    int ret, chosen_node;
> >>>> +    const void *prop;
> >>>> +    unsigned long fdt_size;
> >>>> +
> >>>> +    fdt_size = fdt_totalsize(initial_boot_params) +
> >>>> +               (cmdline ? strlen(cmdline) : 0) +
> >>>> +               FDT_EXTRA_SPACE;
> >>>
> >>> Just adding 4 KB to initial_boot_params won't be enough for crash
> >>> kernels on ppc64. The current powerpc code doubles the size of
> >>> initial_boot_params (which is normally larger than 4 KB) and even that
> >>> isn't enough. A patch was added to powerpc/next today which uses a more
> >>> precise (but arch-specific) formula:
> >>>
> >>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
> >>>
> >>> So I believe we need a hook here where architectures can provide their
> >>> own specific calculation for the size of the fdt. Perhaps a weakly
> >>> defined function providing a default implementation which an
> >>> arch-specific file can override (a la arch_kexec_kernel_image_load())?
> >>>
> >>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
> >>> function from the patch I linked above.
> >>>
> >>
> >> Do you think it'd better to add "fdt_size" parameter to
> >> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
> >> desired FDT buffer size?
> >
> > Yes, I guess so. But please define the param as extra size, not total
> > size. The kernel command line size addition can be in the common code.
>
> Will do. Just to clarify -
>
> The common code will do:
>
> fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size
>
> The caller will pass "extra_fdt_size"
> ARM64 => 4KB
> PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when
> the patch Thiago had referred to is merged.

Yes, I'd leave the 4KB in there by default and arm64 use 0.

Rob

^ permalink raw reply

* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Lakshmi Ramasubramanian @ 2021-02-12 18:27 UTC (permalink / raw)
  To: Rob Herring
  Cc: Mark Rutland, tao.li, Mimi Zohar, Paul Mackerras,
	vincenzo.frascino, Frank Rowand, Sasha Levin, Masahiro Yamada,
	James Morris, AKASHI, Takahiro, linux-arm-kernel, Catalin Marinas,
	Serge E. Hallyn, devicetree, Pavel Tatashin, Will Deacon,
	Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
	Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
	linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
	Joe Perches, linux-integrity, linuxppc-dev, Thiago Jung Bauermann
In-Reply-To: <CAL_JsqKDCgtJngxqMCRdC9evEQpHnryEaMvfgYEh0Mcto6dLHA@mail.gmail.com>

On 2/12/21 10:24 AM, Rob Herring wrote:
> On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian
> <nramas@linux.microsoft.com> wrote:
>>
>> On 2/12/21 6:38 AM, Rob Herring wrote:
>>> On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian
>>> <nramas@linux.microsoft.com> wrote:
>>>>
>>>> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
>>>>>
>>>>> There's actually a complication that I just noticed and needs to be
>>>>> addressed. More below.
>>>>>
>>>>
>>>> <...>
>>>>
>>>>>> +
>>>>>> +/*
>>>>>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
>>>>>> + *
>>>>>> + * @image:          kexec image being loaded.
>>>>>> + * @initrd_load_addr:       Address where the next initrd will be loaded.
>>>>>> + * @initrd_len:             Size of the next initrd, or 0 if there will be none.
>>>>>> + * @cmdline:                Command line for the next kernel, or NULL if there will
>>>>>> + *                  be none.
>>>>>> + *
>>>>>> + * Return: fdt on success, or NULL errno on error.
>>>>>> + */
>>>>>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>>>>> +                               unsigned long initrd_load_addr,
>>>>>> +                               unsigned long initrd_len,
>>>>>> +                               const char *cmdline)
>>>>>> +{
>>>>>> +    void *fdt;
>>>>>> +    int ret, chosen_node;
>>>>>> +    const void *prop;
>>>>>> +    unsigned long fdt_size;
>>>>>> +
>>>>>> +    fdt_size = fdt_totalsize(initial_boot_params) +
>>>>>> +               (cmdline ? strlen(cmdline) : 0) +
>>>>>> +               FDT_EXTRA_SPACE;
>>>>>
>>>>> Just adding 4 KB to initial_boot_params won't be enough for crash
>>>>> kernels on ppc64. The current powerpc code doubles the size of
>>>>> initial_boot_params (which is normally larger than 4 KB) and even that
>>>>> isn't enough. A patch was added to powerpc/next today which uses a more
>>>>> precise (but arch-specific) formula:
>>>>>
>>>>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
>>>>>
>>>>> So I believe we need a hook here where architectures can provide their
>>>>> own specific calculation for the size of the fdt. Perhaps a weakly
>>>>> defined function providing a default implementation which an
>>>>> arch-specific file can override (a la arch_kexec_kernel_image_load())?
>>>>>
>>>>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
>>>>> function from the patch I linked above.
>>>>>
>>>>
>>>> Do you think it'd better to add "fdt_size" parameter to
>>>> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
>>>> desired FDT buffer size?
>>>
>>> Yes, I guess so. But please define the param as extra size, not total
>>> size. The kernel command line size addition can be in the common code.
>>
>> Will do. Just to clarify -
>>
>> The common code will do:
>>
>> fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size
>>
>> The caller will pass "extra_fdt_size"
>> ARM64 => 4KB
>> PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when
>> the patch Thiago had referred to is merged.
> 
> Yes, I'd leave the 4KB in there by default and arm64 use 0.
> 

Sounds good.

common:
fdt_totalsize(initial_boot_params) + strlen(cmdline) + 0x1000 + extra

arm64 => 0 for extra
ppc => fdt_totalsize(initial_boot_params) for extra.

  -lakshmi


^ permalink raw reply

* Re: [PATCH v17 02/10] of: Add a common kexec FDT setup function
From: Thiago Jung Bauermann @ 2021-02-12 19:39 UTC (permalink / raw)
  To: Lakshmi Ramasubramanian
  Cc: Mark Rutland, tao.li, Mimi Zohar, Paul Mackerras,
	vincenzo.frascino, Frank Rowand, Sasha Levin, Rob Herring,
	Masahiro Yamada, James Morris, AKASHI, Takahiro, linux-arm-kernel,
	Catalin Marinas, Serge E. Hallyn, devicetree, Pavel Tatashin,
	Will Deacon, Prakhar Srivastava, Hsin-Yi Wang, Allison Randal,
	Christophe Leroy, Matthias Brugger, balajib, dmitry.kasatkin,
	linux-kernel@vger.kernel.org, James Morse, Greg Kroah-Hartman,
	Joe Perches, linux-integrity, linuxppc-dev
In-Reply-To: <11aff288-feaa-8e43-fcda-12bc12fbf4cf@linux.microsoft.com>


Lakshmi Ramasubramanian <nramas@linux.microsoft.com> writes:

> On 2/12/21 10:24 AM, Rob Herring wrote:
>> On Fri, Feb 12, 2021 at 11:19 AM Lakshmi Ramasubramanian
>> <nramas@linux.microsoft.com> wrote:
>>>
>>> On 2/12/21 6:38 AM, Rob Herring wrote:
>>>> On Thu, Feb 11, 2021 at 7:17 PM Lakshmi Ramasubramanian
>>>> <nramas@linux.microsoft.com> wrote:
>>>>>
>>>>> On 2/11/21 5:09 PM, Thiago Jung Bauermann wrote:
>>>>>>
>>>>>> There's actually a complication that I just noticed and needs to be
>>>>>> addressed. More below.
>>>>>>
>>>>>
>>>>> <...>
>>>>>
>>>>>>> +
>>>>>>> +/*
>>>>>>> + * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree
>>>>>>> + *
>>>>>>> + * @image:          kexec image being loaded.
>>>>>>> + * @initrd_load_addr:       Address where the next initrd will be loaded.
>>>>>>> + * @initrd_len:             Size of the next initrd, or 0 if there will be none.
>>>>>>> + * @cmdline:                Command line for the next kernel, or NULL if there will
>>>>>>> + *                  be none.
>>>>>>> + *
>>>>>>> + * Return: fdt on success, or NULL errno on error.
>>>>>>> + */
>>>>>>> +void *of_kexec_alloc_and_setup_fdt(const struct kimage *image,
>>>>>>> +                               unsigned long initrd_load_addr,
>>>>>>> +                               unsigned long initrd_len,
>>>>>>> +                               const char *cmdline)
>>>>>>> +{
>>>>>>> +    void *fdt;
>>>>>>> +    int ret, chosen_node;
>>>>>>> +    const void *prop;
>>>>>>> +    unsigned long fdt_size;
>>>>>>> +
>>>>>>> +    fdt_size = fdt_totalsize(initial_boot_params) +
>>>>>>> +               (cmdline ? strlen(cmdline) : 0) +
>>>>>>> +               FDT_EXTRA_SPACE;
>>>>>>
>>>>>> Just adding 4 KB to initial_boot_params won't be enough for crash
>>>>>> kernels on ppc64. The current powerpc code doubles the size of
>>>>>> initial_boot_params (which is normally larger than 4 KB) and even that
>>>>>> isn't enough. A patch was added to powerpc/next today which uses a more
>>>>>> precise (but arch-specific) formula:
>>>>>>
>>>>>> https://lore.kernel.org/linuxppc-dev/161243826811.119001.14083048209224609814.stgit@hbathini/
>>>>>>
>>>>>> So I believe we need a hook here where architectures can provide their
>>>>>> own specific calculation for the size of the fdt. Perhaps a weakly
>>>>>> defined function providing a default implementation which an
>>>>>> arch-specific file can override (a la arch_kexec_kernel_image_load())?
>>>>>>
>>>>>> Then the powerpc specific hook would be the kexec_fdt_totalsize_ppc64()
>>>>>> function from the patch I linked above.
>>>>>>
>>>>>
>>>>> Do you think it'd better to add "fdt_size" parameter to
>>>>> of_kexec_alloc_and_setup_fdt() so that the caller can provide the
>>>>> desired FDT buffer size?
>>>>
>>>> Yes, I guess so. But please define the param as extra size, not total
>>>> size. The kernel command line size addition can be in the common code.
>>>
>>> Will do. Just to clarify -
>>>
>>> The common code will do:
>>>
>>> fdt_totalsize(initial_boot_params) + strlen(cmdline) + extra_fdt_size
>>>
>>> The caller will pass "extra_fdt_size"
>>> ARM64 => 4KB
>>> PPC64 => fdt_totalsize(initial_boot_params) - which will be updated when
>>> the patch Thiago had referred to is merged.
>> Yes, I'd leave the 4KB in there by default and arm64 use 0.
>> 
>
> Sounds good.
>
> common:
> fdt_totalsize(initial_boot_params) + strlen(cmdline) + 0x1000 + extra
>
> arm64 => 0 for extra
> ppc => fdt_totalsize(initial_boot_params) for extra.

Looks good to me.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center

^ permalink raw reply

* Re: [RFC PATCH 2/9] KVM: PPC: Book3S 64: Move GUEST_MODE_SKIP test into KVM
From: Fabiano Rosas @ 2021-02-12 20:33 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210202030313.3509446-3-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Move the GUEST_MODE_SKIP logic into KVM code. This is quite a KVM
> internal detail that has no real need to be in common handlers.

LGTM,

>
> (XXX: Need to confirm CBE handlers etc)

Do these interrupts exist only in Cell? I see that they set HSRRs and
MSR_HV, but CPU_FTRS_CELL does not contain CPU_HVMODE. So I don't get
why they use the KVM macros.

And for the instruction_breakpoint (0x1300) I think it would help if we
could at least restrict when it is built. But I don't know what
ISA/processor version it is from.

>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/kernel/exceptions-64s.S | 64 ----------------------------
>  arch/powerpc/kvm/book3s_64_entry.S   | 50 ++++++++++++++++++----
>  2 files changed, 42 insertions(+), 72 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 65659ea3cec4..e6f7fc7c61a1 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -133,7 +133,6 @@ name:
>  #define IBRANCH_TO_COMMON	.L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */
>  #define IREALMODE_COMMON	.L_IREALMODE_COMMON_\name\() /* Common runs in realmode */
>  #define IMASK		.L_IMASK_\name\()	/* IRQ soft-mask bit */
> -#define IKVM_SKIP	.L_IKVM_SKIP_\name\()	/* Generate KVM skip handler */
>  #define IKVM_REAL	.L_IKVM_REAL_\name\()	/* Real entry tests KVM */
>  #define __IKVM_REAL(name)	.L_IKVM_REAL_ ## name
>  #define IKVM_VIRT	.L_IKVM_VIRT_\name\()	/* Virt entry tests KVM */
> @@ -191,9 +190,6 @@ do_define_int n
>  	.ifndef IMASK
>  		IMASK=0
>  	.endif
> -	.ifndef IKVM_SKIP
> -		IKVM_SKIP=0
> -	.endif
>  	.ifndef IKVM_REAL
>  		IKVM_REAL=0
>  	.endif
> @@ -254,15 +250,10 @@ do_define_int n
>  	.balign IFETCH_ALIGN_BYTES
>  \name\()_kvm:
>
> -	.if IKVM_SKIP
> -	cmpwi	r10,KVM_GUEST_MODE_SKIP
> -	beq	89f
> -	.else
>  BEGIN_FTR_SECTION
>  	ld	r10,IAREA+EX_CFAR(r13)
>  	std	r10,HSTATE_CFAR(r13)
>  END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
> -	.endif
>
>  	ld	r10,IAREA+EX_CTR(r13)
>  	mtctr	r10
> @@ -289,27 +280,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>  	ori	r12,r12,(IVEC)
>  	.endif
>  	b	kvmppc_interrupt
> -
> -	.if IKVM_SKIP
> -89:	mtocrf	0x80,r9
> -	ld	r10,IAREA+EX_CTR(r13)
> -	mtctr	r10
> -	ld	r9,IAREA+EX_R9(r13)
> -	ld	r10,IAREA+EX_R10(r13)
> -	ld	r11,IAREA+EX_R11(r13)
> -	ld	r12,IAREA+EX_R12(r13)
> -	.if IHSRR_IF_HVMODE
> -	BEGIN_FTR_SECTION
> -	b	kvmppc_skip_Hinterrupt
> -	FTR_SECTION_ELSE
> -	b	kvmppc_skip_interrupt
> -	ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206)
> -	.elseif IHSRR
> -	b	kvmppc_skip_Hinterrupt
> -	.else
> -	b	kvmppc_skip_interrupt
> -	.endif
> -	.endif
>  .endm
>
>  #else
> @@ -1128,7 +1098,6 @@ INT_DEFINE_BEGIN(machine_check)
>  	ISET_RI=0
>  	IDAR=1
>  	IDSISR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(machine_check)
>
> @@ -1419,7 +1388,6 @@ INT_DEFINE_BEGIN(data_access)
>  	IVEC=0x300
>  	IDAR=1
>  	IDSISR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(data_access)
>
> @@ -1469,7 +1437,6 @@ INT_DEFINE_BEGIN(data_access_slb)
>  	IAREA=PACA_EXSLB
>  	IRECONCILE=0
>  	IDAR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(data_access_slb)
>
> @@ -2116,7 +2083,6 @@ INT_DEFINE_BEGIN(h_data_storage)
>  	IHSRR=1
>  	IDAR=1
>  	IDSISR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  	IKVM_VIRT=1
>  INT_DEFINE_END(h_data_storage)
> @@ -2573,7 +2539,6 @@ EXC_VIRT_NONE(0x5100, 0x100)
>  INT_DEFINE_BEGIN(cbe_system_error)
>  	IVEC=0x1200
>  	IHSRR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(cbe_system_error)
>
> @@ -2598,7 +2563,6 @@ EXC_VIRT_NONE(0x5200, 0x100)
>  INT_DEFINE_BEGIN(instruction_breakpoint)
>  	IVEC=0x1300
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  #endif
>  INT_DEFINE_END(instruction_breakpoint)
> @@ -2744,7 +2708,6 @@ EXC_COMMON_BEGIN(denorm_exception_common)
>  INT_DEFINE_BEGIN(cbe_maintenance)
>  	IVEC=0x1600
>  	IHSRR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(cbe_maintenance)
>
> @@ -2797,7 +2760,6 @@ EXC_COMMON_BEGIN(altivec_assist_common)
>  INT_DEFINE_BEGIN(cbe_thermal)
>  	IVEC=0x1800
>  	IHSRR=1
> -	IKVM_SKIP=1
>  	IKVM_REAL=1
>  INT_DEFINE_END(cbe_thermal)
>
> @@ -3081,32 +3043,6 @@ EXPORT_SYMBOL(do_uaccess_flush)
>  MASKED_INTERRUPT
>  MASKED_INTERRUPT hsrr=1
>
> -#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
> -kvmppc_skip_interrupt:
> -	/*
> -	 * Here all GPRs are unchanged from when the interrupt happened
> -	 * except for r13, which is saved in SPRG_SCRATCH0.
> -	 */
> -	mfspr	r13, SPRN_SRR0
> -	addi	r13, r13, 4
> -	mtspr	SPRN_SRR0, r13
> -	GET_SCRATCH0(r13)
> -	RFI_TO_KERNEL
> -	b	.
> -
> -kvmppc_skip_Hinterrupt:
> -	/*
> -	 * Here all GPRs are unchanged from when the interrupt happened
> -	 * except for r13, which is saved in SPRG_SCRATCH0.
> -	 */
> -	mfspr	r13, SPRN_HSRR0
> -	addi	r13, r13, 4
> -	mtspr	SPRN_HSRR0, r13
> -	GET_SCRATCH0(r13)
> -	HRFI_TO_KERNEL
> -	b	.
> -#endif
> -
>  	/*
>  	 * Relocation-on interrupts: A subset of the interrupts can be delivered
>  	 * with IR=1/DR=1, if AIL==2 and MSR.HV won't be changed by delivering
> diff --git a/arch/powerpc/kvm/book3s_64_entry.S b/arch/powerpc/kvm/book3s_64_entry.S
> index 22e34b95f478..8e7216f3c3ee 100644
> --- a/arch/powerpc/kvm/book3s_64_entry.S
> +++ b/arch/powerpc/kvm/book3s_64_entry.S
> @@ -1,9 +1,10 @@
> +#include <asm/asm-offsets.h>
>  #include <asm/cache.h>
> -#include <asm/ppc_asm.h>
> +#include <asm/exception-64s.h>
>  #include <asm/kvm_asm.h>
> -#include <asm/reg.h>
> -#include <asm/asm-offsets.h>
>  #include <asm/kvm_book3s_asm.h>
> +#include <asm/ppc_asm.h>
> +#include <asm/reg.h>
>
>  /*
>   * We come here from the first-level interrupt handlers.
> @@ -18,17 +19,50 @@ kvmppc_interrupt:
>  	 * guest R12 saved in shadow VCPU SCRATCH0
>  	 * guest R13 saved in SPRN_SCRATCH0
>  	 */
> +	std	r9,HSTATE_SCRATCH2(r13)
> +	lbz	r9,HSTATE_IN_GUEST(r13)
> +	cmpwi	r9,KVM_GUEST_MODE_SKIP
> +	beq	maybe_skip
> +no_skip:
>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> -	std	r9, HSTATE_SCRATCH2(r13)
> -	lbz	r9, HSTATE_IN_GUEST(r13)
> -	cmpwi	r9, KVM_GUEST_MODE_HOST_HV
> +	cmpwi	r9,KVM_GUEST_MODE_HOST_HV
>  	beq	kvmppc_bad_host_intr
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
> -	cmpwi	r9, KVM_GUEST_MODE_GUEST
> -	ld	r9, HSTATE_SCRATCH2(r13)
> +	cmpwi	r9,KVM_GUEST_MODE_GUEST
> +	ld	r9,HSTATE_SCRATCH2(r13)
>  	beq	kvmppc_interrupt_pr
>  #endif
>  	b	kvmppc_interrupt_hv
>  #else
>  	b	kvmppc_interrupt_pr
>  #endif
> +
> +maybe_skip:
> +	cmpwi	r12,0x200
> +	beq	1f
> +	cmpwi	r12,0x300
> +	beq	1f
> +	cmpwi	r12,0x380
> +	beq	1f
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +	/* XXX: cbe stuff? instruction breakpoint? */
> +	cmpwi	r12,0xe02
> +	beq	2f
> +#endif
> +	b	no_skip
> +1:	mfspr	r9,SPRN_SRR0
> +	addi	r9,r9,4
> +	mtspr	SPRN_SRR0,r9
> +	ld	r12,HSTATE_SCRATCH0(r13)
> +	ld	r9,HSTATE_SCRATCH2(r13)
> +	GET_SCRATCH0(r13)
> +	RFI_TO_KERNEL
> +#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
> +2:	mfspr	r9,SPRN_HSRR0
> +	addi	r9,r9,4
> +	mtspr	SPRN_HSRR0,r9
> +	ld	r12,HSTATE_SCRATCH0(r13)
> +	ld	r9,HSTATE_SCRATCH2(r13)
> +	GET_SCRATCH0(r13)
> +	HRFI_TO_KERNEL
> +#endif

^ permalink raw reply

* Re: [RFC PATCH 3/9] KVM: PPC: Book3S 64: add hcall interrupt handler
From: Fabiano Rosas @ 2021-02-12 20:33 UTC (permalink / raw)
  To: Nicholas Piggin, kvm-ppc; +Cc: linuxppc-dev, Nicholas Piggin
In-Reply-To: <20210202030313.3509446-4-npiggin@gmail.com>

Nicholas Piggin <npiggin@gmail.com> writes:

> Add a separate hcall entry point. This can be used to deal with the
> different calling convention.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com>

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 4 ++--
>  arch/powerpc/kvm/book3s_64_entry.S   | 4 ++++
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index e6f7fc7c61a1..c25395b5921a 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -2028,13 +2028,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
>  	 * Requires __LOAD_FAR_HANDLER beause kvmppc_interrupt lives
>  	 * outside the head section.
>  	 */
> -	__LOAD_FAR_HANDLER(r10, kvmppc_interrupt)
> +	__LOAD_FAR_HANDLER(r10, kvmppc_hcall)
>  	mtctr   r10
>  	ld	r10,PACA_EXGEN+EX_R10(r13)
>  	bctr
>  #else
>  	ld	r10,PACA_EXGEN+EX_R10(r13)
> -	b       kvmppc_interrupt
> +	b       kvmppc_hcall
>  #endif
>  #endif
>
> diff --git a/arch/powerpc/kvm/book3s_64_entry.S b/arch/powerpc/kvm/book3s_64_entry.S
> index 8e7216f3c3ee..3b894b90862f 100644
> --- a/arch/powerpc/kvm/book3s_64_entry.S
> +++ b/arch/powerpc/kvm/book3s_64_entry.S
> @@ -9,6 +9,10 @@
>  /*
>   * We come here from the first-level interrupt handlers.
>   */
> +.global	kvmppc_hcall
> +.balign IFETCH_ALIGN_BYTES
> +kvmppc_hcall:
> +
>  .global	kvmppc_interrupt
>  .balign IFETCH_ALIGN_BYTES
>  kvmppc_interrupt:

^ permalink raw reply

* Re: [PATCH] powerpc/pseries: Don't enforce MSI affinity with kdump
From: kernel test robot @ 2021-02-12 21:05 UTC (permalink / raw)
  To: Greg Kurz, Michael Ellerman
  Cc: lvivier, kbuild-all, Greg Kurz, stable, linux-kernel,
	Cédric Le Goater, linuxppc-dev
In-Reply-To: <20210212164132.821332-1-groug@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 6268 bytes --]

Hi Greg,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v5.11-rc7 next-20210211]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Greg-Kurz/powerpc-pseries-Don-t-enforce-MSI-affinity-with-kdump/20210213-004658
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/1e5f7523fcfc57ab9437b8c7b29a974b62bde79d
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Greg-Kurz/powerpc-pseries-Don-t-enforce-MSI-affinity-with-kdump/20210213-004658
        git checkout 1e5f7523fcfc57ab9437b8c7b29a974b62bde79d
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/msi.c: In function 'rtas_setup_msi_irqs':
>> arch/powerpc/platforms/pseries/msi.c:478:7: error: implicit declaration of function 'is_kdump_kernel' [-Werror=implicit-function-declaration]
     478 |   if (is_kdump_kernel())
         |       ^~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/is_kdump_kernel +478 arch/powerpc/platforms/pseries/msi.c

   369	
   370	static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
   371	{
   372		struct pci_dn *pdn;
   373		int hwirq, virq, i, quota, rc;
   374		struct msi_desc *entry;
   375		struct msi_msg msg;
   376		int nvec = nvec_in;
   377		int use_32bit_msi_hack = 0;
   378	
   379		if (type == PCI_CAP_ID_MSIX)
   380			rc = check_req_msix(pdev, nvec);
   381		else
   382			rc = check_req_msi(pdev, nvec);
   383	
   384		if (rc)
   385			return rc;
   386	
   387		quota = msi_quota_for_device(pdev, nvec);
   388	
   389		if (quota && quota < nvec)
   390			return quota;
   391	
   392		if (type == PCI_CAP_ID_MSIX && check_msix_entries(pdev))
   393			return -EINVAL;
   394	
   395		/*
   396		 * Firmware currently refuse any non power of two allocation
   397		 * so we round up if the quota will allow it.
   398		 */
   399		if (type == PCI_CAP_ID_MSIX) {
   400			int m = roundup_pow_of_two(nvec);
   401			quota = msi_quota_for_device(pdev, m);
   402	
   403			if (quota >= m)
   404				nvec = m;
   405		}
   406	
   407		pdn = pci_get_pdn(pdev);
   408	
   409		/*
   410		 * Try the new more explicit firmware interface, if that fails fall
   411		 * back to the old interface. The old interface is known to never
   412		 * return MSI-Xs.
   413		 */
   414	again:
   415		if (type == PCI_CAP_ID_MSI) {
   416			if (pdev->no_64bit_msi) {
   417				rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec);
   418				if (rc < 0) {
   419					/*
   420					 * We only want to run the 32 bit MSI hack below if
   421					 * the max bus speed is Gen2 speed
   422					 */
   423					if (pdev->bus->max_bus_speed != PCIE_SPEED_5_0GT)
   424						return rc;
   425	
   426					use_32bit_msi_hack = 1;
   427				}
   428			} else
   429				rc = -1;
   430	
   431			if (rc < 0)
   432				rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec);
   433	
   434			if (rc < 0) {
   435				pr_debug("rtas_msi: trying the old firmware call.\n");
   436				rc = rtas_change_msi(pdn, RTAS_CHANGE_FN, nvec);
   437			}
   438	
   439			if (use_32bit_msi_hack && rc > 0)
   440				rtas_hack_32bit_msi_gen2(pdev);
   441		} else
   442			rc = rtas_change_msi(pdn, RTAS_CHANGE_MSIX_FN, nvec);
   443	
   444		if (rc != nvec) {
   445			if (nvec != nvec_in) {
   446				nvec = nvec_in;
   447				goto again;
   448			}
   449			pr_debug("rtas_msi: rtas_change_msi() failed\n");
   450			return rc;
   451		}
   452	
   453		i = 0;
   454		for_each_pci_msi_entry(entry, pdev) {
   455			hwirq = rtas_query_irq_number(pdn, i++);
   456			if (hwirq < 0) {
   457				pr_debug("rtas_msi: error (%d) getting hwirq\n", rc);
   458				return hwirq;
   459			}
   460	
   461			/*
   462			 * Depending on the number of online CPUs in the original
   463			 * kernel, it is likely for CPU #0 to be offline in a kdump
   464			 * kernel. The associated IRQs in the affinity mappings
   465			 * provided by irq_create_affinity_masks() are thus not
   466			 * started by irq_startup(), as per-design for managed IRQs.
   467			 * This can be a problem with multi-queue block devices driven
   468			 * by blk-mq : such a non-started IRQ is very likely paired
   469			 * with the single queue enforced by blk-mq during kdump (see
   470			 * blk_mq_alloc_tag_set()). This causes the device to remain
   471			 * silent and likely hangs the guest at some point.
   472			 *
   473			 * We don't really care for fine-grained affinity when doing
   474			 * kdump actually : simply ignore the pre-computed affinity
   475			 * masks in this case and let the default mask with all CPUs
   476			 * be used when creating the IRQ mappings.
   477			 */
 > 478			if (is_kdump_kernel())
   479				virq = irq_create_mapping(NULL, hwirq);
   480			else
   481				virq = irq_create_mapping_affinity(NULL, hwirq,
   482								   entry->affinity);
   483	
   484			if (!virq) {
   485				pr_debug("rtas_msi: Failed mapping hwirq %d\n", hwirq);
   486				return -ENOSPC;
   487			}
   488	
   489			dev_dbg(&pdev->dev, "rtas_msi: allocated virq %d\n", virq);
   490			irq_set_msi_desc(virq, entry);
   491	
   492			/* Read config space back so we can restore after reset */
   493			__pci_read_msi_msg(entry, &msg);
   494			entry->msg = msg;
   495		}
   496	
   497		return 0;
   498	}
   499	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 72574 bytes --]

^ permalink raw reply

* Re: [PATCH v5 07/10] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()
From: Daniel Axtens @ 2021-02-12 21:17 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210203184323.20792-8-cmr@codefail.de>

Hi Chris,

> Previously restore_sigcontext() performed a costly KUAP switch on every
> uaccess operation. These repeated uaccess switches cause a significant
> drop in signal handling performance.
>
> Rewrite restore_sigcontext() to assume that a userspace read access
> window is open. Replace all uaccess functions with their 'unsafe'
> versions which avoid the repeated uaccess switches.
>

Much of the same comments apply here as to the last patch:
 - the commit message might be improved by adding that you are also
   changing the calling functions to open the uaccess window before
   calling into the new unsafe functions

 - I have checked that the safe to unsafe conversions look right.

 - I think you're opening too wide a window in user_read_access_begin,
   it seems to me that it could be reduced to just the
   uc_mcontext. (Again, not that it makes a difference with the current
   HW.)

Kind regards,
Daniel

> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>  arch/powerpc/kernel/signal_64.c | 68 ++++++++++++++++++++-------------
>  1 file changed, 41 insertions(+), 27 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 4248e4489ff1..d668f8af18fe 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
>  /*
>   * Restore the sigcontext from the signal frame.
>   */
> -
> -static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
> -			      struct sigcontext __user *sc)
> +#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \
> +	unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e)
> +static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, sigset_t *set,
> +						int sig, struct sigcontext __user *sc)
>  {
>  #ifdef CONFIG_ALTIVEC
>  	elf_vrreg_t __user *v_regs;
>  #endif
> -	unsigned long err = 0;
>  	unsigned long save_r13 = 0;
>  	unsigned long msr;
>  	struct pt_regs *regs = tsk->thread.regs;
> @@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>  		save_r13 = regs->gpr[13];
>  
>  	/* copy the GPRs */
> -	err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr));
> -	err |= __get_user(regs->nip, &sc->gp_regs[PT_NIP]);
> +	unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr),
> +			      efault_out);
> +	unsafe_get_user(regs->nip, &sc->gp_regs[PT_NIP], efault_out);
>  	/* get MSR separately, transfer the LE bit if doing signal return */
> -	err |= __get_user(msr, &sc->gp_regs[PT_MSR]);
> +	unsafe_get_user(msr, &sc->gp_regs[PT_MSR], efault_out);
>  	if (sig)
>  		regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
> -	err |= __get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3]);
> -	err |= __get_user(regs->ctr, &sc->gp_regs[PT_CTR]);
> -	err |= __get_user(regs->link, &sc->gp_regs[PT_LNK]);
> -	err |= __get_user(regs->xer, &sc->gp_regs[PT_XER]);
> -	err |= __get_user(regs->ccr, &sc->gp_regs[PT_CCR]);
> +	unsafe_get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3], efault_out);
> +	unsafe_get_user(regs->ctr, &sc->gp_regs[PT_CTR], efault_out);
> +	unsafe_get_user(regs->link, &sc->gp_regs[PT_LNK], efault_out);
> +	unsafe_get_user(regs->xer, &sc->gp_regs[PT_XER], efault_out);
> +	unsafe_get_user(regs->ccr, &sc->gp_regs[PT_CCR], efault_out);
>  	/* Don't allow userspace to set SOFTE */
>  	set_trap_norestart(regs);
> -	err |= __get_user(regs->dar, &sc->gp_regs[PT_DAR]);
> -	err |= __get_user(regs->dsisr, &sc->gp_regs[PT_DSISR]);
> -	err |= __get_user(regs->result, &sc->gp_regs[PT_RESULT]);
> +	unsafe_get_user(regs->dar, &sc->gp_regs[PT_DAR], efault_out);
> +	unsafe_get_user(regs->dsisr, &sc->gp_regs[PT_DSISR], efault_out);
> +	unsafe_get_user(regs->result, &sc->gp_regs[PT_RESULT], efault_out);
>  
>  	if (!sig)
>  		regs->gpr[13] = save_r13;
>  	if (set != NULL)
> -		err |=  __get_user(set->sig[0], &sc->oldmask);
> +		unsafe_get_user(set->sig[0], &sc->oldmask, efault_out);
>  
>  	/*
>  	 * Force reload of FP/VEC.
> @@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>  	regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX);
>  
>  #ifdef CONFIG_ALTIVEC
> -	err |= __get_user(v_regs, &sc->v_regs);
> -	if (err)
> -		return err;
> +	unsafe_get_user(v_regs, &sc->v_regs, efault_out);
>  	if (v_regs && !access_ok(v_regs, 34 * sizeof(vector128)))
>  		return -EFAULT;
>  	/* Copy 33 vec registers (vr0..31 and vscr) from the stack */
>  	if (v_regs != NULL && (msr & MSR_VEC) != 0) {
> -		err |= __copy_from_user(&tsk->thread.vr_state, v_regs,
> -					33 * sizeof(vector128));
> +		unsafe_copy_from_user(&tsk->thread.vr_state, v_regs,
> +				      33 * sizeof(vector128), efault_out);
>  		tsk->thread.used_vr = true;
>  	} else if (tsk->thread.used_vr) {
>  		memset(&tsk->thread.vr_state, 0, 33 * sizeof(vector128));
>  	}
>  	/* Always get VRSAVE back */
>  	if (v_regs != NULL)
> -		err |= __get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
> +		unsafe_get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33],
> +				efault_out);
>  	else
>  		tsk->thread.vrsave = 0;
>  	if (cpu_has_feature(CPU_FTR_ALTIVEC))
>  		mtspr(SPRN_VRSAVE, tsk->thread.vrsave);
>  #endif /* CONFIG_ALTIVEC */
>  	/* restore floating point */
> -	err |= copy_fpr_from_user(tsk, &sc->fp_regs);
> +	unsafe_copy_fpr_from_user(tsk, &sc->fp_regs, efault_out);
>  #ifdef CONFIG_VSX
>  	/*
>  	 * Get additional VSX data. Update v_regs to point after the
> @@ -409,14 +409,17 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
>  	 */
>  	v_regs += ELF_NVRREG;
>  	if ((msr & MSR_VSX) != 0) {
> -		err |= copy_vsx_from_user(tsk, v_regs);
> +		unsafe_copy_vsx_from_user(tsk, v_regs, efault_out);
>  		tsk->thread.used_vsr = true;
>  	} else {
>  		for (i = 0; i < 32 ; i++)
>  			tsk->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = 0;
>  	}
>  #endif
> -	return err;
> +	return 0;
> +
> +efault_out:
> +	return -EFAULT;
>  }
>  
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -701,8 +704,14 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>  	if (__copy_from_user(&set, &new_ctx->uc_sigmask, sizeof(set)))
>  		do_exit(SIGSEGV);
>  	set_current_blocked(&set);
> -	if (restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext))
> +
> +	if (!user_read_access_begin(new_ctx, ctx_size))
> +		return -EFAULT;
> +	if (__unsafe_restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext)) {
> +		user_read_access_end();
>  		do_exit(SIGSEGV);
> +	}
> +	user_read_access_end();
>  
>  	/* This returns like rt_sigreturn */
>  	set_thread_flag(TIF_RESTOREALL);
> @@ -807,8 +816,13 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  		 * causing a TM bad thing.
>  		 */
>  		current->thread.regs->msr &= ~MSR_TS_MASK;
> -		if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
> +		if (!user_read_access_begin(uc, sizeof(*uc)))
> +			return -EFAULT;
> +		if (__unsafe_restore_sigcontext(current, NULL, 1, &uc->uc_mcontext)) {
> +			user_read_access_end();
>  			goto badframe;
> +		}
> +		user_read_access_end();
>  	}
>  
>  	if (restore_altstack(&uc->uc_stack))
> -- 
> 2.26.1

^ permalink raw reply

* Re: [PATCH v5 10/10] powerpc/signal64: Use __get_user() to copy sigset_t
From: Daniel Axtens @ 2021-02-12 21:21 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev
In-Reply-To: <20210203184323.20792-11-cmr@codefail.de>

"Christopher M. Riedl" <cmr@codefail.de> writes:

> Usually sigset_t is exactly 8B which is a "trivial" size and does not
> warrant using __copy_from_user(). Use __get_user() directly in
> anticipation of future work to remove the trivial size optimizations
> from __copy_from_user(). Calling __get_user() also results in a small
> boost to signal handling throughput here.

Modulo the comments from Christophe, this looks good to me. It seems to
do what it says on the tin.

Reviewed-by: Daniel Axtens <dja@axtens.net>

Do you know if this patch is responsible for the slightly increase in
radix performance you observed in your cover letter? The rest of the
series shouldn't really make things faster than the no-KUAP case...

Kind regards,
Daniel

>
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>  arch/powerpc/kernel/signal_64.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 817b64e1e409..42fdc4a7ff72 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -97,6 +97,14 @@ static void prepare_setup_sigcontext(struct task_struct *tsk, int ctx_has_vsx_re
>  #endif /* CONFIG_VSX */
>  }
>  
> +static inline int get_user_sigset(sigset_t *dst, const sigset_t *src)
> +{
> +	if (sizeof(sigset_t) <= 8)
> +		return __get_user(dst->sig[0], &src->sig[0]);
> +	else
> +		return __copy_from_user(dst, src, sizeof(sigset_t));
> +}
> +
>  /*
>   * Set up the sigcontext for the signal frame.
>   */
> @@ -701,8 +709,9 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
>  	 * We kill the task with a SIGSEGV in this situation.
>  	 */
>  
> -	if (__copy_from_user(&set, &new_ctx->uc_sigmask, sizeof(set)))
> +	if (get_user_sigset(&set, &new_ctx->uc_sigmask))
>  		do_exit(SIGSEGV);
> +
>  	set_current_blocked(&set);
>  
>  	if (!user_read_access_begin(new_ctx, ctx_size))
> @@ -740,8 +749,9 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	if (!access_ok(uc, sizeof(*uc)))
>  		goto badframe;
>  
> -	if (__copy_from_user(&set, &uc->uc_sigmask, sizeof(set)))
> +	if (get_user_sigset(&set, &uc->uc_sigmask))
>  		goto badframe;
> +
>  	set_current_blocked(&set);
>  
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> -- 
> 2.26.1

^ permalink raw reply

* Re: linux-next: manual merge of the spi tree with the powerpc tree
From: Stephen Rothwell @ 2021-02-13  1:58 UTC (permalink / raw)
  To: Mark Brown; +Cc: Linux Next Mailing List, PowerPC, Linux Kernel Mailing List
In-Reply-To: <20210212122759.GA6057@sirena.org.uk>

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

Hi Mark,

On Fri, 12 Feb 2021 12:27:59 +0000 Mark Brown <broonie@kernel.org> wrote:
>
> On Fri, Feb 12, 2021 at 03:31:42PM +1100, Stephen Rothwell wrote:
> 
> > BTW Mark: the author's address in 258ea99fe25a uses a non existent domain :-(  
> 
> Ugh, I think that's something gone wrong with b4 :(  A bit late now to
> try to fix it up.

Not sure about that, the email (following the link to lore from the
commit) has the same address (...@public.gmane.com) and that domain
does not exist. In fact the email headers (in lore) look like this:

From: Sergiu Cuciurean <sergiu.cuciurean-OyLXuOCK7orQT0dZR+AlfA@public.gmane.org>
To: <broonie-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	<linux-spi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Cc: Sergiu Cuciurean
	<sergiu.cuciurean-OyLXuOCK7orQT0dZR+AlfA@public.gmane.org>

So I am suprised that it was received by anyone.  Maybe gmane has an
internal reply system that is screwed.
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH 4/4] ASoC: fsl: drop unneeded snd_soc_dai_set_drvdata
From: Julia Lawall @ 2021-02-13 10:19 UTC (permalink / raw)
  To: Timur Tabi
  Cc: alsa-devel, linuxppc-dev, Xiubo Li, Shengjiu Wang, Takashi Iwai,
	kernel-janitors, Liam Girdwood, Jaroslav Kysela, Nicolin Chen,
	Mark Brown, Fabio Estevam, linux-kernel
In-Reply-To: <20210213101907.1318496-1-Julia.Lawall@inria.fr>

snd_soc_dai_set_drvdata is not needed when the set data comes from
snd_soc_dai_get_drvdata or dev_get_drvdata.  The problem was fixed
usingthe following semantic patch: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression x,y,e;
@@
	x = dev_get_drvdata(y->dev)
	... when != x = e
-	snd_soc_dai_set_drvdata(y,x);

@@
expression x,y,e;
@@
	x = snd_soc_dai_get_drvdata(y)
	... when != x = e
-	snd_soc_dai_set_drvdata(y,x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>

---
 sound/soc/fsl/fsl_micfil.c |    2 --
 sound/soc/fsl/fsl_sai.c    |    2 --
 sound/soc/fsl/fsl_xcvr.c   |    1 -
 3 files changed, 5 deletions(-)

diff --git a/sound/soc/fsl/fsl_micfil.c b/sound/soc/fsl/fsl_micfil.c
index 8aedf6590b28..fd21017fa2bd 100644
--- a/sound/soc/fsl/fsl_micfil.c
+++ b/sound/soc/fsl/fsl_micfil.c
@@ -423,8 +423,6 @@ static int fsl_micfil_dai_probe(struct snd_soc_dai *cpu_dai)
 		return ret;
 	}
 
-	snd_soc_dai_set_drvdata(cpu_dai, micfil);
-
 	return 0;
 }
 
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index 5e65b456d3e2..8876d0ed37d9 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -727,8 +727,6 @@ static int fsl_sai_dai_probe(struct snd_soc_dai *cpu_dai)
 	snd_soc_dai_init_dma_data(cpu_dai, &sai->dma_params_tx,
 				&sai->dma_params_rx);
 
-	snd_soc_dai_set_drvdata(cpu_dai, sai);
-
 	return 0;
 }
 
diff --git a/sound/soc/fsl/fsl_xcvr.c b/sound/soc/fsl/fsl_xcvr.c
index dd228b421e2c..8a3bde718697 100644
--- a/sound/soc/fsl/fsl_xcvr.c
+++ b/sound/soc/fsl/fsl_xcvr.c
@@ -869,7 +869,6 @@ static int fsl_xcvr_dai_probe(struct snd_soc_dai *dai)
 	struct fsl_xcvr *xcvr = snd_soc_dai_get_drvdata(dai);
 
 	snd_soc_dai_init_dma_data(dai, &xcvr->dma_prms_tx, &xcvr->dma_prms_rx);
-	snd_soc_dai_set_drvdata(dai, xcvr);
 
 	snd_soc_add_dai_controls(dai, &fsl_xcvr_mode_kctl, 1);
 	snd_soc_add_dai_controls(dai, &fsl_xcvr_arc_mode_kctl, 1);


^ permalink raw reply related

* [powerpc:merge] BUILD SUCCESS 463d77b1993642f56ce6929be97c08ef065f48ba
From: kernel test robot @ 2021-02-13 13:33 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
branch HEAD: 463d77b1993642f56ce6929be97c08ef065f48ba  Automatic merge of 'next' into merge (2021-02-12 00:44)

elapsed time: 2807m

configs tested: 205
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm                                 defconfig
arm64                            allyesconfig
arm64                               defconfig
arm                              allyesconfig
arm                              allmodconfig
alpha                            alldefconfig
arc                     haps_hs_smp_defconfig
sh                               j2_defconfig
powerpc                      ppc64e_defconfig
arm                       netwinder_defconfig
arm                     davinci_all_defconfig
sh                        sh7763rdp_defconfig
c6x                        evmc6474_defconfig
arm                         shannon_defconfig
arc                        vdk_hs38_defconfig
sh                      rts7751r2d1_defconfig
m68k                            q40_defconfig
arm                            lart_defconfig
arm                            pleb_defconfig
arm                   milbeaut_m10v_defconfig
h8300                            alldefconfig
powerpc                         ps3_defconfig
h8300                     edosk2674_defconfig
arm                          moxart_defconfig
arm                             rpc_defconfig
mips                         mpc30x_defconfig
arm                  colibri_pxa300_defconfig
powerpc                    ge_imp3a_defconfig
m68k                          atari_defconfig
mips                      bmips_stb_defconfig
powerpc                     stx_gp3_defconfig
arm                           sunxi_defconfig
openrisc                         alldefconfig
sh                           se7712_defconfig
h8300                               defconfig
sh                            hp6xx_defconfig
arm                          simpad_defconfig
powerpc                 mpc8315_rdb_defconfig
mips                     cu1830-neo_defconfig
mips                         tb0219_defconfig
sh                          rsk7264_defconfig
arm                            u300_defconfig
arm                          pxa3xx_defconfig
sh                           se7206_defconfig
sh                         ap325rxa_defconfig
xtensa                  cadence_csp_defconfig
powerpc                       eiger_defconfig
mips                         rt305x_defconfig
xtensa                    xip_kc705_defconfig
powerpc                     tqm5200_defconfig
sh                     sh7710voipgw_defconfig
mips                      maltaaprp_defconfig
powerpc                 linkstation_defconfig
mips                           jazz_defconfig
powerpc                 mpc832x_mds_defconfig
mips                       capcella_defconfig
powerpc                      chrp32_defconfig
um                           x86_64_defconfig
powerpc                      obs600_defconfig
arm64                            alldefconfig
arm                         axm55xx_defconfig
x86_64                           alldefconfig
mips                        jmr3927_defconfig
mips                       bmips_be_defconfig
powerpc                 mpc834x_mds_defconfig
powerpc                   bluestone_defconfig
sh                          rsk7203_defconfig
nds32                               defconfig
powerpc                    gamecube_defconfig
mips                          ath79_defconfig
powerpc64                        alldefconfig
s390                          debug_defconfig
powerpc                 xes_mpc85xx_defconfig
powerpc                       holly_defconfig
powerpc                     kmeter1_defconfig
powerpc                     mpc5200_defconfig
openrisc                  or1klitex_defconfig
arm                       cns3420vb_defconfig
arm                           h5000_defconfig
arm                        clps711x_defconfig
arc                        nsim_700_defconfig
powerpc                      pmac32_defconfig
openrisc                 simple_smp_defconfig
powerpc                        icon_defconfig
sh                                  defconfig
mips                      malta_kvm_defconfig
powerpc                     taishan_defconfig
arm                          pxa910_defconfig
arm                         lpc32xx_defconfig
arm                          badge4_defconfig
powerpc                    sam440ep_defconfig
powerpc                  mpc885_ads_defconfig
c6x                        evmc6457_defconfig
openrisc                            defconfig
arm                           omap1_defconfig
xtensa                         virt_defconfig
mips                         bigsur_defconfig
arm                        magician_defconfig
arm                         vf610m4_defconfig
sparc                       sparc32_defconfig
mips                            e55_defconfig
parisc                              defconfig
parisc                generic-32bit_defconfig
powerpc                      pcm030_defconfig
mips                    maltaup_xpa_defconfig
arm                         palmz72_defconfig
powerpc                     tqm8548_defconfig
microblaze                          defconfig
powerpc                      ep88xc_defconfig
sh                          kfr2r09_defconfig
mips                           ip32_defconfig
ia64                             allmodconfig
ia64                                defconfig
ia64                             allyesconfig
m68k                             allmodconfig
m68k                                defconfig
m68k                             allyesconfig
nios2                               defconfig
arc                              allyesconfig
nds32                             allnoconfig
c6x                              allyesconfig
nios2                            allyesconfig
csky                                defconfig
alpha                               defconfig
alpha                            allyesconfig
xtensa                           allyesconfig
h8300                            allyesconfig
arc                                 defconfig
sh                               allmodconfig
s390                             allyesconfig
s390                             allmodconfig
parisc                           allyesconfig
s390                                defconfig
i386                             allyesconfig
sparc                            allyesconfig
sparc                               defconfig
i386                               tinyconfig
i386                                defconfig
mips                             allyesconfig
mips                             allmodconfig
powerpc                          allyesconfig
powerpc                          allmodconfig
powerpc                           allnoconfig
x86_64               randconfig-a006-20210209
x86_64               randconfig-a001-20210209
x86_64               randconfig-a005-20210209
x86_64               randconfig-a004-20210209
x86_64               randconfig-a002-20210209
x86_64               randconfig-a003-20210209
x86_64               randconfig-a003-20210212
x86_64               randconfig-a002-20210212
x86_64               randconfig-a004-20210212
x86_64               randconfig-a001-20210212
x86_64               randconfig-a005-20210212
x86_64               randconfig-a006-20210212
i386                 randconfig-a001-20210209
i386                 randconfig-a005-20210209
i386                 randconfig-a003-20210209
i386                 randconfig-a002-20210209
i386                 randconfig-a006-20210209
i386                 randconfig-a004-20210209
i386                 randconfig-a003-20210212
i386                 randconfig-a005-20210212
i386                 randconfig-a002-20210212
i386                 randconfig-a001-20210212
i386                 randconfig-a004-20210212
i386                 randconfig-a006-20210212
x86_64               randconfig-a016-20210211
x86_64               randconfig-a013-20210211
x86_64               randconfig-a012-20210211
x86_64               randconfig-a015-20210211
x86_64               randconfig-a014-20210211
x86_64               randconfig-a011-20210211
i386                 randconfig-a016-20210209
i386                 randconfig-a013-20210209
i386                 randconfig-a012-20210209
i386                 randconfig-a014-20210209
i386                 randconfig-a011-20210209
i386                 randconfig-a015-20210209
riscv                    nommu_k210_defconfig
riscv                            allyesconfig
riscv                    nommu_virt_defconfig
riscv                             allnoconfig
riscv                               defconfig
riscv                          rv32_defconfig
riscv                            allmodconfig
x86_64                                   rhel
x86_64                           allyesconfig
x86_64                    rhel-7.6-kselftests
x86_64                              defconfig
x86_64                               rhel-8.3
x86_64                      rhel-8.3-kbuiltin
x86_64                                  kexec

clang tested configs:
x86_64               randconfig-a003-20210211
x86_64               randconfig-a002-20210211
x86_64               randconfig-a001-20210211
x86_64               randconfig-a004-20210211
x86_64               randconfig-a005-20210211
x86_64               randconfig-a006-20210211
x86_64               randconfig-a013-20210209
x86_64               randconfig-a014-20210209
x86_64               randconfig-a015-20210209
x86_64               randconfig-a012-20210209
x86_64               randconfig-a016-20210209
x86_64               randconfig-a011-20210209

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox