LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf 1/6] bpf: support 64-bit offsets for bpf function calls
From: Sandipan Das @ 2018-05-17  6:35 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, linuxppc-dev, naveen.n.rao
In-Reply-To: <20180517063548.6373-1-sandipan@linux.vnet.ibm.com>

The imm field of a bpf instruction is a signed 32-bit integer.
For JIT bpf-to-bpf function calls, it stores the offset of the
start address of the callee's JITed image from __bpf_call_base.

For some architectures, such as powerpc64, this offset may be
as large as 64 bits and cannot be accomodated in the imm field
without truncation.

We resolve this by:

[1] Additionally using the auxillary data of each function to
    keep a list of start addresses of the JITed images for all
    functions determined by the verifier.

[2] Retaining the subprog id inside the off field of the call
    instructions and using it to index into the list mentioned
    above and lookup the callee's address.

To make sure that the existing JIT compilers continue to work
without requiring changes, we keep the imm field as it is.

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
---
 kernel/bpf/verifier.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d5e1a6c4165d..aa76879f4fd1 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5373,11 +5373,24 @@ static int jit_subprogs(struct bpf_verifier_env *env)
 			    insn->src_reg != BPF_PSEUDO_CALL)
 				continue;
 			subprog = insn->off;
-			insn->off = 0;
 			insn->imm = (u64 (*)(u64, u64, u64, u64, u64))
 				func[subprog]->bpf_func -
 				__bpf_call_base;
 		}
+
+		/* we use the aux data to keep a list of the start addresses
+		 * of the JITed images for each function in the program
+		 *
+		 * for some architectures, such as powerpc64, the imm field
+		 * might not be large enough to hold the offset of the start
+		 * address of the callee's JITed image from __bpf_call_base
+		 *
+		 * in such cases, we can lookup the start address of a callee
+		 * by using its subprog id, available from the off field of
+		 * the call instruction, as an index for this list
+		 */
+		func[i]->aux->func = func;
+		func[i]->aux->func_cnt = env->subprog_cnt + 1;
 	}
 	for (i = 0; i < env->subprog_cnt; i++) {
 		old_bpf_func = func[i]->bpf_func;
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH 0/3] Add support to disable sensor groups in P9
From: Shilpasri G Bhat @ 2018-05-17  6:10 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: mpe, linuxppc-dev, linux-hwmon, linux-kernel, stewart
In-Reply-To: <20180515150205.GA29321@roeck-us.net>



On 05/15/2018 08:32 PM, Guenter Roeck wrote:
> On Thu, Mar 22, 2018 at 04:24:32PM +0530, Shilpasri G Bhat wrote:
>> This patch series adds support to enable/disable OCC based
>> inband-sensor groups at runtime. The environmental sensor groups are
>> managed in HWMON and the remaining platform specific sensor groups are
>> managed in /sys/firmware/opal.
>>
>> The firmware changes required for this patch is posted below:
>> https://lists.ozlabs.org/pipermail/skiboot/2018-March/010812.html
>>
> 
> Sorry for not getting back earlier. This is a tough one.
> 

Thanks for the reply. I have tried to answer your questions according to my
understanding below:

> Key problem is that you are changing the ABI with those new attributes.
> On top of that, the attributes _do_ make some sense (many chips support
> enabling/disabling of individual sensors), suggesting that those or
> similar attributes may or even should at some point be added to the ABI.
> 
> At the same time, returning "0" as measurement values when sensors are
> disabled does not seem like a good idea, since "0" is a perfectly valid
> measurement, at least for most sensors.

I agree.

> 
> Given that, we need to have a discussion about adding _enable attributes to
> the ABI 

> what is the scope,
IIUC the scope should be RW and the attribute is defined for each supported
sensor group

> when should the attributes exist and when not,
We control this currently via device-tree

> do we want/need power_enable or powerX_enable or both, and so on), and 
We need power_enable right now

> what to return if a sensor is disabled (such as -ENODATA).
-ENODATA sounds good.

Thanks and Regards,
Shilpa

 Once we have an
> agreement, we can continue with an implementation.
> 
> Guenter
> 
>> Shilpasri G Bhat (3):
>>   powernv:opal-sensor-groups: Add support to enable sensor groups
>>   hwmon: ibmpowernv: Add attributes to enable/disable sensor groups
>>   powernv: opal-sensor-groups: Add attributes to disable/enable sensors
>>
>>  .../ABI/testing/sysfs-firmware-opal-sensor-groups  |  34 ++++++
>>  Documentation/hwmon/ibmpowernv                     |  31 ++++-
>>  arch/powerpc/include/asm/opal-api.h                |   4 +-
>>  arch/powerpc/include/asm/opal.h                    |   2 +
>>  .../powerpc/platforms/powernv/opal-sensor-groups.c | 104 ++++++++++++-----
>>  arch/powerpc/platforms/powernv/opal-wrappers.S     |   1 +
>>  drivers/hwmon/ibmpowernv.c                         | 127 +++++++++++++++++++--
>>  7 files changed, 265 insertions(+), 38 deletions(-)
>>  create mode 100644 Documentation/ABI/testing/sysfs-firmware-opal-sensor-groups
>>
>> -- 
>> 1.8.3.1
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-hwmon" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* Re: [PATCH] cpuidle/powernv : init all present cpus for deep states
From: Akshay Adiga @ 2018-05-17  5:55 UTC (permalink / raw)
  To: Stewart Smith; +Cc: linux-kernel, linuxppc-dev, ego, npiggin
In-Reply-To: <87efiby6a9.fsf@linux.vnet.ibm.com>

Yes this needs to be sent to  stable. 

Fixes: d405a98c ("powerpc/powernv: Move cpuidle related code from setup.c
to new file")

^ permalink raw reply

* [PATCH 1/2] powerpc/ptrace: Fix enforcement of DAWR contraints
From: Michael Neuling @ 2018-05-17  5:37 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, Edjunior Barbosa Machado, Pedro Franco de Carvalho,
	Ulrich Weigand, mikey

Back when we first introduced the DAWR in this commit:
  4ae7ebe952 powerpc: Change hardware breakpoint to allow longer ranges

We screwed up the constraint making it a 1024 byte boundary rather
than a 512. This makes the check overly permissive. Fortunately GDB is
the only real user and it always did they right thing, so we never
noticed.

This fixes the constraint to 512 bytes.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: <stable@vger.kernel.org> # v3.9+
---
 arch/powerpc/kernel/hw_breakpoint.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index 4c1012b80d..80547dad37 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -178,8 +178,8 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
 	if (cpu_has_feature(CPU_FTR_DAWR)) {
 		length_max = 512 ; /* 64 doublewords */
 		/* DAWR region can't cross 512 boundary */
-		if ((bp->attr.bp_addr >> 10) != 
-		    ((bp->attr.bp_addr + bp->attr.bp_len - 1) >> 10))
+		if ((bp->attr.bp_addr >> 9) !=
+		    ((bp->attr.bp_addr + bp->attr.bp_len - 1) >> 9))
 			return -EINVAL;
 	}
 	if (info->len >
-- 
2.14.1

^ permalink raw reply related

* [PATCH 2/2] powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
From: Michael Neuling @ 2018-05-17  5:37 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, Edjunior Barbosa Machado, Pedro Franco de Carvalho,
	Ulrich Weigand, mikey
In-Reply-To: <20180517053715.24011-1-mikey@neuling.org>

In this change:
  e2a800beac powerpc/hw_brk: Fix off by one error when validating DAWR region end

We fixed setting the DAWR end point to its max value via
PPC_PTRACE_SETHWDEBUG. Unfortunately we broke PTRACE_SET_DEBUGREG when
setting a 512 byte aligned breakpoint.

PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to
zero (memset() in hw_breakpoint_init()).  This worked with
arch_validate_hwbkpt_settings() before the above patch was applied but
is now broken if the breakpoint is 512byte aligned.

This sets the length of the breakpoint to 8 bytes when using
PTRACE_SET_DEBUGREG.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org # 3.10+
---
 arch/powerpc/kernel/ptrace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index d23cf632ed..0f63dd5972 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -2443,6 +2443,7 @@ static int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 	/* Create a new breakpoint request if one doesn't exist already */
 	hw_breakpoint_init(&attr);
 	attr.bp_addr = hw_brk.address;
+	attr.bp_len = 8;
 	arch_bp_generic_fields(hw_brk.type,
 			       &attr.bp_type);

-- 
2.14.1

^ permalink raw reply related

* [powerpc:merge 138/143] arch/powerpc/include/asm/ftrace.h:2:0: error: unterminated #ifndef
From: kbuild test robot @ 2018-05-17  5:03 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: kbuild-all, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1955 bytes --]

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git merge
head:   46cf23553743d51ea53be61efce633061fa47f17
commit: 900be8ab1549359ba980cfb042a043128204a963 [138/143] Automatic merge of branches 'master', 'next' and 'fixes' into merge
config: powerpc-kilauea_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        git checkout 900be8ab1549359ba980cfb042a043128204a963
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

Note: the powerpc/merge HEAD 46cf23553743d51ea53be61efce633061fa47f17 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   In file included from include/linux/ftrace.h:21:0,
                    from include/linux/perf_event.h:48,
                    from include/linux/trace_events.h:10,
                    from include/trace/syscall.h:7,
                    from include/linux/syscalls.h:82,
                    from drivers/pci/syscall.c:10:
>> arch/powerpc/include/asm/ftrace.h:2:0: error: unterminated #ifndef
    #ifndef _ASM_POWERPC_FTRACE
    

vim +2 arch/powerpc/include/asm/ftrace.h

ccbfac29 include/asm-powerpc/ftrace.h Steven Rostedt 2008-05-22 @2  #ifndef _ASM_POWERPC_FTRACE
ccbfac29 include/asm-powerpc/ftrace.h Steven Rostedt 2008-05-22  3  #define _ASM_POWERPC_FTRACE
ccbfac29 include/asm-powerpc/ftrace.h Steven Rostedt 2008-05-22  4  

:::::: The code at line 2 was first introduced by commit
:::::: ccbfac2923c9febaeaf07a50054027a92b502718 ftrace: powerpc clean ups

:::::: TO: Steven Rostedt <rostedt@goodmis.org>
:::::: CC: Thomas Gleixner <tglx@linutronix.de>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 13373 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] KVM: PPC: Book3S HV: lockless tlbie for HPT hcalls
From: Paul Mackerras @ 2018-05-17  3:53 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Nicholas Piggin, kvm-ppc, linuxppc-dev
In-Reply-To: <874ljarih1.fsf@concordia.ellerman.id.au>

On Mon, May 14, 2018 at 02:04:10PM +1000, Michael Ellerman wrote:
[snip]
> OK good, in commit:
> 
> c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors") (Dec 2014)
> 
> So we should be able to do the patch below.
> 
> cheers
> 
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 17498e9a26e4..7756b0c6da75 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -269,7 +269,6 @@ struct kvm_arch {
>  	unsigned long host_lpcr;
>  	unsigned long sdr1;
>  	unsigned long host_sdr1;
> -	int tlbie_lock;
>  	unsigned long lpcr;
>  	unsigned long vrma_slb_v;
>  	int mmu_ready;
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 78e6a392330f..89d909b3b881 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -434,24 +434,6 @@ static inline int is_mmio_hpte(unsigned long v, unsigned long r)
>  		(HPTE_R_KEY_HI | HPTE_R_KEY_LO));
>  }
>  
> -static inline int try_lock_tlbie(unsigned int *lock)
> -{
> -	unsigned int tmp, old;
> -	unsigned int token = LOCK_TOKEN;
> -
> -	asm volatile("1:lwarx	%1,0,%2\n"
> -		     "	cmpwi	cr0,%1,0\n"
> -		     "	bne	2f\n"
> -		     "  stwcx.	%3,0,%2\n"
> -		     "	bne-	1b\n"
> -		     "  isync\n"
> -		     "2:"
> -		     : "=&r" (tmp), "=&r" (old)
> -		     : "r" (lock), "r" (token)
> -		     : "cc", "memory");
> -	return old == 0;
> -}
> -
>  static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
>  		      long npages, int global, bool need_sync)
>  {
> @@ -463,8 +445,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
>  	 * the RS field, this is backwards-compatible with P7 and P8.
>  	 */
>  	if (global) {
> -		while (!try_lock_tlbie(&kvm->arch.tlbie_lock))
> -			cpu_relax();
>  		if (need_sync)
>  			asm volatile("ptesync" : : : "memory");
>  		for (i = 0; i < npages; ++i) {
> @@ -483,7 +463,6 @@ static void do_tlbies(struct kvm *kvm, unsigned long *rbvalues,
>  		}
>  
>  		asm volatile("eieio; tlbsync; ptesync" : : : "memory");
> -		kvm->arch.tlbie_lock = 0;
>  	} else {
>  		if (need_sync)
>  			asm volatile("ptesync" : : : "memory");

Seems reasonable; is that a patch submission? :)

Paul.

^ permalink raw reply

* [PATCH] powerpc: Ensure gcc doesn't move around cache flushing in __patch_instruction
From: Benjamin Herrenschmidt @ 2018-05-17  3:06 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Michael Neuling

The current asm statement in __patch_instruction() for the cache flushes
doesn't have a "volatile" statement and no memory clobber. That means
gcc can potentially move it around (or move the store done by put_user
past the flush).

Add both to ensure gcc doesn't play games.

Found by code inspection, no actual bug reported.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -32,8 +32,9 @@ static int __patch_instruction(unsigned int *exec_addr, unsigned int instr,
        if (err)
                return err;
 
-       asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (patch_addr),
-                                                           "r" (exec_addr));
+       asm volatile("dcbst 0, %0; sync; icbi 0,%1; sync; isync"
+                    :: "r" (patch_addr), "r" (exec_addr)
+                    : "memory");
 
        return 0;
 }

^ permalink raw reply

* Re: [PATCH v2 05/10] KVM: PPC: reimplement non-SIMD LOAD/STORE instruction mmio emulation with analyse_intr() input
From: Simon Guo @ 2018-05-17  2:30 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm-ppc, kvm, linuxppc-dev
In-Reply-To: <20180516234918.GB24778@fergus.ozlabs.ibm.com>

Hi Paul,
On Thu, May 17, 2018 at 09:49:18AM +1000, Paul Mackerras wrote:
> On Mon, May 07, 2018 at 02:20:11PM +0800, wei.guo.simon@gmail.com wrote:
> > From: Simon Guo <wei.guo.simon@gmail.com>
> > 
> > This patch reimplements non-SIMD LOAD/STORE instruction MMIO emulation
> > with analyse_intr() input. It utilizes the BYTEREV/UPDATE/SIGNEXT
> > properties exported by analyse_instr() and invokes
> > kvmppc_handle_load(s)/kvmppc_handle_store() accordingly.
> > 
> > It also move CACHEOP type handling into the skeleton.
> > 
> > instruction_type within kvm_ppc.h is renamed to avoid conflict with
> > sstep.h.
> > 
> > Suggested-by: Paul Mackerras <paulus@ozlabs.org>
> > Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
> 
> Looks pretty good, but one comment below...
> 
> > -		case OP_31_XOP_LWAUX:
> > -			emulated = kvmppc_handle_loads(run, vcpu, rt, 4, 1);
> > -			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
> > -			break;
> > +	if (emulated == EMULATE_DONE)
> > +		goto out;
> 
> Shouldn't this also goto out if emulated == EMULATE_DO_MMIO?
Yes. Let me update this.

Thanks,
- Simon

^ permalink raw reply

* Re: [PATCH v2 07/10] KVM: PPC: reimplement LOAD_FP/STORE_FP instruction mmio emulation with analyse_intr() input
From: Simon Guo @ 2018-05-17  2:26 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: kvm-ppc, kvm, linuxppc-dev
In-Reply-To: <20180516235207.GC24778@fergus.ozlabs.ibm.com>

On Thu, May 17, 2018 at 09:52:07AM +1000, Paul Mackerras wrote:
> On Mon, May 07, 2018 at 02:20:13PM +0800, wei.guo.simon@gmail.com wrote:
> > From: Simon Guo <wei.guo.simon@gmail.com>
> > 
> > This patch reimplements LOAD_FP/STORE_FP instruction MMIO emulation with
> > analyse_intr() input. It utilizes the FPCONV/UPDATE properties exported by
> > analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store()
> > accordingly.
> > 
> > For FP store MMIO emulation, the FP regs need to be flushed firstly so
> > that the right FP reg vals can be read from vcpu->arch.fpr, which will
> > be stored into MMIO data.
> > 
> > Suggested-by: Paul Mackerras <paulus@ozlabs.org>
> > Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
> 
> One comment below, otherwise looks good...
> 
> >  arch/powerpc/kvm/emulate_loadstore.c | 197 +++++++----------------------------
> >  1 file changed, 40 insertions(+), 157 deletions(-)
> > 
> > diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> > index 2a91845..5a6571c 100644
> > --- a/arch/powerpc/kvm/emulate_loadstore.c
> > +++ b/arch/powerpc/kvm/emulate_loadstore.c
> > @@ -138,6 +138,22 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
> >  
> >  			break;
> >  		}
> > +#ifdef CONFIG_PPC_FPU
> > +		case LOAD_FP:
> > +			if (kvmppc_check_fp_disabled(vcpu))
> > +				return EMULATE_DONE;
> > +
> > +			if (op.type & FPCONV)
> > +				vcpu->arch.mmio_sp64_extend = 1;
> > +
> > +			emulated = kvmppc_handle_load(run, vcpu,
> > +					KVM_MMIO_REG_FPR|op.reg, size, 1);
> 
> You need to check the SIGNEXT flag and call kvmppc_handle_loads if it
> is set, because of the lfiwax case:
> 
> > -		case OP_31_XOP_LFIWAX:
> > -			if (kvmppc_check_fp_disabled(vcpu))
> > -				return EMULATE_DONE;
> > -			emulated = kvmppc_handle_loads(run, vcpu,
> > -				KVM_MMIO_REG_FPR|rt, 4, 1);
> > -			break;

Yes. I need to handle that. Thanks for point it out.

BR,
 - Simon

^ permalink raw reply

* Re: administrivia: mails containing HTML attachments
From: Andrew Donnellan @ 2018-05-17  1:42 UTC (permalink / raw)
  To: Stephen Rothwell, ppc-dev
In-Reply-To: <20180516144829.2dbefb0b@canb.auug.org.au>

On 16/05/18 14:48, Stephen Rothwell wrote:
> Hi all,
> 
> I have decided that any email sent to the linuxppc-dev mailing list
> that contains an HTML attachment (or is just an HTML email) will be
> rejected.  The vast majority of such mail are spam (and I have to spend
> time dropping them manually at the moment) and, I presume, anyone on
> this list is capable of sending no HTML email.
> 
> Feel free to discuss ;-)

As it turns out, there's some bugs in how Patchwork handles HTML emails 
that can screw up the resulting patch mboxes when retrieved from 
Patchwork. We're hoping to fix that in the next Patchwork release, but 
in the meantime this restriction should stop us running into any issues 
on this list.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan@au1.ibm.com  IBM Australia Limited

^ permalink raw reply

* Re: [PATCH 07/14] powerpc: Add support for restartable sequences
From: Boqun Feng @ 2018-05-17  1:19 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, Paul E. McKenney, Andy Lutomirski, Dave Watson,
	linux-kernel, linux-api, Paul Turner, Andrew Morton, Russell King,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andrew Hunter,
	Andi Kleen, Chris Lameter, Ben Maurer, rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas, Will Deacon, Michael Kerrisk,
	Joel Fernandes, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linuxppc-dev
In-Reply-To: <112970629.1913.1526501596485.JavaMail.zimbra@efficios.com>

[-- Attachment #1: Type: text/plain, Size: 3578 bytes --]

On Wed, May 16, 2018 at 04:13:16PM -0400, Mathieu Desnoyers wrote:
> ----- On May 16, 2018, at 12:18 PM, Peter Zijlstra peterz@infradead.org wrote:
> 
> > On Mon, Apr 30, 2018 at 06:44:26PM -0400, Mathieu Desnoyers wrote:
> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> >> index c32a181a7cbb..ed21a777e8c6 100644
> >> --- a/arch/powerpc/Kconfig
> >> +++ b/arch/powerpc/Kconfig
> >> @@ -223,6 +223,7 @@ config PPC
> >>  	select HAVE_SYSCALL_TRACEPOINTS
> >>  	select HAVE_VIRT_CPU_ACCOUNTING
> >>  	select HAVE_IRQ_TIME_ACCOUNTING
> >> +	select HAVE_RSEQ
> >>  	select IRQ_DOMAIN
> >>  	select IRQ_FORCED_THREADING
> >>  	select MODULES_USE_ELF_RELA
> >> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> >> index 61db86ecd318..d3bb3aaaf5ac 100644
> >> --- a/arch/powerpc/kernel/signal.c
> >> +++ b/arch/powerpc/kernel/signal.c
> >> @@ -133,6 +133,8 @@ static void do_signal(struct task_struct *tsk)
> >>  	/* Re-enable the breakpoints for the signal stack */
> >>  	thread_change_pc(tsk, tsk->thread.regs);
> >>  
> >> +	rseq_signal_deliver(tsk->thread.regs);
> >> +
> >>  	if (is32) {
> >>          	if (ksig.ka.sa.sa_flags & SA_SIGINFO)
> >>  			ret = handle_rt_signal32(&ksig, oldset, tsk);
> >> @@ -164,6 +166,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long
> >> thread_info_flags)
> >>  	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
> >>  		clear_thread_flag(TIF_NOTIFY_RESUME);
> >>  		tracehook_notify_resume(regs);
> >> +		rseq_handle_notify_resume(regs);
> >>  	}
> >>  
> >>  	user_enter();
> > 
> > Again no rseq_syscall().
> 
> Same question for PowerPC as for ARM:
> 
> Considering that rseq_syscall is implemented as follows:
> 
> +void rseq_syscall(struct pt_regs *regs)
> +{
> +       unsigned long ip = instruction_pointer(regs);
> +       struct task_struct *t = current;
> +       struct rseq_cs rseq_cs;
> +
> +       if (!t->rseq)
> +               return;
> +       if (!access_ok(VERIFY_READ, t->rseq, sizeof(*t->rseq)) ||
> +           rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
> +               force_sig(SIGSEGV, t);
> +}
> 
> and that x86 calls it from syscall_return_slowpath() (which AFAIU is
> now used in the fast-path since KPTI), I wonder where we should call

So we actually detect this after the syscall takes effect, right? I
wonder whether this could be problematic, because "disallowing syscall"
in rseq areas may means the syscall won't take effect to some people, I
guess?

> this on PowerPC ?  I was under the impression that PowerPC return to
> userspace fast-path was not calling C code unless work flags were set,
> but I might be wrong.
> 

I think you're right. So we have to introduce callsite to rseq_syscall()
in syscall path, something like:

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 51695608c68b..a25734a96640 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -222,6 +222,9 @@ system_call_exit:
 	mtmsrd	r11,1
 #endif /* CONFIG_PPC_BOOK3E */
 
+	addi    r3,r1,STACK_FRAME_OVERHEAD
+	bl	rseq_syscall
+
 	ld	r9,TI_FLAGS(r12)
 	li	r11,-MAX_ERRNO
 	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)

But I think it's important for us to first decide where (before or after
the syscall) we do the detection.

Regards,
Boqun

> Thoughts ?
> 
> Thanks!
> 
> Mathieu
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related

* Re: [PATCH] selftests/powerpc: add test to verify rfi flush across a system call
From: Michael Ellerman @ 2018-05-17  1:03 UTC (permalink / raw)
  To: Naveen N. Rao; +Cc: Anton Blanchard, linuxppc-dev
In-Reply-To: <20180406165546.19433-1-naveen.n.rao@linux.vnet.ibm.com>

"Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> writes:

> diff --git a/tools/testing/selftests/powerpc/utils.c b/tools/testing/selftests/powerpc/utils.c
> index d46916867a6f..c6b1d20ed3ba 100644
> --- a/tools/testing/selftests/powerpc/utils.c
> +++ b/tools/testing/selftests/powerpc/utils.c
> @@ -104,3 +111,149 @@ int pick_online_cpu(void)
>  	printf("No cpus in affinity mask?!\n");
>  	return -1;
>  }
...
> +
> +static void sigill_handler(int signr, siginfo_t *info, void *unused)
> +{
> +	static int warned = 0;
> +	ucontext_t *ctx = (ucontext_t *)unused;
> +	unsigned int *pc = (unsigned int *)ctx->uc_mcontext.gp_regs[PT_NIP];

The above doesn't work on 32-bit, and this code is sometimes built 32-bit.

For an example of how to handle 32 and 64-bit, see eg:

  tools/testing/selftests/powerpc/primitives/load_unaligned_zeropad.c

cheers

^ permalink raw reply

* Re: [PATCH v2 00/10] KVM: PPC: reimplement mmio emulation with analyse_instr()
From: Paul Mackerras @ 2018-05-17  0:04 UTC (permalink / raw)
  To: wei.guo.simon; +Cc: kvm-ppc, kvm, linuxppc-dev
In-Reply-To: <1525674016-6703-1-git-send-email-wei.guo.simon@gmail.com>

On Mon, May 07, 2018 at 02:20:06PM +0800, wei.guo.simon@gmail.com wrote:
> From: Simon Guo <wei.guo.simon@gmail.com>
> 
> We already have analyse_instr() which analyzes instructions for the instruction
> type, size, addtional flags, etc. What kvmppc_emulate_loadstore() did is somehow
> duplicated and it will be good to utilize analyse_instr() to reimplement the
> code. The advantage is that the code logic will be shared and more clean to be 
> maintained.
> 
> This patch series reimplement kvmppc_emulate_loadstore() for various load/store
> instructions. 
> 
> The testcase locates at:
> https://github.com/justdoitqd/publicFiles/blob/master/test_mmio.c
> 
> - Tested at both PR/HV KVM. 
> - Also tested with little endian host & big endian guest.

This series looks good.  I have just a couple of comments on two of
the patches.

Thanks,
Paul.

^ permalink raw reply

* Re: [PATCH v2 07/10] KVM: PPC: reimplement LOAD_FP/STORE_FP instruction mmio emulation with analyse_intr() input
From: Paul Mackerras @ 2018-05-16 23:52 UTC (permalink / raw)
  To: wei.guo.simon; +Cc: kvm-ppc, kvm, linuxppc-dev
In-Reply-To: <1525674016-6703-8-git-send-email-wei.guo.simon@gmail.com>

On Mon, May 07, 2018 at 02:20:13PM +0800, wei.guo.simon@gmail.com wrote:
> From: Simon Guo <wei.guo.simon@gmail.com>
> 
> This patch reimplements LOAD_FP/STORE_FP instruction MMIO emulation with
> analyse_intr() input. It utilizes the FPCONV/UPDATE properties exported by
> analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store()
> accordingly.
> 
> For FP store MMIO emulation, the FP regs need to be flushed firstly so
> that the right FP reg vals can be read from vcpu->arch.fpr, which will
> be stored into MMIO data.
> 
> Suggested-by: Paul Mackerras <paulus@ozlabs.org>
> Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>

One comment below, otherwise looks good...

>  arch/powerpc/kvm/emulate_loadstore.c | 197 +++++++----------------------------
>  1 file changed, 40 insertions(+), 157 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/emulate_loadstore.c b/arch/powerpc/kvm/emulate_loadstore.c
> index 2a91845..5a6571c 100644
> --- a/arch/powerpc/kvm/emulate_loadstore.c
> +++ b/arch/powerpc/kvm/emulate_loadstore.c
> @@ -138,6 +138,22 @@ int kvmppc_emulate_loadstore(struct kvm_vcpu *vcpu)
>  
>  			break;
>  		}
> +#ifdef CONFIG_PPC_FPU
> +		case LOAD_FP:
> +			if (kvmppc_check_fp_disabled(vcpu))
> +				return EMULATE_DONE;
> +
> +			if (op.type & FPCONV)
> +				vcpu->arch.mmio_sp64_extend = 1;
> +
> +			emulated = kvmppc_handle_load(run, vcpu,
> +					KVM_MMIO_REG_FPR|op.reg, size, 1);

You need to check the SIGNEXT flag and call kvmppc_handle_loads if it
is set, because of the lfiwax case:

> -		case OP_31_XOP_LFIWAX:
> -			if (kvmppc_check_fp_disabled(vcpu))
> -				return EMULATE_DONE;
> -			emulated = kvmppc_handle_loads(run, vcpu,
> -				KVM_MMIO_REG_FPR|rt, 4, 1);
> -			break;

Paul.

^ permalink raw reply

* Re: [PATCH v2 05/10] KVM: PPC: reimplement non-SIMD LOAD/STORE instruction mmio emulation with analyse_intr() input
From: Paul Mackerras @ 2018-05-16 23:49 UTC (permalink / raw)
  To: wei.guo.simon; +Cc: kvm-ppc, kvm, linuxppc-dev
In-Reply-To: <1525674016-6703-6-git-send-email-wei.guo.simon@gmail.com>

On Mon, May 07, 2018 at 02:20:11PM +0800, wei.guo.simon@gmail.com wrote:
> From: Simon Guo <wei.guo.simon@gmail.com>
> 
> This patch reimplements non-SIMD LOAD/STORE instruction MMIO emulation
> with analyse_intr() input. It utilizes the BYTEREV/UPDATE/SIGNEXT
> properties exported by analyse_instr() and invokes
> kvmppc_handle_load(s)/kvmppc_handle_store() accordingly.
> 
> It also move CACHEOP type handling into the skeleton.
> 
> instruction_type within kvm_ppc.h is renamed to avoid conflict with
> sstep.h.
> 
> Suggested-by: Paul Mackerras <paulus@ozlabs.org>
> Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>

Looks pretty good, but one comment below...

> -		case OP_31_XOP_LWAUX:
> -			emulated = kvmppc_handle_loads(run, vcpu, rt, 4, 1);
> -			kvmppc_set_gpr(vcpu, ra, vcpu->arch.vaddr_accessed);
> -			break;
> +	if (emulated == EMULATE_DONE)
> +		goto out;

Shouldn't this also goto out if emulated == EMULATE_DO_MMIO?

Paul.

^ permalink raw reply

* Re: [PATCH] Revert "powerpc/64: Fix checksum folding in csum_add()"
From: Paul Mackerras @ 2018-05-16 23:10 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Michael Ellerman, Scott Wood,
	linux-kernel, linuxppc-dev, Shile Zhang
In-Reply-To: <20180410063437.217D2653BC@po15720vm.idsi0.si.c-s.fr>

On Tue, Apr 10, 2018 at 08:34:37AM +0200, Christophe Leroy wrote:
> This reverts commit 6ad966d7303b70165228dba1ee8da1a05c10eefe.
> 
> That commit was pointless, because csum_add() sums two 32 bits
> values, so the sum is 0x1fffffffe at the maximum.
> And then when adding upper part (1) and lower part (0xfffffffe),
> the result is 0xffffffff which doesn't carry.
> Any lower value will not carry either.
> 
> And behind the fact that this commit is useless, it also kills the
> whole purpose of having an arch specific inline csum_add()
> because the resulting code gets even worse than what is obtained
> with the generic implementation of csum_add()
> 
> 0000000000000240 <.csum_add>:
>  240:	38 00 ff ff 	li      r0,-1
>  244:	7c 84 1a 14 	add     r4,r4,r3
>  248:	78 00 00 20 	clrldi  r0,r0,32
>  24c:	78 89 00 22 	rldicl  r9,r4,32,32
>  250:	7c 80 00 38 	and     r0,r4,r0
>  254:	7c 09 02 14 	add     r0,r9,r0
>  258:	78 09 00 22 	rldicl  r9,r0,32,32
>  25c:	7c 00 4a 14 	add     r0,r0,r9
>  260:	78 03 00 20 	clrldi  r3,r0,32
>  264:	4e 80 00 20 	blr
> 
> In comparison, the generic implementation of csum_add() gives:
> 
> 0000000000000290 <.csum_add>:
>  290:	7c 63 22 14 	add     r3,r3,r4
>  294:	7f 83 20 40 	cmplw   cr7,r3,r4
>  298:	7c 10 10 26 	mfocrf  r0,1
>  29c:	54 00 ef fe 	rlwinm  r0,r0,29,31,31
>  2a0:	7c 60 1a 14 	add     r3,r0,r3
>  2a4:	78 63 00 20 	clrldi  r3,r3,32
>  2a8:	4e 80 00 20 	blr
> 
> And the reverted implementation for PPC64 gives:
> 
> 0000000000000240 <.csum_add>:
>  240:	7c 84 1a 14 	add     r4,r4,r3
>  244:	78 80 00 22 	rldicl  r0,r4,32,32
>  248:	7c 80 22 14 	add     r4,r0,r4
>  24c:	78 83 00 20 	clrldi  r3,r4,32
>  250:	4e 80 00 20 	blr
> 
> Fixes: 6ad966d7303b7 ("powerpc/64: Fix checksum folding in csum_add()")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Seems I was right first time... :)

Acked-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Ram Pai @ 2018-05-16 21:07 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Florian Weimer, Dave Hansen, Linux-MM, Linux API, linux-x86_64,
	linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <CALCETrVQs=ix-w9_MLJWikzmBG-e2Fzg61TrZLNVv5R3XFOs=g@mail.gmail.com>

On Wed, May 16, 2018 at 01:37:46PM -0700, Andy Lutomirski wrote:
> On Wed, May 16, 2018 at 1:35 PM Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > On Tue, May 08, 2018 at 02:40:46PM +0200, Florian Weimer wrote:
> > > On 05/08/2018 04:49 AM, Andy Lutomirski wrote:
> > > >On Mon, May 7, 2018 at 2:48 AM Florian Weimer <fweimer@redhat.com>
> wrote:
> > > >
> > > >>On 05/03/2018 06:05 AM, Andy Lutomirski wrote:
> > > >>>On Wed, May 2, 2018 at 7:11 PM Ram Pai <linuxram@us.ibm.com> wrote:
> > > >>>
> > > >>>>On Wed, May 02, 2018 at 09:23:49PM +0000, Andy Lutomirski wrote:
> > > >>>>>
> > > >>>>>>If I recall correctly, the POWER maintainer did express a strong
> > > >>>desire
> > > >>>>>>back then for (what is, I believe) their current semantics, which
> my
> > > >>>>>>PKEY_ALLOC_SIGNALINHERIT patch implements for x86, too.
> > > >>>>>
> > > >>>>>Ram, I really really don't like the POWER semantics.  Can you give
> > > >some
> > > >>>>>justification for them?  Does POWER at least have an atomic way for
> > > >>>>>userspace to modify just the key it wants to modify or, even
> better,
> > > >>>>>special load and store instructions to use alternate keys?
> > > >>>
> > > >>>>I wouldn't call it POWER semantics. The way I implemented it on
> power
> > > >>>>lead to the semantics, given that nothing was explicitly stated
> > > >>>>about how the semantics should work within a signal handler.
> > > >>>
> > > >>>I think that this is further evidence that we should introduce a new
> > > >>>pkey_alloc() mode and deprecate the old.  To the extent possible,
> this
> > > >>>thing should work the same way on x86 and POWER.
> > > >
> > > >>Do you propose to change POWER or to change x86?
> > > >
> > > >Sorry for being slow to reply.  I propose to introduce a new
> > > >PKEY_ALLOC_something variant on x86 and POWER and to make the behavior
> > > >match on both.
> > >
> > > So basically implement PKEY_ALLOC_SETSIGNAL for POWER, and keep the
> > > existing (different) behavior without the flag?
> > >
> > > Ram, would you be okay with that?  Could you give me a hand if
> > > necessary?  (I assume we have silicon in-house because it's a
> > > long-standing feature of the POWER platform which was simply dormant
> > > on Linux until now.)
> 
> > Yes. I can help you with that.
> 
> > So let me see if I understand the overall idea.
> 
> > Application can allocate new keys through a new syscall
> > sys_pkey_alloc_1(flags, init_val, sig_init_val)
> 
> > 'sig_init_val' is the permission-state of the key in signal context.
> 
> > The kernel will set the permission of each keys to their
> > corresponding values when entering the signal handler and revert
> > on return from the signal handler.
> 
> > just like init_val, sig_init_val also percolates to children threads.
> 
> 
> I was imagining it would be just pkey_alloc(SOME_NEW_FLAG, init_val); and
> the init val would be used for the current thread and for signal handlers.

what would change the key-permission-values enforced in signal-handler
context?  Or can it never be changed, ones set through sys_pkey_alloc()?

I suppose key-permission-values change done in non-signal-handler context,
will not apply to those in signal-handler context.

Can the signal handler change the key-permission-values from the
signal-handler context?

RP

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Andy Lutomirski @ 2018-05-16 20:54 UTC (permalink / raw)
  To: linuxram
  Cc: Florian Weimer, Andrew Lutomirski, Dave Hansen, Linux-MM,
	Linux API, linux-x86_64, linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <20180516205244.GB5479@ram.oc3035372033.ibm.com>

On Wed, May 16, 2018 at 1:52 PM Ram Pai <linuxram@us.ibm.com> wrote:

> On Mon, May 14, 2018 at 02:01:23PM +0200, Florian Weimer wrote:
> > On 05/09/2018 04:41 PM, Andy Lutomirski wrote:
> > >Hmm.  I can get on board with the idea that fork() / clone() /
> > >pthread_create() are all just special cases of the idea that the thread
> > >that*calls*  them should have the right pkey values, and the latter is
> > >already busted given our inability to asynchronously propagate the new
mode
> > >in pkey_alloc().  So let's so PKEY_ALLOC_SETSIGNAL as a starting point.
> >
> > Ram, any suggestions for implementing this on POWER?

> I suspect the changes will go in
> restore_user_regs() and save_user_regs().  These are the functions
> that save and restore register state before entry and exit into/from
> a signal handler.

> >
> > >One thing we could do, though: the current initual state on process
> > >creation is all access blocked on all keys.  We could change it so that
> > >half the keys are fully blocked and half are read-only.  Then we could
add
> > >a PKEY_ALLOC_STRICT or similar that allocates a key with the correct
> > >initial state*and*  does the setsignal thing.  If there are no keys
left
> > >with the correct initial state, then it fails.
> >
> > The initial PKRU value can currently be configured by the system
> > administrator.  I fear this approach has too many moving parts to be
> > viable.

> Sounds like on x86  keys can go active in signal-handler
> without any explicit allocation request by the application.  This is not
> the case on power. Is that API requirement? Hope not.

On x86, signals are currently delivered with all keys locked all the way
down (except for the magic one we use to emulate no-read access).  I would
hesitate to change this for existing applications.

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Ram Pai @ 2018-05-16 20:52 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andy Lutomirski, Dave Hansen, Linux-MM, Linux API, linux-x86_64,
	linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <008010c1-20a1-c307-25ac-8a69d672d031@redhat.com>

On Mon, May 14, 2018 at 02:01:23PM +0200, Florian Weimer wrote:
> On 05/09/2018 04:41 PM, Andy Lutomirski wrote:
> >Hmm.  I can get on board with the idea that fork() / clone() /
> >pthread_create() are all just special cases of the idea that the thread
> >that*calls*  them should have the right pkey values, and the latter is
> >already busted given our inability to asynchronously propagate the new mode
> >in pkey_alloc().  So let's so PKEY_ALLOC_SETSIGNAL as a starting point.
> 
> Ram, any suggestions for implementing this on POWER?

I suspect the changes will go in 
restore_user_regs() and save_user_regs().  These are the functions
that save and restore register state before entry and exit into/from
a signal handler.

> 
> >One thing we could do, though: the current initual state on process
> >creation is all access blocked on all keys.  We could change it so that
> >half the keys are fully blocked and half are read-only.  Then we could add
> >a PKEY_ALLOC_STRICT or similar that allocates a key with the correct
> >initial state*and*  does the setsignal thing.  If there are no keys left
> >with the correct initial state, then it fails.
> 
> The initial PKRU value can currently be configured by the system
> administrator.  I fear this approach has too many moving parts to be
> viable.

Sounds like on x86  keys can go active in signal-handler 
without any explicit allocation request by the application.  This is not
the case on power. Is that API requirement? Hope not.

RP

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Andy Lutomirski @ 2018-05-16 20:37 UTC (permalink / raw)
  To: linuxram
  Cc: Florian Weimer, Andrew Lutomirski, Dave Hansen, Linux-MM,
	Linux API, linux-x86_64, linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <20180516203534.GA5479@ram.oc3035372033.ibm.com>

On Wed, May 16, 2018 at 1:35 PM Ram Pai <linuxram@us.ibm.com> wrote:

> On Tue, May 08, 2018 at 02:40:46PM +0200, Florian Weimer wrote:
> > On 05/08/2018 04:49 AM, Andy Lutomirski wrote:
> > >On Mon, May 7, 2018 at 2:48 AM Florian Weimer <fweimer@redhat.com>
wrote:
> > >
> > >>On 05/03/2018 06:05 AM, Andy Lutomirski wrote:
> > >>>On Wed, May 2, 2018 at 7:11 PM Ram Pai <linuxram@us.ibm.com> wrote:
> > >>>
> > >>>>On Wed, May 02, 2018 at 09:23:49PM +0000, Andy Lutomirski wrote:
> > >>>>>
> > >>>>>>If I recall correctly, the POWER maintainer did express a strong
> > >>>desire
> > >>>>>>back then for (what is, I believe) their current semantics, which
my
> > >>>>>>PKEY_ALLOC_SIGNALINHERIT patch implements for x86, too.
> > >>>>>
> > >>>>>Ram, I really really don't like the POWER semantics.  Can you give
> > >some
> > >>>>>justification for them?  Does POWER at least have an atomic way for
> > >>>>>userspace to modify just the key it wants to modify or, even
better,
> > >>>>>special load and store instructions to use alternate keys?
> > >>>
> > >>>>I wouldn't call it POWER semantics. The way I implemented it on
power
> > >>>>lead to the semantics, given that nothing was explicitly stated
> > >>>>about how the semantics should work within a signal handler.
> > >>>
> > >>>I think that this is further evidence that we should introduce a new
> > >>>pkey_alloc() mode and deprecate the old.  To the extent possible,
this
> > >>>thing should work the same way on x86 and POWER.
> > >
> > >>Do you propose to change POWER or to change x86?
> > >
> > >Sorry for being slow to reply.  I propose to introduce a new
> > >PKEY_ALLOC_something variant on x86 and POWER and to make the behavior
> > >match on both.
> >
> > So basically implement PKEY_ALLOC_SETSIGNAL for POWER, and keep the
> > existing (different) behavior without the flag?
> >
> > Ram, would you be okay with that?  Could you give me a hand if
> > necessary?  (I assume we have silicon in-house because it's a
> > long-standing feature of the POWER platform which was simply dormant
> > on Linux until now.)

> Yes. I can help you with that.

> So let me see if I understand the overall idea.

> Application can allocate new keys through a new syscall
> sys_pkey_alloc_1(flags, init_val, sig_init_val)

> 'sig_init_val' is the permission-state of the key in signal context.

> The kernel will set the permission of each keys to their
> corresponding values when entering the signal handler and revert
> on return from the signal handler.

> just like init_val, sig_init_val also percolates to children threads.


I was imagining it would be just pkey_alloc(SOME_NEW_FLAG, init_val); and
the init val would be used for the current thread and for signal handlers.
New threads would inherit their parents' values.  The latter is certainly
up for negotiation, but it's the simplest behavior, and it's not obvious to
be that it's wrong.

--Andy

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Ram Pai @ 2018-05-16 20:35 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Andy Lutomirski, Dave Hansen, Linux-MM, Linux API, linux-x86_64,
	linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <314e1a48-db94-9b37-8793-a95a2082c9e2@redhat.com>

On Tue, May 08, 2018 at 02:40:46PM +0200, Florian Weimer wrote:
> On 05/08/2018 04:49 AM, Andy Lutomirski wrote:
> >On Mon, May 7, 2018 at 2:48 AM Florian Weimer <fweimer@redhat.com> wrote:
> >
> >>On 05/03/2018 06:05 AM, Andy Lutomirski wrote:
> >>>On Wed, May 2, 2018 at 7:11 PM Ram Pai <linuxram@us.ibm.com> wrote:
> >>>
> >>>>On Wed, May 02, 2018 at 09:23:49PM +0000, Andy Lutomirski wrote:
> >>>>>
> >>>>>>If I recall correctly, the POWER maintainer did express a strong
> >>>desire
> >>>>>>back then for (what is, I believe) their current semantics, which my
> >>>>>>PKEY_ALLOC_SIGNALINHERIT patch implements for x86, too.
> >>>>>
> >>>>>Ram, I really really don't like the POWER semantics.  Can you give
> >some
> >>>>>justification for them?  Does POWER at least have an atomic way for
> >>>>>userspace to modify just the key it wants to modify or, even better,
> >>>>>special load and store instructions to use alternate keys?
> >>>
> >>>>I wouldn't call it POWER semantics. The way I implemented it on power
> >>>>lead to the semantics, given that nothing was explicitly stated
> >>>>about how the semantics should work within a signal handler.
> >>>
> >>>I think that this is further evidence that we should introduce a new
> >>>pkey_alloc() mode and deprecate the old.  To the extent possible, this
> >>>thing should work the same way on x86 and POWER.
> >
> >>Do you propose to change POWER or to change x86?
> >
> >Sorry for being slow to reply.  I propose to introduce a new
> >PKEY_ALLOC_something variant on x86 and POWER and to make the behavior
> >match on both.
> 
> So basically implement PKEY_ALLOC_SETSIGNAL for POWER, and keep the
> existing (different) behavior without the flag?
> 
> Ram, would you be okay with that?  Could you give me a hand if
> necessary?  (I assume we have silicon in-house because it's a
> long-standing feature of the POWER platform which was simply dormant
> on Linux until now.)

Yes. I can help you with that.

So let me see if I understand the overall idea.

Application can allocate new keys through a new syscall
sys_pkey_alloc_1(flags, init_val, sig_init_val) 

'sig_init_val' is the permission-state of the key in signal context.

The kernel will set the permission of each keys to their
corresponding values when entering the signal handler and revert
on return from the signal handler.

just like init_val, sig_init_val also percolates to children threads.

Is this a correct understanding?
RP

^ permalink raw reply

* Re: [PATCH 07/14] powerpc: Add support for restartable sequences
From: Mathieu Desnoyers @ 2018-05-16 20:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul E. McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	linux-kernel, linux-api, Paul Turner, Andrew Morton, Russell King,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Andrew Hunter,
	Andi Kleen, Chris Lameter, Ben Maurer, rostedt, Josh Triplett,
	Linus Torvalds, Catalin Marinas, Will Deacon, Michael Kerrisk,
	Joel Fernandes, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linuxppc-dev
In-Reply-To: <20180516161837.GI12198@hirez.programming.kicks-ass.net>

----- On May 16, 2018, at 12:18 PM, Peter Zijlstra peterz@infradead.org wrote:

> On Mon, Apr 30, 2018 at 06:44:26PM -0400, Mathieu Desnoyers wrote:
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index c32a181a7cbb..ed21a777e8c6 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -223,6 +223,7 @@ config PPC
>>  	select HAVE_SYSCALL_TRACEPOINTS
>>  	select HAVE_VIRT_CPU_ACCOUNTING
>>  	select HAVE_IRQ_TIME_ACCOUNTING
>> +	select HAVE_RSEQ
>>  	select IRQ_DOMAIN
>>  	select IRQ_FORCED_THREADING
>>  	select MODULES_USE_ELF_RELA
>> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
>> index 61db86ecd318..d3bb3aaaf5ac 100644
>> --- a/arch/powerpc/kernel/signal.c
>> +++ b/arch/powerpc/kernel/signal.c
>> @@ -133,6 +133,8 @@ static void do_signal(struct task_struct *tsk)
>>  	/* Re-enable the breakpoints for the signal stack */
>>  	thread_change_pc(tsk, tsk->thread.regs);
>>  
>> +	rseq_signal_deliver(tsk->thread.regs);
>> +
>>  	if (is32) {
>>          	if (ksig.ka.sa.sa_flags & SA_SIGINFO)
>>  			ret = handle_rt_signal32(&ksig, oldset, tsk);
>> @@ -164,6 +166,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long
>> thread_info_flags)
>>  	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
>>  		clear_thread_flag(TIF_NOTIFY_RESUME);
>>  		tracehook_notify_resume(regs);
>> +		rseq_handle_notify_resume(regs);
>>  	}
>>  
>>  	user_enter();
> 
> Again no rseq_syscall().

Same question for PowerPC as for ARM:

Considering that rseq_syscall is implemented as follows:

+void rseq_syscall(struct pt_regs *regs)
+{
+       unsigned long ip = instruction_pointer(regs);
+       struct task_struct *t = current;
+       struct rseq_cs rseq_cs;
+
+       if (!t->rseq)
+               return;
+       if (!access_ok(VERIFY_READ, t->rseq, sizeof(*t->rseq)) ||
+           rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
+               force_sig(SIGSEGV, t);
+}

and that x86 calls it from syscall_return_slowpath() (which AFAIU is
now used in the fast-path since KPTI), I wonder where we should call
this on PowerPC ?  I was under the impression that PowerPC return to
userspace fast-path was not calling C code unless work flags were set,
but I might be wrong.

Thoughts ?

Thanks!

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply

* Re: [PATCH 07/14] powerpc: Add support for restartable sequences
From: Peter Zijlstra @ 2018-05-16 16:18 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Paul E . McKenney, Boqun Feng, Andy Lutomirski, Dave Watson,
	linux-kernel, linux-api, Paul Turner, Andrew Morton, Russell King,
	Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Andrew Hunter,
	Andi Kleen, Chris Lameter, Ben Maurer, Steven Rostedt,
	Josh Triplett, Linus Torvalds, Catalin Marinas, Will Deacon,
	Michael Kerrisk, Joel Fernandes, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, linuxppc-dev
In-Reply-To: <20180430224433.17407-8-mathieu.desnoyers@efficios.com>

On Mon, Apr 30, 2018 at 06:44:26PM -0400, Mathieu Desnoyers wrote:
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index c32a181a7cbb..ed21a777e8c6 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -223,6 +223,7 @@ config PPC
>  	select HAVE_SYSCALL_TRACEPOINTS
>  	select HAVE_VIRT_CPU_ACCOUNTING
>  	select HAVE_IRQ_TIME_ACCOUNTING
> +	select HAVE_RSEQ
>  	select IRQ_DOMAIN
>  	select IRQ_FORCED_THREADING
>  	select MODULES_USE_ELF_RELA
> diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c
> index 61db86ecd318..d3bb3aaaf5ac 100644
> --- a/arch/powerpc/kernel/signal.c
> +++ b/arch/powerpc/kernel/signal.c
> @@ -133,6 +133,8 @@ static void do_signal(struct task_struct *tsk)
>  	/* Re-enable the breakpoints for the signal stack */
>  	thread_change_pc(tsk, tsk->thread.regs);
>  
> +	rseq_signal_deliver(tsk->thread.regs);
> +
>  	if (is32) {
>          	if (ksig.ka.sa.sa_flags & SA_SIGINFO)
>  			ret = handle_rt_signal32(&ksig, oldset, tsk);
> @@ -164,6 +166,7 @@ void do_notify_resume(struct pt_regs *regs, unsigned long thread_info_flags)
>  	if (thread_info_flags & _TIF_NOTIFY_RESUME) {
>  		clear_thread_flag(TIF_NOTIFY_RESUME);
>  		tracehook_notify_resume(regs);
> +		rseq_handle_notify_resume(regs);
>  	}
>  
>  	user_enter();

Again no rseq_syscall().

^ permalink raw reply

* Re: [PATCH] pkeys: Introduce PKEY_ALLOC_SIGNALINHERIT and change signal semantics
From: Dave Hansen @ 2018-05-16 17:01 UTC (permalink / raw)
  To: Florian Weimer, Andy Lutomirski
  Cc: Andy Lutomirski, linuxram, Linux-MM, Linux API, linux-x86_64,
	linux-arch, X86 ML, linuxppc-dev
In-Reply-To: <c727a0da-dd5b-4ca2-375c-773ec550ab25@redhat.com>

On 05/14/2018 08:34 AM, Florian Weimer wrote:
>>> The initial PKRU value can currently be configured by the system
>>> administrator.  I fear this approach has too many moving parts to be
>>> viable.
>>
>> Honestly, I think we should drop that option. I don’t see how we can
>> expect an administrator to do this usefully.
> 
> I don't disagree—it makes things way less predictable in practice.

I originally put that thing in there to make Andy happy with the initial
permissions, and give us a way to back it out if something went wrong.
I have no objections to removing it either.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox