LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH stable 4.4] powerpc/lib: fix book3s/32 boot failure due to code patching
From: Christophe Leroy @ 2019-05-15 13:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Greg KH,
	erhard_f, Michael Neuling
  Cc: linuxppc-dev, linux-kernel, stable

[Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]

On powerpc32, patch_instruction() is called by apply_feature_fixups()
which is called from early_init()

There is the following note in front of early_init():
 * Note that the kernel may be running at an address which is different
 * from the address that it was linked at, so we must use RELOC/PTRRELOC
 * to access static data (including strings).  -- paulus

Therefore init_mem_is_free must be accessed with PTRRELOC()

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

---
Can't apply the upstream commit as such due to several other unrelated stuff
like for instance STRICT_KERNEL_RWX which are missing.
So instead, using same approach as for commit 252eb55816a6f69ef9464cad303cdb3326cdc61d

Removed the Fixes: tag as I don't know yet the commit Id of the fixed commit on 4.4 branch.
---
 arch/powerpc/lib/code-patching.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 2604192c0719..65ce778aee46 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -28,7 +28,7 @@ int patch_instruction(unsigned int *addr, unsigned int instr)
 	int err;
 
 	/* Make sure we aren't patching a freed init section */
-	if (init_mem_is_free && is_init(addr)) {
+	if (*PTRRELOC(&init_mem_is_free) && is_init(addr)) {
 		pr_debug("Skipping init section patching addr: 0x%px\n", addr);
 		return 0;
 	}
-- 
2.13.3


^ permalink raw reply related

* Re: [PATCH stable 4.9] powerpc/lib: fix book3s/32 boot failure due to code patching
From: Christophe Leroy @ 2019-05-15 13:31 UTC (permalink / raw)
  To: Greg KH
  Cc: erhard_f, Michael Neuling, linux-kernel, stable@vger.kernel.org,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <20190515130825.GA3794@kroah.com>



Le 15/05/2019 à 15:08, Greg KH a écrit :
> On Wed, May 15, 2019 at 02:35:36PM +0200, Christophe Leroy wrote:
>>
>>
>> Le 15/05/2019 à 10:29, Greg KH a écrit :
>>> On Wed, May 15, 2019 at 06:40:47AM +0000, Christophe Leroy wrote:
>>>> [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
>>>>
>>>> On powerpc32, patch_instruction() is called by apply_feature_fixups()
>>>> which is called from early_init()
>>>>
>>>> There is the following note in front of early_init():
>>>>    * Note that the kernel may be running at an address which is different
>>>>    * from the address that it was linked at, so we must use RELOC/PTRRELOC
>>>>    * to access static data (including strings).  -- paulus
>>>>
>>>> Therefore init_mem_is_free must be accessed with PTRRELOC()
>>>>
>>>> Fixes: 1c38a84d4586 ("powerpc: Avoid code patching freed init sections")
>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
>>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>>>
>>>> ---
>>>> Can't apply the upstream commit as such due to several other unrelated stuff
>>>> like for instance STRICT_KERNEL_RWX which are missing.
>>>> So instead, using same approach as for commit 252eb55816a6f69ef9464cad303cdb3326cdc61d
>>>
>>> Now queued up, thanks.
>>>
>>
>> Should go to 4.4 as well since the commit it fixes is now queued for 4.4
>> ([PATCH 4.4 056/266] powerpc: Avoid code patching freed init sections)
> 
> Ok, can someone send me a backport that actually applies there?
> 

Done

Christophe

^ permalink raw reply

* [Bug 203609] New: Build error: implicit declaration of function 'cpu_mitigations_off'
From: bugzilla-daemon @ 2019-05-15 13:36 UTC (permalink / raw)
  To: linuxppc-dev

https://bugzilla.kernel.org/show_bug.cgi?id=203609

            Bug ID: 203609
           Summary: Build error: implicit declaration of function
                    'cpu_mitigations_off'
           Product: Platform Specific/Hardware
           Version: 2.5
    Kernel Version: 4.19.43 and 4.14.119
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: PPC-64
          Assignee: platform_ppc-64@kernel-bugs.osdl.org
          Reporter: jason@bluehome.net
        Regression: No

Created attachment 282765
  --> https://bugzilla.kernel.org/attachment.cgi?id=282765&action=edit
Build log

This just showed up in 4.19.43 and 4.14.119. 4.19.42 and 4.14.118 were fine.
I'm building with GCC 8.3 for ppc64el. 4.19.43 and 4.14.119 also build fine for
32- and 64-bit x86.

arch/powerpc/kernel/security.c: In function 'setup_barrier_nospec':
arch/powerpc/kernel/security.c:59:21: error: implicit declaration of function
'cpu_mitigations_off' [-Werror=implicit-function-declaration]
  if (!no_nospec && !cpu_mitigations_off())
                     ^~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
scripts/Makefile.build:303: recipe for target 'arch/powerpc/kernel/security.o'
failed
make[1]: *** [arch/powerpc/kernel/security.o] Error 1
Makefile:1051: recipe for target 'arch/powerpc/kernel' failed
make: *** [arch/powerpc/kernel] Error 2

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply

* Re: [PATCH] powerpc: silence a -Wcast-function-type warning in dawr_write_file_bool
From: Mathieu Malaterre @ 2019-05-15 13:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Michael Neuling, linuxppc-dev, Paul Mackerras, LKML
In-Reply-To: <20190515131441.GA3072@infradead.org>

Hi Christoph,

On Wed, May 15, 2019 at 3:14 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, May 15, 2019 at 02:09:42PM +0200, Mathieu Malaterre wrote:
> > In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
> > option") the following piece of code was added:
> >
> >    smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);
> >
> > Since GCC 8 this trigger the following warning about incompatible
> > function types:
>
> And the warning is there for a reason, and should not be hidden
> behind a cast.  This should instead be fixed by something like this:

<meme: I have no idea what I am doing>
OK, thanks for the quick feedback, will send a v2 asap.

> diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
> index da307dd93ee3..a26b67a1be83 100644
> --- a/arch/powerpc/kernel/hw_breakpoint.c
> +++ b/arch/powerpc/kernel/hw_breakpoint.c
> @@ -384,6 +384,12 @@ void hw_breakpoint_pmu_read(struct perf_event *bp)
>  bool dawr_force_enable;
>  EXPORT_SYMBOL_GPL(dawr_force_enable);
>
> +
> +static void set_dawr_cb(void *info)
> +{
> +       set_dawr(info);
> +}
> +
>  static ssize_t dawr_write_file_bool(struct file *file,
>                                     const char __user *user_buf,
>                                     size_t count, loff_t *ppos)
> @@ -403,7 +409,7 @@ static ssize_t dawr_write_file_bool(struct file *file,
>
>         /* If we are clearing, make sure all CPUs have the DAWR cleared */
>         if (!dawr_force_enable)
> -               smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);
> +               smp_call_function(set_dawr_cb, &null_brk, 0);
>
>         return rc;
>  }

^ permalink raw reply

* [PATCH v2] powerpc: silence a -Wcast-function-type warning in dawr_write_file_bool
From: Mathieu Malaterre @ 2019-05-15 13:51 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Michael Neuling, Mathieu Malaterre, linux-kernel,
	Christoph Hellwig, Paul Mackerras, linuxppc-dev
In-Reply-To: <20190515120942.3812-1-malat@debian.org>

In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") the following piece of code was added:

   smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);

Since GCC 8 this triggers the following warning about incompatible
function types:

  arch/powerpc/kernel/hw_breakpoint.c:408:21: error: cast between incompatible function types from 'int (*)(struct arch_hw_breakpoint *)' to 'void (*)(void *)' [-Werror=cast-function-type]

Since the warning is there for a reason, and should not be hidden behind
a cast, provide an intermediate callback function to avoid the warning.

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
---
v2: do not hide warning using a hack

 arch/powerpc/kernel/hw_breakpoint.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/hw_breakpoint.c b/arch/powerpc/kernel/hw_breakpoint.c
index f70fb89dbf60..969092d84a2f 100644
--- a/arch/powerpc/kernel/hw_breakpoint.c
+++ b/arch/powerpc/kernel/hw_breakpoint.c
@@ -386,6 +386,11 @@ void hw_breakpoint_pmu_read(struct perf_event *bp)
 bool dawr_force_enable;
 EXPORT_SYMBOL_GPL(dawr_force_enable);
 
+static void set_dawr_cb(void *info)
+{
+	set_dawr(info);
+}
+
 static ssize_t dawr_write_file_bool(struct file *file,
 				    const char __user *user_buf,
 				    size_t count, loff_t *ppos)
@@ -405,7 +410,7 @@ static ssize_t dawr_write_file_bool(struct file *file,
 
 	/* If we are clearing, make sure all CPUs have the DAWR cleared */
 	if (!dawr_force_enable)
-		smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);
+		smp_call_function(set_dawr_cb, &null_brk, 0);
 
 	return rc;
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] crypto: talitos - fix skcipher failure due to wrong output IV
From: Horia Geanta @ 2019-05-15 14:05 UTC (permalink / raw)
  To: Christophe Leroy, Herbert Xu, David S. Miller
  Cc: linuxppc-dev@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <a5b0d31d8fc9fc9bc2b69baa5330466090825a39.1557923113.git.christophe.leroy@c-s.fr>

On 5/15/2019 3:29 PM, Christophe Leroy wrote:
> Selftests report the following:
> 
> [    2.984845] alg: skcipher: cbc-aes-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
> [    2.995377] 00000000: 3d af ba 42 9d 9e b4 30 b4 22 da 80 2c 9f ac 41
> [    3.032673] alg: skcipher: cbc-des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
> [    3.043185] 00000000: fe dc ba 98 76 54 32 10
> [    3.063238] alg: skcipher: cbc-3des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
> [    3.073818] 00000000: 7d 33 88 93 0f 93 b2 42
> 
> This above dumps show that the actual output IV is indeed the input IV.
> This is due to the IV not being copied back into the request.
> 
> This patch fixes that.
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>

While here, could you please check ecb mode (which by definition does not have
an IV) is behaving correctly?
Looking in driver_algs[] list of crypto algorithms supported by talitos,
ecb(aes,des,3des) are declared with ivsize != 0.

Thanks,
Horia

^ permalink raw reply

* Re: [PATCH stable 4.4] powerpc/lib: fix book3s/32 boot failure due to code patching
From: Greg KH @ 2019-05-15 14:16 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: erhard_f, Michael Neuling, linux-kernel, stable@vger.kernel.org,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <71dbc8bdad5da9f6cb0446535fb2a29c68fccf80.1557926850.git.christophe.leroy@c-s.fr>

On Wed, May 15, 2019 at 01:30:42PM +0000, Christophe Leroy wrote:
> [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
> 
> On powerpc32, patch_instruction() is called by apply_feature_fixups()
> which is called from early_init()
> 
> There is the following note in front of early_init():
>  * Note that the kernel may be running at an address which is different
>  * from the address that it was linked at, so we must use RELOC/PTRRELOC
>  * to access static data (including strings).  -- paulus
> 
> Therefore init_mem_is_free must be accessed with PTRRELOC()
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> 
> ---
> Can't apply the upstream commit as such due to several other unrelated stuff
> like for instance STRICT_KERNEL_RWX which are missing.
> So instead, using same approach as for commit 252eb55816a6f69ef9464cad303cdb3326cdc61d
> 
> Removed the Fixes: tag as I don't know yet the commit Id of the fixed commit on 4.4 branch.
> ---
>  arch/powerpc/lib/code-patching.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Now added, thanks.

greg k-h

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Christian Brauner @ 2019-05-15 14:16 UTC (permalink / raw)
  To: Yann Droneaud
  Cc: linux-ia64, linux-sh, linux-kernel, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, oleg,
	tglx, linux-arm-kernel, linux-parisc, cyphar, torvalds,
	linux-mips, luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <4c5ae46657e1931a832def5645db61eb0bf1accd.camel@opteya.com>

On Wed, May 15, 2019 at 04:00:20PM +0200, Yann Droneaud wrote:
> Hi,
> 
> Le mercredi 15 mai 2019 à 12:03 +0200, Christian Brauner a écrit :
> > 
> > diff --git a/kernel/pid.c b/kernel/pid.c
> > index 20881598bdfa..237d18d6ecb8 100644
> > --- a/kernel/pid.c
> > +++ b/kernel/pid.c
> > @@ -451,6 +452,53 @@ struct pid *find_ge_pid(int nr, struct
> > pid_namespace *ns)
> >  	return idr_get_next(&ns->idr, &nr);
> >  }
> >  
> > +/**
> > + * pidfd_open() - Open new pid file descriptor.
> > + *
> > + * @pid:   pid for which to retrieve a pidfd
> > + * @flags: flags to pass
> > + *
> > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for
> > + * the process identified by @pid. Currently, the process identified by
> > + * @pid must be a thread-group leader. This restriction currently exists
> > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
> > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group
> > + * leaders).
> > + *
> 
> Would it be possible to create file descriptor with "restricted"
> operation ?
> 
> - O_RDONLY: waiting for process completion allowed (for example)
> - O_WRONLY: sending process signal allowed

Yes, something like this is likely going to be possible in the future.
We had discussion around this. But mapping this to O_RDONLY and O_WRONLY
is not the right model. It makes more sense to have specialized flags
that restrict actions.

> 
> For example, a process could send over a Unix socket a process a pidfd,
> allowing this to only wait for completion, but not sending signal ?
> 
> I see the permission check is not done in pidfd_open(), so what prevent
> a user from sending a signal to another user owned process ?

That's supposed to be possible. You can do the same right now already
with pids. Tools like LMK need this probably very much.
Permission checking for signals is done at send time right now.
And if you can't signal via a pid you can't signal via a pidfd as
they're both subject to the same permissions checks.

> 
> If it's in pidfd_send_signal(), then, passing the socket through
> SCM_RIGHT won't be useful if the target process is not owned by the
> same user, or root.
> 
> > + * Return: On success, a cloexec pidfd is returned.
> > + *         On error, a negative errno number will be returned.
> > + */
> > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> > +{
> > +	int fd, ret;
> > +	struct pid *p;
> > +	struct task_struct *tsk;
> > +
> > +	if (flags)
> > +		return -EINVAL;
> > +
> > +	if (pid <= 0)
> > +		return -EINVAL;
> > +
> > +	p = find_get_pid(pid);
> > +	if (!p)
> > +		return -ESRCH;
> > +
> > +	rcu_read_lock();
> > +	tsk = pid_task(p, PIDTYPE_PID);
> > +	if (!tsk)
> > +		ret = -ESRCH;
> > +	else if (unlikely(!thread_group_leader(tsk)))
> > +		ret = -EINVAL;
> > +	else
> > +		ret = 0;
> > +	rcu_read_unlock();
> > +
> > +	fd = ret ?: pidfd_create(p);
> > +	put_pid(p);
> > +	return fd;
> > +}
> > +
> >  void __init pid_idr_init(void)
> >  {
> >  	/* Verify no one has done anything silly: */
> 
> Regards.
> 
> -- 
> Yann Droneaud
> OPTEYA
> 
> 

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Oleg Nesterov @ 2019-05-15 14:38 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515100400.3450-1-christian@brauner.io>

On 05/15, Christian Brauner wrote:
>
> +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> +{
> +	int fd, ret;
> +	struct pid *p;
> +	struct task_struct *tsk;
> +
> +	if (flags)
> +		return -EINVAL;
> +
> +	if (pid <= 0)
> +		return -EINVAL;
> +
> +	p = find_get_pid(pid);
> +	if (!p)
> +		return -ESRCH;
> +
> +	rcu_read_lock();
> +	tsk = pid_task(p, PIDTYPE_PID);

You do not need find_get_pid() before rcu_lock and put_pid() at the end.
You can just do find_vpid() under rcu_read_lock().

> +	if (!tsk)
> +		ret = -ESRCH;
> +	else if (unlikely(!thread_group_leader(tsk)))
> +		ret = -EINVAL;

it seems that you can do a single check

	tsk = pid_task(p, PIDTYPE_TGID);
	if (!tsk)
		ret = -ESRCH;

this even looks more correct if we race with exec changing the leader.

Oleg.


^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Christian Brauner @ 2019-05-15 14:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515143857.GB18892@redhat.com>

On Wed, May 15, 2019 at 04:38:58PM +0200, Oleg Nesterov wrote:
> On 05/15, Christian Brauner wrote:
> >
> > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> > +{
> > +	int fd, ret;
> > +	struct pid *p;
> > +	struct task_struct *tsk;
> > +
> > +	if (flags)
> > +		return -EINVAL;
> > +
> > +	if (pid <= 0)
> > +		return -EINVAL;
> > +
> > +	p = find_get_pid(pid);
> > +	if (!p)
> > +		return -ESRCH;
> > +
> > +	rcu_read_lock();
> > +	tsk = pid_task(p, PIDTYPE_PID);
> 
> You do not need find_get_pid() before rcu_lock and put_pid() at the end.
> You can just do find_vpid() under rcu_read_lock().

Will do.

> 
> > +	if (!tsk)
> > +		ret = -ESRCH;
> > +	else if (unlikely(!thread_group_leader(tsk)))
> > +		ret = -EINVAL;
> 
> it seems that you can do a single check
> 
> 	tsk = pid_task(p, PIDTYPE_TGID);
> 	if (!tsk)
> 		ret = -ESRCH;
> 
> this even looks more correct if we race with exec changing the leader.

The logic here being that you can only reach the thread_group leader
from struct pid if PIDTYPE_PID == PIDTYPE_TGID for this struct pid?

Thanks, Oleg.
Christian

^ permalink raw reply

* Re: [PATCH stable 4.4] powerpc/lib: fix book3s/32 boot failure due to code patching
From: Christophe Leroy @ 2019-05-15 15:13 UTC (permalink / raw)
  To: Greg KH
  Cc: erhard_f, Michael Neuling, linux-kernel, stable@vger.kernel.org,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <20190515141604.GB8999@kroah.com>



Le 15/05/2019 à 16:16, Greg KH a écrit :
> On Wed, May 15, 2019 at 01:30:42PM +0000, Christophe Leroy wrote:
>> [Backport of upstream commit b45ba4a51cde29b2939365ef0c07ad34c8321789]
>>
>> On powerpc32, patch_instruction() is called by apply_feature_fixups()
>> which is called from early_init()
>>
>> There is the following note in front of early_init():
>>   * Note that the kernel may be running at an address which is different
>>   * from the address that it was linked at, so we must use RELOC/PTRRELOC
>>   * to access static data (including strings).  -- paulus
>>
>> Therefore init_mem_is_free must be accessed with PTRRELOC()
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203597
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>
>> ---
>> Can't apply the upstream commit as such due to several other unrelated stuff
>> like for instance STRICT_KERNEL_RWX which are missing.
>> So instead, using same approach as for commit 252eb55816a6f69ef9464cad303cdb3326cdc61d
>>
>> Removed the Fixes: tag as I don't know yet the commit Id of the fixed commit on 4.4 branch.
>> ---
>>   arch/powerpc/lib/code-patching.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Now added, thanks.
> 

Thanks,

However you took the commit log from the upstream commit, which doesn't 
corresponds exactly to the change being done here and described in the 
backport patch

Christophe

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Oleg Nesterov @ 2019-05-15 15:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515144927.f2yxyi6w6lhn3xx7@brauner.io>

On 05/15, Christian Brauner wrote:
>
> On Wed, May 15, 2019 at 04:38:58PM +0200, Oleg Nesterov wrote:
> >
> > it seems that you can do a single check
> >
> > 	tsk = pid_task(p, PIDTYPE_TGID);
> > 	if (!tsk)
> > 		ret = -ESRCH;
> >
> > this even looks more correct if we race with exec changing the leader.
>
> The logic here being that you can only reach the thread_group leader
> from struct pid if PIDTYPE_PID == PIDTYPE_TGID for this struct pid?

Not exactly... it is not that PIDTYPE_PID == PIDTYPE_TGID for this pid,
struct pid has no "type" or something like this.

The logic is that pid->tasks[PIDTYPE_XXX] is the list of task which use
this pid as "XXX" type.

For example, clone(CLONE_THREAD) creates a pid which has a single non-
empty list, pid->tasks[PIDTYPE_PID]. This pid can't be used as TGID or
SID.

So if pid_task(PIDTYPE_TGID) returns non-NULL we know that this pid was
used for a group-leader, see copy_process() which does

	if (thread_group_leader(p))
		attach_pid(p, PIDTYPE_TGID);


If we race with exec which changes the leader pid_task(TGID) can return
the old leader. We do not care, but this means that we should not check
thread_group_leader().

Oleg.


^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Christian Brauner @ 2019-05-15 15:30 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515151912.GE18892@redhat.com>

On Wed, May 15, 2019 at 05:19:13PM +0200, Oleg Nesterov wrote:
> On 05/15, Christian Brauner wrote:
> >
> > On Wed, May 15, 2019 at 04:38:58PM +0200, Oleg Nesterov wrote:
> > >
> > > it seems that you can do a single check
> > >
> > > 	tsk = pid_task(p, PIDTYPE_TGID);
> > > 	if (!tsk)
> > > 		ret = -ESRCH;
> > >
> > > this even looks more correct if we race with exec changing the leader.
> >
> > The logic here being that you can only reach the thread_group leader
> > from struct pid if PIDTYPE_PID == PIDTYPE_TGID for this struct pid?
> 
> Not exactly... it is not that PIDTYPE_PID == PIDTYPE_TGID for this pid,
> struct pid has no "type" or something like this.
> 
> The logic is that pid->tasks[PIDTYPE_XXX] is the list of task which use
> this pid as "XXX" type.
> 
> For example, clone(CLONE_THREAD) creates a pid which has a single non-
> empty list, pid->tasks[PIDTYPE_PID]. This pid can't be used as TGID or
> SID.
> 
> So if pid_task(PIDTYPE_TGID) returns non-NULL we know that this pid was
> used for a group-leader, see copy_process() which does

Ah, this was what I was asking myself when I worked on thread-specific
signal sending. This clarifies quite a lot of things!

Though I wonder how one reliably gets a the PGID or SID from a
PIDTYPE_PID.

> 
> 	if (thread_group_leader(p))
> 		attach_pid(p, PIDTYPE_TGID);
> 
> 
> If we race with exec which changes the leader pid_task(TGID) can return
> the old leader. We do not care, but this means that we should not check
> thread_group_leader().

Nice!

Thank you, Oleg! :)
Christian

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Oleg Nesterov @ 2019-05-15 15:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515143857.GB18892@redhat.com>

On 05/15, Oleg Nesterov wrote:
>
> On 05/15, Christian Brauner wrote:
> >
> > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> > +{
> > +	int fd, ret;
> > +	struct pid *p;
> > +	struct task_struct *tsk;
> > +
> > +	if (flags)
> > +		return -EINVAL;
> > +
> > +	if (pid <= 0)
> > +		return -EINVAL;
> > +
> > +	p = find_get_pid(pid);
> > +	if (!p)
> > +		return -ESRCH;
> > +
> > +	rcu_read_lock();
> > +	tsk = pid_task(p, PIDTYPE_PID);
>
> You do not need find_get_pid() before rcu_lock and put_pid() at the end.
> You can just do find_vpid() under rcu_read_lock().

Ah, sorry. Somehow I forgot you need to call pidfd_create(pid), you can't
do this under rcu_read_lock().

So I was wrong, you can't avoid get/put_pid.

Oleg.


^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Christian Brauner @ 2019-05-15 15:40 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-ia64, linux-sh, linux-mips, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, tglx,
	linux-arm-kernel, linux-parisc, cyphar, torvalds, linux-kernel,
	luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515153515.GA20783@redhat.com>

On Wed, May 15, 2019 at 05:35:15PM +0200, Oleg Nesterov wrote:
> On 05/15, Oleg Nesterov wrote:
> >
> > On 05/15, Christian Brauner wrote:
> > >
> > > +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> > > +{
> > > +	int fd, ret;
> > > +	struct pid *p;
> > > +	struct task_struct *tsk;
> > > +
> > > +	if (flags)
> > > +		return -EINVAL;
> > > +
> > > +	if (pid <= 0)
> > > +		return -EINVAL;
> > > +
> > > +	p = find_get_pid(pid);
> > > +	if (!p)
> > > +		return -ESRCH;
> > > +
> > > +	rcu_read_lock();
> > > +	tsk = pid_task(p, PIDTYPE_PID);
> >
> > You do not need find_get_pid() before rcu_lock and put_pid() at the end.
> > You can just do find_vpid() under rcu_read_lock().
> 
> Ah, sorry. Somehow I forgot you need to call pidfd_create(pid), you can't
> do this under rcu_read_lock().
> 
> So I was wrong, you can't avoid get/put_pid.

Yeah, I haven't made any changes yet.

Christian

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Daniel Colascione @ 2019-05-15 17:45 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-kernel, David Howells,
	open list:KERNEL SELFTEST FRAMEWORK, sparclinux, Linux API,
	elena.reshetova, linux-arch, linux-s390, linux-xtensa, Kees Cook,
	Arnd Bergmann, Jann Horn, linux-m68k, Al Viro, Andy Lutomirski,
	Oleg Nesterov, Thomas Gleixner, linux-arm-kernel, linux-parisc,
	Aleksa Sarai, Linus Torvalds, linux-mips, Andy Lutomirski,
	Eric W. Biederman, linux-alpha, Andrew Morton, linuxppc-dev
In-Reply-To: <20190515100400.3450-1-christian@brauner.io>

On Wed, May 15, 2019 at 3:04 AM Christian Brauner <christian@brauner.io> wrote:
>
> This adds the pidfd_open() syscall. It allows a caller to retrieve pollable
> pidfds for a process which did not get created via CLONE_PIDFD, i.e. for a
> process that is created via traditional fork()/clone() calls that is only
> referenced by a PID:

Thanks for doing this work. I'm really looking forward to this new
approach to process management.

> int pidfd = pidfd_open(1234, 0);
> ret = pidfd_send_signal(pidfd, SIGSTOP, NULL, 0);
>
> With the introduction of pidfds through CLONE_PIDFD it is possible to
> created pidfds at process creation time.
> However, a lot of processes get created with traditional PID-based calls
> such as fork() or clone() (without CLONE_PIDFD). For these processes a
> caller can currently not create a pollable pidfd. This is a huge problem
> for Android's low memory killer (LMK) and service managers such as systemd.
> Both are examples of tools that want to make use of pidfds to get reliable
> notification of process exit for non-parents (pidfd polling) and race-free
> signal sending (pidfd_send_signal()). They intend to switch to this API for
> process supervision/management as soon as possible. Having no way to get
> pollable pidfds from PID-only processes is one of the biggest blockers for
> them in adopting this api. With pidfd_open() making it possible to retrieve
> pidfd for PID-based processes we enable them to adopt this api.
>
> In line with Arnd's recent changes to consolidate syscall numbers across
> architectures, I have added the pidfd_open() syscall to all architectures
> at the same time.

I'm glad it's easier now.

>  arch/alpha/kernel/syscalls/syscall.tbl      |  1 +
>  arch/arm64/include/asm/unistd32.h           |  2 +
>  arch/ia64/kernel/syscalls/syscall.tbl       |  1 +
>  arch/m68k/kernel/syscalls/syscall.tbl       |  1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl |  1 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   |  1 +
>  arch/parisc/kernel/syscalls/syscall.tbl     |  1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    |  1 +
>  arch/s390/kernel/syscalls/syscall.tbl       |  1 +
>  arch/sh/kernel/syscalls/syscall.tbl         |  1 +
>  arch/sparc/kernel/syscalls/syscall.tbl      |  1 +
>  arch/x86/entry/syscalls/syscall_32.tbl      |  1 +
>  arch/x86/entry/syscalls/syscall_64.tbl      |  1 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     |  1 +

It'd be nice to arrange the system call tables so that we need to
change only one file when adding a new system call.

[Snip system call wiring]

> --- a/include/linux/pid.h
> +++ b/include/linux/pid.h
> @@ -67,6 +67,7 @@ struct pid
>  extern struct pid init_struct_pid;
>
>  extern const struct file_operations pidfd_fops;
> +extern int pidfd_create(struct pid *pid);
>
>  static inline struct pid *get_pid(struct pid *pid)
>  {
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index e2870fe1be5b..989055e0b501 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -929,6 +929,7 @@ asmlinkage long sys_clock_adjtime32(clockid_t which_clock,
>                                 struct old_timex32 __user *tx);
>  asmlinkage long sys_syncfs(int fd);
>  asmlinkage long sys_setns(int fd, int nstype);
> +asmlinkage long sys_pidfd_open(pid_t pid, unsigned int flags);
>  asmlinkage long sys_sendmmsg(int fd, struct mmsghdr __user *msg,
>                              unsigned int vlen, unsigned flags);
>  asmlinkage long sys_process_vm_readv(pid_t pid,
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index dee7292e1df6..94a257a93d20 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -832,9 +832,11 @@ __SYSCALL(__NR_io_uring_setup, sys_io_uring_setup)
>  __SYSCALL(__NR_io_uring_enter, sys_io_uring_enter)
>  #define __NR_io_uring_register 427
>  __SYSCALL(__NR_io_uring_register, sys_io_uring_register)
> +#define __NR_pidfd_open 428
> +__SYSCALL(__NR_pidfd_open, sys_pidfd_open)
>
>  #undef __NR_syscalls
> -#define __NR_syscalls 428
> +#define __NR_syscalls 429
>
>  /*
>   * 32 bit systems traditionally used different
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 737db1828437..980cc1d2b8d4 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1714,7 +1714,7 @@ const struct file_operations pidfd_fops = {
>   * Return: On success, a cloexec pidfd is returned.
>   *         On error, a negative errno number will be returned.
>   */
> -static int pidfd_create(struct pid *pid)
> +int pidfd_create(struct pid *pid)
>  {
>         int fd;
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 20881598bdfa..237d18d6ecb8 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -38,6 +38,7 @@
>  #include <linux/syscalls.h>
>  #include <linux/proc_ns.h>
>  #include <linux/proc_fs.h>
> +#include <linux/sched/signal.h>
>  #include <linux/sched/task.h>
>  #include <linux/idr.h>
>
> @@ -451,6 +452,53 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns)
>         return idr_get_next(&ns->idr, &nr);
>  }
>
> +/**
> + * pidfd_open() - Open new pid file descriptor.
> + *
> + * @pid:   pid for which to retrieve a pidfd
> + * @flags: flags to pass
> + *
> + * This creates a new pid file descriptor with the O_CLOEXEC flag set for
> + * the process identified by @pid. Currently, the process identified by
> + * @pid must be a thread-group leader. This restriction currently exists
> + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
> + * be used with CLONE_THREAD) and pidfd polling (only supports thread group
> + * leaders).
> + *
> + * Return: On success, a cloexec pidfd is returned.
> + *         On error, a negative errno number will be returned.
> + */
> +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> +{
> +       int fd, ret;
> +       struct pid *p;
> +       struct task_struct *tsk;
> +
> +       if (flags)
> +               return -EINVAL;

If we support blocking operations on pidfds, we'll want to be able to
put them in non-blocking mode. Does it make sense to accept and ignore
O_NONBLOCK here now?

> +       if (pid <= 0)
> +               return -EINVAL;

WDYT of defining pid == 0 to mean "open myself"?

> +       p = find_get_pid(pid);
> +       if (!p)
> +               return -ESRCH;
> +
> +       rcu_read_lock();
> +       tsk = pid_task(p, PIDTYPE_PID);
> +       if (!tsk)
> +               ret = -ESRCH;
> +       else if (unlikely(!thread_group_leader(tsk)))
> +               ret = -EINVAL;
> +       else
> +               ret = 0;
> +       rcu_read_unlock();
> +
> +       fd = ret ?: pidfd_create(p);
> +       put_pid(p);
> +       return fd;
> +}

^ permalink raw reply

* Re: [PATCH] crypto: talitos - fix skcipher failure due to wrong output IV
From: Christophe Leroy @ 2019-05-15 18:49 UTC (permalink / raw)
  To: Horia Geanta, Herbert Xu, David S. Miller
  Cc: linuxppc-dev@lists.ozlabs.org, linux-crypto@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <VI1PR0402MB34858D80A15D4B55F64570E398090@VI1PR0402MB3485.eurprd04.prod.outlook.com>



Le 15/05/2019 à 16:05, Horia Geanta a écrit :
> On 5/15/2019 3:29 PM, Christophe Leroy wrote:
>> Selftests report the following:
>>
>> [    2.984845] alg: skcipher: cbc-aes-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
>> [    2.995377] 00000000: 3d af ba 42 9d 9e b4 30 b4 22 da 80 2c 9f ac 41
>> [    3.032673] alg: skcipher: cbc-des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
>> [    3.043185] 00000000: fe dc ba 98 76 54 32 10
>> [    3.063238] alg: skcipher: cbc-3des-talitos encryption test failed (wrong output IV) on test vector 0, cfg="in-place"
>> [    3.073818] 00000000: 7d 33 88 93 0f 93 b2 42
>>
>> This above dumps show that the actual output IV is indeed the input IV.
>> This is due to the IV not being copied back into the request.
>>
>> This patch fixes that.
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Reviewed-by: Horia Geantă <horia.geanta@nxp.com>

It's missing a Fixes: tag and a Cc: to stable.

I'll resend tomorrow.

> 
> While here, could you please check ecb mode (which by definition does not have
> an IV) is behaving correctly?
> Looking in driver_algs[] list of crypto algorithms supported by talitos,
> ecb(aes,des,3des) are declared with ivsize != 0.

According to /proc/crypto, test are passed for ecb.

Christophe

> 
> Thanks,
> Horia
> 

^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Yann Droneaud @ 2019-05-15 15:29 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-kernel, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, linux-arch, linux-s390,
	linux-xtensa, keescook, arnd, jannh, linux-m68k, viro, luto, oleg,
	tglx, linux-arm-kernel, linux-parisc, cyphar, torvalds,
	linux-mips, luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515141634.lrc5ynllcmjr64mn@brauner.io>

Hi,

Le mercredi 15 mai 2019 à 16:16 +0200, Christian Brauner a écrit :
> On Wed, May 15, 2019 at 04:00:20PM +0200, Yann Droneaud wrote:
> > Le mercredi 15 mai 2019 à 12:03 +0200, Christian Brauner a écrit :
> > > diff --git a/kernel/pid.c b/kernel/pid.c
> > > index 20881598bdfa..237d18d6ecb8 100644
> > > --- a/kernel/pid.c
> > > +++ b/kernel/pid.c
> > > @@ -451,6 +452,53 @@ struct pid *find_ge_pid(int nr, struct
> > > pid_namespace *ns)
> > >  	return idr_get_next(&ns->idr, &nr);
> > >  }
> > >  
> > > +/**
> > > + * pidfd_open() - Open new pid file descriptor.
> > > + *
> > > + * @pid:   pid for which to retrieve a pidfd
> > > + * @flags: flags to pass
> > > + *
> > > + * This creates a new pid file descriptor with the O_CLOEXEC flag set for
> > > + * the process identified by @pid. Currently, the process identified by
> > > + * @pid must be a thread-group leader. This restriction currently exists
> > > + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
> > > + * be used with CLONE_THREAD) and pidfd polling (only supports thread group
> > > + * leaders).
> > > + *
> > 
> > Would it be possible to create file descriptor with "restricted"
> > operation ?
> > 
> > - O_RDONLY: waiting for process completion allowed (for example)
> > - O_WRONLY: sending process signal allowed
> 
> Yes, something like this is likely going to be possible in the future.
> We had discussion around this. But mapping this to O_RDONLY and O_WRONLY
> is not the right model. It makes more sense to have specialized flags
> that restrict actions.

Yes, dedicated flags are the way to go. I've used the old open() flags
here as examples as an echo of the O_CLOEXEC flag used in the comment.

> > For example, a process could send over a Unix socket a process a pidfd,
> > allowing this to only wait for completion, but not sending signal ?
> > 
> > I see the permission check is not done in pidfd_open(), so what prevent
> > a user from sending a signal to another user owned process ?
> 
> That's supposed to be possible. You can do the same right now already
> with pids. Tools like LMK need this probably very much.
> Permission checking for signals is done at send time right now.
> And if you can't signal via a pid you can't signal via a pidfd as
> they're both subject to the same permissions checks.
> 

I would have expect it to behave like most other file descriptor,
permission check done at opening time, which allow more privileged
process to open the file descriptor, then pass it to a less privileged
process, or change its own privileged with setuid() and such. Then the
less privileged process can act on behalf of the privileged process
through the file descriptor.

> > If it's in pidfd_send_signal(), then, passing the socket through
> > SCM_RIGHT won't be useful if the target process is not owned by the
> > same user, or root.
> > 

If the permission check is done at sending time, the scenario above
case cannot be implemented.

Sending a pidfd through SCM_RIGHT is only useful if the receiver
process is equally or more privileged than the sender then.

For isolation purpose, I would have expect to be able to give a right
to send a signal to a highly privileged process a specific less
privileged process though Unix socket.

But I can't come up with a specific use case. So I dunno.

Regards.

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Yann Droneaud @ 2019-05-15 14:00 UTC (permalink / raw)
  To: Christian Brauner, jannh, oleg, viro, torvalds, linux-kernel,
	arnd, dhowells
  Cc: linux-ia64, linux-sh, linux-mips, linux-kselftest, sparclinux,
	elena.reshetova, linux-arch, linux-s390, linux-xtensa, keescook,
	linux-m68k, luto, tglx, linux-arm-kernel, linux-parisc, linux-api,
	cyphar, luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515100400.3450-1-christian@brauner.io>

Hi,

Le mercredi 15 mai 2019 à 12:03 +0200, Christian Brauner a écrit :
> 
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 20881598bdfa..237d18d6ecb8 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -451,6 +452,53 @@ struct pid *find_ge_pid(int nr, struct
> pid_namespace *ns)
>  	return idr_get_next(&ns->idr, &nr);
>  }
>  
> +/**
> + * pidfd_open() - Open new pid file descriptor.
> + *
> + * @pid:   pid for which to retrieve a pidfd
> + * @flags: flags to pass
> + *
> + * This creates a new pid file descriptor with the O_CLOEXEC flag set for
> + * the process identified by @pid. Currently, the process identified by
> + * @pid must be a thread-group leader. This restriction currently exists
> + * for all aspects of pidfds including pidfd creation (CLONE_PIDFD cannot
> + * be used with CLONE_THREAD) and pidfd polling (only supports thread group
> + * leaders).
> + *

Would it be possible to create file descriptor with "restricted"
operation ?

- O_RDONLY: waiting for process completion allowed (for example)
- O_WRONLY: sending process signal allowed

For example, a process could send over a Unix socket a process a pidfd,
allowing this to only wait for completion, but not sending signal ?

I see the permission check is not done in pidfd_open(), so what prevent
a user from sending a signal to another user owned process ?

If it's in pidfd_send_signal(), then, passing the socket through
SCM_RIGHT won't be useful if the target process is not owned by the
same user, or root.

> + * Return: On success, a cloexec pidfd is returned.
> + *         On error, a negative errno number will be returned.
> + */
> +SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags)
> +{
> +	int fd, ret;
> +	struct pid *p;
> +	struct task_struct *tsk;
> +
> +	if (flags)
> +		return -EINVAL;
> +
> +	if (pid <= 0)
> +		return -EINVAL;
> +
> +	p = find_get_pid(pid);
> +	if (!p)
> +		return -ESRCH;
> +
> +	rcu_read_lock();
> +	tsk = pid_task(p, PIDTYPE_PID);
> +	if (!tsk)
> +		ret = -ESRCH;
> +	else if (unlikely(!thread_group_leader(tsk)))
> +		ret = -EINVAL;
> +	else
> +		ret = 0;
> +	rcu_read_unlock();
> +
> +	fd = ret ?: pidfd_create(p);
> +	put_pid(p);
> +	return fd;
> +}
> +
>  void __init pid_idr_init(void)
>  {
>  	/* Verify no one has done anything silly: */

Regards.

-- 
Yann Droneaud
OPTEYA



^ permalink raw reply

* Re: [PATCH 1/2] pid: add pidfd_open()
From: Aleksa Sarai @ 2019-05-15 14:51 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-ia64, linux-sh, linux-kernel, dhowells, linux-kselftest,
	sparclinux, linux-api, elena.reshetova, Yann Droneaud, linux-s390,
	linux-arch, linux-xtensa, keescook, arnd, jannh, linux-m68k, viro,
	luto, oleg, tglx, linux-arm-kernel, linux-parisc, torvalds,
	linux-mips, luto, ebiederm, linux-alpha, akpm, linuxppc-dev
In-Reply-To: <20190515141634.lrc5ynllcmjr64mn@brauner.io>

[-- Attachment #1: Type: text/plain, Size: 805 bytes --]

On 2019-05-15, Christian Brauner <christian@brauner.io> wrote:
> On Wed, May 15, 2019 at 04:00:20PM +0200, Yann Droneaud wrote:
> > Would it be possible to create file descriptor with "restricted"
> > operation ?
> > 
> > - O_RDONLY: waiting for process completion allowed (for example)
> > - O_WRONLY: sending process signal allowed
> 
> Yes, something like this is likely going to be possible in the future.
> We had discussion around this. But mapping this to O_RDONLY and O_WRONLY
> is not the right model. It makes more sense to have specialized flags
> that restrict actions.

Not to mention that the O_* flags have silly values which we shouldn't
replicate in new syscalls IMHO.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] powerpc/mm/book3s64: Implement STRICT_MODULE_RWX
From: Russell Currey @ 2019-05-16  0:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linuxppc-dev
In-Reply-To: <20190515064111.GA15778@infradead.org>

On Tue, 2019-05-14 at 23:41 -0700, Christoph Hellwig wrote:
> > + * This program is free software; you can redistribute it and/or
> > modify it
> > + * under the terms of the GNU General Public License as published
> > by the
> > + * Free Software Foundation; either version 2 of the License, or
> > (at your
> > + * option) any later version.
> 
> This license boilerplate should not be added together with an SPDX
> tag.
> 
> > +// we need this to have a single pointer to pass into
> > apply_to_page_range()
> 
> Please use normal /* - */ style comments.

I was under the impression they're allowed (in powerpc at least, if not
the wider kernel nowadays) but happy to defer on this.


^ permalink raw reply

* Re: [RFC PATCH] powerpc/mm: Implement STRICT_MODULE_RWX
From: Russell Currey @ 2019-05-16  0:51 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman
  Cc: linuxppc-dev, linux-kernel
In-Reply-To: <df502ffe07caa38c46b0144fc824fff447f4105b.1557901092.git.christophe.leroy@c-s.fr>

On Wed, 2019-05-15 at 06:20 +0000, Christophe Leroy wrote:
> Strict module RWX is just like strict kernel RWX, but for modules -
> so
> loadable modules aren't marked both writable and executable at the
> same
> time.  This is handled by the generic code in kernel/module.c, and
> simply requires the architecture to implement the set_memory() set of
> functions, declared with ARCH_HAS_SET_MEMORY.
> 
> There's nothing other than these functions required to turn
> ARCH_HAS_STRICT_MODULE_RWX on, so turn that on too.
> 
> With STRICT_MODULE_RWX enabled, there are as many W+X pages at
> runtime
> as there are with CONFIG_MODULES=n (none), so in Russel's testing it
> works
> well on both Hash and Radix book3s64.
> 
> There's a TODO in the code for also applying the page permission
> changes
> to the backing pages in the linear mapping: this is pretty simple for
> Radix and (seemingly) a lot harder for Hash, so I've left it for now
> since there's still a notable security benefit for the patch as-is.
> 
> Technically can be enabled without STRICT_KERNEL_RWX, but
> that doesn't gets you a whole lot, so we should leave it off by
> default
> until we can get STRICT_KERNEL_RWX to the point where it's enabled by
> default.
> 
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---

Thanks for this, I figured you'd know how to make this work on 32bit
too.  I'll test on my end today.

Note that there are two Ls in my name!  To quote the great Rusty, "This
Russel disease must be stamped out before it becomes widespread".



^ permalink raw reply

* Re: [v4 PATCH 1/2] [PowerPC] Add simd.h implementation
From: Shawn Landden @ 2019-05-16  1:12 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras
In-Reply-To: <568f62f6-7810-7179-f9aa-03f9df198fb6@c-s.fr>

On Wed, May 15, 2019 at 1:27 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
> Could you please as usual list here the changes provided by each version
> to ease the review ?
A bunch of embarrassing stuff that caused it not to build on some
set-ups (the functions were under the wrong include guards), and I
added include guards on simd.h so that you can use may_use_simd() even
if you don't have the FPU enabled (ARM's simd.h does this).

^ permalink raw reply

* [PATCH] powerpc/64s: Make boot look nice(r)
From: Nicholas Piggin @ 2019-05-16  2:04 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Radix boot looks like this:

 -----------------------------------------------------
 phys_mem_size     = 0x200000000
 dcache_bsize      = 0x80
 icache_bsize      = 0x80
 cpu_features      = 0x0000c06f8f5fb1a7
   possible        = 0x0000fbffcf5fb1a7
   always          = 0x00000003800081a1
 cpu_user_features = 0xdc0065c2 0xaee00000
 mmu_features      = 0xbc006041
 firmware_features = 0x0000000010000000
 hash-mmu: ppc64_pft_size    = 0x0
 hash-mmu: kernel vmalloc start   = 0xc008000000000000
 hash-mmu: kernel IO start        = 0xc00a000000000000
 hash-mmu: kernel vmemmap start   = 0xc00c000000000000
 -----------------------------------------------------

Fix:

 -----------------------------------------------------
 phys_mem_size     = 0x200000000
 dcache_bsize      = 0x80
 icache_bsize      = 0x80
 cpu_features      = 0x0000c06f8f5fb1a7
   possible        = 0x0000fbffcf5fb1a7
   always          = 0x00000003800081a1
 cpu_user_features = 0xdc0065c2 0xaee00000
 mmu_features      = 0xbc006041
 firmware_features = 0x0000000010000000
 vmalloc start     = 0xc008000000000000
 IO start          = 0xc00a000000000000
 vmemmap start     = 0xc00c000000000000
 -----------------------------------------------------

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/setup-common.c    | 8 +++++++-
 arch/powerpc/mm/book3s64/hash_utils.c | 3 ---
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index aad9f5df6ab6..f2da8c809c85 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -810,9 +810,15 @@ static __init void print_system_info(void)
 	pr_info("mmu_features      = 0x%08x\n", cur_cpu_spec->mmu_features);
 #ifdef CONFIG_PPC64
 	pr_info("firmware_features = 0x%016lx\n", powerpc_firmware_features);
+#ifdef CONFIG_PPC_BOOK3S
+	pr_info("vmalloc start     = 0x%lx\n", KERN_VIRT_START);
+	pr_info("IO start          = 0x%lx\n", KERN_IO_START);
+	pr_info("vmemmap start     = 0x%lx\n", (unsigned long)vmemmap);
+#endif
 #endif
 
-	print_system_hash_info();
+	if (!early_radix_enabled())
+		print_system_hash_info();
 
 	if (PHYSICAL_START > 0)
 		pr_info("physical_start    = 0x%llx\n",
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 919a861a8ec0..8b307b796b83 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1954,7 +1954,4 @@ void __init print_system_hash_info(void)
 
 	if (htab_hash_mask)
 		pr_info("htab_hash_mask    = 0x%lx\n", htab_hash_mask);
-	pr_info("kernel vmalloc start   = 0x%lx\n", KERN_VIRT_START);
-	pr_info("kernel IO start        = 0x%lx\n", KERN_IO_START);
-	pr_info("kernel vmemmap start   = 0x%lx\n", (unsigned long)vmemmap);
 }
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH] crypto: vmx - fix copy-paste error in CTR mode
From: Daniel Axtens @ 2019-05-16  2:12 UTC (permalink / raw)
  To: Herbert Xu
  Cc: leo.barbosa, nayna, Stephan Mueller, Nayna, omosnacek,
	Eric Biggers, marcelo.cerri, pfsmorigo, linux-crypto, leitao,
	George Wilson, linuxppc-dev
In-Reply-To: <874l5w1axv.fsf@dja-thinkpad.axtens.net>

Daniel Axtens <dja@axtens.net> writes:

> Herbert Xu <herbert@gondor.apana.org.au> writes:
>
>> On Wed, May 15, 2019 at 03:35:51AM +1000, Daniel Axtens wrote:
>>>
>>> By all means disable vmx ctr if I don't get an answer to you in a
>>> timeframe you are comfortable with, but I am going to at least try to
>>> have a look.
>>
>> I'm happy to give you guys more time.  How much time do you think
>> you will need?
>>
> Give me till the end of the week: if I haven't solved it by then I will
> probably have to give up and go on to other things anyway.

So as you've hopefully seen, I've nailed it down and posted a patch.
(http://patchwork.ozlabs.org/patch/1099934/)

I'm also seeing issues with ghash with the extended tests:

[    7.582926] alg: hash: p8_ghash test failed (wrong result) on test vector 0, cfg="random: use_final src_divs=[<reimport>9.72%@+39832, <reimport>18.2%@+65504, <reimport,nosimd>45.57%@alignmask+18, <reimport,nosimd>15.6%@+65496, 6.83%@+65514, <reimport,nosimd>1.2%@+25, <reim"

It seems to happen when one of the source divisions has nosimd and the
final result uses the simd finaliser, so that's interesting.

Regards,
Daniel

>
> (FWIW, it seems to happen when encoding greater than 4 but less than 8
> AES blocks - in particular with both 7 and 5 blocks encoded I can see it
> go wrong from block 4 onwards. No idea why yet, and the asm is pretty
> dense, but that's where I'm at at the moment.)
>
> Regards,
> Daniel
>
>> Thanks,
>> -- 
>> Email: Herbert Xu <herbert@gondor.apana.org.au>
>> Home Page: http://gondor.apana.org.au/~herbert/
>> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox