Linux Security Modules development
 help / color / mirror / Atom feed
* Re: [PATCH v33 12/21] x86/sgx: Allow a limited use of ATTRIBUTE.PROVISIONKEY for attestation
From: Jarkko Sakkinen @ 2020-07-03  2:38 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Borislav Petkov, x86, linux-sgx, linux-kernel,
	linux-security-module, Jethro Beekman, Andy Lutomirski, akpm,
	andriy.shevchenko, asapek, cedric.xing, chenalexchen,
	conradparker, cyhanish, dave.hansen, haitao.huang, josh,
	kai.huang, kai.svahn, kmoy, ludloff, nhorman, npmccallum,
	puiterwijk, rientjes, tglx, yaozhangx
In-Reply-To: <20200629220400.GI12312@linux.intel.com>

On Mon, Jun 29, 2020 at 03:04:00PM -0700, Sean Christopherson wrote:
> On Mon, Jun 29, 2020 at 06:02:42PM +0200, Borislav Petkov wrote:
> > On Thu, Jun 18, 2020 at 01:08:34AM +0300, Jarkko Sakkinen wrote:
> > > Provisioning Certification Enclave (PCE), the root of trust for other
> > > enclaves, generates a signing key from a fused key called Provisioning
> > > Certification Key. PCE can then use this key to certify an attestation key
> > > of a QE, e.g. we get the chain of trust down to the hardware if the Intel
> > 
> > What's a QE?
> > 
> > I don't see this acronym resolved anywhere in the whole patchset.
> 
> Quoting Enclave.
> 
> > > signed PCE is used.
> > > 
> > > To use the needed keys, ATTRIBUTE.PROVISIONKEY is required but should be
> > > only allowed for those who actually need it so that only the trusted
> > > parties can certify QE's.
> > > 
> > > Obviously the attestation service should know the public key of the used
> > > PCE and that way detect illegit attestation, but whitelisting the legit
> > > users still adds an additional layer of defence.
> > > 
> > > Add new device file called /dev/sgx/provision. The sole purpose of this
> > > file is to provide file descriptors that act as privilege tokens to allow
> > > to build enclaves with ATTRIBUTE.PROVISIONKEY set. A new ioctl called
> > > SGX_IOC_ENCLAVE_SET_ATTRIBUTE is used to assign this token to an enclave.
> > 
> > So I'm sure I'm missing something here: what controls which
> > enclave can open /dev/sgx/provision and thus pass the FD to
> > SGX_IOC_ENCLAVE_SET_ATTRIBUTE?
> 
> /dev/sgx/provision is root-only by default, the expectation is that the admin
> will configure the system to grant only specific enclaves access to the
> PROVISION_KEY.
> 
> > And in general, how does that whole flow look like: what calls
> > SGX_IOC_ENCLAVE_SET_ATTRIBUTE when?
> 
> The basic gist is that the host process of an enclave that needs/wants access
> to the PROVISION_KEY will invoke SGX_IOC_ENCLAVE_SET_ATTRIBUTE when building
> the enclave.  Any enclave can request access to PROVISION_KEY, but practically
> speaking only the PCE and QE (or their non-Intel equivalents) actually need
> access to the key.  KVM (future series) will also respect /dev/sgx/provision,
> i.e. require a similar ioctl() to expose the PROVISION_KEY to a guest.
> 
> E.g. for my own personal testing, I never do anything attestation related, so
> none of the enclaves I run request PROVISION_KEY, but I do expose it to VMs to
> test the KVM paths.
> 
> In this series, access is fairly binary, i.e. there's no additional kernel
> infrastructure to help userspace make per-enclave decisions.  There have been
> more than a few proposals on how to extend the kernel to help provide better
> granularity, e.g. LSM hooks, but it was generally agreed to punt that stuff
> to post-upstreaming to keep things "simple" once we went far enough down
> various paths to ensure we weren't painting ourselves into a corner.
> 
> If you want super gory details, Intel's whitepaper on attestation in cloud
> environments is a good starting point[*], but I don't recommended doing much
> more than skimming unless you really like attestation stuff or are
> masochistic, which IMO amount to the same thing :-)
> 
> [*] https://download.01.org/intel-sgx/dcap-1.0/docs/SGX_ECDSA_QuoteGenReference_DCAP_API_Linux_1.0.pdf

Section 3 in [*] is what describes the infrastructure. DCAP is only a
component in the whole attestation infrastructure.

[*] https://software.intel.com/sites/default/files/managed/f1/b8/intel-sgx-support-for-third-party-attestation.pdf

/Jarkko

^ permalink raw reply

* Re: [PATCH v33 12/21] x86/sgx: Allow a limited use of ATTRIBUTE.PROVISIONKEY for attestation
From: Jarkko Sakkinen @ 2020-07-03  2:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: x86, linux-sgx, linux-kernel, linux-security-module,
	Jethro Beekman, Andy Lutomirski, akpm, andriy.shevchenko, asapek,
	cedric.xing, chenalexchen, conradparker, cyhanish, dave.hansen,
	haitao.huang, josh, kai.huang, kai.svahn, kmoy, ludloff, nhorman,
	npmccallum, puiterwijk, rientjes, sean.j.christopherson, tglx,
	yaozhangx
In-Reply-To: <20200629160242.GB32176@zn.tnic>

On Mon, Jun 29, 2020 at 06:02:42PM +0200, Borislav Petkov wrote:
> On Thu, Jun 18, 2020 at 01:08:34AM +0300, Jarkko Sakkinen wrote:
> > Provisioning Certification Enclave (PCE), the root of trust for other
> > enclaves, generates a signing key from a fused key called Provisioning
> > Certification Key. PCE can then use this key to certify an attestation key
> > of a QE, e.g. we get the chain of trust down to the hardware if the Intel
> 
> What's a QE?
> 
> I don't see this acronym resolved anywhere in the whole patchset.

Quoting Enclave.

> 
> > signed PCE is used.
> > 
> > To use the needed keys, ATTRIBUTE.PROVISIONKEY is required but should be
> > only allowed for those who actually need it so that only the trusted
> > parties can certify QE's.
> > 
> > Obviously the attestation service should know the public key of the used
> > PCE and that way detect illegit attestation, but whitelisting the legit
> > users still adds an additional layer of defence.
> > 
> > Add new device file called /dev/sgx/provision. The sole purpose of this
> > file is to provide file descriptors that act as privilege tokens to allow
> > to build enclaves with ATTRIBUTE.PROVISIONKEY set. A new ioctl called
> > SGX_IOC_ENCLAVE_SET_ATTRIBUTE is used to assign this token to an enclave.
> 
> So I'm sure I'm missing something here: what controls which
> enclave can open /dev/sgx/provision and thus pass the FD to
> SGX_IOC_ENCLAVE_SET_ATTRIBUTE?
> 
> And in general, how does that whole flow look like: what calls
> SGX_IOC_ENCLAVE_SET_ATTRIBUTE when?

I've documented it in the Remote Attestation section:

https://github.com/jsakkine-intel/linux-sgx/blob/master/Documentation/x86/sgx.rst

/Jarkko

^ permalink raw reply

* Re: linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
From: Tetsuo Handa @ 2020-07-03  0:52 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: David Howells, Christian Borntraeger, Christoph Hellwig,
	Eric W. Biederman, ast, axboe, bfields, bridge, chainsaw,
	christian.brauner, chuck.lever, davem, gregkh, jarkko.sakkinen,
	jmorris, josh, keescook, keyrings, kuba, lars.ellenberg,
	linux-fsdevel, linux-kernel, linux-nfs, linux-security-module,
	nikolay, philipp.reisner, ravenexp, roopa, serge, slyfox, viro,
	yangtiezhu, netdev, markward, linux-s390
In-Reply-To: <20200702194656.GV4332@42.do-not-panic.com>

On 2020/07/03 4:46, Luis Chamberlain wrote:
> On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:
>> On 2020/07/02 0:38, Luis Chamberlain wrote:
>>> @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
>>>  		 */
>>>  		if (KWIFEXITED(ret))
>>>  			sub_info->retval = KWEXITSTATUS(ret);
>>> +		/*
>>> +		 * Do we really want to be passing the signal, or do we pass
>>> +		 * a single error code for all cases?
>>> +		 */
>>> +		else if (KWIFSIGNALED(ret))
>>> +			sub_info->retval = KWTERMSIG(ret);
>>
>> No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
>> and e.g. SIGKILL'ed by the OOM-killer.
> 
> Right, the question is: do we care?

Yes, we have to care.

> And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used"
> changed this to:
> 
> -       if (ret >= 0) {
> +       if (ret != 0) {
> 
> Prior to the patch negative return values from userspace were still
> being captured, and likewise signals, but the error value was not
> raw, not the actual value. After the patch, since we check for ret != 0
> we still upkeep the sanity check for any error, correct the error value,
> but as you noted signals were ignored as I made the wrong assumption
> we would ignore them. The umh sub_info->retval is set after my original
> patch only if KWIFSIGNALED(ret)), and ignored signals, and so that
> would be now capitured with the additional KWIFSIGNALED(ret)) check.

"call_usermodehelper_keys() == 0" (i.e. usermode helper was successfully
started and successfully terminated via exit(0)) is different from "there is
nothing to do". call_sbin_request_key() == 0 case still has to check for
possibility of -ENOKEY case.

> 
> The question still stands:
> 
> Do we want to open code all these checks or simply wrap them up in
> the umh. If we do the later, as you note exit(9) and a SIGKILL will
> be the same to the inspector in the kernel. But do we care?

Yes, we do care.

> 
> Do we really want umh callers differntiatin between signals and exit values?

Yes, we do.

> 
> The alternative to making a compromise is using generic wrappers for
> things which make sense and letting the callers use those.

I suggest just introducing KWIFEXITED()/KWEXITSTATUS()/KWIFSIGNALED()/KWTERMSIG()
macros and fixing the callers, for some callers are not aware of possibility of
KWIFSIGNALED() case.

For example, conn_try_outdate_peer() in drivers/block/drbd/drbd_nl.c misbehaves if
drbd_usermode_helper process was terminated by a signal, for the switch() statement
after returning from conn_helper() is assuming that the return value of conn_helper()
is a KWEXITSTATUS() value if drbd_usermode_helper process was successfully started.
If drbd_usermode_helper process was terminated by SIGQUIT (which is 3),
conn_try_outdate_peer() will by error hit "case P_INCONSISTENT:" (which is 3);
conn_try_outdate_peer() should hit "default: /* The script is broken ... */"
unless KWIFEXITED() == true.

Your patch is trying to obnubilate the return code.


^ permalink raw reply

* Re: [PATCH v3 00/16] Make the user mode driver code a better citizen
From: Tetsuo Handa @ 2020-07-02 23:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, David Miller, Greg Kroah-Hartman,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

On 2020/07/03 1:40, Eric W. Biederman wrote:
> 
> This is the third round of my changeset to split the user mode driver
> code from the user mode helper code, and to make the code use common
> facilities to get things done instead of recreating them just
> for the user mode driver code.

I won't test this version, for you are ignoring my comments.

^ permalink raw reply

* Re: [PATCH v2 09/11] ima: Move validation of the keyrings conditional into ima_validate_rule()
From: Tyler Hicks @ 2020-07-02 22:16 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Dmitry Kasatkin, James Morris, Serge E . Hallyn,
	Lakshmi Ramasubramanian, Prakhar Srivastava, linux-kernel,
	linux-integrity, linux-security-module
In-Reply-To: <1593558449.5057.12.camel@linux.ibm.com>

On 2020-06-30 19:07:29, Mimi Zohar wrote:
> On Fri, 2020-06-26 at 17:38 -0500, Tyler Hicks wrote:
> > Use ima_validate_rule() to ensure that the combination of a hook
> > function and the keyrings conditional is valid and that the keyrings
> > conditional is not specified without an explicit KEY_CHECK func
> > conditional. This is a code cleanup and has no user-facing change.
> > 
> > Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> > ---
> > 
> > * v2
> >   - Allowed IMA_DIGSIG_REQUIRED, IMA_PERMIT_DIRECTIO,
> >     IMA_MODSIG_ALLOWED, and IMA_CHECK_BLACKLIST conditionals to be
> >     present in the rule entry flags for non-buffer hook functions.
> > 
> >  security/integrity/ima/ima_policy.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
> > index 8cdca2399d59..43d49ad958fb 100644
> > --- a/security/integrity/ima/ima_policy.c
> > +++ b/security/integrity/ima/ima_policy.c
> > @@ -1000,6 +1000,15 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
> >  		case KEXEC_KERNEL_CHECK:
> >  		case KEXEC_INITRAMFS_CHECK:
> >  		case POLICY_CHECK:
> > +			if (entry->flags & ~(IMA_FUNC | IMA_MASK | IMA_FSMAGIC |
> > +					     IMA_UID | IMA_FOWNER | IMA_FSUUID |
> > +					     IMA_INMASK | IMA_EUID | IMA_PCR |
> > +					     IMA_FSNAME | IMA_DIGSIG_REQUIRED |
> > +					     IMA_PERMIT_DIRECTIO |
> > +					     IMA_MODSIG_ALLOWED |
> > +					     IMA_CHECK_BLACKLIST))
> 
> Other than KEYRINGS, this patch should continue to behave the same.
>  However, this list gives the impressions that all of these flags are
> permitted on all of the above flags, which isn't true.
> 
> For example, both IMA_MODSIG_ALLOWED & IMA_CHECK_BLACKLIST are limited
> to appended signatures, meaning KERNEL_CHECK and KEXEC_KERNEL_CHECK.

Just to clarify, are both IMA_MODSIG_ALLOWED and IMA_CHECK_BLACKLIST
limited to KEXEC_KERNEL_CHECK, KEXEC_INITRAMFS_CHECK, and MODULE_CHECK?
That's what ima_hook_supports_modsig() suggests.

>  Both should only be allowed on APPRAISE action rules.

For completeness, it looks like DONT_APPRAISE should not be allowed.

> IMA_PCR should be limited to MEASURE action rules.

It looks like DONT_MEASURE should not be allowed.

> IMA_DIGSIG_REQUIRED should be limited to APPRAISE action rules.

It looks like DONT_APPRAISE should not be allowed.

> 
> > +				return false;
> > +
> >  			break;
> >  		case KEXEC_CMDLINE:
> >  			if (entry->action & ~(MEASURE | DONT_MEASURE))
> > @@ -1027,7 +1036,8 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
> >  		default:
> >  			return false;
> >  		}
> > -	}
> > +	} else if (entry->flags & IMA_KEYRINGS)
> > +		return false;
> 
> IMA_MODSIG_ALLOWED and IMA_CHECK_BLACKLIST need to be added here as
> well.

That makes sense.

Tyler

> 
> Mimi
> 
> >  
> >  	return true;
> >  }
> > @@ -1209,7 +1219,6 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
> >  			keyrings_len = strlen(args[0].from) + 1;
> >  
> >  			if ((entry->keyrings) ||
> > -			    (entry->func != KEY_CHECK) ||
> >  			    (keyrings_len < 2)) {
> >  				result = -EINVAL;
> >  				break;

^ permalink raw reply

* Re: [PATCH v4 3/3] prctl: Allow ptrace capable processes to change /proc/self/exe
From: Paul Moore @ 2020-07-02 22:00 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Adrian Reber, Christian Brauner, Eric Biederman, Pavel Emelyanov,
	Oleg Nesterov, Dmitry Safonov, Andrei Vagin, Nicolas Viennot,
	Michał Cłapiński, Kamil Yurtsever, Dirk Petersen,
	Christine Flood, Casey Schaufler, Mike Rapoport,
	Radostin Stoyanov, Cyrill Gorcunov, Stephen Smalley,
	Sargun Dhillon, Arnd Bergmann, linux-security-module,
	linux-kernel, selinux, Eric Paris, Jann Horn, linux-fsdevel
In-Reply-To: <20200702211647.GB3283@mail.hallyn.com>

On Thu, Jul 2, 2020 at 5:16 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> On Wed, Jul 01, 2020 at 08:49:06AM +0200, Adrian Reber wrote:
> > From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> >
> > Previously, the current process could only change the /proc/self/exe
> > link with local CAP_SYS_ADMIN.
> > This commit relaxes this restriction by permitting such change with
> > CAP_CHECKPOINT_RESTORE, and the ability to use ptrace.
> >
> > With access to ptrace facilities, a process can do the following: fork a
> > child, execve() the target executable, and have the child use ptrace()
> > to replace the memory content of the current process. This technique
> > makes it possible to masquerade an arbitrary program as any executable,
> > even setuid ones.
> >
> > Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> > Signed-off-by: Adrian Reber <areber@redhat.com>
>
> This is scary.  But I believe it is safe.
>
> Reviewed-by: Serge Hallyn <serge@hallyn.com>
>
> I am a bit curious about the implications of the selinux patch.
> IIUC you are using the permission of the tracing process to
> execute the file without transition, so this is a way to work
> around the policy which might prevent the tracee from doing so.
> Given that SELinux wants to be MAC, I'm not *quite* sure that's
> considered kosher.  You also are skipping the PROCESS__PTRACE
> to SECCLASS_PROCESS check which selinux_bprm_set_creds does later
> on.  Again I'm just not quite sure what's considered normal there
> these days.
>
> Paul, do you have input there?

I agree, the SELinux hook looks wrong.  Building on what Christian
said, this looks more like a ptrace operation than an exec operation.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [PATCH v4 3/3] prctl: Allow ptrace capable processes to change /proc/self/exe
From: Serge E. Hallyn @ 2020-07-02 21:58 UTC (permalink / raw)
  To: Christian Brauner, Paul Moore
  Cc: Adrian Reber, Eric Biederman, Pavel Emelyanov, Oleg Nesterov,
	Dmitry Safonov, Andrei Vagin, Nicolas Viennot,
	Michał Cłapiński, Kamil Yurtsever, Dirk Petersen,
	Christine Flood, Casey Schaufler, Mike Rapoport,
	Radostin Stoyanov, Cyrill Gorcunov, Serge Hallyn, Stephen Smalley,
	Sargun Dhillon, Arnd Bergmann, linux-security-module,
	linux-kernel, selinux, Eric Paris, Jann Horn, linux-fsdevel
In-Reply-To: <20200701085537.k7whjbn3icu5rpsq@wittgenstein>

On Wed, Jul 01, 2020 at 10:55:37AM +0200, Christian Brauner wrote:
> On Wed, Jul 01, 2020 at 08:49:06AM +0200, Adrian Reber wrote:
> > From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> > 
> > Previously, the current process could only change the /proc/self/exe
> > link with local CAP_SYS_ADMIN.
> > This commit relaxes this restriction by permitting such change with
> > CAP_CHECKPOINT_RESTORE, and the ability to use ptrace.
> > 
> > With access to ptrace facilities, a process can do the following: fork a
> > child, execve() the target executable, and have the child use ptrace()
> > to replace the memory content of the current process. This technique
> > makes it possible to masquerade an arbitrary program as any executable,
> > even setuid ones.
> > 
> > Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> > Signed-off-by: Adrian Reber <areber@redhat.com>
> > ---
> >  include/linux/lsm_hook_defs.h |  1 +
> >  include/linux/security.h      |  6 ++++++
> >  kernel/sys.c                  | 12 ++++--------
> >  security/commoncap.c          | 26 ++++++++++++++++++++++++++
> >  security/security.c           |  5 +++++
> >  security/selinux/hooks.c      | 14 ++++++++++++++
> >  6 files changed, 56 insertions(+), 8 deletions(-)
> > 
> > diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> > index 0098852bb56a..90e51d5e093b 100644
> > --- a/include/linux/lsm_hook_defs.h
> > +++ b/include/linux/lsm_hook_defs.h
> > @@ -211,6 +211,7 @@ LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info,
> >  	 int sig, const struct cred *cred)
> >  LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
> >  	 unsigned long arg3, unsigned long arg4, unsigned long arg5)
> > +LSM_HOOK(int, 0, prctl_set_mm_exe_file, struct file *exe_file)
> >  LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
> >  	 struct inode *inode)
> >  LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
> > diff --git a/include/linux/security.h b/include/linux/security.h
> > index 2797e7f6418e..0f594eb7e766 100644
> > --- a/include/linux/security.h
> > +++ b/include/linux/security.h
> > @@ -412,6 +412,7 @@ int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
> >  			int sig, const struct cred *cred);
> >  int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >  			unsigned long arg4, unsigned long arg5);
> > +int security_prctl_set_mm_exe_file(struct file *exe_file);
> >  void security_task_to_inode(struct task_struct *p, struct inode *inode);
> >  int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
> >  void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid);
> > @@ -1124,6 +1125,11 @@ static inline int security_task_prctl(int option, unsigned long arg2,
> >  	return cap_task_prctl(option, arg2, arg3, arg4, arg5);
> >  }
> >  
> > +static inline int security_prctl_set_mm_exe_file(struct file *exe_file)
> > +{
> > +	return cap_prctl_set_mm_exe_file(exe_file);
> > +}
> > +
> >  static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
> >  { }
> >  
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index 00a96746e28a..bb53e8408c63 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -1851,6 +1851,10 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
> >  	if (err)
> >  		goto exit;
> >  
> > +	err = security_prctl_set_mm_exe_file(exe.file);
> > +	if (err)
> > +		goto exit;
> > +
> >  	/*
> >  	 * Forbid mm->exe_file change if old file still mapped.
> >  	 */
> > @@ -2006,14 +2010,6 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
> >  	}
> >  
> >  	if (prctl_map.exe_fd != (u32)-1) {
> > -		/*
> > -		 * Make sure the caller has the rights to
> > -		 * change /proc/pid/exe link: only local sys admin should
> > -		 * be allowed to.
> > -		 */
> > -		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> > -			return -EINVAL;
> > -
> >  		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
> >  		if (error)
> >  			return error;
> > diff --git a/security/commoncap.c b/security/commoncap.c
> > index 59bf3c1674c8..663d00fe2ecc 100644
> > --- a/security/commoncap.c
> > +++ b/security/commoncap.c
> > @@ -1291,6 +1291,31 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >  	}
> >  }
> >  
> > +/**
> > + * cap_prctl_set_mm_exe_file - Determine whether /proc/self/exe can be changed
> > + * by the current process.
> > + * @exe_file: The new exe file
> > + * Returns 0 if permission is granted, -ve if denied.
> > + *
> > + * The current process is permitted to change its /proc/self/exe link via two policies:
> > + * 1) The current user can do checkpoint/restore. At the time of this writing,
> > + *    this means CAP_SYS_ADMIN or CAP_CHECKPOINT_RESTORE capable.
> > + * 2) The current user can use ptrace.
> > + *
> > + * With access to ptrace facilities, a process can do the following:
> > + * fork a child, execve() the target executable, and have the child use
> > + * ptrace() to replace the memory content of the current process.
> > + * This technique makes it possible to masquerade an arbitrary program as the
> > + * target executable, even if it is setuid.
> > + */
> > +int cap_prctl_set_mm_exe_file(struct file *exe_file)
> > +{
> > +	if (checkpoint_restore_ns_capable(current_user_ns()))
> > +		return 0;
> > +
> > +	return security_ptrace_access_check(current, PTRACE_MODE_ATTACH_REALCREDS);
> > +}
> 
> What is the reason for having this be a new security hook? Doesn't look
> like it needs to be unless I'm missing something. This just seems more
> complex than it needs to be.

Yeah, agreed, and that in turn risks actually making it less safe.  Is
there a good reason not to do what Christian suggests?

> I might be wrong here but if you look at the callsites for
> security_ptrace_access_check() right now, you'll see that it's only
> called from kernel/ptrace.c in __ptrace_may_access() and that function
> checks right at the top:
> 
> 	/* Don't let security modules deny introspection */
> 	if (same_thread_group(task, current))
> 		return 0;
> 
> since you're passing in same_thread_group(current, current) you're
> passing this check and never even hitting
> security_ptrace_access_check(). So the contract seems to be (as is
> obvious from the comment) that a task can't be denied ptrace
> introspection. But if you're using security_ptrace_access_check(current)
> here and _if_ there would be any lsm that would deny ptrace
> introspection to current you'd suddenly introduce a callsite where
> ptrace introspection is denied. That seems wrong. So either you meant to
> do something else here or you really just want:
> 
> checkpoint_restore_ns_capable(current_user_ns())
> 
> and none of the rest. But I might be missing the big picture in this
> patch.
> 
> > +	if (checkpoint_restore_ns_capable(current_user_ns()))
> > +		return 0;
> > +
> >  /**
> >   * cap_vm_enough_memory - Determine whether a new virtual mapping is permitted
> >   * @mm: The VM space in which the new mapping is to be made
> > @@ -1356,6 +1381,7 @@ static struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
> >  	LSM_HOOK_INIT(mmap_file, cap_mmap_file),
> >  	LSM_HOOK_INIT(task_fix_setuid, cap_task_fix_setuid),
> >  	LSM_HOOK_INIT(task_prctl, cap_task_prctl),
> > +	LSM_HOOK_INIT(prctl_set_mm_exe_file, cap_prctl_set_mm_exe_file),
> >  	LSM_HOOK_INIT(task_setscheduler, cap_task_setscheduler),
> >  	LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
> >  	LSM_HOOK_INIT(task_setnice, cap_task_setnice),
> > diff --git a/security/security.c b/security/security.c
> > index 2bb912496232..13a1ed32f9e3 100644
> > --- a/security/security.c
> > +++ b/security/security.c
> > @@ -1790,6 +1790,11 @@ int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
> >  	return rc;
> >  }
> >  
> > +int security_prctl_set_mm_exe_file(struct file *exe_file)
> > +{
> > +	return call_int_hook(prctl_set_mm_exe_file, 0, exe_file);
> > +}
> > +
> >  void security_task_to_inode(struct task_struct *p, struct inode *inode)
> >  {
> >  	call_void_hook(task_to_inode, p, inode);
> > diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> > index ca901025802a..fca5581392b8 100644
> > --- a/security/selinux/hooks.c
> > +++ b/security/selinux/hooks.c
> > @@ -4156,6 +4156,19 @@ static int selinux_task_kill(struct task_struct *p, struct kernel_siginfo *info,
> >  			    secid, task_sid(p), SECCLASS_PROCESS, perm, NULL);
> >  }
> >  
> > +static int selinux_prctl_set_mm_exe_file(struct file *exe_file)
> > +{
> > +	u32 sid = current_sid();
> > +
> > +	struct common_audit_data ad = {
> > +		.type = LSM_AUDIT_DATA_FILE,
> > +		.u.file = exe_file,
> > +	};
> > +
> > +	return avc_has_perm(&selinux_state, sid, sid,
> > +			    SECCLASS_FILE, FILE__EXECUTE_NO_TRANS, &ad);
> > +}
> > +
> >  static void selinux_task_to_inode(struct task_struct *p,
> >  				  struct inode *inode)
> >  {
> > @@ -7057,6 +7070,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
> >  	LSM_HOOK_INIT(task_getscheduler, selinux_task_getscheduler),
> >  	LSM_HOOK_INIT(task_movememory, selinux_task_movememory),
> >  	LSM_HOOK_INIT(task_kill, selinux_task_kill),
> > +	LSM_HOOK_INIT(prctl_set_mm_exe_file, selinux_prctl_set_mm_exe_file),
> >  	LSM_HOOK_INIT(task_to_inode, selinux_task_to_inode),
> >  
> >  	LSM_HOOK_INIT(ipc_permission, selinux_ipc_permission),
> > -- 
> > 2.26.2
> > 

^ permalink raw reply

* Re: [PATCH v4 3/3] prctl: Allow ptrace capable processes to change /proc/self/exe
From: Serge E. Hallyn @ 2020-07-02 21:16 UTC (permalink / raw)
  To: Adrian Reber, Paul Moore
  Cc: Christian Brauner, Eric Biederman, Pavel Emelyanov, Oleg Nesterov,
	Dmitry Safonov, Andrei Vagin, Nicolas Viennot,
	Michał Cłapiński, Kamil Yurtsever, Dirk Petersen,
	Christine Flood, Casey Schaufler, Mike Rapoport,
	Radostin Stoyanov, Cyrill Gorcunov, Serge Hallyn, Stephen Smalley,
	Sargun Dhillon, Arnd Bergmann, linux-security-module,
	linux-kernel, selinux, Eric Paris, Jann Horn, linux-fsdevel
In-Reply-To: <20200701064906.323185-4-areber@redhat.com>

On Wed, Jul 01, 2020 at 08:49:06AM +0200, Adrian Reber wrote:
> From: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> 
> Previously, the current process could only change the /proc/self/exe
> link with local CAP_SYS_ADMIN.
> This commit relaxes this restriction by permitting such change with
> CAP_CHECKPOINT_RESTORE, and the ability to use ptrace.
> 
> With access to ptrace facilities, a process can do the following: fork a
> child, execve() the target executable, and have the child use ptrace()
> to replace the memory content of the current process. This technique
> makes it possible to masquerade an arbitrary program as any executable,
> even setuid ones.
> 
> Signed-off-by: Nicolas Viennot <Nicolas.Viennot@twosigma.com>
> Signed-off-by: Adrian Reber <areber@redhat.com>

This is scary.  But I believe it is safe.

Reviewed-by: Serge Hallyn <serge@hallyn.com>

I am a bit curious about the implications of the selinux patch.
IIUC you are using the permission of the tracing process to
execute the file without transition, so this is a way to work
around the policy which might prevent the tracee from doing so.
Given that SELinux wants to be MAC, I'm not *quite* sure that's
considered kosher.  You also are skipping the PROCESS__PTRACE
to SECCLASS_PROCESS check which selinux_bprm_set_creds does later
on.  Again I'm just not quite sure what's considered normal there
these days.

Paul, do you have input there?

> ---
>  include/linux/lsm_hook_defs.h |  1 +
>  include/linux/security.h      |  6 ++++++
>  kernel/sys.c                  | 12 ++++--------
>  security/commoncap.c          | 26 ++++++++++++++++++++++++++
>  security/security.c           |  5 +++++
>  security/selinux/hooks.c      | 14 ++++++++++++++
>  6 files changed, 56 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 0098852bb56a..90e51d5e093b 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -211,6 +211,7 @@ LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info,
>  	 int sig, const struct cred *cred)
>  LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
>  	 unsigned long arg3, unsigned long arg4, unsigned long arg5)
> +LSM_HOOK(int, 0, prctl_set_mm_exe_file, struct file *exe_file)
>  LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
>  	 struct inode *inode)
>  LSM_HOOK(int, 0, ipc_permission, struct kern_ipc_perm *ipcp, short flag)
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 2797e7f6418e..0f594eb7e766 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -412,6 +412,7 @@ int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
>  			int sig, const struct cred *cred);
>  int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  			unsigned long arg4, unsigned long arg5);
> +int security_prctl_set_mm_exe_file(struct file *exe_file);
>  void security_task_to_inode(struct task_struct *p, struct inode *inode);
>  int security_ipc_permission(struct kern_ipc_perm *ipcp, short flag);
>  void security_ipc_getsecid(struct kern_ipc_perm *ipcp, u32 *secid);
> @@ -1124,6 +1125,11 @@ static inline int security_task_prctl(int option, unsigned long arg2,
>  	return cap_task_prctl(option, arg2, arg3, arg4, arg5);
>  }
>  
> +static inline int security_prctl_set_mm_exe_file(struct file *exe_file)
> +{
> +	return cap_prctl_set_mm_exe_file(exe_file);
> +}
> +
>  static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
>  { }
>  
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 00a96746e28a..bb53e8408c63 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1851,6 +1851,10 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
>  	if (err)
>  		goto exit;
>  
> +	err = security_prctl_set_mm_exe_file(exe.file);
> +	if (err)
> +		goto exit;
> +
>  	/*
>  	 * Forbid mm->exe_file change if old file still mapped.
>  	 */
> @@ -2006,14 +2010,6 @@ static int prctl_set_mm_map(int opt, const void __user *addr, unsigned long data
>  	}
>  
>  	if (prctl_map.exe_fd != (u32)-1) {
> -		/*
> -		 * Make sure the caller has the rights to
> -		 * change /proc/pid/exe link: only local sys admin should
> -		 * be allowed to.
> -		 */
> -		if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
> -			return -EINVAL;
> -
>  		error = prctl_set_mm_exe_file(mm, prctl_map.exe_fd);
>  		if (error)
>  			return error;
> diff --git a/security/commoncap.c b/security/commoncap.c
> index 59bf3c1674c8..663d00fe2ecc 100644
> --- a/security/commoncap.c
> +++ b/security/commoncap.c
> @@ -1291,6 +1291,31 @@ int cap_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  	}
>  }
>  
> +/**
> + * cap_prctl_set_mm_exe_file - Determine whether /proc/self/exe can be changed
> + * by the current process.
> + * @exe_file: The new exe file
> + * Returns 0 if permission is granted, -ve if denied.
> + *
> + * The current process is permitted to change its /proc/self/exe link via two policies:
> + * 1) The current user can do checkpoint/restore. At the time of this writing,
> + *    this means CAP_SYS_ADMIN or CAP_CHECKPOINT_RESTORE capable.
> + * 2) The current user can use ptrace.
> + *
> + * With access to ptrace facilities, a process can do the following:
> + * fork a child, execve() the target executable, and have the child use
> + * ptrace() to replace the memory content of the current process.
> + * This technique makes it possible to masquerade an arbitrary program as the
> + * target executable, even if it is setuid.
> + */
> +int cap_prctl_set_mm_exe_file(struct file *exe_file)
> +{
> +	if (checkpoint_restore_ns_capable(current_user_ns()))
> +		return 0;
> +
> +	return security_ptrace_access_check(current, PTRACE_MODE_ATTACH_REALCREDS);
> +}
> +
>  /**
>   * cap_vm_enough_memory - Determine whether a new virtual mapping is permitted
>   * @mm: The VM space in which the new mapping is to be made
> @@ -1356,6 +1381,7 @@ static struct security_hook_list capability_hooks[] __lsm_ro_after_init = {
>  	LSM_HOOK_INIT(mmap_file, cap_mmap_file),
>  	LSM_HOOK_INIT(task_fix_setuid, cap_task_fix_setuid),
>  	LSM_HOOK_INIT(task_prctl, cap_task_prctl),
> +	LSM_HOOK_INIT(prctl_set_mm_exe_file, cap_prctl_set_mm_exe_file),
>  	LSM_HOOK_INIT(task_setscheduler, cap_task_setscheduler),
>  	LSM_HOOK_INIT(task_setioprio, cap_task_setioprio),
>  	LSM_HOOK_INIT(task_setnice, cap_task_setnice),
> diff --git a/security/security.c b/security/security.c
> index 2bb912496232..13a1ed32f9e3 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -1790,6 +1790,11 @@ int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
>  	return rc;
>  }
>  
> +int security_prctl_set_mm_exe_file(struct file *exe_file)
> +{
> +	return call_int_hook(prctl_set_mm_exe_file, 0, exe_file);
> +}
> +
>  void security_task_to_inode(struct task_struct *p, struct inode *inode)
>  {
>  	call_void_hook(task_to_inode, p, inode);
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index ca901025802a..fca5581392b8 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -4156,6 +4156,19 @@ static int selinux_task_kill(struct task_struct *p, struct kernel_siginfo *info,
>  			    secid, task_sid(p), SECCLASS_PROCESS, perm, NULL);
>  }
>  
> +static int selinux_prctl_set_mm_exe_file(struct file *exe_file)
> +{
> +	u32 sid = current_sid();
> +
> +	struct common_audit_data ad = {
> +		.type = LSM_AUDIT_DATA_FILE,
> +		.u.file = exe_file,
> +	};
> +
> +	return avc_has_perm(&selinux_state, sid, sid,
> +			    SECCLASS_FILE, FILE__EXECUTE_NO_TRANS, &ad);
> +}
> +
>  static void selinux_task_to_inode(struct task_struct *p,
>  				  struct inode *inode)
>  {
> @@ -7057,6 +7070,7 @@ static struct security_hook_list selinux_hooks[] __lsm_ro_after_init = {
>  	LSM_HOOK_INIT(task_getscheduler, selinux_task_getscheduler),
>  	LSM_HOOK_INIT(task_movememory, selinux_task_movememory),
>  	LSM_HOOK_INIT(task_kill, selinux_task_kill),
> +	LSM_HOOK_INIT(prctl_set_mm_exe_file, selinux_prctl_set_mm_exe_file),
>  	LSM_HOOK_INIT(task_to_inode, selinux_task_to_inode),
>  
>  	LSM_HOOK_INIT(ipc_permission, selinux_ipc_permission),
> -- 
> 2.26.2

^ permalink raw reply

* Re: [PATCH v33 12/21] x86/sgx: Allow a limited use of ATTRIBUTE.PROVISIONKEY for attestation
From: Dr. Greg @ 2020-07-02 20:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Sean Christopherson, Jarkko Sakkinen, x86, linux-sgx,
	linux-kernel, linux-security-module, Jethro Beekman,
	Andy Lutomirski, akpm, andriy.shevchenko, asapek, cedric.xing,
	chenalexchen, conradparker, cyhanish, dave.hansen, haitao.huang,
	josh, kai.huang, kai.svahn, kmoy, ludloff, nhorman, npmccallum,
	puiterwijk, rientjes, tglx, yaozhangx
In-Reply-To: <20200630084956.GB1093@zn.tnic>

Good afternoon, I hope the week is progressing productively to an end
for everyone.

I think it was almost two months ago now that Thomas Gleixner
indicated that security and privacy issues that we were raising with
respect to this driver, with what we believe is legitimate domain
expertise, threatened the upstreaming of the driver.  I think the case
can be made that those claims were somewhat specious given a fast
forward to today and the continued uncertainty regarding the
architecture of this driver.

So I will take that risk once again in order to provide some context
for this thread.

On Tue, Jun 30, 2020 at 10:49:56AM +0200, Borislav Petkov wrote:
> On Mon, Jun 29, 2020 at 03:04:00PM -0700, Sean Christopherson wrote:
> > > I don't see this acronym resolved anywhere in the whole patchset.
> > 
> > Quoting Enclave.

> Yah, pls add it somewhere.

For the benefit of everyone not deeply involved with this and perhaps
something that Jarkko can cut and paste if he desires.

The Quoting Enclave (QE) is the trusted endpoint that is responsible
for signing the report of an enclave's initialized status on a
platform.

In Enhanced Privacy ID (EPID) based attestation, the QE is the
custodian of one of the private EPID group keys that is provisioned to
the platform by the Intel Attestation Service (IAS).  The quoting
enclave generates a signature over the attesting enclave's report
structure using that key.  The IAS uses its public copy of the group
key to verify that the signature is from a trusted endpoint running on
a member of the group.

The EPID provisioning process 'trusts' that the platform, to which the
private group key is being delegated to, is a known Intel platform by
virtue of the fact that the Platform Certification Enclave (PCE) is
able to generate an identifier that could only be generated by having
access to a specific symmetric encryption key.  A key that is
available only to enclaves that have been initialized with the
PROVISION_KEY attribute bit.

The QE encrypts the private group key using a signer specific
(MRSIGNER) symmetric encryption key that only the QE can generate
while in a trusted initialization state.  That provides a mechanisum
for persisting this chain of trust outside the context of execution of
the QE.

Things are slightly different from a mechanistic perspective when the
Data Center Attestation Protocol (DCAP) is being used, but
conceptually the same.  In that attestation variant, the QE carries
the PROVISION_KEY attribute so that it can certify that the report
signature, generated with the Elliptic Curve Digital Signature
Algorithm (ECDSA) rather then EPID, is from a known platform.

> > /dev/sgx/provision is root-only by default, the expectation is
> > that the admin will configure the system to grant only specific
> > enclaves access to the PROVISION_KEY.

> Uuh, I don't like "the expectation is" - the reality happens to turn
> differently, more often than not.

Indeed, which is why we have consistently maintained that the platform
owner should be allowed to use cryptographic access controls over
which enclaves can possess the PROVISION_KEY attribute.

Given the security threat on a platform that is capable of supporting
Enclave Dynamic Memory Management (EDMM), the ability to initialize an
enclave is also a legitimate candidate for cryptographic access
control.

We can bikeshed the significance of these issues all day, but for the
record, access to the PROVISION_KEY allows platforms to be absolutely
and precisely fingerprinted for their entire lifespan.

Initialized enclaves on an EDMM capable platform have the ability to
execute arbitrary code, over which the kernel security infrastructure
has no visibility into or control over.

There is little doubt that EDMM will be a mainstay of the Confidential
Computing Initiative (CCI), which is the target of this driver.  A
review of the archives will indicate that RedHat/IBM has already
confirmed that their candidate for this infrastructure (Enarx)
requires EDMM.

To refresh everyone further, the last version of the SFLC patches that
we proposed, allows a platform owner to run the driver in a default
mode, which uses the proposed DAC controls over all of this or to opt
for stronger cryptographic controls.  The approach that we took has
virtually no impact on the architecture and footprint of the driver.

> > In this series, access is fairly binary, i.e. there's no
> > additional kernel infrastructure to help userspace make
> > per-enclave decisions.  There have been more than a few proposals
> > on how to extend the kernel to help provide better granularity,
> > e.g. LSM hooks, but it was generally agreed to punt that stuff to
> > post-upstreaming to keep things "simple" once we went far enough
> > down various paths to ensure we weren't painting ourselves into a
> > corner.

> So this all sounds to me like we should not upstream
> /dev/sgx/provision now but delay it until the infrastructure for
> that has been made more concrete. We can always add it
> then. Changing it after the fact - if we have to and for whatever
> reason - would be a lot harder for a user-visible interface which
> someone has started using already.
>
> So I'd leave that out from the initial patchset.

Without access to the PROVISION_KEY attribute, the two standard
mechanisms for attestation will not function, the technology is
arguably useless for its intended purposes without attestation.

Once again, I will leave it to community bike shedding as to whether
or not Linux wants the ability to support unfettered deterministic
platform fingerprinting that can be conducted for the physical
lifespan of the platform.

Once again, for the record, our approach affords compatibility with
the now 6+ year old out-of-tree user interface, without requiring its
use.  I would argue that the continually voiced concerns about
'painting ourselves into a corner', with respect to access
authorization, is significantly less of a problem with our approach
rather then without it.

> > If you want super gory details, Intel's whitepaper on attestation
> > in cloud environments is a good starting point[*], but I don't
> > recommended doing much more than skimming unless you really like
> > attestation stuff or are masochistic, which IMO amount to the same
> > thing :-)

> No thanks. :)

Interesting reflections and perhaps worthy of some introspection by
the Linux development community.

I will concede that there are a lot of musty corners that need to be
explored in order to understand how all of this works and the
security/privacy issues involved.  I certainly wouldn't wish
exploration on those corners on anyone.

I think everyone will concede that my ideas and suggestions on these
issues have been deemed to be unpopular, if not without merit.  They
do, however, come from the perspective of someone who directed a
complete and independent implementation of all of this infrastructure.

A process which has left me with no uncertainty whatsoever with
respect to the issues involved in all of this.

> Regards/Gruss,
>     Boris.

Best wishes for a pleasant and productive weekend to everyone.

Dr. Greg

As always,
Dr. Greg Wettstein, Ph.D, Worker      Autonomously self-defensive
Enjellic Systems Development, LLC     IOT platforms and edge devices.
4206 N. 19th Ave.
Fargo, ND  58102
PH: 701-281-1686                      EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"I suppose that could happen but he wouldn't know a Galois Field
 if it kicked him in the nuts."
                                -- Anonymous mathematician
                                   Resurrection.

^ permalink raw reply

* Re: [PATCH v4 2/3] selftests: add clone3() CAP_CHECKPOINT_RESTORE test
From: Serge E. Hallyn @ 2020-07-02 20:53 UTC (permalink / raw)
  To: Adrian Reber
  Cc: Christian Brauner, Eric Biederman, Pavel Emelyanov, Oleg Nesterov,
	Dmitry Safonov, Andrei Vagin, Nicolas Viennot,
	Michał Cłapiński, Kamil Yurtsever, Dirk Petersen,
	Christine Flood, Casey Schaufler, Mike Rapoport,
	Radostin Stoyanov, Cyrill Gorcunov, Serge Hallyn, Stephen Smalley,
	Sargun Dhillon, Arnd Bergmann, linux-security-module,
	linux-kernel, selinux, Eric Paris, Jann Horn, linux-fsdevel
In-Reply-To: <20200701064906.323185-3-areber@redhat.com>

On Wed, Jul 01, 2020 at 08:49:05AM +0200, Adrian Reber wrote:
> This adds a test that changes its UID, uses capabilities to
> get CAP_CHECKPOINT_RESTORE and uses clone3() with set_tid to
> create a process with a given PID as non-root.

Seems worth also verifying that it fails if you have no capabilities.
I don't see that in the existing clone3/ test dir.


> Signed-off-by: Adrian Reber <areber@redhat.com>
> ---
>  tools/testing/selftests/clone3/Makefile       |   4 +-
>  .../clone3/clone3_cap_checkpoint_restore.c    | 203 ++++++++++++++++++
>  2 files changed, 206 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c
> 
> diff --git a/tools/testing/selftests/clone3/Makefile b/tools/testing/selftests/clone3/Makefile
> index cf976c732906..ef7564cb7abe 100644
> --- a/tools/testing/selftests/clone3/Makefile
> +++ b/tools/testing/selftests/clone3/Makefile
> @@ -1,6 +1,8 @@
>  # SPDX-License-Identifier: GPL-2.0
>  CFLAGS += -g -I../../../../usr/include/
> +LDLIBS += -lcap
>  
> -TEST_GEN_PROGS := clone3 clone3_clear_sighand clone3_set_tid
> +TEST_GEN_PROGS := clone3 clone3_clear_sighand clone3_set_tid \
> +	clone3_cap_checkpoint_restore
>  
>  include ../lib.mk
> diff --git a/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c b/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c
> new file mode 100644
> index 000000000000..2cc3d57b91f2
> --- /dev/null
> +++ b/tools/testing/selftests/clone3/clone3_cap_checkpoint_restore.c
> @@ -0,0 +1,203 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Based on Christian Brauner's clone3() example.
> + * These tests are assuming to be running in the host's
> + * PID namespace.
> + */
> +
> +/* capabilities related code based on selftests/bpf/test_verifier.c */
> +
> +#define _GNU_SOURCE
> +#include <errno.h>
> +#include <linux/types.h>
> +#include <linux/sched.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <stdbool.h>
> +#include <sys/capability.h>
> +#include <sys/prctl.h>
> +#include <sys/syscall.h>
> +#include <sys/types.h>
> +#include <sys/un.h>
> +#include <sys/wait.h>
> +#include <unistd.h>
> +#include <sched.h>
> +
> +#include "../kselftest.h"
> +#include "clone3_selftests.h"
> +
> +#ifndef MAX_PID_NS_LEVEL
> +#define MAX_PID_NS_LEVEL 32
> +#endif
> +
> +static void child_exit(int ret)
> +{
> +	fflush(stdout);
> +	fflush(stderr);
> +	_exit(ret);
> +}
> +
> +static int call_clone3_set_tid(pid_t * set_tid, size_t set_tid_size)
> +{
> +	int status;
> +	pid_t pid = -1;
> +
> +	struct clone_args args = {
> +		.exit_signal = SIGCHLD,
> +		.set_tid = ptr_to_u64(set_tid),
> +		.set_tid_size = set_tid_size,
> +	};
> +
> +	pid = sys_clone3(&args, sizeof(struct clone_args));
> +	if (pid < 0) {
> +		ksft_print_msg("%s - Failed to create new process\n",
> +			       strerror(errno));
> +		return -errno;
> +	}
> +
> +	if (pid == 0) {
> +		int ret;
> +		char tmp = 0;
> +
> +		ksft_print_msg
> +		    ("I am the child, my PID is %d (expected %d)\n",
> +		     getpid(), set_tid[0]);
> +
> +		if (set_tid[0] != getpid())
> +			child_exit(EXIT_FAILURE);
> +		child_exit(EXIT_SUCCESS);
> +	}
> +
> +	ksft_print_msg("I am the parent (%d). My child's pid is %d\n",
> +		       getpid(), pid);
> +
> +	if (waitpid(pid, &status, 0) < 0) {
> +		ksft_print_msg("Child returned %s\n", strerror(errno));
> +		return -errno;
> +	}
> +
> +	if (!WIFEXITED(status))
> +		return -1;
> +
> +	return WEXITSTATUS(status);
> +}
> +
> +static int test_clone3_set_tid(pid_t * set_tid,
> +			       size_t set_tid_size, int expected)
> +{
> +	int ret;
> +
> +	ksft_print_msg("[%d] Trying clone3() with CLONE_SET_TID to %d\n",
> +		       getpid(), set_tid[0]);
> +	ret = call_clone3_set_tid(set_tid, set_tid_size);
> +
> +	ksft_print_msg
> +	    ("[%d] clone3() with CLONE_SET_TID %d says :%d - expected %d\n",
> +	     getpid(), set_tid[0], ret, expected);
> +	if (ret != expected) {
> +		ksft_test_result_fail
> +		    ("[%d] Result (%d) is different than expected (%d)\n",
> +		     getpid(), ret, expected);
> +		return -1;
> +	}
> +	ksft_test_result_pass
> +	    ("[%d] Result (%d) matches expectation (%d)\n", getpid(), ret,
> +	     expected);
> +
> +	return 0;
> +}
> +
> +struct libcap {
> +	struct __user_cap_header_struct hdr;
> +	struct __user_cap_data_struct data[2];
> +};
> +
> +static int set_capability()
> +{
> +	cap_value_t cap_values[] = { CAP_SETUID, CAP_SETGID };
> +	struct libcap *cap;
> +	int ret = -1;
> +	cap_t caps;
> +
> +	caps = cap_get_proc();
> +	if (!caps) {
> +		perror("cap_get_proc");
> +		return -1;
> +	}
> +
> +	/* Drop all capabilities */
> +	if (cap_clear(caps)) {
> +		perror("cap_clear");
> +		goto out;
> +	}
> +
> +	cap_set_flag(caps, CAP_EFFECTIVE, 2, cap_values, CAP_SET);
> +	cap_set_flag(caps, CAP_PERMITTED, 2, cap_values, CAP_SET);
> +
> +	cap = (struct libcap *) caps;
> +
> +	/* 40 -> CAP_CHECKPOINT_RESTORE */
> +	cap->data[1].effective |= 1 << (40 - 32);
> +	cap->data[1].permitted |= 1 << (40 - 32);
> +
> +	if (cap_set_proc(caps)) {
> +		perror("cap_set_proc");
> +		goto out;
> +	}
> +	ret = 0;
> +out:
> +	if (cap_free(caps))
> +		perror("cap_free");
> +	return ret;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +	pid_t pid;
> +	int status;
> +	int ret = 0;
> +	pid_t set_tid[1];
> +	uid_t uid = getuid();
> +
> +	ksft_print_header();
> +	test_clone3_supported();
> +	ksft_set_plan(2);
> +
> +	if (uid != 0) {
> +		ksft_cnt.ksft_xskip = ksft_plan;
> +		ksft_print_msg("Skipping all tests as non-root\n");
> +		return ksft_exit_pass();
> +	}
> +
> +	memset(&set_tid, 0, sizeof(set_tid));
> +
> +	/* Find the current active PID */
> +	pid = fork();
> +	if (pid == 0) {
> +		ksft_print_msg("Child has PID %d\n", getpid());
> +		child_exit(EXIT_SUCCESS);
> +	}
> +	if (waitpid(pid, &status, 0) < 0)
> +		ksft_exit_fail_msg("Waiting for child %d failed", pid);
> +
> +	/* After the child has finished, its PID should be free. */
> +	set_tid[0] = pid;
> +
> +	if (set_capability())
> +		ksft_test_result_fail
> +		    ("Could not set CAP_CHECKPOINT_RESTORE\n");
> +	prctl(PR_SET_KEEPCAPS, 1, 0, 0, 0);
> +	/* This would fail without CAP_CHECKPOINT_RESTORE */
> +	setgid(1000);
> +	setuid(1000);
> +	set_tid[0] = pid;
> +	ret |= test_clone3_set_tid(set_tid, 1, -EPERM);
> +	if (set_capability())
> +		ksft_test_result_fail
> +		    ("Could not set CAP_CHECKPOINT_RESTORE\n");
> +	/* This should work as we have CAP_CHECKPOINT_RESTORE as non-root */
> +	ret |= test_clone3_set_tid(set_tid, 1, 0);
> +
> +	return !ret ? ksft_exit_pass() : ksft_exit_fail();
> +}
> -- 
> 2.26.2

^ permalink raw reply

* Re: [PATCH v8 2/2] tpm: Add support for event log pointer found in TPM2 ACPI table
From: Jarkko Sakkinen @ 2020-07-02 20:44 UTC (permalink / raw)
  To: Stefan Berger
  Cc: linux-integrity, linux-kernel, linux-acpi, linux-security-module,
	Stefan Berger
In-Reply-To: <20200626153948.2059251-3-stefanb@linux.vnet.ibm.com>

On Fri, Jun 26, 2020 at 11:39:48AM -0400, Stefan Berger wrote:
> From: Stefan Berger <stefanb@linux.ibm.com>
> 
> In case a TPM2 is attached, search for a TPM2 ACPI table when trying
> to get the event log from ACPI. If one is found, use it to get the
> start and length of the log area. This allows non-UEFI systems, such
> as SeaBIOS, to pass an event log when using a TPM2.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>  drivers/char/tpm/eventlog/acpi.c | 63 +++++++++++++++++++++-----------
>  1 file changed, 42 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/char/tpm/eventlog/acpi.c b/drivers/char/tpm/eventlog/acpi.c
> index 63ada5e53f13..51312c460335 100644
> --- a/drivers/char/tpm/eventlog/acpi.c
> +++ b/drivers/char/tpm/eventlog/acpi.c
> @@ -49,9 +49,9 @@ int tpm_read_log_acpi(struct tpm_chip *chip)
>  	void __iomem *virt;
>  	u64 len, start;
>  	struct tpm_bios_log *log;
> -
> -	if (chip->flags & TPM_CHIP_FLAG_TPM2)
> -		return -ENODEV;
> +	struct acpi_table_tpm2 *tbl;
> +	struct acpi_tpm2_phy *t2phy;

This is a rather cryptic variable name. Should be either 'phy' or
'tpm2_phy' (I don't care that much which one). Did not spot this
earlier.

/Jarkko

^ permalink raw reply

* Re: linux-next: umh: fix processed error when UMH_WAIT_PROC is used seems to break linux bridge on s390x (bisected)
From: Luis Chamberlain @ 2020-07-02 19:46 UTC (permalink / raw)
  To: Tetsuo Handa, David Howells
  Cc: Christian Borntraeger, Christoph Hellwig, Eric W. Biederman, ast,
	axboe, bfields, bridge, chainsaw, christian.brauner, chuck.lever,
	davem, gregkh, jarkko.sakkinen, jmorris, josh, keescook, keyrings,
	kuba, lars.ellenberg, linux-fsdevel, linux-kernel, linux-nfs,
	linux-security-module, nikolay, philipp.reisner, ravenexp, roopa,
	serge, slyfox, viro, yangtiezhu, netdev, markward, linux-s390
In-Reply-To: <e3f3e501-2cb7-b683-4b85-2002b7603244@i-love.sakura.ne.jp>

On Thu, Jul 02, 2020 at 01:26:53PM +0900, Tetsuo Handa wrote:
> On 2020/07/02 0:38, Luis Chamberlain wrote:
> > @@ -156,6 +156,18 @@ static void call_usermodehelper_exec_sync(struct subprocess_info *sub_info)
> >  		 */
> >  		if (KWIFEXITED(ret))
> >  			sub_info->retval = KWEXITSTATUS(ret);
> > +		/*
> > +		 * Do we really want to be passing the signal, or do we pass
> > +		 * a single error code for all cases?
> > +		 */
> > +		else if (KWIFSIGNALED(ret))
> > +			sub_info->retval = KWTERMSIG(ret);
> 
> No, this is bad. Caller of usermode helper is unable to distinguish exit(9)
> and e.g. SIGKILL'ed by the OOM-killer.

Right, the question is: do we care?

> Please pass raw exit status value.
> 
> I feel that caller of usermode helper should not use exit status value.
> For example, call_sbin_request_key() is checking
> 
>   test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) || key_validate(key) < 0
> 
> condition (if usermode helper was invoked) in order to "ignore any errors from
> userspace if the key was instantiated".

For those not familiar with this code path, or if you cannot decipher
the above, the code path in question was:

static int call_sbin_request_key(struct key *authkey, void *aux)                
{
	...
	/* do it */                                                             
	ret = call_usermodehelper_keys(request_key, argv, envp, keyring,        
				       UMH_WAIT_PROC);
	kdebug("usermode -> 0x%x", ret);
	if (ret >= 0) {
		/* ret is the exit/wait code */
		if (test_bit(KEY_FLAG_USER_CONSTRUCT, &key->flags) ||
		    key_validate(key) < 0)
		    ret = -ENOKEY;
		/* ignore any errors from userspace if the key was      
		 * instantiated */  
		 ret = 0;
	}
	...
}

And the umh patch "umh: fix processed error when UMH_WAIT_PROC is used"
changed this to:

-       if (ret >= 0) {
+       if (ret != 0) {

Prior to the patch negative return values from userspace were still
being captured, and likewise signals, but the error value was not
raw, not the actual value. After the patch, since we check for ret != 0
we still upkeep the sanity check for any error, correct the error value,
but as you noted signals were ignored as I made the wrong assumption
we would ignore them. The umh sub_info->retval is set after my original
patch only if KWIFSIGNALED(ret)), and ignored signals, and so that
would be now capitured with the additional KWIFSIGNALED(ret)) check.

The question still stands:

Do we want to open code all these checks or simply wrap them up in
the umh. If we do the later, as you note exit(9) and a SIGKILL will
be the same to the inspector in the kernel. But do we care?

Do we really want umh callers differntiatin between signals and exit values?

The alternative to making a compromise is using generic wrappers for
things which make sense and letting the callers use those.

  Luis

> > +		/* Same here */
> > +		else if (KWIFSTOPPED((ret)))
> > +			sub_info->retval = KWSTOPSIG(ret);
> > +		/* And are we really sure we want this? */
> > +		else if (KWIFCONTINUED((ret)))
> > +			sub_info->retval = 0;
> >  	}
> >  
> >  	/* Restore default kernel sig handler */
> > 
> 

^ permalink raw reply

* [PATCH v3 11/16] bpfilter: Move bpfilter_umh back into init data
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

To allow for restarts 61fbf5933d42 ("net: bpfilter: restart
bpfilter_umh when error occurred") moved the blob holding the
userspace binary out of the init sections.

Now that loading the blob into a filesystem is separate from executing
the blob the blob no longer needs to live .rodata to allow for restarting.
So move the blob back to .init.rodata.

v1: https://lkml.kernel.org/r/87sgeidlvq.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87ftad4ozc.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 net/bpfilter/bpfilter_umh_blob.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bpfilter/bpfilter_umh_blob.S b/net/bpfilter/bpfilter_umh_blob.S
index 9ea6100dca87..40311d10d2f2 100644
--- a/net/bpfilter/bpfilter_umh_blob.S
+++ b/net/bpfilter/bpfilter_umh_blob.S
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-	.section .rodata, "a"
+	.section .init.rodata, "a"
 	.global bpfilter_umh_start
 bpfilter_umh_start:
 	.incbin "net/bpfilter/bpfilter_umh"
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 16/16] umd: Stop using split_argv
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

There is exactly one argument so there is nothing to split.  All
split_argv does now is cause confusion and avoid the need for a cast
when passing a "const char *" string to call_usermodehelper_setup.

So avoid confusion and the possibility of an odd driver name causing
problems by just using a fixed argv array with a cast in the call to
call_usermodehelper_setup.

v1: https://lkml.kernel.org/r/87sged3a9n.fsf_-_@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 kernel/usermode_driver.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index cd136f86f799..0b35212ffc3d 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -160,27 +160,21 @@ static void umd_cleanup(struct subprocess_info *info)
 int fork_usermode_driver(struct umd_info *info)
 {
 	struct subprocess_info *sub_info;
-	char **argv = NULL;
+	const char *argv[] = { info->driver_name, NULL };
 	int err;
 
 	if (WARN_ON_ONCE(info->tgid))
 		return -EBUSY;
 
 	err = -ENOMEM;
-	argv = argv_split(GFP_KERNEL, info->driver_name, NULL);
-	if (!argv)
-		goto out;
-
-	sub_info = call_usermodehelper_setup(info->driver_name, argv, NULL,
-					     GFP_KERNEL,
+	sub_info = call_usermodehelper_setup(info->driver_name,
+					     (char **)argv, NULL, GFP_KERNEL,
 					     umd_setup, umd_cleanup, info);
 	if (!sub_info)
 		goto out;
 
 	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
 out:
-	if (argv)
-		argv_free(argv);
 	return err;
 }
 EXPORT_SYMBOL_GPL(fork_usermode_driver);
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 13/16] exit: Factor thread_group_exited out of pidfd_poll
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

Create an independent helper thread_group_exited report return true
when all threads have passed exit_notify in do_exit.  AKA all of the
threads are at least zombies and might be dead or completely gone.

Create this helper by taking the logic out of pidfd_poll where
it is already tested, and adding a missing READ_ONCE on
the read of task->exit_state.

I will be changing the user mode driver code to use this same logic
to know when a user mode driver needs to be restarted.

Place the new helper thread_group_exited in kernel/exit.c and
EXPORT it so it can be used by modules.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/sched/signal.h |  2 ++
 kernel/exit.c                | 24 ++++++++++++++++++++++++
 kernel/fork.c                |  6 +-----
 3 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 0ee5e696c5d8..1bad18a1d8ba 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -674,6 +674,8 @@ static inline int thread_group_empty(struct task_struct *p)
 #define delay_group_leader(p) \
 		(thread_group_leader(p) && !thread_group_empty(p))
 
+extern bool thread_group_exited(struct pid *pid);
+
 extern struct sighand_struct *__lock_task_sighand(struct task_struct *task,
 							unsigned long *flags);
 
diff --git a/kernel/exit.c b/kernel/exit.c
index d3294b611df1..a7f112feb0f6 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1713,6 +1713,30 @@ COMPAT_SYSCALL_DEFINE5(waitid,
 }
 #endif
 
+/**
+ * thread_group_exited - check that a thread group has exited
+ * @pid: tgid of thread group to be checked.
+ *
+ * Test if thread group is has exited (all threads are zombies, dead
+ * or completely gone).
+ *
+ * Return: true if the thread group has exited. false otherwise.
+ */
+bool thread_group_exited(struct pid *pid)
+{
+	struct task_struct *task;
+	bool exited;
+
+	rcu_read_lock();
+	task = pid_task(pid, PIDTYPE_PID);
+	exited = !task ||
+		(READ_ONCE(task->exit_state) && thread_group_empty(task));
+	rcu_read_unlock();
+
+	return exited;
+}
+EXPORT_SYMBOL(thread_group_exited);
+
 __weak void abort(void)
 {
 	BUG();
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..bf215af7a904 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1787,22 +1787,18 @@ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f)
  */
 static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts)
 {
-	struct task_struct *task;
 	struct pid *pid = file->private_data;
 	__poll_t poll_flags = 0;
 
 	poll_wait(file, &pid->wait_pidfd, pts);
 
-	rcu_read_lock();
-	task = pid_task(pid, PIDTYPE_PID);
 	/*
 	 * Inform pollers only when the whole thread group exits.
 	 * If the thread group leader exits before all other threads in the
 	 * group, then poll(2) should block, similar to the wait(2) family.
 	 */
-	if (!task || (task->exit_state && thread_group_empty(task)))
+	if (thread_group_exited(pid))
 		poll_flags = EPOLLIN | EPOLLRDNORM;
-	rcu_read_unlock();
 
 	return poll_flags;
 }
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 12/16] umd: Track user space drivers with struct pid
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

Use struct pid instead of user space pid values that are prone to wrap
araound.

In addition track the entire thread group instead of just the first
thread that is started by exec.  There are no multi-threaded user mode
drivers today but there is nothing preclucing user drivers from being
multi-threaded, so it is just a good idea to track the entire process.

Take a reference count on the tgid's in question to make it possible
to remove exit_umh in a future change.

As a struct pid is available directly use kill_pid_info.

The prior process signalling code was iffy in using a userspace pid
known to be in the initial pid namespace and then looking up it's task
in whatever the current pid namespace is.  It worked only because
kernel threads always run in the initial pid namespace.

As the tgid is now refcounted verify the tgid is NULL at the start of
fork_usermode_driver to avoid the possibility of silent pid leaks.

v1: https://lkml.kernel.org/r/87mu4qdlv2.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/a70l4oy8.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/usermode_driver.h |  2 +-
 kernel/exit.c                   |  3 ++-
 kernel/usermode_driver.c        | 15 ++++++++++-----
 net/bpfilter/bpfilter_kern.c    | 13 +++++--------
 net/ipv4/bpfilter/sockopt.c     |  3 ++-
 5 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
index 97c919b7147c..45adbffb31d9 100644
--- a/include/linux/usermode_driver.h
+++ b/include/linux/usermode_driver.h
@@ -25,7 +25,7 @@ struct umd_info {
 	struct list_head list;
 	void (*cleanup)(struct umd_info *info);
 	struct path wd;
-	pid_t pid;
+	struct pid *tgid;
 };
 int umd_load_blob(struct umd_info *info, const void *data, size_t len);
 int umd_unload_blob(struct umd_info *info);
diff --git a/kernel/exit.c b/kernel/exit.c
index a081deea52ca..d3294b611df1 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -805,7 +805,8 @@ void __noreturn do_exit(long code)
 	exit_task_namespaces(tsk);
 	exit_task_work(tsk);
 	exit_thread(tsk);
-	exit_umh(tsk);
+	if (group_dead)
+		exit_umh(tsk);
 
 	/*
 	 * Flush inherited counters to the parent - before the parent
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index a86798759f83..f77f8d7ce9e3 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -133,7 +133,7 @@ static int umd_setup(struct subprocess_info *info, struct cred *new)
 	set_fs_pwd(current->fs, &umd_info->wd);
 	umd_info->pipe_to_umh = to_umh[1];
 	umd_info->pipe_from_umh = from_umh[0];
-	umd_info->pid = task_pid_nr(current);
+	umd_info->tgid = get_pid(task_tgid(current));
 	current->flags |= PF_UMH;
 	return 0;
 }
@@ -146,6 +146,8 @@ static void umd_cleanup(struct subprocess_info *info)
 	if (info->retval) {
 		fput(umd_info->pipe_to_umh);
 		fput(umd_info->pipe_from_umh);
+		put_pid(umd_info->tgid);
+		umd_info->tgid = NULL;
 	}
 }
 
@@ -155,9 +157,9 @@ static void umd_cleanup(struct subprocess_info *info)
  *
  * Returns either negative error or zero which indicates success in
  * executing a usermode driver. In such case 'struct umd_info *info'
- * is populated with two pipes and a pid of the process. The caller is
+ * is populated with two pipes and a tgid of the process. The caller is
  * responsible for health check of the user process, killing it via
- * pid, and closing the pipes when user process is no longer needed.
+ * tgid, and closing the pipes when user process is no longer needed.
  */
 int fork_usermode_driver(struct umd_info *info)
 {
@@ -165,6 +167,9 @@ int fork_usermode_driver(struct umd_info *info)
 	char **argv = NULL;
 	int err;
 
+	if (WARN_ON_ONCE(info->tgid))
+		return -EBUSY;
+
 	err = -ENOMEM;
 	argv = argv_split(GFP_KERNEL, info->driver_name, NULL);
 	if (!argv)
@@ -192,11 +197,11 @@ EXPORT_SYMBOL_GPL(fork_usermode_driver);
 void __exit_umh(struct task_struct *tsk)
 {
 	struct umd_info *info;
-	pid_t pid = tsk->pid;
+	struct pid *tgid = task_tgid(tsk);
 
 	mutex_lock(&umh_list_lock);
 	list_for_each_entry(info, &umh_list, list) {
-		if (info->pid == pid) {
+		if (info->tgid == tgid) {
 			list_del(&info->list);
 			mutex_unlock(&umh_list_lock);
 			goto out;
diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index 28883b00609d..08ea77c2b137 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -15,16 +15,13 @@ extern char bpfilter_umh_end;
 
 static void shutdown_umh(void)
 {
-	struct task_struct *tsk;
+	struct umd_info *info = &bpfilter_ops.info;
+	struct pid *tgid = info->tgid;
 
 	if (bpfilter_ops.stop)
 		return;
 
-	tsk = get_pid_task(find_vpid(bpfilter_ops.info.pid), PIDTYPE_PID);
-	if (tsk) {
-		send_sig(SIGKILL, tsk, 1);
-		put_task_struct(tsk);
-	}
+	kill_pid(tgid, SIGKILL, 1);
 }
 
 static void __stop_umh(void)
@@ -48,7 +45,7 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname,
 	req.cmd = optname;
 	req.addr = (long __force __user)optval;
 	req.len = optlen;
-	if (!bpfilter_ops.info.pid)
+	if (!bpfilter_ops.info.tgid)
 		goto out;
 	n = __kernel_write(bpfilter_ops.info.pipe_to_umh, &req, sizeof(req),
 			   &pos);
@@ -81,7 +78,7 @@ static int start_umh(void)
 	if (err)
 		return err;
 	bpfilter_ops.stop = false;
-	pr_info("Loaded bpfilter_umh pid %d\n", bpfilter_ops.info.pid);
+	pr_info("Loaded bpfilter_umh pid %d\n", pid_nr(bpfilter_ops.info.tgid));
 
 	/* health check that usermode process started correctly */
 	if (__bpfilter_process_sockopt(NULL, 0, NULL, 0, 0) != 0) {
diff --git a/net/ipv4/bpfilter/sockopt.c b/net/ipv4/bpfilter/sockopt.c
index 5050de28333d..56cbc43145f6 100644
--- a/net/ipv4/bpfilter/sockopt.c
+++ b/net/ipv4/bpfilter/sockopt.c
@@ -18,7 +18,8 @@ static void bpfilter_umh_cleanup(struct umd_info *info)
 	bpfilter_ops.stop = true;
 	fput(info->pipe_to_umh);
 	fput(info->pipe_from_umh);
-	info->pid = 0;
+	put_pid(info->tgid);
+	info->tgid = NULL;
 	mutex_unlock(&bpfilter_ops.lock);
 }
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 15/16] umd: Remove exit_umh
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

The bpfilter code no longer uses the umd_info.cleanup callback.  This
callback is what exit_umh exists to call.  So remove exit_umh and all
of it's associated booking.

v1: https://lkml.kernel.org/r/87bll6dlte.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87y2o53abg.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/sched.h           |  1 -
 include/linux/usermode_driver.h | 16 ----------------
 kernel/exit.c                   |  3 ---
 kernel/usermode_driver.c        | 28 ----------------------------
 4 files changed, 48 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 59d1e92bb88e..edb2020875ad 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1511,7 +1511,6 @@ extern struct pid *cad_pid;
 #define PF_KTHREAD		0x00200000	/* I am a kernel thread */
 #define PF_RANDOMIZE		0x00400000	/* Randomize virtual address space */
 #define PF_SWAPWRITE		0x00800000	/* Allowed to write to swap */
-#define PF_UMH			0x02000000	/* I'm an Usermodehelper process */
 #define PF_NO_SETAFFINITY	0x04000000	/* Userland is not allowed to meddle with cpus_mask */
 #define PF_MCE_EARLY		0x08000000      /* Early kill for mce process policy */
 #define PF_MEMALLOC_NOCMA	0x10000000	/* All allocation request will have _GFP_MOVABLE cleared */
diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
index 45adbffb31d9..073a9e0ec07d 100644
--- a/include/linux/usermode_driver.h
+++ b/include/linux/usermode_driver.h
@@ -4,26 +4,10 @@
 #include <linux/umh.h>
 #include <linux/path.h>
 
-#ifdef CONFIG_BPFILTER
-void __exit_umh(struct task_struct *tsk);
-
-static inline void exit_umh(struct task_struct *tsk)
-{
-	if (unlikely(tsk->flags & PF_UMH))
-		__exit_umh(tsk);
-}
-#else
-static inline void exit_umh(struct task_struct *tsk)
-{
-}
-#endif
-
 struct umd_info {
 	const char *driver_name;
 	struct file *pipe_to_umh;
 	struct file *pipe_from_umh;
-	struct list_head list;
-	void (*cleanup)(struct umd_info *info);
 	struct path wd;
 	struct pid *tgid;
 };
diff --git a/kernel/exit.c b/kernel/exit.c
index a7f112feb0f6..4ec82859bfe5 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -63,7 +63,6 @@
 #include <linux/random.h>
 #include <linux/rcuwait.h>
 #include <linux/compat.h>
-#include <linux/usermode_driver.h>
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
@@ -805,8 +804,6 @@ void __noreturn do_exit(long code)
 	exit_task_namespaces(tsk);
 	exit_task_work(tsk);
 	exit_thread(tsk);
-	if (group_dead)
-		exit_umh(tsk);
 
 	/*
 	 * Flush inherited counters to the parent - before the parent
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index f77f8d7ce9e3..cd136f86f799 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -9,9 +9,6 @@
 #include <linux/task_work.h>
 #include <linux/usermode_driver.h>
 
-static LIST_HEAD(umh_list);
-static DEFINE_MUTEX(umh_list_lock);
-
 static struct vfsmount *blob_to_mnt(const void *data, size_t len, const char *name)
 {
 	struct file_system_type *type;
@@ -134,7 +131,6 @@ static int umd_setup(struct subprocess_info *info, struct cred *new)
 	umd_info->pipe_to_umh = to_umh[1];
 	umd_info->pipe_from_umh = from_umh[0];
 	umd_info->tgid = get_pid(task_tgid(current));
-	current->flags |= PF_UMH;
 	return 0;
 }
 
@@ -182,11 +178,6 @@ int fork_usermode_driver(struct umd_info *info)
 		goto out;
 
 	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
-	if (!err) {
-		mutex_lock(&umh_list_lock);
-		list_add(&info->list, &umh_list);
-		mutex_unlock(&umh_list_lock);
-	}
 out:
 	if (argv)
 		argv_free(argv);
@@ -194,23 +185,4 @@ int fork_usermode_driver(struct umd_info *info)
 }
 EXPORT_SYMBOL_GPL(fork_usermode_driver);
 
-void __exit_umh(struct task_struct *tsk)
-{
-	struct umd_info *info;
-	struct pid *tgid = task_tgid(tsk);
-
-	mutex_lock(&umh_list_lock);
-	list_for_each_entry(info, &umh_list, list) {
-		if (info->tgid == tgid) {
-			list_del(&info->list);
-			mutex_unlock(&umh_list_lock);
-			goto out;
-		}
-	}
-	mutex_unlock(&umh_list_lock);
-	return;
-out:
-	if (info->cleanup)
-		info->cleanup(info);
-}
 
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 14/16] bpfilter: Take advantage of the facilities of struct pid
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

Instead of relying on the exit_umh cleanup callback use the fact a
struct pid can be tested to see if a process still exists, and that
struct pid has a wait queue that notifies when the process dies.

v1: https://lkml.kernel.org/r/87h7uydlu9.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/874kqt4owu.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/bpfilter.h     |  3 ++-
 net/bpfilter/bpfilter_kern.c | 15 +++++----------
 net/ipv4/bpfilter/sockopt.c  | 15 ++++++++-------
 3 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/include/linux/bpfilter.h b/include/linux/bpfilter.h
index ec9972d822e0..9b114c718a76 100644
--- a/include/linux/bpfilter.h
+++ b/include/linux/bpfilter.h
@@ -10,6 +10,8 @@ int bpfilter_ip_set_sockopt(struct sock *sk, int optname, char __user *optval,
 			    unsigned int optlen);
 int bpfilter_ip_get_sockopt(struct sock *sk, int optname, char __user *optval,
 			    int __user *optlen);
+void bpfilter_umh_cleanup(struct umd_info *info);
+
 struct bpfilter_umh_ops {
 	struct umd_info info;
 	/* since ip_getsockopt() can run in parallel, serialize access to umh */
@@ -18,7 +20,6 @@ struct bpfilter_umh_ops {
 		       char __user *optval,
 		       unsigned int optlen, bool is_set);
 	int (*start)(void);
-	bool stop;
 };
 extern struct bpfilter_umh_ops bpfilter_ops;
 #endif
diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index 08ea77c2b137..9616fb7defeb 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -18,10 +18,11 @@ static void shutdown_umh(void)
 	struct umd_info *info = &bpfilter_ops.info;
 	struct pid *tgid = info->tgid;
 
-	if (bpfilter_ops.stop)
-		return;
-
-	kill_pid(tgid, SIGKILL, 1);
+	if (tgid) {
+		kill_pid(tgid, SIGKILL, 1);
+		wait_event(tgid->wait_pidfd, thread_group_exited(tgid));
+		bpfilter_umh_cleanup(info);
+	}
 }
 
 static void __stop_umh(void)
@@ -77,7 +78,6 @@ static int start_umh(void)
 	err = fork_usermode_driver(&bpfilter_ops.info);
 	if (err)
 		return err;
-	bpfilter_ops.stop = false;
 	pr_info("Loaded bpfilter_umh pid %d\n", pid_nr(bpfilter_ops.info.tgid));
 
 	/* health check that usermode process started correctly */
@@ -100,16 +100,11 @@ static int __init load_umh(void)
 		return err;
 
 	mutex_lock(&bpfilter_ops.lock);
-	if (!bpfilter_ops.stop) {
-		err = -EFAULT;
-		goto out;
-	}
 	err = start_umh();
 	if (!err && IS_ENABLED(CONFIG_INET)) {
 		bpfilter_ops.sockopt = &__bpfilter_process_sockopt;
 		bpfilter_ops.start = &start_umh;
 	}
-out:
 	mutex_unlock(&bpfilter_ops.lock);
 	if (err)
 		umd_unload_blob(&bpfilter_ops.info);
diff --git a/net/ipv4/bpfilter/sockopt.c b/net/ipv4/bpfilter/sockopt.c
index 56cbc43145f6..9063c6767d34 100644
--- a/net/ipv4/bpfilter/sockopt.c
+++ b/net/ipv4/bpfilter/sockopt.c
@@ -12,16 +12,14 @@
 struct bpfilter_umh_ops bpfilter_ops;
 EXPORT_SYMBOL_GPL(bpfilter_ops);
 
-static void bpfilter_umh_cleanup(struct umd_info *info)
+void bpfilter_umh_cleanup(struct umd_info *info)
 {
-	mutex_lock(&bpfilter_ops.lock);
-	bpfilter_ops.stop = true;
 	fput(info->pipe_to_umh);
 	fput(info->pipe_from_umh);
 	put_pid(info->tgid);
 	info->tgid = NULL;
-	mutex_unlock(&bpfilter_ops.lock);
 }
+EXPORT_SYMBOL_GPL(bpfilter_umh_cleanup);
 
 static int bpfilter_mbox_request(struct sock *sk, int optname,
 				 char __user *optval,
@@ -39,7 +37,11 @@ static int bpfilter_mbox_request(struct sock *sk, int optname,
 			goto out;
 		}
 	}
-	if (bpfilter_ops.stop) {
+	if (bpfilter_ops.info.tgid &&
+	    thread_group_exited(bpfilter_ops.info.tgid))
+		bpfilter_umh_cleanup(&bpfilter_ops.info);
+
+	if (!bpfilter_ops.info.tgid) {
 		err = bpfilter_ops.start();
 		if (err)
 			goto out;
@@ -70,9 +72,8 @@ int bpfilter_ip_get_sockopt(struct sock *sk, int optname, char __user *optval,
 static int __init bpfilter_sockopt_init(void)
 {
 	mutex_init(&bpfilter_ops.lock);
-	bpfilter_ops.stop = true;
+	bpfilter_ops.info.tgid = NULL;
 	bpfilter_ops.info.driver_name = "bpfilter_umh";
-	bpfilter_ops.info.cleanup = &bpfilter_umh_cleanup;
 
 	return 0;
 }
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 04/16] umh: Remove call_usermodehelper_setup_file.
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

The only caller of call_usermodehelper_setup_file is fork_usermode_blob.
In fork_usermode_blob replace call_usermodehelper_setup_file with
call_usermodehelper_setup and delete fork_usermodehelper_setup_file.

For this to work the argv_free is moved from umh_clean_and_save_pid
to fork_usermode_blob.

v1: https://lkml.kernel.org/r/87zh8qf0mp.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87o8p163u1.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/umh.h |  3 ---
 kernel/umh.c        | 42 +++++++++++-------------------------------
 2 files changed, 11 insertions(+), 34 deletions(-)

diff --git a/include/linux/umh.h b/include/linux/umh.h
index aae16a0ebd0f..de08af00c68a 100644
--- a/include/linux/umh.h
+++ b/include/linux/umh.h
@@ -39,9 +39,6 @@ call_usermodehelper_setup(const char *path, char **argv, char **envp,
 			  int (*init)(struct subprocess_info *info, struct cred *new),
 			  void (*cleanup)(struct subprocess_info *), void *data);
 
-struct subprocess_info *call_usermodehelper_setup_file(struct file *file,
-			  int (*init)(struct subprocess_info *info, struct cred *new),
-			  void (*cleanup)(struct subprocess_info *), void *data);
 struct umh_info {
 	const char *cmdline;
 	struct file *pipe_to_umh;
diff --git a/kernel/umh.c b/kernel/umh.c
index 26c3d493f168..b8fa9b99b366 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -402,33 +402,6 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
 }
 EXPORT_SYMBOL(call_usermodehelper_setup);
 
-struct subprocess_info *call_usermodehelper_setup_file(struct file *file,
-		int (*init)(struct subprocess_info *info, struct cred *new),
-		void (*cleanup)(struct subprocess_info *info), void *data)
-{
-	struct subprocess_info *sub_info;
-	struct umh_info *info = data;
-	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
-
-	sub_info = kzalloc(sizeof(struct subprocess_info), GFP_KERNEL);
-	if (!sub_info)
-		return NULL;
-
-	sub_info->argv = argv_split(GFP_KERNEL, cmdline, NULL);
-	if (!sub_info->argv) {
-		kfree(sub_info);
-		return NULL;
-	}
-
-	INIT_WORK(&sub_info->work, call_usermodehelper_exec_work);
-	sub_info->path = "none";
-	sub_info->file = file;
-	sub_info->init = init;
-	sub_info->cleanup = cleanup;
-	sub_info->data = data;
-	return sub_info;
-}
-
 static int umd_setup(struct subprocess_info *info, struct cred *new)
 {
 	struct umh_info *umh_info = info->data;
@@ -479,8 +452,6 @@ static void umd_cleanup(struct subprocess_info *info)
 		fput(umh_info->pipe_to_umh);
 		fput(umh_info->pipe_from_umh);
 	}
-
-	argv_free(info->argv);
 }
 
 /**
@@ -501,7 +472,9 @@ static void umd_cleanup(struct subprocess_info *info)
  */
 int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
 {
+	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
 	struct subprocess_info *sub_info;
+	char **argv = NULL;
 	struct file *file;
 	ssize_t written;
 	loff_t pos = 0;
@@ -520,11 +493,16 @@ int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
 	}
 
 	err = -ENOMEM;
-	sub_info = call_usermodehelper_setup_file(file, umd_setup, umd_cleanup,
-						  info);
+	argv = argv_split(GFP_KERNEL, cmdline, NULL);
+	if (!argv)
+		goto out;
+
+	sub_info = call_usermodehelper_setup("none", argv, NULL, GFP_KERNEL,
+					     umd_setup, umd_cleanup, info);
 	if (!sub_info)
 		goto out;
 
+	sub_info->file = file;
 	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
 	if (!err) {
 		mutex_lock(&umh_list_lock);
@@ -532,6 +510,8 @@ int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
 		mutex_unlock(&umh_list_lock);
 	}
 out:
+	if (argv)
+		argv_free(argv);
 	fput(file);
 	return err;
 }
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 10/16] exec: Remove do_execve_file
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Tetsuo Handa,
	Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

Now that the last callser has been removed remove this code from exec.

For anyone thinking of resurrecing do_execve_file please note that
the code was buggy in several fundamental ways.

- It did not ensure the file it was passed was read-only and that
  deny_write_access had been called on it.  Which subtlely breaks
  invaniants in exec.

- The caller of do_execve_file was expected to hold and put a
  reference to the file, but an extra reference for use by exec was
  not taken so that when exec put it's reference to the file an
  underflow occured on the file reference count.

- The point of the interface was so that a pathname did not need to
  exist.  Which breaks pathname based LSMs.

Tetsuo Handa originally reported these issues[1].  While it was clear
that deny_write_access was missing the fundamental incompatibility
with the passed in O_RDWR filehandle was not immediately recognized.

All of these issues were fixed by modifying the usermode driver code
to have a path, so it did not need this hack.

Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
[1] https://lore.kernel.org/linux-fsdevel/2a8775b4-1dd5-9d5c-aa42-9872445e0942@i-love.sakura.ne.jp/
v1: https://lkml.kernel.org/r/871rm2f0hi.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87lfk54p0m.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 fs/exec.c               | 38 +++++++++-----------------------------
 include/linux/binfmts.h |  1 -
 2 files changed, 9 insertions(+), 30 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index e6e8a9a70327..23dfbb820626 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1818,13 +1818,14 @@ static int exec_binprm(struct linux_binprm *bprm)
 /*
  * sys_execve() executes a new program.
  */
-static int __do_execve_file(int fd, struct filename *filename,
-			    struct user_arg_ptr argv,
-			    struct user_arg_ptr envp,
-			    int flags, struct file *file)
+static int do_execveat_common(int fd, struct filename *filename,
+			      struct user_arg_ptr argv,
+			      struct user_arg_ptr envp,
+			      int flags)
 {
 	char *pathbuf = NULL;
 	struct linux_binprm *bprm;
+	struct file *file;
 	struct files_struct *displaced;
 	int retval;
 
@@ -1863,8 +1864,7 @@ static int __do_execve_file(int fd, struct filename *filename,
 	check_unsafe_exec(bprm);
 	current->in_execve = 1;
 
-	if (!file)
-		file = do_open_execat(fd, filename, flags);
+	file = do_open_execat(fd, filename, flags);
 	retval = PTR_ERR(file);
 	if (IS_ERR(file))
 		goto out_unmark;
@@ -1872,9 +1872,7 @@ static int __do_execve_file(int fd, struct filename *filename,
 	sched_exec();
 
 	bprm->file = file;
-	if (!filename) {
-		bprm->filename = "none";
-	} else if (fd == AT_FDCWD || filename->name[0] == '/') {
+	if (fd == AT_FDCWD || filename->name[0] == '/') {
 		bprm->filename = filename->name;
 	} else {
 		if (filename->name[0] == '\0')
@@ -1935,8 +1933,7 @@ static int __do_execve_file(int fd, struct filename *filename,
 	task_numa_free(current, false);
 	free_bprm(bprm);
 	kfree(pathbuf);
-	if (filename)
-		putname(filename);
+	putname(filename);
 	if (displaced)
 		put_files_struct(displaced);
 	return retval;
@@ -1967,27 +1964,10 @@ static int __do_execve_file(int fd, struct filename *filename,
 	if (displaced)
 		reset_files_struct(displaced);
 out_ret:
-	if (filename)
-		putname(filename);
+	putname(filename);
 	return retval;
 }
 
-static int do_execveat_common(int fd, struct filename *filename,
-			      struct user_arg_ptr argv,
-			      struct user_arg_ptr envp,
-			      int flags)
-{
-	return __do_execve_file(fd, filename, argv, envp, flags, NULL);
-}
-
-int do_execve_file(struct file *file, void *__argv, void *__envp)
-{
-	struct user_arg_ptr argv = { .ptr.native = __argv };
-	struct user_arg_ptr envp = { .ptr.native = __envp };
-
-	return __do_execve_file(AT_FDCWD, NULL, argv, envp, 0, file);
-}
-
 int do_execve(struct filename *filename,
 	const char __user *const __user *__argv,
 	const char __user *const __user *__envp)
diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index 4a20b7517dd0..7c27d7b57871 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -141,6 +141,5 @@ extern int do_execveat(int, struct filename *,
 		       const char __user * const __user *,
 		       const char __user * const __user *,
 		       int);
-int do_execve_file(struct file *file, void *__argv, void *__envp);
 
 #endif /* _LINUX_BINFMTS_H */
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 09/16] umh: Stop calling do_execve_file
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

With the user mode driver code changed to not set subprocess_info.file
there are no more users of subproces_info.file.  Remove this field
from struct subprocess_info and remove the only user in
call_usermodehelper_exec_async that would call do_execve_file instead
of do_execve if file was set.

v1: https://lkml.kernel.org/r/877dvuf0i7.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87r1tx4p2a.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/umh.h |  1 -
 kernel/umh.c        | 10 +++-------
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/include/linux/umh.h b/include/linux/umh.h
index 73173c4a07e5..244aff638220 100644
--- a/include/linux/umh.h
+++ b/include/linux/umh.h
@@ -22,7 +22,6 @@ struct subprocess_info {
 	const char *path;
 	char **argv;
 	char **envp;
-	struct file *file;
 	int wait;
 	int retval;
 	int (*init)(struct subprocess_info *info, struct cred *new);
diff --git a/kernel/umh.c b/kernel/umh.c
index 3e4e453d45c8..6ca2096298b9 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -98,13 +98,9 @@ static int call_usermodehelper_exec_async(void *data)
 
 	commit_creds(new);
 
-	if (sub_info->file)
-		retval = do_execve_file(sub_info->file,
-					sub_info->argv, sub_info->envp);
-	else
-		retval = do_execve(getname_kernel(sub_info->path),
-				   (const char __user *const __user *)sub_info->argv,
-				   (const char __user *const __user *)sub_info->envp);
+	retval = do_execve(getname_kernel(sub_info->path),
+			   (const char __user *const __user *)sub_info->argv,
+			   (const char __user *const __user *)sub_info->envp);
 out:
 	sub_info->retval = retval;
 	/*
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 08/16] umd: Transform fork_usermode_blob into fork_usermode_driver
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

Instead of loading a binary blob into a temporary file with
shmem_kernel_file_setup load a binary blob into a temporary tmpfs
filesystem.  This means that the blob can be stored in an init section
and discared, and it means the binary blob will have a filename so can
be executed normally.

The only tricky thing about this code is that in the helper function
blob_to_mnt __fput_sync is used.  That is because a file can not be
executed if it is still open for write, and the ordinary delayed close
for kernel threads does not happen soon enough, which causes the
following exec to fail.  The function umd_load_blob is not called with
any locks so this should be safe.

Executing the blob normally winds up correcting several problems with
the user mode driver code discovered by Tetsuo Handa[1].  By passing
an ordinary filename into the exec, it is no longer necessary to
figure out how to turn a O_RDWR file descriptor into a properly
referende counted O_EXEC file descriptor that forbids all writes.  For
path based LSMs there are no new special cases.

[1] https://lore.kernel.org/linux-fsdevel/2a8775b4-1dd5-9d5c-aa42-9872445e0942@i-love.sakura.ne.jp/
v1: https://lkml.kernel.org/r/87d05mf0j9.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87wo3p4p35.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/usermode_driver.h |   6 +-
 kernel/usermode_driver.c        | 126 ++++++++++++++++++++++++--------
 net/bpfilter/bpfilter_kern.c    |  14 +++-
 3 files changed, 113 insertions(+), 33 deletions(-)

diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
index 48cf25e3145d..97c919b7147c 100644
--- a/include/linux/usermode_driver.h
+++ b/include/linux/usermode_driver.h
@@ -2,6 +2,7 @@
 #define __LINUX_USERMODE_DRIVER_H__
 
 #include <linux/umh.h>
+#include <linux/path.h>
 
 #ifdef CONFIG_BPFILTER
 void __exit_umh(struct task_struct *tsk);
@@ -23,8 +24,11 @@ struct umd_info {
 	struct file *pipe_from_umh;
 	struct list_head list;
 	void (*cleanup)(struct umd_info *info);
+	struct path wd;
 	pid_t pid;
 };
-int fork_usermode_blob(void *data, size_t len, struct umd_info *info);
+int umd_load_blob(struct umd_info *info, const void *data, size_t len);
+int umd_unload_blob(struct umd_info *info);
+int fork_usermode_driver(struct umd_info *info);
 
 #endif /* __LINUX_USERMODE_DRIVER_H__ */
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index 46d60d855e93..a86798759f83 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -4,11 +4,98 @@
  */
 #include <linux/shmem_fs.h>
 #include <linux/pipe_fs_i.h>
+#include <linux/mount.h>
+#include <linux/fs_struct.h>
+#include <linux/task_work.h>
 #include <linux/usermode_driver.h>
 
 static LIST_HEAD(umh_list);
 static DEFINE_MUTEX(umh_list_lock);
 
+static struct vfsmount *blob_to_mnt(const void *data, size_t len, const char *name)
+{
+	struct file_system_type *type;
+	struct vfsmount *mnt;
+	struct file *file;
+	ssize_t written;
+	loff_t pos = 0;
+
+	type = get_fs_type("tmpfs");
+	if (!type)
+		return ERR_PTR(-ENODEV);
+
+	mnt = kern_mount(type);
+	put_filesystem(type);
+	if (IS_ERR(mnt))
+		return mnt;
+
+	file = file_open_root(mnt->mnt_root, mnt, name, O_CREAT | O_WRONLY, 0700);
+	if (IS_ERR(file)) {
+		mntput(mnt);
+		return ERR_CAST(file);
+	}
+
+	written = kernel_write(file, data, len, &pos);
+	if (written != len) {
+		int err = written;
+		if (err >= 0)
+			err = -ENOMEM;
+		filp_close(file, NULL);
+		mntput(mnt);
+		return ERR_PTR(err);
+	}
+
+	fput(file);
+
+	/* Flush delayed fput so exec can open the file read-only */
+	flush_delayed_fput();
+	task_work_run();
+	return mnt;
+}
+
+/**
+ * umd_load_blob - Remember a blob of bytes for fork_usermode_driver
+ * @info: information about usermode driver
+ * @data: a blob of bytes that can be executed as a file
+ * @len:  The lentgh of the blob
+ *
+ */
+int umd_load_blob(struct umd_info *info, const void *data, size_t len)
+{
+	struct vfsmount *mnt;
+
+	if (WARN_ON_ONCE(info->wd.dentry || info->wd.mnt))
+		return -EBUSY;
+
+	mnt = blob_to_mnt(data, len, info->driver_name);
+	if (IS_ERR(mnt))
+		return PTR_ERR(mnt);
+
+	info->wd.mnt = mnt;
+	info->wd.dentry = mnt->mnt_root;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(umd_load_blob);
+
+/**
+ * umd_unload_blob - Disassociate @info from a previously loaded blob
+ * @info: information about usermode driver
+ *
+ */
+int umd_unload_blob(struct umd_info *info)
+{
+	if (WARN_ON_ONCE(!info->wd.mnt ||
+			 !info->wd.dentry ||
+			 info->wd.mnt->mnt_root != info->wd.dentry))
+		return -EINVAL;
+
+	kern_unmount(info->wd.mnt);
+	info->wd.mnt = NULL;
+	info->wd.dentry = NULL;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(umd_unload_blob);
+
 static int umd_setup(struct subprocess_info *info, struct cred *new)
 {
 	struct umd_info *umd_info = info->data;
@@ -43,6 +130,7 @@ static int umd_setup(struct subprocess_info *info, struct cred *new)
 		return err;
 	}
 
+	set_fs_pwd(current->fs, &umd_info->wd);
 	umd_info->pipe_to_umh = to_umh[1];
 	umd_info->pipe_from_umh = from_umh[0];
 	umd_info->pid = task_pid_nr(current);
@@ -62,39 +150,21 @@ static void umd_cleanup(struct subprocess_info *info)
 }
 
 /**
- * fork_usermode_blob - fork a blob of bytes as a usermode process
- * @data: a blob of bytes that can be do_execv-ed as a file
- * @len: length of the blob
- * @info: information about usermode process (shouldn't be NULL)
+ * fork_usermode_driver - fork a usermode driver
+ * @info: information about usermode driver (shouldn't be NULL)
  *
- * Returns either negative error or zero which indicates success
- * in executing a blob of bytes as a usermode process. In such
- * case 'struct umd_info *info' is populated with two pipes
- * and a pid of the process. The caller is responsible for health
- * check of the user process, killing it via pid, and closing the
- * pipes when user process is no longer needed.
+ * Returns either negative error or zero which indicates success in
+ * executing a usermode driver. In such case 'struct umd_info *info'
+ * is populated with two pipes and a pid of the process. The caller is
+ * responsible for health check of the user process, killing it via
+ * pid, and closing the pipes when user process is no longer needed.
  */
-int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
+int fork_usermode_driver(struct umd_info *info)
 {
 	struct subprocess_info *sub_info;
 	char **argv = NULL;
-	struct file *file;
-	ssize_t written;
-	loff_t pos = 0;
 	int err;
 
-	file = shmem_kernel_file_setup(info->driver_name, len, 0);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	written = kernel_write(file, data, len, &pos);
-	if (written != len) {
-		err = written;
-		if (err >= 0)
-			err = -ENOMEM;
-		goto out;
-	}
-
 	err = -ENOMEM;
 	argv = argv_split(GFP_KERNEL, info->driver_name, NULL);
 	if (!argv)
@@ -106,7 +176,6 @@ int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 	if (!sub_info)
 		goto out;
 
-	sub_info->file = file;
 	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
 	if (!err) {
 		mutex_lock(&umh_list_lock);
@@ -116,10 +185,9 @@ int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 out:
 	if (argv)
 		argv_free(argv);
-	fput(file);
 	return err;
 }
-EXPORT_SYMBOL_GPL(fork_usermode_blob);
+EXPORT_SYMBOL_GPL(fork_usermode_driver);
 
 void __exit_umh(struct task_struct *tsk)
 {
diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c
index c0f0990f30b6..28883b00609d 100644
--- a/net/bpfilter/bpfilter_kern.c
+++ b/net/bpfilter/bpfilter_kern.c
@@ -77,9 +77,7 @@ static int start_umh(void)
 	int err;
 
 	/* fork usermode process */
-	err = fork_usermode_blob(&bpfilter_umh_start,
-				 &bpfilter_umh_end - &bpfilter_umh_start,
-				 &bpfilter_ops.info);
+	err = fork_usermode_driver(&bpfilter_ops.info);
 	if (err)
 		return err;
 	bpfilter_ops.stop = false;
@@ -98,6 +96,12 @@ static int __init load_umh(void)
 {
 	int err;
 
+	err = umd_load_blob(&bpfilter_ops.info,
+			    &bpfilter_umh_start,
+			    &bpfilter_umh_end - &bpfilter_umh_start);
+	if (err)
+		return err;
+
 	mutex_lock(&bpfilter_ops.lock);
 	if (!bpfilter_ops.stop) {
 		err = -EFAULT;
@@ -110,6 +114,8 @@ static int __init load_umh(void)
 	}
 out:
 	mutex_unlock(&bpfilter_ops.lock);
+	if (err)
+		umd_unload_blob(&bpfilter_ops.info);
 	return err;
 }
 
@@ -122,6 +128,8 @@ static void __exit fini_umh(void)
 		bpfilter_ops.sockopt = NULL;
 	}
 	mutex_unlock(&bpfilter_ops.lock);
+
+	umd_unload_blob(&bpfilter_ops.info);
 }
 module_init(load_umh);
 module_exit(fini_umh);
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 07/16] umd: Rename umd_info.cmdline umd_info.driver_name
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

The only thing supplied in the cmdline today is the driver name so
rename the field to clarify the code.

As this value is always supplied stop trying to handle the case of
a NULL cmdline.

Additionally since we now have a name we can count on use the
driver_name any place where the code is looking for a name
of the binary.

v1: https://lkml.kernel.org/r/87imfef0k3.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87366d63os.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/usermode_driver.h |  2 +-
 kernel/usermode_driver.c        | 11 ++++-------
 net/ipv4/bpfilter/sockopt.c     |  2 +-
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
index 7131ea611bab..48cf25e3145d 100644
--- a/include/linux/usermode_driver.h
+++ b/include/linux/usermode_driver.h
@@ -18,7 +18,7 @@ static inline void exit_umh(struct task_struct *tsk)
 #endif
 
 struct umd_info {
-	const char *cmdline;
+	const char *driver_name;
 	struct file *pipe_to_umh;
 	struct file *pipe_from_umh;
 	struct list_head list;
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index e73550e946d6..46d60d855e93 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -67,9 +67,6 @@ static void umd_cleanup(struct subprocess_info *info)
  * @len: length of the blob
  * @info: information about usermode process (shouldn't be NULL)
  *
- * If info->cmdline is set it will be used as command line for the
- * user process, else "usermodehelper" is used.
- *
  * Returns either negative error or zero which indicates success
  * in executing a blob of bytes as a usermode process. In such
  * case 'struct umd_info *info' is populated with two pipes
@@ -79,7 +76,6 @@ static void umd_cleanup(struct subprocess_info *info)
  */
 int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 {
-	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
 	struct subprocess_info *sub_info;
 	char **argv = NULL;
 	struct file *file;
@@ -87,7 +83,7 @@ int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 	loff_t pos = 0;
 	int err;
 
-	file = shmem_kernel_file_setup("", len, 0);
+	file = shmem_kernel_file_setup(info->driver_name, len, 0);
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
@@ -100,11 +96,12 @@ int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 	}
 
 	err = -ENOMEM;
-	argv = argv_split(GFP_KERNEL, cmdline, NULL);
+	argv = argv_split(GFP_KERNEL, info->driver_name, NULL);
 	if (!argv)
 		goto out;
 
-	sub_info = call_usermodehelper_setup("none", argv, NULL, GFP_KERNEL,
+	sub_info = call_usermodehelper_setup(info->driver_name, argv, NULL,
+					     GFP_KERNEL,
 					     umd_setup, umd_cleanup, info);
 	if (!sub_info)
 		goto out;
diff --git a/net/ipv4/bpfilter/sockopt.c b/net/ipv4/bpfilter/sockopt.c
index c0dbcc86fcdb..5050de28333d 100644
--- a/net/ipv4/bpfilter/sockopt.c
+++ b/net/ipv4/bpfilter/sockopt.c
@@ -70,7 +70,7 @@ static int __init bpfilter_sockopt_init(void)
 {
 	mutex_init(&bpfilter_ops.lock);
 	bpfilter_ops.stop = true;
-	bpfilter_ops.info.cmdline = "bpfilter_umh";
+	bpfilter_ops.info.driver_name = "bpfilter_umh";
 	bpfilter_ops.info.cleanup = &bpfilter_umh_cleanup;
 
 	return 0;
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 06/16] umd: For clarity rename umh_info umd_info
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

This structure is only used for user mode drivers so change
the prefix from umh to umd to make that clear.

v1: https://lkml.kernel.org/r/87o8p6f0kw.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/878sg563po.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/bpfilter.h        |  2 +-
 include/linux/usermode_driver.h |  6 +++---
 kernel/usermode_driver.c        | 20 ++++++++++----------
 net/ipv4/bpfilter/sockopt.c     |  2 +-
 4 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/include/linux/bpfilter.h b/include/linux/bpfilter.h
index d6d6206052a6..ec9972d822e0 100644
--- a/include/linux/bpfilter.h
+++ b/include/linux/bpfilter.h
@@ -11,7 +11,7 @@ int bpfilter_ip_set_sockopt(struct sock *sk, int optname, char __user *optval,
 int bpfilter_ip_get_sockopt(struct sock *sk, int optname, char __user *optval,
 			    int __user *optlen);
 struct bpfilter_umh_ops {
-	struct umh_info info;
+	struct umd_info info;
 	/* since ip_getsockopt() can run in parallel, serialize access to umh */
 	struct mutex lock;
 	int (*sockopt)(struct sock *sk, int optname,
diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
index c5f6dc950227..7131ea611bab 100644
--- a/include/linux/usermode_driver.h
+++ b/include/linux/usermode_driver.h
@@ -17,14 +17,14 @@ static inline void exit_umh(struct task_struct *tsk)
 }
 #endif
 
-struct umh_info {
+struct umd_info {
 	const char *cmdline;
 	struct file *pipe_to_umh;
 	struct file *pipe_from_umh;
 	struct list_head list;
-	void (*cleanup)(struct umh_info *info);
+	void (*cleanup)(struct umd_info *info);
 	pid_t pid;
 };
-int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
+int fork_usermode_blob(void *data, size_t len, struct umd_info *info);
 
 #endif /* __LINUX_USERMODE_DRIVER_H__ */
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
index 5b05863af855..e73550e946d6 100644
--- a/kernel/usermode_driver.c
+++ b/kernel/usermode_driver.c
@@ -11,7 +11,7 @@ static DEFINE_MUTEX(umh_list_lock);
 
 static int umd_setup(struct subprocess_info *info, struct cred *new)
 {
-	struct umh_info *umh_info = info->data;
+	struct umd_info *umd_info = info->data;
 	struct file *from_umh[2];
 	struct file *to_umh[2];
 	int err;
@@ -43,21 +43,21 @@ static int umd_setup(struct subprocess_info *info, struct cred *new)
 		return err;
 	}
 
-	umh_info->pipe_to_umh = to_umh[1];
-	umh_info->pipe_from_umh = from_umh[0];
-	umh_info->pid = task_pid_nr(current);
+	umd_info->pipe_to_umh = to_umh[1];
+	umd_info->pipe_from_umh = from_umh[0];
+	umd_info->pid = task_pid_nr(current);
 	current->flags |= PF_UMH;
 	return 0;
 }
 
 static void umd_cleanup(struct subprocess_info *info)
 {
-	struct umh_info *umh_info = info->data;
+	struct umd_info *umd_info = info->data;
 
 	/* cleanup if umh_setup() was successful but exec failed */
 	if (info->retval) {
-		fput(umh_info->pipe_to_umh);
-		fput(umh_info->pipe_from_umh);
+		fput(umd_info->pipe_to_umh);
+		fput(umd_info->pipe_from_umh);
 	}
 }
 
@@ -72,12 +72,12 @@ static void umd_cleanup(struct subprocess_info *info)
  *
  * Returns either negative error or zero which indicates success
  * in executing a blob of bytes as a usermode process. In such
- * case 'struct umh_info *info' is populated with two pipes
+ * case 'struct umd_info *info' is populated with two pipes
  * and a pid of the process. The caller is responsible for health
  * check of the user process, killing it via pid, and closing the
  * pipes when user process is no longer needed.
  */
-int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
+int fork_usermode_blob(void *data, size_t len, struct umd_info *info)
 {
 	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
 	struct subprocess_info *sub_info;
@@ -126,7 +126,7 @@ EXPORT_SYMBOL_GPL(fork_usermode_blob);
 
 void __exit_umh(struct task_struct *tsk)
 {
-	struct umh_info *info;
+	struct umd_info *info;
 	pid_t pid = tsk->pid;
 
 	mutex_lock(&umh_list_lock);
diff --git a/net/ipv4/bpfilter/sockopt.c b/net/ipv4/bpfilter/sockopt.c
index 0480918bfc7c..c0dbcc86fcdb 100644
--- a/net/ipv4/bpfilter/sockopt.c
+++ b/net/ipv4/bpfilter/sockopt.c
@@ -12,7 +12,7 @@
 struct bpfilter_umh_ops bpfilter_ops;
 EXPORT_SYMBOL_GPL(bpfilter_ops);
 
-static void bpfilter_umh_cleanup(struct umh_info *info)
+static void bpfilter_umh_cleanup(struct umd_info *info)
 {
 	mutex_lock(&bpfilter_ops.lock);
 	bpfilter_ops.stop = true;
-- 
2.25.0


^ permalink raw reply related

* [PATCH v3 05/16] umh: Separate the user mode driver and the user mode helper support
From: Eric W. Biederman @ 2020-07-02 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Alexei Starovoitov, Kees Cook, Andrew Morton, Alexei Starovoitov,
	Al Viro, bpf, linux-fsdevel, Daniel Borkmann, Jakub Kicinski,
	Masahiro Yamada, Gary Lin, Bruno Meneguele, LSM List,
	Casey Schaufler, Luis Chamberlain, Linus Torvalds,
	Christian Brauner, Eric W. Biederman, Greg Kroah-Hartman
In-Reply-To: <87y2o1swee.fsf_-_@x220.int.ebiederm.org>

This makes it clear which code is part of the core user mode
helper support and which code is needed to implement user mode
drivers.

This makes the kernel smaller for everyone who does not use a usermode
driver.

v1: https://lkml.kernel.org/r/87tuyyf0ln.fsf_-_@x220.int.ebiederm.org
v2: https://lkml.kernel.org/r/87imf963s6.fsf_-_@x220.int.ebiederm.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 include/linux/bpfilter.h        |   2 +-
 include/linux/sched.h           |   8 --
 include/linux/umh.h             |  10 ---
 include/linux/usermode_driver.h |  30 +++++++
 kernel/Makefile                 |   1 +
 kernel/exit.c                   |   1 +
 kernel/umh.c                    | 139 ------------------------------
 kernel/usermode_driver.c        | 146 ++++++++++++++++++++++++++++++++
 8 files changed, 179 insertions(+), 158 deletions(-)
 create mode 100644 include/linux/usermode_driver.h
 create mode 100644 kernel/usermode_driver.c

diff --git a/include/linux/bpfilter.h b/include/linux/bpfilter.h
index d815622cd31e..d6d6206052a6 100644
--- a/include/linux/bpfilter.h
+++ b/include/linux/bpfilter.h
@@ -3,7 +3,7 @@
 #define _LINUX_BPFILTER_H
 
 #include <uapi/linux/bpfilter.h>
-#include <linux/umh.h>
+#include <linux/usermode_driver.h>
 
 struct sock;
 int bpfilter_ip_set_sockopt(struct sock *sk, int optname, char __user *optval,
diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f0..59d1e92bb88e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2020,14 +2020,6 @@ static inline void rseq_execve(struct task_struct *t)
 
 #endif
 
-void __exit_umh(struct task_struct *tsk);
-
-static inline void exit_umh(struct task_struct *tsk)
-{
-	if (unlikely(tsk->flags & PF_UMH))
-		__exit_umh(tsk);
-}
-
 #ifdef CONFIG_DEBUG_RSEQ
 
 void rseq_syscall(struct pt_regs *regs);
diff --git a/include/linux/umh.h b/include/linux/umh.h
index de08af00c68a..73173c4a07e5 100644
--- a/include/linux/umh.h
+++ b/include/linux/umh.h
@@ -39,16 +39,6 @@ call_usermodehelper_setup(const char *path, char **argv, char **envp,
 			  int (*init)(struct subprocess_info *info, struct cred *new),
 			  void (*cleanup)(struct subprocess_info *), void *data);
 
-struct umh_info {
-	const char *cmdline;
-	struct file *pipe_to_umh;
-	struct file *pipe_from_umh;
-	struct list_head list;
-	void (*cleanup)(struct umh_info *info);
-	pid_t pid;
-};
-int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
-
 extern int
 call_usermodehelper_exec(struct subprocess_info *info, int wait);
 
diff --git a/include/linux/usermode_driver.h b/include/linux/usermode_driver.h
new file mode 100644
index 000000000000..c5f6dc950227
--- /dev/null
+++ b/include/linux/usermode_driver.h
@@ -0,0 +1,30 @@
+#ifndef __LINUX_USERMODE_DRIVER_H__
+#define __LINUX_USERMODE_DRIVER_H__
+
+#include <linux/umh.h>
+
+#ifdef CONFIG_BPFILTER
+void __exit_umh(struct task_struct *tsk);
+
+static inline void exit_umh(struct task_struct *tsk)
+{
+	if (unlikely(tsk->flags & PF_UMH))
+		__exit_umh(tsk);
+}
+#else
+static inline void exit_umh(struct task_struct *tsk)
+{
+}
+#endif
+
+struct umh_info {
+	const char *cmdline;
+	struct file *pipe_to_umh;
+	struct file *pipe_from_umh;
+	struct list_head list;
+	void (*cleanup)(struct umh_info *info);
+	pid_t pid;
+};
+int fork_usermode_blob(void *data, size_t len, struct umh_info *info);
+
+#endif /* __LINUX_USERMODE_DRIVER_H__ */
diff --git a/kernel/Makefile b/kernel/Makefile
index f3218bc5ec69..43928759893a 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -12,6 +12,7 @@ obj-y     = fork.o exec_domain.o panic.o \
 	    notifier.o ksysfs.o cred.o reboot.o \
 	    async.o range.o smpboot.o ucount.o
 
+obj-$(CONFIG_BPFILTER) += usermode_driver.o
 obj-$(CONFIG_MODULES) += kmod.o
 obj-$(CONFIG_MULTIUSER) += groups.o
 
diff --git a/kernel/exit.c b/kernel/exit.c
index 727150f28103..a081deea52ca 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -63,6 +63,7 @@
 #include <linux/random.h>
 #include <linux/rcuwait.h>
 #include <linux/compat.h>
+#include <linux/usermode_driver.h>
 
 #include <linux/uaccess.h>
 #include <asm/unistd.h>
diff --git a/kernel/umh.c b/kernel/umh.c
index b8fa9b99b366..3e4e453d45c8 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -26,8 +26,6 @@
 #include <linux/ptrace.h>
 #include <linux/async.h>
 #include <linux/uaccess.h>
-#include <linux/shmem_fs.h>
-#include <linux/pipe_fs_i.h>
 
 #include <trace/events/module.h>
 
@@ -38,8 +36,6 @@ static kernel_cap_t usermodehelper_bset = CAP_FULL_SET;
 static kernel_cap_t usermodehelper_inheritable = CAP_FULL_SET;
 static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
-static LIST_HEAD(umh_list);
-static DEFINE_MUTEX(umh_list_lock);
 
 static void call_usermodehelper_freeinfo(struct subprocess_info *info)
 {
@@ -402,121 +398,6 @@ struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
 }
 EXPORT_SYMBOL(call_usermodehelper_setup);
 
-static int umd_setup(struct subprocess_info *info, struct cred *new)
-{
-	struct umh_info *umh_info = info->data;
-	struct file *from_umh[2];
-	struct file *to_umh[2];
-	int err;
-
-	/* create pipe to send data to umh */
-	err = create_pipe_files(to_umh, 0);
-	if (err)
-		return err;
-	err = replace_fd(0, to_umh[0], 0);
-	fput(to_umh[0]);
-	if (err < 0) {
-		fput(to_umh[1]);
-		return err;
-	}
-
-	/* create pipe to receive data from umh */
-	err = create_pipe_files(from_umh, 0);
-	if (err) {
-		fput(to_umh[1]);
-		replace_fd(0, NULL, 0);
-		return err;
-	}
-	err = replace_fd(1, from_umh[1], 0);
-	fput(from_umh[1]);
-	if (err < 0) {
-		fput(to_umh[1]);
-		replace_fd(0, NULL, 0);
-		fput(from_umh[0]);
-		return err;
-	}
-
-	umh_info->pipe_to_umh = to_umh[1];
-	umh_info->pipe_from_umh = from_umh[0];
-	umh_info->pid = task_pid_nr(current);
-	current->flags |= PF_UMH;
-	return 0;
-}
-
-static void umd_cleanup(struct subprocess_info *info)
-{
-	struct umh_info *umh_info = info->data;
-
-	/* cleanup if umh_setup() was successful but exec failed */
-	if (info->retval) {
-		fput(umh_info->pipe_to_umh);
-		fput(umh_info->pipe_from_umh);
-	}
-}
-
-/**
- * fork_usermode_blob - fork a blob of bytes as a usermode process
- * @data: a blob of bytes that can be do_execv-ed as a file
- * @len: length of the blob
- * @info: information about usermode process (shouldn't be NULL)
- *
- * If info->cmdline is set it will be used as command line for the
- * user process, else "usermodehelper" is used.
- *
- * Returns either negative error or zero which indicates success
- * in executing a blob of bytes as a usermode process. In such
- * case 'struct umh_info *info' is populated with two pipes
- * and a pid of the process. The caller is responsible for health
- * check of the user process, killing it via pid, and closing the
- * pipes when user process is no longer needed.
- */
-int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
-{
-	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
-	struct subprocess_info *sub_info;
-	char **argv = NULL;
-	struct file *file;
-	ssize_t written;
-	loff_t pos = 0;
-	int err;
-
-	file = shmem_kernel_file_setup("", len, 0);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
-
-	written = kernel_write(file, data, len, &pos);
-	if (written != len) {
-		err = written;
-		if (err >= 0)
-			err = -ENOMEM;
-		goto out;
-	}
-
-	err = -ENOMEM;
-	argv = argv_split(GFP_KERNEL, cmdline, NULL);
-	if (!argv)
-		goto out;
-
-	sub_info = call_usermodehelper_setup("none", argv, NULL, GFP_KERNEL,
-					     umd_setup, umd_cleanup, info);
-	if (!sub_info)
-		goto out;
-
-	sub_info->file = file;
-	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
-	if (!err) {
-		mutex_lock(&umh_list_lock);
-		list_add(&info->list, &umh_list);
-		mutex_unlock(&umh_list_lock);
-	}
-out:
-	if (argv)
-		argv_free(argv);
-	fput(file);
-	return err;
-}
-EXPORT_SYMBOL_GPL(fork_usermode_blob);
-
 /**
  * call_usermodehelper_exec - start a usermode application
  * @sub_info: information about the subprocessa
@@ -678,26 +559,6 @@ static int proc_cap_handler(struct ctl_table *table, int write,
 	return 0;
 }
 
-void __exit_umh(struct task_struct *tsk)
-{
-	struct umh_info *info;
-	pid_t pid = tsk->pid;
-
-	mutex_lock(&umh_list_lock);
-	list_for_each_entry(info, &umh_list, list) {
-		if (info->pid == pid) {
-			list_del(&info->list);
-			mutex_unlock(&umh_list_lock);
-			goto out;
-		}
-	}
-	mutex_unlock(&umh_list_lock);
-	return;
-out:
-	if (info->cleanup)
-		info->cleanup(info);
-}
-
 struct ctl_table usermodehelper_table[] = {
 	{
 		.procname	= "bset",
diff --git a/kernel/usermode_driver.c b/kernel/usermode_driver.c
new file mode 100644
index 000000000000..5b05863af855
--- /dev/null
+++ b/kernel/usermode_driver.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * umd - User mode driver support
+ */
+#include <linux/shmem_fs.h>
+#include <linux/pipe_fs_i.h>
+#include <linux/usermode_driver.h>
+
+static LIST_HEAD(umh_list);
+static DEFINE_MUTEX(umh_list_lock);
+
+static int umd_setup(struct subprocess_info *info, struct cred *new)
+{
+	struct umh_info *umh_info = info->data;
+	struct file *from_umh[2];
+	struct file *to_umh[2];
+	int err;
+
+	/* create pipe to send data to umh */
+	err = create_pipe_files(to_umh, 0);
+	if (err)
+		return err;
+	err = replace_fd(0, to_umh[0], 0);
+	fput(to_umh[0]);
+	if (err < 0) {
+		fput(to_umh[1]);
+		return err;
+	}
+
+	/* create pipe to receive data from umh */
+	err = create_pipe_files(from_umh, 0);
+	if (err) {
+		fput(to_umh[1]);
+		replace_fd(0, NULL, 0);
+		return err;
+	}
+	err = replace_fd(1, from_umh[1], 0);
+	fput(from_umh[1]);
+	if (err < 0) {
+		fput(to_umh[1]);
+		replace_fd(0, NULL, 0);
+		fput(from_umh[0]);
+		return err;
+	}
+
+	umh_info->pipe_to_umh = to_umh[1];
+	umh_info->pipe_from_umh = from_umh[0];
+	umh_info->pid = task_pid_nr(current);
+	current->flags |= PF_UMH;
+	return 0;
+}
+
+static void umd_cleanup(struct subprocess_info *info)
+{
+	struct umh_info *umh_info = info->data;
+
+	/* cleanup if umh_setup() was successful but exec failed */
+	if (info->retval) {
+		fput(umh_info->pipe_to_umh);
+		fput(umh_info->pipe_from_umh);
+	}
+}
+
+/**
+ * fork_usermode_blob - fork a blob of bytes as a usermode process
+ * @data: a blob of bytes that can be do_execv-ed as a file
+ * @len: length of the blob
+ * @info: information about usermode process (shouldn't be NULL)
+ *
+ * If info->cmdline is set it will be used as command line for the
+ * user process, else "usermodehelper" is used.
+ *
+ * Returns either negative error or zero which indicates success
+ * in executing a blob of bytes as a usermode process. In such
+ * case 'struct umh_info *info' is populated with two pipes
+ * and a pid of the process. The caller is responsible for health
+ * check of the user process, killing it via pid, and closing the
+ * pipes when user process is no longer needed.
+ */
+int fork_usermode_blob(void *data, size_t len, struct umh_info *info)
+{
+	const char *cmdline = (info->cmdline) ? info->cmdline : "usermodehelper";
+	struct subprocess_info *sub_info;
+	char **argv = NULL;
+	struct file *file;
+	ssize_t written;
+	loff_t pos = 0;
+	int err;
+
+	file = shmem_kernel_file_setup("", len, 0);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+
+	written = kernel_write(file, data, len, &pos);
+	if (written != len) {
+		err = written;
+		if (err >= 0)
+			err = -ENOMEM;
+		goto out;
+	}
+
+	err = -ENOMEM;
+	argv = argv_split(GFP_KERNEL, cmdline, NULL);
+	if (!argv)
+		goto out;
+
+	sub_info = call_usermodehelper_setup("none", argv, NULL, GFP_KERNEL,
+					     umd_setup, umd_cleanup, info);
+	if (!sub_info)
+		goto out;
+
+	sub_info->file = file;
+	err = call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC);
+	if (!err) {
+		mutex_lock(&umh_list_lock);
+		list_add(&info->list, &umh_list);
+		mutex_unlock(&umh_list_lock);
+	}
+out:
+	if (argv)
+		argv_free(argv);
+	fput(file);
+	return err;
+}
+EXPORT_SYMBOL_GPL(fork_usermode_blob);
+
+void __exit_umh(struct task_struct *tsk)
+{
+	struct umh_info *info;
+	pid_t pid = tsk->pid;
+
+	mutex_lock(&umh_list_lock);
+	list_for_each_entry(info, &umh_list, list) {
+		if (info->pid == pid) {
+			list_del(&info->list);
+			mutex_unlock(&umh_list_lock);
+			goto out;
+		}
+	}
+	mutex_unlock(&umh_list_lock);
+	return;
+out:
+	if (info->cleanup)
+		info->cleanup(info);
+}
+
-- 
2.25.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox