Linux Security Modules development
 help / color / mirror / Atom feed
* Re: [PATCH net-next] cipso: Remove unused inline functions
From: David Miller @ 2020-07-15 14:45 UTC (permalink / raw)
  To: yuehaibing; +Cc: paul, kuba, netdev, linux-security-module, linux-kernel
In-Reply-To: <20200715021846.34096-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Wed, 15 Jul 2020 10:18:46 +0800

> They are not used any more since commit b1edeb102397 ("netlabel: Replace
> protocol/NetLabel linking with refrerence counts")
> 
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>

Applied.

^ permalink raw reply

* Re: [PATCH bpf-next v4 1/4] bpf: Generalize bpf_sk_storage
From: KP Singh @ 2020-07-15  8:57 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: bpf, Linux Security Module list
In-Reply-To: <CACYkzJ7GmSPt8WV5nSNB2irH+Dn9QywSabZyLvqEcae2-jTuHQ@mail.gmail.com>

[+lists]

I inadvertently missed them in my previous reply.

On Wed, Jul 15, 2020 at 10:22 AM KP Singh <kpsingh@chromium.org> wrote:
>
> On Wed, Jul 15, 2020 at 8:43 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Tue, Jul 14, 2020 at 11:42:56PM +0200, KP Singh wrote:
> > >
> > >
> > > On Fri, Jul 10, 2020 at 8:59 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > > On Thu, Jul 09, 2020 at 12:12:36PM +0200, KP Singh wrote:
> > > > > From: KP Singh <kpsingh@google.com>
> > > > >
> > > > > Refactor the functionality in bpf_sk_storage.c so that concept of
> > > > > storage linked to kernel objects can be extended to other objects like
> > > > > inode, task_struct etc.
> > > > >
> > > > > bpf_sk_storage is updated to be bpf_local_storage with a union that
> > > > > contains a pointer to the owner object.
> > > >
> > > > > The type of the
> > > > > bpf_local_storage can be determined using the newly added
> > > > > bpf_local_storage_type enum.
> > > > This is out dated.
> > > >
> > > > >
> > > > > Each new local storage will still be a separate map and provide its own
> > > > > set of helpers. This allows for future object specific extensions and
> > > > > still share a lot of the underlying implementation.
> > > > Thanks for v4.
> > > >
> > > > I do find it quite hard to follow by directly moving to
> > > > bpf_local_storage.c without doing all the renaming locally
> > > > at bpf_sk_storage.c first.  I will try my best to follow.
> > > >
> > >
> > > Thanks for painfully going through it. Will make it easier next time :)
> > >
> > > > There are some unnecessary name/convention change and function
> > > > folding that do not help on this side either.  Please keep them
> > > > unchanged for now and they can use another patch in the future if needed.
> > > > It will be easier to have a mostly one to one naming change
> > > > and please mention them in the commit message.
> > >
> > > So I am going to split the first change as:
> > >
> > > - A mechcanical change that does the following renames:
> > >
> > > Flags/consts:
> > >
> > >   SK_STORAGE_CREATE_FLAG_MASK -> BPF_LOCAL_STORAGE_CREATE_FLAG_MASK
> > >   BPF_SK_STORAGE_CACHE_SIZE -> BPF_LOCAL_STORAGE_CACHE_SIZE
> > >   MAX_VALUE_SIZE -> BPF_LOCAL_STORAGE_MAX_VALUE_SIZE
> > >
> > > Structs:
> > >
> > >   bucket -> bpf_local_storage_map_bucket
> > >   bpf_sk_storage_map -> bpf_local_storage_map
> > >   bpf_sk_storage_data -> bpf_local_storage_data
> > >   bpf_sk_storage_elem -> bpf_local_storage_elem
> > >   bpf_sk_storage -> bpf_local_storage
> > >   selem_linked_to_sk -> selem_linked_to_storage
> > >   selem_alloc -> bpf_selem_alloc
> > >
> > >   and in bpf_local_storage change the name of the sk -> owner
> > >   (the type change happens in a subsequent patch).
> > >
> > > Functions:
> > >
> > >   __selem_unlink_sk -> bpf_selem_unlink_storage
> > >   __selem_link_sk -> bpf_selem_link_storage
> > >   selem_unlink_sk -> __bpf_selem_unlink_storage
> > >   sk_storage_update -> bpf_local_storage_update
> > >   __sk_storage_lookup -> bpf_local_storage_lookup
> > >   bpf_sk_storage_map_free -> bpf_local_storage_map_free
> > >   bpf_sk_storage_map_alloc -> bpf_local_storage_map_alloc
> > >   bpf_sk_storage_map_alloc_check -> bpf_local_storage_map_alloc_check
> > >   bpf_sk_storage_map_check_btf -> bpf_local_storage_map_check_btf
> > >
> > > - Split the caching generalization into its separate patch.
> > > - Do the rest of the changes within bpf_sk_storage.c without any
> > >   splitting to make the review easier.
> > > - Another mechanical no-change split into
> > >   bpf_local_storage.
> > >
> > > Hope this would make the review easier for you. Let me know if you
> > > have any concerns with the naming / split of patches.
> > That will be much better. Thanks!
> >
> > > > [ ... ]
> > > >
> > > > > +static struct bpf_local_storage_data *
> > > > > +sk_storage_update(void *owner, struct bpf_map *map, void *value, u64 map_flags)
> > > > >  {
> > > > > -     struct bpf_sk_storage_data *old_sdata = NULL;
> > > > > -     struct bpf_sk_storage_elem *selem;
> > > > > -     struct bpf_sk_storage *sk_storage;
> > > > > -     struct bpf_sk_storage_map *smap;
> > > > > +     struct bpf_local_storage_data *old_sdata = NULL;
> > > > > +     struct bpf_local_storage_elem *selem;
> > > > > +     struct bpf_local_storage *local_storage;
> > > > > +     struct bpf_local_storage_map *smap;
> > > > > +     struct sock *sk;
> > > > >       int err;
> > > > >
> > > > > -     /* BPF_EXIST and BPF_NOEXIST cannot be both set */
> > > > > -     if (unlikely((map_flags & ~BPF_F_LOCK) > BPF_EXIST) ||
> > > > > -         /* BPF_F_LOCK can only be used in a value with spin_lock */
> > > > > -         unlikely((map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
> > > > > -             return ERR_PTR(-EINVAL);
> > > > > +     err = bpf_local_storage_check_update_flags(map, map_flags);
> > > > > +     if (err)
> > > > > +             return ERR_PTR(err);
> > > > >
> > > > > -     smap = (struct bpf_sk_storage_map *)map;
> > > > > -     sk_storage = rcu_dereference(sk->sk_bpf_storage);
> > > > > -     if (!sk_storage || hlist_empty(&sk_storage->list)) {
> > > > > -             /* Very first elem for this sk */
> > > > > -             err = check_flags(NULL, map_flags);
> > > > > -             if (err)
> > > > > -                     return ERR_PTR(err);
> > > > > +     sk = owner;
> > > > > +     local_storage = rcu_dereference(sk->sk_bpf_storage);
> > > > > +     smap = (struct bpf_local_storage_map *)map;
> > > > >
> > > > > -             selem = selem_alloc(smap, sk, value, true);
> > > > > +     if (!local_storage || hlist_empty(&local_storage->list)) {
> > > > > +             /* Very first elem */
> > > > > +             selem = map->ops->map_selem_alloc(smap, owner, value, !old_sdata);
> > >
> > > > hmmm... If this map_selem_alloc is directly called here in sk_storage instead
> > > > of the common local_storage, does it have to be in map_ops?
> > >
> > > map_selem_alloc is also called from bpf_local_storage_update as well.
> > > However, map_local_storage_alloc is only called from here
> > > and we probably don't need that, so I removed it.
> > Ah. right, I meant map_local_storage_alloc.
> > Sorry for the confusion.
> >
> > >
> > > >
> > > > >               if (!selem)
> > > > >                       return ERR_PTR(-ENOMEM);
> > > > >
> > > > > -             err = sk_storage_alloc(sk, smap, selem);
> > > > > +             err = map->ops->map_local_storage_alloc(owner, smap, selem);
> > > > >               if (err) {
> > > > >                       kfree(selem);
> > > > >                       atomic_sub(smap->elem_size, &sk->sk_omem_alloc);
> > > >
> > > > [ ... ]
> > > >
> > > > > -static void bpf_sk_storage_map_free(struct bpf_map *map)
> > > > > +static void *bpf_sk_storage_lookup_elem(struct bpf_map *map, void *key)
> > > > Hmmm... this change here... keep scrolling down and down .... :)
> > > >
> > > > >  {
> > > > > -     struct bpf_sk_storage_elem *selem;
> > > > > -     struct bpf_sk_storage_map *smap;
> > > > > -     struct bucket *b;
> > > > > -     unsigned int i;
> > >
> > > [...]
> > >
> > > > > -
> > > > > -     smap = (struct bpf_sk_storage_map *)map;
> > > > > -}
> > > > > -
> > > > > -static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
> > > > .... finally got it :p
> > > >
> > > > > -{
> > > > > -     struct bpf_sk_storage_data *sdata;
> > > > > +     struct bpf_local_storage_data *sdata;
> > > > >       struct socket *sock;
> > > > > -     int fd, err;
> > > > > +     int fd, err = -EINVAL;
> > > > This is a bug fix or to suppress compiler warning?
> > >
> > > I did not see any compiler warning. This came up in an internal
> > > discussion to protect against the (albeit hypothetical) case where the
> > > sockfd_lookup does not set err.
> > I don't see an issue in sockfd_lookup().
> > There are other cases in the kernel depending on sockfd_lookup() to set
> > the err properly.  I don't see it is enough to only workaround it in
> > this lookup function but not everywhere else.
> > If sockfd_lookup() had a bug, fix it there instead of asking
> > everybody to work around it.
>
> I agree. I dropped this change.
>
> >
> > >
> > > >
> > > > >
> > > > >       fd = *(int *)key;
> > > > >       sock = sockfd_lookup(fd, &err);
> > > > > @@ -752,17 +223,18 @@ static void *bpf_fd_sk_storage_lookup_elem(struct bpf_map *map, void *key)
> > > > >       return ERR_PTR(err);
> > > > >  }
> > > > >
> > > >
> > > > [ ... ]
> > > >
> > > > >  static int sk_storage_map_btf_id;
> > > > >  const struct bpf_map_ops sk_storage_map_ops = {
> > > > > -     .map_alloc_check = bpf_sk_storage_map_alloc_check,
> > > > > -     .map_alloc = bpf_sk_storage_map_alloc,
> > > > > -     .map_free = bpf_sk_storage_map_free,
> > > > > +     .map_alloc_check = bpf_local_storage_map_alloc_check,
> > > > > +     .map_alloc = sk_storage_map_alloc,
> > > > > +     .map_free = sk_storage_map_free,
> > > > >       .map_get_next_key = notsupp_get_next_key,
> > > > > -     .map_lookup_elem = bpf_fd_sk_storage_lookup_elem,
> > > > > -     .map_update_elem = bpf_fd_sk_storage_update_elem,
> > > > > -     .map_delete_elem = bpf_fd_sk_storage_delete_elem,
> > > > Why this "_fd_" name change?
> > >
> > > Shouldn't really be needed as a part of this series. So I will drop
> > > it. Do you want the corresponding inode functions to also have fd
> > > in the name?
> > I don't have strong opinion on the name here or in bpf_inode_storage.
>
> Me neither. :)
>
> > I think the idea in this patch is to have consistent naming with
> > bpf_inode_storage.
>
> I think the fd makes sense here (in sk_storage) because the key is an fd,
> it would not make sense for an inode storage though.
>
> >
> > For a short and small patch, I don't mind to squash this naming change
> > into a single patch.  However, this refactoring effort is not a small change.
> >
> > My only point is, if unncessary renaming/function-folding
> > is really desired in bpf_sk_storage,  please do it in a separate patch.
> > Unnecessary changes will make this refactoring effort less clean and harder
> > for the future reviewer to look back what has been done and why.
> >
>
> I agree. I think I had ended up renaming them to understand the code better
> and forgot to revert these changes, as you might have seen tracking / reviewing
> them in a giant patch was hard.
>
> > Thanks,
> > Martin

^ permalink raw reply

* Re: [PATCH 7/7] exec: Implement kernel_execve
From: Christoph Hellwig @ 2020-07-15  6:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Eric W. Biederman, linux-kernel, Linus Torvalds, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <202007141446.A72A4437C@keescook>

On Tue, Jul 14, 2020 at 02:49:23PM -0700, Kees Cook wrote:
> On Tue, Jul 14, 2020 at 08:31:40AM -0500, Eric W. Biederman wrote:
> > +static int count_strings_kernel(const char *const *argv)
> > +{
> > +	int i;
> > +
> > +	if (!argv)
> > +		return 0;
> > +
> > +	for (i = 0; argv[i]; ++i) {
> > +		if (i >= MAX_ARG_STRINGS)
> > +			return -E2BIG;
> > +		if (fatal_signal_pending(current))
> > +			return -ERESTARTNOHAND;
> > +		cond_resched();
> > +	}
> > +	return i;
> > +}
> 
> I notice count() is only ever called with MAX_ARG_STRINGS. Perhaps
> refactor that too? (And maybe rename it to count_strings_user()?)

Liks this?

http://git.infradead.org/users/hch/misc.git/commitdiff/35a3129dab5b712b018c30681d15de42d9509731

^ permalink raw reply

* Re: [PATCH 7/7] exec: Implement kernel_execve
From: Christoph Hellwig @ 2020-07-15  6:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87wo365ikj.fsf@x220.int.ebiederm.org>

> +static int count_strings_kernel(const char *const *argv)
> +{
> +	int i;
> +
> +	if (!argv)
> +		return 0;
> +
> +	for (i = 0; argv[i]; ++i) {
> +		if (i >= MAX_ARG_STRINGS)
> +			return -E2BIG;
> +		if (fatal_signal_pending(current))
> +			return -ERESTARTNOHAND;
> +		cond_resched();

I don't think we need a fatal_signal_pending and cond_resched() is
needed in each step given that we don't actually do anything.

^ permalink raw reply

* Re: [PATCH 6/7] exec: Factor bprm_stack_limits out of prepare_arg_pages
From: Christoph Hellwig @ 2020-07-15  6:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87365u6x60.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:31:03AM -0500, Eric W. Biederman wrote:
> 
> In preparation for implementiong kernel_execve (which will take kernel
> pointers not userspace pointers) factor out bprm_stack_limits out of
> prepare_arg_pages.  This separates the counting which depends upon the
> getting data from userspace from the calculations of the stack limits
> which is usable in kernel_execve.
> 
> The remove prepare_args_pages and compute bprm->argc and bprm->envc
> directly in do_execveat_common, before bprm_stack_limits is called.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---

This looks basically identical to my "exec: split prepare_arg_pages".
I slightly prefer the version I had, but this looks ok as well:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 5/7] exec: Factor bprm_execve out of do_execve_common
From: Christoph Hellwig @ 2020-07-15  6:36 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <878sfm6x6x.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:30:30AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code
> that launches init to use set_fs so that pages coming from the kernel
> look like they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument
> copying from userspace needs to happen earlier.  Factor bprm_execve
> out of do_execve_common to separate out the copying of arguments
> to the newe stack, and the rest of exec.
> 
> In separating bprm_execve from do_execve_common the copying
> of the arguments onto the new stack happens earlier.
> 
> As the copying of the arguments does not depend any security hooks,
> files, the file table, current->in_execve, current->fs->in_exec,
> bprm->unsafe, or creds this is safe.
> 
> Likewise the security hook security_creds_for_exec does not depend upon
> preventing the argument copying from happening.
> 
> In addition to making it possible to implement kernel_execve that
> performs the copying differently, this separation of bprm_execve from
> do_execve_common makes for a nice separation of responsibilities making
> the exec code easier to navigate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  fs/exec.c | 108 +++++++++++++++++++++++++++++-------------------------
>  1 file changed, 58 insertions(+), 50 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index afb168bf5e23..50508892fa71 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1856,44 +1856,16 @@ static int exec_binprm(struct linux_binprm *bprm)
>  /*
>   * sys_execve() executes a new program.
>   */
> -static int do_execveat_common(int fd, struct filename *filename,
> -			      struct user_arg_ptr argv,
> -			      struct user_arg_ptr envp,
> -			      int flags)
> +static int bprm_execve(struct linux_binprm *bprm,
> +		       int fd, struct filename *filename, int flags)

int fd easily fits onto the previous line.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 4/7] exec: Move bprm_mm_init into alloc_bprm
From: Christoph Hellwig @ 2020-07-15  6:35 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87eepe6x7p.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:30:02AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code that
> launches init to use set_fs so that pages coming from the kernel look like
> they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument copying
> from userspace needs to happen earlier.  Move the allocation and
> initialization of bprm->mm into alloc_bprm so that the bprm->mm is
> available early to store the new user stack into.  This is a prerequisite
> for copying argv and envp into the new user stack early before ther rest of
> exec.
> 
> To keep the things consistent the cleanup of bprm->mm is moved into
> free_bprm.  So that bprm->mm will be cleaned up whenever bprm->mm is
> allocated and free_bprm are called.
> 
> Moving bprm_mm_init earlier is safe as it does not depend on any files,
> current->in_execve, current->fs->in_exec, bprm->unsafe, or the if the file
> table is shared. (AKA bprm_mm_init does not depend on any of the code that
> happens between alloc_bprm and where it was previously called.)
> 
> This moves bprm->mm cleanup after current->fs->in_exec is set to 0.  This
> is safe because current->fs->in_exec is only used to preventy taking an
> additional reference on the fs_struct.
> 
> This moves bprm->mm cleanup after current->in_execve is set to 0.  This is
> safe because current->in_execve is only used by the lsms (apparmor and
> tomoyou) and always for LSM specific functions, never for anything to do
> with the mm.
> 
> This adds bprm->mm cleanup into the successful return path.  This is safe
> because being on the successful return path implies that begin_new_exec
> succeeded and set brpm->mm to NULL.  As bprm->mm is NULL bprm cleanup I am
> moving into free_bprm will do nothing.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 3/7] exec: Move initialization of bprm->filename into alloc_bprm
From: Christoph Hellwig @ 2020-07-15  6:34 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87k0z66x8f.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:29:36AM -0500, Eric W. Biederman wrote:
>  
> -static struct linux_binprm *alloc_bprm(void)
> +static struct linux_binprm *alloc_bprm(int fd, struct filename *filename)
>  {
>  	struct linux_binprm *bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
> +	int retval = -ENOMEM;
>  	if (!bprm)
> -		return ERR_PTR(-ENOMEM);
> +		goto out;
> +

Ok, so here we add to it.   Two nitpicks:

The abose is missing a blank line after the declarations, which really
makes things more readable.  Also no need for the goto there.

Otherwisel ooks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH 2/7] exec: Factor out alloc_bprm
From: Christoph Hellwig @ 2020-07-15  6:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87pn8y6x9a.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:29:05AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code
> that launches init to use set_fs so that pages coming from the kernel
> look like they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument
> copying from userspace needs to happen earlier.  Move the allocation
> of the bprm into it's own function (alloc_bprm) and move the call of
> alloc_bprm before unshare_files so that bprm can ultimately be
> allocated, the arguments can be placed on the new stack, and then the
> bprm can be passed into the core of exec.
> 
> Neither the allocation of struct binprm nor the unsharing depend upon each
> other so swapping the order in which they are called is trivially safe.
> 
> To keep things consistent the order of cleanup at the end of
> do_execve_common swapped to match the order of initialization.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>  fs/exec.c | 29 +++++++++++++++++++----------
>  1 file changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 23dfbb820626..526156d6461d 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1560,6 +1560,14 @@ static void free_bprm(struct linux_binprm *bprm)
>  	kfree(bprm);
>  }
>  
> +static struct linux_binprm *alloc_bprm(void)
> +{
> +	struct linux_binprm *bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
> +	if (!bprm)
> +		return ERR_PTR(-ENOMEM);
> +	return bprm;

Unless this helper grows later I really don't see the point of it.
Also a NULL return vs ERR_PTR would simplify this a bit (again unless
this grows more code later with different return codes, but then again
it might make sense to only add the helper once it becomes useful).

The actual allocation order move looks fine, though.

^ permalink raw reply

* Re: [PATCH 1/7] exec: Remove unnecessary spaces from binfmts.h
From: Christoph Hellwig @ 2020-07-15  6:28 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Kees Cook, Andy Lutomirski,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Al Viro, Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87v9iq6x9x.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:28:42AM -0500, Eric W. Biederman wrote:
> 
> The general convention in the linux kernel is to define a pointer
> member as "type *name".  The declaration of struct linux_binprm has
> several pointer defined as "type * name".  Update them to the
> form of "type *name" for consistency.
> 
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply

* Re: [PATCH net-next] cipso: Remove unused inline functions
From: Paul Moore @ 2020-07-15  2:28 UTC (permalink / raw)
  To: YueHaibing; +Cc: davem, kuba, netdev, linux-security-module, linux-kernel
In-Reply-To: <20200715021846.34096-1-yuehaibing@huawei.com>

On Tue, Jul 14, 2020 at 10:21 PM YueHaibing <yuehaibing@huawei.com> wrote:
>
> They are not used any more since commit b1edeb102397 ("netlabel: Replace
> protocol/NetLabel linking with refrerence counts")
>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  include/net/cipso_ipv4.h | 12 ------------
>  1 file changed, 12 deletions(-)

Looks good to me, thanks for the patch.

Acked-by: Paul Moore <paul@paul-moore.com>

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [PATCH net-next] cipso: Remove unused inline functions
From: YueHaibing @ 2020-07-15  2:18 UTC (permalink / raw)
  To: paul, davem, kuba, yuehaibing; +Cc: netdev, linux-security-module, linux-kernel

They are not used any more since commit b1edeb102397 ("netlabel: Replace
protocol/NetLabel linking with refrerence counts")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
 include/net/cipso_ipv4.h | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/include/net/cipso_ipv4.h b/include/net/cipso_ipv4.h
index 428b6725b248..53dd7d988a2d 100644
--- a/include/net/cipso_ipv4.h
+++ b/include/net/cipso_ipv4.h
@@ -150,18 +150,6 @@ static inline int cipso_v4_doi_walk(u32 *skip_cnt,
 {
 	return 0;
 }
-
-static inline int cipso_v4_doi_domhsh_add(struct cipso_v4_doi *doi_def,
-					  const char *domain)
-{
-	return -ENOSYS;
-}
-
-static inline int cipso_v4_doi_domhsh_remove(struct cipso_v4_doi *doi_def,
-					     const char *domain)
-{
-	return 0;
-}
 #endif /* CONFIG_NETLABEL */
 
 /*
-- 
2.17.1



^ permalink raw reply related

* Re: [PATCH 7/7] exec: Implement kernel_execve
From: Kees Cook @ 2020-07-14 21:49 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87wo365ikj.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:31:40AM -0500, Eric W. Biederman wrote:
> +static int count_strings_kernel(const char *const *argv)
> +{
> +	int i;
> +
> +	if (!argv)
> +		return 0;
> +
> +	for (i = 0; argv[i]; ++i) {
> +		if (i >= MAX_ARG_STRINGS)
> +			return -E2BIG;
> +		if (fatal_signal_pending(current))
> +			return -ERESTARTNOHAND;
> +		cond_resched();
> +	}
> +	return i;
> +}

I notice count() is only ever called with MAX_ARG_STRINGS. Perhaps
refactor that too? (And maybe rename it to count_strings_user()?)

Otherwise, looks good:

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 6/7] exec: Factor bprm_stack_limits out of prepare_arg_pages
From: Kees Cook @ 2020-07-14 21:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87365u6x60.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:31:03AM -0500, Eric W. Biederman wrote:
> 
> In preparation for implementiong kernel_execve (which will take kernel
> pointers not userspace pointers) factor out bprm_stack_limits out of
> prepare_arg_pages.  This separates the counting which depends upon the
> getting data from userspace from the calculations of the stack limits
> which is usable in kernel_execve.
> 
> The remove prepare_args_pages and compute bprm->argc and bprm->envc
> directly in do_execveat_common, before bprm_stack_limits is called.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 3/7] exec: Move initialization of bprm->filename into alloc_bprm
From: Kees Cook @ 2020-07-14 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87k0z66x8f.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:29:36AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code
> that launches init to use set_fs so that pages coming from the kernel
> look like they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument
> copying from userspace needs to happen earlier.  Move the computation
> of bprm->filename and possible allocation of a name in the case
> of execveat into alloc_bprm to make that possible.
> 
> The exectuable name, the arguments, and the environment are
> copied into the new usermode stack which is stored in bprm
> until exec passes the point of no return.
> 
> As the executable name is copied first onto the usermode stack
> it needs to be known.  As there are no dependencies to computing
> the executable name, compute it early in alloc_bprm.
> 
> As an implementation detail if the filename needs to be generated
> because it embeds a file descriptor store that filename in a new field
> bprm->fdpath, and free it in free_bprm.  Previously this was done in
> an independent variable pathbuf.  I have renamed pathbuf fdpath
> because fdpath is more suggestive of what kind of path is in the
> variable.  I moved fdpath into struct linux_binprm because it is
> tightly tied to the other variables in struct linux_binprm, and as
> such is needed to allow the call alloc_binprm to move.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 2/7] exec: Factor out alloc_bprm
From: Kees Cook @ 2020-07-14 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87pn8y6x9a.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:29:05AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code
> that launches init to use set_fs so that pages coming from the kernel
> look like they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument
> copying from userspace needs to happen earlier.  Move the allocation
> of the bprm into it's own function (alloc_bprm) and move the call of
> alloc_bprm before unshare_files so that bprm can ultimately be
> allocated, the arguments can be placed on the new stack, and then the
> bprm can be passed into the core of exec.
> 
> Neither the allocation of struct binprm nor the unsharing depend upon each
> other so swapping the order in which they are called is trivially safe.
> 
> To keep things consistent the order of cleanup at the end of
> do_execve_common swapped to match the order of initialization.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 1/7] exec: Remove unnecessary spaces from binfmts.h
From: Kees Cook @ 2020-07-14 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87v9iq6x9x.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:28:42AM -0500, Eric W. Biederman wrote:
> 
> The general convention in the linux kernel is to define a pointer
> member as "type *name".  The declaration of struct linux_binprm has
> several pointer defined as "type * name".  Update them to the
> form of "type *name" for consistency.
> 
> Suggested-by: Kees Cook <keescook@chromium.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 5/7] exec: Factor bprm_execve out of do_execve_common
From: Kees Cook @ 2020-07-14 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <878sfm6x6x.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:30:30AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code
> that launches init to use set_fs so that pages coming from the kernel
> look like they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument
> copying from userspace needs to happen earlier.  Factor bprm_execve
> out of do_execve_common to separate out the copying of arguments
> to the newe stack, and the rest of exec.
> 
> In separating bprm_execve from do_execve_common the copying
> of the arguments onto the new stack happens earlier.
> 
> As the copying of the arguments does not depend any security hooks,
> files, the file table, current->in_execve, current->fs->in_exec,
> bprm->unsafe, or creds this is safe.
> 
> Likewise the security hook security_creds_for_exec does not depend upon
> preventing the argument copying from happening.
> 
> In addition to making it possible to implement kernel_execve that
> performs the copying differently, this separation of bprm_execve from
> do_execve_common makes for a nice separation of responsibilities making
> the exec code easier to navigate.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH 4/7] exec: Move bprm_mm_init into alloc_bprm
From: Kees Cook @ 2020-07-14 21:37 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, Linus Torvalds, Andy Lutomirski, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Al Viro,
	Luis Chamberlain, linux-fsdevel, Tetsuo Handa,
	linux-security-module, Serge E. Hallyn, James Morris,
	Kentaro Takeda, Casey Schaufler, John Johansen, Christoph Hellwig
In-Reply-To: <87eepe6x7p.fsf@x220.int.ebiederm.org>

On Tue, Jul 14, 2020 at 08:30:02AM -0500, Eric W. Biederman wrote:
> 
> Currently it is necessary for the usermode helper code and the code that
> launches init to use set_fs so that pages coming from the kernel look like
> they are coming from userspace.
> 
> To allow that usage of set_fs to be removed cleanly the argument copying
> from userspace needs to happen earlier.  Move the allocation and
> initialization of bprm->mm into alloc_bprm so that the bprm->mm is
> available early to store the new user stack into.  This is a prerequisite
> for copying argv and envp into the new user stack early before ther rest of
> exec.
> 
> To keep the things consistent the cleanup of bprm->mm is moved into
> free_bprm.  So that bprm->mm will be cleaned up whenever bprm->mm is
> allocated and free_bprm are called.
> 
> Moving bprm_mm_init earlier is safe as it does not depend on any files,
> current->in_execve, current->fs->in_exec, bprm->unsafe, or the if the file
> table is shared. (AKA bprm_mm_init does not depend on any of the code that
> happens between alloc_bprm and where it was previously called.)
> 
> This moves bprm->mm cleanup after current->fs->in_exec is set to 0.  This
> is safe because current->fs->in_exec is only used to preventy taking an
> additional reference on the fs_struct.
> 
> This moves bprm->mm cleanup after current->in_execve is set to 0.  This is
> safe because current->in_execve is only used by the lsms (apparmor and
> tomoyou) and always for LSM specific functions, never for anything to do
> with the mm.
> 
> This adds bprm->mm cleanup into the successful return path.  This is safe
> because being on the successful return path implies that begin_new_exec
> succeeded and set brpm->mm to NULL.  As bprm->mm is NULL bprm cleanup I am
> moving into free_bprm will do nothing.
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Reviewed-by: Kees Cook <keescook@chromium.org>

This looks correct, and is required before moving the arg pages stuff,
so good. :)

-- 
Kees Cook

^ permalink raw reply

* Re: [PATCH ghak84 v4] audit: purge audit_log_string from the intra-kernel audit API
From: Richard Guy Briggs @ 2020-07-14 21:00 UTC (permalink / raw)
  To: Paul Moore
  Cc: john.johansen, Linux-Audit Mailing List, LKML,
	Linux Security Module list, Eric Paris
In-Reply-To: <CAHC9VhQusQsdQc7EfdjdH5mp6qqqYVPHnG9nNhUhf3DS_cdWwA@mail.gmail.com>

On 2020-07-14 16:29, Paul Moore wrote:
> On Tue, Jul 14, 2020 at 1:44 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> > On 2020-07-14 12:21, Paul Moore wrote:
> > > On Mon, Jul 13, 2020 at 3:52 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> > > >
> > > > audit_log_string() was inteded to be an internal audit function and
> > > > since there are only two internal uses, remove them.  Purge all external
> > > > uses of it by restructuring code to use an existing audit_log_format()
> > > > or using audit_log_format().
> > > >
> > > > Please see the upstream issue
> > > > https://github.com/linux-audit/audit-kernel/issues/84
> > > >
> > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > > > ---
> > > > Passes audit-testsuite.
> > > >
> > > > Changelog:
> > > > v4
> > > > - use double quotes in all replaced audit_log_string() calls
> > > >
> > > > v3
> > > > - fix two warning: non-void function does not return a value in all control paths
> > > >         Reported-by: kernel test robot <lkp@intel.com>
> > > >
> > > > v2
> > > > - restructure to piggyback on existing audit_log_format() calls, checking quoting needs for each.
> > > >
> > > > v1 Vlad Dronov
> > > > - https://github.com/nefigtut/audit-kernel/commit/dbbcba46335a002f44b05874153a85b9cc18aebf
> > > >
> > > >  include/linux/audit.h     |  5 -----
> > > >  kernel/audit.c            |  4 ++--
> > > >  security/apparmor/audit.c | 10 ++++------
> > > >  security/apparmor/file.c  | 25 +++++++------------------
> > > >  security/apparmor/ipc.c   | 46 +++++++++++++++++++++++-----------------------
> > > >  security/apparmor/net.c   | 14 ++++++++------
> > > >  security/lsm_audit.c      |  4 ++--
> > > >  7 files changed, 46 insertions(+), 62 deletions(-)
> > >
> > > Thanks for restoring the quotes, just one question below ...
> > >
> > > > diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> > > > index 4ecedffbdd33..fe36d112aad9 100644
> > > > --- a/security/apparmor/ipc.c
> > > > +++ b/security/apparmor/ipc.c
> > > > @@ -20,25 +20,23 @@
> > > >
> > > >  /**
> > > >   * audit_ptrace_mask - convert mask to permission string
> > > > - * @buffer: buffer to write string to (NOT NULL)
> > > >   * @mask: permission mask to convert
> > > > + *
> > > > + * Returns: pointer to static string
> > > >   */
> > > > -static void audit_ptrace_mask(struct audit_buffer *ab, u32 mask)
> > > > +static const char *audit_ptrace_mask(u32 mask)
> > > >  {
> > > >         switch (mask) {
> > > >         case MAY_READ:
> > > > -               audit_log_string(ab, "read");
> > > > -               break;
> > > > +               return "read";
> > > >         case MAY_WRITE:
> > > > -               audit_log_string(ab, "trace");
> > > > -               break;
> > > > +               return "trace";
> > > >         case AA_MAY_BE_READ:
> > > > -               audit_log_string(ab, "readby");
> > > > -               break;
> > > > +               return "readby";
> > > >         case AA_MAY_BE_TRACED:
> > > > -               audit_log_string(ab, "tracedby");
> > > > -               break;
> > > > +               return "tracedby";
> > > >         }
> > > > +       return "";
> > >
> > > Are we okay with this returning an empty string ("") in this case?
> > > Should it be a question mark ("?")?
> > >
> > > My guess is that userspace parsing should be okay since it still has
> > > quotes, I'm just not sure if we wanted to use a question mark as we do
> > > in other cases where the field value is empty/unknown.
> >
> > Previously, it would have been an empty value, not even double quotes.
> > "?" might be an improvement.
> 
> Did you want to fix that now in this patch, or leave it to later?  As
> I said above, I'm not too bothered by it with the quotes so either way
> is fine by me.

I'd defer to Steve, otherwise I'd say leave it, since there wasn't
anything there before and this makes that more evident.

> John, I'm assuming you are okay with this patch?
> 
> -- 
> paul moore

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


^ permalink raw reply

* Re: [PATCH ghak84 v4] audit: purge audit_log_string from the intra-kernel audit API
From: Paul Moore @ 2020-07-14 20:29 UTC (permalink / raw)
  To: Richard Guy Briggs, john.johansen
  Cc: Linux-Audit Mailing List, LKML, Linux Security Module list,
	Eric Paris
In-Reply-To: <20200714174353.ds7lj3iisy67t2zu@madcap2.tricolour.ca>

On Tue, Jul 14, 2020 at 1:44 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> On 2020-07-14 12:21, Paul Moore wrote:
> > On Mon, Jul 13, 2020 at 3:52 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> > >
> > > audit_log_string() was inteded to be an internal audit function and
> > > since there are only two internal uses, remove them.  Purge all external
> > > uses of it by restructuring code to use an existing audit_log_format()
> > > or using audit_log_format().
> > >
> > > Please see the upstream issue
> > > https://github.com/linux-audit/audit-kernel/issues/84
> > >
> > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> > > ---
> > > Passes audit-testsuite.
> > >
> > > Changelog:
> > > v4
> > > - use double quotes in all replaced audit_log_string() calls
> > >
> > > v3
> > > - fix two warning: non-void function does not return a value in all control paths
> > >         Reported-by: kernel test robot <lkp@intel.com>
> > >
> > > v2
> > > - restructure to piggyback on existing audit_log_format() calls, checking quoting needs for each.
> > >
> > > v1 Vlad Dronov
> > > - https://github.com/nefigtut/audit-kernel/commit/dbbcba46335a002f44b05874153a85b9cc18aebf
> > >
> > >  include/linux/audit.h     |  5 -----
> > >  kernel/audit.c            |  4 ++--
> > >  security/apparmor/audit.c | 10 ++++------
> > >  security/apparmor/file.c  | 25 +++++++------------------
> > >  security/apparmor/ipc.c   | 46 +++++++++++++++++++++++-----------------------
> > >  security/apparmor/net.c   | 14 ++++++++------
> > >  security/lsm_audit.c      |  4 ++--
> > >  7 files changed, 46 insertions(+), 62 deletions(-)
> >
> > Thanks for restoring the quotes, just one question below ...
> >
> > > diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
> > > index 4ecedffbdd33..fe36d112aad9 100644
> > > --- a/security/apparmor/ipc.c
> > > +++ b/security/apparmor/ipc.c
> > > @@ -20,25 +20,23 @@
> > >
> > >  /**
> > >   * audit_ptrace_mask - convert mask to permission string
> > > - * @buffer: buffer to write string to (NOT NULL)
> > >   * @mask: permission mask to convert
> > > + *
> > > + * Returns: pointer to static string
> > >   */
> > > -static void audit_ptrace_mask(struct audit_buffer *ab, u32 mask)
> > > +static const char *audit_ptrace_mask(u32 mask)
> > >  {
> > >         switch (mask) {
> > >         case MAY_READ:
> > > -               audit_log_string(ab, "read");
> > > -               break;
> > > +               return "read";
> > >         case MAY_WRITE:
> > > -               audit_log_string(ab, "trace");
> > > -               break;
> > > +               return "trace";
> > >         case AA_MAY_BE_READ:
> > > -               audit_log_string(ab, "readby");
> > > -               break;
> > > +               return "readby";
> > >         case AA_MAY_BE_TRACED:
> > > -               audit_log_string(ab, "tracedby");
> > > -               break;
> > > +               return "tracedby";
> > >         }
> > > +       return "";
> >
> > Are we okay with this returning an empty string ("") in this case?
> > Should it be a question mark ("?")?
> >
> > My guess is that userspace parsing should be okay since it still has
> > quotes, I'm just not sure if we wanted to use a question mark as we do
> > in other cases where the field value is empty/unknown.
>
> Previously, it would have been an empty value, not even double quotes.
> "?" might be an improvement.

Did you want to fix that now in this patch, or leave it to later?  As
I said above, I'm not too bothered by it with the quotes so either way
is fine by me.

John, I'm assuming you are okay with this patch?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [merged][PATCH v3 00/16] Make the user mode driver code a better citizen
From: Alexei Starovoitov @ 2020-07-14 19:42 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel, David Miller, Greg Kroah-Hartman, Tetsuo Handa,
	Kees Cook, Andrew Morton, Alexei Starovoitov, Al Viro, bpf,
	linux-fsdevel, Daniel Borkmann, Jakub Kicinski, Masahiro Yamada,
	Gary Lin, Bruno Meneguele, LSM List, Casey Schaufler,
	Luis Chamberlain, Linus Torvalds, Christian Brauner
In-Reply-To: <87r1tke44q.fsf_-_@x220.int.ebiederm.org>

On Thu, Jul 09, 2020 at 05:05:09PM -0500, Eric W. Biederman wrote:
> 
> I have merged all of this into my exec-next tree.
> 
> The code is also available on the frozen branch:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git usermode-driver-cleanup
> 
> Declaring this set of changes done now, allows the work that depends
> upon this change to proceed.

Now I've pulled it into bpf-next as well.
In the mean time there were changes to kernel_write that broke bpfilter.ko
I fixed it up as well.
Thanks.

^ permalink raw reply

* Re: [PATCH v6 5/7] fs,doc: Enable to enforce noexec mounts or file exec through O_MAYEXEC
From: Randy Dunlap @ 2020-07-14 18:40 UTC (permalink / raw)
  To: Mickaël Salaün, linux-kernel
  Cc: Aleksa Sarai, Alexei Starovoitov, Al Viro, Andrew Morton,
	Andy Lutomirski, Christian Brauner, Christian Heimes,
	Daniel Borkmann, Deven Bowers, Dmitry Vyukov, Eric Biggers,
	Eric Chiang, Florian Weimer, James Morris, Jan Kara, Jann Horn,
	Jonathan Corbet, Kees Cook, Lakshmi Ramasubramanian,
	Matthew Garrett, Matthew Wilcox, Michael Kerrisk,
	Mickaël Salaün, Mimi Zohar, Philippe Trébuchet,
	Scott Shell, Sean Christopherson, Shuah Khan, Steve Dower,
	Steve Grubb, Tetsuo Handa, Thibaut Sautereau, Vincent Strubel,
	kernel-hardening, linux-api, linux-integrity,
	linux-security-module, linux-fsdevel
In-Reply-To: <20200714181638.45751-6-mic@digikod.net>

Hi,

On 7/14/20 11:16 AM, Mickaël Salaün wrote:

> ---
>  Documentation/admin-guide/sysctl/fs.rst | 45 +++++++++++++++++++++++++
>  fs/namei.c                              | 29 +++++++++++++---
>  include/linux/fs.h                      |  1 +
>  kernel/sysctl.c                         | 12 +++++--
>  4 files changed, 80 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
> index 2a45119e3331..02ec384b8bbf 100644
> --- a/Documentation/admin-guide/sysctl/fs.rst
> +++ b/Documentation/admin-guide/sysctl/fs.rst

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

with one tiny nit:

> @@ -165,6 +166,50 @@ system needs to prune the inode list instead of allocating
> +The ability to restrict code execution must be thought as a system-wide policy,
> +which first starts by restricting mount points with the ``noexec`` option.
> +This option is also automatically applied to special filesystems such as /proc
> +.  This prevents files on such mount points to be directly executed by the

Can you move that period from the beginning of the line to the end of the
previous line?

> +kernel or mapped as executable memory (e.g. libraries).  With script
> +interpreters using the ``O_MAYEXEC`` flag, the executable permission can then
> +be checked before reading commands from files. This makes it possible to
> +enforce the ``noexec`` at the interpreter level, and thus propagates this
> +security policy to scripts.  To be fully effective, these interpreters also
> +need to handle the other ways to execute code: command line parameters (e.g.,
> +option ``-e`` for Perl), module loading (e.g., option ``-m`` for Python),
> +stdin, file sourcing, environment variables, configuration files, etc.
> +According to the threat model, it may be acceptable to allow some script
> +interpreters (e.g. Bash) to interpret commands from stdin, may it be a TTY or a
> +pipe, because it may not be enough to (directly) perform syscalls.

thanks.
-- 
~Randy


^ permalink raw reply

* [PATCH v6 4/7] fs: Introduce O_MAYEXEC flag for openat2(2)
From: Mickaël Salaün @ 2020-07-14 18:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Aleksa Sarai, Alexei Starovoitov,
	Al Viro, Andrew Morton, Andy Lutomirski, Christian Brauner,
	Christian Heimes, Daniel Borkmann, Deven Bowers, Dmitry Vyukov,
	Eric Biggers, Eric Chiang, Florian Weimer, James Morris, Jan Kara,
	Jann Horn, Jonathan Corbet, Kees Cook, Lakshmi Ramasubramanian,
	Matthew Garrett, Matthew Wilcox, Michael Kerrisk,
	Mickaël Salaün, Mimi Zohar, Philippe Trébuchet,
	Scott Shell, Sean Christopherson, Shuah Khan, Steve Dower,
	Steve Grubb, Tetsuo Handa, Thibaut Sautereau, Vincent Strubel,
	kernel-hardening, linux-api, linux-integrity,
	linux-security-module, linux-fsdevel
In-Reply-To: <20200714181638.45751-1-mic@digikod.net>

When the O_MAYEXEC flag is passed, openat2(2) may be subject to
additional restrictions depending on a security policy managed by the
kernel through a sysctl or implemented by an LSM thanks to the
inode_permission hook.  This new flag is ignored by open(2) and
openat(2) because of their unspecified flags handling.

The underlying idea is to be able to restrict scripts interpretation
according to a policy defined by the system administrator.  For this to
be possible, script interpreters must use the O_MAYEXEC flag
appropriately.  To be fully effective, these interpreters also need to
handle the other ways to execute code: command line parameters (e.g.,
option -e for Perl), module loading (e.g., option -m for Python), stdin,
file sourcing, environment variables, configuration files, etc.
According to the threat model, it may be acceptable to allow some script
interpreters (e.g. Bash) to interpret commands from stdin, may it be a
TTY or a pipe, because it may not be enough to (directly) perform
syscalls.  Further documentation can be found in a following patch.

Even without enforced security policy, userland interpreters can set it
to enforce the system policy at their level, knowing that it will not
break anything on running systems which do not care about this feature.
However, on systems which want this feature enforced, there will be
knowledgeable people (i.e. sysadmins who enforced O_MAYEXEC
deliberately) to manage it.  A simple security policy implementation,
configured through a dedicated sysctl, is available in a following
patch.

O_MAYEXEC should not be confused with the O_EXEC flag which is intended
for execute-only, which obviously doesn't work for scripts.  However, a
similar behavior could be implemented in userland with O_PATH:
https://lore.kernel.org/lkml/1e2f6913-42f2-3578-28ed-567f6a4bdda1@digikod.net/

The implementation of O_MAYEXEC almost duplicates what execve(2) and
uselib(2) are already doing: setting MAY_OPENEXEC in acc_mode (which can
then be checked as MAY_EXEC, if enforced), and propagating FMODE_EXEC to
_fmode via __FMODE_EXEC flag (which can then trigger a
fanotify/FAN_OPEN_EXEC event).

This is an updated subset of the patch initially written by Vincent
Strubel for CLIP OS 4:
https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch
This patch has been used for more than 12 years with customized script
interpreters.  Some examples (with the original name O_MAYEXEC) can be
found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Co-developed-by: Vincent Strubel <vincent.strubel@ssi.gouv.fr>
Signed-off-by: Vincent Strubel <vincent.strubel@ssi.gouv.fr>
Co-developed-by: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Signed-off-by: Thibaut Sautereau <thibaut.sautereau@ssi.gouv.fr>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Reviewed-by: Deven Bowers <deven.desai@linux.microsoft.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
---

Changes since v5:
* Update commit message.

Changes since v3:
* Switch back to O_MAYEXEC, but only handle it with openat2(2) which
  checks unknown flags (suggested by Aleksa Sarai). Cf.
  https://lore.kernel.org/lkml/20200430015429.wuob7m5ofdewubui@yavin.dot.cyphar.com/

Changes since v2:
* Replace O_MAYEXEC with RESOLVE_MAYEXEC from openat2(2).  This change
  enables to not break existing application using bogus O_* flags that
  may be ignored by current kernels by using a new dedicated flag, only
  usable through openat2(2) (suggested by Jeff Layton).  Using this flag
  will results in an error if the running kernel does not support it.
  User space needs to manage this case, as with other RESOLVE_* flags.
  The best effort approach to security (for most common distros) will
  simply consists of ignoring such an error and retry without
  RESOLVE_MAYEXEC.  However, a fully controlled system may which to
  error out if such an inconsistency is detected.

Changes since v1:
* Set __FMODE_EXEC when using O_MAYEXEC to make this information
  available through the new fanotify/FAN_OPEN_EXEC event (suggested by
  Jan Kara and Matthew Bobrowski):
  https://lore.kernel.org/lkml/20181213094658.GA996@lithium.mbobrowski.org/
---
 fs/fcntl.c                       | 2 +-
 fs/open.c                        | 8 ++++++++
 include/linux/fcntl.h            | 2 +-
 include/linux/fs.h               | 2 ++
 include/uapi/asm-generic/fcntl.h | 7 +++++++
 5 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 2e4c0fa2074b..0357ad667563 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -1033,7 +1033,7 @@ static int __init fcntl_init(void)
 	 * Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY
 	 * is defined as O_NONBLOCK on some platforms and not on others.
 	 */
-	BUILD_BUG_ON(21 - 1 /* for O_RDONLY being 0 */ !=
+	BUILD_BUG_ON(22 - 1 /* for O_RDONLY being 0 */ !=
 		HWEIGHT32(
 			(VALID_OPEN_FLAGS & ~(O_NONBLOCK | O_NDELAY)) |
 			__FMODE_EXEC | __FMODE_NONOTIFY));
diff --git a/fs/open.c b/fs/open.c
index 623b7506a6db..38e434bdbbb6 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -987,6 +987,8 @@ inline struct open_how build_open_how(int flags, umode_t mode)
 		.mode = mode & S_IALLUGO,
 	};
 
+	/* O_MAYEXEC is ignored by syscalls relying on build_open_how(). */
+	how.flags &= ~O_MAYEXEC;
 	/* O_PATH beats everything else. */
 	if (how.flags & O_PATH)
 		how.flags &= O_PATH_FLAGS;
@@ -1054,6 +1056,12 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op)
 	if (flags & __O_SYNC)
 		flags |= O_DSYNC;
 
+	/* Checks execution permissions on open. */
+	if (flags & O_MAYEXEC) {
+		acc_mode |= MAY_OPENEXEC;
+		flags |= __FMODE_EXEC;
+	}
+
 	op->open_flag = flags;
 
 	/* O_TRUNC implies we need access checks for write permissions */
diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
index 7bcdcf4f6ab2..e188a360fa5f 100644
--- a/include/linux/fcntl.h
+++ b/include/linux/fcntl.h
@@ -10,7 +10,7 @@
 	(O_RDONLY | O_WRONLY | O_RDWR | O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC | \
 	 O_APPEND | O_NDELAY | O_NONBLOCK | O_NDELAY | __O_SYNC | O_DSYNC | \
 	 FASYNC	| O_DIRECT | O_LARGEFILE | O_DIRECTORY | O_NOFOLLOW | \
-	 O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE)
+	 O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE | O_MAYEXEC)
 
 /* List of all valid flags for the how->upgrade_mask argument: */
 #define VALID_UPGRADE_FLAGS \
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f5abba86107d..56f835c9a87a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -101,6 +101,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 #define MAY_CHDIR		0x00000040
 /* called from RCU mode, don't block */
 #define MAY_NOT_BLOCK		0x00000080
+/* the inode is opened with O_MAYEXEC */
+#define MAY_OPENEXEC		0x00000100
 
 /*
  * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must correspond
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 9dc0bf0c5a6e..bca90620119f 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -97,6 +97,13 @@
 #define O_NDELAY	O_NONBLOCK
 #endif
 
+/*
+ * Code execution from file is intended, checks such permission.  A simple
+ * policy can be enforced system-wide as explained in
+ * Documentation/admin-guide/sysctl/fs.rst .
+ */
+#define O_MAYEXEC	040000000
+
 #define F_DUPFD		0	/* dup */
 #define F_GETFD		1	/* get close_on_exec */
 #define F_SETFD		2	/* set/clear close_on_exec */
-- 
2.27.0


^ permalink raw reply related

* [PATCH v6 0/7] Add support for O_MAYEXEC
From: Mickaël Salaün @ 2020-07-14 18:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Aleksa Sarai, Alexei Starovoitov,
	Al Viro, Andrew Morton, Andy Lutomirski, Christian Brauner,
	Christian Heimes, Daniel Borkmann, Deven Bowers, Dmitry Vyukov,
	Eric Biggers, Eric Chiang, Florian Weimer, James Morris, Jan Kara,
	Jann Horn, Jonathan Corbet, Kees Cook, Lakshmi Ramasubramanian,
	Matthew Garrett, Matthew Wilcox, Michael Kerrisk,
	Mickaël Salaün, Mimi Zohar, Philippe Trébuchet,
	Scott Shell, Sean Christopherson, Shuah Khan, Steve Dower,
	Steve Grubb, Tetsuo Handa, Thibaut Sautereau, Vincent Strubel,
	kernel-hardening, linux-api, linux-integrity,
	linux-security-module, linux-fsdevel

Hi,

This sixth patch series mainly adds Kees Cook's file permission check
relocations which help to simplify and generalize the previous series.
I removed the MAY_EXECMOUNT flag patch which is not useful anymore with
this refactoring.  I also removed the static enforcement configuration
through Kconfig to make this series simpler and because it works in pair
with mount configurations (i.e. requires the same capability:
CAP_SYS_ADMIN).

As requested by Mimi Zohar, I completed the series with one of her
patches for IMA.  I also picked Kees Cook's patches to consolidate exec
permission checking into do_filp_open()'s flow:
https://lore.kernel.org/lkml/20200605160013.3954297-1-keescook@chromium.org/


# Goal of O_MAYEXEC

The goal of this patch series is to enable to control script execution
with interpreters help.  A new O_MAYEXEC flag, usable through
openat2(2), is added to enable userspace script interpreters to delegate
to the kernel (and thus the system security policy) the permission to
interpret/execute scripts or other files containing what can be seen as
commands.

A simple system-wide security policy can be enforced by the system
administrator through a sysctl configuration consistent with the mount
points or the file access rights.  The documentation patch explains the
prerequisites.

Furthermore, the security policy can also be delegated to an LSM, either
a MAC system or an integrity system.  For instance, the new kernel
MAY_OPENEXEC flag closes a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [1].
Other uses are expected, such as for magic-links [2], SGX integration
[3], bpffs [4] or IPE [5].


# Prerequisite of its use

Userspace needs to adapt to take advantage of this new feature.  For
example, the PEP 578 [6] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
withou -c, stdin piping of code) are on their way [7].


# Examples

The initial idea comes from CLIP OS 4 and the original implementation
has been used for more than 12 years:
https://github.com/clipos-archive/clipos4_doc
Chrome OS has a similar approach:
https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md

Userland patches can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Actually, there is more than the O_MAYEXEC changes (which matches this search)
e.g., to prevent Python interactive execution. There are patches for
Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
also some related patches which do not directly rely on O_MAYEXEC but
which restrict the use of browser plugins and extensions, which may be
seen as scripts too:
https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY&t=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU&t=11m14s
See also an overview article: https://lwn.net/Articles/820000/


This patch series can be applied on top of v5.8-rc5 .  This can be tested
with CONFIG_SYSCTL.  I would really appreciate constructive comments on
this patch series.

Previous version:
https://lore.kernel.org/lkml/20200505153156.925111-1-mic@digikod.net/


[1] https://lore.kernel.org/lkml/1544647356.4028.105.camel@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyphar@cyphar.com/
[3] https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2T_cNrpvMig@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CALCETrVeZ0eufFXwfhtaG_j+AdvbzEWE0M3wjXMWVEO7pj+xkw@mail.gmail.com/
[5] https://lore.kernel.org/lkml/20200406221439.1469862-12-deven.desai@linux.microsoft.com/
[6] https://www.python.org/dev/peps/pep-0578/
[7] https://lore.kernel.org/lkml/0c70debd-e79e-d514-06c6-4cd1e021fa8b@python.org/

Regards,

Kees Cook (3):
  exec: Change uselib(2) IS_SREG() failure to EACCES
  exec: Move S_ISREG() check earlier
  exec: Move path_noexec() check earlier

Mickaël Salaün (3):
  fs: Introduce O_MAYEXEC flag for openat2(2)
  fs,doc: Enable to enforce noexec mounts or file exec through O_MAYEXEC
  selftest/openat2: Add tests for O_MAYEXEC enforcing

Mimi Zohar (1):
  ima: add policy support for the new file open MAY_OPENEXEC flag

 Documentation/ABI/testing/ima_policy          |   2 +-
 Documentation/admin-guide/sysctl/fs.rst       |  45 +++
 fs/exec.c                                     |  23 +-
 fs/fcntl.c                                    |   2 +-
 fs/namei.c                                    |  31 ++-
 fs/open.c                                     |  14 +-
 include/linux/fcntl.h                         |   2 +-
 include/linux/fs.h                            |   3 +
 include/uapi/asm-generic/fcntl.h              |   7 +
 kernel/sysctl.c                               |  12 +-
 security/integrity/ima/ima_main.c             |   3 +-
 security/integrity/ima/ima_policy.c           |  15 +-
 tools/testing/selftests/kselftest_harness.h   |   3 +
 tools/testing/selftests/openat2/Makefile      |   3 +-
 tools/testing/selftests/openat2/config        |   1 +
 tools/testing/selftests/openat2/helpers.h     |   1 +
 .../testing/selftests/openat2/omayexec_test.c | 262 ++++++++++++++++++
 17 files changed, 400 insertions(+), 29 deletions(-)
 create mode 100644 tools/testing/selftests/openat2/config
 create mode 100644 tools/testing/selftests/openat2/omayexec_test.c

-- 
2.27.0


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox