Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [PATCH 1/2] rtnetlink: gate MAC address with an LSM hook
From: David Miller @ 2019-08-22 23:19 UTC (permalink / raw)
  To: jeffv; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <20190821134547.96929-1-jeffv@google.com>

From: Jeff Vander Stoep <jeffv@google.com>
Date: Wed, 21 Aug 2019 15:45:47 +0200

> MAC addresses are often considered sensitive because they are
> usually unique and can be used to identify/track a device or
> user [1].
> 
> The MAC address is accessible via the RTM_NEWLINK message type of a
> netlink route socket[2]. Ideally we could grant/deny access to the
> MAC address on a case-by-case basis without blocking the entire
> RTM_NEWLINK message type which contains a lot of other useful
> information. This can be achieved using a new LSM hook on the netlink
> message receive path. Using this new hook, individual LSMs can select
> which processes are allowed access to the real MAC, otherwise a
> default value of zeros is returned. Offloading access control
> decisions like this to an LSM is convenient because it preserves the
> status quo for most Linux users while giving the various LSMs
> flexibility to make finer grained decisions on access to sensitive
> data based on policy.
> 
> [1] https://adamdrake.com/mac-addresses-udids-and-privacy.html
> [2] Other access vectors like ioctl(SIOCGIFHWADDR) are already covered
> by existing LSM hooks.
> 
> Signed-off-by: Jeff Vander Stoep <jeffv@google.com>

I'm sure the MAC address will escape into userspace via other means,
dumping pieces of networking config in other contexts, etc.  I mean,
if I can get a link dump, I can dump the neighbor table as well.

I kinda think this is all very silly whack-a-mole kind of stuff, to
be quite honest.

And like others have said, tomorrow you'll be like "oh crap, we should
block X too" and we'll get another hook, another config knob, another
rulset update, etc.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Alexei Starovoitov @ 2019-08-22 22:48 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Andy Lutomirski, Song Liu, Kees Cook, Networking, bpf,
	Alexei Starovoitov, Kernel Team, Lorenz Bauer, Jann Horn, Greg KH,
	Linux API, LSM List, Chenbo Feng
In-Reply-To: <98fee747-795a-ff10-fa98-10ddb5afcc03@iogearbox.net>

On Thu, Aug 22, 2019 at 04:17:43PM +0200, Daniel Borkmann wrote:
> 
> > > Hence unprivileged bpf is actually something that can be deprecated.
> 
> There is actually a publicly known use-case on unprivileged bpf wrt
> socket filters, see the SO_ATTACH_BPF on sockets section as an example:
> 
>   https://blog.cloudflare.com/cloudflare-architecture-and-how-bpf-eats-the-world/
> 
> If I'd have to take a good guess, I'd think it's major use-case is in
> SO_ATTACH_REUSEPORT_EBPF in the wild, I don't think the sysctl can be
> outright flipped or deprecated w/o breaking existing applications unless
> it's cleanly modeled into some sort of customizable CAP_BPF* type policy
> (more below) where this would be the lowest common denominator.

The cloudflare use case is the perfect example that a lot of program types
are used together.
Do people use SO_ATTACH_BPF ? Of course.
All program types are used by somebody. Before accepting them we had long
conversations with authors to understand that the use cases are real.
Some progs are probably used less than others by now.
Like cls_bpf without exts_integrated is probably not used at all.
We still have to support it, of course.
That cloudflare example demonstrates that kernel.unprivileged_bpf_disabled=1
is a reality. Companies that care about security already switched it on.
Different bits in cloudflare setup need different level of capabilities.
Some (like SO_ATTACH_BPF) need unpriv. Another need CAP_NET_ADMIN.
But common demoninator for them all is still CAP_SYS_ADMIN.
And that's why the system as a whole is not as safe as it could have
been with CAP_BPF. The system needs root in many places.
Folks going out of the way to reduce that SYS_ADMIN to something less.
The example with systemd and NET_ADMIN is just one of them.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: David Miller @ 2019-08-22 22:36 UTC (permalink / raw)
  To: casey; +Cc: fw, paul, netdev, linux-security-module, selinux
In-Reply-To: <5d524e45-0d80-3a1c-4fd2-7610d2197bf8@schaufler-ca.com>

From: Casey Schaufler <casey@schaufler-ca.com>
Date: Thu, 22 Aug 2019 15:34:44 -0700

> On 8/22/2019 3:28 PM, David Miller wrote:
>> From: Casey Schaufler <casey@schaufler-ca.com>
>> Date: Thu, 22 Aug 2019 14:59:37 -0700
>>
>>> Sure, you *can* do that, but it would be insane to do so.
>> We look up the neighbour table entries on every single packet we
>> transmit from the kernel in the same exact way.
>>
>> And it was exactly to get rid of a pointer in a data structure.
> 
> I very much expect that the lifecycle management issues would
> be completely different, but I'll admit to having little understanding
> of the details of the neighbour table.

Neighbour table entries can live anywhere from essentially forever down
to several microseconds.

If your hash is good, and you use RCU locking on the read side, it's a
single pointer dereference in cost.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Casey Schaufler @ 2019-08-22 22:34 UTC (permalink / raw)
  To: David Miller; +Cc: fw, paul, netdev, linux-security-module, selinux, casey
In-Reply-To: <20190822.152857.1388207414767202364.davem@davemloft.net>

On 8/22/2019 3:28 PM, David Miller wrote:
> From: Casey Schaufler <casey@schaufler-ca.com>
> Date: Thu, 22 Aug 2019 14:59:37 -0700
>
>> Sure, you *can* do that, but it would be insane to do so.
> We look up the neighbour table entries on every single packet we
> transmit from the kernel in the same exact way.
>
> And it was exactly to get rid of a pointer in a data structure.

I very much expect that the lifecycle management issues would
be completely different, but I'll admit to having little understanding
of the details of the neighbour table.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: David Miller @ 2019-08-22 22:28 UTC (permalink / raw)
  To: casey; +Cc: fw, paul, netdev, linux-security-module, selinux
In-Reply-To: <e2e22b41-2aa1-6a52-107d-e4efd9dcacf4@schaufler-ca.com>

From: Casey Schaufler <casey@schaufler-ca.com>
Date: Thu, 22 Aug 2019 14:59:37 -0700

> Sure, you *can* do that, but it would be insane to do so.

We look up the neighbour table entries on every single packet we
transmit from the kernel in the same exact way.

And it was exactly to get rid of a pointer in a data structure.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Casey Schaufler @ 2019-08-22 21:59 UTC (permalink / raw)
  To: David Miller; +Cc: fw, paul, netdev, linux-security-module, selinux, casey
In-Reply-To: <20190822.141845.217313560870249775.davem@davemloft.net>

On 8/22/2019 2:18 PM, David Miller wrote:
> From: Casey Schaufler <casey@schaufler-ca.com>
> Date: Thu, 22 Aug 2019 13:35:01 -0700
>
>> If the secmark where replaced by a security blob, the u32 secmark field
>> in an sk_buff would be replaced by a void * security field.
> You can already use the secmark to hash to some kind of pointer or other
> object.

Would you really want that used in the most common configuration?
Sure, you *can* do that, but it would be insane to do so.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: David Miller @ 2019-08-22 21:18 UTC (permalink / raw)
  To: casey; +Cc: fw, paul, netdev, linux-security-module, selinux
In-Reply-To: <32646e98-2ed6-a63a-5589-fefd57e85f66@schaufler-ca.com>

From: Casey Schaufler <casey@schaufler-ca.com>
Date: Thu, 22 Aug 2019 13:35:01 -0700

> If the secmark where replaced by a security blob, the u32 secmark field
> in an sk_buff would be replaced by a void * security field.

You can already use the secmark to hash to some kind of pointer or other
object.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: David Miller @ 2019-08-22 21:17 UTC (permalink / raw)
  To: casey; +Cc: paul, fw, netdev, linux-security-module, selinux
In-Reply-To: <00ab1a3e-fd57-fe42-04fa-d82c1585b360@schaufler-ca.com>

From: Casey Schaufler <casey@schaufler-ca.com>
Date: Thu, 22 Aug 2019 13:10:43 -0700

> Given that the original objection to using a skb extension for a
> security blob was that an extension is dynamic, and that the ubiquitous
> nature of LSM use makes that unreasonable, it would seem that supporting
> the security blob as a basic part if the skb would be the obvious and
> correct solution. If the normal case is that there is an LSM that would
> befit from the native (unextended) support of a blob, it would seem
> that that is the case that should be optimized.

The objection is the cost, either in terms of dynamic allocation or in
terms of fixed space allocated inside of the SKB.

If you are given a u32 (which you already have) it can be used as an
IDR-like space to look up pointers if necessary.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Casey Schaufler @ 2019-08-22 20:35 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Paul Moore, netdev, linux-security-module, selinux, casey
In-Reply-To: <20190822201520.GJ20113@breakpoint.cc>

On 8/22/2019 1:15 PM, Florian Westphal wrote:
> Casey Schaufler <casey@schaufler-ca.com> wrote:
>> Given that the original objection to using a skb extension for a
>> security blob was that an extension is dynamic, and that the ubiquitous
>> nature of LSM use makes that unreasonable, it would seem that supporting
>> the security blob as a basic part if the skb would be the obvious and
>> correct solution. If the normal case is that there is an LSM that would
>> befit from the native (unextended) support of a blob, it would seem
>> that that is the case that should be optimized.
> What is this "blob"? i.e., what would you like to add to sk_buff to make
> whatever use cases you have in mind work?

In LSM terminology a blob is a set of data managed and used by
the LSM (either in the infrastructure or the security module).
Blob pointers are included in the system data structures to which
they relate. The inode has an i_security field, which is a void *.
If the secmark where replaced by a security blob, the u32 secmark field
in an sk_buff would be replaced by a void * security field.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Florian Westphal @ 2019-08-22 20:15 UTC (permalink / raw)
  To: Casey Schaufler
  Cc: Paul Moore, Florian Westphal, netdev, linux-security-module,
	selinux
In-Reply-To: <00ab1a3e-fd57-fe42-04fa-d82c1585b360@schaufler-ca.com>

Casey Schaufler <casey@schaufler-ca.com> wrote:
> Given that the original objection to using a skb extension for a
> security blob was that an extension is dynamic, and that the ubiquitous
> nature of LSM use makes that unreasonable, it would seem that supporting
> the security blob as a basic part if the skb would be the obvious and
> correct solution. If the normal case is that there is an LSM that would
> befit from the native (unextended) support of a blob, it would seem
> that that is the case that should be optimized.

What is this "blob"? i.e., what would you like to add to sk_buff to make
whatever use cases you have in mind work?

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Casey Schaufler @ 2019-08-22 20:10 UTC (permalink / raw)
  To: Paul Moore, Florian Westphal
  Cc: netdev, linux-security-module, selinux, casey
In-Reply-To: <CAHC9VhQ_+3ywPu0QRzP3cSgPH2i9Br994wJttp-yXy2GA4FrNg@mail.gmail.com>

On 8/22/2019 9:32 AM, Paul Moore wrote:
> On Thu, Aug 22, 2019 at 3:03 AM Florian Westphal <fw@strlen.de> wrote:
>> Paul Moore <paul@paul-moore.com> wrote:
>>> Hello netdev,
>>>
>>> I was just made aware of the skb extension work, and it looks very
>>> appealing from a LSM perspective.  As some of you probably remember,
>>> we (the LSM folks) have wanted a proper security blob in the skb for
>>> quite some time, but netdev has been resistant to this idea thus far.
>> Is that "blob" in addition to skb->secmark, or a replacement?
> That's a good question.  While I thought about that, I wasn't sure if
> that was worth bringing up as previous attempts to trade the secmark
> field for a void pointer met with failure.  Last time I played with it
> I was able to take the additional 32-bits from holes in the skb, and
> possibly even improve some of the cacheline groupings (but that is
> always going to be a dependent on use case I think), but that wasn't
> enough.
>
> I think we could consider freeing up the secmark in the main skb, and
> move it to a skb extension, but this would potentially increase the
> chances that we would need to add an extension to a skb.  I don't have
> any hard numbers, but based on discussions and questions I suspect
> Secmark is more widely used than NetLabel and/or labeled IPsec;
> although I'm confident it is still a minor percentage of the overall
> Linux installed base.

Smack uses both extensively. As far as Smack is concerned giving up
the secmark for a blob would be just fine.

I am also working on security module stacking, and a blob in the
skb would dramatically improve the options for making that work
rationally.

> For me the big question is what would it take for us to get a security
> blob associated with the skb?  Would moving the secmark into the skb
> extension be enough?  Something else?  Or is this simply never going
> to happen?  I want to remain optimistic, but I've been trying for this
> off-and-on for over a decade and keep running into a brick wall ;)

Given that the original objection to using a skb extension for a
security blob was that an extension is dynamic, and that the ubiquitous
nature of LSM use makes that unreasonable, it would seem that supporting
the security blob as a basic part if the skb would be the obvious and
correct solution. If the normal case is that there is an LSM that would
befit from the native (unextended) support of a blob, it would seem
that that is the case that should be optimized.

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Casey Schaufler @ 2019-08-22 18:50 UTC (permalink / raw)
  To: David Miller, paul; +Cc: netdev, linux-security-module, selinux, casey
In-Reply-To: <20190821.205454.2103510420957943248.davem@davemloft.net>

On 8/21/2019 8:54 PM, David Miller wrote:
> From: Paul Moore <paul@paul-moore.com>
> Date: Wed, 21 Aug 2019 23:27:03 -0400
>
>> On Wed, Aug 21, 2019 at 6:50 PM David Miller <davem@davemloft.net> wrote:
>>> From: Paul Moore <paul@paul-moore.com>
>>> Date: Wed, 21 Aug 2019 18:00:09 -0400
>>>
>>>> I was just made aware of the skb extension work, and it looks very
>>>> appealing from a LSM perspective.  As some of you probably remember,
>>>> we (the LSM folks) have wanted a proper security blob in the skb for
>>>> quite some time, but netdev has been resistant to this idea thus far.
>>>>
>>>> If I were to propose a patchset to add a SKB_EXT_SECURITY skb
>>>> extension (a single extension ID to be shared among the different
>>>> LSMs), would that be something that netdev would consider merging, or
>>>> is there still a philosophical objection to things like this?
>>> Unlike it's main intended user (MPTCP), it sounds like LSM's would use
>>> this in a way such that it would be enabled on most systems all the
>>> time.

Only SELinux and Smack use the networking hooks today,
although I understand that AppArmor has plans to do so
in the not too distant future. Smack enables labeled
networking at all times. While Smack doesn't have the
expansive use that SELinux does because of Android, it
is used extensively in embedded systems via Tizen and
Yocto Project deployments.

>>> That really defeats the whole purpose of making it dynamic. :-/

It argues that fulfilling the needs of LSMs ought to be a basic
feature of the skb, rather than a dynamic extension. When LSMs were
introduced 20 years ago it was assumed their use would be rare, and
it was. Today almost all Linux systems use LSMs, and once AppArmor
adds network labeling it will be quite difficult to find a major
distribution that doesn't need the support.

>> I would be okay with only adding a skb extension when we needed it,
>> which I'm currently thinking would only be when we had labeled
>> networking actually configured at runtime and not just built into the
>> kernel.  In SELinux we do something similar today when it comes to our
>> per-packet access controls; if labeled networking is not configured we
>> bail out of the LSM hooks early to improve performance (we would just
>> be comparing unlabeled_t to unlabeled_t anyway).  I think the other
>> LSMs would be okay with this usage as well.

Smack uses labeled (CIPSO now, CALIPSO 'soon') networking by default
and depends on it heavily for basic system policy enforcement.

>> While a number of distros due enable some form of LSM and the labeled
>> networking bits at build time, vary few (if any?) provide a default
>> configuration so I would expect no additional overhead in the common
>> case.

Tizen isn't a distro, but neither is Android.

>> Would that be acceptable?
> I honestly don't know, I kinda feared that once the SKB extension went in
> people would start dumping things there and that's exactly what's happening.

As Paul has mentioned, the LSM community (Paul and me in particular)
have been looking for a better way to deal with the network stack
for a long time.

> I just so happened to be reviewing:
>
> 	https://patchwork.ozlabs.org/patch/1150091/
>
> while you were writing this email.
>
> It's rediculous, the vultures are out.

^ permalink raw reply

* Re: [PATCH] smack: use GFP_NOFS while holding inode_smack::smk_lock
From: Casey Schaufler @ 2019-08-22 16:44 UTC (permalink / raw)
  To: Eric Biggers, linux-security-module
  Cc: syzkaller-bugs, syzbot+0eefc1e06a77d327a056, stable, casey
In-Reply-To: <20190822055441.20569-1-ebiggers@kernel.org>

On 8/21/2019 10:54 PM, Eric Biggers wrote:
> From: Eric Biggers <ebiggers@google.com>
>
> inode_smack::smk_lock is taken during smack_d_instantiate(), which is
> called during a filesystem transaction when creating a file on ext4.
> Therefore to avoid a deadlock, all code that takes this lock must use
> GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
> transaction to complete.
>
> Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
> Cc: stable@vger.kernel.org
> Signed-off-by: Eric Biggers <ebiggers@google.com>

I will run tests on this, and will take it in the Smack tree
unless there are unexpected failures.

> ---
>  security/smack/smack_access.c | 6 +++---
>  security/smack/smack_lsm.c    | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
> index f1c93a7be9ec..38ac3da4e791 100644
> --- a/security/smack/smack_access.c
> +++ b/security/smack/smack_access.c
> @@ -465,7 +465,7 @@ char *smk_parse_smack(const char *string, int len)
>  	if (i == 0 || i >= SMK_LONGLABEL)
>  		return ERR_PTR(-EINVAL);
>  
> -	smack = kzalloc(i + 1, GFP_KERNEL);
> +	smack = kzalloc(i + 1, GFP_NOFS);
>  	if (smack == NULL)
>  		return ERR_PTR(-ENOMEM);
>  
> @@ -500,7 +500,7 @@ int smk_netlbl_mls(int level, char *catset, struct netlbl_lsm_secattr *sap,
>  			if ((m & *cp) == 0)
>  				continue;
>  			rc = netlbl_catmap_setbit(&sap->attr.mls.cat,
> -						  cat, GFP_KERNEL);
> +						  cat, GFP_NOFS);
>  			if (rc < 0) {
>  				netlbl_catmap_free(sap->attr.mls.cat);
>  				return rc;
> @@ -536,7 +536,7 @@ struct smack_known *smk_import_entry(const char *string, int len)
>  	if (skp != NULL)
>  		goto freeout;
>  
> -	skp = kzalloc(sizeof(*skp), GFP_KERNEL);
> +	skp = kzalloc(sizeof(*skp), GFP_NOFS);
>  	if (skp == NULL) {
>  		skp = ERR_PTR(-ENOMEM);
>  		goto freeout;
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index 50c536cad85b..7e4d3145a018 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -288,7 +288,7 @@ static struct smack_known *smk_fetch(const char *name, struct inode *ip,
>  	if (!(ip->i_opflags & IOP_XATTR))
>  		return ERR_PTR(-EOPNOTSUPP);
>  
> -	buffer = kzalloc(SMK_LONGLABEL, GFP_KERNEL);
> +	buffer = kzalloc(SMK_LONGLABEL, GFP_NOFS);
>  	if (buffer == NULL)
>  		return ERR_PTR(-ENOMEM);
>  

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Paul Moore @ 2019-08-22 16:32 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <20190822070358.GE20113@breakpoint.cc>

On Thu, Aug 22, 2019 at 3:03 AM Florian Westphal <fw@strlen.de> wrote:
> Paul Moore <paul@paul-moore.com> wrote:
> > Hello netdev,
> >
> > I was just made aware of the skb extension work, and it looks very
> > appealing from a LSM perspective.  As some of you probably remember,
> > we (the LSM folks) have wanted a proper security blob in the skb for
> > quite some time, but netdev has been resistant to this idea thus far.
>
> Is that "blob" in addition to skb->secmark, or a replacement?

That's a good question.  While I thought about that, I wasn't sure if
that was worth bringing up as previous attempts to trade the secmark
field for a void pointer met with failure.  Last time I played with it
I was able to take the additional 32-bits from holes in the skb, and
possibly even improve some of the cacheline groupings (but that is
always going to be a dependent on use case I think), but that wasn't
enough.

I think we could consider freeing up the secmark in the main skb, and
move it to a skb extension, but this would potentially increase the
chances that we would need to add an extension to a skb.  I don't have
any hard numbers, but based on discussions and questions I suspect
Secmark is more widely used than NetLabel and/or labeled IPsec;
although I'm confident it is still a minor percentage of the overall
Linux installed base.

For me the big question is what would it take for us to get a security
blob associated with the skb?  Would moving the secmark into the skb
extension be enough?  Something else?  Or is this simply never going
to happen?  I want to remain optimistic, but I've been trying for this
off-and-on for over a decade and keep running into a brick wall ;)

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
From: Eric Biggers @ 2019-08-22 15:47 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Al Viro, linux-fsdevel, syzbot, jmorris, linux-kernel,
	linux-security-module, serge, syzkaller-bugs, takedakn,
	David S. Miller
In-Reply-To: <201908220742.x7M7gQJW078160@www262.sakura.ne.jp>

On Thu, Aug 22, 2019 at 04:42:26PM +0900, Tetsuo Handa wrote:
> Eric Biggers wrote:
> > On Thu, Aug 22, 2019 at 03:55:31PM +0900, Tetsuo Handa wrote:
> > > > Also, isn't the same bug in other places too?:
> > > > 
> > > > 	- tomoyo_path_chmod()
> > > > 	- tomoyo_path_chown()
> > > > 	- smack_inode_getsecurity()
> > > > 	- smack_inode_setsecurity()
> > > 
> > > What's the bug? The file descriptor returned by open(O_PATH) cannot be
> > > passed to read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), mmap(2) etc.
> > > 
> > 
> > chmod(2), chown(2), getxattr(2), and setxattr(2) take a path, not a fd.
> > 
> 
> OK. Then, is the correct fix
> 
>   inode_lock(inode);
>   if (SOCKET_I(inode)->sk) {
>     // Can access SOCKET_I(sock)->sk->*
>   } else {
>     // Already close()d. Don't touch.
>   }
>   inode_unlock(inode);
> 
> thanks to
> 
>   commit 6d8c50dcb029872b ("socket: close race condition between sock_close() and sockfs_setattr()")
>   commit ff7b11aa481f682e ("net: socket: set sock->sk to NULL after calling proto_ops::release()")
> 
> changes?

inode_lock() is already held during security_path_chmod(),
security_path_chown(), and security_inode_setxattr().
So you can't just take it again.

- Eric

^ permalink raw reply

* RFC: very rough draft of a bpf permission model
From: Andy Lutomirski @ 2019-08-22 15:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Daniel Borkmann, Alexei Starovoitov, Song Liu, Kees Cook,
	Networking, bpf, Alexei Starovoitov, Kernel Team, Lorenz Bauer,
	Jann Horn, Greg KH, Linux API, LSM List, Chenbo Feng
In-Reply-To: <CALCETrUWQbPK3Pc6P5i_UqHPXJmZVyvuYXfq+VRtD6A3emaRhw@mail.gmail.com>

BPF security strawman, v0.1

This is very rough.  Most of this, especially the API details, needs
work before it's ready to implement.  The whole concept also needs
review.

= Goals =

The overall goal is to make it possible to use eBPF without having
what is effectively administrator access.  For example, an eBPF user
should not be able to directly tamper with other processes (unless
this permission is explicitly granted) and should not be able to
read or write other users' eBPF maps.

It should be possible to use eBPF inside a user namespace without breaking
the userns security model.

Due to the risk of speculation attacks and such being carried out via
eBPF, it should not become possible to use too much of eBPF without the
administrator's permission.  (NB: it is already possible to use
*classic* BPF without any permission, and classic BPF is translated
internally to eBPF, so this goal can only be met to a limited extent.)

= Definitions =

Global capability: A capability bit in the caller's effective mask, so
long as the caller is in the root user namespace.  Tasks in non-root
user namespaces never have global capabilibies.  This is what capable()
checks.

Namespace capability: A capability over a specific user namespace.
Tasks in a user namespace have all the capabilities in their effective
mask over their user namespace.  A namespace capability generally
indicates that the capability applies to the user namespace itself and
to all non-user namespaces that live in the user namespace.  For
example, CAP_NET_ADMIN means that you can configure all networks
namespaces in the current user namespace.  This is what ns_capable()
checks.

Anything that requires a global capability will not work in a non-root
user namespace.

= unprivileged_bpf_disabled =

Nothing in here supercedes unprivileged_bpf_disabled.  If
unprivileged_bpf_disabled = 1, then these proposals should not allow anything
that is disallowed today.  The idea is to make unprivileged_bpf_disabled=0
both safer and more useful.

= Test runs =

Global CAP_SYS_ADMIN is needed to test-run a program.  Test-running a program
exposes its own attack surface.  It's also the only way to run a program at
all if you merely have permission to load the program but not to attach it
anywhere.  Some of the proposed changes below will make it possible to load
most program types without

= Access to programs and maps =

There are two basic security concerns when accessing programs and maps:
the attack surface against the kernel and the ability to access other
people's maps.

Unprivileged processes may read a map if they have an FMODE_READ descriptor
for the map.  Unprivileged processes may write a map if they have an
FMODE_WRITE descriptor to the map.  Unprivileged processes may open a
persistent map with a mode consistent with the permissions in bpffs.

Unprivileged processes may create a bpffs inode for an existing map
if the have an RW file descriptor for the map.  (This is a change to
current behavior.  Daniel, Alexei thought the current behavior was
intentional.  Do you recall whether this is the case?)

The _BY_ID map APIs inherently have no concept of ownership of maps.  These
APIs will continue to require global CAP_SYS_ADMIN.

The small number of things that currently require the _BY_ID APIs, e.g.,
reading maps of maps, can be addressed if needed with new APIs that
return fds instead of ids.  Otherwise using them will continue to require
global capabilities.

Unprivileged processes may create exactly the set of maps that they can
create today.  Future proposals may extend this by a variety of means;
this current proposal makes no changes.

= Program loading =

Loading a program carries the following risks:

 - It exposes the attack surface in the program verifier itself.  That is
   possible, although unlikely, that merely verifying a malicious program
   could crash or otherwise cause a kernel malfunction.

 - It exposes the attack surface of insufficient checks in the verifier.
   That is, a verifier bug could allow a malicious program that is dangerous
   when run.

 - It exposes all of the functions that the program type can call.
   Some functions, e.g. bpf_probe_read(), should require privilege to call.

 - It exposes resource attacks.  Currently, privileged users can load programs
   that use more resources than unprivileged users can load.

 - It exposes pointer-to-integer conversions.  This requires global
   capabilities.

 - The program could contain speculation attack gadgets.

 - Loading a program is a prerequisite to attaching the program.

I propose the following:

Flag functions that require privilege as such.  Loading a program that calls
such a function will require a global capability.  The privileged functions are
mainly used for tracing, I think, and kernel tracing should require global
capabilities.

Loading a program that uses privileged verifier features (function calls or
pointer-to-integer-conversions) will continue to require privileges.

Loading a function that uses excessive resources can continue to require
global capabilities or it could use a new set of cgroup settings that
adjust the bpf complexity limits.

Loading a function that bypasses the various speculation attack hardening
features (e.g. constant blinding) requires global capabilities.

Other than this, bpf program types can have a new flag set to allow
them to be loaded without any privileges.  Some bpf program types
may need additional care, e.g. perf bpf events.  They can be attached
without privilege even in current kernels, and this might need to change.

(optional) Add an API to load a program where the program source comes from a
file specified by id instead of in memory.  This would allow LSMs to require
that bpf() programs be appropriately labeled.  If LSMs require use of this
API, it will make it much harder to exploit the verifier or speculation bugs.

As a possible future extension, a way to selectively grant the ability to
use specific program types without privilege could be useful.  This
could be done with a cgroup option, for example.

= Cgroup attach =

Cgroups have their own hierarchy that does not necessarily follow the
namespace hierarchy.  Unless cgroups integrate with namespaces in ways
that they currently don't, namespace capabilites cannot be used to grant
permission to operate on cgroups.

I propose that attaching and detaching bpf programs to cgroups use a
permission model similar to the model for changing non-bpf cgroup
settings.  In particular, each bpf_attach_type will get a new file in a
cgroup directory.  So there will be
/sys/fs/cgroup/cgroup_name/bpf.inet_ingress, bpf.inet_egress, etc.

A new API will be added to bpf() to attach and detach programs.  The new
API will take an fd to the bpf.attach_type file instead of to the cgroup
directory.  It will require FMODE_WRITE.  This API will *not* require
any capability.

To prevent anyone with a delegated cgroup from automatically being
able to use all bpf program types, the new bpf.attach_type files
will be opt-in as part of the hierarchy.  This could be done by writing
"+bpf.*" or "+bpf.inet_ingress" to cgroup.subtree_control to make
all the bpf.attach_type files or just bpf.inet_ingress available
in descendents of the cgroup in question.  This could alternatively
be a new bpf.subtree_control file if that seems better.

The result of these changes will be that root can use the old
attach API or the new attach API.  Unprivileged programs cannot
use the old attach API.  Unprivileged programs can use the new
attach API if they are explicitly granted permission by all their
ancestor cgroup managers.

= Additional mitigations =

Optional: there may be cases where a user can load a bpf program
but can't attach or otherwise execute it.  Nonetheless, it's plausible
that such a program could be speculatively executed.  The kernel could
mitigate this by only marking a JITted bpf program executable when it
is first attached or test-run.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-22 15:16 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Andy Lutomirski, Alexei Starovoitov, Song Liu, Kees Cook,
	Networking, bpf, Alexei Starovoitov, Kernel Team, Lorenz Bauer,
	Jann Horn, Greg KH, Linux API, LSM List, Chenbo Feng
In-Reply-To: <98fee747-795a-ff10-fa98-10ddb5afcc03@iogearbox.net>

On Thu, Aug 22, 2019 at 7:17 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 8/7/19 7:24 AM, Andy Lutomirski wrote:
> > On Mon, Aug 5, 2019 at 6:11 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> >> On Mon, Aug 05, 2019 at 02:25:35PM -0700, Andy Lutomirski wrote:
> >>> It tries to make the kernel respect the access modes for fds.  Without
> >>> this patch, there seem to be some holes: nothing looked at program fds
> >>> and, unless I missed something, you could take a readonly fd for a
> >>> program, pin the program, and reopen it RW.
> >>
> >> I think it's by design. iirc Daniel had a use case for something like this.
> >
> > That seems odd.  Daniel, can you elaborate?
>
> [ ... catching up late. ]
>
> Not from my side, the change was added by Chenbo back then for Android
> use-case to replace xt_qtaguid and xt_owner with BPF programs and to
> allow unprivileged applications to read maps. More on their architecture:
>
>    https://source.android.com/devices/tech/datausage/ebpf-traffic-monitor
>
>  From the cover-letter:
>
>    [...]
>    The network-control daemon (netd) creates and loads an eBPF object for
>    network packet filtering and analysis. It passes the object FD to an
>    unprivileged network monitor app (netmonitor), which is not allowed to
>    create, modify or load eBPF objects, but is allowed to read the traffic
>    stats from the map.
>    [...]

I suspect that this use case is, in fact, mostly broken in current
kernels.  An unprivileged process with a read-only fd to a bpf map can
BPF_OBJ_PIN the map and then re-open it read-write.  As far as I can
tell, the only thing mitigating this is that it won't work unless the
attacker has write access to some directory in bpffs.

> > Trusted by whom?  In a non-nested container, the container manager
> > *might* be trusted by the outside world.  In a *nested* container,
> > unless the inner container management is controlled from outside the
> > outer container, it's not trusted.  I don't know much about how
> > Facebook's containers work, but the LXC/LXD/Podman world is moving
> > very strongly toward user namespaces and maximally-untrusted
> > containers, and I think bpf() should work in that context.
>
> [...] and if we opt-in with CAP_NET_ADMIN, for example, then it should
> ideally be possible for that container to install BPF programs for
> mangling, dropping, forwarding etc as long as it's only affecting it's
> /own/ netns like the rest of networking subsystem controls that work
> in that case. I would actually like to get to this at some point and
> make it more approachable as long as there is a way for an admin to
> /opt into it/ via policy (aka not by default).

For better or for worse, I think this would need a massive
re-architecting of the way bpf filtering works.  bpf filters attach to
cgroups, which aren't scoped to network namespaces at all.  So we need
a different permission model.

> Thinking out loud, I'd
> love some sort of a hybrid, that is, a mixture of CAP_BPF_ADMIN and
> customizable seccomp policy. Meaning, there could be several CAP_BPF
> type sub-policies e.g. from allowing everything (equivalent to the
> /dev/bpf on/off handle or CAP_SYS_ADMIN we have today) down to
> programmable user defined policy that can be tailored to specific
> needs like granting apps to BPF_OBJ_GET and BPF_MAP_LOOKUP elements
> or granting to load+mangle a specific subset of maps (e.g. BPF_MAP_TYPE_{ARRAY,
> HASH,LRU_HASH,LPM_TRIE}) and prog types (...) plus attaching them to
> their own netns, and if that is untrusted, then same restrictions/
> mitigations could be done by the verifier as with (current) unprivileged
> BPF, enabled via programmable policy as well. We wouldn't make any
> static/fixed assumptions, but allow users to define them based on their
> own use-cases. Haven't looked how feasible this would be, but something
> to take into consideration when we rework the current [admittedly
> suboptimal all-or-nothing] model we have. Is this something you had in
> mind as well for your wip proposal, Andy?
>

Hmm.  Fine-grained seccomp stuff like this is very much in scope for
the seccomp discussion that's happening at LPC this year.
Unfortunately, I'm not there, but I'm participating via the mailing
list.

I also finally finished typing a very rough draft of my bpf ideas.
I'll email them out momentarily in a separate email.  I think it
should come fairly close to doing what you want.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Daniel Borkmann @ 2019-08-22 14:17 UTC (permalink / raw)
  To: Andy Lutomirski, Alexei Starovoitov
  Cc: Song Liu, Kees Cook, Networking, bpf, Alexei Starovoitov,
	Kernel Team, Lorenz Bauer, Jann Horn, Greg KH, Linux API,
	LSM List, Chenbo Feng
In-Reply-To: <CALCETrXEHL3+NAY6P6vUj7Pvd9ZpZsYC6VCLXOaNxb90a_POGw@mail.gmail.com>

On 8/7/19 7:24 AM, Andy Lutomirski wrote:
> On Mon, Aug 5, 2019 at 6:11 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>> On Mon, Aug 05, 2019 at 02:25:35PM -0700, Andy Lutomirski wrote:
>>> It tries to make the kernel respect the access modes for fds.  Without
>>> this patch, there seem to be some holes: nothing looked at program fds
>>> and, unless I missed something, you could take a readonly fd for a
>>> program, pin the program, and reopen it RW.
>>
>> I think it's by design. iirc Daniel had a use case for something like this.
> 
> That seems odd.  Daniel, can you elaborate?

[ ... catching up late. ]

Not from my side, the change was added by Chenbo back then for Android
use-case to replace xt_qtaguid and xt_owner with BPF programs and to
allow unprivileged applications to read maps. More on their architecture:

   https://source.android.com/devices/tech/datausage/ebpf-traffic-monitor

 From the cover-letter:

   [...]
   The network-control daemon (netd) creates and loads an eBPF object for
   network packet filtering and analysis. It passes the object FD to an
   unprivileged network monitor app (netmonitor), which is not allowed to
   create, modify or load eBPF objects, but is allowed to read the traffic
   stats from the map.
   [...]

Iuuc, netd would be in charge with the ability to r/w into maps and
pin them, but with the ability to to hand off some map fds as r/o to
unprivileged applications in order for them to query data.

>> Hence unprivileged bpf is actually something that can be deprecated.

There is actually a publicly known use-case on unprivileged bpf wrt
socket filters, see the SO_ATTACH_BPF on sockets section as an example:

   https://blog.cloudflare.com/cloudflare-architecture-and-how-bpf-eats-the-world/

If I'd have to take a good guess, I'd think it's major use-case is in
SO_ATTACH_REUSEPORT_EBPF in the wild, I don't think the sysctl can be
outright flipped or deprecated w/o breaking existing applications unless
it's cleanly modeled into some sort of customizable CAP_BPF* type policy
(more below) where this would be the lowest common denominator.

> I hope not.  There are a couple setsockopt uses right now, and and
> seccomp will surely want it someday.  And the bpf-inside-container use
> case really is unprivileged bpf -- containers are, in many (most?)
> cases, explicitly not trusted by the host.
[...]
>> Inside containers and inside nested containers we need to start processes
>> that will use bpf. All of the processes are trusted.
> 
> Trusted by whom?  In a non-nested container, the container manager
> *might* be trusted by the outside world.  In a *nested* container,
> unless the inner container management is controlled from outside the
> outer container, it's not trusted.  I don't know much about how
> Facebook's containers work, but the LXC/LXD/Podman world is moving
> very strongly toward user namespaces and maximally-untrusted
> containers, and I think bpf() should work in that context.

[...] and if we opt-in with CAP_NET_ADMIN, for example, then it should
ideally be possible for that container to install BPF programs for
mangling, dropping, forwarding etc as long as it's only affecting it's
/own/ netns like the rest of networking subsystem controls that work
in that case. I would actually like to get to this at some point and
make it more approachable as long as there is a way for an admin to
/opt into it/ via policy (aka not by default). Thinking out loud, I'd
love some sort of a hybrid, that is, a mixture of CAP_BPF_ADMIN and
customizable seccomp policy. Meaning, there could be several CAP_BPF
type sub-policies e.g. from allowing everything (equivalent to the
/dev/bpf on/off handle or CAP_SYS_ADMIN we have today) down to
programmable user defined policy that can be tailored to specific
needs like granting apps to BPF_OBJ_GET and BPF_MAP_LOOKUP elements
or granting to load+mangle a specific subset of maps (e.g. BPF_MAP_TYPE_{ARRAY,
HASH,LRU_HASH,LPM_TRIE}) and prog types (...) plus attaching them to
their own netns, and if that is untrusted, then same restrictions/
mitigations could be done by the verifier as with (current) unprivileged
BPF, enabled via programmable policy as well. We wouldn't make any
static/fixed assumptions, but allow users to define them based on their
own use-cases. Haven't looked how feasible this would be, but something
to take into consideration when we rework the current [admittedly
suboptimal all-or-nothing] model we have. Is this something you had in
mind as well for your wip proposal, Andy?

Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
From: Tetsuo Handa @ 2019-08-22  7:42 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Al Viro, linux-fsdevel, syzbot, jmorris, linux-kernel,
	linux-security-module, serge, syzkaller-bugs, takedakn,
	David S. Miller
In-Reply-To: <20190822070129.GL6111@zzz.localdomain>

Eric Biggers wrote:
> On Thu, Aug 22, 2019 at 03:55:31PM +0900, Tetsuo Handa wrote:
> > > Also, isn't the same bug in other places too?:
> > > 
> > > 	- tomoyo_path_chmod()
> > > 	- tomoyo_path_chown()
> > > 	- smack_inode_getsecurity()
> > > 	- smack_inode_setsecurity()
> > 
> > What's the bug? The file descriptor returned by open(O_PATH) cannot be
> > passed to read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), mmap(2) etc.
> > 
> 
> chmod(2), chown(2), getxattr(2), and setxattr(2) take a path, not a fd.
> 

OK. Then, is the correct fix

  inode_lock(inode);
  if (SOCKET_I(inode)->sk) {
    // Can access SOCKET_I(sock)->sk->*
  } else {
    // Already close()d. Don't touch.
  }
  inode_unlock(inode);

thanks to

  commit 6d8c50dcb029872b ("socket: close race condition between sock_close() and sockfs_setattr()")
  commit ff7b11aa481f682e ("net: socket: set sock->sk to NULL after calling proto_ops::release()")

changes?

^ permalink raw reply

* Re: New skb extension for use by LSMs (skb "security blob")?
From: Florian Westphal @ 2019-08-22  7:03 UTC (permalink / raw)
  To: Paul Moore; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <CAHC9VhSz1_KA1tCJtNjwK26BOkGhKGbPT7v1O82mWPduvWwd4A@mail.gmail.com>

Paul Moore <paul@paul-moore.com> wrote:
> Hello netdev,
> 
> I was just made aware of the skb extension work, and it looks very
> appealing from a LSM perspective.  As some of you probably remember,
> we (the LSM folks) have wanted a proper security blob in the skb for
> quite some time, but netdev has been resistant to this idea thus far.

Is that "blob" in addition to skb->secmark, or a replacement?

^ permalink raw reply

* Re: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
From: Eric Biggers @ 2019-08-22  7:01 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Al Viro, linux-fsdevel, syzbot, jmorris, linux-kernel,
	linux-security-module, serge, syzkaller-bugs, takedakn,
	David S. Miller
In-Reply-To: <201908220655.x7M6tVmv029579@www262.sakura.ne.jp>

On Thu, Aug 22, 2019 at 03:55:31PM +0900, Tetsuo Handa wrote:
> Eric Biggers wrote:
> > What happened to this patch?
> 
> I have to learn how to manage a git tree for sending
> pull requests, but I can't find time to try.
> 
> > 
> > Also, isn't the same bug in other places too?:
> > 
> > 	- tomoyo_path_chmod()
> > 	- tomoyo_path_chown()
> > 	- smack_inode_getsecurity()
> > 	- smack_inode_setsecurity()
> 
> What's the bug? The file descriptor returned by open(O_PATH) cannot be
> passed to read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), mmap(2) etc.
> 

chmod(2), chown(2), getxattr(2), and setxattr(2) take a path, not a fd.

- Eric

^ permalink raw reply

* Re: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
From: Tetsuo Handa @ 2019-08-22  6:55 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Al Viro, linux-fsdevel, syzbot, jmorris, linux-kernel,
	linux-security-module, serge, syzkaller-bugs, takedakn,
	David S. Miller
In-Reply-To: <20190822063018.GK6111@zzz.localdomain>

Eric Biggers wrote:
> What happened to this patch?

I have to learn how to manage a git tree for sending
pull requests, but I can't find time to try.

> 
> Also, isn't the same bug in other places too?:
> 
> 	- tomoyo_path_chmod()
> 	- tomoyo_path_chown()
> 	- smack_inode_getsecurity()
> 	- smack_inode_setsecurity()

What's the bug? The file descriptor returned by open(O_PATH) cannot be
passed to read(2), write(2), fchmod(2), fchown(2), fgetxattr(2), mmap(2) etc.

^ permalink raw reply

* Re: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
From: Eric Biggers @ 2019-08-22  6:30 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Al Viro, linux-fsdevel, syzbot, jmorris, linux-kernel,
	linux-security-module, serge, syzkaller-bugs, takedakn,
	David S. Miller
In-Reply-To: <8f874b03-b129-205f-5f05-125479701275@i-love.sakura.ne.jp>

Hi Tetsuo,

On Sat, Jun 22, 2019 at 01:45:30PM +0900, Tetsuo Handa wrote:
> On 2019/06/19 5:49, Al Viro wrote:
> > On Sun, Jun 16, 2019 at 03:49:00PM +0900, Tetsuo Handa wrote:
> >> Hello, Al.
> >>
> >> Q1: Do you agree that we should fix TOMOYO side rather than SOCKET_I()->sk
> >>     management.
> > 
> > You do realize that sockets are not unique in that respect, right?
> > All kinds of interesting stuff can be accessed via /proc/*/fd/*, and
> > it _can_ be closed under you.  So I'd suggest checking how your code
> > copes with similar for pipes, FIFOs, epoll, etc., accessed that way...
> 
> I know all kinds of interesting stuff can be accessed via /proc/*/fd/*,
> and it _can_ be closed under me.
> 
> Regarding sockets, I was accessing "struct socket" memory and
> "struct sock" memory which are outside of "struct inode" memory.
> 
> But regarding other objects, I am accessing "struct dentry" memory,
> "struct super_block" memory and "struct inode" memory. I'm expecting
> that these memory can't be kfree()d as long as "struct path" holds
> a reference. Is my expectation correct?
> 
> > 
> > We are _not_ going to be checking that in fs/open.c - the stuff found
> > via /proc/*/fd/* can have the associated file closed by the time
> > we get to calling ->open() and we won't know that until said call.
> 
> OK. Then, fixing TOMOYO side is the correct way.
> 
> > 
> >> Q2: Do you see any problem with using f->f_path.dentry->d_inode ?
> >>     Do we need to use d_backing_inode() or d_inode() ?
> > 
> > Huh?  What's wrong with file_inode(f), in the first place?  And
> > just when can that be NULL, while we are at it?
> 
> Oh, I was not aware of file_inode(). Thanks.
> 
> > 
> >>>  static int tomoyo_inode_getattr(const struct path *path)
> >>>  {
> >>> +	/* It is not safe to call tomoyo_get_socket_name(). */
> >>> +	if (path->dentry->d_inode && S_ISSOCK(path->dentry->d_inode->i_mode))
> >>> +		return 0;
> > 
> > Can that be called for a negative?
> > 
> 
> I check for NULL when I'm not sure it is guaranteed to hold a valid pointer.
> You meant "we are sure that path->dentry->d_inode is valid", don't you?
> 
> By the way, "negative" associates with IS_ERR() range. I guess that
> "NULL" is the better name...
> 
> Anyway, here is V2 patch.
> 
> From c63c4074300921d6d1c33c3b8dc9c84ebfededf5 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sat, 22 Jun 2019 13:14:26 +0900
> Subject: [PATCH v2] tomoyo: Don't check open/getattr permission on sockets.
> 
> syzbot is reporting that use of SOCKET_I()->sk from open() can result in
> use after free problem [1], for socket's inode is still reachable via
> /proc/pid/fd/n despite destruction of SOCKET_I()->sk already completed.
> 
> But there is no point with calling security_file_open() on sockets
> because open("/proc/pid/fd/n", !O_PATH) on sockets fails with -ENXIO.
> 
> There is some point with calling security_inode_getattr() on sockets
> because stat("/proc/pid/fd/n") and fstat(open("/proc/pid/fd/n", O_PATH))
> are valid. If we want to access "struct sock"->sk_{family,type,protocol}
> fields, we will need to use security_socket_post_create() hook and
> security_inode_free() hook in order to remember these fields because
> security_sk_free() hook is called before the inode is destructed. But
> since information which can be protected by checking
> security_inode_getattr() on sockets is trivial, let's not be bothered by
> "struct inode"->i_security management.
> 
> There is point with calling security_file_ioctl() on sockets. Since
> ioctl(open("/proc/pid/fd/n", O_PATH)) is invalid, security_file_ioctl()
> on sockets should remain safe.
> 
> [1] https://syzkaller.appspot.com/bug?id=73d590010454403d55164cca23bd0565b1eb3b74
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Reported-by: syzbot <syzbot+0341f6a4d729d4e0acf1@syzkaller.appspotmail.com>
> ---
>  security/tomoyo/tomoyo.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
> index 716c92e..8ea3f5d 100644
> --- a/security/tomoyo/tomoyo.c
> +++ b/security/tomoyo/tomoyo.c
> @@ -126,6 +126,9 @@ static int tomoyo_bprm_check_security(struct linux_binprm *bprm)
>   */
>  static int tomoyo_inode_getattr(const struct path *path)
>  {
> +	/* It is not safe to call tomoyo_get_socket_name(). */
> +	if (S_ISSOCK(d_inode(path->dentry)->i_mode))
> +		return 0;
>  	return tomoyo_path_perm(TOMOYO_TYPE_GETATTR, path, NULL);
>  }
>  
> @@ -316,6 +319,9 @@ static int tomoyo_file_open(struct file *f)
>  	/* Don't check read permission here if called from do_execve(). */
>  	if (current->in_execve)
>  		return 0;
> +	/* Sockets can't be opened by open(). */
> +	if (S_ISSOCK(file_inode(f)->i_mode))
> +		return 0;
>  	return tomoyo_check_open_permission(tomoyo_domain(), &f->f_path,
>  					    f->f_flags);
>  }
> -- 

What happened to this patch?

Also, isn't the same bug in other places too?:

	- tomoyo_path_chmod()
	- tomoyo_path_chown()
	- smack_inode_getsecurity()
	- smack_inode_setsecurity()

- Eric

^ permalink raw reply

* [PATCH] smack: use GFP_NOFS while holding inode_smack::smk_lock
From: Eric Biggers @ 2019-08-22  5:54 UTC (permalink / raw)
  To: Casey Schaufler, linux-security-module
  Cc: syzkaller-bugs, syzbot+0eefc1e06a77d327a056, stable
In-Reply-To: <CACT4Y+aGjkmq4cEaQXagd_YqjE4a1HoNgcEzqeNj-g0Hg_hQHw@mail.gmail.com>

From: Eric Biggers <ebiggers@google.com>

inode_smack::smk_lock is taken during smack_d_instantiate(), which is
called during a filesystem transaction when creating a file on ext4.
Therefore to avoid a deadlock, all code that takes this lock must use
GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
transaction to complete.

Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 security/smack/smack_access.c | 6 +++---
 security/smack/smack_lsm.c    | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/security/smack/smack_access.c b/security/smack/smack_access.c
index f1c93a7be9ec..38ac3da4e791 100644
--- a/security/smack/smack_access.c
+++ b/security/smack/smack_access.c
@@ -465,7 +465,7 @@ char *smk_parse_smack(const char *string, int len)
 	if (i == 0 || i >= SMK_LONGLABEL)
 		return ERR_PTR(-EINVAL);
 
-	smack = kzalloc(i + 1, GFP_KERNEL);
+	smack = kzalloc(i + 1, GFP_NOFS);
 	if (smack == NULL)
 		return ERR_PTR(-ENOMEM);
 
@@ -500,7 +500,7 @@ int smk_netlbl_mls(int level, char *catset, struct netlbl_lsm_secattr *sap,
 			if ((m & *cp) == 0)
 				continue;
 			rc = netlbl_catmap_setbit(&sap->attr.mls.cat,
-						  cat, GFP_KERNEL);
+						  cat, GFP_NOFS);
 			if (rc < 0) {
 				netlbl_catmap_free(sap->attr.mls.cat);
 				return rc;
@@ -536,7 +536,7 @@ struct smack_known *smk_import_entry(const char *string, int len)
 	if (skp != NULL)
 		goto freeout;
 
-	skp = kzalloc(sizeof(*skp), GFP_KERNEL);
+	skp = kzalloc(sizeof(*skp), GFP_NOFS);
 	if (skp == NULL) {
 		skp = ERR_PTR(-ENOMEM);
 		goto freeout;
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 50c536cad85b..7e4d3145a018 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -288,7 +288,7 @@ static struct smack_known *smk_fetch(const char *name, struct inode *ip,
 	if (!(ip->i_opflags & IOP_XATTR))
 		return ERR_PTR(-EOPNOTSUPP);
 
-	buffer = kzalloc(SMK_LONGLABEL, GFP_KERNEL);
+	buffer = kzalloc(SMK_LONGLABEL, GFP_NOFS);
 	if (buffer == NULL)
 		return ERR_PTR(-ENOMEM);
 
-- 
2.22.1


^ permalink raw reply related

* Re: New skb extension for use by LSMs (skb "security blob")?
From: David Miller @ 2019-08-22  3:54 UTC (permalink / raw)
  To: paul; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <CAHC9VhRLexftb5mK8_izVQkv9w46m=aPukws2d2m+yrMvHUF_g@mail.gmail.com>

From: Paul Moore <paul@paul-moore.com>
Date: Wed, 21 Aug 2019 23:27:03 -0400

> On Wed, Aug 21, 2019 at 6:50 PM David Miller <davem@davemloft.net> wrote:
>> From: Paul Moore <paul@paul-moore.com>
>> Date: Wed, 21 Aug 2019 18:00:09 -0400
>>
>> > I was just made aware of the skb extension work, and it looks very
>> > appealing from a LSM perspective.  As some of you probably remember,
>> > we (the LSM folks) have wanted a proper security blob in the skb for
>> > quite some time, but netdev has been resistant to this idea thus far.
>> >
>> > If I were to propose a patchset to add a SKB_EXT_SECURITY skb
>> > extension (a single extension ID to be shared among the different
>> > LSMs), would that be something that netdev would consider merging, or
>> > is there still a philosophical objection to things like this?
>>
>> Unlike it's main intended user (MPTCP), it sounds like LSM's would use
>> this in a way such that it would be enabled on most systems all the
>> time.
>>
>> That really defeats the whole purpose of making it dynamic. :-/
> 
> I would be okay with only adding a skb extension when we needed it,
> which I'm currently thinking would only be when we had labeled
> networking actually configured at runtime and not just built into the
> kernel.  In SELinux we do something similar today when it comes to our
> per-packet access controls; if labeled networking is not configured we
> bail out of the LSM hooks early to improve performance (we would just
> be comparing unlabeled_t to unlabeled_t anyway).  I think the other
> LSMs would be okay with this usage as well.
> 
> While a number of distros due enable some form of LSM and the labeled
> networking bits at build time, vary few (if any?) provide a default
> configuration so I would expect no additional overhead in the common
> case.
> 
> Would that be acceptable?

I honestly don't know, I kinda feared that once the SKB extension went in
people would start dumping things there and that's exactly what's happening.
I just so happened to be reviewing:

	https://patchwork.ozlabs.org/patch/1150091/

while you were writing this email.

It's rediculous, the vultures are out.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox