All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Mickaël Salaün" <mic@digikod.net>
To: Sargun Dhillon <sargun@sargun.me>
Cc: Kees Cook <keescook@chromium.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	linux-security-module <linux-security-module@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	"Reshetova, Elena" <elena.reshetova@intel.com>
Subject: Re: [RFC 0/4] RFC: Add Checmate, BPF-driven minor LSM
Date: Mon, 15 Aug 2016 12:59:13 +0200	[thread overview]
Message-ID: <57B1A081.9030209@digikod.net> (raw)
In-Reply-To: <20160815030952.GC31242@ircssh.c.rugged-nimbus-611.internal>


[-- Attachment #1.1: Type: text/plain, Size: 2667 bytes --]


On 15/08/2016 05:09, Sargun Dhillon wrote:
> On Mon, Aug 15, 2016 at 12:57:44AM +0200, Mickaël Salaün wrote:
>> Our approaches have some common points (i.e. use eBPF in an LSM, stacked 
>> filters like seccomp) but I'm focused on a kind of unprivileged LSM (i.e. no 
>> CAP_SYS_ADMIN), to make standalone sandboxes, which brings more constraints 
>> (e.g. no use of unsafe functions like bpf_probe_read(), take care of privacy, 
>> SUID exec, stable ABI…). However, I don't want to handle resource limits, 
>> which should be the job of cgroups.
>>
> Kind of. Sometimes describing these resource limits is difficult. For example, I 
> have a customer who is trying to restrict containers from burning up all the 
> ephemeral ports on the machine. In this, they have an incredibly elaborate chain 
> of wiring to prevent a given container from connecting to the same (proto, 
> destip, destport) more than 1000 times.
> 
> I'm unsure of how you'd model that in a cgroup. 

This looks like a Netfilter rule. Have you tried applying this limitation with the connlimit module?


> 
>> For now, I'm focusing on file-system access control which is one of the more 
>> complex system to properly filter. I also plan to support basic network access 
>> control.
>>
>> What you are trying to accomplish seems more related to a Netfilter extension 
>> (something like ipset but with eBPF maybe?).
>>
> I don't only want to do network access control, I also want to write to the 
> value once it's copied into kernel space. There are lot of benefits of doing 
> this at the syscall level, but the two primary ones are performance, and 
> capability. 
> 
> One of the biggest complaints with our current approach to filtering & load 
> balancing (iptables) is that it hides information. When people connect through 
> the load balancer, they want to find out who they connected to, and without some 
> high application-level mechanism, this isn't possible. On the other hand, if we 
> just rewrite the destination address in the connect hook, we can pretty easily
> allow them to do getpeername.

What exactly is not doable with Netfilter (e.g. REDIRECT or TPROXY)?


> 
> I'm curious about your filesystem access limiter. Do you have a way to make it so
> that a given container can only write, say, 100mb of data to disk? 

It's a filesystem access control. It doesn't deal with quota and is not focused on container but process hierarchies (which is more generic).

What is not doable with a quota mount option? It may be more appropriate to enhance the VFS (or overlayfs) to apply this kind of limitation, if needed.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

  reply	other threads:[~2016-08-15 11:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-04  7:11 [RFC 0/4] RFC: Add Checmate, BPF-driven minor LSM Sargun Dhillon
2016-08-04  8:41 ` Richard Weinberger
2016-08-04  9:24   ` Sargun Dhillon
2016-08-04  9:45 ` Daniel Borkmann
2016-08-04 10:12   ` Sargun Dhillon
2016-08-08 23:44 ` Kees Cook
2016-08-09  0:00   ` Sargun Dhillon
2016-08-09  0:22     ` Kees Cook
2016-08-14 22:57       ` Mickaël Salaün
2016-08-15  3:09         ` Sargun Dhillon
2016-08-15 10:59           ` Mickaël Salaün [this message]
2016-08-15 17:03             ` Sargun Dhillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57B1A081.9030209@digikod.net \
    --to=mic@digikod.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=elena.reshetova@intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sargun@sargun.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.