Linux Security Modules development
 help / color / mirror / Atom feed
* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: David Howells @ 2019-11-02 19:34 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CAHk-=wg_X_7JSYT-a3qHrzvuWGMyffDWtQ4n7adBp_fe5w0BsA@mail.gmail.com>

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > > Similar synchronization is required for reusing memory after vmsplice()?
> > > I don't see other way how sender could safely change these pages.

Actually, it's probably worse than that.  If the output of the pipe gets teed
or spliced somewhere else, you still don't know when the vmspliced pages are
finished with.

David


^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Andy Lutomirski @ 2019-11-02 20:31 UTC (permalink / raw)
  To: David Howells
  Cc: Linus Torvalds, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <25886.1572723272@warthog.procyon.org.uk>



> On Nov 2, 2019, at 12:34 PM, David Howells <dhowells@redhat.com> wrote:
> 
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
>>>> Similar synchronization is required for reusing memory after vmsplice()?
>>>> I don't see other way how sender could safely change these pages.
> 
> Actually, it's probably worse than that.  If the output of the pipe gets teed
> or spliced somewhere else, you still don't know when the vmspliced pages are
> finished with.
> 
> 

I sometimes wonder whether vmsplice should be disallowed or severely restricted. Even ignoring these usability issues, it makes me very uncomfortable that you can have some data queue up on a pipe, tee() it, and get *different* data in the original pipe and the teed copy because the sender used vmsplice and is messing with you.

Add in the fact that it’s not obvious that vmsplice *can* be used correctly, and I’m wondering if we should just remove it or make it just do write() under the hood.

I suppose the kernel could guarantee that it stops referring to the vmsplice source pages as soon as anything sees *or* tees the data. This way it would be, at least in principle, possible to say “hey, the pipe has consumed the first n vmspliced bytes, so I can reuse that memory”.

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Linus Torvalds @ 2019-11-02 22:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CC3328CC-2F05-461E-AAC3-8DEBAB1BA162@amacapital.net>

On Sat, Nov 2, 2019 at 1:31 PM Andy Lutomirski <luto@amacapital.net> wrote:
>
> Add in the fact that it’s not obvious that vmsplice *can* be used correctly, and I’m wondering if we should just remove it or make it just do write() under the hood.

Sure it can. Just don't modify the data you vmsplice. It's really that simple.

That said, if we don't have any actual users, then we should look at
removing it (maybe turning it into "write()" as you say). Not because
it's hard to use, but simply because it probably doesn't have that
many uses.

               Linus

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Linus Torvalds @ 2019-11-02 22:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CAHk-=wj1BLz6s9cG9Ptk4ULxrTy=MkF7ZH=HF67d7M5HL1fd_A@mail.gmail.com>

On Sat, Nov 2, 2019 at 3:03 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sat, Nov 2, 2019 at 1:31 PM Andy Lutomirski <luto@amacapital.net> wrote:
> >
> > Add in the fact that it’s not obvious that vmsplice *can* be used correctly, and I’m wondering if we should just remove it or make it just do write() under the hood.
>
> Sure it can. Just don't modify the data you vmsplice. It's really that simple.
>
> That said, if we don't have any actual users, then we should look at
> removing it (maybe turning it into "write()" as you say). Not because
> it's hard to use, but simply because it probably doesn't have that
> many uses.

Looking at debian code search, there are _some_ uses (including
openssl and fuse):

  https://codesearch.debian.net/search?q=%3D+vmsplice%28&literal=1

but I didn't check any more closely what they do.

             Linus

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Andy Lutomirski @ 2019-11-02 22:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CAHk-=wj1BLz6s9cG9Ptk4ULxrTy=MkF7ZH=HF67d7M5HL1fd_A@mail.gmail.com>



> On Nov 2, 2019, at 3:04 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> On Sat, Nov 2, 2019 at 1:31 PM Andy Lutomirski <luto@amacapital.net> wrote:
>> 
>> Add in the fact that it’s not obvious that vmsplice *can* be used correctly, and I’m wondering if we should just remove it or make it just do write() under the hood.
> 
> Sure it can. Just don't modify the data you vmsplice. It's really that simple.

So you allocate memory, vmsplice, and munmap() without reusing it?  Just plain free() won’t be good enough. I suspect the TLB overhead will make this a loss in most workloads?

Or maybe you vmsplice from a read-only mapping of a file that you know no one modifies?  This could be useful, but you can just splice() from the file directly.

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Linus Torvalds @ 2019-11-02 23:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <E590C3AF-1D09-4927-B83F-DD0A6A148B6D@amacapital.net>

On Sat, Nov 2, 2019 at 3:30 PM Andy Lutomirski <luto@amacapital.net> wrote:
>
> So you allocate memory, vmsplice, and munmap() without reusing it?

You can re-use it as much as you want. Just don't write to it.

So the traditional argument for this was "I do a caching http server".
If you don't ever load the data into user space at all and just push
file data out, you just use splice() from the file to the target. But
if you generate some of the data in memory, and you cache it, you use
vmsplice().

And then it really is very easy to set up: make sure you generate your
caches with a new clean private mmap, and you can throw them out with
munmap (or just over-mmap it with the new cache, of course).

If you don't cache it, then there's no advantage to vmsplice() - just
write() it and forget about it. The whole (and only) point of
vmsplice() is when you want to zero-copy the data, and that's
generally likely only an advantage if you can do it multiple times.

But I don't think anybody actually _did_ any of that. But that's
basically the argument for the three splice operations:
write/vmsplice/splice(). Which one you use depends on the lifetime and
the source of your data. write() is obviously for the copy case (the
source data might not be stable), while splice() is for the "data from
another source", and vmsplace() is "data is from stable data in my
vm".

There's the reverse op, of course, but we never implemented that:
mmap() on the pipe could do the reverse of a vmsplice() (moving from
the pipe to the vm), but it would only work if everything was
page-aligned, which it effectively never is. It's basically a
benchmark-only operation.

And the existence of vmsplice() is because we actually had code to
play games with making write() do a zero-copy but mark the source as
being COW. It was _wonderful_ for benchmarks, and was completely
useless for real world case because in the real world you always took
the COW fault. So vmsplice() is basically a "hey, I know what I'm
doing, and you can just take the page as-is because the source is
stable".

             Linus

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Linus Torvalds @ 2019-11-02 23:09 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, linux-usb, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CAHk-=wgzRU9RjkZG0L9_yrnFN69REkrSokTQOGZMUkvdispvuQ@mail.gmail.com>

On Sat, Nov 2, 2019 at 4:02 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But I don't think anybody actually _did_ any of that. But that's
> basically the argument for the three splice operations:
> write/vmsplice/splice(). Which one you use depends on the lifetime and
> the source of your data. write() is obviously for the copy case (the
> source data might not be stable), while splice() is for the "data from
> another source", and vmsplace() is "data is from stable data in my
> vm".

Btw, it's really worth noting that "splice()" and friends are from a
more happy-go-lucky time when we were experimenting with new
interfaces, and in a day and age when people thought that interfaces
like "sendpage()" and zero-copy and playing games with the VM was a
great thing to do.

It turns out that VM games are almost always more expensive than just
copying the data in the first place, but hey, people didn't know that,
and zero-copy was seen a big deal.

The reality is that almost nobody uses splice and vmsplice at all, and
they have been a much bigger headache than they are worth. If I could
go back in time and not do them, I would. But there have been a few
very special uses that seem to actually like the interfaces.

But it's entirely possible that we should kill vmsplice() (likely by
just implementing the semantics as "write()") because it's not common
enough to have the complexity.

             Linus

^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Andy Lutomirski @ 2019-11-02 23:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Konstantin Khlebnikov, Rasmus Villemoes,
	Greg Kroah-Hartman, Peter Zijlstra, Nicolas Dichtel, raven,
	Christian Brauner, keyrings, USB list, linux-block, LSM List,
	linux-fsdevel, Linux API, Linux Kernel Mailing List
In-Reply-To: <CAHk-=wgPQutQ8d8kUCvAFi+hfNWgaNLiZPkbg-GXY2DCtD-Z5Q@mail.gmail.com>

On Sat, Nov 2, 2019 at 4:10 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sat, Nov 2, 2019 at 4:02 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > But I don't think anybody actually _did_ any of that. But that's
> > basically the argument for the three splice operations:
> > write/vmsplice/splice(). Which one you use depends on the lifetime and
> > the source of your data. write() is obviously for the copy case (the
> > source data might not be stable), while splice() is for the "data from
> > another source", and vmsplace() is "data is from stable data in my
> > vm".
>
> Btw, it's really worth noting that "splice()" and friends are from a
> more happy-go-lucky time when we were experimenting with new
> interfaces, and in a day and age when people thought that interfaces
> like "sendpage()" and zero-copy and playing games with the VM was a
> great thing to do.

I suppose a nicer interface might be:


madvise(buf, len, MADV_STABILIZE);

(MADV_STABILIZE is an imaginary operation that write protects the
memory a la fork() but without the copying part.)

vmsplice_safer(fd, ...);

Where vmsplice_safer() is like vmsplice, except that it only works on
write-protected pages.  If you vmsplice_safer() some memory and then
write to the memory, the pipe keeps the old copy.

But this can all be done with memfd and splice, too, I think.


>
> It turns out that VM games are almost always more expensive than just
> copying the data in the first place, but hey, people didn't know that,
> and zero-copy was seen a big deal.
>
> The reality is that almost nobody uses splice and vmsplice at all, and
> they have been a much bigger headache than they are worth. If I could
> go back in time and not do them, I would. But there have been a few
> very special uses that seem to actually like the interfaces.
>
> But it's entirely possible that we should kill vmsplice() (likely by
> just implementing the semantics as "write()") because it's not common
> enough to have the complexity.

I think this is the right choice.

FWIW, the openssl vmsplice() call looks dubious, but I suspect it's
okay because it's vmsplicing to a netlink socket, and the kernel code
on the other end won't read the data after it returns a response.

--Andy

^ permalink raw reply

* Re: [RFC PATCH 07/10] pipe: Conditionalise wakeup in pipe_read() [ver #2]
From: Konstantin Khlebnikov @ 2019-11-03 11:04 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, Rasmus Villemoes, Greg Kroah-Hartman, Peter Zijlstra,
	nicolas.dichtel, raven, Christian Brauner, keyrings, linux-usb,
	linux-block, linux-security-module, linux-fsdevel, linux-api,
	linux-kernel
In-Reply-To: <3165.1572539884@warthog.procyon.org.uk>

On 31/10/2019 19.38, David Howells wrote:
> Okay, attached is a change that might give you what you want.  I tried my
> pipe-bench program (see cover note) with perf.  The output of the program with
> the patch applied was:
> 
> -       pipe                  305127298     36262221772       302185181         7887690
> 
> The output of perf with the patch applied:
> 
>          239,943.92 msec task-clock                #    1.997 CPUs utilized
>              17,728      context-switches          #   73.884 M/sec
>                 124      cpu-migrations            #    0.517 M/sec
>               9,330      page-faults               #   38.884 M/sec
>     885,107,207,365      cycles                    # 3688822.793 GHz
>   1,386,873,499,490      instructions              #    1.57  insn per cycle
>     311,037,372,339      branches                  # 1296296921.931 M/sec
>          33,467,827      branch-misses             #    0.01% of all branches
> 
> And without:
> 
>          239,891.87 msec task-clock                #    1.997 CPUs utilized
>              22,187      context-switches          #   92.488 M/sec
>                 133      cpu-migrations            #    0.554 M/sec
>               9,334      page-faults               #   38.909 M/sec
>     884,906,976,128      cycles                    # 3688787.725 GHz
>   1,391,986,932,265      instructions              #    1.57  insn per cycle
>     311,394,686,857      branches                  # 1298067400.849 M/sec
>          30,242,823      branch-misses             #    0.01% of all branches
> 
> So it did make something like a 20% reduction in context switches.

Ok. Looks promising. Depending on workload reduction might be much bigger.

I suppose buffer resize (grow) makes wakeup unconditionally. Should be ok.

> 
> David
> ---
> diff --git a/fs/pipe.c b/fs/pipe.c
> index e3d5f7a39123..5167921edd73 100644
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -276,7 +276,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
>   	size_t total_len = iov_iter_count(to);
>   	struct file *filp = iocb->ki_filp;
>   	struct pipe_inode_info *pipe = filp->private_data;
> -	int do_wakeup;
> +	int do_wakeup, wake;
>   	ssize_t ret;
> 
>   	/* Null read succeeds. */
> @@ -329,11 +329,12 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to)
>   				tail++;
>   				pipe->tail = tail;
>   				do_wakeup = 1;
> -				if (head - (tail - 1) == pipe->max_usage)
> +				wake = head - (tail - 1) == pipe->max_usage / 2;
> +				if (wake)
>   					wake_up_interruptible_sync_poll_locked(
>   						&pipe->wait, EPOLLOUT | EPOLLWRNORM);
>   				spin_unlock_irq(&pipe->wait.lock);
> -				if (head - (tail - 1) == pipe->max_usage)
> +				if (wake)
>   					kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
>   			}
>   			total_len -= chars;
> 

^ permalink raw reply

* Re: [RFC PATCH 04/10] pipe: Use head and tail pointers for the ring, not cursor and length [ver #2]
From: Matthew Wilcox @ 2019-11-03 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: Linus Torvalds, Rasmus Villemoes, Greg Kroah-Hartman,
	Peter Zijlstra, Nicolas Dichtel, raven, Christian Brauner,
	keyrings, linux-usb, linux-block, LSM List, linux-fsdevel,
	Linux API, Linux Kernel Mailing List
In-Reply-To: <24075.1572533871@warthog.procyon.org.uk>

On Thu, Oct 31, 2019 at 02:57:51PM +0000, David Howells wrote:
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > It's shorter and more obvious to just write
> > 
> >    pipe->head = head;
> > 
> > than it is to write
> > 
> >    pipe_commit_write(pipe, head);
> 
> But easier to find the latter.  But whatever.

May I suggest that you use a name that's easier to grep for?

$ git grep -cw p_tail
drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c:9
$ git grep -cw p_head
drivers/net/wireless/broadcom/brcm80211/brcmfmac/fwsignal.c:3


^ permalink raw reply

* Re: [RFC PATCH 11/10] pipe: Add fsync() support [ver #2]
From: Konstantin Khlebnikov @ 2019-11-03 12:02 UTC (permalink / raw)
  To: Andy Lutomirski, Linus Torvalds
  Cc: David Howells, Rasmus Villemoes, Greg Kroah-Hartman,
	Peter Zijlstra, Nicolas Dichtel, raven, Christian Brauner,
	keyrings, USB list, linux-block, LSM List, linux-fsdevel,
	Linux API, Linux Kernel Mailing List, Miklos Szeredi
In-Reply-To: <CALCETrWZjW88OY2mh7v8cUU_6XTSJTkQhAfNbSC17AdhEWwVAA@mail.gmail.com>

On 03/11/2019 02.14, Andy Lutomirski wrote:
> On Sat, Nov 2, 2019 at 4:10 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> On Sat, Nov 2, 2019 at 4:02 PM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> But I don't think anybody actually _did_ any of that. But that's
>>> basically the argument for the three splice operations:
>>> write/vmsplice/splice(). Which one you use depends on the lifetime and
>>> the source of your data. write() is obviously for the copy case (the
>>> source data might not be stable), while splice() is for the "data from
>>> another source", and vmsplace() is "data is from stable data in my
>>> vm".
>>
>> Btw, it's really worth noting that "splice()" and friends are from a
>> more happy-go-lucky time when we were experimenting with new
>> interfaces, and in a day and age when people thought that interfaces
>> like "sendpage()" and zero-copy and playing games with the VM was a
>> great thing to do.
> 
> I suppose a nicer interface might be:
> 
> 
> madvise(buf, len, MADV_STABILIZE);
> 
> (MADV_STABILIZE is an imaginary operation that write protects the
> memory a la fork() but without the copying part.)
> 
> vmsplice_safer(fd, ...);
> 
> Where vmsplice_safer() is like vmsplice, except that it only works on
> write-protected pages.  If you vmsplice_safer() some memory and then
> write to the memory, the pipe keeps the old copy.
> 
> But this can all be done with memfd and splice, too, I think.

Looks monstrous. This will kill all fun and profit. =)

I think vmsplice should at least deprecate and ignore SPLICE_F_GIFT.

It almost never works - if page still mapped then page_count in
generic_pipe_buf_steal() will be at least 2 (pte and pipe gup).
But if user munmap vma between splicing and consuming (and page not
stuck in lazy tlb and per-cpu vectors) then page from anon lru
could be spliced into file. Ouch.

And looks like fuse device still accepts SPLICE_F_MOVE.

> 
> 
>>
>> It turns out that VM games are almost always more expensive than just
>> copying the data in the first place, but hey, people didn't know that,
>> and zero-copy was seen a big deal.
>>
>> The reality is that almost nobody uses splice and vmsplice at all, and
>> they have been a much bigger headache than they are worth. If I could
>> go back in time and not do them, I would. But there have been a few
>> very special uses that seem to actually like the interfaces.
>>
>> But it's entirely possible that we should kill vmsplice() (likely by
>> just implementing the semantics as "write()") because it's not common
>> enough to have the complexity.
> 
> I think this is the right choice.
> 
> FWIW, the openssl vmsplice() call looks dubious, but I suspect it's
> okay because it's vmsplicing to a netlink socket, and the kernel code
> on the other end won't read the data after it returns a response.
> 
> --Andy
> 

^ permalink raw reply

* Re: [Patch v3 6/7] doc: keys: Document usage of TEE based Trusted Keys
From: Sumit Garg @ 2019-11-04  6:58 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Jens Wiklander, dhowells, Jonathan Corbet, jejb, Mimi Zohar,
	James Morris, Serge E. Hallyn, Casey Schaufler, Ard Biesheuvel,
	Daniel Thompson, Stuart Yoder, Janne Karhunen,
	open list:ASYMMETRIC KEYS, linux-integrity, linux-security-module,
	Linux Doc Mailing List, Linux Kernel Mailing List,
	linux-arm-kernel, tee-dev @ lists . linaro . org
In-Reply-To: <20191101201957.GA8369@linux.intel.com>

On Sat, 2 Nov 2019 at 01:50, Jarkko Sakkinen
<jarkko.sakkinen@linux.intel.com> wrote:
>
> On Fri, Nov 01, 2019 at 03:04:18PM +0530, Sumit Garg wrote:
>
> > Isn't this statement contradicting with your earlier statement
> > regarding the right order would be to complete TEE patches review
> > first and then come up with documentation here [2]?
> >
> > [1] https://lore.kernel.org/linux-integrity/1568025601.4614.253.camel@linux.ibm.com/
> > [2] https://lore.kernel.org/linux-integrity/20190909163643.qxmzpcggi567hmhv@linux.intel.com/
>
> With the intersecting issues, namely key generation and conflicting
> keyctl parameters, that was not a well considered statement.

Okay, let me work on documentation first, but I think resending whole
patch-set just for documentation review and rework would be an
overkill. Would minor revisions of this patch only like v3.1, v3.2
etc. work for you? And later I could send next version of this
patch-set once we agree on documentation.

-Sumit

>
> /Jarkko

^ permalink raw reply

* Re: kernel panic while using get_random_bytes
From: Temp Sha @ 2019-11-04  9:35 UTC (permalink / raw)
  To: linux-security-module
In-Reply-To: <CANe=CUmBHF=L8EUVvupGSYKY_m2PH_4aH=pL7Fky57BY4JQdvA@mail.gmail.com>

hi, can any one help me with this issue?  please let me know if this
is not correct form to ask this question?
thanks

On Mon, Oct 14, 2019 at 1:51 PM Temp Sha <temp.sha@gmail.com> wrote:
>
> hi,
>
> i use get_random_bytes() function for my randomness requirement in
> kernel version 4.14.142
> however is gives panic as soon as I call   get_random_bytes() in my module.
>
>
> Oct 10 07:20:18 BUG: unable to handle kernel paging request at 00007f5563ced000
> IP: chacha20_block+0x24d/0x280
> PGD 800000010f7f8067 P4D 800000010f7f8067 PUD 161316067 PMD 1015a8067 PTE 0
> Oops: 0002 [#1] PREEMPT SMP PTI
> Modules linked in: ipi_hsl(PO) mymod(PO) e1000 ipv6 ftdi_sio usbserial
> xt_tcpudp xt_mark iptable_nat nf_nat_ipv4 nf_conntrack_ipv4
> nf_defrag_ipv4 nf_nat xt_connlimit nf_conntrack iptable_filter
> ip_tables x_tables
> CPU: 0 PID: 1841 Comm: hexdump Tainted: P           O    4.14.142-ws-symbol #1
> Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
> task: ffff8881611da000 task.stack: ffffc900504c4000
> RIP: 0010:chacha20_block+0x24d/0x280
> RSP: 0018:ffffc900504c7c70 EFLAGS: 00010886
> RAX: 0000000000000000 RBX: 00000000a88c95b0 RCX: 00007f5563ced000
> RDX: ffff88810f79da00 RSI: 0000000015c4332e RDI: 000000007613f298
> RBP: ffffc900504c7d00 R08: 000000009d39d68d R09: 00000000bfbdb51f
> R10: 00000000ed798a26 R11: 0000000083c184dc R12: 0000000036fc61e0
> R13: 00000000f9004639 R14: 0000000042c0d351 R15: 000000008a6cef0f
> FS:  00007f5563cef700(0000) GS:ffff888167e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f5563ced000 CR3: 000000016117c000 CR4: 00000000000006b0
> Call Trace:
>  _extract_crng+0x6d/0xc0
>  extract_crng+0x3a/0x40
>  _get_random_bytes+0x56/0x1c0
>  ? vprintk_func+0x3f/0xd0
>  ? printk+0x3e/0x46
>  get_random_bytes+0x2f/0x40
>  xyz_packets+0x1110/0x11e0 [mymod]
>  proc_reg_read+0x3d/0x60
>  __vfs_read+0x23/0x120
>  ? vm_mmap_pgoff+0x9d/0xd0
>  vfs_read+0x8e/0x110
>  SyS_read+0x48/0xc0
>  do_syscall_64+0x5c/0x260
>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x7f55631f
>
>
> what could be the problem?
>
>
> thanks

^ permalink raw reply

* [PATCH bpf-next v13 7/7] landlock: Add user and kernel documentation for Landlock
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

This documentation can be built with the Sphinx framework.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v12:
* enhance the "unprivileged use" explanation
* add more explanation about the domain/credentials inheritance
* update and add self-reference Sphinx links
* more clearly explain the capability-based security principles for
  program context

Changes since v11:
* cosmetic improvements

Changes since v10:
* replace the filesystem hooks with the ptrace one
* remove the triggers
* update example
* add documenation for Landlock domains and seccomp interaction
* reference more kernel documenation (e.g. LSM hooks)

Changes since v9:
* update with expected attach type and expected attach triggers

Changes since v8:
* remove documentation related to chaining and tagging according to this
  patch series

Changes since v7:
* update documentation according to the Landlock revamp

Changes since v6:
* add a check for ctx->event
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
* rename Landlock version to ABI to better reflect its purpose and add a
  dedicated changelog section
* update tables
* relax no_new_privs recommendations
* remove ABILITY_WRITE related functions
* reword rule "appending" to "prepending" and explain it
* cosmetic fixes

Changes since v5:
* update the rule hierarchy inheritance explanation
* briefly explain ctx->arg2
* add ptrace restrictions
* explain EPERM
* update example (subtype)
* use ":manpage:"
---
 Documentation/security/index.rst           |   1 +
 Documentation/security/landlock/index.rst  |  22 +++
 Documentation/security/landlock/kernel.rst | 166 +++++++++++++++++++++
 Documentation/security/landlock/user.rst   | 153 +++++++++++++++++++
 4 files changed, 342 insertions(+)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index fc503dd689a7..4d213e76ddf4 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -15,3 +15,4 @@ Security Documentation
    self-protection
    siphash
    tpm/index
+   landlock/index
diff --git a/Documentation/security/landlock/index.rst b/Documentation/security/landlock/index.rst
new file mode 100644
index 000000000000..1eced757b05d
--- /dev/null
+++ b/Documentation/security/landlock/index.rst
@@ -0,0 +1,22 @@
+=========================================
+Landlock LSM: programmatic access control
+=========================================
+
+:Author: Mickaël Salaün
+
+Landlock is a stackable Linux Security Module (LSM) that makes it possible to
+create security sandboxes, programmable access-controls or safe endpoint
+security agents.  This kind of sandbox is expected to help mitigate the
+security impact of bugs or unexpected/malicious behaviors in user-space
+applications.  The current version allows only a process with the global
+CAP_SYS_ADMIN capability to create such sandboxes but the ultimate goal of
+Landlock is to empower any process, including unprivileged ones, to securely
+restrict themselves.  Landlock is inspired by seccomp-bpf but instead of
+filtering syscalls and their raw arguments, a Landlock rule can inspect the use
+of kernel objects like processes and hence make a decision according to the
+kernel semantic.
+
+.. toctree::
+
+    user
+    kernel
diff --git a/Documentation/security/landlock/kernel.rst b/Documentation/security/landlock/kernel.rst
new file mode 100644
index 000000000000..ec0109b17e6f
--- /dev/null
+++ b/Documentation/security/landlock/kernel.rst
@@ -0,0 +1,166 @@
+==============================
+Landlock: kernel documentation
+==============================
+
+eBPF properties
+===============
+
+To get an expressive language while still being safe and small, Landlock is
+based on eBPF. Landlock should be usable by untrusted processes and must
+therefore expose a minimal attack surface. The eBPF bytecode is minimal,
+powerful, widely used and designed to be used by untrusted applications. Thus,
+reusing the eBPF support in the kernel enables a generic approach while
+minimizing new code.
+
+An eBPF program has access to an eBPF context containing some fields used to
+inspect the current object. These arguments may be used directly (e.g. raw
+value) or passed to helper functions according to their types (e.g. pointer).
+It is then possible to do complex access checks without race conditions or
+inconsistent evaluation (i.e.  `incorrect mirroring of the OS code and state
+<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
+
+A Landlock hook describes a particular access type.  For now, there is one hook
+dedicated to ptrace related operations: ``BPF_LANDLOCK_PTRACE``.  A Landlock
+program is tied to one hook.  This makes it possible to statically check
+context accesses, potentially performed by such program, and hence prevents
+kernel address leaks and ensure the right use of hook arguments with eBPF
+functions.  Any user can add multiple Landlock programs per Landlock hook.
+They are stacked and evaluated one after the other, starting from the most
+recent program, as seccomp-bpf does with its filters.  Underneath, a hook is an
+abstraction over a set of LSM hooks.
+
+
+Guiding principles
+==================
+
+Unprivileged use
+----------------
+
+* As far as possible, Landlock helpers and contexts should be *designed* to be
+  usable by unprivileged programs while following the system security policy
+  enforced by other access control mechanisms (e.g. DAC, LSM).  Indeed, a
+  Landlock program shall not interfere with other access-controls enforced on
+  the system.
+
+Because one of the Landlock's goal is to create scoped access-control (i.e.
+sandboxing), it makes sense to make it possible to have access-control-safe
+programs.  This enables to avoid unneeded security risks when writing a
+security policy.  We should also keep in mind that a Landlock program may be
+written and loaded in the kernel by a trusted process, but applied by a
+non-root (and possibly malicious) process to sandbox itself e.g., using a
+sandboxer service.  This sandboxed process must not be able to leverage one of
+the Landlock program applied on itself to do a privilege escalation nor to
+infer data that should not be accessible otherwise (i.e. side-channels).
+
+However, when justified, it should be possible to have dedicated
+privileged-only program types e.g., to make a security decision based on
+properties inaccessible by unprivileged processes, or to log actions with
+additional metadata.  As explained above, these properties should not be
+inferable from the enforced access-control.  Care must be taken to not only
+focus on these programs' context or helpers to avoid putting everything in a
+root-only realm (cf. `CAP_SYS_ADMIN: the new root
+<https://lwn.net/Articles/486306/>`_).
+
+It should be noted that ``CAP_SYS_ADMIN`` is currently required for loading and
+for enforcing any Landlock programs, but more fine-grained rights may be
+discussed in the future.
+
+
+Landlock hook and context
+-------------------------
+
+* A Landlock hook shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* A Landlock context provided by a hook shall express the minimal and more
+  generic interface to control an access for a kernel object.  This may be
+  implemented with kernel pointers used as security capabilities (i.e.
+  unforgeable token enabling actions on an object according to a set of
+  rights).
+* A hook shall guaranty that all the BPF function calls from a program are
+  safe.  Thus, the related Landlock context arguments shall always be of the
+  same type for a particular hook.  For example, a network hook could share
+  helpers with a file hook because of UNIX socket.  However, the same helpers
+  may not be compatible for a file system handle and a net handle.
+* Multiple hooks may use the same context interface.
+
+
+Landlock helpers
+----------------
+
+* Landlock helpers shall be as generic as possible while at the same time being
+  as simple as possible and following the syscall creation principles (cf.
+  :doc:`/process/adding-syscalls`).
+* The only behavior change allowed on a helper is to fix a (logical) bug to
+  match the initial semantic.
+* Helpers shall be reentrant, i.e. only take inputs from arguments (e.g. from
+  the BPF context), to enable a hook to use a cache.  Future program options
+  might change this cache behavior.
+* It is quite easy to add new helpers to extend Landlock.  The main concern
+  should be about the possibility to leak information from the kernel that may
+  not be accessible otherwise (i.e. side-channel attack).
+
+
+Landlock domain
+===============
+
+A Landlock domain is a set of eBPF programs.  There is a list for each
+different program types that can be run on a specific Landlock hook (e.g.
+ptrace).  A domain is tied to a set of subjects (i.e. tasks).
+
+A Landlock program should not try (nor be able) to infer which subject is
+currently enforced, but to have a unique security policy for all subjects tied
+to the same domain.  This make the reasoning much easier and help avoid
+pitfalls.
+
+.. kernel-doc:: security/landlock/common.h
+    :functions: landlock_domain
+
+.. kernel-doc:: security/landlock/domain_manage.c
+    :functions: landlock_prepend_prog
+
+
+Adding a Landlock program with seccomp
+--------------------------------------
+
+The :manpage:`seccomp(2)` syscall can be used with the
+``SECCOMP_PREPEND_LANDLOCK_PROG`` operation to prepend a Landlock program to
+the current task's domain.
+
+.. kernel-doc:: security/landlock/domain_syscall.c
+    :functions: landlock_seccomp_prepend_prog
+
+
+Running a list of Landlock programs
+-----------------------------------
+
+.. kernel-doc:: security/landlock/bpf_run.c
+    :functions: landlock_access_denied
+
+
+LSM hooks
+=========
+
+.. kernel-doc:: security/landlock/hooks_ptrace.c
+    :functions: hook_ptrace_access_check
+
+.. kernel-doc:: security/landlock/hooks_ptrace.c
+    :functions: hook_ptrace_traceme
+
+
+Questions and answers
+=====================
+
+Why a program does not return an errno or a kill code?
+------------------------------------------------------
+
+seccomp filters can return multiple kind of code, including an errno value or a
+kill signal, which may be convenient for access control.  Those return codes
+are hardwired in the userland ABI.  Instead, Landlock's approach is to return a
+bitmask to allow or deny an action, which is much simpler and more generic.
+Moreover, we do not really have a choice because, unlike to seccomp, Landlock
+programs are not enforced at the syscall entry point but may be executed at any
+point in the kernel (through LSM hooks) where an errno return code may not make
+sense.  However, with this simple ABI and with the ability to call helpers,
+Landlock may gain features similar to seccomp-bpf in the future while being
+compatible with previous programs.
diff --git a/Documentation/security/landlock/user.rst b/Documentation/security/landlock/user.rst
new file mode 100644
index 000000000000..ef48e7752f1b
--- /dev/null
+++ b/Documentation/security/landlock/user.rst
@@ -0,0 +1,153 @@
+=================================
+Landlock: userspace documentation
+=================================
+
+Landlock programs
+=================
+
+eBPF programs are used to create security programs.  They are contained and can
+call only a whitelist of dedicated functions. Moreover, they can only loop
+under strict conditions, which protects from denial of service.  More
+information on BPF can be found in :doc:`/bpf/index`.
+
+
+Writing a program
+-----------------
+
+To enforce a security policy, a thread first needs to create a Landlock
+program.  The easiest way to write an eBPF program depicting a security program
+is to write it in the C language.  As described in `samples/bpf/README.rst`_,
+LLVM can compile such programs.  A simple eBPF program can also be written by
+hand has done in `tools/testing/selftests/landlock/`_.
+
+Once the eBPF program is created, the next step is to create the metadata
+describing the Landlock program.  This metadata includes an expected attach
+type which contains the hook type to which the program is tied.
+
+A hook is a policy decision point which exposes the same context type for
+each program evaluation.
+
+A Landlock hook describes the kind of kernel object for which a program will be
+triggered to allow or deny an action.  For example, the hook
+``BPF_LANDLOCK_PTRACE`` can be triggered every time a landlocked thread
+performs a set of action related to debugging (cf. :manpage:`ptrace(2)`) or if
+the kernel needs to know if a process manipulation requested by something else
+is legitimate.
+
+The next step is to fill a :c:type:`struct bpf_load_program_attr
+<bpf_load_program_attr>` with ``BPF_PROG_TYPE_LANDLOCK_HOOK``, the expected
+attach type and other BPF program metadata.  This bpf_attr must then be passed
+to the :manpage:`bpf(2)` syscall alongside the ``BPF_PROG_LOAD`` command.  If
+everything is deemed correct by the kernel, the thread gets a file descriptor
+referring to this program.
+
+In the following code, the `insn` variable is an array of BPF instructions
+which can be extracted from an ELF file as is done in bpf_load_file() from
+`samples/bpf/bpf_load.c`_.
+
+.. code-block:: c
+
+    int prog_fd;
+    struct bpf_load_program_attr load_attr;
+
+    memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
+    load_attr.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK;
+    load_attr.expected_attach_type = BPF_LANDLOCK_PTRACE;
+    load_attr.insns = insns;
+    load_attr.insns_cnt = sizeof(insn) / sizeof(struct bpf_insn);
+    load_attr.license = "GPL";
+
+    prog_fd = bpf_load_program_xattr(&load_attr, log_buf, log_buf_sz);
+    if (prog_fd == -1)
+        exit(1);
+
+
+Enforcing a program
+-------------------
+
+Once the Landlock program has been created or received (e.g. through a UNIX
+socket), the thread willing to sandbox itself (and its future children) should
+perform the following two steps.
+
+The thread should first request to never be allowed to get new privileges with
+a call to :manpage:`prctl(2)` and the ``PR_SET_NO_NEW_PRIVS`` option.  More
+information can be found in :doc:`/userspace-api/no_new_privs`.
+
+.. code-block:: c
+
+    if (prctl(PR_SET_NO_NEW_PRIVS, 1, NULL, 0, 0))
+        exit(1);
+
+A thread can apply a program to itself by using the :manpage:`seccomp(2)`
+syscall.  The operation is ``SECCOMP_PREPEND_LANDLOCK_PROG``, the flags must be
+empty and the `args` argument must point to a valid Landlock program file
+descriptor.
+
+.. code-block:: c
+
+    if (seccomp(SECCOMP_PREPEND_LANDLOCK_PROG, 0, &fd))
+        exit(1);
+
+If the syscall succeeds, the program is now enforced on the calling thread and
+will be enforced on all its subsequently created children of the thread as
+well.  Once a thread is landlocked, there is no way to remove this security
+policy, only stacking more restrictions is allowed.  The program evaluation is
+performed from the newest to the oldest.
+
+When a syscall ask for an action on a kernel object, if this action is denied,
+then an ``EACCES`` errno code is returned through the syscall.
+
+
+.. _inherited_programs:
+
+Inherited programs
+------------------
+
+Every new thread resulting from a :manpage:`clone(2)` inherits Landlock program
+restrictions from its parent.  This is similar to the seccomp inheritance (cf.
+:doc:`/userspace-api/seccomp_filter`) or any other LSM dealing with task's
+:manpage:`credentials(7)`.  For instance, one process's thread may apply
+Landlock programs to itself, but they will not be automatically applied to
+other sibling threads (unlike POSIX thread credential changes, cf.
+:manpage:`nptl(7)`).
+
+
+Ptrace restrictions
+-------------------
+
+A sandboxed process has less privileges than a non-sandboxed process and must
+then be subject to additional restrictions when manipulating another process.
+To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
+process, a sandboxed process should have a subset of the target process
+programs.  This security policy can easily be implemented like in
+`tools/testing/selftests/landlock/test_ptrace.c`_.
+
+
+Landlock structures and constants
+=================================
+
+Contexts
+--------
+
+.. kernel-doc:: include/uapi/linux/landlock.h
+    :functions: landlock_context_ptrace
+
+
+Return types
+------------
+
+.. kernel-doc:: include/uapi/linux/landlock.h
+    :functions: landlock_ret
+
+
+Additional documentation
+========================
+
+See https://landlock.io
+
+
+.. Links
+.. _samples/bpf/README.rst: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/bpf/README.rst
+.. _tools/testing/selftests/landlock/: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/landlock/
+.. _samples/bpf/bpf_load.c: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/bpf/bpf_load.c
+.. _tools/testing/selftests/landlock/test_ptrace.c: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/landlock/test_ptrace.c
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 1/7] bpf,landlock: Define an eBPF program type for Landlock hooks
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

Add a new type of eBPF program used by Landlock hooks.  The goal of this
type of program is to accept or deny a requested access from userspace
to a kernel object (e.g. ptrace a process).  This will be more useful
with the next commit adding a new eBPF helper.

The context of this program type contains two items of type PTR_TO_TASK,
one for the tracer and one for the tracee.  The underlying kernel
structure is currently a task_struct pointer, but it could seamlessly
evolve to a task wrapper with dedicated rights (i.e.  capability-based
security) to fit with different use cases (e.g. get and log the task's
PID).

This new BPF program type will be registered with the Landlock LSM
initialization.

Add an initial Landlock Kconfig and update the MAINTAINERS file.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v10:
* replace file system program types with a (simpler) ptrace program type
* add an eBPF task pointer type
* split files

Changes since v9:
* handle inode put and map put, which fix unmount (reported by Al Viro)
* replace subtype with expected_attach_type and expected_attach_triggers
* check eBPF program return code

Changes since v8:
* Remove the chaining concept from the eBPF program contexts (chain and
  cookie). We need to keep these subtypes this way to be able to make
  them evolve, though.
* remove bpf_landlock_put_extra() because there is no more a "previous"
  field to free (for now)

Changes since v7:
* cosmetic fixes
* rename LANDLOCK_SUBTYPE_* to LANDLOCK_*
* cleanup UAPI definitions and move them from bpf.h to landlock.h
  (suggested by Alexei Starovoitov)
* disable Landlock by default (suggested by Alexei Starovoitov)
* rename BPF_PROG_TYPE_LANDLOCK_{RULE,HOOK}
* update the Kconfig
* update the MAINTAINERS file
* replace the IOCTL, LOCK and FCNTL events with FS_PICK, FS_WALK and
  FS_GET hook types
* add the ability to chain programs with an eBPF program file descriptor
  (i.e. the "previous" field in a Landlock subtype) and keep a state
  with a "cookie" value available from the context
* add a "triggers" subtype bitfield to match specific actions (e.g.
  append, chdir, read...)

Changes since v6:
* add 3 more sub-events: IOCTL, LOCK, FCNTL
  https://lkml.kernel.org/r/2fbc99a6-f190-f335-bd14-04bdeed35571@digikod.net
* rename LANDLOCK_VERSION to LANDLOCK_ABI to better reflect its purpose,
  and move it from landlock.h to common.h
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE: an eBPF
  program could be used for something else than a rule
* simplify struct landlock_context by removing the arch and syscall_nr fields
* remove all eBPF map functions call, remove ABILITY_WRITE
* refactor bpf_landlock_func_proto() (suggested by Kees Cook)
* constify pointers
* fix doc inclusion

Changes since v5:
* rename file hooks.c to init.c
* fix spelling

Changes since v4:
* merge a minimal (not enabled) LSM code and Kconfig in this commit

Changes since v3:
* split commit
* revamp the landlock_context:
  * add arch, syscall_nr and syscall_cmd (ioctl, fcntl…) to be able to
    cross-check action with the event type
  * replace args array with dedicated fields to ease the addition of new
    fields
---
 MAINTAINERS                    |  8 ++++
 include/linux/bpf.h            |  1 +
 include/linux/bpf_types.h      |  3 ++
 include/uapi/linux/bpf.h       |  2 +
 include/uapi/linux/landlock.h  | 39 ++++++++++++++++
 kernel/bpf/syscall.c           |  9 ++++
 kernel/bpf/verifier.c          |  7 +++
 security/Kconfig               |  1 +
 security/Makefile              |  2 +
 security/landlock/Kconfig      | 19 ++++++++
 security/landlock/Makefile     |  4 ++
 security/landlock/bpf_ptrace.c | 30 ++++++++++++
 security/landlock/bpf_ptrace.h | 17 +++++++
 security/landlock/bpf_verify.c | 83 ++++++++++++++++++++++++++++++++++
 security/landlock/common.h     | 30 ++++++++++++
 15 files changed, 255 insertions(+)
 create mode 100644 include/uapi/linux/landlock.h
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/bpf_ptrace.c
 create mode 100644 security/landlock/bpf_ptrace.h
 create mode 100644 security/landlock/bpf_verify.c
 create mode 100644 security/landlock/common.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7fc074632eac..4cabb85ea52d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9146,6 +9146,14 @@ F:	net/core/skmsg.c
 F:	net/core/sock_map.c
 F:	net/ipv4/tcp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:	Mickaël Salaün <mic@digikod.net>
+S:	Supported
+F:	include/uapi/linux/landlock.h
+F:	security/landlock/
+K:	landlock
+K:	LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M:	Hauke Mehrtens <hauke@hauke-m.de>
 L:	netdev@vger.kernel.org
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 171be30fe0ae..819a3e207438 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -291,6 +291,7 @@ enum bpf_reg_type {
 	PTR_TO_TP_BUFFER,	 /* reg points to a writable raw tp's buffer */
 	PTR_TO_XDP_SOCK,	 /* reg points to struct xdp_sock */
 	PTR_TO_BTF_ID,		 /* reg points to kernel struct */
+	PTR_TO_TASK,		 /* reg points to struct task_struct */
 };
 
 /* The information passed from prog-specific *_is_valid_access
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 36a9c2325176..bddabc961a3b 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -38,6 +38,9 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LIRC_MODE2, lirc_mode2)
 #ifdef CONFIG_INET
 BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport)
 #endif
+#ifdef CONFIG_SECURITY_LANDLOCK
+BPF_PROG_TYPE(BPF_PROG_TYPE_LANDLOCK_HOOK, landlock)
+#endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4af8b0819a32..6e4147790f96 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -173,6 +173,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SYSCTL,
 	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
+	BPF_PROG_TYPE_LANDLOCK_HOOK,
 };
 
 enum bpf_attach_type {
@@ -199,6 +200,7 @@ enum bpf_attach_type {
 	BPF_CGROUP_UDP6_RECVMSG,
 	BPF_CGROUP_GETSOCKOPT,
 	BPF_CGROUP_SETSOCKOPT,
+	BPF_LANDLOCK_PTRACE,
 	__MAX_BPF_ATTACH_TYPE
 };
 
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
new file mode 100644
index 000000000000..3ffe3cbdbad6
--- /dev/null
+++ b/include/uapi/linux/landlock.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Landlock - UAPI headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _UAPI__LINUX_LANDLOCK_H__
+#define _UAPI__LINUX_LANDLOCK_H__
+
+#include <linux/types.h>
+
+/**
+ * DOC: landlock_ret
+ *
+ * The return value of a landlock program is a bitmask that can allow or deny
+ * the action for which the program is run.
+ *
+ * In the future, this could be used to trigger an audit event as well.
+ *
+ * - %LANDLOCK_RET_ALLOW
+ * - %LANDLOCK_RET_DENY
+ */
+#define LANDLOCK_RET_ALLOW	0
+#define LANDLOCK_RET_DENY	1
+
+/**
+ * struct landlock_context_ptrace - context accessible to BPF_LANDLOCK_PTRACE
+ *
+ * @tracer: pointer to the task requesting to debug @tracee
+ * @tracee: pointer to the task being debugged
+ */
+struct landlock_context_ptrace {
+	__u64 tracer;
+	__u64 tracee;
+};
+
+#endif /* _UAPI__LINUX_LANDLOCK_H__ */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ff5225759553..5159e582a0d8 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1621,6 +1621,15 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		default:
 			return -EINVAL;
 		}
+#ifdef CONFIG_SECURITY_LANDLOCK
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
+		switch (expected_attach_type) {
+		case BPF_LANDLOCK_PTRACE:
+			return 0;
+		default:
+			return -EINVAL;
+		}
+#endif
 	default:
 		return 0;
 	}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c59778c0fc4d..ebf1991906b7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -421,6 +421,7 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_TP_BUFFER]	= "tp_buffer",
 	[PTR_TO_XDP_SOCK]	= "xdp_sock",
 	[PTR_TO_BTF_ID]		= "ptr_",
+	[PTR_TO_TASK]		= "task",
 };
 
 static char slot_type_char[] = {
@@ -1878,6 +1879,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
 	case PTR_TO_TCP_SOCK:
 	case PTR_TO_TCP_SOCK_OR_NULL:
 	case PTR_TO_XDP_SOCK:
+	case PTR_TO_TASK:
 		return true;
 	default:
 		return false;
@@ -2600,6 +2602,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
 	case PTR_TO_XDP_SOCK:
 		pointer_desc = "xdp_sock ";
 		break;
+	case PTR_TO_TASK:
+		pointer_desc = "task ";
+		break;
 	default:
 		break;
 	}
@@ -4527,6 +4532,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 	case PTR_TO_TCP_SOCK:
 	case PTR_TO_TCP_SOCK_OR_NULL:
 	case PTR_TO_XDP_SOCK:
+	case PTR_TO_TASK:
 		verbose(env, "R%d pointer arithmetic on %s prohibited\n",
 			dst, reg_type_str[ptr_reg->type]);
 		return -EACCES;
@@ -6278,6 +6284,7 @@ static int check_return_code(struct bpf_verifier_env *env)
 	case BPF_PROG_TYPE_CGROUP_DEVICE:
 	case BPF_PROG_TYPE_CGROUP_SYSCTL:
 	case BPF_PROG_TYPE_CGROUP_SOCKOPT:
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
 		break;
 	default:
 		return 0;
diff --git a/security/Kconfig b/security/Kconfig
index 2a1a2d396228..9d9981394fb0 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index be1dd9d2cb2f..60b7f6f2fd30 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -12,6 +12,7 @@ subdir-$(CONFIG_SECURITY_YAMA)		+= yama
 subdir-$(CONFIG_SECURITY_LOADPIN)	+= loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)    += safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown
+subdir-$(CONFIG_SECURITY_LANDLOCK)		+= landlock
 
 # always enable default capabilities
 obj-y					+= commoncap.o
@@ -29,6 +30,7 @@ obj-$(CONFIG_SECURITY_YAMA)		+= yama/
 obj-$(CONFIG_SECURITY_LOADPIN)		+= loadpin/
 obj-$(CONFIG_SECURITY_SAFESETID)       += safesetid/
 obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)	+= lockdown/
+obj-$(CONFIG_SECURITY_LANDLOCK)	+= landlock/
 obj-$(CONFIG_CGROUP_DEVICE)		+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
new file mode 100644
index 000000000000..44921bd72380
--- /dev/null
+++ b/security/landlock/Kconfig
@@ -0,0 +1,19 @@
+config SECURITY_LANDLOCK
+	bool "Landlock support"
+	depends on SECURITY
+	depends on BPF_SYSCALL
+	depends on SECCOMP_FILTER
+	default n
+	help
+	  This selects Landlock, a programmatic access control.  It enables to
+	  restrict processes on the fly (i.e. create a sandbox) or log some
+	  actions.  The security policy is a set of eBPF programs, dedicated to
+	  allow or deny a list of actions on specific kernel objects (e.g.
+	  process).
+
+	  You need to enable seccomp filter to apply a security policy to a
+	  process hierarchy (e.g. application with built-in sandboxing).
+
+	  See Documentation/security/landlock/ for further information.
+
+	  If you are unsure how to answer this question, answer N.
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
new file mode 100644
index 000000000000..682b798c6b76
--- /dev/null
+++ b/security/landlock/Makefile
@@ -0,0 +1,4 @@
+obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
+
+landlock-y := \
+	bpf_verify.o bpf_ptrace.o
diff --git a/security/landlock/bpf_ptrace.c b/security/landlock/bpf_ptrace.c
new file mode 100644
index 000000000000..2ec73078ad01
--- /dev/null
+++ b/security/landlock/bpf_ptrace.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - eBPF ptrace
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#include <linux/bpf.h>
+#include <uapi/linux/landlock.h>
+
+#include "bpf_ptrace.h"
+
+bool landlock_is_valid_access_ptrace(int off, enum bpf_access_type type,
+		enum bpf_reg_type *reg_type, int *max_size)
+{
+	if (type != BPF_READ)
+		return false;
+
+	switch (off) {
+	case offsetof(struct landlock_context_ptrace, tracer):
+		/* fall through */
+	case offsetof(struct landlock_context_ptrace, tracee):
+		*reg_type = PTR_TO_TASK;
+		*max_size = sizeof(u64);
+		return true;
+	default:
+		return false;
+	}
+}
diff --git a/security/landlock/bpf_ptrace.h b/security/landlock/bpf_ptrace.h
new file mode 100644
index 000000000000..85edce37b70a
--- /dev/null
+++ b/security/landlock/bpf_ptrace.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - eBPF ptrace headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_BPF_PTRACE_H
+#define _SECURITY_LANDLOCK_BPF_PTRACE_H
+
+#include <linux/bpf.h>
+
+bool landlock_is_valid_access_ptrace(int off, enum bpf_access_type type,
+		enum bpf_reg_type *reg_type, int *max_size);
+
+#endif /* _SECURITY_LANDLOCK_BPF_PTRACE_H */
diff --git a/security/landlock/bpf_verify.c b/security/landlock/bpf_verify.c
new file mode 100644
index 000000000000..6ed921588178
--- /dev/null
+++ b/security/landlock/bpf_verify.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - eBPF program verifications
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <linux/bpf.h>
+#include <linux/filter.h>
+
+#include "common.h"
+#include "bpf_ptrace.h"
+
+static bool bpf_landlock_is_valid_access(int off, int size,
+		enum bpf_access_type type, const struct bpf_prog *prog,
+		struct bpf_insn_access_aux *info)
+{
+	enum bpf_reg_type reg_type = NOT_INIT;
+	int max_size = 0;
+
+	if (WARN_ON(!prog->expected_attach_type))
+		return false;
+
+	if (off < 0)
+		return false;
+	if (size <= 0 || size > sizeof(__u64))
+		return false;
+
+	/* set register type and max size */
+	switch (get_hook_type(prog)) {
+	case LANDLOCK_HOOK_PTRACE:
+		if (!landlock_is_valid_access_ptrace(off, type, &reg_type,
+					&max_size))
+			return false;
+		break;
+	}
+
+	/* check memory range access */
+	switch (reg_type) {
+	case NOT_INIT:
+		return false;
+	case SCALAR_VALUE:
+		/* allow partial raw value */
+		if (size > max_size)
+			return false;
+		info->ctx_field_size = max_size;
+		break;
+	default:
+		/* deny partial pointer */
+		if (size != max_size)
+			return false;
+	}
+
+	info->reg_type = reg_type;
+	return true;
+}
+
+static const struct bpf_func_proto *bpf_landlock_func_proto(
+		enum bpf_func_id func_id,
+		const struct bpf_prog *prog)
+{
+	if (WARN_ON(!prog->expected_attach_type))
+		return NULL;
+
+	switch (func_id) {
+	case BPF_FUNC_map_lookup_elem:
+		return &bpf_map_lookup_elem_proto;
+	case BPF_FUNC_map_update_elem:
+		return &bpf_map_update_elem_proto;
+	case BPF_FUNC_map_delete_elem:
+		return &bpf_map_delete_elem_proto;
+	default:
+		return NULL;
+	}
+}
+
+const struct bpf_verifier_ops landlock_verifier_ops = {
+	.get_func_proto	= bpf_landlock_func_proto,
+	.is_valid_access = bpf_landlock_is_valid_access,
+};
+
+const struct bpf_prog_ops landlock_prog_ops = {};
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index 000000000000..0234c4bc4acd
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - private headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#include <linux/bpf.h>
+#include <linux/filter.h>
+
+enum landlock_hook_type {
+	LANDLOCK_HOOK_PTRACE = 1,
+};
+
+static inline enum landlock_hook_type get_hook_type(const struct bpf_prog *prog)
+{
+	switch (prog->expected_attach_type) {
+	case BPF_LANDLOCK_PTRACE:
+		return LANDLOCK_HOOK_PTRACE;
+	default:
+		WARN_ON(1);
+		return BPF_LANDLOCK_PTRACE;
+	}
+}
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 3/7] landlock,seccomp: Apply Landlock programs to process hierarchy
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

The seccomp(2) syscall can be used by a task to apply a Landlock program
to itself. As a seccomp filter, a Landlock program is enforced for the
current task and all its future children. A program is immutable and a
task can only add new restricting programs to itself, forming a list of
programs.

A Landlock program is tied to a Landlock hook. If the action on a kernel
object is allowed by the other Linux security mechanisms (e.g. DAC,
capabilities, other LSM), then a Landlock hook related to this kind of
object is triggered. The list of programs for this hook is then
evaluated. Each program return a binary value which can deny the action
on a kernel object with a non-zero value. If every programs of the list
return zero, then the action on the object is allowed.

The next commit adds the LSM hooks to enforce the memory protection
programs on the appropriate process hierarchies.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
Link: https://lore.kernel.org/lkml/c10a503d-5e35-7785-2f3d-25ed8dd63fab@digikod.net/
---

Changes since v12:
* move the landlock_put_domain() call from domain_manage.c: same
  semantic (for this call) but less error-prone and self-explanatory
  (only put a domain when it is effectively replaced)
* only handle copy of domains (cf. domain management changes)
* use a consistent naming
* extend comment about unprivileged enforcement

Changes since v11:
* fix build of seccomp without Landlock (reported by kbuild test robot)

Changes since v10:
* rewrite the Landlock program attaching mechanisme to not rely on
  internal seccomp structures but only on the (LSM-stacked) task's
  credentials:
  * make the use of seccomp (and task's credentials) optional if not
    relying on its syscall, which may be useful for domains defined by
    other means (e.g. cgroups or system-wide thanks to a dedicated
    securityfs)

Changes since v9:
* replace subtype with expected_attach_type and expected_attach_triggers

Changes since v8:
* Remove the chaining concept from the eBPF program contexts (chain and
  cookie). We need to keep these subtypes this way to be able to make
  them evolve, though.

Changes since v7:
* handle and verify program chains
* split and rename providers.c to enforce.c and enforce_seccomp.c
* rename LANDLOCK_SUBTYPE_* to LANDLOCK_*

Changes since v6:
* rename some functions with more accurate names to reflect that an eBPF
  program for Landlock could be used for something else than a rule
* reword rule "appending" to "prepending" and explain it
* remove the superfluous no_new_privs check, only check global
  CAP_SYS_ADMIN when prepending a Landlock rule (needed for containers)
* create and use {get,put}_seccomp_landlock() (suggested by Kees Cook)
* replace ifdef with static inlined function (suggested by Kees Cook)
* use get_user() (suggested by Kees Cook)
* replace atomic_t with refcount_t (requested by Kees Cook)
* move struct landlock_{rule,events} from landlock.h to common.h
* cleanup headers

Changes since v5:
* remove struct landlock_node and use a similar inheritance mechanisme
  as seccomp-bpf (requested by Andy Lutomirski)
* rename SECCOMP_ADD_LANDLOCK_RULE to SECCOMP_APPEND_LANDLOCK_RULE
* rename file manager.c to providers.c
* add comments
* typo and cosmetic fixes

Changes since v4:
* merge manager and seccomp patches
* return -EFAULT in seccomp(2) when user_bpf_fd is null to easely check
  if Landlock is supported
* only allow a process with the global CAP_SYS_ADMIN to use Landlock
  (will be lifted in the future)
* add an early check to exit as soon as possible if the current process
  does not have Landlock rules

Changes since v3:
* remove the hard link with seccomp (suggested by Andy Lutomirski and
  Kees Cook):
  * remove the cookie which could imply multiple evaluation of Landlock
    rules
  * remove the origin field in struct landlock_data
* remove documentation fix (merged upstream)
* rename the new seccomp command to SECCOMP_ADD_LANDLOCK_RULE
* internal renaming
* split commit
* new design to be able to inherit on the fly the parent rules

Changes since v2:
* Landlock programs can now be run without seccomp filter but for any
  syscall (from the process) or interruption
* move Landlock related functions and structs into security/landlock/*
  (to manage cgroups as well)
* fix seccomp filter handling: run Landlock programs for each of their
  legitimate seccomp filter
* properly clean up all seccomp results
* cosmetic changes to ease the understanding
* fix some ifdef
---
 MAINTAINERS                        |  1 +
 include/linux/landlock.h           | 25 ++++++++
 include/linux/lsm_hooks.h          |  1 +
 include/uapi/linux/seccomp.h       |  1 +
 kernel/seccomp.c                   |  4 ++
 security/landlock/Makefile         |  5 +-
 security/landlock/common.h         | 16 +++++
 security/landlock/domain_syscall.c | 93 ++++++++++++++++++++++++++++++
 security/landlock/hooks_cred.c     | 47 +++++++++++++++
 security/landlock/hooks_cred.h     | 14 +++++
 security/landlock/init.c           | 30 ++++++++++
 security/security.c                | 15 +++++
 12 files changed, 250 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/landlock.h
 create mode 100644 security/landlock/domain_syscall.c
 create mode 100644 security/landlock/hooks_cred.c
 create mode 100644 security/landlock/hooks_cred.h
 create mode 100644 security/landlock/init.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 4cabb85ea52d..32bfd88159b0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9149,6 +9149,7 @@ F:	net/ipv4/tcp_bpf.c
 LANDLOCK SECURITY MODULE
 M:	Mickaël Salaün <mic@digikod.net>
 S:	Supported
+F:	include/linux/landlock.h
 F:	include/uapi/linux/landlock.h
 F:	security/landlock/
 K:	landlock
diff --git a/include/linux/landlock.h b/include/linux/landlock.h
new file mode 100644
index 000000000000..ffbf2397c459
--- /dev/null
+++ b/include/linux/landlock.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Landlock LSM - public kernel headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _LINUX_LANDLOCK_H
+#define _LINUX_LANDLOCK_H
+
+#include <linux/errno.h>
+
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_SECURITY_LANDLOCK)
+extern int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd);
+#else /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+static inline int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd)
+{
+		return -EINVAL;
+}
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_SECURITY_LANDLOCK */
+
+#endif /* _LINUX_LANDLOCK_H */
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index a3763247547c..a8ba679b388a 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -2106,6 +2106,7 @@ extern void security_add_hooks(struct security_hook_list *hooks, int count,
 enum lsm_order {
 	LSM_ORDER_FIRST = -1,	/* This is only for capabilities. */
 	LSM_ORDER_MUTABLE = 0,
+	LSM_ORDER_LAST = 1,	/* potentially-unprivileged LSM */
 };
 
 struct lsm_info {
diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
index 90734aa5aa36..bce6534e7feb 100644
--- a/include/uapi/linux/seccomp.h
+++ b/include/uapi/linux/seccomp.h
@@ -16,6 +16,7 @@
 #define SECCOMP_SET_MODE_FILTER		1
 #define SECCOMP_GET_ACTION_AVAIL	2
 #define SECCOMP_GET_NOTIF_SIZES		3
+#define SECCOMP_PREPEND_LANDLOCK_PROG	4
 
 /* Valid flags for SECCOMP_SET_MODE_FILTER */
 #define SECCOMP_FILTER_FLAG_TSYNC		(1UL << 0)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index dba52a7db5e8..0688a1439587 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -19,6 +19,7 @@
 #include <linux/compat.h>
 #include <linux/coredump.h>
 #include <linux/kmemleak.h>
+#include <linux/landlock.h>
 #include <linux/nospec.h>
 #include <linux/prctl.h>
 #include <linux/sched.h>
@@ -1397,6 +1398,9 @@ static long do_seccomp(unsigned int op, unsigned int flags,
 			return -EINVAL;
 
 		return seccomp_get_notif_sizes(uargs);
+	case SECCOMP_PREPEND_LANDLOCK_PROG:
+		return landlock_seccomp_prepend_prog(flags,
+				(const int __user *)uargs);
 	default:
 		return -EINVAL;
 	}
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index dd5f70185778..0b291f2c027c 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,5 +1,6 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := \
+landlock-y := init.o \
 	bpf_verify.o bpf_ptrace.o \
-	domain_manage.o
+	domain_manage.o domain_syscall.o \
+	hooks_cred.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
index fb2990eb5fb4..3ae8340a5b3d 100644
--- a/security/landlock/common.h
+++ b/security/landlock/common.h
@@ -10,9 +10,15 @@
 #define _SECURITY_LANDLOCK_COMMON_H
 
 #include <linux/bpf.h>
+#include <linux/cred.h>
 #include <linux/filter.h>
+#include <linux/lsm_hooks.h>
 #include <linux/refcount.h>
 
+#define LANDLOCK_NAME "landlock"
+
+extern struct lsm_blob_sizes landlock_blob_sizes;
+
 enum landlock_hook_type {
 	LANDLOCK_HOOK_PTRACE = 1,
 };
@@ -43,6 +49,16 @@ struct landlock_domain {
 	refcount_t usage;
 };
 
+struct landlock_cred_security {
+	struct landlock_domain *domain;
+};
+
+static inline struct landlock_cred_security *landlock_cred(
+		const struct cred *cred)
+{
+	return cred->security + landlock_blob_sizes.lbs_cred;
+}
+
 /**
  * get_hook_index - get an index for the programs of struct landlock_prog_set
  *
diff --git a/security/landlock/domain_syscall.c b/security/landlock/domain_syscall.c
new file mode 100644
index 000000000000..022393841a0a
--- /dev/null
+++ b/security/landlock/domain_syscall.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - seccomp syscall
+ *
+ * Copyright © 2016-2018 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifdef CONFIG_SECCOMP_FILTER
+
+#include <asm/current.h>
+#include <linux/bpf.h>
+#include <linux/capability.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/filter.h>
+#include <linux/landlock.h>
+#include <linux/refcount.h>
+#include <linux/sched.h>
+#include <linux/seccomp.h>
+#include <linux/uaccess.h>
+
+#include "common.h"
+#include "domain_manage.h"
+
+/**
+ * landlock_seccomp_prepend_prog - attach a Landlock program to the current
+ *                                 task
+ *
+ * current->cred->security[landlock]->domain is lazily allocated. When a new
+ * credential is created, only a pointer is copied.  When a new Landlock
+ * program is added by a task, if there is other references to this task's
+ * domain, then a new allocation is made to contain an array pointing to
+ * Landlock program lists.  This design enable low-performance impact and is
+ * memory efficient while keeping the property of prepend-only programs.
+ *
+ * For now, installing a Landlock program requires that the requesting task has
+ * the global CAP_SYS_ADMIN. We cannot force the use of no_new_privs to not
+ * exclude containers where a process may legitimately acquire more privileges
+ * thanks to an SUID binary.
+ *
+ * @flags: not used, must be 0
+ * @user_bpf_fd: file descriptor pointing to a loaded Landlock prog
+ */
+int landlock_seccomp_prepend_prog(unsigned int flags,
+		const int __user *user_bpf_fd)
+{
+	struct landlock_domain *new_dom;
+	struct cred *new_cred;
+	struct landlock_cred_security *new_llcred;
+	struct bpf_prog *prog;
+	int bpf_fd, err;
+
+	/*
+	 * It is planned to replaced the CAP_SYS_ADMIN check with a
+	 * no_new_privs check to allow unprivileged tasks to sandbox
+	 * themselves.  However, they may not be allowed to directly create an
+	 * eBPF program, but could received it from a privileged service.
+	 */
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+	/* enable to check if Landlock is supported with early EFAULT */
+	if (!user_bpf_fd)
+		return -EFAULT;
+	if (flags)
+		return -EINVAL;
+	err = get_user(bpf_fd, user_bpf_fd);
+	if (err)
+		return err;
+	prog = bpf_prog_get(bpf_fd);
+	if (IS_ERR(prog))
+		return PTR_ERR(prog);
+
+	new_cred = prepare_creds();
+	if (!new_cred) {
+		bpf_prog_put(prog);
+		return -ENOMEM;
+	}
+	new_llcred = landlock_cred(new_cred);
+	/* the new creds are an atomic copy of the current creds */
+	new_dom = landlock_prepend_prog(new_llcred->domain, prog);
+	bpf_prog_put(prog);
+	if (IS_ERR(new_dom)) {
+		abort_creds(new_cred);
+		return PTR_ERR(new_dom);
+	}
+	/* replace the old (prepared) domain */
+	landlock_put_domain(new_llcred->domain);
+	new_llcred->domain = new_dom;
+	return commit_creds(new_cred);
+}
+
+#endif /* CONFIG_SECCOMP_FILTER */
diff --git a/security/landlock/hooks_cred.c b/security/landlock/hooks_cred.c
new file mode 100644
index 000000000000..def8678019a0
--- /dev/null
+++ b/security/landlock/hooks_cred.c
@@ -0,0 +1,47 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - credential hooks
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <linux/cred.h>
+#include <linux/lsm_hooks.h>
+
+#include "common.h"
+#include "domain_manage.h"
+#include "hooks_cred.h"
+
+static int hook_cred_prepare(struct cred *new, const struct cred *old,
+		gfp_t gfp)
+{
+	const struct landlock_cred_security *cred_old = landlock_cred(old);
+	struct landlock_cred_security *cred_new = landlock_cred(new);
+	struct landlock_domain *dom_old;
+
+	dom_old = cred_old->domain;
+	if (dom_old) {
+		landlock_get_domain(dom_old);
+		cred_new->domain = dom_old;
+	} else {
+		cred_new->domain = NULL;
+	}
+	return 0;
+}
+
+static void hook_cred_free(struct cred *cred)
+{
+	landlock_put_domain(landlock_cred(cred)->domain);
+}
+
+static struct security_hook_list landlock_hooks[] = {
+	LSM_HOOK_INIT(cred_prepare, hook_cred_prepare),
+	LSM_HOOK_INIT(cred_free, hook_cred_free),
+};
+
+__init void landlock_add_hooks_cred(void)
+{
+	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+			LANDLOCK_NAME);
+}
diff --git a/security/landlock/hooks_cred.h b/security/landlock/hooks_cred.h
new file mode 100644
index 000000000000..641d66f6bf9a
--- /dev/null
+++ b/security/landlock/hooks_cred.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - credential hooks headers
+ *
+ * Copyright © 2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_HOOKS_CRED_H
+#define _SECURITY_LANDLOCK_HOOKS_CRED_H
+
+__init void landlock_add_hooks_cred(void);
+
+#endif /* _SECURITY_LANDLOCK_HOOKS_CRED_H */
diff --git a/security/landlock/init.c b/security/landlock/init.c
new file mode 100644
index 000000000000..8836ec4defd3
--- /dev/null
+++ b/security/landlock/init.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - initialization
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <linux/lsm_hooks.h>
+
+#include "common.h"
+#include "hooks_cred.h"
+
+static int __init landlock_init(void)
+{
+	pr_info(LANDLOCK_NAME ": Registering hooks\n");
+	landlock_add_hooks_cred();
+	return 0;
+}
+
+struct lsm_blob_sizes landlock_blob_sizes __lsm_ro_after_init = {
+	.lbs_cred = sizeof(struct landlock_cred_security),
+};
+
+DEFINE_LSM(LANDLOCK_NAME) = {
+	.name = LANDLOCK_NAME,
+	.order = LSM_ORDER_LAST,
+	.blobs = &landlock_blob_sizes,
+	.init = landlock_init,
+};
diff --git a/security/security.c b/security/security.c
index 1bc000f834e2..03c7dce9e014 100644
--- a/security/security.c
+++ b/security/security.c
@@ -264,6 +264,21 @@ static void __init ordered_lsm_parse(const char *order, const char *origin)
 		}
 	}
 
+	/*
+	 * In case of an unprivileged access-control, we don't want to give the
+	 * ability to any process to do some checks (e.g. through an eBPF
+	 * program) on kernel objects (e.g. files) if a privileged security
+	 * policy forbid their access.  We must then load
+	 * potentially-unprivileged security modules after all other LSMs.
+	 *
+	 * LSM_ORDER_LAST is always last and does not appear in the modifiable
+	 * ordered list of enabled LSMs.
+	 */
+	for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
+		if (lsm->order == LSM_ORDER_LAST)
+			append_ordered_lsm(lsm, "last");
+	}
+
 	/* Disable all LSMs not in the ordered list. */
 	for (lsm = __start_lsm_info; lsm < __end_lsm_info; lsm++) {
 		if (exists_ordered_lsm(lsm))
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 6/7] bpf,landlock: Add tests for the Landlock ptrace program type
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

Test eBPF program context access and ptrace hooks semantic.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v12:
* add documentation and diagrams (by Vincent Dagonneau)

Changes since v11:
* cosmetic fixes

Changes since v10:
* rework tests with new Landlock ptrace programs which restrict ptrace
  thanks to the task_landlock_ptrace_ancestor() helper
* simplify ptrace tests (make expect_ptrace implicit)
* add tests:
  * check a child process tracing its parent
  * check Landlock domain without ptrace enforcement (e.g. useful for
    audit/signaling purpose)
  * check inherited-only domains
  * check task pointer arithmetic
* fix flaky test for multi-core
* increase log size
* cosmetic renames
* update and improve the Makefile

Changes since v9:
* replace subtype with expected_attach_type and expected_attach_triggers
* rename inode_map_lookup() into inode_map_lookup_elem()
* check for inode map entry without value (which is now possible thanks
  to the pointer null check)
* use read-only inode map for Landlock programs

Changes since v8:
* update eBPF include path for macros
* use TEST_GEN_PROGS and use the generic "clean" target
* add more verbose errors
* update the bpf/verifier files
* remove chain tests (from landlock and bpf/verifier)
* replace the whitelist tests with blacklist tests (because of stateless
  Landlock programs): remove "dotdot" tests and other depth tests
* sync the landlock Makefile with its bpf sibling directory and use
  bpf_load_program_xattr()

Changes since v7:
* update tests and add new ones for filesystem hierarchy and Landlock
  chains.

Changes since v6:
* use the new kselftest_harness.h
* use const variables
* replace ASSERT_STEP with ASSERT_*
* rename BPF_PROG_TYPE_LANDLOCK to BPF_PROG_TYPE_LANDLOCK_RULE
* force sample library rebuild
* fix install target

Changes since v5:
* add subtype test
* add ptrace tests
* split and rename files
* cleanup and rebase
---
 scripts/bpf_helpers_doc.py                    |   1 +
 tools/include/uapi/linux/bpf.h                |  23 +-
 tools/include/uapi/linux/landlock.h           |  22 ++
 tools/lib/bpf/libbpf_probes.c                 |   3 +
 tools/testing/selftests/bpf/config            |   3 +
 tools/testing/selftests/bpf/test_verifier.c   |   1 +
 .../testing/selftests/bpf/verifier/landlock.c |  56 ++++
 tools/testing/selftests/landlock/.gitignore   |   5 +
 tools/testing/selftests/landlock/Makefile     |  27 ++
 tools/testing/selftests/landlock/config       |   5 +
 tools/testing/selftests/landlock/test.h       |  48 +++
 tools/testing/selftests/landlock/test_base.c  |  24 ++
 .../testing/selftests/landlock/test_ptrace.c  | 289 ++++++++++++++++++
 13 files changed, 506 insertions(+), 1 deletion(-)
 create mode 100644 tools/include/uapi/linux/landlock.h
 create mode 100644 tools/testing/selftests/bpf/verifier/landlock.c
 create mode 100644 tools/testing/selftests/landlock/.gitignore
 create mode 100644 tools/testing/selftests/landlock/Makefile
 create mode 100644 tools/testing/selftests/landlock/config
 create mode 100644 tools/testing/selftests/landlock/test.h
 create mode 100644 tools/testing/selftests/landlock/test_base.c
 create mode 100644 tools/testing/selftests/landlock/test_ptrace.c

diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py
index 7548569e8076..8e4c0fe75663 100755
--- a/scripts/bpf_helpers_doc.py
+++ b/scripts/bpf_helpers_doc.py
@@ -466,6 +466,7 @@ class PrinterHelpers(Printer):
             'const struct sk_buff': 'const struct __sk_buff',
             'struct sk_msg_buff': 'struct sk_msg_md',
             'struct xdp_buff': 'struct xdp_md',
+            'struct task_struct': 'void',
     }
 
     def print_header(self):
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4af8b0819a32..c88436b97163 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -173,6 +173,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SYSCTL,
 	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
+	BPF_PROG_TYPE_LANDLOCK_HOOK,
 };
 
 enum bpf_attach_type {
@@ -199,6 +200,7 @@ enum bpf_attach_type {
 	BPF_CGROUP_UDP6_RECVMSG,
 	BPF_CGROUP_GETSOCKOPT,
 	BPF_CGROUP_SETSOCKOPT,
+	BPF_LANDLOCK_PTRACE,
 	__MAX_BPF_ATTACH_TYPE
 };
 
@@ -2775,6 +2777,24 @@ union bpf_attr {
  * 		restricted to raw_tracepoint bpf programs.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_task_landlock_ptrace_ancestor(struct task_struct *parent, struct task_struct *child)
+ *	Description
+ *		Check the relation of a potentially parent task with a child
+ *		one, according to their Landlock ptrace hook programs.
+ *	Return
+ *		**-EINVAL** if the child's ptrace programs are not comparable
+ *		to the parent ones, i.e. one of them is an empty set.
+ *
+ *		**-ENOENT** if the parent's ptrace programs are either in a
+ *		separate hierarchy of the child ones, or if the parent's ptrace
+ *		programs are a superset of the child ones.
+ *
+ *		0 if the parent's ptrace programs are the same as the child
+ *		ones.
+ *
+ *		1 if the parent's ptrace programs are indeed a subset of the
+ *		child ones.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2888,7 +2908,8 @@ union bpf_attr {
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
 	FN(tcp_gen_syncookie),		\
-	FN(skb_output),
+	FN(skb_output),			\
+	FN(task_landlock_ptrace_ancestor),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/tools/include/uapi/linux/landlock.h b/tools/include/uapi/linux/landlock.h
new file mode 100644
index 000000000000..3db2d190c4e7
--- /dev/null
+++ b/tools/include/uapi/linux/landlock.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Landlock - UAPI headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _UAPI__LINUX_LANDLOCK_H__
+#define _UAPI__LINUX_LANDLOCK_H__
+
+#include <linux/types.h>
+
+#define LANDLOCK_RET_ALLOW	0
+#define LANDLOCK_RET_DENY	1
+
+struct landlock_context_ptrace {
+	__u64 tracer;
+	__u64 tracee;
+};
+
+#endif /* _UAPI__LINUX_LANDLOCK_H__ */
diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c
index 4b0b0364f5fc..1e0d6346a7c7 100644
--- a/tools/lib/bpf/libbpf_probes.c
+++ b/tools/lib/bpf/libbpf_probes.c
@@ -78,6 +78,9 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
 	case BPF_PROG_TYPE_KPROBE:
 		xattr.kern_version = get_kernel_version();
 		break;
+	case BPF_PROG_TYPE_LANDLOCK_HOOK:
+		xattr.expected_attach_type = BPF_LANDLOCK_PTRACE;
+		break;
 	case BPF_PROG_TYPE_UNSPEC:
 	case BPF_PROG_TYPE_SOCKET_FILTER:
 	case BPF_PROG_TYPE_SCHED_CLS:
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 5dc109f4c097..3161a88a6059 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -35,3 +35,6 @@ CONFIG_MPLS_ROUTING=m
 CONFIG_MPLS_IPTUNNEL=m
 CONFIG_IPV6_SIT=m
 CONFIG_BPF_JIT=y
+CONFIG_SECCOMP_FILTER=y
+CONFIG_SECURITY=y
+CONFIG_SECURITY_LANDLOCK=y
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index d27fd929abb9..74f249dafc0b 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -30,6 +30,7 @@
 #include <linux/bpf.h>
 #include <linux/if_ether.h>
 #include <linux/btf.h>
+#include <linux/landlock.h>
 
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
diff --git a/tools/testing/selftests/bpf/verifier/landlock.c b/tools/testing/selftests/bpf/verifier/landlock.c
new file mode 100644
index 000000000000..59cd333745dc
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/landlock.c
@@ -0,0 +1,56 @@
+{
+	"landlock/ptrace: always accept",
+	.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK,
+	.expected_attach_type = BPF_LANDLOCK_PTRACE,
+	.insns = {
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},
+{
+	"landlock/ptrace: forbid arbitrary return value",
+	.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK,
+	.expected_attach_type = BPF_LANDLOCK_PTRACE,
+	.insns = {
+		BPF_MOV32_IMM(BPF_REG_0, 2),
+		BPF_EXIT_INSN(),
+	},
+	.result = REJECT,
+	.errstr = "At program exit the register R0 has value (0x2; 0x0) should have been in (0x0; 0x1)",
+},
+{
+	"landlock/ptrace: read context and call dedicated helper",
+	.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK,
+	.expected_attach_type = BPF_LANDLOCK_PTRACE,
+	.insns = {
+		BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracer)),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracer)),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+				BPF_FUNC_task_landlock_ptrace_ancestor),
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = ACCEPT,
+},
+{
+	"landlock/ptrace: forbid pointer arithmetic",
+	.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK,
+	.expected_attach_type = BPF_LANDLOCK_PTRACE,
+	.insns = {
+		BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracer)),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 1),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracee)),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, 1),
+		BPF_MOV32_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	},
+	.result = REJECT,
+	.errstr = "R1 pointer arithmetic on task prohibited",
+},
diff --git a/tools/testing/selftests/landlock/.gitignore b/tools/testing/selftests/landlock/.gitignore
new file mode 100644
index 000000000000..4c5c01d23fe0
--- /dev/null
+++ b/tools/testing/selftests/landlock/.gitignore
@@ -0,0 +1,5 @@
+/feature
+/fixdep
+/*libbpf*
+/test_base
+/test_ptrace
diff --git a/tools/testing/selftests/landlock/Makefile b/tools/testing/selftests/landlock/Makefile
new file mode 100644
index 000000000000..2da77c30e77f
--- /dev/null
+++ b/tools/testing/selftests/landlock/Makefile
@@ -0,0 +1,27 @@
+# SPDX-License-Identifier: GPL-2.0
+
+LIBDIR := $(abspath ../../../lib)
+BPFDIR := $(LIBDIR)/bpf
+TOOLSDIR := $(abspath ../../../include)
+APIDIR := $(TOOLSDIR)/uapi
+
+CFLAGS += -g -Wall -O2 -I$(APIDIR) -I$(LIBDIR) -I$(TOOLSDIR)
+LDLIBS += -lelf
+
+test_src = $(wildcard test_*.c)
+
+TEST_GEN_PROGS := $(test_src:.c=)
+
+include ../lib.mk
+
+BPFOBJ := $(OUTPUT)/libbpf.a
+
+$(TEST_GEN_PROGS): $(BPFOBJ) ../kselftest_harness.h
+
+.PHONY: force
+
+# force a rebuild of BPFOBJ when its dependencies are updated
+force:
+
+$(BPFOBJ): force
+	$(MAKE) -C $(BPFDIR) OUTPUT=$(OUTPUT)/
diff --git a/tools/testing/selftests/landlock/config b/tools/testing/selftests/landlock/config
new file mode 100644
index 000000000000..fa5081b840ad
--- /dev/null
+++ b/tools/testing/selftests/landlock/config
@@ -0,0 +1,5 @@
+CONFIG_BPF=y
+CONFIG_BPF_SYSCALL=y
+CONFIG_SECCOMP_FILTER=y
+CONFIG_SECURITY=y
+CONFIG_SECURITY_LANDLOCK=y
diff --git a/tools/testing/selftests/landlock/test.h b/tools/testing/selftests/landlock/test.h
new file mode 100644
index 000000000000..836df68b6bb8
--- /dev/null
+++ b/tools/testing/selftests/landlock/test.h
@@ -0,0 +1,48 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Landlock helpers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#include <bpf/bpf.h>
+#include <errno.h>
+#include <linux/filter.h>
+#include <linux/landlock.h>
+#include <linux/seccomp.h>
+#include <sys/prctl.h>
+#include <sys/syscall.h>
+
+#include "../kselftest_harness.h"
+#include "../../../../samples/bpf/bpf_load.h"
+
+#ifndef SECCOMP_PREPEND_LANDLOCK_PROG
+#define SECCOMP_PREPEND_LANDLOCK_PROG	4
+#endif
+
+#ifndef seccomp
+static int __attribute__((unused)) seccomp(unsigned int op, unsigned int flags,
+		void *args)
+{
+	errno = 0;
+	return syscall(__NR_seccomp, op, flags, args);
+}
+#endif
+
+static int __attribute__((unused)) ll_bpf_load_program(
+		const struct bpf_insn *bpf_insns, size_t insns_len,
+		char *log_buf, size_t log_buf_sz,
+		const enum bpf_attach_type attach_type)
+{
+	struct bpf_load_program_attr load_attr;
+
+	memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
+	load_attr.prog_type = BPF_PROG_TYPE_LANDLOCK_HOOK;
+	load_attr.expected_attach_type = attach_type;
+	load_attr.insns = bpf_insns;
+	load_attr.insns_cnt = insns_len / sizeof(struct bpf_insn);
+	load_attr.license = "GPL";
+
+	return bpf_load_program_xattr(&load_attr, log_buf, log_buf_sz);
+}
diff --git a/tools/testing/selftests/landlock/test_base.c b/tools/testing/selftests/landlock/test_base.c
new file mode 100644
index 000000000000..db46f39048cb
--- /dev/null
+++ b/tools/testing/selftests/landlock/test_base.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock tests - base
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ */
+
+#define _GNU_SOURCE
+#include <errno.h>
+
+#include "test.h"
+
+TEST(seccomp_landlock)
+{
+	int ret;
+
+	ret = seccomp(SECCOMP_PREPEND_LANDLOCK_PROG, 0, NULL);
+	EXPECT_EQ(-1, ret);
+	EXPECT_EQ(EFAULT, errno) {
+		TH_LOG("Kernel does not support CONFIG_SECURITY_LANDLOCK");
+	}
+}
+
+TEST_HARNESS_MAIN
diff --git a/tools/testing/selftests/landlock/test_ptrace.c b/tools/testing/selftests/landlock/test_ptrace.c
new file mode 100644
index 000000000000..7f4945a61758
--- /dev/null
+++ b/tools/testing/selftests/landlock/test_ptrace.c
@@ -0,0 +1,289 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Landlock tests - ptrace
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include <signal.h>
+#include <sys/ptrace.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#include "test.h"
+
+#define LOG_SIZE 512
+
+static void create_domain(struct __test_metadata *_metadata,
+		bool scoped_ptrace, bool inherited_only)
+{
+	const struct bpf_insn prog_void[] = {
+		BPF_MOV32_IMM(BPF_REG_0, LANDLOCK_RET_ALLOW),
+		BPF_EXIT_INSN(),
+	};
+	const struct bpf_insn prog_check[] = {
+		BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_1),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracer)),
+		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_6,
+			offsetof(struct landlock_context_ptrace, tracee)),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+				BPF_FUNC_task_landlock_ptrace_ancestor),
+		/*
+		 * If @tracee is an ancestor or at the same level of @tracer,
+		 * then allow ptrace (warning: do not use BPF_JGE 0).
+		 */
+		BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, inherited_only ? 0 : 1, 2),
+		BPF_MOV32_IMM(BPF_REG_0, LANDLOCK_RET_DENY),
+		BPF_EXIT_INSN(),
+		BPF_MOV32_IMM(BPF_REG_0, LANDLOCK_RET_ALLOW),
+		BPF_EXIT_INSN(),
+	};
+	int prog;
+	char log[LOG_SIZE] = "";
+
+	if (scoped_ptrace)
+		prog = ll_bpf_load_program(prog_check, sizeof(prog_check),
+				log, sizeof(log), BPF_LANDLOCK_PTRACE);
+	else
+		prog = ll_bpf_load_program(prog_void, sizeof(prog_void),
+				log, sizeof(log), BPF_LANDLOCK_PTRACE);
+	ASSERT_NE(-1, prog) {
+		TH_LOG("Failed to load the %s program: %s\n%s",
+				scoped_ptrace ? "check" : "void",
+				strerror(errno), log);
+	}
+	ASSERT_EQ(0, seccomp(SECCOMP_PREPEND_LANDLOCK_PROG, 0, &prog)) {
+		TH_LOG("Failed to create a Landlock domain: %s",
+				strerror(errno));
+	}
+	EXPECT_EQ(0, close(prog));
+}
+
+/* test PTRACE_TRACEME and PTRACE_ATTACH for parent and child */
+static void _check_ptrace(struct __test_metadata *_metadata,
+		bool scoped_ptrace, bool domain_both,
+		bool domain_parent, bool domain_child)
+{
+	pid_t child, parent;
+	int status;
+	int pipe_child[2], pipe_parent[2];
+	char buf_parent;
+	const bool inherited_only = domain_both && !domain_parent &&
+		!domain_child;
+
+	parent = getpid();
+
+	ASSERT_EQ(0, pipe(pipe_child));
+	ASSERT_EQ(0, pipe(pipe_parent));
+	if (domain_both)
+		create_domain(_metadata, scoped_ptrace, inherited_only);
+
+	child = fork();
+	ASSERT_LE(0, child);
+	if (child == 0) {
+		char buf_child;
+
+		EXPECT_EQ(0, close(pipe_parent[1]));
+		EXPECT_EQ(0, close(pipe_child[0]));
+		if (domain_child)
+			create_domain(_metadata, scoped_ptrace, inherited_only);
+
+		/* sync #1 */
+		ASSERT_EQ(1, read(pipe_parent[0], &buf_child, 1)) {
+			TH_LOG("Failed to read() sync #1 from parent");
+		}
+		ASSERT_EQ('.', buf_child);
+
+		/* test the parent protection */
+		ASSERT_EQ((domain_child && scoped_ptrace) ? -1 : 0,
+				ptrace(PTRACE_ATTACH, parent, NULL, 0));
+		if (domain_child && scoped_ptrace) {
+			ASSERT_EQ(EPERM, errno);
+		} else {
+			ASSERT_EQ(parent, waitpid(parent, &status, 0));
+			ASSERT_EQ(1, WIFSTOPPED(status));
+			ASSERT_EQ(0, ptrace(PTRACE_DETACH, parent, NULL, 0));
+		}
+
+		/* sync #2 */
+		ASSERT_EQ(1, write(pipe_child[1], ".", 1)) {
+			TH_LOG("Failed to write() sync #2 to parent");
+		}
+
+		/* test traceme */
+		ASSERT_EQ((domain_parent && scoped_ptrace) ? -1 : 0,
+				ptrace(PTRACE_TRACEME));
+		if (domain_parent && scoped_ptrace) {
+			ASSERT_EQ(EPERM, errno);
+		} else {
+			ASSERT_EQ(0, raise(SIGSTOP));
+		}
+
+		/* sync #3 */
+		ASSERT_EQ(1, read(pipe_parent[0], &buf_child, 1)) {
+			TH_LOG("Failed to read() sync #3 from parent");
+		}
+		ASSERT_EQ('.', buf_child);
+		_exit(_metadata->passed ? EXIT_SUCCESS : EXIT_FAILURE);
+	}
+
+	EXPECT_EQ(0, close(pipe_child[1]));
+	EXPECT_EQ(0, close(pipe_parent[0]));
+	if (domain_parent)
+		create_domain(_metadata, scoped_ptrace, inherited_only);
+
+	/* sync #1 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1)) {
+		TH_LOG("Failed to write() sync #1 to child");
+	}
+
+	/* test the parent protection */
+	/* sync #2 */
+	ASSERT_EQ(1, read(pipe_child[0], &buf_parent, 1)) {
+		TH_LOG("Failed to read() sync #2 from child");
+	}
+	ASSERT_EQ('.', buf_parent);
+
+	/* test traceme */
+	if (!(domain_parent && scoped_ptrace)) {
+		ASSERT_EQ(child, waitpid(child, &status, 0));
+		ASSERT_EQ(1, WIFSTOPPED(status));
+		ASSERT_EQ(0, ptrace(PTRACE_DETACH, child, NULL, 0));
+	}
+	/* test attach */
+	ASSERT_EQ((domain_parent && scoped_ptrace) ? -1 : 0,
+			ptrace(PTRACE_ATTACH, child, NULL, 0));
+	if (domain_parent && scoped_ptrace) {
+		ASSERT_EQ(EPERM, errno);
+	} else {
+		ASSERT_EQ(child, waitpid(child, &status, 0));
+		ASSERT_EQ(1, WIFSTOPPED(status));
+		ASSERT_EQ(0, ptrace(PTRACE_DETACH, child, NULL, 0));
+	}
+
+	/* sync #3 */
+	ASSERT_EQ(1, write(pipe_parent[1], ".", 1)) {
+		TH_LOG("Failed to write() sync #3 to child");
+	}
+	ASSERT_EQ(child, waitpid(child, &status, 0));
+	if (WIFSIGNALED(status) || WEXITSTATUS(status))
+		_metadata->passed = 0;
+}
+
+/* keep the *_scoped order to check program inheritance */
+#define CHECK_PTRACE(name, domain_both, domain_parent, domain_child) \
+	TEST(name ## _unscoped) { \
+		_check_ptrace(_metadata, false, domain_both, domain_parent, \
+				domain_child); \
+	} \
+	TEST(name ## _scoped) { \
+		_check_ptrace(_metadata, false, domain_both, domain_parent, \
+				domain_child); \
+		_check_ptrace(_metadata, true, domain_both, domain_parent, \
+				domain_child); \
+	}
+
+/*
+ * Test multiple tracing combinations between a parent process P1 and a child
+ * process P2.
+ *
+ * Yama's scoped ptrace is presumed disabled.  If enabled, this additional
+ * restriction is enforced before any Landlock check, which means that all P2
+ * requests to trace P1 would be denied.
+ */
+
+/*
+ *        No domain
+ *
+ *   P1-.               P1 -> P2 : allow
+ *       \              P2 -> P1 : allow
+ *        'P2
+ */
+CHECK_PTRACE(allow_without_domain, false, false, false);
+
+/*
+ *        Child domain
+ *
+ *   P1--.              P1 -> P2 : allow
+ *        \             P2 -> P1 : deny
+ *        .'-----.
+ *        |  P2  |
+ *        '------'
+ */
+CHECK_PTRACE(allow_with_one_domain, false, false, true);
+
+/*
+ *        Parent domain
+ * .------.
+ * |  P1  --.           P1 -> P2 : deny
+ * '------'  \          P2 -> P1 : allow
+ *            '
+ *            P2
+ */
+CHECK_PTRACE(deny_with_parent_domain, false, true, false);
+
+/*
+ *        Parent + child domain (siblings)
+ * .------.
+ * |  P1  ---.          P1 -> P2 : deny
+ * '------'   \         P2 -> P1 : deny
+ *         .---'--.
+ *         |  P2  |
+ *         '------'
+ */
+CHECK_PTRACE(deny_with_sibling_domain, false, true, true);
+
+/*
+ *         Same domain (inherited)
+ * .-------------.
+ * | P1----.     |      P1 -> P2 : allow
+ * |        \    |      P2 -> P1 : allow
+ * |         '   |
+ * |         P2  |
+ * '-------------'
+ */
+CHECK_PTRACE(allow_sibling_domain, true, false, false);
+
+/*
+ *         Inherited + child domain
+ * .-----------------.
+ * |  P1----.        |  P1 -> P2 : allow
+ * |         \       |  P2 -> P1 : deny
+ * |        .-'----. |
+ * |        |  P2  | |
+ * |        '------' |
+ * '-----------------'
+ */
+CHECK_PTRACE(allow_with_nested_domain, true, false, true);
+
+/*
+ *         Inherited + parent domain
+ * .-----------------.
+ * |.------.         |  P1 -> P2 : deny
+ * ||  P1  ----.     |  P2 -> P1 : allow
+ * |'------'    \    |
+ * |             '   |
+ * |             P2  |
+ * '-----------------'
+ */
+CHECK_PTRACE(deny_with_nested_and_parent_domain, true, true, false);
+
+/*
+ *         Inherited + parent and child domain (siblings)
+ * .-----------------.
+ * | .------.        |  P1 -> P2 : deny
+ * | |  P1  .        |  P2 -> P1 : deny
+ * | '------'\       |
+ * |          \      |
+ * |        .--'---. |
+ * |        |  P2  | |
+ * |        '------' |
+ * '-----------------'
+ */
+CHECK_PTRACE(deny_with_forked_domain, true, true, true);
+
+TEST_HARNESS_MAIN
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 4/7] landlock: Add ptrace LSM hooks
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

Add a first Landlock hook that can be used to enforce a security policy
or to audit some process activities.  For a sandboxing use-case, it is
needed to inform the kernel if a task can legitimately debug another.
ptrace(2) can also be used by an attacker to impersonate another task
and remain undetected while performing malicious activities.

Using ptrace(2) and related features on a target process can lead to a
privilege escalation.  A sandboxed task must then be able to tell the
kernel if another task is more privileged, via ptrace_may_access().

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v12:
* ensure preemption is disabled before running BPF programs, cf. commit
  568f196756ad ("bpf: check that BPF programs run with preemption
  disabled")

Changes since v10:
* revamp and replace the static policy with a Landlock hook which may be
  used by the corresponding BPF_LANDLOCK_PTRACE program (attach) type
  and a dedicated process_cmp_landlock_ptrace() BPF helper
* check prog return value against LANDLOCK_RET_DENY (ret is a bitmask)

Changes since v6:
* factor out ptrace check
* constify pointers
* cleanup headers
* use the new security_add_hooks()
---
 security/landlock/Makefile       |   4 +-
 security/landlock/bpf_run.c      |  65 ++++++++++++++++++
 security/landlock/bpf_run.h      |  25 +++++++
 security/landlock/hooks_ptrace.c | 114 +++++++++++++++++++++++++++++++
 security/landlock/hooks_ptrace.h |  19 ++++++
 security/landlock/init.c         |   2 +
 6 files changed, 227 insertions(+), 2 deletions(-)
 create mode 100644 security/landlock/bpf_run.c
 create mode 100644 security/landlock/bpf_run.h
 create mode 100644 security/landlock/hooks_ptrace.c
 create mode 100644 security/landlock/hooks_ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 0b291f2c027c..93e4c2f31c8a 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := init.o \
-	bpf_verify.o bpf_ptrace.o \
+	bpf_verify.o bpf_run.o bpf_ptrace.o \
 	domain_manage.o domain_syscall.o \
-	hooks_cred.o
+	hooks_cred.o hooks_ptrace.o
diff --git a/security/landlock/bpf_run.c b/security/landlock/bpf_run.c
new file mode 100644
index 000000000000..454cd4b6e99b
--- /dev/null
+++ b/security/landlock/bpf_run.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - eBPF program evaluation
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <asm/current.h>
+#include <linux/bpf.h>
+#include <linux/errno.h>
+#include <linux/filter.h>
+#include <linux/preempt.h>
+#include <linux/rculist.h>
+#include <uapi/linux/landlock.h>
+
+#include "bpf_run.h"
+#include "common.h"
+#include "hooks_ptrace.h"
+
+static const void *get_prog_ctx(struct landlock_hook_ctx *hook_ctx)
+{
+	switch (hook_ctx->type) {
+	case LANDLOCK_HOOK_PTRACE:
+		return landlock_get_ctx_ptrace(hook_ctx->ctx_ptrace);
+	}
+	WARN_ON(1);
+	return NULL;
+}
+
+/**
+ * landlock_access_denied - run Landlock programs tied to a hook
+ *
+ * @domain: Landlock domain pointer
+ * @hook_ctx: non-NULL valid eBPF context pointer
+ *
+ * Return true if at least one program return deny, false otherwise.
+ */
+bool landlock_access_denied(struct landlock_domain *domain,
+		struct landlock_hook_ctx *hook_ctx)
+{
+	struct landlock_prog_list *prog_list;
+	const size_t hook = get_hook_index(hook_ctx->type);
+
+	if (!domain)
+		return false;
+
+	for (prog_list = domain->programs[hook]; prog_list;
+			prog_list = prog_list->prev) {
+		u32 ret;
+		const void *prog_ctx;
+
+		prog_ctx = get_prog_ctx(hook_ctx);
+		if (!prog_ctx || WARN_ON(IS_ERR(prog_ctx)))
+			return true;
+		preempt_disable();
+		rcu_read_lock();
+		ret = BPF_PROG_RUN(prog_list->prog, prog_ctx);
+		rcu_read_unlock();
+		preempt_enable();
+		if (ret & LANDLOCK_RET_DENY)
+			return true;
+	}
+	return false;
+}
diff --git a/security/landlock/bpf_run.h b/security/landlock/bpf_run.h
new file mode 100644
index 000000000000..3461cbb8ec12
--- /dev/null
+++ b/security/landlock/bpf_run.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - eBPF program evaluation headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_BPF_RUN_H
+#define _SECURITY_LANDLOCK_BPF_RUN_H
+
+#include "common.h"
+#include "hooks_ptrace.h"
+
+struct landlock_hook_ctx {
+	enum landlock_hook_type type;
+	union {
+		struct landlock_hook_ctx_ptrace *ctx_ptrace;
+	};
+};
+
+bool landlock_access_denied(struct landlock_domain *domain,
+		struct landlock_hook_ctx *hook_ctx);
+
+#endif /* _SECURITY_LANDLOCK_BPF_RUN_H */
diff --git a/security/landlock/hooks_ptrace.c b/security/landlock/hooks_ptrace.c
new file mode 100644
index 000000000000..8e518a472d04
--- /dev/null
+++ b/security/landlock/hooks_ptrace.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - ptrace hooks
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#include <asm/current.h>
+#include <linux/cred.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/lsm_hooks.h>
+#include <linux/sched.h>
+#include <uapi/linux/landlock.h>
+
+#include "bpf_run.h"
+#include "common.h"
+#include "hooks_ptrace.h"
+
+struct landlock_hook_ctx_ptrace {
+	struct landlock_context_ptrace prog_ctx;
+};
+
+const struct landlock_context_ptrace *landlock_get_ctx_ptrace(
+		const struct landlock_hook_ctx_ptrace *hook_ctx)
+{
+	if (WARN_ON(!hook_ctx))
+		return NULL;
+
+	return &hook_ctx->prog_ctx;
+}
+
+static int check_ptrace(struct landlock_domain *domain,
+		struct task_struct *tracer, struct task_struct *tracee)
+{
+	struct landlock_hook_ctx_ptrace ctx_ptrace = {
+		.prog_ctx = {
+			.tracer = (uintptr_t)tracer,
+			.tracee = (uintptr_t)tracee,
+		},
+	};
+	struct landlock_hook_ctx hook_ctx = {
+		.type = LANDLOCK_HOOK_PTRACE,
+		.ctx_ptrace = &ctx_ptrace,
+	};
+
+	return landlock_access_denied(domain, &hook_ctx) ? -EPERM : 0;
+}
+
+/**
+ * hook_ptrace_access_check - determine whether the current process may access
+ *			      another
+ *
+ * @child: the process to be accessed
+ * @mode: the mode of attachment
+ *
+ * If the current task (i.e. tracer) has one or multiple BPF_LANDLOCK_PTRACE
+ * programs, then run them with the `struct landlock_context_ptrace` context.
+ * If one of these programs return LANDLOCK_RET_DENY, then deny access with
+ * -EPERM, else allow it by returning 0.
+ */
+static int hook_ptrace_access_check(struct task_struct *child,
+		unsigned int mode)
+{
+	struct landlock_domain *dom_current;
+	const size_t hook = get_hook_index(LANDLOCK_HOOK_PTRACE);
+
+	dom_current = landlock_cred(current_cred())->domain;
+	if (!(dom_current && dom_current->programs[hook]))
+		return 0;
+	return check_ptrace(dom_current, current, child);
+}
+
+/**
+ * hook_ptrace_traceme - determine whether another process may trace the
+ *			 current one
+ *
+ * @parent: the task proposed to be the tracer
+ *
+ * If the parent task (i.e. tracer) has one or multiple BPF_LANDLOCK_PTRACE
+ * programs, then run them with the `struct landlock_context_ptrace` context.
+ * If one of these programs return LANDLOCK_RET_DENY, then deny access with
+ * -EPERM, else allow it by returning 0.
+ */
+static int hook_ptrace_traceme(struct task_struct *parent)
+{
+	struct landlock_domain *dom_parent;
+	const size_t hook = get_hook_index(LANDLOCK_HOOK_PTRACE);
+	int ret;
+
+	rcu_read_lock();
+	dom_parent = landlock_cred(__task_cred(parent))->domain;
+	if (!(dom_parent && dom_parent->programs[hook])) {
+		ret = 0;
+		goto put_rcu;
+	}
+	ret = check_ptrace(dom_parent, parent, current);
+
+put_rcu:
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct security_hook_list landlock_hooks[] = {
+	LSM_HOOK_INIT(ptrace_access_check, hook_ptrace_access_check),
+	LSM_HOOK_INIT(ptrace_traceme, hook_ptrace_traceme),
+};
+
+__init void landlock_add_hooks_ptrace(void)
+{
+	security_add_hooks(landlock_hooks, ARRAY_SIZE(landlock_hooks),
+			LANDLOCK_NAME);
+}
diff --git a/security/landlock/hooks_ptrace.h b/security/landlock/hooks_ptrace.h
new file mode 100644
index 000000000000..53fe651bdb3e
--- /dev/null
+++ b/security/landlock/hooks_ptrace.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - ptrace hooks headers
+ *
+ * Copyright © 2017-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_HOOKS_PTRACE_H
+#define _SECURITY_LANDLOCK_HOOKS_PTRACE_H
+
+struct landlock_hook_ctx_ptrace;
+
+const struct landlock_context_ptrace *landlock_get_ctx_ptrace(
+		const struct landlock_hook_ctx_ptrace *hook_ctx);
+
+__init void landlock_add_hooks_ptrace(void);
+
+#endif /* _SECURITY_LANDLOCK_HOOKS_PTRACE_H */
diff --git a/security/landlock/init.c b/security/landlock/init.c
index 8836ec4defd3..541aad17418e 100644
--- a/security/landlock/init.c
+++ b/security/landlock/init.c
@@ -10,11 +10,13 @@
 
 #include "common.h"
 #include "hooks_cred.h"
+#include "hooks_ptrace.h"
 
 static int __init landlock_init(void)
 {
 	pr_info(LANDLOCK_NAME ": Registering hooks\n");
 	landlock_add_hooks_cred();
+	landlock_add_hooks_ptrace();
 	return 0;
 }
 
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 2/7] landlock: Add the management of domains
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

A Landlock domain is a set of eBPF programs.  There is a list for each
different program types that can be run on a specific Landlock hook
(e.g. ptrace).  A domain is tied to a set of subjects (i.e. tasks).  A
Landlock program should not try (nor be able) to infer which subject is
currently enforced, but to have a unique security policy for all
subjects tied to the same domain.  This make the reasoning much easier
and help avoid pitfalls.

The next commits tie a domain to a task's credentials thanks to
seccomp(2), but we could use cgroups or a security file-system to
enforce a sysadmin-defined policy .

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v12:
* move the landlock_put_domain() call from landlock_prepend_prog() to
  domain_syscall.c:landlock_seccomp_prepend_prog() (next commit): keep
  the same semantic but don't conditionally own the domain (safeguard
  domain owning)
* always copy a domain to-be-modified which guarantee domain
  immutability: the callers of landlock_prepend_prog() are now fully in
  charge of concurrency handling, which is currently only handled by
  prepare_creds() and commit_creds() (next commit)
* simplify landlock_prepend_prog()
* use a consistent naming

Changes since v11:
* remove old code from previous refactoring (removing the program
  chaining concept) and simplify program prepending (reported by Serge
  E. Hallyn):
  * simplify landlock_prepend_prog() and merge it with
    store_landlock_prog()
  * add new_prog_list() and rework new_landlock_domain()
  * remove the extra page allocation checks, only rely on the eBPF
    program checks
* replace the -EINVAL for the duplicate program check with the -EEXIST

Changes since v10:
* rename files and names to clearly define a domain
* create a standalone patch to ease review
---
 security/landlock/Makefile        |   3 +-
 security/landlock/common.h        |  38 ++++++++
 security/landlock/domain_manage.c | 152 ++++++++++++++++++++++++++++++
 security/landlock/domain_manage.h |  22 +++++
 4 files changed, 214 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/domain_manage.c
 create mode 100644 security/landlock/domain_manage.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 682b798c6b76..dd5f70185778 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,5 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := \
-	bpf_verify.o bpf_ptrace.o
+	bpf_verify.o bpf_ptrace.o \
+	domain_manage.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
index 0234c4bc4acd..fb2990eb5fb4 100644
--- a/security/landlock/common.h
+++ b/security/landlock/common.h
@@ -11,11 +11,49 @@
 
 #include <linux/bpf.h>
 #include <linux/filter.h>
+#include <linux/refcount.h>
 
 enum landlock_hook_type {
 	LANDLOCK_HOOK_PTRACE = 1,
 };
 
+#define _LANDLOCK_HOOK_LAST LANDLOCK_HOOK_PTRACE
+
+struct landlock_prog_list {
+	struct landlock_prog_list *prev;
+	struct bpf_prog *prog;
+	refcount_t usage;
+};
+
+/**
+ * struct landlock_domain - Landlock programs enforced on a set of tasks
+ *
+ * When prepending a new program, if &struct landlock_domain is shared with
+ * other tasks, then duplicate it and prepend the program to this new &struct
+ * landlock_domain.
+ *
+ * @usage: reference count to manage the object lifetime. When a task needs to
+ *	   add Landlock programs and if @usage is greater than 1, then the
+ *	   task must duplicate &struct landlock_domain to not change the
+ *	   children's programs as well.
+ * @programs: array of non-NULL &struct landlock_prog_list pointers
+ */
+struct landlock_domain {
+	struct landlock_prog_list *programs[_LANDLOCK_HOOK_LAST];
+	refcount_t usage;
+};
+
+/**
+ * get_hook_index - get an index for the programs of struct landlock_prog_set
+ *
+ * @type: a Landlock hook type
+ */
+static inline size_t get_hook_index(enum landlock_hook_type type)
+{
+	/* type ID > 0 for loaded programs */
+	return type - 1;
+}
+
 static inline enum landlock_hook_type get_hook_type(const struct bpf_prog *prog)
 {
 	switch (prog->expected_attach_type) {
diff --git a/security/landlock/domain_manage.c b/security/landlock/domain_manage.c
new file mode 100644
index 000000000000..b65960611d09
--- /dev/null
+++ b/security/landlock/domain_manage.c
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - domain management
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "common.h"
+#include "domain_manage.h"
+
+void landlock_get_domain(struct landlock_domain *domain)
+{
+	if (!domain)
+		return;
+	refcount_inc(&domain->usage);
+}
+
+static void put_prog_list(struct landlock_prog_list *prog_list)
+{
+	struct landlock_prog_list *orig = prog_list;
+
+	/* clean up single-reference branches iteratively */
+	while (orig && refcount_dec_and_test(&orig->usage)) {
+		struct landlock_prog_list *freeme = orig;
+
+		if (orig->prog)
+			bpf_prog_put(orig->prog);
+		orig = orig->prev;
+		kfree(freeme);
+	}
+}
+
+void landlock_put_domain(struct landlock_domain *domain)
+{
+	if (domain && refcount_dec_and_test(&domain->usage)) {
+		size_t i;
+
+		for (i = 0; i < ARRAY_SIZE(domain->programs); i++)
+			put_prog_list(domain->programs[i]);
+		kfree(domain);
+	}
+}
+
+static struct landlock_prog_list *create_prog_list(struct bpf_prog *prog)
+{
+	struct landlock_prog_list *new_list;
+
+	if (WARN_ON(IS_ERR_OR_NULL(prog)))
+		return ERR_PTR(-EFAULT);
+	if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK)
+		return ERR_PTR(-EINVAL);
+	prog = bpf_prog_inc(prog);
+	if (IS_ERR(prog))
+		return ERR_CAST(prog);
+	new_list = kzalloc(sizeof(*new_list), GFP_KERNEL);
+	if (!new_list) {
+		bpf_prog_put(prog);
+		return ERR_PTR(-ENOMEM);
+	}
+	new_list->prog = prog;
+	refcount_set(&new_list->usage, 1);
+	return new_list;
+}
+
+static struct landlock_domain *create_domain(struct bpf_prog *prog)
+{
+	struct landlock_domain *new_dom;
+	struct landlock_prog_list *new_list;
+	size_t hook;
+
+	/* programs[] filled with NULL values */
+	new_dom = kzalloc(sizeof(*new_dom), GFP_KERNEL);
+	if (!new_dom)
+		return ERR_PTR(-ENOMEM);
+	refcount_set(&new_dom->usage, 1);
+	new_list = create_prog_list(prog);
+	if (IS_ERR(new_list)) {
+		kfree(new_dom);
+		return ERR_CAST(new_list);
+	}
+	hook = get_hook_index(get_hook_type(prog));
+	new_dom->programs[hook] = new_list;
+	return new_dom;
+}
+
+/**
+ * landlock_prepend_prog - extend a Landlock domain with an eBPF program
+ *
+ * Prepend @prog to @domain if @prog is not already in @domain.
+ *
+ * @domain: domain to copy and extend with @prog. This domain must not be
+ *          modified by another function than this one to guarantee domain
+ *          immutability.
+ * @prog: non-NULL Landlock program to prepend to a copy of @domain.  @prog
+ *        will be owned by landlock_prepend_prog(). You can then call
+ *        bpf_prog_put(@prog) after.
+ *
+ * Return a copy of @domain (with @prog prepended) when OK. Return a pointer
+ * error otherwise.
+ */
+struct landlock_domain *landlock_prepend_prog(struct landlock_domain *domain,
+		struct bpf_prog *prog)
+{
+	struct landlock_prog_list *walker;
+	struct landlock_domain *new_dom;
+	size_t i, hook;
+
+	if (WARN_ON(!prog))
+		return ERR_PTR(-EFAULT);
+	if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK)
+		return ERR_PTR(-EINVAL);
+
+	if (!domain)
+		return create_domain(prog);
+
+	hook = get_hook_index(get_hook_type(prog));
+	/* check for similar program */
+	for (walker = domain->programs[hook]; walker;
+			walker = walker->prev) {
+		/* don't allow duplicate programs */
+		if (prog == walker->prog)
+			return ERR_PTR(-EEXIST);
+	}
+
+	new_dom = create_domain(prog);
+	if (IS_ERR(new_dom))
+		return new_dom;
+
+	/* copy @domain (which is guarantee to be immutable) */
+	for (i = 0; i < ARRAY_SIZE(new_dom->programs); i++) {
+		struct landlock_prog_list *current_list;
+		struct landlock_prog_list **new_list;
+
+		current_list = domain->programs[i];
+		if (!current_list)
+			continue;
+		refcount_inc(&current_list->usage);
+		new_list = &new_dom->programs[i];
+		if (*new_list)
+			new_list = &(*new_list)->prev;
+		/* do not increment usage */
+		*new_list = current_list;
+	}
+	return new_dom;
+}
diff --git a/security/landlock/domain_manage.h b/security/landlock/domain_manage.h
new file mode 100644
index 000000000000..d32c65c941c0
--- /dev/null
+++ b/security/landlock/domain_manage.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - domain management headers
+ *
+ * Copyright © 2016-2019 Mickaël Salaün <mic@digikod.net>
+ * Copyright © 2018-2019 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_DOMAIN_MANAGE_H
+#define _SECURITY_LANDLOCK_DOMAIN_MANAGE_H
+
+#include <linux/filter.h>
+
+#include "common.h"
+
+void landlock_get_domain(struct landlock_domain *domain);
+void landlock_put_domain(struct landlock_domain *domain);
+
+struct landlock_domain *landlock_prepend_prog(struct landlock_domain *domain,
+		struct bpf_prog *prog);
+
+#endif /* _SECURITY_LANDLOCK_DOMAIN_MANAGE_H */
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 5/7] bpf,landlock: Add task_landlock_ptrace_ancestor() helper
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev
In-Reply-To: <20191104172146.30797-1-mic@digikod.net>

This new task_landlock_ptrace_ancestor() helper can be used to identify
if the Landlock domain tied to the current tracer is in the same
hierarchy as the domain of tracee.  This may be a way for user space to
programmatically defines that a set of tasks is less privileged than
another set of tasks.

Indeed, ptrace(2) can be used to impersonate an unsandboxed process and
lead to a privilege escalation.  A common use-case when sandboxing a
process is then to forbid it to debug a less-privileged process.  A
sandbox process (tracer) should only be allowed to trace another process
(tracee) if the tracee has fewer privileges than the tracer.  This
policy can be implemented with this helper.

More complex helpers could be added in the future to enable other ways
to check the relation between the tracer and the tracee (e.g. check
other program types or other hierarchies) if there is a use case.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: James Morris <jmorris@namei.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Will Drewry <wad@chromium.org>
---

Changes since v10:
* new patch taking inspiration from the previous static ptrace policy
---
 include/linux/bpf.h            |  2 +
 include/uapi/linux/bpf.h       | 21 ++++++++++-
 kernel/bpf/verifier.c          |  4 ++
 security/landlock/bpf_ptrace.c | 68 ++++++++++++++++++++++++++++++++++
 security/landlock/bpf_verify.c |  4 ++
 5 files changed, 98 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 819a3e207438..67ec198a90cb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -214,6 +214,7 @@ enum bpf_arg_type {
 	ARG_PTR_TO_LONG,	/* pointer to long */
 	ARG_PTR_TO_SOCKET,	/* pointer to bpf_sock (fullsock) */
 	ARG_PTR_TO_BTF_ID,	/* pointer to in-kernel struct */
+	ARG_PTR_TO_TASK,	/* pointer to task_struct */
 };
 
 /* type of values returned from helper functions */
@@ -1088,6 +1089,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
 extern const struct bpf_func_proto bpf_strtol_proto;
 extern const struct bpf_func_proto bpf_strtoul_proto;
 extern const struct bpf_func_proto bpf_tcp_sock_proto;
+extern const struct bpf_func_proto bpf_task_landlock_ptrace_ancestor_proto;
 
 /* Shared helpers among cBPF and eBPF. */
 void bpf_user_rnd_init_once(void);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 6e4147790f96..c88436b97163 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2777,6 +2777,24 @@ union bpf_attr {
  * 		restricted to raw_tracepoint bpf programs.
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * int bpf_task_landlock_ptrace_ancestor(struct task_struct *parent, struct task_struct *child)
+ *	Description
+ *		Check the relation of a potentially parent task with a child
+ *		one, according to their Landlock ptrace hook programs.
+ *	Return
+ *		**-EINVAL** if the child's ptrace programs are not comparable
+ *		to the parent ones, i.e. one of them is an empty set.
+ *
+ *		**-ENOENT** if the parent's ptrace programs are either in a
+ *		separate hierarchy of the child ones, or if the parent's ptrace
+ *		programs are a superset of the child ones.
+ *
+ *		0 if the parent's ptrace programs are the same as the child
+ *		ones.
+ *
+ *		1 if the parent's ptrace programs are indeed a subset of the
+ *		child ones.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2890,7 +2908,8 @@ union bpf_attr {
 	FN(sk_storage_delete),		\
 	FN(send_signal),		\
 	FN(tcp_gen_syncookie),		\
-	FN(skb_output),
+	FN(skb_output),			\
+	FN(task_landlock_ptrace_ancestor),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ebf1991906b7..af8f1a777a2d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3492,6 +3492,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
 		    type != PTR_TO_MAP_VALUE &&
 		    type != expected_type)
 			goto err_type;
+	} else if (arg_type == ARG_PTR_TO_TASK) {
+		expected_type = PTR_TO_TASK;
+		if (type != expected_type)
+			goto err_type;
 	} else {
 		verbose(env, "unsupported arg_type %d\n", arg_type);
 		return -EFAULT;
diff --git a/security/landlock/bpf_ptrace.c b/security/landlock/bpf_ptrace.c
index 2ec73078ad01..0e1362951463 100644
--- a/security/landlock/bpf_ptrace.c
+++ b/security/landlock/bpf_ptrace.c
@@ -7,9 +7,13 @@
  */
 
 #include <linux/bpf.h>
+#include <linux/cred.h>
+#include <linux/kernel.h>
+#include <linux/rcupdate.h>
 #include <uapi/linux/landlock.h>
 
 #include "bpf_ptrace.h"
+#include "common.h"
 
 bool landlock_is_valid_access_ptrace(int off, enum bpf_access_type type,
 		enum bpf_reg_type *reg_type, int *max_size)
@@ -28,3 +32,67 @@ bool landlock_is_valid_access_ptrace(int off, enum bpf_access_type type,
 		return false;
 	}
 }
+
+/**
+ * domain_ptrace_ancestor - check domain ordering according to ptrace
+ *
+ * @parent: a parent domain
+ * @child: a potential child of @parent
+ *
+ * Check if the @parent domain is less or equal to (i.e. a subset of) the
+ * @child domain.
+ */
+static int domain_ptrace_ancestor(const struct landlock_domain *parent,
+		const struct landlock_domain *child)
+{
+	const struct landlock_prog_list *child_progs, *parent_progs;
+	const size_t hook = get_hook_index(LANDLOCK_HOOK_PTRACE);
+
+	if (!parent || !child)
+		/* @parent or @child has no ptrace restriction */
+		return -EINVAL;
+	parent_progs = parent->programs[hook];
+	child_progs = child->programs[hook];
+	if (!parent_progs || !child_progs)
+		/* @parent or @child has no ptrace restriction */
+		return -EINVAL;
+	if (child_progs == parent_progs)
+		/* @parent is at the same level as @child */
+		return 0;
+	for (child_progs = child_progs->prev; child_progs;
+			child_progs = child_progs->prev) {
+		if (child_progs == parent_progs)
+			/* @parent is one of the ancestors of @child */
+			return 1;
+	}
+	/*
+	 * Either there is no relationship between @parent and @child, or
+	 * @child is one of the ancestors of @parent.
+	 */
+	return -ENOENT;
+}
+
+/*
+ * Cf. include/uapi/linux/bpf.h - bpf_task_landlock_ptrace_ancestor
+ */
+BPF_CALL_2(bpf_task_landlock_ptrace_ancestor, const struct task_struct *,
+		parent, const struct task_struct *, child)
+{
+	const struct landlock_domain *dom_parent, *dom_child;
+
+	WARN_ON_ONCE(!rcu_read_lock_held());
+	if (WARN_ON(!parent || !child))
+		return -EFAULT;
+	dom_parent = landlock_cred(__task_cred(parent))->domain;
+	dom_child = landlock_cred(__task_cred(child))->domain;
+	return domain_ptrace_ancestor(dom_parent, dom_child);
+}
+
+const struct bpf_func_proto bpf_task_landlock_ptrace_ancestor_proto = {
+	.func		= bpf_task_landlock_ptrace_ancestor,
+	.gpl_only	= false,
+	.pkt_access	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_TASK,
+	.arg2_type	= ARG_PTR_TO_TASK,
+};
diff --git a/security/landlock/bpf_verify.c b/security/landlock/bpf_verify.c
index 6ed921588178..a1d2db75d51d 100644
--- a/security/landlock/bpf_verify.c
+++ b/security/landlock/bpf_verify.c
@@ -70,6 +70,10 @@ static const struct bpf_func_proto *bpf_landlock_func_proto(
 		return &bpf_map_update_elem_proto;
 	case BPF_FUNC_map_delete_elem:
 		return &bpf_map_delete_elem_proto;
+	case BPF_FUNC_task_landlock_ptrace_ancestor:
+		if (get_hook_type(prog) == LANDLOCK_HOOK_PTRACE)
+			return &bpf_task_landlock_ptrace_ancestor_proto;
+		return NULL;
 	default:
 		return NULL;
 	}
-- 
2.23.0


^ permalink raw reply related

* [PATCH bpf-next v13 0/7] Landlock LSM
From: Mickaël Salaün @ 2019-11-04 17:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Mickaël Salaün, Alexei Starovoitov, Andy Lutomirski,
	Casey Schaufler, Daniel Borkmann, David Drysdale, Florent Revest,
	James Morris, Jann Horn, John Johansen, Jonathan Corbet,
	Kees Cook, KP Singh, Michael Kerrisk, Mickaël Salaün,
	Paul Moore, Sargun Dhillon, Serge E . Hallyn, Shuah Khan,
	Stephen Smalley, Tejun Heo, Tetsuo Handa, Tycho Andersen,
	Will Drewry, bpf, kernel-hardening, linux-api,
	linux-security-module, netdev

Hi,

Following the previous series [1], this thirteenth series mainly clean
up and add safeguards to the domain management.  This series also
improves the documentation, add comments, and fixes some latent bugs.

This is the first step of the roadmap discussed at LPC [2].  While the
intended final goal is to allow unprivileged (or non-root) users to use
Landlock, this series allows only a process with global CAP_SYS_ADMIN to
load and enforce a rule.  This may help to get feedback and avoid
unexpected behaviors.

This series can be applied on top of bpf-next, commit e93d99180abd
("selftests/bpf: Restore $(OUTPUT)/test_stub.o rule").  This can be
tested with CONFIG_BPF_SYSCALL, CONFIG_SECCOMP_FILTER and
CONFIG_SECURITY_LANDLOCK.  This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v13
I would really appreciate constructive comments on the design and the
code.


# Landlock LSM

Landlock is a stackable LSM [3] intended to be used as a low-level
framework to build custom access-control/audit systems or safe endpoint
security agents.  There is currently one Landlock hook dedicated to
check ptrace(2).  This hook accepts a dedicated eBPF program, called a
Landlock program, which can currently compare its position in the
hierarchy of similar programs tied to other processes.  This enables to
enforce programmatic scoped ptrace restrictions.

The final goal of this new Linux Security Module (LSM) called Landlock
is to allow any process, including unprivileged ones, to create powerful
security sandboxes comparable to XNU Sandbox, FreeBSD Capsicum or
OpenBSD Pledge (which could be implemented with Landlock).  This kind of
sandbox is expected to help mitigate the security impact of bugs or
unexpected/malicious behaviors in user-space applications.

The use of seccomp and Landlock is more suitable with the help of a
user-space library (e.g.  libseccomp) that could help to specify a
high-level language to express a security policy instead of raw eBPF
programs.  Moreover, thanks to the LLVM front-end, it is quite easy to
write an eBPF program with a subset of the C language.

The documentation patch contains some kernel documentation, explanations
on how to use Landlock and a FAQ.  The compiled documentation and some
talks can be found here: https://landlock.io


# Frequently asked questions

## Why is seccomp-bpf not enough?

A seccomp filter can access only raw syscall arguments (i.e. the
register values) which means that it is not possible to filter according
to the value pointed to by an argument, such as a file pathname. As an
embryonic Landlock version demonstrated (i.e. seccomp-object), filtering
at the syscall level is complicated (e.g. need to take care of race
conditions). This is mainly because the access control checkpoints of
the kernel are not at this high-level but more underneath, at the
LSM-hook level. The LSM hooks are designed to handle this kind of
checks.  Landlock abstracts this approach to leverage the ability of
unprivileged users to limit themselves.

Cf. section "What it isn't?" in
Documentation/userspace-api/seccomp_filter.rst


## Why use the seccomp(2) syscall?

Landlock use the same semantic as seccomp to apply access rule
restrictions. It add a new layer of security for the current process
which is inherited by its children. It makes sense to use an unique
access-restricting syscall (that should be allowed by seccomp filters)
which can only drop privileges. Moreover, a Landlock rule could come
from outside a process (e.g.  passed through a UNIX socket). It is then
useful to differentiate the creation/load of Landlock eBPF programs via
bpf(2), from rule enforcement via seccomp(2).


## Why a new LSM? Are SELinux, AppArmor, Smack and Tomoyo not good
   enough?

The current access control LSMs are fine for their purpose which is to
give the *root* the ability to enforce a security policy for the
*system*. What is missing is a way to enforce a security policy for any
application by its developer and *unprivileged user* as seccomp can do
for raw syscall filtering.

Differences from other (access control) LSMs:
* not only dedicated to administrators (i.e. no_new_priv);
* limited kernel attack surface (e.g. policy parsing);
* constrained policy rules (no DoS: deterministic execution time);
* do not leak more information than the loader process can legitimately
  have access to (minimize metadata inference).


# Changes since v12

* make landlock_prepend_prog() more consistant by putting its caller in
  charge of the lifetime of the Landlock domain being updated
* always copy a domain being updated, which garantee its immutability,
  and behave as a copy-on-write thanks to commit_creds()
* fix preemption
* improve documentation and comments


# Changes since v11

* rework domain management
* minor fixes


# Changes since v10

* remove all the file system related features: program types, inode
  map and expected_attach_triggers
* replace the static ptrace security policy with a new and simpler
  ptrace program (attached) type and a task_landlock_ptrace_ancestor()
  eBPF helper
* do not rely on seccomp internal structure but use stacked credentials
  insdead
* extend ptrace tests
* add more documentation
* split and rename files/patches
* miscellaneous fixes


Previous changes can be found in a previous cover-letter [4].


[1] https://lore.kernel.org/lkml/20191031164445.29426-1-mic@digikod.net/
[2] https://lore.kernel.org/lkml/5828776A.1010104@digikod.net/
[3] https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/
[4] https://lore.kernel.org/lkml/20190721213116.23476-1-mic@digikod.net/

Regards,

Mickaël Salaün (7):
  bpf,landlock: Define an eBPF program type for Landlock hooks
  landlock: Add the management of domains
  landlock,seccomp: Apply Landlock programs to process hierarchy
  landlock: Add ptrace LSM hooks
  bpf,landlock: Add task_landlock_ptrace_ancestor() helper
  bpf,landlock: Add tests for the Landlock ptrace program type
  landlock: Add user and kernel documentation for Landlock

 Documentation/security/index.rst              |   1 +
 Documentation/security/landlock/index.rst     |  22 ++
 Documentation/security/landlock/kernel.rst    | 166 ++++++++++
 Documentation/security/landlock/user.rst      | 153 ++++++++++
 MAINTAINERS                                   |   9 +
 include/linux/bpf.h                           |   3 +
 include/linux/bpf_types.h                     |   3 +
 include/linux/landlock.h                      |  25 ++
 include/linux/lsm_hooks.h                     |   1 +
 include/uapi/linux/bpf.h                      |  23 +-
 include/uapi/linux/landlock.h                 |  39 +++
 include/uapi/linux/seccomp.h                  |   1 +
 kernel/bpf/syscall.c                          |   9 +
 kernel/bpf/verifier.c                         |  11 +
 kernel/seccomp.c                              |   4 +
 scripts/bpf_helpers_doc.py                    |   1 +
 security/Kconfig                              |   1 +
 security/Makefile                             |   2 +
 security/landlock/Kconfig                     |  19 ++
 security/landlock/Makefile                    |   6 +
 security/landlock/bpf_ptrace.c                |  98 ++++++
 security/landlock/bpf_ptrace.h                |  17 ++
 security/landlock/bpf_run.c                   |  65 ++++
 security/landlock/bpf_run.h                   |  25 ++
 security/landlock/bpf_verify.c                |  87 ++++++
 security/landlock/common.h                    |  84 +++++
 security/landlock/domain_manage.c             | 152 +++++++++
 security/landlock/domain_manage.h             |  22 ++
 security/landlock/domain_syscall.c            |  93 ++++++
 security/landlock/hooks_cred.c                |  47 +++
 security/landlock/hooks_cred.h                |  14 +
 security/landlock/hooks_ptrace.c              | 114 +++++++
 security/landlock/hooks_ptrace.h              |  19 ++
 security/landlock/init.c                      |  32 ++
 security/security.c                           |  15 +
 tools/include/uapi/linux/bpf.h                |  23 +-
 tools/include/uapi/linux/landlock.h           |  22 ++
 tools/lib/bpf/libbpf_probes.c                 |   3 +
 tools/testing/selftests/bpf/config            |   3 +
 tools/testing/selftests/bpf/test_verifier.c   |   1 +
 .../testing/selftests/bpf/verifier/landlock.c |  56 ++++
 tools/testing/selftests/landlock/.gitignore   |   5 +
 tools/testing/selftests/landlock/Makefile     |  27 ++
 tools/testing/selftests/landlock/config       |   5 +
 tools/testing/selftests/landlock/test.h       |  48 +++
 tools/testing/selftests/landlock/test_base.c  |  24 ++
 .../testing/selftests/landlock/test_ptrace.c  | 289 ++++++++++++++++++
 47 files changed, 1887 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst
 create mode 100644 include/linux/landlock.h
 create mode 100644 include/uapi/linux/landlock.h
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/bpf_ptrace.c
 create mode 100644 security/landlock/bpf_ptrace.h
 create mode 100644 security/landlock/bpf_run.c
 create mode 100644 security/landlock/bpf_run.h
 create mode 100644 security/landlock/bpf_verify.c
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/domain_manage.c
 create mode 100644 security/landlock/domain_manage.h
 create mode 100644 security/landlock/domain_syscall.c
 create mode 100644 security/landlock/hooks_cred.c
 create mode 100644 security/landlock/hooks_cred.h
 create mode 100644 security/landlock/hooks_ptrace.c
 create mode 100644 security/landlock/hooks_ptrace.h
 create mode 100644 security/landlock/init.c
 create mode 100644 tools/include/uapi/linux/landlock.h
 create mode 100644 tools/testing/selftests/bpf/verifier/landlock.c
 create mode 100644 tools/testing/selftests/landlock/.gitignore
 create mode 100644 tools/testing/selftests/landlock/Makefile
 create mode 100644 tools/testing/selftests/landlock/config
 create mode 100644 tools/testing/selftests/landlock/test.h
 create mode 100644 tools/testing/selftests/landlock/test_base.c
 create mode 100644 tools/testing/selftests/landlock/test_ptrace.c

-- 
2.23.0


^ permalink raw reply

* Business Proposal - Please Reply
From: Yuval @ 2019-11-04 18:17 UTC (permalink / raw)
  To: linux-security-module

Hello

My name is Yuval Rose. I have an urgent lucrative business 
opportunity for you. I got your details on the internet when I 
was searching for a reliable person that can handle this deal and 
I believe you can handle it because you bears the same Last-Name 
with the dead investor of the business am about to introduce to 
you. Waiting for your speedy reply for further and complete 
details.

Send reply to: yuval@free2312.com


Best Regards
Yuval

^ permalink raw reply

* Re: [Patch v3 6/7] doc: keys: Document usage of TEE based Trusted Keys
From: Jarkko Sakkinen @ 2019-11-04 20:55 UTC (permalink / raw)
  To: Sumit Garg
  Cc: Jens Wiklander, dhowells, Jonathan Corbet, jejb, Mimi Zohar,
	James Morris, Serge E. Hallyn, Casey Schaufler, Ard Biesheuvel,
	Daniel Thompson, Stuart Yoder, Janne Karhunen,
	open list:ASYMMETRIC KEYS, linux-integrity, linux-security-module,
	Linux Doc Mailing List, Linux Kernel Mailing List,
	linux-arm-kernel, tee-dev @ lists . linaro . org
In-Reply-To: <CAFA6WYNwSSaZv5OM=q+LCyn0mEdpg7K+W_v2_NBHhtktg1BFXw@mail.gmail.com>

On Mon, Nov 04, 2019 at 12:28:47PM +0530, Sumit Garg wrote:
> On Sat, 2 Nov 2019 at 01:50, Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
> >
> > On Fri, Nov 01, 2019 at 03:04:18PM +0530, Sumit Garg wrote:
> >
> > > Isn't this statement contradicting with your earlier statement
> > > regarding the right order would be to complete TEE patches review
> > > first and then come up with documentation here [2]?
> > >
> > > [1] https://lore.kernel.org/linux-integrity/1568025601.4614.253.camel@linux.ibm.com/
> > > [2] https://lore.kernel.org/linux-integrity/20190909163643.qxmzpcggi567hmhv@linux.intel.com/
> >
> > With the intersecting issues, namely key generation and conflicting
> > keyctl parameters, that was not a well considered statement.
> 
> Okay, let me work on documentation first, but I think resending whole
> patch-set just for documentation review and rework would be an
> overkill. Would minor revisions of this patch only like v3.1, v3.2
> etc. work for you? And later I could send next version of this
> patch-set once we agree on documentation.

Yeah, we could iterate through the documentation patch and once
we are happy with it you can bundle it to your main patch set.

/Jarkko

^ permalink raw reply

* Re: [PATCH v14 4/5] overlayfs: internal getxattr operations without sepolicy checking
From: Mark Salyzyn @ 2019-11-04 21:47 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: linux-kernel, kernel-team, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Randy Dunlap, Stephen Smalley,
	overlayfs, linux-doc, LSM List
In-Reply-To: <CAOQ4uxgWOmV_x5gRZ9tR+u86GE6JoXn-MSxKkvi87e9owMApZw@mail.gmail.com>

On 10/22/19 11:39 PM, Amir Goldstein wrote:
> On Tue, Oct 22, 2019 at 11:46 PM Mark Salyzyn <salyzyn@android.com> wrote:
>> Check impure, opaque, origin & meta xattr with no sepolicy audit
>> (using __vfs_getxattr) since these operations are internal to
>> overlayfs operations and do not disclose any data.  This became
>> an issue for credential override off since sys_admin would have
>> been required by the caller; whereas would have been inherently
>> present for the creator since it performed the mount.
>>
>> This is a change in operations since we do not check in the new
>> ovl_do_vfs_getxattr function if the credential override is off or
>> not.  Reasoning is that the sepolicy check is unnecessary overhead,
>> especially since the check can be expensive.
>>
>> Because for override credentials off, this affects _everyone_ that
>> underneath performs private xattr calls without the appropriate
>> sepolicy permissions and sys_admin capability.  Providing blanket
>> support for sys_admin would be bad for all possible callers.
>>
>> For the override credentials on, this will affect only the mounter,
>> should it lack sepolicy permissions. Not considered a security
>> problem since mounting by definition has sys_admin capabilities,
>> but sepolicy contexts would still need to be crafted.
>>
> It sounds reasonable to me, but I am not a "security person".
>
>> It should be noted that there is precedence, __vfs_getxattr is used
>> in other filesystems for their own internal trusted xattr management.
>>
> Urgh! "other" filesystems meaning ecryptfs_getxattr()?
> That looks like a loop hole to read any trusted xattr without any
> security checks. Not sure its a good example...

Yes. But it also makes sense since ecryptfs_getxattr is performed inside 
a layer where the security check is done above by the filesystem that 
called it (AFAIK)? This is used by the filesystem, or the security 
layers to pull the actual sepolicy, rather than getting an EPERM and no 
data. The xattr 'data' is needed by the internal layers.

-- Mark


^ permalink raw reply

* [PATCH v15 3/4] overlayfs: internal getxattr operations without sepolicy checking
From: Mark Salyzyn @ 2019-11-04 21:52 UTC (permalink / raw)
  To: linux-kernel
  Cc: kernel-team, Mark Salyzyn, Miklos Szeredi, Jonathan Corbet,
	Vivek Goyal, Eric W . Biederman, Amir Goldstein, Randy Dunlap,
	Stephen Smalley, linux-unionfs, linux-doc, linux-security-module
In-Reply-To: <20191104215253.141818-1-salyzyn@android.com>

Check impure, opaque, origin & meta xattr with no sepolicy audit
(using __vfs_getxattr) since these operations are internal to
overlayfs operations and do not disclose any data.  This became
an issue for credential override off since sys_admin would have
been required by the caller; whereas would have been inherently
present for the creator since it performed the mount.

This is a change in operations since we do not check in the new
ovl_do_vfs_getxattr function if the credential override is off or
not.  Reasoning is that the sepolicy check is unnecessary overhead,
especially since the check can be expensive.

Because for override credentials off, this affects _everyone_ that
underneath performs private xattr calls without the appropriate
sepolicy permissions and sys_admin capability.  Providing blanket
support for sys_admin would be bad for all possible callers.

For the override credentials on, this will affect only the mounter,
should it lack sepolicy permissions. Not considered a security
problem since mounting by definition has sys_admin capabilities,
but sepolicy contexts would still need to be crafted.

It should be noted that there is precedence, __vfs_getxattr is used
in other filesystems for their own internal trusted xattr management.

Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: linux-unionfs@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-team@android.com
Cc: linux-security-module@vger.kernel.org

v15 - revert to v13 as xattr_gs_args was rejected.
    - move ovl_do_wrapper from util.c to inline in overlayfs.h

v14 - rebase to use xattr_gs_args.

v13 - rebase to use __vfs_getxattr flags option

v12 - rebase

v11 - switch name to ovl_do_vfs_getxattr, fortify comment

v10 - added to patch series
---
 fs/overlayfs/namei.c     | 12 +++++++-----
 fs/overlayfs/overlayfs.h |  8 ++++++++
 fs/overlayfs/util.c      | 18 +++++++++---------
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index e9717c2f7d45..f5aba0a0767b 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -106,10 +106,11 @@ int ovl_check_fh_len(struct ovl_fh *fh, int fh_len)
 
 static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 {
-	int res, err;
+	ssize_t res;
+	int err;
 	struct ovl_fh *fh = NULL;
 
-	res = vfs_getxattr(dentry, name, NULL, 0);
+	res = ovl_do_vfs_getxattr(dentry, name, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return NULL;
@@ -123,7 +124,7 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 	if (!fh)
 		return ERR_PTR(-ENOMEM);
 
-	res = vfs_getxattr(dentry, name, fh, res);
+	res = ovl_do_vfs_getxattr(dentry, name, fh, res);
 	if (res < 0)
 		goto fail;
 
@@ -141,10 +142,11 @@ static struct ovl_fh *ovl_get_fh(struct dentry *dentry, const char *name)
 	return NULL;
 
 fail:
-	pr_warn_ratelimited("overlayfs: failed to get origin (%i)\n", res);
+	pr_warn_ratelimited("overlayfs: failed to get origin (%zi)\n", res);
 	goto out;
 invalid:
-	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", res, fh);
+	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n",
+			    (int)res, fh);
 	goto out;
 }
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index ab3d031c422b..55b872c28bf9 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -200,6 +200,14 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
 	return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
 }
 
+static inline ssize_t ovl_do_vfs_getxattr(struct dentry *dentry,
+					  const char *name, void *buf,
+					  size_t size)
+{
+	return __vfs_getxattr(dentry, d_inode(dentry), name, buf, size,
+			      XATTR_NOSECURITY);
+}
+
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f5678a3f8350..2050c5084a82 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -537,9 +537,9 @@ void ovl_copy_up_end(struct dentry *dentry)
 
 bool ovl_check_origin_xattr(struct dentry *dentry)
 {
-	int res;
+	ssize_t res;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0);
+	res = ovl_do_vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0);
 
 	/* Zero size value means "copied up but origin unknown" */
 	if (res >= 0)
@@ -550,13 +550,13 @@ bool ovl_check_origin_xattr(struct dentry *dentry)
 
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name)
 {
-	int res;
+	ssize_t res;
 	char val;
 
 	if (!d_is_dir(dentry))
 		return false;
 
-	res = vfs_getxattr(dentry, name, &val, 1);
+	res = ovl_do_vfs_getxattr(dentry, name, &val, 1);
 	if (res == 1 && val == 'y')
 		return true;
 
@@ -837,13 +837,13 @@ int ovl_lock_rename_workdir(struct dentry *workdir, struct dentry *upperdir)
 /* err < 0, 0 if no metacopy xattr, 1 if metacopy xattr found */
 int ovl_check_metacopy_xattr(struct dentry *dentry)
 {
-	int res;
+	ssize_t res;
 
 	/* Only regular files can have metacopy xattr */
 	if (!S_ISREG(d_inode(dentry)->i_mode))
 		return 0;
 
-	res = vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
+	res = ovl_do_vfs_getxattr(dentry, OVL_XATTR_METACOPY, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return 0;
@@ -852,7 +852,7 @@ int ovl_check_metacopy_xattr(struct dentry *dentry)
 
 	return 1;
 out:
-	pr_warn_ratelimited("overlayfs: failed to get metacopy (%i)\n", res);
+	pr_warn_ratelimited("overlayfs: failed to get metacopy (%zi)\n", res);
 	return res;
 }
 
@@ -878,7 +878,7 @@ ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value,
 	ssize_t res;
 	char *buf = NULL;
 
-	res = vfs_getxattr(dentry, name, NULL, 0);
+	res = ovl_do_vfs_getxattr(dentry, name, NULL, 0);
 	if (res < 0) {
 		if (res == -ENODATA || res == -EOPNOTSUPP)
 			return -ENODATA;
@@ -890,7 +890,7 @@ ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value,
 		if (!buf)
 			return -ENOMEM;
 
-		res = vfs_getxattr(dentry, name, buf, res);
+		res = ovl_do_vfs_getxattr(dentry, name, buf, res);
 		if (res < 0)
 			goto fail;
 	}
-- 
2.24.0.rc1.363.gb1bccd3e3d-goog


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox