Linux Security Modules development

Linux Security Modules development
 help / color / mirror / Atom feed

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
From: Sean Christopherson @ 2019-06-12 14:34 UTC (permalink / raw)
  To: Andy Lutomirski, q
  Cc: Xing, Cedric, Andy Lutomirski, Jarkko Sakkinen, Stephen Smalley,
	James Morris, Serge E . Hallyn, LSM List, Paul Moore, Eric Paris,
	selinux@vger.kernel.org, Jethro Beekman, Hansen, Dave,
	Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx@vger.kernel.org, Andrew Morton, nhorman@redhat.com,
	npmccallum@redhat.com, Ayoun, Serge, Katz-zamir, Shay,
	Huang, Haitao, Andy Shevchenko, Svahn, Kai, Borislav Petkov,
	Josh Triplett, Huang, Kai, David Rientjes, Roberts, William C,
	Tricca, Philip B
In-Reply-To: <331B31BF-9892-4FB3-9265-3E37412F80F4@amacapital.net>

On Tue, Jun 11, 2019 at 05:09:28PM -0700, Andy Lutomirski wrote:
> 
> On Jun 10, 2019, at 3:28 PM, Xing, Cedric <cedric.xing@intel.com> wrote:
> 
> >> From: Andy Lutomirski [mailto:luto@kernel.org]
> >> Sent: Monday, June 10, 2019 12:15 PM
> >> This seems like an odd workflow.  Shouldn't the #PF return back to
> >> untrusted userspace so that the untrusted user code can make its own
> >> decision as to whether it wants to EAUG a page there as opposed to, say,
> >> killing the enclave or waiting to keep resource usage under control?
> > 
> > This may seem odd to some at the first glance. But if you can think of how
> > static heap (pre-allocated by EADD before EINIT) works, the load parses the
> > "metadata" coming with the enclave to decide the address/size of the heap,
> > EADDs it, and calls it done. In the case of "dynamic" heap (allocated
> > dynamically by EAUG after EINIT), the same thing applies - the loader
> > determines the range of the heap, tells the SGX module about it, and calls
> > it done. Everything else is the between the enclave and the SGX module.
> > 
> > In practice, untrusted code usually doesn't know much about enclaves, just
> > like it doesn't know much about the shared objects loaded into its address
> > space either. Without the necessary knowledge, untrusted code usually just
> > does what it is told (via o-calls, or return value from e-calls), without
> > judging that's right or wrong. 
> > 
> > When it comes to #PF like what I described, of course a signal could be
> > sent to the untrusted code but what would it do then? Usually it'd just
> > come back asking for a page at the fault address. So we figured it'd be
> > more efficient to just have the kernel EAUG at #PF. 
> > 
> > Please don't get me wrong though, as I'm not dictating what the s/w flow
> > shall be. It's just going to be a choice offered to user mode. And that
> > choice was planned to be offered via mprotect() - i.e. a writable vma
> > causes kernel to EAUG while a non-writable vma will result in a signal
> > (then the user mode could decide whether to EAUG). The key point is
> > flexibility - as we want to allow all reasonable s/w flows instead of
> > dictating one over others. We had similar discussions on vDSO API before.
> > And I think you accepted my approach because of its flexibility. Am I
> > right?
> 
> As long as user code can turn this off, I have no real objection. But it
> might make sense to have it be more explicit — have an ioctl set up a range
> as “EAUG-on-demand”.

This was part of the motivation behind changing SGX_IOC_ENCLAVE_ADD_PAGE
to SGX_IOC_ENCLAVE_ADD_REGION and adding a @flags parameter.  E.g. adding
support for "EAUG-on-demand" regions would just be a new flag.

> But this is all currently irrelevant. We can argue about it when the patches
> show up. :)

^ permalink raw reply

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
From: Sean Christopherson @ 2019-06-12 14:25 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca
In-Reply-To: <20190612093221.GA24188@wind.enjellic.com>

On Wed, Jun 12, 2019 at 04:32:21AM -0500, Dr. Greg wrote:
> With SGX2 we will, by necessity, have to admit the notion that a
> platform owner will not have any effective visibility into code that
> is loaded and executed, since it can come in over a secured network
> connection in an enclave security context.  This advocates for the
> simplest approach possible to providing some type of regulation to any
> form of WX page access.

I believe we're all on the same page in the sense that we all want the
"simplest approach possible", but there's a sliding scale of complexity
between the kernel and userspace.  We can make life simple for userspace
at the cost of additional complexity in the kernel, and vice versa.  The
disagreement is over where to shove the extra complexity.

> Current state of the art, and there doesn't appear to be a reason to
> change this, is to package an enclave in the form of an ELF shared
> library.  It seems straight forward to inherit and act on page
> privileges from the privileges specified on the ELF sections that are
> loaded.  Loaders will have a file descriptor available so an mmap of
> the incoming page with the specified privileges should trigger the
> required LSM interventions and tie them to a specific enclave.
> 
> The current enclave 'standard' also uses layout metadata, stored in a
> special .notes section of the shared image, to direct a loader with
> respect to construction of the enclave stack, heap, TCS and other
> miscellaneous regions not directly coded by the ELF TEXT sections.  It
> seems straight forward to extend this paradigm to declare region(s) of
> an enclave that are eligible to be generated at runtime (EAUG'ed) with
> the RWX protections needed to support dynamically loaded code.
> 
> If an enclave wishes to support this functionality, it would seem
> straight forward to require an enclave to provide a single zero page
> which the loader will mmap with those protections in order to trigger
> the desired LSM checks against that specific enclave.

This is effectively #1, e.g. would require userspace to pre-declare its
intent to make regions W->X.

^ permalink raw reply

* Re: [PATCH v3 0/2] ima/evm fixes for v5.2
From: Janne Karhunen @ 2019-06-12 13:38 UTC (permalink / raw)
  To: Roberto Sassu
  Cc: Mimi Zohar, dmitry.kasatkin, mjg59, linux-integrity,
	linux-security-module, silviu.vlasceanu
In-Reply-To: <d9efe3c7-20dd-bbb0-40d8-40f69cba5b88@huawei.com>

On Wed, Jun 12, 2019 at 4:11 PM Roberto Sassu <roberto.sassu@huawei.com> wrote:

> > - after initialization
> >    - deny reading|writing anything without security.ima
> >    - deny reading|writing anything invalid
> >    - allow everything else
> >
> > The logic is pretty handy as it even creates additional layer of
> > security around the early initialization files as they become
> > unreadable after use.
>
> What if they should be legitimately used after the HMAC key is unsealed
> and before switching to the persistent root file system?

Any examples? Log files and such are mostly 'one way' and should
probably be whitelisted in the policy?


> > Now, if we initialize the system with a random key like in your patch,
> > this logic is to change quite drastically? It sounds to me the
> > userland may actually break, all the userland initialization files in
> > the existing ima configurations that do not use digsigs would become
> > unreadable given that the random key is put in? Remember, those files
> > can be protected via other means (most commonly signed ramdisk).
>
> No, the first patch is about adding the ability to verify files created
> during each boot. For any other file, EVM returns INTEGRITY_UNKNOWN as
> before. The second patch changes the behavior, as INTEGRITY_UNKNOWN is
> considered as an error for the enforce-evm appraisal mode. The second
> patch aims at making the system more secure, as no file would be
> accessible unless it is verified.
>
> It is true that configurations without digsigs won't work anymore but
> the alternative is accepting any file until the HMAC key is unsealed.

That's a pretty big change for the userland IMHO. Quite a few
configurations out there will break, including mine I believe, so I
hope there is a solid reason asking people to change their stuff. I'm
fine holding off all writing until it is safe to do so for now..


--
Janne

^ permalink raw reply

* Re: [PATCH v2 1/2] LSM: switch to blocking policy update notifiers
From: Mimi Zohar @ 2019-06-12 13:28 UTC (permalink / raw)
  To: Janne Karhunen, sds, paul, linux-integrity, linux-security-module
In-Reply-To: <20190612074456.2504-1-janne.karhunen@gmail.com>

Hi Paul,

On Wed, 2019-06-12 at 10:44 +0300, Janne Karhunen wrote:
> Atomic policy updaters are not very useful as they cannot
> usually perform the policy updates on their own. Since it
> seems that there is no strict need for the atomicity,
> switch to the blocking variant. While doing so, rename
> the functions accordingly.
> 
> Changelog v2
> - Rebase to 'next-queued-testing'
> 
> Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
> Acked-by: Paul Moore <paul@paul-moore.com>

The patches need to be upstreamed together.  Do you have any problems
with my upstreaming them via linux-integrity?

Mimi


^ permalink raw reply

* Re: [PATCH v2 1/2] ima: use the lsm policy update notifier
From: Mimi Zohar @ 2019-06-12 13:24 UTC (permalink / raw)
  To: Janne Karhunen, sds, paul, linux-integrity, linux-security-module
In-Reply-To: <20190612074456.2504-2-janne.karhunen@gmail.com>

On Wed, 2019-06-12 at 10:44 +0300, Janne Karhunen wrote:
> Don't do lazy policy updates while running the rule matching,
> run the updates as they happen.
> 
> Depends on commit 141a61ce6c60 ("LSM: switch to blocking policy update notifiers")
> 
> Changelog v2
> - Rebase to 'next-queued-testing'
> - Use memset to initialize the lsm rule array
> - Don't duplicate elements that are immutable during the rule copy
> 
> =========
> Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
> 

Thanks, this looks a lot better.

Mimi


^ permalink raw reply

* Re: [PATCH v3 0/2] ima/evm fixes for v5.2
From: Roberto Sassu @ 2019-06-12 13:11 UTC (permalink / raw)
  To: Janne Karhunen
  Cc: Mimi Zohar, dmitry.kasatkin, mjg59, linux-integrity,
	linux-security-module, silviu.vlasceanu
In-Reply-To: <CAE=NcraYOw9B3RFu3_DbJs9nPT87AtQEptC7zF4kAu4FP8YhxA@mail.gmail.com>

On 6/12/2019 1:28 PM, Janne Karhunen wrote:
> On Thu, Jun 6, 2019 at 3:27 PM Roberto Sassu <roberto.sassu@huawei.com> wrote:
>>
>> Previous versions included the patch 'ima: don't ignore INTEGRITY_UNKNOWN
>> EVM status'. However, I realized that this patch cannot be accepted alone
>> because IMA-Appraisal would deny access to new files created during the
>> boot.
> 
> The early initialization logic seems to have been changing, the
> original one as I have understood it:
> - before initialization
>    - allow reading anything without security.ima
>    - deny reading anything with security.ima

These two should be probably inverted: deny..., allow...


>    - allow all writes

Allow writing anything with security.ima
Allow writing new files


> - after initialization
>    - deny reading|writing anything without security.ima
>    - deny reading|writing anything invalid
>    - allow everything else
> 
> The logic is pretty handy as it even creates additional layer of
> security around the early initialization files as they become
> unreadable after use.

What if they should be legitimately used after the HMAC key is unsealed
and before switching to the persistent root file system?


> Now, if we initialize the system with a random key like in your patch,
> this logic is to change quite drastically? It sounds to me the
> userland may actually break, all the userland initialization files in
> the existing ima configurations that do not use digsigs would become
> unreadable given that the random key is put in? Remember, those files
> can be protected via other means (most commonly signed ramdisk).

No, the first patch is about adding the ability to verify files created
during each boot. For any other file, EVM returns INTEGRITY_UNKNOWN as
before. The second patch changes the behavior, as INTEGRITY_UNKNOWN is
considered as an error for the enforce-evm appraisal mode. The second
patch aims at making the system more secure, as no file would be
accessible unless it is verified.

It is true that configurations without digsigs won't work anymore but
the alternative is accepting any file until the HMAC key is unsealed.

Signing the ramdisk is for sure a possibility, but IMA would be
sufficient to provide integrity protection as it checks any file in the
ram disk.

Unfortunately I found an issue in patch 1/2. These changes should be
applied:

--
diff --git a/security/integrity/evm/evm_main.c 
b/security/integrity/evm/evm_main.c
index faa4a02a3139..f4af595678fe 100644
--- a/security/integrity/evm/evm_main.c
+++ b/security/integrity/evm/evm_main.c
@@ -310,10 +310,14 @@ EXPORT_SYMBOL_GPL(evm_verifyxattr);
  static enum integrity_status evm_verify_current_integrity(struct 
dentry *dentry)
  {
         struct inode *inode = d_backing_inode(dentry);
+       int rc;

         if (!evm_key_loaded() || !S_ISREG(inode->i_mode) || evm_fixmode)
                 return 0;
-       return evm_verify_hmac(dentry, NULL, NULL, 0, NULL);
+       rc = evm_verify_hmac(dentry, NULL, NULL, 0, NULL);
+       if (rc == INTEGRITY_UNKNOWN && !evm_persistent_key_loaded())
+               return 0;
+       return rc;
  }

  /*
--

Roberto

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply related

* Re: What do LSMs *actually* need for checks on notifications?
From: David Howells @ 2019-06-12 11:43 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: dhowells, Casey Schaufler, Andy Lutomirski, viro, linux-usb,
	linux-security-module, linux-fsdevel, linux-kernel
In-Reply-To: <05ddc1e6-78ba-b60e-73b1-ffe86de2f2f8@tycho.nsa.gov>

Stephen Smalley <sds@tycho.nsa.gov> wrote:

> >   (6) The security attributes of all the objects between the object in (5)
> >       and the object in (4), assuming we work from (5) towards (4) if the
> >       two aren't coincident (WATCH_INFO_RECURSIVE).
> 
> Does this apply to anything other than mount notifications?

Not at the moment.  I'm considering making it such that you can make a watch
on a keyring get automatically propagated to keys that get added to the
keyring (and removed upon unlink) - the idea being that there is no 'single
parent path' concept for a keyring as there is for a directory.

I'm also pondering the idea of making it possible to have superblock watches
automatically propagated to superblocks created by automount points on the
watched superblock.

> And for mount notifications, isn't the notification actually for a change to
> the mount namespace, not a change to any file?

Yes.

> Hence, the real "object" for events that trigger mount notifications is the
> mount namespace, right?

Um... arguably.  Would that mean that that would need a label from somewhere?

> The watched path is just a way of identifying a subtree of the mount
> namespace for notifications - it isn't the real object being watched.

I like that argument.

Thanks,
David

^ permalink raw reply

* Re: [PATCH v3 0/2] ima/evm fixes for v5.2
From: Janne Karhunen @ 2019-06-12 11:28 UTC (permalink / raw)
  To: Roberto Sassu
  Cc: Mimi Zohar, dmitry.kasatkin, mjg59, linux-integrity,
	linux-security-module, silviu.vlasceanu
In-Reply-To: <20190606112620.26488-1-roberto.sassu@huawei.com>

On Thu, Jun 6, 2019 at 3:27 PM Roberto Sassu <roberto.sassu@huawei.com> wrote:
>
> Previous versions included the patch 'ima: don't ignore INTEGRITY_UNKNOWN
> EVM status'. However, I realized that this patch cannot be accepted alone
> because IMA-Appraisal would deny access to new files created during the
> boot.

The early initialization logic seems to have been changing, the
original one as I have understood it:
- before initialization
  - allow reading anything without security.ima
  - deny reading anything with security.ima
  - allow all writes
- after initialization
  - deny reading|writing anything without security.ima
  - deny reading|writing anything invalid
  - allow everything else

The logic is pretty handy as it even creates additional layer of
security around the early initialization files as they become
unreadable after use.

Now, if we initialize the system with a random key like in your patch,
this logic is to change quite drastically? It sounds to me the
userland may actually break, all the userland initialization files in
the existing ima configurations that do not use digsigs would become
unreadable given that the random key is put in? Remember, those files
can be protected via other means (most commonly signed ramdisk).

--
Janne

^ permalink raw reply

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
From: Dr. Greg @ 2019-06-12  9:32 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Stephen Smalley, Cedric Xing, linux-security-module, selinux,
	linux-kernel, linux-sgx, jarkko.sakkinen, luto, jmorris, serge,
	paul, eparis, jethro, dave.hansen, tglx, torvalds, akpm, nhorman,
	pmccallum, serge.ayoun, shay.katz-zamir, haitao.huang,
	andriy.shevchenko, kai.svahn, bp, josh, kai.huang, rientjes,
	william.c.roberts, philip.b.tricca
In-Reply-To: <20190611220243.GB3416@linux.intel.com>

On Tue, Jun 11, 2019 at 03:02:43PM -0700, Sean Christopherson wrote:

Good morning, I hope the week is proceeding well for everyone.

> On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> > I haven't looked at this code closely, but it feels like a lot of
> > SGX-specific logic embedded into SELinux that will have to be repeated or
> > reused for every security module.  Does SGX not track this state itself?

> SGX does track equivalent state.
> 
> There are three proposals on the table (I think):
> 
>   1. Require userspace to explicitly specificy (maximal) enclave page
>      permissions at build time.  The enclave page permissions are provided
>      to, and checked by, LSMs at enclave build time.
> 
>      Pros: Low-complexity kernel implementation, straightforward auditing
>      Cons: Sullies the SGX UAPI to some extent, may increase complexity of
>            SGX2 enclave loaders.
> 
>   2. Pre-check LSM permissions and dynamically track mappings to enclave
>      pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
>      based on the pre-checked permissions.
> 
>      Pros: Does not impact SGX UAPI, medium kernel complexity
>      Cons: Auditing is complex/weird, requires taking enclave-specific
>            lock during mprotect() to query/update tracking.
> 
>   3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
>      from cradle to grave, but otherwise defer everything to LSMs.
> 
>      Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
>      Cons: Most complex and "heaviest" kernel implementation of the three,
>            pushes more SGX details into LSMs.
> 
> My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
> prefers #2.  Cedric's RFC series implements #3.
> 
> Perhaps the easiest way to make forward progress is to rule out the
> options we absolutely *don't* want by focusing on the potentially blocking
> issue with each option:
>
>   #1 - SGX UAPI funkiness
> 
>   #2 - Auditing complexity, potential enclave lock contention
> 
>   #3 - Pushing SGX details into LSMs and complexity of kernel implementation

At the risk of repeating myself, I believe the issue that has not
received full clarity is that, for a security relevant solution, there
has to be two separate aspects of LSM coverage for SGX.  I believe
that a high level review of the requirements may assist in selection
of a course of action for the driver.

The first aspect of LSM control has been covered extensively and that
is the notion of implementing control over the ability of a user
identity to request some cohort of page privileges.  The cohort of
obvious concern is the ability of a page to possess both WRITE and
EXECUTE privileges at sometime during its lifetime.

Given that SGX2 support is the ultimate and necesary goal for this
driver, the selected proposal should be the one that gives the most
simplistic application of this policy.  As I have noted previously,
once SGX2 becomes available, the only relevant security control that
can be realized with this type of LSM support is whether or not the
platform owner wishes to limit access by a user identity to the
ability to dynamically load code in enclave context.

With SGX2 we will, by necessity, have to admit the notion that a
platform owner will not have any effective visibility into code that
is loaded and executed, since it can come in over a secured network
connection in an enclave security context.  This advocates for the
simplest approach possible to providing some type of regulation to any
form of WX page access.

Current state of the art, and there doesn't appear to be a reason to
change this, is to package an enclave in the form of an ELF shared
library.  It seems straight forward to inherit and act on page
privileges from the privileges specified on the ELF sections that are
loaded.  Loaders will have a file descriptor available so an mmap of
the incoming page with the specified privileges should trigger the
required LSM interventions and tie them to a specific enclave.

The current enclave 'standard' also uses layout metadata, stored in a
special .notes section of the shared image, to direct a loader with
respect to construction of the enclave stack, heap, TCS and other
miscellaneous regions not directly coded by the ELF TEXT sections.  It
seems straight forward to extend this paradigm to declare region(s) of
an enclave that are eligible to be generated at runtime (EAUG'ed) with
the RWX protections needed to support dynamically loaded code.

If an enclave wishes to support this functionality, it would seem
straight forward to require an enclave to provide a single zero page
which the loader will mmap with those protections in order to trigger
the desired LSM checks against that specific enclave.

The simplest driver approach that achieves the desired introspection
of permissions in the described framework will implement as much LSM
security as is possible with SGX technology and with minimal
disruption to the existing SGX software eco-system.

This leaves the second aspect of LSM security and that is the ability
to inspect and act on the initialized characteristics of the enclave.
This is the aspect of SGX LSM functionality that has not been clearly
called out.

All that is needed here is an LSM hook that gets handed a pointer to
the signature structure (SIGSTRUCT) that is passed to the EINIT ioctl.
If the SIGSTRUCT does not match the proposed enclave image that the
processor has computed secondary to the enclave image creation process
the enclave will not initialize, so all that is needed is for an LSM
to be allowed to interpret and act on the characteristics defined in
that structure before the enclave is actually initialized.

As we have now collectively demonstrated, it is easy to get lost in
minutia with respect to all of this.  I believe if we can focus on a
solution that implements what I have discussed above we will achieve
as much as can be achieved with respect to platform security for SGX
systems.

Best wishes for a productive remainder of the week.

Dr. Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686            EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Nullum magnum ingenium sine mixtura dementiae fuit."
        (There is no great genius without some touch of madness.)
                                -- Seneca

^ permalink raw reply

* Re: [RFC][PATCH 00/13] Mount, FS, Block and Keyrings notifications [ver #4]
From: David Howells @ 2019-06-12  8:55 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: dhowells, Andy Lutomirski, Casey Schaufler, Andy Lutomirski,
	Al Viro, USB list, LSM List, Greg Kroah-Hartman, raven,
	Linux FS Devel, Linux API, linux-block, keyrings, LKML,
	Paul Moore
In-Reply-To: <cf3f4865-b6d7-7303-0212-960439e0c119@tycho.nsa.gov>

Stephen Smalley <sds@tycho.nsa.gov> wrote:

> 2) If notifications can be triggered by read-like operations (as in fanotify,
> for example), then a "read" can be turned into a "write" flow through a
> notification.

I don't think any of the things can be classed as "read-like" operations.  At
the moment, there are the following groups:

 (1) Addition of objects (eg. key_link, mount).

 (2) Modifications to things (eg. keyctl_write, remount).

 (3) Removal of objects (eg. key_unlink, unmount, fput+FMODE_NEED_UNMOUNT).

 (4) I/O or hardware errors (eg. USB device add/remove, EDQUOT, ENOSPC).

I have not currently defined any access events.

I've been looking at the possibility of having epoll generate events this way,
but that's not as straightforward as I'd hoped and fanotify could potentially
use it also, but in both those cases, the process is already getting the
events currently by watching for them using synchronous waiting syscalls.
Instead this would generate an event to say it had happened.

David

^ permalink raw reply

* [PATCH v2 1/2] ima: use the lsm policy update notifier
From: Janne Karhunen @ 2019-06-12  7:44 UTC (permalink / raw)
  To: sds, paul, zohar, linux-integrity, linux-security-module; +Cc: Janne Karhunen
In-Reply-To: <20190612074456.2504-1-janne.karhunen@gmail.com>

Don't do lazy policy updates while running the rule matching,
run the updates as they happen.

Depends on commit 141a61ce6c60 ("LSM: switch to blocking policy update notifiers")

Changelog v2
- Rebase to 'next-queued-testing'
- Use memset to initialize the lsm rule array
- Don't duplicate elements that are immutable during the rule copy

=========
Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
---
 security/integrity/ima/ima.h        |   2 +
 security/integrity/ima/ima_main.c   |   8 ++
 security/integrity/ima/ima_policy.c | 116 +++++++++++++++++++++++-----
 3 files changed, 106 insertions(+), 20 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 18b48a6d0b80..579544527246 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -155,6 +155,8 @@ unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 void ima_init_template_list(void);
 int __init ima_init_digests(void);
+int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
+			  void *lsm_data);
 
 /*
  * used to protect h_table and sha_table
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index af341a80118f..a7e7e2d7224c 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -43,6 +43,10 @@ int ima_appraise;
 int ima_hash_algo = HASH_ALGO_SHA1;
 static int hash_setup_done;
 
+static struct notifier_block ima_lsm_policy_notifier = {
+	.notifier_call = ima_lsm_policy_change,
+};
+
 static int __init hash_setup(char *str)
 {
 	struct ima_template_desc *template_desc = ima_template_desc_current();
@@ -622,6 +626,10 @@ static int __init init_ima(void)
 		error = ima_init();
 	}
 
+	error = register_blocking_lsm_notifier(&ima_lsm_policy_notifier);
+	if (error)
+		pr_warn("Couldn't register LSM notifier, error %d\n", error);
+
 	if (!error)
 		ima_update_policy_flag();
 
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index fd9b01881d17..e1550859f870 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -250,31 +250,113 @@ static int __init default_appraise_policy_setup(char *str)
 }
 __setup("ima_appraise_tcb", default_appraise_policy_setup);
 
+static void ima_lsm_free_rule(struct ima_rule_entry *entry)
+{
+	int i;
+
+	for (i = 0; i < MAX_LSM_RULES; i++) {
+		kfree(entry->lsm[i].rule);
+		kfree(entry->lsm[i].args_p);
+	}
+	kfree(entry);
+}
+
+static struct ima_rule_entry *ima_lsm_copy_rule(struct ima_rule_entry *entry)
+{
+	struct ima_rule_entry *nentry;
+	int i, result;
+
+	nentry = kmalloc(sizeof(*nentry), GFP_KERNEL);
+	if (!nentry)
+		return NULL;
+
+	/*
+	 * Immutable elements are copied over as pointers and data; only
+	 * lsm rules can change
+	 */
+	memcpy(nentry, entry, sizeof(*nentry));
+	memset(nentry->lsm, 0, FIELD_SIZEOF(struct ima_rule_entry, lsm));
+
+	for (i = 0; i < MAX_LSM_RULES; i++) {
+		if (!entry->lsm[i].rule)
+			continue;
+
+		nentry->lsm[i].type = entry->lsm[i].type;
+		nentry->lsm[i].args_p = kstrdup(entry->lsm[i].args_p,
+						GFP_KERNEL);
+		if (!nentry->lsm[i].args_p)
+			goto out_err;
+
+		result = security_filter_rule_init(nentry->lsm[i].type,
+						   Audit_equal,
+						   nentry->lsm[i].args_p,
+						   &nentry->lsm[i].rule);
+		if (result == -EINVAL)
+			pr_warn("ima: rule for LSM \'%d\' is undefined\n",
+				entry->lsm[i].type);
+	}
+	return nentry;
+
+out_err:
+	ima_lsm_free_rule(nentry);
+	return NULL;
+}
+
+static int ima_lsm_update_rule(struct ima_rule_entry *entry)
+{
+	struct ima_rule_entry *nentry;
+
+	nentry = ima_lsm_copy_rule(entry);
+	if (!nentry)
+		return -ENOMEM;
+
+	list_replace_rcu(&entry->list, &nentry->list);
+	synchronize_rcu();
+	ima_lsm_free_rule(entry);
+
+	return 0;
+}
+
 /*
  * The LSM policy can be reloaded, leaving the IMA LSM based rules referring
  * to the old, stale LSM policy.  Update the IMA LSM based rules to reflect
- * the reloaded LSM policy.  We assume the rules still exist; and BUG_ON() if
- * they don't.
+ * the reloaded LSM policy.
  */
 static void ima_lsm_update_rules(void)
 {
-	struct ima_rule_entry *entry;
-	int result;
-	int i;
+	struct ima_rule_entry *entry, *e;
+	int i, result, needs_update;
 
-	list_for_each_entry(entry, &ima_policy_rules, list) {
+	list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
+		needs_update = 0;
 		for (i = 0; i < MAX_LSM_RULES; i++) {
-			if (!entry->lsm[i].rule)
-				continue;
-			result = security_filter_rule_init(entry->lsm[i].type,
-							   Audit_equal,
-							   entry->lsm[i].args_p,
-							   &entry->lsm[i].rule);
-			BUG_ON(!entry->lsm[i].rule);
+			if (entry->lsm[i].rule) {
+				needs_update = 1;
+				break;
+			}
+		}
+		if (!needs_update)
+			continue;
+
+		result = ima_lsm_update_rule(entry);
+		if (result) {
+			pr_err("ima: lsm rule update error %d\n",
+				result);
+			return;
 		}
 	}
 }
 
+int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
+			  void *lsm_data)
+{
+	if (event != LSM_POLICY_CHANGE)
+		return NOTIFY_DONE;
+
+	ima_lsm_update_rules();
+	return NOTIFY_OK;
+}
+
 /**
  * ima_match_rules - determine whether an inode matches the measure rule.
  * @rule: a pointer to a rule
@@ -328,11 +410,10 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
 	for (i = 0; i < MAX_LSM_RULES; i++) {
 		int rc = 0;
 		u32 osid;
-		int retried = 0;
 
 		if (!rule->lsm[i].rule)
 			continue;
-retry:
+
 		switch (i) {
 		case LSM_OBJ_USER:
 		case LSM_OBJ_ROLE:
@@ -353,11 +434,6 @@ static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
 		default:
 			break;
 		}
-		if ((rc < 0) && (!retried)) {
-			retried = 1;
-			ima_lsm_update_rules();
-			goto retry;
-		}
 		if (!rc)
 			return false;
 	}
-- 
2.17.1


^ permalink raw reply related

* [PATCH v2 1/2] LSM: switch to blocking policy update notifiers
From: Janne Karhunen @ 2019-06-12  7:44 UTC (permalink / raw)
  To: sds, paul, zohar, linux-integrity, linux-security-module; +Cc: Janne Karhunen

Atomic policy updaters are not very useful as they cannot
usually perform the policy updates on their own. Since it
seems that there is no strict need for the atomicity,
switch to the blocking variant. While doing so, rename
the functions accordingly.

Changelog v2
- Rebase to 'next-queued-testing'

Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com>
---
 drivers/infiniband/core/device.c |  6 +++---
 include/linux/security.h         |  6 +++---
 security/security.c              | 23 +++++++++++++----------
 security/selinux/hooks.c         |  2 +-
 security/selinux/selinuxfs.c     |  2 +-
 5 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 78dc07c6ac4b..61c0c93a2e73 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2499,7 +2499,7 @@ static int __init ib_core_init(void)
 		goto err_mad;
 	}
 
-	ret = register_lsm_notifier(&ibdev_lsm_nb);
+	ret = register_blocking_lsm_notifier(&ibdev_lsm_nb);
 	if (ret) {
 		pr_warn("Couldn't register LSM notifier. ret %d\n", ret);
 		goto err_sa;
@@ -2518,7 +2518,7 @@ static int __init ib_core_init(void)
 	return 0;
 
 err_compat:
-	unregister_lsm_notifier(&ibdev_lsm_nb);
+	unregister_blocking_lsm_notifier(&ibdev_lsm_nb);
 err_sa:
 	ib_sa_cleanup();
 err_mad:
@@ -2544,7 +2544,7 @@ static void __exit ib_core_cleanup(void)
 	nldev_exit();
 	rdma_nl_unregister(RDMA_NL_LS);
 	unregister_pernet_device(&rdma_dev_net_ops);
-	unregister_lsm_notifier(&ibdev_lsm_nb);
+	unregister_blocking_lsm_notifier(&ibdev_lsm_nb);
 	ib_sa_cleanup();
 	ib_mad_cleanup();
 	addr_cleanup();
diff --git a/include/linux/security.h b/include/linux/security.h
index 659071c2e57c..fc655fbe44ad 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -189,9 +189,9 @@ static inline const char *kernel_load_data_id_str(enum kernel_load_data_id id)
 
 #ifdef CONFIG_SECURITY
 
-int call_lsm_notifier(enum lsm_event event, void *data);
-int register_lsm_notifier(struct notifier_block *nb);
-int unregister_lsm_notifier(struct notifier_block *nb);
+int call_blocking_lsm_notifier(enum lsm_event event, void *data);
+int register_blocking_lsm_notifier(struct notifier_block *nb);
+int unregister_blocking_lsm_notifier(struct notifier_block *nb);
 
 /* prototypes */
 extern int security_init(void);
diff --git a/security/security.c b/security/security.c
index 613a5c00e602..47e5849d7557 100644
--- a/security/security.c
+++ b/security/security.c
@@ -39,7 +39,7 @@
 #define LSM_COUNT (__end_lsm_info - __start_lsm_info)
 
 struct security_hook_heads security_hook_heads __lsm_ro_after_init;
-static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain);
+static BLOCKING_NOTIFIER_HEAD(blocking_lsm_notifier_chain);
 
 static struct kmem_cache *lsm_file_cache;
 static struct kmem_cache *lsm_inode_cache;
@@ -430,23 +430,26 @@ void __init security_add_hooks(struct security_hook_list *hooks, int count,
 		panic("%s - Cannot get early memory.\n", __func__);
 }
 
-int call_lsm_notifier(enum lsm_event event, void *data)
+int call_blocking_lsm_notifier(enum lsm_event event, void *data)
 {
-	return atomic_notifier_call_chain(&lsm_notifier_chain, event, data);
+	return blocking_notifier_call_chain(&blocking_lsm_notifier_chain,
+					    event, data);
 }
-EXPORT_SYMBOL(call_lsm_notifier);
+EXPORT_SYMBOL(call_blocking_lsm_notifier);
 
-int register_lsm_notifier(struct notifier_block *nb)
+int register_blocking_lsm_notifier(struct notifier_block *nb)
 {
-	return atomic_notifier_chain_register(&lsm_notifier_chain, nb);
+	return blocking_notifier_chain_register(&blocking_lsm_notifier_chain,
+						nb);
 }
-EXPORT_SYMBOL(register_lsm_notifier);
+EXPORT_SYMBOL(register_blocking_lsm_notifier);
 
-int unregister_lsm_notifier(struct notifier_block *nb)
+int unregister_blocking_lsm_notifier(struct notifier_block *nb)
 {
-	return atomic_notifier_chain_unregister(&lsm_notifier_chain, nb);
+	return blocking_notifier_chain_unregister(&blocking_lsm_notifier_chain,
+						  nb);
 }
-EXPORT_SYMBOL(unregister_lsm_notifier);
+EXPORT_SYMBOL(unregister_blocking_lsm_notifier);
 
 /**
  * lsm_cred_alloc - allocate a composite cred blob
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index c61787b15f27..c1e37018c8eb 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -197,7 +197,7 @@ static int selinux_lsm_notifier_avc_callback(u32 event)
 {
 	if (event == AVC_CALLBACK_RESET) {
 		sel_ib_pkey_flush();
-		call_lsm_notifier(LSM_POLICY_CHANGE, NULL);
+		call_blocking_lsm_notifier(LSM_POLICY_CHANGE, NULL);
 	}
 
 	return 0;
diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c
index 145ee62f205a..1e2e3e4b5fdb 100644
--- a/security/selinux/selinuxfs.c
+++ b/security/selinux/selinuxfs.c
@@ -180,7 +180,7 @@ static ssize_t sel_write_enforce(struct file *file, const char __user *buf,
 		selnl_notify_setenforce(new_value);
 		selinux_status_update_setenforce(state, new_value);
 		if (!new_value)
-			call_lsm_notifier(LSM_POLICY_CHANGE, NULL);
+			call_blocking_lsm_notifier(LSM_POLICY_CHANGE, NULL);
 	}
 	length = count;
 out:
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH 09/10] usb: Add USB subsystem notifications [ver #3]
From: Felipe Balbi @ 2019-06-12  6:58 UTC (permalink / raw)
  To: Alan Stern
  Cc: Mathias Nyman, Greg Kroah-Hartman, David Howells, viro, linux-usb,
	raven, linux-fsdevel, linux-api, linux-block, keyrings,
	linux-security-module, linux-kernel
In-Reply-To: <Pine.LNX.4.44L0.1906110950440.1535-100000@iolanthe.rowland.org>


Hi,

Alan Stern <stern@rowland.harvard.edu> writes:

> On Tue, 11 Jun 2019, Felipe Balbi wrote:
>
>> >> >> > So for "severe" issues, yes, we should do this, but perhaps not for all
>> >> >> > of the "normal" things we see when a device is yanked out of the system
>> >> >> > and the like.
>> >> >> 
>> >> >> Then what counts as a "severe" issue?  Anything besides enumeration 
>> >> >> failure?
>> >> >
>> >> > Not that I can think of at the moment, other than the other recently
>> >> > added KOBJ_CHANGE issue.  I'm sure we have other "hard failure" issues
>> >> > in the USB stack that people will want exposed over time.
>> >> 
>> >> From an XHCI standpoint, Transaction Errors might be one thing. They
>> >> happen rarely and are a strong indication that the bus itself is
>> >> bad. Either bad cable, misbehaving PHYs, improper power management, etc.
>> >
>> > Don't you also get transaction errors if the user unplugs a device in 
>> > the middle of a transfer?  That's not the sort of thing we want to sent 
>> > notifications about.
>> 
>> Mathias, do we get Transaction Error if user removes cable during a
>> transfer? I thought we would just get Port Status Change with CC bit
>> cleared, no?
>
> Even if xHCI doesn't give Transaction Errors when a cable is unplugged 
> during a transfer, other host controllers do.  Sometimes quite a lot -- 
> they continue to occur until the kernel polls the parent hub's 
> interrupt ep and learns that the port is disconnected, which can take 
> up to 250 ms.

my comment was specific about XHCI. It even started with "From an XHCI
standpoint" :-)

-- 
balbi

^ permalink raw reply

* Re: [RFC PATCH v2 2/5] x86/sgx: Require userspace to define enclave pages' protection bits
From: Andy Lutomirski @ 2019-06-12  0:09 UTC (permalink / raw)
  To: Xing, Cedric
  Cc: Andy Lutomirski, Christopherson, Sean J, Jarkko Sakkinen,
	Stephen Smalley, James Morris, Serge E . Hallyn, LSM List,
	Paul Moore, Eric Paris, selinux@vger.kernel.org, Jethro Beekman,
	Hansen, Dave, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx@vger.kernel.org, Andrew Morton, nhorman@redhat.com,
	npmccallum@redhat.com, Ayoun, Serge, Katz-zamir, Shay,
	Huang, Haitao, Andy Shevchenko, Svahn, Kai, Borislav Petkov,
	Josh Triplett, Huang, Kai, David Rientjes, Roberts, William C,
	Tricca, Philip B
In-Reply-To: <960B34DE67B9E140824F1DCDEC400C0F655010EF@ORSMSX116.amr.corp.intel.com>



On Jun 10, 2019, at 3:28 PM, Xing, Cedric <cedric.xing@intel.com> wrote:

>> From: Andy Lutomirski [mailto:luto@kernel.org]
>> Sent: Monday, June 10, 2019 12:15 PM
>> 
>> On Mon, Jun 10, 2019 at 11:29 AM Xing, Cedric <cedric.xing@intel.com>
>> wrote:
>>> 
>>>> From: Christopherson, Sean J
>>>> Sent: Wednesday, June 05, 2019 7:12 PM
>>>> 
>>>> +/**
>>>> + * sgx_map_allowed - check vma protections against the associated
>>>> enclave page
>>>> + * @encl:    an enclave
>>>> + * @start:   start address of the mapping (inclusive)
>>>> + * @end:     end address of the mapping (exclusive)
>>>> + * @prot:    protection bits of the mapping
>>>> + *
>>>> + * Verify a userspace mapping to an enclave page would not violate
>>>> +the security
>>>> + * requirements of the *kernel*.  Note, this is in no way related
>>>> +to the
>>>> + * page protections enforced by hardware via the EPCM.  The EPCM
>>>> +protections
>>>> + * can be directly extended by the enclave, i.e. cannot be relied
>>>> +upon by the
>>>> + * kernel for security guarantees of any kind.
>>>> + *
>>>> + * Return:
>>>> + *   0 on success,
>>>> + *   -EACCES if the mapping is disallowed
>>>> + */
>>>> +int sgx_map_allowed(struct sgx_encl *encl, unsigned long start,
>>>> +                 unsigned long end, unsigned long prot) {
>>>> +     struct sgx_encl_page *page;
>>>> +     unsigned long addr;
>>>> +
>>>> +     prot &= (VM_READ | VM_WRITE | VM_EXEC);
>>>> +     if (!prot || !encl)
>>>> +             return 0;
>>>> +
>>>> +     mutex_lock(&encl->lock);
>>>> +
>>>> +     for (addr = start; addr < end; addr += PAGE_SIZE) {
>>>> +             page = radix_tree_lookup(&encl->page_tree, addr >>
>>>> PAGE_SHIFT);
>>>> +
>>>> +             /*
>>>> +              * Do not allow R|W|X to a non-existent page, or
>> protections
>>>> +              * beyond those of the existing enclave page.
>>>> +              */
>>>> +             if (!page || (prot & ~page->prot))
>>>> +                     return -EACCES;
>>> 
>>> In SGX2, pages will be "mapped" before being populated.
>>> 
>>> Here's a brief summary for those who don't have enough background on
>> how new EPC pages could be added to a running enclave in SGX2:
>>>  - There are 2 new instructions - EACCEPT and EAUG.
>>>  - EAUG is used by SGX module to add (augment) a new page to an
>> existing enclave. The newly added page is *inaccessible* until the
>> enclave *accepts* it.
>>>  - EACCEPT is the instruction for an enclave to accept a new page.
>>> 
>>> And the s/w flow for an enclave to request new EPC pages is expected
>> to be something like the following:
>>>  - The enclave issues EACCEPT at the linear address that it would
>> like a new page.
>>>  - EACCEPT results in #PF, as there's no page at the linear address
>> above.
>>>  - SGX module is notified about the #PF, in form of its vma->vm_ops-
>>> fault() being called by kernel.
>>>  - SGX module EAUGs a new EPC page at the fault address, and resumes
>> the enclave.
>>>  - EACCEPT is reattempted, and succeeds at this time.
>> 
>> This seems like an odd workflow.  Shouldn't the #PF return back to
>> untrusted userspace so that the untrusted user code can make its own
>> decision as to whether it wants to EAUG a page there as opposed to, say,
>> killing the enclave or waiting to keep resource usage under control?
> 
> This may seem odd to some at the first glance. But if you can think of how static heap (pre-allocated by EADD before EINIT) works, the load parses the "metadata" coming with the enclave to decide the address/size of the heap, EADDs it, and calls it done. In the case of "dynamic" heap (allocated dynamically by EAUG after EINIT), the same thing applies - the loader determines the range of the heap, tells the SGX module about it, and calls it done. Everything else is the between the enclave and the SGX module.
> 
> In practice, untrusted code usually doesn't know much about enclaves, just like it doesn't know much about the shared objects loaded into its address space either. Without the necessary knowledge, untrusted code usually just does what it is told (via o-calls, or return value from e-calls), without judging that's right or wrong. 
> 
> When it comes to #PF like what I described, of course a signal could be sent to the untrusted code but what would it do then? Usually it'd just come back asking for a page at the fault address. So we figured it'd be more efficient to just have the kernel EAUG at #PF. 
> 
> Please don't get me wrong though, as I'm not dictating what the s/w flow shall be. It's just going to be a choice offered to user mode. And that choice was planned to be offered via mprotect() - i.e. a writable vma causes kernel to EAUG while a non-writable vma will result in a signal (then the user mode could decide whether to EAUG). The key point is flexibility - as we want to allow all reasonable s/w flows instead of dictating one over others. We had similar discussions on vDSO API before. And I think you accepted my approach because of its flexibility. Am I right?

As long as user code can turn this off, I have no real objection. But it might make sense to have it be more explicit — have an ioctl set up a range as “EAUG-on-demand”.

But this is all currently irrelevant. We can argue about it when the patches show up. :)

^ permalink raw reply

* RE: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
From: Xing, Cedric @ 2019-06-11 22:55 UTC (permalink / raw)
  To: Stephen Smalley, linux-security-module@vger.kernel.org,
	selinux@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-sgx@vger.kernel.org
  Cc: jarkko.sakkinen@linux.intel.com, luto@kernel.org,
	jmorris@namei.org, serge@hallyn.com, paul@paul-moore.com,
	eparis@parisplace.org, jethro@fortanix.com, Hansen, Dave,
	tglx@linutronix.de, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, nhorman@redhat.com,
	pmccallum@redhat.com, Ayoun, Serge, Katz-zamir, Shay,
	Huang, Haitao, andriy.shevchenko@linux.intel.com, Svahn, Kai,
	bp@alien8.de, josh@joshtriplett.org, Huang, Kai,
	rientjes@google.com, Roberts, William C, Tricca, Philip B
In-Reply-To: <b6f099cd-c0eb-d5cf-847d-27a15ac5ceaf@tycho.nsa.gov>

> From: linux-sgx-owner@vger.kernel.org [mailto:linux-sgx-
> owner@vger.kernel.org] On Behalf Of Stephen Smalley
> Sent: Tuesday, June 11, 2019 6:40 AM
> 
> >
> > +#ifdef CONFIG_INTEL_SGX
> > +	rc = sgxsec_mprotect(vma, prot);
> > +	if (rc <= 0)
> > +		return rc;
> 
> Why are you skipping the file_map_prot_check() call when rc == 0?
> What would SELinux check if you didn't do so -
> FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a
> problem to let SELinux proceed with that check?

We can continue the check. But in practice, all FILE__{READ|WRITE|EXECUTE} are needed for every enclave, then what's the point of checking them? FILE__EXECMOD may be the only flag that has a meaning, but it's kind of redundant because sigstruct file was checked against that already.

> > +static int selinux_enclave_load(struct file *encl, unsigned long addr,
> > +				unsigned long size, unsigned long prot,
> > +				struct vm_area_struct *source)
> > +{
> > +	if (source) {
> > +		/**
> > +		 * Adding page from source => EADD request
> > +		 */
> > +		int rc = selinux_file_mprotect(source, prot, prot);
> > +		if (rc)
> > +			return rc;
> > +
> > +		if (!(prot & VM_EXEC) &&
> > +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
> 
> I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be
> defined with the same values currently.  Elsewhere the kernel appears to
> explicitly translate them ala calc_vm_prot_bits().

Thanks! I'd change them to PROT_EXEC in the next version.

> 
> Also, this will mean that we will always perform an execute check on all
> sources, thereby triggering audit denial messages for any EADD sources
> that are only intended to be data.  Depending on the source, this could
> trigger PROCESS__EXECMEM or FILE__EXECMOD or FILE__EXECUTE.  In a world
> where users often just run any denials they see through audit2allow,
> they'll end up always allowing them all.  How can they tell whether it
> was needed? It would be preferable if we could only trigger execute
> checks when there is some probability that execute will be requested in
> the future.  Alternatives would be to silence the audit of these
> permission checks always via use of _noaudit() interfaces or to silence
> audit of these permissions via dontaudit rules in policy, but the latter
> would hide all denials of the permission by the process, not just those
> triggered from security_enclave_load().  And if we silence them, then we
> won't see them even if they were needed.

*_noaudit() is exactly what I wanted. But I couldn't find selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm reluctant to duplicate code. Any suggestions?
 
> 
> > +			prot = 0;
> > +		else {
> > +			prot = SGX__EXECUTE;
> > +			if (source->vm_file &&
> > +			    !file_has_perm(current_cred(), source->vm_file,
> > +					   FILE__EXECMOD))
> > +				prot |= SGX__EXECMOD;
> 
> Similarly, this means that we will always perform a FILE__EXECMOD check
> on all executable sources, triggering audit denial messages for any EADD
> source that is executable but to which EXECMOD is not allowed, and again
> the most common pattern will be that users will add EXECMOD to all
> executable sources to avoid this.
> 
> > +		}
> > +		return sgxsec_eadd(encl, addr, size, prot);
> > +	} else {
> > +		/**
> > +		  * Adding page from NULL => EAUG request
> > +		  */
> > +		return sgxsec_eaug(encl, addr, size, prot);
> > +	}
> > +}
> > +
> > +static int selinux_enclave_init(struct file *encl,
> > +				const struct sgx_sigstruct *sigstruct,
> > +				struct vm_area_struct *vma)
> > +{
> > +	int rc = 0;
> > +
> > +	if (!vma)
> > +		rc = -EINVAL;
> 
> Is it ever valid to call this hook with a NULL vma?  If not, this should
> be handled/prevented by the caller.  If so, I'd just return -EINVAL
> immediately here.

vma shall never be NULL. I'll update it in the next version.

> 
> > +
> > +	if (!rc && !(vma->vm_flags & VM_EXEC))
> > +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
> 
> I had thought we were trying to avoid overloading FILE__EXECUTE (or
> whatever gets checked here, e.g. could be PROCESS__EXECMEM or
> FILE__EXECMOD) on the sigstruct file, since the caller isn't truly
> executing code from it.

Agreed. Another problem with FILE__EXECMOD on the sigstruct file is that user code would then be allowed to modify SIGSTRUCT at will, which effectively wipes out the protection provided by FILE__EXECUTE.

> 
> I'd define new ENCLAVE__* permissions, including an up-front
> ENCLAVE__INIT permission that governs whether the sigstruct file can be
> used at all irrespective of memory protections.

Agreed.

> 
> Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM,
> ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the
> /dev/sgx/enclave inode as the target for the execute checks and just
> reuse the file permissions there.

Now we've got 2 options - 1) New ENCLAVE__* flags on sigstruct file or 2) FILE__* on /dev/sgx/enclave. Which one do you think makes more sense?

ENCLAVE__EXECMEM seems to offer finer granularity (than PROCESS__EXECMEM) but I wonder if it'd have any real use in practice.

> > +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot) {
> > +	struct enclave_sec *esec;
> > +	int rc;
> > +
> > +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file))))
> {
> > +		/* Positive return value indicates non-enclave VMA */
> > +		return 1;
> > +	}
> > +
> > +	down_read(&esec->sem);
> > +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end,
> > +prot);
> 
> Why is it safe for this to only use down_read()? enclave_mprotect() can
> call enclave_prot_set_cb() which modifies the list?

Probably because it was too late at night when I wrote this line:-( Good catch!

> 
> I haven't looked at this code closely, but it feels like a lot of SGX-
> specific logic embedded into SELinux that will have to be repeated or
> reused for every security module.  Does SGX not track this state itself?

I can tell you have looked quite closely, and I truly think you for your time!

You are right that there are SGX specific stuff. More precisely, SGX enclaves don't have access to anything except memory, so there are only 3 questions that need to be answered for each enclave page: 1) whether X is allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This proposal tries to cache the answers to those questions upon creation of each enclave page, meaning it involves a) figuring out the answers and b) "remember" them for every page. #b is generic, mostly captured in intel_sgx.c, and could be shared among all LSM modules; while #a is SELinux specific. I could move intel_sgx.c up one level in the directory hierarchy if that's what you'd suggest.

By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t track that state. In practice, there's no way for SGX to track it because there's no vm_ops->may_mprotect() callback. It doesn't follow the philosophy of Linux either, as mprotect() doesn't track it for regular memory. And it doesn't have a use without LSM, so I believe it makes more sense to track it inside LSM.

^ permalink raw reply

* Re: [RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux
From: Sean Christopherson @ 2019-06-11 22:02 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Cedric Xing, linux-security-module, selinux, linux-kernel,
	linux-sgx, jarkko.sakkinen, luto, jmorris, serge, paul, eparis,
	jethro, dave.hansen, tglx, torvalds, akpm, nhorman, pmccallum,
	serge.ayoun, shay.katz-zamir, haitao.huang, andriy.shevchenko,
	kai.svahn, bp, josh, kai.huang, rientjes, william.c.roberts,
	philip.b.tricca
In-Reply-To: <b6f099cd-c0eb-d5cf-847d-27a15ac5ceaf@tycho.nsa.gov>

On Tue, Jun 11, 2019 at 09:40:25AM -0400, Stephen Smalley wrote:
> I haven't looked at this code closely, but it feels like a lot of
> SGX-specific logic embedded into SELinux that will have to be repeated or
> reused for every security module.  Does SGX not track this state itself?

SGX does track equivalent state.

There are three proposals on the table (I think):

  1. Require userspace to explicitly specificy (maximal) enclave page
     permissions at build time.  The enclave page permissions are provided
     to, and checked by, LSMs at enclave build time.

     Pros: Low-complexity kernel implementation, straightforward auditing
     Cons: Sullies the SGX UAPI to some extent, may increase complexity of
           SGX2 enclave loaders.

  2. Pre-check LSM permissions and dynamically track mappings to enclave
     pages, e.g. add an SGX mprotect() hook to restrict W->X and WX
     based on the pre-checked permissions.

     Pros: Does not impact SGX UAPI, medium kernel complexity
     Cons: Auditing is complex/weird, requires taking enclave-specific
           lock during mprotect() to query/update tracking.

  3. Implement LSM hooks in SGX to allow LSMs to track enclave regions
     from cradle to grave, but otherwise defer everything to LSMs.

     Pros: Does not impact SGX UAPI, maximum flexibility, precise auditing
     Cons: Most complex and "heaviest" kernel implementation of the three,
           pushes more SGX details into LSMs.

My RFC series[1] implements #1.  My understanding is that Andy (Lutomirski)
prefers #2.  Cedric's RFC series implements #3.

Perhaps the easiest way to make forward progress is to rule out the
options we absolutely *don't* want by focusing on the potentially blocking
issue with each option:

  #1 - SGX UAPI funkiness

  #2 - Auditing complexity, potential enclave lock contention

  #3 - Pushing SGX details into LSMs and complexity of kernel implementation


[1] https://lkml.kernel.org/r/20190606021145.12604-1-sean.j.christopherson@intel.com

^ permalink raw reply

* Re: [PATCH -next] security: Make capability_hooks static
From: James Morris @ 2019-06-11 21:44 UTC (permalink / raw)
  To: YueHaibing; +Cc: serge, linux-kernel, linux-security-module
In-Reply-To: <20190611134815.16612-1-yuehaibing@huawei.com>

On Tue, 11 Jun 2019, YueHaibing wrote:

> Fix sparse warning:
> 
> security/commoncap.c:1347:27: warning:
>  symbol 'capability_hooks' was not declared. Should it be static?
> 
> Reported-by: Hulk Robot <hulkci@huawei.com>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>
> ---
>  security/commoncap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks!

Applied to
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git next-lsm

-- 
James Morris
<jmorris@namei.org>


^ permalink raw reply

* Re: [PATCH v3 05/33] docs: cgroup-v1: convert docs to ReST and rename to *.rst
From: Tejun Heo @ 2019-06-11 19:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Doc Mailing List, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, x86, Jens Axboe, Li Zefan, Johannes Weiner,
	Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, James Morris, Serge E. Hallyn, linux-block,
	cgroups, netdev, bpf, linux-security-module
In-Reply-To: <79865a4248ce5b042106e5ec69bb493292a8d392.1560045490.git.mchehab+samsung@kernel.org>

On Sat, Jun 08, 2019 at 11:26:55PM -0300, Mauro Carvalho Chehab wrote:
> Convert the cgroup-v1 files to ReST format, in order to
> allow a later addition to the admin-guide.
> 
> The conversion is actually:
>   - add blank lines and identation in order to identify paragraphs;
>   - fix tables markups;
>   - add some lists markups;
>   - mark literal blocks;
>   - adjust title markups.
> 
> At its new index.rst, let's add a :orphan: while this is not linked to
> the main index.rst file, in order to avoid build warnings.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

Acked-by: Tejun Heo <tj@kernel.org>

Please feel free to route with the rest of the series.  If you want
the patch to be routed through the cgroup tree, please let me know.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH v7 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args
From: prakhar srivastava @ 2019-06-11 18:48 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: linux-integrity, linux-security-module, linux-kernel,
	Roberto Sassu, vgoyal
In-Reply-To: <1560267426.4464.173.camel@linux.ibm.com>

On Tue, Jun 11, 2019 at 8:37 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
>
> Hi Prakhar,
>
> The patch/patch set title in the Subject line should not explain "how"
> you add a new feature.  In this case an appropriate patch set title
> would be, "Add support for measuring the boot command line".
>  Similarly, the first patch in this patch set could be named "Define a
> new IMA hook to measure the boot command line arguments".
>
> On Thu, 2019-06-06 at 17:23 -0700, Prakhar Srivastava wrote:
> > The motive behind the patch series is to measure the boot cmdline args
> > used for soft reboot/kexec case.
>
> When mentoring, I suggest starting out with a simple status statement
> (eg. "The kexec boot command line arguments are not currently being
> measured."), followed by the problem statement in the first paragraph.
>
> >
> > For secure boot attestation, it is necessary to measure the kernel
>
> Secure boot enforces local file data integrity.  The term here should
> be "trusted boot attestation".
>
> > command line and the kernel version.
>
> The original version of this patch set included the kernel version.
> This version is just measuring the boot command line arguments.
>
Sorry missed it while updating the cover letter.
<snip>

> > The ima logs need to be carried over to the next kernel, which will be followed
> > up by other patchsets for x86_64 and arm64.
> >
> > The kexec cmdline hash
>
> ^stored in the "d-ng" field of the template data
>
I will add another template-name for ima-buf
> > can be verified using
>
> > sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements |
> >   grep  kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum
>
> Until per policy template field rule support is added, a template name
> needs to be defined.  Please define "ima-buf" as:
> {.name = "ima-buf", .fmt = "d-ng|n-ng|buf"}
>
> I'm still seeing some scripts/checkpatch "WARNING: line over 80
> characters".  scripts/Lindent should provide the correct way of
> formatting these lines.
>
> Some people feel that references to Lindent should be removed, but I
> tend to agree with the Documentation/hwmon/submitting-patches.rst
> comment pertaining to scripts/Lindent.
>
> "* Running your patch or driver file(s) through checkpatch does not
> mean its formatting is clean. If unsure about formatting in your new
> driver, run it through Lindent. Lindent is not perfect, and you may
> have to do some minor cleanup, but it is a good start."
>
I will double check fix the issues.
> Examples of where the line formatting is off is the call to
> ima_get_action() in process_buffer_measurement() and the call to
> process_buffer_measurement() in ima_kexec_cmdline().
>
Thanks,
Prakhar Srivastava
> thanks,
>
> Mimi
<snip>

^ permalink raw reply

* Re: [RFC PATCH v2 3/5] x86/sgx: Enforce noexec filesystem restriction for enclaves
From: Stephen Smalley @ 2019-06-11 17:21 UTC (permalink / raw)
  To: Andy Lutomirski, Jarkko Sakkinen
  Cc: Sean Christopherson, Cedric Xing, James Morris, Serge E . Hallyn,
	LSM List, Paul Moore, Eric Paris, selinux, Jethro Beekman,
	Dave Hansen, Thomas Gleixner, Linus Torvalds, LKML, X86 ML,
	linux-sgx, Andrew Morton, nhorman, npmccallum, Serge Ayoun,
	Shay Katz-zamir, Haitao Huang, Andy Shevchenko, Kai Svahn,
	Borislav Petkov, Josh Triplett, Kai Huang, David Rientjes,
	William Roberts, Philip Tricca
In-Reply-To: <CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2T_cNrpvMig@mail.gmail.com>

On 6/10/19 12:44 PM, Andy Lutomirski wrote:
> On Mon, Jun 10, 2019 at 9:00 AM Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
>>
>> On Wed, Jun 05, 2019 at 07:11:43PM -0700, Sean Christopherson wrote:
>>> +             goto out;
>>> +     }
>>> +
>>> +     /*
>>> +      * Query VM_MAYEXEC as an indirect path_noexec() check (see do_mmap()),
>>> +      * but with some future proofing against other cases that may deny
>>> +      * execute permissions.
>>> +      */
>>> +     if (!(vma->vm_flags & VM_MAYEXEC)) {
>>> +             ret = -EACCES;
>>> +             goto out;
>>> +     }
>>> +
>>> +     if (copy_from_user(dst, (void __user *)src, PAGE_SIZE))
>>> +             ret = -EFAULT;
>>> +     else
>>> +             ret = 0;
>>> +
>>> +out:
>>> +     up_read(&current->mm->mmap_sem);
>>> +
>>> +     return ret;
>>> +}
>>
>> I would suggest to express the above instead like this for clarity
>> and consistency:
>>
>>                  goto err_map_sem;
>>          }
>>
>>          /* Query VM_MAYEXEC as an indirect path_noexec() check
>>           * (see do_mmap()).
>>           */
>>          if (!(vma->vm_flags & VM_MAYEXEC)) {
>>                  ret = -EACCES;
>>                  goto err_mmap_sem;
>>          }
>>
>>          if (copy_from_user(dst, (void __user *)src, PAGE_SIZE)) {
>>                  ret = -EFAULT;
>>                  goto err_mmap_sem;
>>          }
>>
>>          return 0;
>>
>> err_mmap_sem:
>>          up_read(&current->mm->mmap_sem);
>>          return ret;
>> }
>>
>> The comment about future proofing is unnecessary.
>>
> 
> I'm also torn as to whether this patch is needed at all.  If we ever
> get O_MAYEXEC, then enclave loaders should use it to enforce noexec in
> userspace.  Otherwise I'm unconvinced it's that special.

What's a situation where we would want to allow this?  Why is it 
different than do_mmap()?




^ permalink raw reply

* Re: What do LSMs *actually* need for checks on notifications?
From: Casey Schaufler @ 2019-06-11 16:22 UTC (permalink / raw)
  To: David Howells, Stephen Smalley, Andy Lutomirski
  Cc: viro, linux-usb, linux-security-module, linux-fsdevel,
	linux-kernel, casey
In-Reply-To: <31009.1560262869@warthog.procyon.org.uk>

On 6/11/2019 7:21 AM, David Howells wrote:
> To see if we can try and make progress on this, can we try and come at this
> from another angle: what do LSMs *actually* need to do this?  And I grant that
> each LSM might require different things.
>
> -~-
>
> [A] There are a bunch of things available, some of which may be coincident,
> depending on the context:
>
>  (1) The creds of the process that created a watch_queue (ie. opened
>      /dev/watch_queue).

Smack needs this for the filesystem access required to open /dev/watch_queue.

>  (2) The creds of the process that set a watch (ie. called watch_sb,
>      KEYCTL_NOTIFY, ...);

Smack needs this to set a watch on any named object (file, key, ...).
Smack needs this as the object information for event delivery.

>  (3) The creds of the process that tripped the event (which might be the
>      system).

Smack needs this as the subject information for the event delivery.

>  (4) The security attributes of the object on which the watch was set (uid,
>      gid, mode, labels).

Smack needs this to set a watch on the named object (file, key, ...).
I am going to say that if you can't access an object you can't watch it.
I think that read access is sufficient provided that no one else can
see what watches I've created.

>  (5) The security attributes of the object on which the event was tripped.

Smack does not need these for the event mechanism as that object isn't
involved in the event delivery, except as may be required by (4).

>  (6) The security attributes of all the objects between the object in (5) and
>      the object in (4), assuming we work from (5) towards (4) if the two
>      aren't coincident (WATCH_INFO_RECURSIVE).

Smack needs these only as they would apply to (4).

> At the moment, when post_one_notification() wants to write a notification into
> a queue, it calls security_post_notification() to ask if it should be allowed
> to do so.  This is passed (1) and (3) above plus the notification record.

Is "current" (2)? Smack needs (2) for the event delivery access check.

> [B] There are a number of places I can usefully potentially add hooks:
>
>  (a) The point at which a watch queue is created (ie. /dev/watch_queue is
>      opened).

Smack would not need a new check as the filesystem checks should suffice.

>  (b) The point at which a watch is set (ie. watch_sb).

Smack would need a check to ensure the watcher has access
in cases where what is being watched is an object.

>  (c) The point at which a notification is generated (ie. an automount point is
>      tripped).

Smack does not require an explicit check here.

>  (d) The point at which a notification is delivered (ie. we write the message
>      into the queue).

Smack requires a check here. (2) as the object and (3) as the subject.

>  (e) All the points at which we walk over an object in a chain from (c) to
>      find the watch on which we can effect (d) (eg. we walk rootwards from a
>      mountpoint to find watches on a branch in the mount topology).

Smack does not require anything beyond existing checks.

> [C] Problems that need to be resolved:
>
>  (x) Do I need to put a security pointer in struct watch for the active LSM to
>      fill in?  If so, I presume this would need passing to
>      security_post_notification().

Smack does not need this.

>  (y) What checks should be done on object destruction after final put and what
>      contexts need to be supplied?

Classically there is no such thing as filesystem object deletion.
By making it possible to set a watch on that you've inadvertently
added a security relevant action to the security model. :o

>      This one is made all the harder because the creds that are in force when
>      close(), exit(), exec(), dup2(), etc. close a file descriptor might need
>      to be propagated to deferred-fput, which must in turn propagate them to
>      af_unix-cleanup, and thence back to deferred-fput and thence to implicit
>      unmount (dissolve_on_fput()[*]).
>
>      [*] Though it should be noted that if this happens, the subtree cannot be
>      	 attached to the root of a namespace.
>
>      Further, if several processes are sharing a file object, it's not
>      predictable as to which process the final notification will come from.

How about we don't add filesystem object deletion to the security model?

If what you really care about is removal of last filesystem reference
(last unlink) this shouldn't be any harder than any other watched change
(e.g. chmod) to the object.

If, on the other hand, you really want to watch for the last inode
reference and actual destruction of the thing I will suggest an
argument based on the model itself. If all of the directory entries
are unlinked the object no longer exists in the filesystem namespace.
If all of the fd's are closed (by whatever mechanism, we don't really
care) it no longer exists in any process space. At that point it has
no names, and is no longer a named object. No process (subject) deletes
it. They can't. They don't have access to it without a name. The
deletion is a system event, like setting the clock. There is no policy
that says when or even if the destruction occurs.

If and when the system gets around to cleaning up what has become
nothing more than system resources any outstanding watches can be
triggered using the system credential.

>  (z) Do intermediate objects, say in a mount topology notification, actually
>      need to be checked against the watcher's creds?  For a mount topology
>      notification, would this require calling inode_permission() for each
>      intervening directory?

Smack would not require this. Paths are an illusion.

>      Doing that might be impractical as it would probably have to be done
>      outside of of the RCU read lock and the filesystem ->permission() hooks
>      might want to sleep (to touch disk or talk to a server).
>
> David

^ permalink raw reply

* Re: What do LSMs *actually* need for checks on notifications?
From: Stephen Smalley @ 2019-06-11 15:57 UTC (permalink / raw)
  To: David Howells, Casey Schaufler, Andy Lutomirski
  Cc: viro, linux-usb, linux-security-module, linux-fsdevel,
	linux-kernel
In-Reply-To: <31009.1560262869@warthog.procyon.org.uk>

On 6/11/19 10:21 AM, David Howells wrote:
> To see if we can try and make progress on this, can we try and come at this
> from another angle: what do LSMs *actually* need to do this?  And I grant that
> each LSM might require different things.

I think part of the problem here is that the discussion is too abstract 
and not dealing with the specifics of the notifications in question. 
Those details matter.

> 
> -~-
> 
> [A] There are a bunch of things available, some of which may be coincident,
> depending on the context:
> 
>   (1) The creds of the process that created a watch_queue (ie. opened
>       /dev/watch_queue).

These will be used when checking permissions to open /dev/watch_queue.

>   (2) The creds of the process that set a watch (ie. called watch_sb,
>       KEYCTL_NOTIFY, ...);

These will be used when checking permissions to set a watch.

>   (3) The creds of the process that tripped the event (which might be the
>       system).

These will be used when checking permission to perform whatever 
operation tripped the event (if the event is triggered by a userspace 
operation).

>   (4) The security attributes of the object on which the watch was set (uid,
>       gid, mode, labels).

These will be used when checking permissions to set the watch.

>   (5) The security attributes of the object on which the event was tripped.

These will be used when checking permission to perform whatever 
operation tripped the event.

>   (6) The security attributes of all the objects between the object in (5) and
>       the object in (4), assuming we work from (5) towards (4) if the two
>       aren't coincident (WATCH_INFO_RECURSIVE).

Does this apply to anything other than mount notifications?  And for 
mount notifications, isn't the notification actually for a change to the 
mount namespace, not a change to any file?  Hence, the real "object" for 
events that trigger mount notifications is the mount namespace, right? 
The watched path is just a way of identifying a subtree of the mount 
namespace for notifications - it isn't the real object being watched.

> At the moment, when post_one_notification() wants to write a notification into
> a queue, it calls security_post_notification() to ask if it should be allowed
> to do so.  This is passed (1) and (3) above plus the notification record.

Not convinced we need this.

> [B] There are a number of places I can usefully potentially add hooks:
> 
>   (a) The point at which a watch queue is created (ie. /dev/watch_queue is
>       opened).

Already covered by existing hooks on opening files.

>   (b) The point at which a watch is set (ie. watch_sb).

Yes, this requires a hook and corresponding check.

>   (c) The point at which a notification is generated (ie. an automount point is
>       tripped).

Preferably covered by existing hooks on object accesses that would 
generate notifications.

>   (d) The point at which a notification is delivered (ie. we write the message
>       into the queue).

Preferably not needed.

>   (e) All the points at which we walk over an object in a chain from (c) to
>       find the watch on which we can effect (d) (eg. we walk rootwards from a
>       mountpoint to find watches on a branch in the mount topology).

Not necessary if the real object of mount notifications is the mount 
namespace and if we do not support recursive notifications on e.g. 
directories or some other object where the two can truly diverge.

> [C] Problems that need to be resolved:
> 
>   (x) Do I need to put a security pointer in struct watch for the active LSM to
>       fill in?  If so, I presume this would need passing to
>       security_post_notification().

I don't see why or where it would get used.

>   (y) What checks should be done on object destruction after final put and what
>       contexts need to be supplied?

IMHO, no.

> 
>       This one is made all the harder because the creds that are in force when
>       close(), exit(), exec(), dup2(), etc. close a file descriptor might need
>       to be propagated to deferred-fput, which must in turn propagate them to
>       af_unix-cleanup, and thence back to deferred-fput and thence to implicit
>       unmount (dissolve_on_fput()[*]).
> 
>       [*] Though it should be noted that if this happens, the subtree cannot be
>       	 attached to the root of a namespace.
> 
>       Further, if several processes are sharing a file object, it's not
>       predictable as to which process the final notification will come from.
> 
>   (z) Do intermediate objects, say in a mount topology notification, actually
>       need to be checked against the watcher's creds?  For a mount topology
>       notification, would this require calling inode_permission() for each
>       intervening directory?

I don't think so, because the real object is the mount namespace, not 
the individual directories.

> 
>       Doing that might be impractical as it would probably have to be done
>       outside of of the RCU read lock and the filesystem ->permission() hooks
>       might want to sleep (to touch disk or talk to a server).



^ permalink raw reply

* Re: [PATCH v7 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args
From: Mimi Zohar @ 2019-06-11 15:37 UTC (permalink / raw)
  To: Prakhar Srivastava, linux-integrity, linux-security-module,
	linux-kernel
  Cc: roberto.sassu, vgoyal
In-Reply-To: <20190607002330.2999-1-prsriva02@gmail.com>

Hi Prakhar,

The patch/patch set title in the Subject line should not explain "how"
you add a new feature.  In this case an appropriate patch set title
would be, "Add support for measuring the boot command line".
 Similarly, the first patch in this patch set could be named "Define a
new IMA hook to measure the boot command line arguments".

On Thu, 2019-06-06 at 17:23 -0700, Prakhar Srivastava wrote:
> The motive behind the patch series is to measure the boot cmdline args
> used for soft reboot/kexec case.

When mentoring, I suggest starting out with a simple status statement
(eg. "The kexec boot command line arguments are not currently being
measured."), followed by the problem statement in the first paragraph.

> 
> For secure boot attestation, it is necessary to measure the kernel

Secure boot enforces local file data integrity.  The term here should
be "trusted boot attestation".

> command line and the kernel version. 

The original version of this patch set included the kernel version.  
This version is just measuring the boot command line arguments.

> For cold boot, the boot loader
> can be enhanced to measure these parameters.
> (https://mjg59.dreamwidth.org/48897.html)
> However, for attestation across soft reboot boundary, these values 
> also need to be measured during kexec_file_load.
> 
> Currently for Kexec(kexec_file_load)/soft reboot scenario the boot cmdline
> args for the next kernel are not measured. For 
> normal case of boot/hardreboot the cmdline args are measured into the TPM.
> The hash of boot command line is calculated and added to the current 
> running kernel's measurement list.
> On a soft reboot like kexec, no cmdline arguments measurement takes place.
> 
> To achive the above the patch series does the following
>   -adds a new ima hook: ima_kexec_cmdline which measures the cmdline args
>    into the ima log, behind a new ima policy entry KEXEC_CMDLINE.
>   -since the cmldine args cannot be appraised, a new template field(buf) is
>    added. The template field contains the buffer passed(cmldine args), which
>    can be used to appraise/attest at a later stage.
>   -call the ima_kexec_cmdline(...) hook from kexec_file_load call.
> 
> The ima logs need to be carried over to the next kernel, which will be followed
> up by other patchsets for x86_64 and arm64.
> 
> The kexec cmdline hash

^stored in the "d-ng" field of the template data

> can be verified using

> sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements | 
>   grep  kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum

Until per policy template field rule support is added, a template name
needs to be defined.  Please define "ima-buf" as:
{.name = "ima-buf", .fmt = "d-ng|n-ng|buf"}

I'm still seeing some scripts/checkpatch "WARNING: line over 80
characters".  scripts/Lindent should provide the correct way of
formatting these lines.

Some people feel that references to Lindent should be removed, but I
tend to agree with the Documentation/hwmon/submitting-patches.rst
comment pertaining to scripts/Lindent.

"* Running your patch or driver file(s) through checkpatch does not
mean its formatting is clean. If unsure about formatting in your new
driver, run it through Lindent. Lindent is not perfect, and you may
have to do some minor cleanup, but it is a good start."

Examples of where the line formatting is off is the call to
ima_get_action() in process_buffer_measurement() and the call to
process_buffer_measurement() in ima_kexec_cmdline().

thanks,

Mimi

> 
> Changelog:
> V7:
>   - rebased to next-queued-testing
>   https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git/log/?h=next-queued-testing
> 
> V6:
>   -add a new ima hook and policy to measure the cmdline
>     args(ima_kexec_cmdline)
>   -add a new template field buf to contain the buffer measured.
>   [suggested by Mimi Zohar]
>    add new fields to ima_event_data to store/read buffer data.
>   [suggested by Roberto]
>   -call ima_kexec_cmdline from kexec_file_load path
> 
> v5:
>   -add a new ima hook and policy to measure the cmdline
>     args(ima_kexec_cmdline)
>   -add a new template field buf to contain the buffer measured.
>     [suggested by Mimi Zohar]
>   -call ima_kexec_cmdline from kexec_file_load path
> 
> v4:
>   - per feedback from LSM community, removed the LSM hook and renamed the
>     IMA policy to KEXEC_CMDLINE
> 
> v3: (rebase changes to next-general)
>   - Add policy checks for buffer[suggested by Mimi Zohar]
>   - use the IMA_XATTR to add buffer
>   - Add kexec_cmdline used for kexec file load
>   - Add an LSM hook to allow usage by other LSM.[suggestd by Mimi Zohar]
> 
> v2:
>   - Add policy checks for buffer[suggested by Mimi Zohar]
>   - Add an LSM hook to allow usage by other LSM.[suggestd by Mimi Zohar]
>   - use the IMA_XATTR to add buffer instead of sig template
> 
> v1:
>   -Add kconfigs to control the ima_buffer_check
>   -measure the cmdline args suffixed with the kernel file name
>   -add the buffer to the template sig field.
> 
> Prakhar Srivastava (3):
>   Add a new ima hook ima_kexec_cmdline to measure cmdline args
>   add a new ima template field buf
>   call ima_kexec_cmdline to measure the cmdline args
> 
>  Documentation/ABI/testing/ima_policy      |  1 +
>  Documentation/security/IMA-templates.rst  |  2 +-
>  include/linux/ima.h                       |  2 +
>  kernel/kexec_file.c                       |  8 ++-
>  security/integrity/ima/ima.h              |  3 +
>  security/integrity/ima/ima_api.c          |  5 +-
>  security/integrity/ima/ima_init.c         |  2 +-
>  security/integrity/ima/ima_main.c         | 80 +++++++++++++++++++++++
>  security/integrity/ima/ima_policy.c       |  9 +++
>  security/integrity/ima/ima_template.c     |  2 +
>  security/integrity/ima/ima_template_lib.c | 20 ++++++
>  security/integrity/ima/ima_template_lib.h |  4 ++
>  12 files changed, 131 insertions(+), 7 deletions(-)
> 


^ permalink raw reply

* Re: [RFC][PATCH 00/13] Mount, FS, Block and Keyrings notifications [ver #4]
From: Stephen Smalley @ 2019-06-11 14:32 UTC (permalink / raw)
  To: Andy Lutomirski, Casey Schaufler
  Cc: Andy Lutomirski, David Howells, Al Viro, USB list, LSM List,
	Greg Kroah-Hartman, raven, Linux FS Devel, Linux API, linux-block,
	keyrings, LKML, Paul Moore
In-Reply-To: <97BA9EB5-4E62-4E3A-BD97-CEC34F16FCFF@amacapital.net>

On 6/10/19 8:13 PM, Andy Lutomirski wrote:
> 
> 
>> On Jun 10, 2019, at 2:25 PM, Casey Schaufler <casey@schaufler-ca.com> wrote:
>>
>>> On 6/10/2019 12:53 PM, Andy Lutomirski wrote:
>>> On Mon, Jun 10, 2019 at 12:34 PM Casey Schaufler <casey@schaufler-ca.com> wrote:
>>>>>>> I think you really need to give an example of a coherent policy that
>>>>>>> needs this.
>>>>>> I keep telling you, and you keep ignoring what I say.
>>>>>>
>>>>>>> As it stands, your analogy seems confusing.
>>>>>> It's pretty simple. I have given both the abstract
>>>>>> and examples.
>>>>> You gave the /dev/null example, which is inapplicable to this patchset.
>>>> That addressed an explicit objection, and pointed out
>>>> an exception to a generality you had asserted, which was
>>>> not true. It's also a red herring regarding the current
>>>> discussion.
>>> This argument is pointless.
>>>
>>> Please humor me and just give me an example.  If you think you have
>>> already done so, feel free to repeat yourself.  If you have no
>>> example, then please just say so.
>>
>> To repeat the /dev/null example:
>>
>> Process A and process B both open /dev/null.
>> A and B can write and read to their hearts content
>> to/from /dev/null without ever once communicating.
>> The mutual accessibility of /dev/null in no way implies that
>> A and B can communicate. If A can set a watch on /dev/null,
>> and B triggers an event, there still has to be an access
>> check on the delivery of the event because delivering an event
>> to A is not an action on /dev/null, but on A.
>>
> 
> At discussed, this is an irrelevant straw man. This patch series does not produce events when this happens. I’m looking for a relevant example, please.
>>
>>
>>>   An unprivileged
>>> user can create a new userns and a new mount ns, but then they're
>>> modifying a whole different mount tree.
>>
>> Within those namespaces you can still have multiple users,
>> constrained be system access control policy.
> 
> And the one doing the mounting will be constrained by MAC and DAC policy, as always.  The namespace creator is, from the perspective of those processes, admin.
> 
>>
>>>
>>>>>>> Similarly, if someone
>>>>>>> tries to receive a packet on a socket, we check whether they have the
>>>>>>> right to receive on that socket (from the endpoint in question) and,
>>>>>>> if the sender is local, whether the sender can send to that socket.
>>>>>>> We do not check whether the sender can send to the receiver.
>>>>>> Bzzzt! Smack sure does.
>>>>> This seems dubious. I’m still trying to get you to explain to a non-Smack person why this makes sense.
>>>> Process A sends a packet to process B.
>>>> If A has access to TopSecret data and B is not
>>>> allowed to see TopSecret data, the delivery should
>>>> be prevented. Is that nonsensical?
>>> It makes sense.  As I see it, the way that a sensible policy should do
>>> this is by making sure that there are no sockets, pipes, etc that
>>> Process A can write and that Process B can read.
>>
>> You can't explain UDP controls without doing the access check
>> on packet delivery. The sendmsg() succeeds when the packet leaves
>> the sender. There doesn't even have to be a socket bound to the
>> port. The only opportunity you have for control is on packet
>> delivery, which is the only point at which you can have the
>> information required.
> 
> Huh?  You sendmsg() from an address to an address.  My point is that, for most purposes, that’s all the information that’s needed.
> 
>>
>>> If you really want to prevent a malicious process with TopSecret data
>>> from sending it to a different process, then you can't use Linux on
>>> x86 or ARM.  Maybe that will be fixed some day, but you're going to
>>> need to use an extremely tight sandbox to make this work.
>>
>> I won't be commenting on that.
> 
> Then why is preventing this is an absolute requirement? It’s unattainable.
> 
>>
>>>
>>>>>>> The signal example is inapplicable.
>>>>>>  From a modeling viewpoint the actions are identical.
>>>>> This seems incorrect to me
>>>> What would be correct then? Some convoluted combination
>>>> of system entities that aren't owned or controlled by
>>>> any mechanism?
>>>>
>>> POSIX signal restrictions aren't there to prevent two processes from
>>> communicating.  They're there to prevent the sender from manipulating
>>> or crashing the receiver without appropriate privilege.
>>
>> POSIX signal restrictions have a long history. In the P10031e/2c
>> debates both communication and manipulation where seriously
>> considered. I would say both are true.
>>
>>>>> and, I think, to most everyone else reading this.
>>>> That's quite the assertion. You may even be correct.
>>>>
>>>>> Can you explain?
>>>>>
>>>>> In SELinux-ese, when you write to a file, the subject is the writer and the object is the file.  When you send a signal to a process, the object is the target process.
>>>> YES!!!!!!!!!!!!
>>>>
>>>> And when a process triggers a notification it is the subject
>>>> and the watching process is the object!
>>>>
>>>> Subject == active entity
>>>> Object  == passive entity
>>>>
>>>> Triggering an event is, like calling kill(), an action!
>>>>
>>> And here is where I disagree with your interpretation.  Triggering an
>>> event is a side effect of writing to the file.  There are *two*
>>> security relevant actions, not one, and they are:
>>>
>>> First, the write:
>>>
>>> Subject == the writer
>>> Action == write
>>> Object == the file
>>>
>>> Then the event, which could be modeled in a couple of ways:
>>>
>>> Subject == the file
>>
>> Files   are   not   subjects. They are passive entities.
>>
>>> Action == notify
>>> Object == the recipient
> 
> Great. Then use the variant below.
> 
>>>
>>> or
>>>
>>> Subject == the recipient
>>> Action == watch
>>> Object == the file
>>>
>>> By conflating these two actions into one, you've made the modeling
>>> very hard, and you start running into all these nasty questions like
>>> "who actually closed this open file"
>>
>> No, I've made the code more difficult.
>> You can not call
>> the file a subject. That is just wrong. It's not a valid
>> model.
> 
> You’ve ignored the “Action == watch” variant. Do you care to comment?

While I agree with this model in general, I will note two caveats when 
trying to apply this to watches/notifications:

1) The object on which the notification was triggered and the object on 
which the watch was placed are not necessarily the same and access to 
one might not imply access to the other,

2) If notifications can be triggered by read-like operations (as in 
fanotify, for example), then a "read" can be turned into a "write" flow 
through a notification.

Whether or not these caveats are applicable to the notifications in this 
series I am not clear.

^ permalink raw reply

* What do LSMs *actually* need for checks on notifications?
From: David Howells @ 2019-06-11 14:21 UTC (permalink / raw)
  To: Casey Schaufler, Stephen Smalley, Andy Lutomirski
  Cc: dhowells, viro, linux-usb, linux-security-module, linux-fsdevel,
	linux-kernel
In-Reply-To: <155991702981.15579.6007568669839441045.stgit@warthog.procyon.org.uk>

To see if we can try and make progress on this, can we try and come at this
from another angle: what do LSMs *actually* need to do this?  And I grant that
each LSM might require different things.

-~-

[A] There are a bunch of things available, some of which may be coincident,
depending on the context:

 (1) The creds of the process that created a watch_queue (ie. opened
     /dev/watch_queue).

 (2) The creds of the process that set a watch (ie. called watch_sb,
     KEYCTL_NOTIFY, ...);

 (3) The creds of the process that tripped the event (which might be the
     system).

 (4) The security attributes of the object on which the watch was set (uid,
     gid, mode, labels).

 (5) The security attributes of the object on which the event was tripped.

 (6) The security attributes of all the objects between the object in (5) and
     the object in (4), assuming we work from (5) towards (4) if the two
     aren't coincident (WATCH_INFO_RECURSIVE).

At the moment, when post_one_notification() wants to write a notification into
a queue, it calls security_post_notification() to ask if it should be allowed
to do so.  This is passed (1) and (3) above plus the notification record.


[B] There are a number of places I can usefully potentially add hooks:

 (a) The point at which a watch queue is created (ie. /dev/watch_queue is
     opened).

 (b) The point at which a watch is set (ie. watch_sb).

 (c) The point at which a notification is generated (ie. an automount point is
     tripped).

 (d) The point at which a notification is delivered (ie. we write the message
     into the queue).

 (e) All the points at which we walk over an object in a chain from (c) to
     find the watch on which we can effect (d) (eg. we walk rootwards from a
     mountpoint to find watches on a branch in the mount topology).


[C] Problems that need to be resolved:

 (x) Do I need to put a security pointer in struct watch for the active LSM to
     fill in?  If so, I presume this would need passing to
     security_post_notification().

 (y) What checks should be done on object destruction after final put and what
     contexts need to be supplied?

     This one is made all the harder because the creds that are in force when
     close(), exit(), exec(), dup2(), etc. close a file descriptor might need
     to be propagated to deferred-fput, which must in turn propagate them to
     af_unix-cleanup, and thence back to deferred-fput and thence to implicit
     unmount (dissolve_on_fput()[*]).

     [*] Though it should be noted that if this happens, the subtree cannot be
     	 attached to the root of a namespace.

     Further, if several processes are sharing a file object, it's not
     predictable as to which process the final notification will come from.

 (z) Do intermediate objects, say in a mount topology notification, actually
     need to be checked against the watcher's creds?  For a mount topology
     notification, would this require calling inode_permission() for each
     intervening directory?

     Doing that might be impractical as it would probably have to be done
     outside of of the RCU read lock and the filesystem ->permission() hooks
     might want to sleep (to touch disk or talk to a server).

David

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox