linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kees Cook <keescook@chromium.org>
To: Topi Miettinen <toiwoton@gmail.com>
Cc: "Catalin Marinas" <catalin.marinas@arm.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Christoph Hellwig" <hch@infradead.org>,
	"Lennart Poettering" <lennart@poettering.net>,
	"Zbigniew Jędrzejewski-Szmek" <zbyszek@in.waw.pl>,
	"Will Deacon" <will@kernel.org>,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Eric Biederman" <ebiederm@xmission.com>,
	"Szabolcs Nagy" <szabolcs.nagy@arm.com>,
	"Mark Brown" <broonie@kernel.org>,
	"Jeremy Linton" <jeremy.linton@arm.com>,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	linux-abi-devel@lists.sourceforge.net,
	linux-hardening@vger.kernel.org, "Jann Horn" <jannh@google.com>,
	"Salvatore Mesoraca" <s.mesoraca16@gmail.com>,
	"Igor Zhbanov" <izh1979@gmail.com>
Subject: Re: [PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE)
Date: Wed, 20 Apr 2022 16:21:45 -0700	[thread overview]
Message-ID: <202204201610.093C9D5FE8@keescook> (raw)
In-Reply-To: <c62170c6-5993-2417-4143-5a37a98b227c@gmail.com>

On Wed, Apr 20, 2022 at 10:34:33PM +0300, Topi Miettinen wrote:
> On 20.4.2022 16.01, Catalin Marinas wrote:
> > On Thu, Apr 14, 2022 at 11:52:17AM -0700, Kees Cook wrote:
> > > On Wed, Apr 13, 2022 at 02:49:42PM +0100, Catalin Marinas wrote:
> > > > The background to this is that systemd has a configuration option called
> > > > MemoryDenyWriteExecute [1], implemented as a SECCOMP BPF filter. Its aim
> > > > is to prevent a user task from inadvertently creating an executable
> > > > mapping that is (or was) writeable. Since such BPF filter is stateless,
> > > > it cannot detect mappings that were previously writeable but
> > > > subsequently changed to read-only. Therefore the filter simply rejects
> > > > any mprotect(PROT_EXEC). The side-effect is that on arm64 with BTI
> > > > support (Branch Target Identification), the dynamic loader cannot change
> > > > an ELF section from PROT_EXEC to PROT_EXEC|PROT_BTI using mprotect().
> > > > For libraries, it can resort to unmapping and re-mapping but for the
> > > > main executable it does not have a file descriptor. The original bug
> > > > report in the Red Hat bugzilla - [2] - and subsequent glibc workaround
> > > > for libraries - [3].
> > > 
> > > Right, so, the systemd filter is a big hammer solution for the kernel
> > > not having a very easy way to provide W^X mapping protections to
> > > userspace. There's stuff in SELinux, and there have been several
> > > attempts[1] at other LSMs to do it too, but nothing stuck.
> > > 
> > > Given the filter, and the implementation of how to enable BTI, I see two
> > > solutions:
> > > 
> > > - provide a way to do W^X so systemd can implement the feature differently
> > > - provide a way to turn on BTI separate from mprotect to bypass the filter
> > > 
> > > I would agree, the latter seems like the greater hack,
> > 
> > We discussed such hacks in the past but they are just working around the
> > fundamental issue - systemd wants W^X but with BPF it can only achieve
> > it by preventing mprotect(PROT_EXEC) irrespective of whether the mapping
> > was already executable. If we find a better solution for W^X, we
> > wouldn't have to hack anything for mprotect(PROT_EXEC|PROT_BTI).
> > 
> > > so I welcome
> > > this RFC, though I think it might need to explore a bit of the feature
> > > space exposed by other solutions[1] (i.e. see SARA and NAX), otherwise
> > > it risks being too narrowly implemented. For example, playing well with
> > > JITs should be part of the design, and will likely need some kind of
> > > ELF flags and/or "sealing" mode, and to handle the vma alias case as
> > > Jann Horn pointed out[2].
> > 
> > I agree we should look at what we want to cover, though trying to avoid
> > re-inventing SELinux. With this patchset I went for the minimum that
> > systemd MDWE does with BPF.
> > 
> > I think JITs get around it using something like memfd with two separate
> > mappings to the same page. We could try to prevent such aliases but
> > allow it if an ELF note is detected (or get the JIT to issue a prctl()).
> > 
> > Anyway, with a prctl() we can allow finer-grained control starting with
> > anonymous and file mappings and later extending to vma aliases,
> > writeable files etc. On top we can add a seal mask so that a process
> > cannot disable a control was set. Something like (I'm not good at
> > names):
> > 
> > 	prctl(PR_MDWX_SET, flags, seal_mask);
> > 	prctl(PR_MDWX_GET);
> > 
> > with flags like:
> > 
> > 	PR_MDWX_MMAP - basics, should cover mmap() and mprotect()
> > 	PR_MDWX_ALIAS - vma aliases, allowed with an ELF note
> > 	PR_MDWX_WRITEABLE_FILE
> > 
> > (needs some more thinking)
> > 
> 
> For systemd, feature compatibility with the BPF version is important so that
> we could automatically switch to the kernel version once available without
> regressions. So I think PR_MDWX_MMAP (or maybe PR_MDWX_COMPAT) should match
> exactly what MemoryDenyWriteExecute=yes as implemented with BPF has: only
> forbid mmap(PROT_EXEC|PROT_WRITE) and mprotect(PROT_EXEC). Like BPF, once
> installed there should be no way to escape and ELF flags should be also
> ignored. ARM BTI should be allowed though (allow PROT_EXEC|PROT_BTI if the
> old flags had PROT_EXEC).
> 
> Then we could have improved versions (other PR_MDWX_ prctls) with lots more
> checks. This could be enabled with MemoryDenyWriteExecute=strict or so.
> 
> Perhaps also more relaxed versions (like SARA) could be interesting (system
> service running Python with FFI, or perhaps JVM etc), enabled with for
> example MemoryDenyWriteExecute=trampolines. That way even those programs
> would get some protection (though there would be a gap in the defences).

Yup, I think we're all on the same page. Catalin, can you respin with a
prctl for enabling MDWE? I propose just:

	prctl(PR_MDWX_SET, flags);
	prctl(PR_MDWX_GET);

	PR_MDWX_FLAG_MMAP
		disallows PROT_EXEC on any VMA that is or was PROT_WRITE,
		covering at least: mmap, mprotect, pkey_mprotect, and shmat.

I don't think anything should be allowed to be disabled once set.

-- 
Kees Cook


  reply	other threads:[~2022-04-20 23:21 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13 13:49 [PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE) Catalin Marinas
2022-04-13 13:49 ` [PATCH RFC 1/4] mm: Track previously writeable vma permission Catalin Marinas
2022-04-13 13:49 ` [PATCH RFC 2/4] mm, personality: Implement memory-deny-write-execute as a personality flag Catalin Marinas
2022-04-21 17:37   ` David Hildenbrand
2022-04-22 10:28     ` Catalin Marinas
2022-04-22 11:04       ` David Hildenbrand
2022-04-22 13:12         ` Catalin Marinas
2022-04-22 17:41           ` David Hildenbrand
2022-04-13 13:49 ` [PATCH RFC 3/4] fs/binfmt_elf: Tell user-space about the DENY_WRITE_EXEC " Catalin Marinas
2022-04-13 13:49 ` [PATCH RFC 4/4] arm64: Select ARCH_ENABLE_DENY_WRITE_EXEC Catalin Marinas
2022-04-13 18:39 ` [PATCH RFC 0/4] mm, arm64: In-kernel support for memory-deny-write-execute (MDWE) Topi Miettinen
2022-04-14 13:49   ` Catalin Marinas
2022-04-14 18:52 ` Kees Cook
2022-04-15 20:01   ` Topi Miettinen
2022-04-20 13:01   ` Catalin Marinas
2022-04-20 17:44     ` Kees Cook
2022-04-20 19:34     ` Topi Miettinen
2022-04-20 23:21       ` Kees Cook [this message]
2022-04-21 15:35         ` Catalin Marinas
2022-04-21 16:42           ` Kees Cook
2022-04-21 17:24             ` Catalin Marinas
2022-04-21 17:41               ` Kees Cook
2022-04-21 18:33                 ` Catalin Marinas
2022-04-21 16:48           ` Topi Miettinen
2022-04-21 17:28             ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202204201610.093C9D5FE8@keescook \
    --to=keescook@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=ebiederm@xmission.com \
    --cc=hch@infradead.org \
    --cc=izh1979@gmail.com \
    --cc=jannh@google.com \
    --cc=jeremy.linton@arm.com \
    --cc=lennart@poettering.net \
    --cc=linux-abi-devel@lists.sourceforge.net \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=s.mesoraca16@gmail.com \
    --cc=szabolcs.nagy@arm.com \
    --cc=toiwoton@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=zbyszek@in.waw.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).