From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=mF0B=RA=linuxfoundation.org=gregkh@kernel.org>
Received: from mail.linutronix.de (146.0.238.70:993) by
  crypto-ml.lab.linutronix.de with IMAP4-SSL for <speck@linutronix.de>; 25 Feb
  2019 16:30:24 -0000
Received: from mail.kernel.org ([198.145.29.99])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <SRS0=mF0B=RA=linuxfoundation.org=gregkh@kernel.org>)
	id 1gyJ94-00025T-Pj
	for speck@linutronix.de; Mon, 25 Feb 2019 17:30:23 +0100
Date: Mon, 25 Feb 2019 17:30:13 +0100
From: Greg KH <gregkh@linuxfoundation.org>
Subject: [MODERATED] Re: [PATCH v6 10/43] MDSv6
Message-ID: <20190225163013.GB22318@kroah.com>
References: <cover.1551019522.git.ak@linux.intel.com>
 <fd985a6564dd500cf316665c5de823cb13843a1d.1551019522.git.ak@linux.intel.com>
MIME-Version: 1.0
In-Reply-To: 
 <fd985a6564dd500cf316665c5de823cb13843a1d.1551019522.git.ak@linux.intel.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

I'm sorry, I just can't stop...

On Sun, Feb 24, 2019 at 07:07:16AM -0800, speck for Andi Kleen wrote:
> +Guidance for driver/subsystem developers
> +----------------------------------------
> +
> +[These generally need to be enforced in code review for new code now]
> +
> +When you touch user supplied data of *other* processes in system call
> +context add lazy_clear_cpu().

What do you mean by "user supplied data of *other* processes"?  I think
I understand the kernel, and I have no idea what this means.

> +For the cases below we care only about data from other processes.
> +Touching non cryptographic data from the current process is always allowed.

Define "non cryptographic data".  Is data coming across a serial port
crypto data?  From a camera?

> +
> +Touching only pointers to user data is always allowed.

But not touching the data the pointer points to?

> +
> +When your interrupt does touch user data directly mark it with IRQF_USER_DATA.

I still don't know what "user data" means.  Is a serial stream coming
from a bluetooth device "user data"?  Is a program that talks directly
to a USB data without a special kernel driver reading "user data"?  Is a
serial port data stream "user data"?

> +
> +When your tasklet does touch user data directly, mark it TASKLET_USER_DATA
> +using tasklet_init_flags/or DECLARE_TASKLET_USERDATA*.

Same as above.

> +
> +When your timer does touch user data mark it with TIMER_USER_DATA
> +If it is a hrtimer and touches user data, mark it with HRTIMER_MODE_USER_DATA.

Mark what, the timer function?

> +When your irq poll handler does touch user data, mark it lazy_clear_cpu().

Mark what?  That's a function call?

> +For networking code, make sure to only touch user data through
> +skb_push/put/copy [add more], unless it is data from the current
> +process. If that is not ensured add lazy_clear_cpu or
> +lazy_clear_cpu_interrupt.

How do you know if data coming across the network is for the "current"
process?  What does "current process" even mean here?

How about UIO drivers, how does their data get classified?  Lots of
networking stacks use UIO now...  virtual io channels?

> +Any cryptographic code touching key data should use memzero_explicit
> +or kzfree to free the data.

We do that today, right?  If not, that needs to be done regardless.

And what about password data?  I _think_ we got most of that now out of
the tty layer, but we could be wrong :)

> +
> +If your RCU callback touches user data add lazy_clear_cpu().

Ugh, really?

> +
> +These steps are currently only needed for code that runs on MDS affected
> +CPUs, which is currently only x86. But might be worth being prepared
> +if other architectures become affected too.

As someone who probably reviews more new drivers than anyone else right
now, my first recommendation is going to be, "Buy a non-Intel
processor, we have no idea what they expect from driver writers now,
just give up trying to appease them."  Because if I, the person that is
responsible for reviewing those drivers has no idea what to do here, how
can some random driver author be expected to know?

Seriously, this is crazy.

And if you all are going to expect me to start auditing all new drivers
based on these new rules, I need a _big_ raise.  I somehow doubt you are
asking all other OS vendors to do all of this crud.

> +Implementation details/assumptions
> +----------------------------------
> +
> +Any buffer clearing is done lazily on next kernel exit. lazy_clear*
> +is only a few fast instructions with no cache misses setting
> +a flag and can be used frequently even in fast paths.

I can not parse this paragraph at all, what are you trying to say?

> +Protecting process data
> +-----------------------
> +
> +If a system call touches data of its own process, CPU state does not
> +need to be cleared, because it has already access to it.

How do you know if it is it's own process or not?

What about something like IPC data?

> +On context switching we clear data, unless the context switch is
> +inside a process. We also clear after any context switches from kernel
> +threads.
> +
> +Cryptographic keys inside the kernel should be protected.

"Protected"?  How?  Huh?  ugh.

<big snip as I got tired>

> +Sandboxes
> +---------
> +
> +We don't do anything special for seccomp processes
> +
> +If there is a sandbox inside the process the process should take care
> +itself of clearing its own sensitive data before running sandbox
> +code. This would include data touched by system calls.

i.e. "Userspace code is hosed, sorry."?

> +BPF
> +---
> +
> +Assume BPF execution does not touch other user's data, so does
> +not need to schedule a clear for itself.

Can you assume that?

> +BPF could attack the rest of the kernel if it can successfully
> +measure side channel side effects.

Can it do such a measurement?

> +When the BPF program was loaded unprivileged, always clear the CPU
> +to prevent any exploits written in BPF using side channels to read
> +data leaked from other kernel code
> +
> +We only do this when running in an interrupt, or if an clear cpu is
> +already scheduled (which means for example there was a context
> +switch, or crypto operation before)
> +
> +In process context we assume the code only accesses data of the
> +current user and check that the BPF running was loaded by the
> +same user so even if data leaked it would not cross privilege
> +boundaries.
> +
> +Technically we would only need to do this if the BPF program
> +contains conditional branches and loads dominated by them, but
> +let's assume that nearly all do.
> +
> +This could be further optimized by batching clears for
> +many similar EBPF executions in a row (e.g. for packet
> +processing). This would need ensuring that no sensitive
> +data is touched inbetween the EBPF executions, and also
> +that all EBPF scripts are set up by the same uid.
> +We could add such optimizations later based on
> +profile data.

Please contact the BPF people before writing any of the above.  The fact
that you all have not done so is scandalous.

greg k-h