From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ak@linux.intel.com>
Received: from mail.linutronix.de (146.0.238.70:993) by
  crypto-ml.lab.linutronix.de with IMAP4-SSL for <speck@linutronix.de>; 25 Feb
  2019 21:19:21 -0000
Received: from mga02.intel.com ([134.134.136.20])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <ak@linux.intel.com>)
	id 1gyNeh-0000Ah-1N
	for speck@linutronix.de; Mon, 25 Feb 2019 22:19:19 +0100
Date: Mon, 25 Feb 2019 13:19:15 -0800
From: Andi Kleen <ak@linux.intel.com>
Subject: [MODERATED] Re: [PATCH v6 39/43] MDSv6
Message-ID: <20190225211915.GX16922@tassilo.jf.intel.com>
References: <cover.1551019522.git.ak@linux.intel.com>
 <4e5e24fd0c2111686f32a55581efa5070cf0a160.1551019522.git.ak@linux.intel.com>
 <20190225152654.GB19947@kroah.com>
 <20190225162825.GR16922@tassilo.jf.intel.com>
 <20190225164729.GA10883@kroah.com>
 <20190225170503.GU16922@tassilo.jf.intel.com>
 <20190225174936.GA3230@kroah.com>
 <20190225181023.GW16922@tassilo.jf.intel.com>
 <20190225201154.GA11688@kroah.com>
MIME-Version: 1.0
In-Reply-To: <20190225201154.GA11688@kroah.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

> > I wasn't really looking for busses, or anything like that, mainly just going
> > through the code and looking for copies or direct data access.
> 
> To make this clearer to me, is this a better definition of the problem:
> 
>   - Whenever the kernel touches data that userspace will eventually
>     see, but does not want any other userspace process to see, the CPU
>     buffers need to be flushed before returning from the kernel to
>     userspace.
> 
> Is that correct?  If not, how would you rephrase this in a single
> sentence?

Here's my version:

In an asynchronous context (such as interrupts or softirqs),
whenever the kernel touches data with the CPU, which originates
from or will finally end up in a processes address space that
may not be the current process, the CPU buffers need to be cleared
before returning from kernel to user space.

This excludes some metadata, such as a network addresses, which
are already visible on the network and which are difficult to
protect efficiently.

[btw please avoid the term "flushing", it's a clear, not a flush]

> If this statement is true, do you care about multiple copies of that
> data, or only the "last" one as you just need to flush things when
> returning to userspace?

In principle all touching needs to be protected.

But the flag to clear the CPU lazily only needs to be set once 
per asynchronous event.  So if you guarantee that someone
else already sets it in the same interrupt it doesn't need
to be set again.


> I ask as your patchset, for the USB data flow cases, calls
> lazy_clear_cpu_interrupt() twice on the same irq, each time after the
> buffer was copied off, which is very odd to me.

lazy_clear_cpu_interrupt() is very cheap (as in a few cycles) 
It's just a few instructions that set a flag. So I didn't make any attempts
to minimize it. It can just be used wherever convenient.

The actual clear action is always only on the next kernel to user transition.


> So do we care about the first copy, every copy, the last copy, or do we
> just need to call lazy_clear_cpu_interrupt() at least once per data
> flow?

At least once.

> 
> As an example, here's the data coming in from a USB keyboard through to
> userspace:
> 	- USB irq
> 	- USB data potentially copied from USB controller buffer ->
> 	  usbmon buffer
> 	- USB data copied from USB controller buffer -> HID driver buffer
> 	- HID driver parsing and copying -> input data buffer
> 	- input data buffer copied -> input event buffer(s)

This particular case is covered by the lazy_clear_cpus in the input layer.

But if there are other HID drivers that do the "USB data copied" step
in interrupt context they might need lazy_clear_cpu()s too. Did I
miss some here?

> So we have USB layer copying once, maybe twice, HID layer copying, and
> input layer copying (at least once, maybe multiple times as you can
> multiplex input streams together).  Where do we care about calling
> lazy_clear_cpu_interrupt() on this call chain?  Everywhere, or just at
> least once?

At least once.

> 
> If only once, why not just care about the input layer, as that is where
> things get sent to userspace and that is the _type_ of data you are
> concerned about.o

Yes that's what I did. It's in the patchkit.

The tty layer does it too.

Generally I only patched the drivers itself when there wasn't a convenient
middle layer or utility function that they called.

> 
> [Note, I'm not sure about the input layer, it's just an example, things
> could be better than it used to be, but you get the idea here.]


> 
> Same goes for USB storage data.  Sometimes that uses dma bounce buffers,
> do we care about those?  If not, do we care later if the SCSI layer

You mean swiotlb? That is covered in the patch kit.

> strips things off and reconstructs data paths and feeds that to the
> block layer?  

Does usb storage copy the data somewhere in the path? If yes I might
have missed something. Can you point to the code?

> If so, shouldn't we only care about the block layer and
> not scsi or USB?

I audited the block layer and all the users seemed to use kmap_atomic,
so I put the clear hook in there.

For SCSI we only marked those that did PIO etc.

> I'm asking you to think of the data either at the physical layer (USB,
> PCI, DMA, MIPI, I2C, etc.) or at the logical layer (tty, bluetooth,
> mouse, sound input, block, etc.) or both.  Right now it's not obvious
> from your patchset what the correct thing is at all.

It was done case by case.

input drivers --> done in input layer
scsi drivers --> only needed to mark explicit PIO
network drivers --> most rely on skb annotation (just marked a few outliers)
usb -> maybe we missed something here. The only obvious copy seemed to be usbmon.

-Andi