From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=mF0B=RA=linuxfoundation.org=gregkh@kernel.org>
Received: from mail.linutronix.de (146.0.238.70:993) by
  crypto-ml.lab.linutronix.de with IMAP4-SSL for <speck@linutronix.de>; 25 Feb
  2019 21:01:08 -0000
Received: from mail.kernel.org ([198.145.29.99])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <SRS0=mF0B=RA=linuxfoundation.org=gregkh@kernel.org>)
	id 1gyNN4-0007ZV-Ce
	for speck@linutronix.de; Mon, 25 Feb 2019 22:01:07 +0100
Date: Mon, 25 Feb 2019 22:00:57 +0100
From: Greg KH <gregkh@linuxfoundation.org>
Subject: [MODERATED] Re: [PATCH v6 39/43] MDSv6
Message-ID: <20190225210057.GA12242@kroah.com>
References: <cover.1551019522.git.ak@linux.intel.com>
 <4e5e24fd0c2111686f32a55581efa5070cf0a160.1551019522.git.ak@linux.intel.com>
 <20190225152654.GB19947@kroah.com>
 <20190225162825.GR16922@tassilo.jf.intel.com>
 <20190225164729.GA10883@kroah.com>
 <20190225170503.GU16922@tassilo.jf.intel.com>
 <20190225174936.GA3230@kroah.com>
 <20190225181023.GW16922@tassilo.jf.intel.com>
 <20190225201154.GA11688@kroah.com>
MIME-Version: 1.0
In-Reply-To: <20190225201154.GA11688@kroah.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

On Mon, Feb 25, 2019 at 09:11:54PM +0100, speck for Greg KH wrote:
> On Mon, Feb 25, 2019 at 10:10:23AM -0800, speck for Andi Kleen wrote:
> > > You obviously do not care about all bus data, only some.  So which
> > > "some"?  As we have class drivers, why not focus on the type of data you
> > 
> > For most drivers anything that would end up being copy_*_user()ed.
> > 
> > (modulo some corner cases like a network address. While I would
> > like to protect those too, I don't see a practical way to do so.
> > And they're already visible on the network)
> > 
> > > feel is an issue?  That way you have a chance to keep on top of this
> > > thing.
> > 
> > Ok so the problem is the terminology? 
> > 
> > We should call it sensitive data?
> > 
> > I'll not be able to list all the cases explicitely in the document
> > (this would require understanding all drivers and all subsystems
> > which I certainly don't), there has to be some discretion.
> > 
> > > If you only focus on a bus, then you will miss those types of things
> > > that do not use a bus.  If you focus on the type of data, then you have
> > > a chance to do this right.
> > 
> > I wasn't really looking for busses, or anything like that, mainly just going
> > through the code and looking for copies or direct data access.
> 
> To make this clearer to me, is this a better definition of the problem:
> 
>   - Whenever the kernel touches data that userspace will eventually
>     see, but does not want any other userspace process to see, the CPU
>     buffers need to be flushed before returning from the kernel to
>     userspace.
> 
> Is that correct?  If not, how would you rephrase this in a single
> sentence?

Ok, I just read the "Deep Dive: Intel Analysis of MSBDS" pdf that Intel
gave me last weekend, and I think the above sentence is correct.  But it
would be great for someone to verify it.

> If this statement is true, do you care about multiple copies of that
> data, or only the "last" one as you just need to flush things when
> returning to userspace?

And to answer myself, I think that all is needed is to flush when
returning from userspace, which lazy_clear_cpu_interrupt() _should_ be
marking to have happen.

So my original point remains for the usbmon patch, it's pointless :)

> I ask as your patchset, for the USB data flow cases, calls
> lazy_clear_cpu_interrupt() twice on the same irq, each time after the
> buffer was copied off, which is very odd to me.
> 
> So do we care about the first copy, every copy, the last copy, or do we
> just need to call lazy_clear_cpu_interrupt() at least once per data
> flow?
> 
> As an example, here's the data coming in from a USB keyboard through to
> userspace:
> 	- USB irq
> 	- USB data potentially copied from USB controller buffer ->
> 	  usbmon buffer
> 	- USB data copied from USB controller buffer -> HID driver buffer
> 	- HID driver parsing and copying -> input data buffer
> 	- input data buffer copied -> input event buffer(s)
> 	- irq return
> 
> So we have USB layer copying once, maybe twice, HID layer copying, and
> input layer copying (at least once, maybe multiple times as you can
> multiplex input streams together).  Where do we care about calling
> lazy_clear_cpu_interrupt() on this call chain?  Everywhere, or just at
> least once?
> 
> If only once, why not just care about the input layer, as that is where
> things get sent to userspace and that is the _type_ of data you are
> concerned about.
> 
> [Note, I'm not sure about the input layer, it's just an example, things
> could be better than it used to be, but you get the idea here.]
> 
> Same goes for USB storage data.  Sometimes that uses dma bounce buffers,
> do we care about those?  If not, do we care later if the SCSI layer
> strips things off and reconstructs data paths and feeds that to the
> block layer?  If so, shouldn't we only care about the block layer and
> not scsi or USB?
> 
> I'm asking you to think of the data either at the physical layer (USB,
> PCI, DMA, MIPI, I2C, etc.) or at the logical layer (tty, bluetooth,
> mouse, sound input, block, etc.) or both.  Right now it's not obvious
> from your patchset what the correct thing is at all.

After reading the paper, I think you need to focus on the logical layer,
when possible, and for busses that copy data in an irq, care about them
_IFF_ they are carrying data for a logical type you care about.

greg k-h