From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ak@linux.intel.com>
Received: from mga04.intel.com ([192.55.52.120])
	by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256)
	(Exim 4.80)
	(envelope-from <ak@linux.intel.com>)
	id 1fLs85-0001yw-Up
	for speck@linutronix.de; Thu, 24 May 2018 17:26:14 +0200
Date: Thu, 24 May 2018 08:26:09 -0700
From: Andi Kleen <ak@linux.intel.com>
Subject: [MODERATED] Re: [PATCH v5 1/8] L1TFv4 6
Message-ID: <20180524152609.GO4486@tassilo.jf.intel.com>
References: <cover.1527111786.git.ak@linux.intel.com>
 <20180523215737.7C50E61169@crypto-ml.lab.linutronix.de>
 <70cc1acf-95a1-6de4-7bc8-46dfd54c686a@linux.intel.com>
MIME-Version: 1.0
In-Reply-To: <70cc1acf-95a1-6de4-7bc8-46dfd54c686a@linux.intel.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
To: speck@linutronix.de
List-ID: <speck.linutronix.de>

> BTW, this reminds me: Let's say we trust guest kernels.  Do we need KVM
> code to _prevent_ guests running in 32-bit non-PAE mode?  Wouldn't any
> 32-bit non-PAE guest effectively have the ability to read the bottom
> ~4GB of host memory?

If you really trust the guest kernel you need to also trust it to not
use PAE.

If the host has full flush mitigations there is no need to trust
the host kernel, and it can even use PAE.

> 
> If we don't trust guest kernels, then what is the point of this patch? :)

See below.

> 
> Problem:
> 
> This patch is intended to protect against a 32-bit unprivileged guest
> application using PROT_NONE on normal guest memory to attack host memory.

No, it does not protect host memory, it protects memory inside the guest.

This is explained in detail in the earlier patch:

>>>

    Q: Why does the guest need to be protected when the
    HyperVisor already has L1TF mitigations?
    A: Here's an example:
    You have physical pages 1 2. They get mapped into a guest as
    GPA 1 -> PA 2
    GPA 2 -> PA 1
    through EPT.
    
    The L1TF speculation ignores the EPT remapping.
    
    Now the guest kernel maps GPA 1 to process A and GPA 2 to process B,
    and they belong to different users and should be isolated.
    
    A sets the GPA 1 PA 2 PTE to PROT_NONE to bypass the EPT remapping
    and gets read access to the underlying physical page. Which
    in this case points to PA 2, so it can read process B's data,
    if it happened to be in L1.
    
    So we broke isolation inside the guest.
    
    There's nothing the hypervisor can do about this. This
    mitigation has to be done in the guest.

<<<


> 
> Background:
> 
> 32-bit 'unsigned long' PFNs can only point to 44 bits of memory
> (32+PAGE_SHIFT).  We enforce this via __PHYSICAL_PAGE_MASK, but
> unfortunately our L1TF workaround bits are also limited by
> __PHYSICAL_PAGE_MASK as well.
> 
> Example:
> 
> Imagine a 32-bit PAE PTE pointing to memory at guest physical address 1GB:
> 
> 	0x0000000040000067
> 
> Then the attacker calls mprotect(PROT_NONE).  We invert the PTE's
> physical address bits (and add _PAGE_PROT_NONE), but only those bits set
> in __PHYSICAL_PAGE_MASK.  We get:
> 
> 	0x00000fffbffff100

These three paragraphs are correct and I can add them.


-Andi