xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Xen-devel <xen-devel@lists.xen.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution
Date: Thu, 4 Jan 2018 20:21:25 +0000	[thread overview]
Message-ID: <1515097329-31902-1-git-send-email-andrew.cooper3@citrix.com> (raw)

This work was developed as an SP3 mitigation, but shelved when it became clear
that it wasn't viable to get done in the timeframe.

To protect against SP3 attacks, most mappings needs to be flushed while in
user context.  However, to protect against all cross-VM attacks, it is
necessary to ensure that the Xen stacks are not mapped in any other cpus
address space, or an attacker can still recover at least the GPR state of
separate VMs.

To have isolated stacks, Xen needs a per-pcpu isolated region, which requires
that two pCPUs never share the same %cr3.  This is trivial for 32bit PV guests
and HVM guests due to the existing per-vcpu Monitor Tables, but is problematic
for 64bit PV guests, which will run on the same %cr3 when scheduling different
threads from the same process.

To avoid breaking the PV ABI, Xen needs to shadow the guest L4 pagetables if
it wants to maintain the unique %cr3 property it needs.

tl;dr The shadowing algorithm in pt-shadow.c is too much of a performance
overhead to be viable, and very high risk to productise in an embargo window.
If we want to continue down this route, we either need someone to have a
clever alternative to the shadowing algorithm I came up with, or change the PV
ABI to require VMs not to share L4 pagetables.

Either way, these patches are presented to start a discussion of the issues.
The series as a whole is not in a suitable state for committing.

~Andrew

Andrew Cooper (44):
  passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait()
  x86/idt: Factor out enabling and disabling of ISTs
  x86/pv: Rename invalidate_shadow_ldt() to pv_destroy_ldt()
  x86/boot: Introduce cpu_smpboot_bsp() to dynamically allocate BSP state
  x86/boot: Move arch_init_memory() earlier in the boot sequence
  x86/boot: Allocate percpu pagetables for the idle vcpus
  x86/boot: Use percpu pagetables for the idle vcpus
  x86/pv: Avoid an opencoded mov to %cr3 in toggle_guest_mode()
  x86/mm: Track the current %cr3 in a per_cpu variable
  x86/pt-shadow: Initial infrastructure for L4 PV pagetable shadowing
  x86/pt-shadow: Always set _PAGE_ACCESSED on L4e updates
  x86/fixmap: Temporarily add a percpu fixmap range
  x86/pt-shadow: Shadow L4 tables from 64bit PV guests
  x86/mm: Added safety checks that pagetables aren't shared
  x86: Rearrange the virtual layout to introduce a PERCPU linear slot
  xen/ipi: Introduce arch_ipi_param_ok() to check IPI parameters
  x86/smp: Infrastructure for allocating and freeing percpu pagetables
  x86/mm: Maintain the correct percpu mappings on context switch
  x86/boot: Defer TSS/IST setup until later during boot on the BSP
  x86/smp: Allocate a percpu linear range for the IDT
  x86/smp: Switch to using the percpu IDT mappings
  x86/mm: Track whether the current cr3 has a short or extended directmap
  x86/smp: Allocate percpu resources for map_domain_page() to use
  x86/mapcache: Reimplement map_domain_page() from scratch
  x86/fixmap: Drop percpu fixmap range
  x86/pt-shadow: Maintain a small cache of shadowed frames
  x86/smp: Allocate a percpu linear range for the compat translation area.
  x86/xlat: Use the percpu compat translation area
  x86/smp: Allocate percpu resources for the GDT and LDT
  x86/pv: Break handle_ldt_mapping_fault() out of handle_gdt_ldt_mapping_fault()
  x86/pv: Drop support for paging out the LDT
  x86: Always reload the LDT on vcpu context switch
  x86/smp: Use the percpu GDT/LDT mappings
  x86: Drop the PERDOMAIN mappings
  x86/smp: Allocate the stack in the percpu range
  x86/monitor: Capture Xen's intent to use monitor at boot time
  x86/misc: Move some IPI parameters off the stack
  x86/mca: Move __HYPERVISOR_mca IPI parameters off the stack
  x86/smp: Introduce get_smp_ipi_buf() and take more IPI parameters off the stack
  x86/boot: Switch the APs to the percpu pagetables before entering C
  x86/smp: Switch to using the percpu stacks
  x86/smp: Allocate a percpu linear range for the TSS
  x86/smp: Use the percpu TSS mapping
  misc debugging

 xen/arch/x86/acpi/cpu_idle.c         |  30 +--
 xen/arch/x86/acpi/cpufreq/cpufreq.c  |  57 +++--
 xen/arch/x86/acpi/cpufreq/powernow.c |  26 +--
 xen/arch/x86/acpi/lib.c              |  16 +-
 xen/arch/x86/boot/x86_64.S           |  24 +-
 xen/arch/x86/cpu/common.c            |  90 +-------
 xen/arch/x86/cpu/mcheck/mce.c        | 143 +++++++-----
 xen/arch/x86/cpu/mtrr/main.c         |  27 ++-
 xen/arch/x86/domain.c                |  94 ++++----
 xen/arch/x86/domain_page.c           | 353 +++++++++--------------------
 xen/arch/x86/domctl.c                |  13 +-
 xen/arch/x86/efi/efi-boot.h          |   8 +-
 xen/arch/x86/hvm/hvm.c               |  14 --
 xen/arch/x86/hvm/save.c              |   4 -
 xen/arch/x86/hvm/svm/svm.c           |   8 +-
 xen/arch/x86/hvm/vmx/vmcs.c          |  51 ++---
 xen/arch/x86/mm.c                    | 380 ++++++-------------------------
 xen/arch/x86/mm/p2m-ept.c            |   5 +-
 xen/arch/x86/mm/shadow/multi.c       |   4 +
 xen/arch/x86/platform_hypercall.c    |  40 ++--
 xen/arch/x86/psr.c                   |   9 +-
 xen/arch/x86/pv/Makefile             |   1 +
 xen/arch/x86/pv/descriptor-tables.c  |  62 ++++-
 xen/arch/x86/pv/dom0_build.c         |   5 -
 xen/arch/x86/pv/domain.c             |  55 +----
 xen/arch/x86/pv/emulate.h            |   4 +-
 xen/arch/x86/pv/mm.c                 |   6 +-
 xen/arch/x86/pv/mm.h                 |  35 ++-
 xen/arch/x86/pv/pt-shadow.c          | 428 +++++++++++++++++++++++++++++++++++
 xen/arch/x86/setup.c                 | 130 +++++++++--
 xen/arch/x86/shutdown.c              |   8 +-
 xen/arch/x86/smp.c                   |   2 +
 xen/arch/x86/smpboot.c               | 399 +++++++++++++++++++++++++++++---
 xen/arch/x86/sysctl.c                |  10 +-
 xen/arch/x86/tboot.c                 |  29 +--
 xen/arch/x86/time.c                  |   7 +-
 xen/arch/x86/traps.c                 | 328 +++++++++++++++++++++------
 xen/arch/x86/x86_64/mm.c             |  34 +--
 xen/arch/x86/xen.lds.S               |   2 +
 xen/common/efi/runtime.c             |  23 +-
 xen/common/smp.c                     |   1 +
 xen/drivers/passthrough/vtd/qinval.c |   8 +-
 xen/include/asm-arm/mm.h             |   1 -
 xen/include/asm-arm/smp.h            |   3 +
 xen/include/asm-x86/config.h         |  77 +++----
 xen/include/asm-x86/cpufeature.h     |   5 +-
 xen/include/asm-x86/cpufeatures.h    |   1 +
 xen/include/asm-x86/domain.h         |  67 +-----
 xen/include/asm-x86/hvm/vmx/vmcs.h   |   1 -
 xen/include/asm-x86/ldt.h            |  19 +-
 xen/include/asm-x86/mm.h             |  32 +--
 xen/include/asm-x86/mwait.h          |   3 +
 xen/include/asm-x86/page.h           |   1 +
 xen/include/asm-x86/processor.h      |  22 +-
 xen/include/asm-x86/pv/mm.h          |   3 +
 xen/include/asm-x86/pv/pt-shadow.h   | 100 ++++++++
 xen/include/asm-x86/smp.h            |  39 ++++
 xen/include/asm-x86/system.h         |   1 +
 xen/include/asm-x86/x86_64/uaccess.h |   6 +-
 xen/include/xen/smp.h                |   2 -
 60 files changed, 2027 insertions(+), 1329 deletions(-)
 create mode 100644 xen/arch/x86/pv/pt-shadow.c
 create mode 100644 xen/include/asm-x86/pv/pt-shadow.h

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

             reply	other threads:[~2018-01-04 20:21 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04 20:21 Andrew Cooper [this message]
2018-01-04 20:21 ` [PATCH RFC 01/44] passthrough/vtd: Don't DMA to the stack in queue_invalidate_wait() Andrew Cooper
2018-01-05  9:21   ` Jan Beulich
2018-01-05  9:33     ` Andrew Cooper
2018-01-16  6:41   ` Tian, Kevin
2018-01-04 20:21 ` [PATCH RFC 02/44] x86/idt: Factor out enabling and disabling of ISTs Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 03/44] x86/pv: Rename invalidate_shadow_ldt() to pv_destroy_ldt() Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 04/44] x86/boot: Introduce cpu_smpboot_bsp() to dynamically allocate BSP state Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 05/44] x86/boot: Move arch_init_memory() earlier in the boot sequence Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 06/44] x86/boot: Allocate percpu pagetables for the idle vcpus Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 07/44] x86/boot: Use " Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 08/44] x86/pv: Avoid an opencoded mov to %cr3 in toggle_guest_mode() Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 09/44] x86/mm: Track the current %cr3 in a per_cpu variable Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 10/44] x86/pt-shadow: Initial infrastructure for L4 PV pagetable shadowing Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 11/44] x86/pt-shadow: Always set _PAGE_ACCESSED on L4e updates Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 12/44] x86/fixmap: Temporarily add a percpu fixmap range Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 13/44] x86/pt-shadow: Shadow L4 tables from 64bit PV guests Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 14/44] x86/mm: Added safety checks that pagetables aren't shared Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 15/44] x86: Rearrange the virtual layout to introduce a PERCPU linear slot Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 16/44] xen/ipi: Introduce arch_ipi_param_ok() to check IPI parameters Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 17/44] x86/smp: Infrastructure for allocating and freeing percpu pagetables Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 18/44] x86/mm: Maintain the correct percpu mappings on context switch Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 19/44] x86/boot: Defer TSS/IST setup until later during boot on the BSP Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 20/44] x86/smp: Allocate a percpu linear range for the IDT Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 21/44] x86/smp: Switch to using the percpu IDT mappings Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 22/44] x86/mm: Track whether the current cr3 has a short or extended directmap Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 23/44] x86/smp: Allocate percpu resources for map_domain_page() to use Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 24/44] x86/mapcache: Reimplement map_domain_page() from scratch Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 25/44] x86/fixmap: Drop percpu fixmap range Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 26/44] x86/pt-shadow: Maintain a small cache of shadowed frames Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 27/44] x86/smp: Allocate a percpu linear range for the compat translation area Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 28/44] x86/xlat: Use the percpu " Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 29/44] x86/smp: Allocate percpu resources for the GDT and LDT Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 30/44] x86/pv: Break handle_ldt_mapping_fault() out of handle_gdt_ldt_mapping_fault() Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 31/44] x86/pv: Drop support for paging out the LDT Andrew Cooper
2018-01-24 11:04   ` Jan Beulich
2018-01-04 20:21 ` [PATCH RFC 32/44] x86: Always reload the LDT on vcpu context switch Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 33/44] x86/smp: Use the percpu GDT/LDT mappings Andrew Cooper
2018-01-04 20:21 ` [PATCH RFC 34/44] x86: Drop the PERDOMAIN mappings Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 35/44] x86/smp: Allocate the stack in the percpu range Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 36/44] x86/monitor: Capture Xen's intent to use monitor at boot time Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 37/44] x86/misc: Move some IPI parameters off the stack Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 38/44] x86/mca: Move __HYPERVISOR_mca " Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 39/44] x86/smp: Introduce get_smp_ipi_buf() and take more " Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 40/44] x86/boot: Switch the APs to the percpu pagetables before entering C Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 41/44] x86/smp: Switch to using the percpu stacks Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 42/44] x86/smp: Allocate a percpu linear range for the TSS Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 43/44] x86/smp: Use the percpu TSS mapping Andrew Cooper
2018-01-04 20:22 ` [PATCH RFC 44/44] misc debugging Andrew Cooper
2018-01-05  7:48 ` [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution Juergen Gross
2018-01-05  9:26   ` Andrew Cooper
2018-01-05  9:39     ` Juergen Gross
2018-01-05  9:56       ` Andrew Cooper
2018-01-05 14:11       ` George Dunlap
2018-01-05 14:17         ` Juergen Gross
2018-01-05 14:21           ` George Dunlap
2018-01-05 14:28             ` Jan Beulich
2018-01-05 14:27         ` Jan Beulich
2018-01-05 14:35           ` Andrew Cooper
2018-01-08 11:41             ` George Dunlap
2018-01-09 23:14   ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1515097329-31902-1-git-send-email-andrew.cooper3@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).