* [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches
@ 2017-12-04 0:15 Boqun Feng
2017-12-04 0:15 ` [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset Boqun Feng
` (17 more replies)
0 siblings, 18 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
Hi all,
This is the v2 of RFC SGX Virtualization design and draft patches, you
can find v1 at:
https://lists.gt.net/xen/devel/483404
In the new version, I fix a few things according to the feedbacks for
previous version(mostly are cleanups and code movement).
Besides, Kai and I redesign the SGX MSRs setting up part and introduce
new XL parameter 'lehash' and 'lewr'.
Another big change is that I modify the EPC management to fit EPC pages
in 'struct page_info', and in patch #6 and #7, unscrubbable pages,
'PGC_epc', 'MEMF_epc' and 'XENZONE_EPC' are introduced, so that EPC
management is fully integrated into existing memory management of xen.
This might be the controversial bit, so patch 6~8 are simply to show the
idea and drive deep discussion.
Detailed changes since v1: (modifications with tag "[New]" is totally
new in this series, reviews and comments are highly welcome for those
parts)
* Make SGX related mostly common for x86 by: 1) moving sgx.[ch] to
arch/x86/ and include/asm-x86/ and 2) renaming EPC related functions
with domain_* prefix.
* Rename ioremap_cache() with ioremap_wb() and make it x86-specific as
suggested by Jan Beulich.
* Remove percpu sgx_cpudata, during bootup secondary CPUs now check
whether they read different value than boot CPU, if so SGX is
disabled.
* Remove domain_has_sgx_{,launch_control}, and make sure we can
rely on domain's arch.cpuid->feat.sgx{_lc} for setting checks.
* Cleanup the code for CPUID handling as suggested by Andrew Cooper.
* Adjust to msr_policy framework for SGX MSRs handling, and remove
unnecessary fields like 'readable' and 'writable'
* Use 'page_info' to maintain EPC pages, and [NEW] add an draft
implementation for employing xenheap for EPC page management. Please
see patch 6~8
* [New] Modify the XL parameter for SGX, please see section 2.1.1 in
the updated design doc.
* [New] Use _set_vcpu_msrs hypercall in the toolstack to set the SGX
related. Please see patch #17.
* ACPI related tool changes are temporarily dropped in this patchset,
as I need more time to resolve the comments and do related tests.
And the update design doc is as follow, as the previous version in the
design there are some particualr points that we don't know which
implementation is better. For those a question mark (?) is added at the
right of the menu. And for SGX live migration, thanks to Wei Liu for
providing comments that it's nice to support if we can in previous
version review, but we'd like hear more from you guys so we still put a
question mark fot this item. Your comments on those "question mark (?)"
parts (and other comments as well, of course) are highly appreciated.
===================================================================
1. SGX Introduction
1.1 Overview
1.1.1 Enclave
1.1.2 EPC (Enclave Paage Cache)
1.1.3 ENCLS and ENCLU
1.2 Discovering SGX Capability
1.2.1 Enumerate SGX via CPUID
1.2.2 Intel SGX Opt-in Configuration
1.3 Enclave Life Cycle
1.3.1 Constructing & Destroying Enclave
1.3.2 Enclave Entry and Exit
1.3.2.1 Synchonous Entry and Exit
1.3.2.2 Asynchounous Enclave Exit
1.3.3 EPC Eviction and Reload
1.4 SGX Launch Control
1.5 SGX Interaction with IA32 and IA64 Architecture
2. SGX Virtualization Design
2.1 High Level Toolstack Changes
2.1.1 New 'sgx' XL configure file parameter
2.1.2 New XL commands (?)
2.1.3 Notify domain's virtual EPC base and size to Xen
2.2 High Level Hypervisor Changes
2.2.1 EPC Management
2.2.2 EPC Virtualization
2.2.3 Populate EPC for Guest
2.2.4 Launch Control Support
2.2.5 CPUID Emulation
2.2.6 EPT Violation & ENCLS Trapping Handling
2.2.7 Guest Suspend & Resume
2.2.8 Destroying Domain
2.3 Additional Point: Live Migration, Snapshot Support (?)
3. Reference
1. SGX Introduction
1.1 Overview
1.1.1 Enclave
Intel Software Guard Extensions (SGX) is a set of instructions and mechanisms
for memory accesses in order to provide security accesses for sensitive
applications and data. SGX allows an application to use it's pariticular address
space as an *enclave*, which is a protected area provides confidentiality and
integrity even in the presence of privileged malware. Accesses to the enclave
memory area from any software not resident in the enclave are prevented,
including those from privileged software. Below diagram illustrates the presence
of Enclave in application.
|-----------------------|
| |
| |---------------| |
| | OS kernel | | |-----------------------|
| |---------------| | | |
| | | | | |---------------| |
| |---------------| | | | Entry table | |
| | Enclave |---|-----> | |---------------| |
| |---------------| | | | Enclave stack | |
| | App code | | | |---------------| |
| |---------------| | | | Enclave heap | |
| | Enclave | | | |---------------| |
| |---------------| | | | Enclave code | |
| | App code | | | |---------------| |
| |---------------| | | |
| | | |-----------------------|
|-----------------------|
SGX supports SGX1 and SGX2 extensions. SGX1 provides basic enclave support,
and SGX2 allows additional flexibility in runtime management of enclave
resources and thread execution within an enclave.
1.1.2 EPC (Enclave Page Cache)
Just like normal application memory management, enclave memory management can be
devided into two parts: address space allocation and memory commitment. Address
space allocation is allocating particular range of linear address space for
enclave. Memory commitment is assigning actual resource for the enclave.
Enclave Page Cache (EPC) is the physical resource used to commit to enclave.
EPC is divided to 4K pages. An EPC page is 4K in size and always aligned to 4K
boundary. Hardware performs additional access control checks to restrict access
to the EPC page. The Enclave Page Cache Map (EPCM) is a secure structure which
holds one entry for each EPC page, and is used by hardware to track the status
of each EPC page (invisibe to software). Typically EPC and EPCM are reserved
by BIOS as Processor Reserved Memory but the actual amount, size, and layout
of EPC are model-specific, and dependent on BIOS settings. EPC is enumerated
via new SGX CPUID, and is reported as reserved memory.
EPC pages can either be invalid or valid. There are 4 valid EPC types in SGX1:
regular EPC page, SGX Enclave Control Structure (SECS) page, Thread Control
Structure (TCS) page, and Version Array (VA) page. SGX2 adds Trimmed EPC page.
Each enclave is associated with one SECS page. Each thread in enclave is
associated with one TCS page. VA page is used in EPC page eviction and reload.
Trimmed EPC page is introduced in SGX2 when particular 4K page in enclave is
going to be freed (trimmed) at runtime after enclave is initialized.
1.1.3 ENCLS and ENCLU
Two new instructions ENCLS and ENCLU are introduced to manage enclave and EPC.
ENCLS can only run in ring 0, while ENCLU can only run in ring 3. Both ENCLS and
ENCLU have multiple leaf functions, with EAX indicating the specific leaf
function.
SGX1 supports below ENCLS and ENCLU leaves:
ENCLS:
- ECREATE, EADD, EEXTEND, EINIT, EREMOVE (Enclave build and destroy)
- EPA, EBLOCK, ETRACK, EWB, ELDU/ELDB (EPC eviction & reload)
ENCLU:
- EENTER, EEXIT, ERESUME (Enclave entry, exit, re-enter)
- EGETKEY, EREPORT (SGX key derivation, attestation)
Additionally, SGX2 supports below ENCLS and ENCLU leaves for runtime add/remove
EPC page to enclave after enclave is initialized, along with permission change.
ENCLS:
- EAUG, EMODT, EMODPR
ENCLU:
- EACCEPT, EACCEPTCOPY, EMODPE
VMM is able to interfere with ENCLS running in guest (see 1.2.x SGX interaction
with VMX) but is unable to interfere with ENCLU.
1.2 Discovering SGX Capability
1.2.1 Enumerate SGX via CPUID
If CPUID.0x7.0:EBX.SGX (bit 2) is 1, then processor supports SGX and SGX
capability and resource can be enumerated via new SGX CPUID (0x12).
CPUID.0x12.0x0 reports SGX capability, such as the presence of SGX1, SGX2,
enclave's maximum size for both 32-bit and 64-bit application. CPUID.0x12.0x1
reports the availability of bits that can be set for SECS.ATTRIBUTES.
CPUID.0x12.0x2 reports the EPC resource's base and size. Platform may support
multiple EPC sections, and CPUID.0x12.0x3 and further sub-leaves can be used
to detect the existence of multiple EPC sections (until CPUID reports invalid
EPC).
Refer to 37.7.2 Intel SGX Resource Enumeration Leaves for full description of
SGX CPUID 0x12.
1.2.2 Intel SGX Opt-in Configuration
On processors that support Intel SGX, IA32_FEATURE_CONTROL also provides the
SGX_ENABLE bit (bit 18) to turn on/off SGX. Before system software can enable
and use SGX, BIOS is required to set IA32_FEATURE_CONTROL.SGX_ENABLE = 1 to
opt-in SGX.
Setting SGX_ENABLE follows the rules of IA32_FEATURE_CONTROL.LOCK (bit 0).
Software is considered to have opted into Intel SGX if and only if
IA32_FEATURE_CONTROL.SGX_ENABLE and IA32_FEATURE_CONTROL.LOCK are set to 1.
The setting of IA32_FEATURE_CONTROL.SGX_ENABLE (bit 18) is not reflected by
SGX CPUID. Enclave instructions will behavior differently according to value
of CPUID.0x7.0x0:EBX.SGX and whether BIOS has opted-in SGX.
Refer to 37.7.1 Intel SGX Opt-in Configuration for more information.
1.3 Enclave Life Cycle
1.3.1 Constructing & Destroying Enclave
Enclave is created via ENCLS[ECREATE] leaf by previleged software. Basically
ECREATE converts an invalid EPC page into SECS page, according to a source SECS
structure resides in normal memory. The source SECS contains enclave's info
such as base (linear) address, size, enclave attributes, enclave's measurement,
etc.
After ECREATE, for each 4K linear address space page, priviledged software uses
EADD and EEXTEND to add one EPC page to it. Enclave code/data (resides in normal
memory) is loaded to enclave during EADD for enclave's each 4K page. After all
EPC pages are added to enclave, priviledged software calls EINIT to initialize
the enclave, and then enclave is ready to run.
During enclave is constructed, enclave measurement, which is a SHA256 hash
value, is also built according to enclave's size, code/data itself and its
location in enclave, etc. The measurement can be used to uniquely identify the
enclave. SIGSTRUCT in EINIT leaf also contains the measurement specified by
untrusted software, via MRENCLAVE. EINIT will check the two measurements and
will only succeed when the two matches.
Enclave is destroyed by running EREMOVE for all Enclave's EPC page, and then
for enclave's SECS. EREMOVE will report SGX_CHILD_PRESENT error if it is called
for SECS when there's still regular EPC pages that haven't been removed from
enclave.
Please refer to SDM chapter 39.1 Constructing an Enclave for more infomation.
1.3.2 Enclave Entry and Exit
1.3.2.1 Synchonous Entry and Exit
After enclave is constructed, non-priviledged software use ENCLU[EENTER] to
enter enclave to run. While process runs in enclave, non-priviledged software
can use ENCLU[EEXIT] to exit from enclave and return to normal mode.
1.3.2.2 Asynchounous Enclave Exit
Asynchronous and synchronous events, such as exceptions, interrupts, traps,
SMIs, and VM exits may occur while executing inside an enclave. These events
are referred to as Enclave Exiting Events (EEE). Upon an EEE, the processor
state is securely saved inside the enclave and then replaced by a synthetic
state to prevent leakage of secrets. The process of securely saving state and
establishing the synthetic state is called an Asynchronous Enclave Exit (AEX).
After AEX, non-priviledged software uses ENCLU[ERESUME] to re-enter enclave.
The SGX userspace software maintains a small piece of code (resides in normal
memory) which basically calls ERESUME to re-enter enclave. The address of this
piece of code is called Asynchronous Exit Pointer (AEP). AEP is specified as
parameter in EENTER and will be kept internally in enclave. Upon AEX, AEP will
be pushed to stack and upon returning from EEE handling, such as IRET, AEP will
be loaded to RIP and ERESUME will be called subsequently to re-enter enclave.
During AEX the processor will do context saving and restore automatically
therefore no change to interrupt handling of OS kernel and VMM is required. It
is SGX userspace software's responsibility to setup AEP correctly.
Please refer to SDM chapter 39.2 Enclave Entry and Exit for more infomation.
1.3.3 EPC Eviction and Reload
SGX also allows priviledged software to evict any EPC pages that are used by
enclave. The idea is the same as normal memory swapping. Below is the detail
info of how to evict EPC pages.
Below is the sequence to evict regular EPC page:
1) Select one or multiple regular EPC pages from one enclave
2) Remove EPT/PT mapping for selected EPC pages
3) Send IPIs to remote CPUs to flush TLB of selected EPC pages
4) EBLOCK on selected EPC pages
5) ETRACK on enclave's SECS page
6) allocate one available slot (8-byte) in VA page
7) EWB on selected EPC pages
With EWB taking:
- VA slot, to restore eviction version info.
- one normal 4K page in memory, to store encrypted content of EPC page.
- one struct PCMD in memory, to store meta data.
(VA slot is a 8-byte slot in VA page, which is a particualr EPC page.)
And below is the sequence to evict an SECS page or VA page:
1) locate SECS (or VA) page
2) remove EPT/PT mapping for SECS (or VA) page
3) Send IPIs to remote CPUs
6) allocate one available slot (8-byte) in VA page
4) EWB on SECS (or) page
And for evicting SECS page, all regular EPC pages that belongs to that SECS
must be evicted out prior, otherwise EWB returns SGX_CHILD_PRESENT error.
And to reload an EPC page:
1) ELDU/ELDB on EPC page
2) setup EPT/PT mapping
With ELDU/ELDB taking:
- location of SECS page
- linear address of enclave's 4K page (that we are going to reload to)
- VA slot (used in EWB)
- 4K page in memory (used in EWB)
- struct PCMD in memory (used in EWB)
Please refer to SDM chapter 39.5 EPC and Management of EPC pages for more
information.
1.4 SGX Launch Control
SGX requires running "Launch Enclave" (LE) before running any other enclaves.
This is because LE is the only enclave that does not requires EINITTOKEN in
EINIT. Running any other enclave requires a valid EINITTOKEN, which contains
MAC of the (first 192 bytes) EINITTOKEN calculated by EINITTOKEN key. EINIT
will verify the MAC via internally deriving the EINITTOKEN key, and only the
EINITTOKEN that has matched MAC will be accepted by EINIT. The EINITTOKEN key
derivation depends on some info from LE. The typical process is LE generates
EINITTOKEN for other enclave according to LE itself and the target enclave,
and calcualtes the MAC by using ENCLU[EGETKEY] to get the EINITTOKEN key. Only
LE is able to get the EINITTOKEN key.
Running LE requies the SHA256 hash of LE signer's RSA public key (SHA256 of
sigstruct->modulus) to equal to IA32_SGXLEPUBKEYHASH[0-3] MSRs (the 4 MSRs
together makes up 256-bit SHA256 hash value).
If CPUID.0x7.0x0:EBX.SGX and CPUID.0x7.0x0:ECX.SGX_LAUNCH_CONTROL[bit 30] is
set, then IA32_FEATURE_CONTROL is available, and IA32_FEATURE_CONTROL MSR has
SGX_LAUNCH_CONTROL_ENABLE bit (bit 17) available. 1-setting of
SGX_LAUNCH_CONTROL_ENABLE bit enables runtime change of IA32_SGXLEPUBKEYHASHn
after IA32_FEATURE_CONTROL is locked. Otherwise, IA32_SGXLEPUBKEYHASHn are
read-only after IA32_FEATURE_CONTROL is locked. After reset,
IA32_SGXLEPUBKEYHASHn will be set to hash of Intel's default key. On system
that has only CPUID.0x7.0x0:EBX.SGX set, IA32_SGXLEPUBKEYHASHn are not
available. On such system EINIT will always treat IA32_SGXLEPUBKEYHASHn as
Intel's default value thus only Intel's LE is able to run.
On system with IA32_SGXLEPUBKEYHASHn available, it is BIOS's implementation to
decide whether to provide configurations to user to set IA32_SGXLEPUBKEYHASHn
in *locked* (IA32_SGXLEPUBKEYHASHn are read-only after IA32_FEATURE_CONTROL is
locked) or *unlocked* mode (IA32_SGXLEPUBKEYHASHn are writable to kernel at
runtime). Also BIOS may or may not provide configurations to allow user to set
custom value of IA32_SGXLEPUBKEYHASHn.
1.5 SGX Interaction with IA32 and IA64 Architecture
SDM Chapter 42 describes SGX interaction with various features in IA32 and IA64
architecture. Below outlines the major ones. Refer to Chapter 42 for full
description of SGX interaction with various IA32 and IA64 features.
1.5.1 VMX Changes for Supporting SGX Virtualization
A new 64-bit ENCLS-exiting bitmap control field is added to VMCS (encoding
0202EH) to control VMEXIT on ENCLS leaf functions. And a new "Enable ENCLS
exiting" control bit (bit 15) is defined in secondary processor based vm
execution control. 1-Setting of "Enable ENCLS exiting" enables ENCLS-exiting
bitmap control. ENCLS-exiting bitmap controls which ENCLS leaves will trigger
VMEXIT.
Additionally two new bits are added to indicate whether VMEXIT (any) is from
enclave. Below two bits will be set if VMEXIT is from enclave:
- Bit 27 in the Exit reason filed of Basic VM-exit information.
- Bit 4 in the Interruptibility State of Guest Non-Register State of VMCS.
Refer to 42.5 Interactions with VMX, 27.2.1 Basic VM-Exit Information, and
27.3.4 Saving Non-Register.
1.5.2 Interaction with XSAVE
SGX defines a sub-field called X-Feature Request Mask (XFRM) in the attributes
field of SECS. On enclave entry, SGX HW verifies XFRM in SECS.ATTRIBUTES are
already enabled in XCR0.
Upon AEX, SGX saves the processor extended state and miscellaneous state to
enclave's state-save area (SSA), and clear the secrets from processor extended
state that is used by enclave (from leaking secrets).
Refer to 42.7 Interaction with Processor Extended State and Miscellaneous State
1.5.3 Interaction with S state
When processor goes into S3-S5 state, EPC is destroyed, thus all enclaves are
destroyed as well consequently.
Refer to 42.14 Interaction with S States.
2. SGX Virtualization Design
2.1 High Level Toolstack Changes:
2.1.1 New 'sgx' XL configure file parameter
EPC is limited resource. In order to use EPC efficiently among all domains,
when creating guest, administrator should be able to specify domain's virtual
EPC size. And admin alao should be able to get all domain's virtual EPC size.
For SGX Launch Control virtualization, we should allow admin to create VM with
either VM's virtual IA32_SGXLEPUBKEYHASHn locked or unlocked, and we should
also allow admin to create VM with custom IA32_SGXLEPUBKEYHASHn value.
For above purposes, below new 'sgx' XL configure file parameter is added:
sgx = 'epc=<size>,lehash=<sha256-hash>,lewr=<0|1>'
In which 'epc' specifies VM's EPC size in MB and it's mandatory.
When physical machine is in *locked* mode, both 'lehash' and 'lewr'
cannot be specificed, as physical machine are unable to change
IA32_SGXLEPUBKEYHASHn at runtime. Adding either 'lehash' and 'lewr' will
cause failure to create VM in that case. And VM's initial
IA32_SGXLEPUBKEYHASHn value will be set to value of physical MSRs.
When physical machine is in *unlocked* mode, then VM's initial
IA32_SGXLEPUBKEYHASHn value will be set to 'lehash' if specified, or
Intel's default value. VM's SGX_LAUNCH_CONTROL_ENABLE bit in
IA32_FEATURE_CONTROL will be set or cleared, depending on whether 'lewr'
is specificied (or set to true or false expilicity).
Please also refer to 2.2.4 Launch Control Support.
2.1.2 New XL commands (?)
Administrator should be able to get physical EPC size, and all domain's virtual
EPC size. For this purpose, we can introduce 2 additional commands:
# xl sgxinfo
Which will print out physical EPC size, and other SGX info (such as SGX1, SGX2,
etc) if necessary.
# xl sgxlist <did>
Which will print out particular domain's virtual EPC size, or list all virtual
EPC sizes for all supported domains.
Alternatively, we can also extend existing XL commands by adding new option
# xl info -sgx
Which will print out physical EPC size along with other physinfo. And
# xl list <did> -sgx
Which will print out domain's virtual EPC size.
Comments?
In this RFC the two new commands are not implemented yet.
2.1.3 Notify domain's virtual EPC base and size to Xen
Xen needs to know guest's EPC base and size in order to populate EPC pages for
it. Toolstack notifies EPC base and size to Xen via XEN_DOMCTL_set_cpuid.
2.2 High Level Xen Hypervisor Changes:
2.2.1 EPC Management
Xen hypervisor needs to detect SGX, discover EPC, and manage EPC before
supporting SGX to guest. EPC is detected via SGX CPUID 0x12.0x2. It's possible
that there are multiple EPC sections (enumerated via sub-leaves 0x3 and so on,
until invaid EPC is reported), but this is typically on MP-socket server on
which each package would have its own EPC.
EPC is reported as reserved memory (so it is not reported as normal memory).
EPC must be managed in 4K pages. CPU hardware uses EPCM to track status of each
EPC pages. Xen needs to manage EPC and provide functions to, ie, alloc and free
EPC pages for guest.
Although typically on physical machine (at least existing machines), EPC is
~100M in size at maximum, but we cannot assume EPC size, thus in terms of EPC
management, it's better to integrate EPC management to Xen's memmory management
framework to take advantage of existing Xen's memory management algorithms.
Specifically, one 'struct page_info' will be created for each EPC page, just
like normal memory, and a new flag will be defined to identify whether 'struct
page_info' is EPC or normal memory. Existing memory allocation API
alloc_domheap_pages will be resued to allocate EPC page, by adding a new memflag
'MEMF_epc' to indicate EPC allocation, rather than memory allocation. The new
'MEMF_epc' can also be used for EPC ballooning (if required in the future), as
with the new flag, existing XENMEM_increase{decrease}_reservation,
XENMEM_populate_physmap can be resued for EPC as well.
2.2.2 EPC Virtualization
This part is how to populate EPC for guests. We have 3 choices:
- Static Partitioning
- Oversubscription
- Ballooning
Static Partitioning means all EPC pages will be allocated and mapped to guest
when it is created, and there's no runtime change of page table mappings for EPC
pages. Oversubscription means Xen hypervisor supports EPC page swapping between
domains, meaning Xen is able to evict EPC page from another domain and assign it
to the domain that needs the EPC. With oversubscription, EPC can be assigned to
domain on demand, when EPT violation happens. Ballooning is similar to memory
ballooning. It is basically "Static Partitioning" + "Balloon driver" in guest.
Static Partitioning is the easiest way in terms of implementation, and there
will be no hypervisor overhead (except EPT overhead of course), because in
"Static partitioning", there is no EPT violation for EPC, and Xen doesn't need
to turn on ENCLS VMEXIT for guest as ENCLS runs perfectly in non-root mode.
Ballooning is "Static Partitioning" + "Balloon driver" in guest. Like "Static
Paratitioning", ballooning doesn't need to turn on ENCLS VMEXIT, and doesn't
have EPT violation for EPC either. To support ballooning, we need ballooning
driver in guest to issue hypercall to give up or reclaim EPC pages. In terms of
hypercall, we have two choices: 1) Add new hypercall for EPC ballooning; 2)
Using existing XENMEM_{increase/decrease}_reservation with new memory flag, ie,
XENMEMF_epc. I'll discuss more regarding to adding dedicated hypercall or not
later.
Oversubscription looks nice but it requires more complicated implemetation.
Firstly, as explained in 1.3.3 EPC Eviction & Reload, we need to follow specific
steps to evict EPC pages, and in order to do that, basically Xen needs to trap
ENCLS from guest and keep track of EPC page status and enclave info from all
guest. This is because:
- To evict regular EPC page, Xen needs to know SECS location
- Xen needs to know EPC page type: evicting regular EPC and evicting SECS,
VA page have different steps.
- Xen needs to know EPC page status: whether the page is blocked or not.
Those info can only be got by trapping ENCLS from guest, and parsing its
parameters (to identify SECS page, etc). Parsing ENCLS parameters means we need
to know which ENCLS leaf is being trapped, and we need to translate guest's
virtual address to get physical address in order to locate EPC page. And once
ENCLS is trapped, we have to emulate ENCLS in Xen, which means we need to
reconstruct ENCLS parameters by remapping all guest's virtual address to Xen's
virtual address (gva->gpa->pa->xen_va), as ENCLS always use *effective address*
which is able to be traslated by processor when running ENCLS.
--------------------------------------------------------------
| ENCLS |
--------------------------------------------------------------
| /|\
ENCLS VMEXIT| | VMENTRY
| |
\|/ |
1) parse ENCLS parameters
2) reconstruct(remap) guest's ENCLS parameters
3) run ENCLS on behalf of guest (and skip ENCLS)
4) on success, update EPC/enclave info, or inject error
And Xen needs to maintain each EPC page's status (type, blocked or not, in
enclave or not, etc). Xen also needs to maintain all Enclave's info from all
guests, in order to find the correct SECS for regular EPC page, and enclave's
linear address as well.
So in general, "Static Partitioning" has simplest implementation, but obviously
not the best way to use EPC efficiently; "Ballooning" has all pros of Static
Partitioning but requies guest balloon driver; "Oversubscription" is best in
terms of flexibility but requires complicated hypervisor implemetation.
We will start with "Static Partitioning". If "Ballooning" is required in the
future, we will support it. "Oversubscription" should not be needed in
forseeable future.
2.2.3 Populate EPC for Guest
Toolstack notifies Xen about domain's EPC base and size by XEN_DOMCTL_set_cpuid,
so currently Xen populates all EPC pages for guest in XEN_DOMCTL_set_cpuid,
particularly, in handling XEN_DOMCTL_set_cpuid for CPUID.0x12.0x2. Once Xen
checks the values passed from toolstack is valid, Xen will allocate all EPC
pages and setup EPT mappings for guest.
2.2.4 Launch Control Support
To support running multiple domains with each running its own LE signed by
different owners, physical machine's BIOS must leave IA32_SGXLEPUBKEYHASHn
*unlocked* before handing to Xen. Xen will trap domain's write to
IA32_SGXLEPUBKEYHASHn and keep the value in vcpu internally, and update the
value to physical MSRs when vcpu is scheduled in. This can guarantee that
when EINIT runs in guest, guest's virtual IA32_SGXLEPUBKEYHASHn have been
written to physical MSRs.
SGX_LAUNCH_CONTROL_ENABLE bit in guest's IA32_FEATURE_CONTROL is controlled
by new added 'lewr' XL parameter (see 2.1.1 New 'sgx' XL configure file
parameter).
If physical IA32_SGXLEPUBKEYHASHn are *locked* in machine's BIOS, then only MSR
read is allowed from guest, and Xen will inject error for guest's MSR writes.
In addition, if physical IA32_SGXLEPUBKEYHASHn are *locked*, then creating guest
with 'lehash' parameter or 'lewr' will fail, as in such case Xen is not able to
update guest's virtual IA32_SGXLEPUBKEYHASHn to physical MSRs.
If physical IA32_SGXLEPUBKEYHASHn are not available
(CPUID.0x7.0x0:ECX.SGX_LAUHCN_CONTROL is not present), then creating VM with
'lehash' and 'lewr' will also fail. In addition, any MSR read/write for
IA32_SGXLEPUBKEYHASHn from guest is invalid and Xen will inject error in such
case.
2.2.5 CPUID Emulation
Most of native SGX CPUID info can be exposed to guest, expect below two parts:
- Sub-leaf 0x2 needs to report domain's virtual EPC base and size, instead
of physical EPC info.
- Sub-leaf 0x1 needs to be consistent with guest's XCR0. For the reason of
this part please refer to 1.5.2 Interaction with XSAVE.
2.2.6 EPT Violation & ENCLS Trapping Handling
Only needed when Xen supports EPC Oversubscription, as explained above.
2.2.7 Guest Suspend & Resume
On hardware, EPC is destroyed when power goes to S3-S5. So Xen will destroy
guest's EPC when guest's power goes into S3-S5. Currently Xen is notified by
Qemu in terms of S State change via HVM_PARAM_ACPI_S_STATE, where Xen will
destroy EPC if S State is S3-S5.
Specifically, Xen will run EREMOVE for guest's each EPC page, as guest may
not handle EPC suspend & resume correctly, in which case physically guest's EPC
pages may still be valid, so Xen needs to run EREMOVE to make sure all EPC
pages are becoming invalid. Otherwise further operation in guest on EPC may
fault as it assumes all EPC pages are invalid after guest is resumed.
For SECS page, EREMOVE may fault with SGX_CHILD_PRESENT, in which case Xen will
keep this SECS page into a list, and call EREMOVE for them again after all EPC
pages have been called with EREMOVE. This time the EREMOVE on SECS will succeed
as all children (regular EPC pages) have already been removed.
2.2.8 Destroying Domain
Normally Xen just frees all EPC pages for domain when it is destroyed. But Xen
will also do EREMOVE on all guest's EPC pages (described in above 2.2.7) before
free them, as guest may shutdown unexpected (ex, user kills guest), and in this
case, guest's EPC may still be valid.
2.3 Additional Point: Live Migration, Snapshot Support (?)
Actually from hardware's point of view, SGX is not migratable. There are two
reasons:
- SGX key architecture cannot be virtualized.
For example, some keys are bound to CPU. For example, Sealing key, EREPORT
key, etc. If VM is migrated to another machine, the same enclave will derive
the different keys. Taking Sealing key as an example, Sealing key is
typically used by enclave (enclave can get sealing key by EGETKEY) to *seal*
its secrets to outside (ex, persistent storage) for further use. If Sealing
key changes after VM migration, then the enclave can never get the sealed
secrets back by using sealing key, as it has changed, and old sealing key
cannot be got back.
- There's no ENCLS to evict EPC page to normal memory, but at the meaning
time, still keep content in EPC. Currently once EPC page is evicted, the EPC
page becomes invalid. So technically, we are unable to implement live
migration (or check pointing, or snapshot) for enclave.
But, with some workaround, and some facts of existing SGX driver, technically
we are able to support Live migration (or even check pointing, snapshot). This
is because:
- Changing key (which is bound to CPU) is not a problem in reality
Take Sealing key as an example. Losing sealed data is not a problem, because
sealing key is only supposed to encrypt secrets that can be provisioned
again. The typical work model is, enclave gets secrets provisioned from
remote (service provider), and use sealing key to store it for further use.
When enclave tries to *unseal* use sealing key, if the sealing key is
changed, enclave will find the data is some kind of corrupted (integrity
check failure), so it will ask secrets to be provisioned again from remote.
Another reason is, in data center, VM's typically share lots of data, and as
sealing key is bound to CPU, it means the data encrypted by one enclave on
one machine cannot be shared by another enclave on another mahcine. So from
SGX app writer's point of view, developer should treat Sealing key as a
changeable key, and should handle lose of sealing data anyway. Sealing key
should only be used to seal secrets that can be easily provisioned again.
For other keys such as EREPORT key and provisioning key, which are used for
local attestation and remote attestation, due to the second reason below,
losing them is not a problem either.
- Sudden lose of EPC is not a problem.
On hardware, EPC will be lost if system goes to S3-S5, or reset, or
shutdown, and SGX driver need to handle lose of EPC due to power transition.
This is done by cooperation between SGX driver and userspace SGX SDK/apps.
However during live migration, there may not be power transition in guest,
so there may not be EPC lose during live migration. And technically we
cannot *really* live migrate enclave (explained above), so looks it's not
feasible. But the fact is that both Linux SGX driver and Windows SGX driver
have already supported *sudden* lose of EPC (not EPC lose during power
transition), which means both driver are able to recover in case EPC is lost
at any runtime. With this, technically we are able to support live migration
by simply ignoring EPC. After VM is migrated, the destination VM will only
suffer *sudden* lose of EPC, which both Windows SGX driver and Linux SGX
driver are already able to handle.
But we must point out such *sudden* lose of EPC is not hardware behavior,
and other SGX driver for other OSes (such as FreeBSD) may not implement
this, so for those guests, destination VM will behavior in unexpected
manner. But I am not sure we need to care about other OSes.
For the same reason, we are able to support check pointing for SGX guest (only
Linux and Windows);
For snapshot, we can support snapshot SGX guest by either:
- Suspend guest before snapshot (s3-s5). This works for all guests but
requires user to manually susppend guest.
- Issue an hypercall to destroy guest's EPC in save_vm. This only works for
Linux and Windows but doesn't require user intervention.
What's your comments?
3. Reference
- Intel SGX Homepage
https://software.intel.com/en-us/sgx
- Linux SGX SDK
https://01.org/intel-software-guard-extensions
- Linux SGX driver for upstreaming
https://github.com/01org/linux-sgx
- Intel SGX Specification (SDM Vol 3D)
https://software.intel.com/sites/default/files/managed/7c/f1/332831-sdm-vol-3d.pdf
- Paper: Intel SGX Explained
https://eprint.iacr.org/2016/086.pdf
- ISCA 2015 tutorial slides for Intel® SGX - Intel® Software
https://software.intel.com/sites/default/files/332680-002.pdf
Boqun Feng (5):
xen: mm: introduce non-scrubbable pages
xen: mm: manage EPC pages in Xen heaps
xen: x86/mm: add SGX EPC management
xen: x86: add functions to populate and destroy EPC for domain
xen: tools: add SGX to applying MSR policy
Kai Huang (12):
xen: x86: expose SGX to HVM domain in CPU featureset
xen: x86: add early stage SGX feature detection
xen: vmx: detect ENCLS VMEXIT
xen: x86/mm: introduce ioremap_wb()
xen: p2m: new 'p2m_epc' type for EPC mapping
xen: x86: add SGX cpuid handling support.
xen: vmx: handle SGX related MSRs
xen: vmx: handle ENCLS VMEXIT
xen: vmx: handle VMEXIT from SGX enclave
xen: x86: reset EPC when guest got suspended.
xen: tools: add new 'sgx' parameter support
xen: tools: add SGX to applying CPUID policy
docs/misc/xen-command-line.markdown | 8 +
tools/libxc/Makefile | 1 +
tools/libxc/include/xc_dom.h | 4 +
tools/libxc/include/xenctrl.h | 16 +
tools/libxc/xc_cpuid_x86.c | 68 ++-
tools/libxc/xc_msr_x86.h | 10 +
tools/libxc/xc_sgx.c | 82 +++
tools/libxl/libxl.h | 3 +-
tools/libxl/libxl_cpuid.c | 15 +-
tools/libxl/libxl_create.c | 10 +
tools/libxl/libxl_dom.c | 65 ++-
tools/libxl/libxl_internal.h | 2 +
tools/libxl/libxl_nocpuid.c | 4 +-
tools/libxl/libxl_types.idl | 11 +
tools/libxl/libxl_x86.c | 12 +
tools/ocaml/libs/xc/xenctrl_stubs.c | 11 +-
tools/python/xen/lowlevel/xc/xc.c | 11 +-
tools/xl/xl_parse.c | 86 +++
tools/xl/xl_parse.h | 1 +
xen/arch/x86/Makefile | 1 +
xen/arch/x86/cpu/common.c | 15 +
xen/arch/x86/cpuid.c | 62 ++-
xen/arch/x86/domctl.c | 87 ++-
xen/arch/x86/hvm/hvm.c | 3 +
xen/arch/x86/hvm/vmx/vmcs.c | 16 +-
xen/arch/x86/hvm/vmx/vmx.c | 68 +++
xen/arch/x86/hvm/vmx/vvmx.c | 11 +
xen/arch/x86/mm.c | 9 +-
xen/arch/x86/mm/p2m-ept.c | 3 +
xen/arch/x86/mm/p2m.c | 41 ++
xen/arch/x86/msr.c | 6 +-
xen/arch/x86/sgx.c | 815 ++++++++++++++++++++++++++++
xen/common/page_alloc.c | 39 +-
xen/include/asm-arm/mm.h | 9 +
xen/include/asm-x86/cpufeature.h | 4 +
xen/include/asm-x86/cpuid.h | 29 +-
xen/include/asm-x86/hvm/hvm.h | 3 +
xen/include/asm-x86/hvm/vmx/vmcs.h | 8 +
xen/include/asm-x86/hvm/vmx/vmx.h | 3 +
xen/include/asm-x86/mm.h | 19 +-
xen/include/asm-x86/msr-index.h | 6 +
xen/include/asm-x86/msr.h | 5 +
xen/include/asm-x86/p2m.h | 12 +-
xen/include/asm-x86/sgx.h | 86 +++
xen/include/public/arch-x86/cpufeatureset.h | 3 +-
xen/include/xen/mm.h | 2 +
xen/tools/gen-cpuid.py | 3 +
47 files changed, 1757 insertions(+), 31 deletions(-)
create mode 100644 tools/libxc/xc_sgx.c
create mode 100644 xen/arch/x86/sgx.c
create mode 100644 xen/include/asm-x86/sgx.h
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 11:13 ` Julien Grall
2017-12-04 0:15 ` [PATCH v2 02/17] xen: x86: add early stage SGX feature detection Boqun Feng
` (16 subsequent siblings)
17 siblings, 1 reply; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
Expose SGX in CPU featureset for HVM domain. SGX will not be supported for
PV domain, as ENCLS (which SGX driver in guest essentially runs) must run
in ring 0, while PV kernel runs in ring 3. Theoretically we can support SGX
in PV domain via either emulating #GP caused by ENCLS running in ring 3, or
by PV ENCLS but it is really not necessary at this stage.
SGX Launch Control is also exposed in CPU featureset for HVM domain. SGX
Launch Control depends on SGX.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/include/public/arch-x86/cpufeatureset.h | 3 ++-
xen/tools/gen-cpuid.py | 3 +++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index be6da8eaf17c..1f8510eebb1d 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -193,7 +193,7 @@ XEN_CPUFEATURE(XSAVES, 4*32+ 3) /*S XSAVES/XRSTORS instructions */
/* Intel-defined CPU features, CPUID level 0x00000007:0.ebx, word 5 */
XEN_CPUFEATURE(FSGSBASE, 5*32+ 0) /*A {RD,WR}{FS,GS}BASE instructions */
XEN_CPUFEATURE(TSC_ADJUST, 5*32+ 1) /*S TSC_ADJUST MSR available */
-XEN_CPUFEATURE(SGX, 5*32+ 2) /* Software Guard extensions */
+XEN_CPUFEATURE(SGX, 5*32+ 2) /*H Intel Software Guard extensions */
XEN_CPUFEATURE(BMI1, 5*32+ 3) /*A 1st bit manipulation extensions */
XEN_CPUFEATURE(HLE, 5*32+ 4) /*A Hardware Lock Elision */
XEN_CPUFEATURE(AVX2, 5*32+ 5) /*A AVX2 instructions */
@@ -230,6 +230,7 @@ XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */
XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */
XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */
XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */
+XEN_CPUFEATURE(SGX_LC, 6*32+30) /*H Intel SGX Launch Control */
/* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */
diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
index 9ec4486f2b4b..4fef21203086 100755
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -256,6 +256,9 @@ def crunch_numbers(state):
AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD,
AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW,
AVX512_4FMAPS, AVX512_VPOPCNTDQ],
+
+ # SGX Launch Control depends on SGX
+ SGX: [SGX_LC],
}
deep_features = tuple(sorted(deps.keys()))
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 02/17] xen: x86: add early stage SGX feature detection
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
2017-12-04 0:15 ` [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 03/17] xen: vmx: detect ENCLS VMEXIT Boqun Feng
` (15 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
This patch adds early stage SGX feature detection via SGX CPUID 0x12.
Function detect_sgx is added to detect SGX info on each CPU (called from
identify_cpu). SDM says SGX info returned by CPUID is per-thread, and
we cannot assume all threads will return the same SGX info, so we have
to detect SGX for each CPU. For simplicity, currently SGX is only
supported when all CPUs reports the same SGX info.
Besides a boot parameter 'sgx' is added to allow the sysadmin control
whether SGX is supported to guests.
SDM also says it's possible to have multiple EPC sections but this is
only for multiple-socket server, which we don't support now (there are
other things need to be done, ex, NUMA EPC, scheduling, etc, as well),
so currently only one EPC is supported.
The detection result is in the X86_FEATURE_SGX bit of 'boot_cpu_data',
and 'cpu_has_sgx' should be the only way to query for the SGX support
enabled or not in the whole system.
Dedicated files sgx.c and sgx.h are added for bulk of above SGX
detection code detection code, and for further SGX code as well.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
docs/misc/xen-command-line.markdown | 8 ++
xen/arch/x86/Makefile | 1 +
xen/arch/x86/cpu/common.c | 15 +++
xen/arch/x86/sgx.c | 191 ++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/cpufeature.h | 1 +
xen/include/asm-x86/msr-index.h | 1 +
xen/include/asm-x86/sgx.h | 61 ++++++++++++
7 files changed, 278 insertions(+)
create mode 100644 xen/arch/x86/sgx.c
create mode 100644 xen/include/asm-x86/sgx.h
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 781110d4b2a5..81f9936face2 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1601,6 +1601,14 @@ hypervisors handle SErrors:
All SErrors will crash the whole system. This option will avoid all overhead
of the dsb/isb pairs.
+### sgx (Intel)
+> = <boolean>
+
+> Default: false
+
+Flag to enable Software Guard Extensions support
+for guest.
+
### smap
> `= <boolean> | hvm`
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index d5d58a205ec8..c8a843fef540 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -54,6 +54,7 @@ obj-y += platform_hypercall.o x86_64/platform_hypercall.o
obj-y += psr.o
obj-y += setup.o
obj-y += shutdown.o
+obj-y += sgx.o
obj-y += smp.o
obj-y += smpboot.o
obj-y += srat.o
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 6cf362849e85..0a93d5759a76 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -11,6 +11,7 @@
#include <asm/apic.h>
#include <mach_apic.h>
#include <asm/setup.h>
+#include <asm/sgx.h>
#include <public/sysctl.h> /* for XEN_INVALID_{SOCKET,CORE}_ID */
#include "cpu.h"
@@ -430,14 +431,28 @@ void identify_cpu(struct cpuinfo_x86 *c)
* executed, c == &boot_cpu_data.
*/
if ( c != &boot_cpu_data ) {
+ struct sgx_cpuinfo tmp;
/* AND the already accumulated flags with these */
for ( i = 0 ; i < NCAPINTS ; i++ )
boot_cpu_data.x86_capability[i] &= c->x86_capability[i];
mcheck_init(c, false);
+ /*
+ * Check SGX CPUID info all for all CPUs, and only support SGX when all
+ * CPUs report the same SGX info. SDM (37.7.2 Intel SGX Resource
+ * Enumeration Leaves) says "software should not assume that if Intel
+ * SGX instructions are supported on one hardware thread, they are also
+ * supported elsewhere.". For simplicity, we only support SGX when all
+ * CPUs reports consistent SGX info.
+ */
+ detect_sgx(&tmp);
+ if ( memcmp(&tmp, &boot_sgx_cpudata, sizeof(tmp)) )
+ disable_sgx();
} else {
mcheck_init(c, true);
+ detect_sgx(&boot_sgx_cpudata);
+
mtrr_bp_init();
}
}
diff --git a/xen/arch/x86/sgx.c b/xen/arch/x86/sgx.c
new file mode 100644
index 000000000000..ead917543f3e
--- /dev/null
+++ b/xen/arch/x86/sgx.c
@@ -0,0 +1,191 @@
+/*
+ * Intel Software Guard Extensions support
+ *
+ * Copyright (c) 2017, Intel Corporation
+ *
+ * Author: Kai Huang <kai.huang@linux.intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/sched.h>
+#include <asm/cpufeature.h>
+#include <asm/msr-index.h>
+#include <asm/msr.h>
+#include <asm/sgx.h>
+
+struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
+
+static bool __read_mostly opt_sgx_enabled = false;
+boolean_param("sgx", opt_sgx_enabled);
+
+static void __detect_sgx(struct sgx_cpuinfo *sgxinfo)
+{
+ u32 eax, ebx, ecx, edx;
+ uint64_t val;
+ uint64_t sgx_enabled = IA32_FEATURE_CONTROL_SGX_ENABLE |
+ IA32_FEATURE_CONTROL_LOCK;
+ int cpu = smp_processor_id();
+
+ memset(sgxinfo, 0, sizeof(*sgxinfo));
+
+ /*
+ * In reality if SGX is not enabled in BIOS, SGX CPUID should report
+ * invalid SGX info, but we do the check anyway to make sure.
+ */
+ rdmsrl(MSR_IA32_FEATURE_CONTROL, val);
+
+ if ( (val & sgx_enabled) != sgx_enabled )
+ {
+ printk("CPU%d: SGX disabled in BIOS.\n", cpu);
+ goto not_supported;
+ }
+
+ sgxinfo->lewr = !!(val & IA32_FEATURE_CONTROL_SGX_LE_WR);
+
+ /*
+ * CPUID.0x12.0x0:
+ *
+ * EAX [0]: whether SGX1 is supported.
+ * [1]: whether SGX2 is supported.
+ * EBX [31:0]: miscselect
+ * ECX [31:0]: reserved
+ * EDX [7:0]: MaxEnclaveSize_Not64
+ * [15:8]: MaxEnclaveSize_64
+ */
+ cpuid_count(SGX_CPUID, 0x0, &eax, &ebx, &ecx, &edx);
+ sgxinfo->cap = eax & (SGX_CAP_SGX1 | SGX_CAP_SGX2);
+ sgxinfo->miscselect = ebx;
+ sgxinfo->max_enclave_size32 = edx & 0xff;
+ sgxinfo->max_enclave_size64 = (edx & 0xff00) >> 8;
+
+ if ( !(eax & SGX_CAP_SGX1) )
+ {
+ /* We may reach here if BIOS doesn't enable SGX */
+ printk("CPU%d: CPUID.0x12.0x0 reports not SGX support.\n", cpu);
+ goto not_supported;
+ }
+
+ /*
+ * CPUID.0x12.0x1:
+ *
+ * EAX [31:0]: bitmask of 1-setting of SECS.ATTRIBUTES[31:0]
+ * EBX [31:0]: bitmask of 1-setting of SECS.ATTRIBUTES[63:32]
+ * ECX [31:0]: bitmask of 1-setting of SECS.ATTRIBUTES[95:64]
+ * EDX [31:0]: bitmask of 1-setting of SECS.ATTRIBUTES[127:96]
+ */
+ cpuid_count(SGX_CPUID, 0x1, &eax, &ebx, &ecx, &edx);
+ sgxinfo->secs_attr_bitmask[0] = eax;
+ sgxinfo->secs_attr_bitmask[1] = ebx;
+ sgxinfo->secs_attr_bitmask[2] = ecx;
+ sgxinfo->secs_attr_bitmask[3] = edx;
+
+ /*
+ * CPUID.0x12.0x2:
+ *
+ * EAX [3:0]: 0000: this sub-leaf is invalid
+ * 0001: this sub-leaf enumerates EPC resource
+ * [11:4]: reserved
+ * [31:12]: bits 31:12 of physical address of EPC base (when
+ * EAX[3:0] is 0001, which applies to following)
+ * EBX [19:0]: bits 51:32 of physical address of EPC base
+ * [31:20]: reserved
+ * ECX [3:0]: 0000: EDX:ECX are 0
+ * 0001: this is EPC section.
+ * [11:4]: reserved
+ * [31:12]: bits 31:12 of EPC size
+ * EDX [19:0]: bits 51:32 of EPC size
+ * [31:20]: reserved
+ *
+ * TODO: So far assume there's only one EPC resource.
+ */
+ cpuid_count(SGX_CPUID, 0x2, &eax, &ebx, &ecx, &edx);
+ if ( !(eax & 0x1) || !(ecx & 0x1) )
+ {
+ /* We may reach here if BIOS doesn't enable SGX */
+ printk("CPU%d: CPUID.0x12.0x2 reports invalid EPC resource.\n", cpu);
+ goto not_supported;
+ }
+ sgxinfo->epc_base = (((u64)(ebx & 0xfffff)) << 32) | (eax & 0xfffff000);
+ sgxinfo->epc_size = (((u64)(edx & 0xfffff)) << 32) | (ecx & 0xfffff000);
+
+ return;
+
+not_supported:
+ memset(sgxinfo, 0, sizeof(*sgxinfo));
+ disable_sgx();
+}
+
+void detect_sgx(struct sgx_cpuinfo *sgxinfo)
+{
+ if ( !opt_sgx_enabled )
+ {
+ setup_clear_cpu_cap(X86_FEATURE_SGX);
+ return;
+ }
+ else if ( sgxinfo != &boot_sgx_cpudata &&
+ ( !cpu_has_sgx || boot_cpu_data.cpuid_level < SGX_CPUID ))
+ {
+ setup_clear_cpu_cap(X86_FEATURE_SGX);
+ return;
+ }
+
+ __detect_sgx(sgxinfo);
+}
+
+void disable_sgx(void)
+{
+ /*
+ * X86_FEATURE_SGX is cleared in boot_cpu_data so that cpu_has_sgx
+ * can be used anywhere to check whether SGX is supported by Xen.
+ *
+ * FIXME: also adjust boot_cpu_data.cpuid_level ?
+ */
+ setup_clear_cpu_cap(X86_FEATURE_SGX);
+ opt_sgx_enabled = false;
+}
+
+static void __init print_sgx_cpuinfo(struct sgx_cpuinfo *sgxinfo)
+{
+ printk("SGX: \n"
+ "\tCAP: %s,%s\n"
+ "\tEPC: [0x%"PRIx64", 0x%"PRIx64")\n",
+ boot_sgx_cpudata.cap & SGX_CAP_SGX1 ? "SGX1" : "",
+ boot_sgx_cpudata.cap & SGX_CAP_SGX2 ? "SGX2" : "",
+ boot_sgx_cpudata.epc_base,
+ boot_sgx_cpudata.epc_base + boot_sgx_cpudata.epc_size);
+}
+
+static int __init sgx_init(void)
+{
+ if ( !cpu_has_sgx )
+ goto not_supported;
+
+ print_sgx_cpuinfo(&boot_sgx_cpudata);
+
+ return 0;
+not_supported:
+ disable_sgx();
+ return -EINVAL;
+}
+__initcall(sgx_init);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 84cc51d2bdc8..9793f8c1c586 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -85,6 +85,7 @@
/* CPUID level 0x00000007:0.ebx */
#define cpu_has_fsgsbase boot_cpu_has(X86_FEATURE_FSGSBASE)
+#define cpu_has_sgx boot_cpu_has(X86_FEATURE_SGX)
#define cpu_has_bmi1 boot_cpu_has(X86_FEATURE_BMI1)
#define cpu_has_hle boot_cpu_has(X86_FEATURE_HLE)
#define cpu_has_avx2 boot_cpu_has(X86_FEATURE_AVX2)
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index b99c623367b8..63e11931cd09 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -298,6 +298,7 @@
#define IA32_FEATURE_CONTROL_ENABLE_SENTER 0x8000
#define IA32_FEATURE_CONTROL_SGX_ENABLE 0x40000
#define IA32_FEATURE_CONTROL_LMCE_ON 0x100000
+#define IA32_FEATURE_CONTROL_SGX_LE_WR 0x20000
#define MSR_IA32_TSC_ADJUST 0x0000003b
diff --git a/xen/include/asm-x86/sgx.h b/xen/include/asm-x86/sgx.h
new file mode 100644
index 000000000000..b37ebde64e84
--- /dev/null
+++ b/xen/include/asm-x86/sgx.h
@@ -0,0 +1,61 @@
+/*
+ * Intel Software Guard Extensions support
+ *
+ * Copyright (c) 2016-2017, Intel Corporation.
+ *
+ * Author: Kai Huang <kai.huang@linux.intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_X86_SGX_H__
+#define __ASM_X86_SGX_H__
+
+#include <xen/config.h>
+#include <xen/types.h>
+#include <xen/init.h>
+#include <asm/processor.h>
+
+#define SGX_CPUID 0x12
+
+/*
+ * SGX info reported by SGX CPUID.
+ *
+ * TODO:
+ *
+ * SDM (37.7.2 Intel SGX Resource Enumeration Leaves) actually says it's
+ * possible there are multiple EPC resources on the machine (CPUID.0x12,
+ * ECX starting with 0x2 enumerates available EPC resources until invalid
+ * EPC resource is returned). But this is only for multiple socket server,
+ * which we current don't support now (there are additional things need to
+ * be done as well). So far for simplicity we assume there is only one EPC.
+ */
+struct sgx_cpuinfo {
+#define SGX_CAP_SGX1 (1UL << 0)
+#define SGX_CAP_SGX2 (1UL << 1)
+ uint32_t cap;
+ uint32_t miscselect;
+ uint8_t max_enclave_size64;
+ uint8_t max_enclave_size32;
+ uint32_t secs_attr_bitmask[4];
+ uint64_t epc_base;
+ uint64_t epc_size;
+ bool lewr;
+};
+
+extern struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
+/* Detect SGX info for particular CPU via SGX CPUID */
+void detect_sgx(struct sgx_cpuinfo *sgxinfo);
+void disable_sgx(void);
+#define sgx_lewr() (boot_sgx_cpudata.lewr)
+
+#endif /* __ASM_X86_SGX_H__ */
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 03/17] xen: vmx: detect ENCLS VMEXIT
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
2017-12-04 0:15 ` [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset Boqun Feng
2017-12-04 0:15 ` [PATCH v2 02/17] xen: x86: add early stage SGX feature detection Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 04/17] xen: x86/mm: introduce ioremap_wb() Boqun Feng
` (14 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
If ENCLS VMEXIT is not present then we cannot support SGX virtualization.
This patch detects presence of ENCLS VMEXIT, and disable SGX if ENCLS
VMEXIT not present
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/hvm/vmx/vmcs.c | 16 +++++++++++++++-
xen/include/asm-x86/hvm/vmx/vmcs.h | 3 +++
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index b5100b50215a..dfcecc4fd1b0 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -40,6 +40,7 @@
#include <asm/shadow.h>
#include <asm/tboot.h>
#include <asm/apic.h>
+#include <asm/sgx.h>
static bool_t __read_mostly opt_vpid_enabled = 1;
boolean_param("vpid", opt_vpid_enabled);
@@ -143,6 +144,7 @@ static void __init vmx_display_features(void)
P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
P(cpu_has_vmx_pml, "Page Modification Logging");
P(cpu_has_vmx_tsc_scaling, "TSC Scaling");
+ P(cpu_has_vmx_encls, "SGX ENCLS Exiting");
#undef P
if ( !printed )
@@ -238,7 +240,8 @@ static int vmx_init_vmcs_config(void)
SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS |
SECONDARY_EXEC_XSAVES |
- SECONDARY_EXEC_TSC_SCALING);
+ SECONDARY_EXEC_TSC_SCALING |
+ SECONDARY_EXEC_ENABLE_ENCLS);
rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -341,6 +344,14 @@ static int vmx_init_vmcs_config(void)
_vmx_secondary_exec_control &= ~ SECONDARY_EXEC_PAUSE_LOOP_EXITING;
}
+ /*
+ * Turn off SGX if ENCLS VMEXIT is not present. Actually on real machine,
+ * if SGX CPUID is present (CPUID.0x7.0x0:EBX.SGX = 1), then ENCLS VMEXIT
+ * will always be present. We do the check anyway here.
+ */
+ if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_ENCLS) )
+ disable_sgx();
+
min = VM_EXIT_ACK_INTR_ON_EXIT;
opt = VM_EXIT_SAVE_GUEST_PAT | VM_EXIT_LOAD_HOST_PAT |
VM_EXIT_CLEAR_BNDCFGS;
@@ -1136,6 +1147,9 @@ static int construct_vmcs(struct vcpu *v)
/* Disable PML anyway here as it will only be enabled in log dirty mode */
v->arch.hvm_vmx.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
+ /* Disable ENCLS VMEXIT. It will only be turned on when needed. */
+ v->arch.hvm_vmx.secondary_exec_control &= ~SECONDARY_EXEC_ENABLE_ENCLS;
+
/* Host data selectors. */
__vmwrite(HOST_SS_SELECTOR, __HYPERVISOR_DS);
__vmwrite(HOST_DS_SELECTOR, __HYPERVISOR_DS);
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 8fb9e3ceee4e..d0293b1a3620 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -245,6 +245,7 @@ extern u32 vmx_vmentry_control;
#define SECONDARY_EXEC_ENABLE_INVPCID 0x00001000
#define SECONDARY_EXEC_ENABLE_VM_FUNCTIONS 0x00002000
#define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING 0x00004000
+#define SECONDARY_EXEC_ENABLE_ENCLS 0x00008000
#define SECONDARY_EXEC_ENABLE_PML 0x00020000
#define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS 0x00040000
#define SECONDARY_EXEC_XSAVES 0x00100000
@@ -325,6 +326,8 @@ extern u64 vmx_ept_vpid_cap;
(vmx_secondary_exec_control & SECONDARY_EXEC_XSAVES)
#define cpu_has_vmx_tsc_scaling \
(vmx_secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
+#define cpu_has_vmx_encls \
+ (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_ENCLS)
#define VMCS_RID_TYPE_MASK 0x80000000
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 04/17] xen: x86/mm: introduce ioremap_wb()
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (2 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 03/17] xen: vmx: detect ENCLS VMEXIT Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 05/17] xen: p2m: new 'p2m_epc' type for EPC mapping Boqun Feng
` (13 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
Currently Xen only has non-cacheable version of ioremap for x86.
Although EPC is reported as reserved memory in e820 but it can be mapped
as cacheable. This patch introduces ioremap_wb() (ioremap for cacheable
and write back memory).
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/mm.c | 9 +++++++--
xen/include/asm-x86/mm.h | 7 +++++++
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 886a5ee327df..1111db1d1f40 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -5207,7 +5207,7 @@ void *__init arch_vmap_virt_end(void)
return (void *)fix_to_virt(__end_of_fixed_addresses);
}
-void __iomem *ioremap(paddr_t pa, size_t len)
+void __iomem *__ioremap(paddr_t pa, size_t len, unsigned int flags)
{
mfn_t mfn = _mfn(PFN_DOWN(pa));
void *va;
@@ -5222,12 +5222,17 @@ void __iomem *ioremap(paddr_t pa, size_t len)
unsigned int offs = pa & (PAGE_SIZE - 1);
unsigned int nr = PFN_UP(offs + len);
- va = __vmap(&mfn, nr, 1, 1, PAGE_HYPERVISOR_UCMINUS, VMAP_DEFAULT) + offs;
+ va = __vmap(&mfn, nr, 1, 1, flags, VMAP_DEFAULT) + offs;
}
return (void __force __iomem *)va;
}
+void __iomem *ioremap(paddr_t pa, size_t len)
+{
+ return __ioremap(pa, len, PAGE_HYPERVISOR_UCMINUS);
+}
+
int create_perdomain_mapping(struct domain *d, unsigned long va,
unsigned int nr, l1_pgentry_t **pl1tab,
struct page_info **ppg)
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 83626085e0a6..77e3c3ba68d1 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -629,4 +629,11 @@ static inline bool arch_mfn_in_directmap(unsigned long mfn)
return mfn <= (virt_to_mfn(eva - 1) + 1);
}
+extern void __iomem *__ioremap(paddr_t, size_t, unsigned int);
+
+static inline void __iomem *ioremap_wb(paddr_t pa, size_t len)
+{
+ return __ioremap(pa, len, PAGE_HYPERVISOR);
+}
+
#endif /* __ASM_X86_MM_H__ */
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 05/17] xen: p2m: new 'p2m_epc' type for EPC mapping
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (3 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 04/17] xen: x86/mm: introduce ioremap_wb() Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 06/17] xen: mm: introduce non-scrubbable pages Boqun Feng
` (12 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
A new 'p2m_epc' type is added for EPC mapping type. Two wrapper functions
set_epc_p2m_entry and clear_epc_p2m_entry are also added for further use.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
xen/arch/x86/mm/p2m-ept.c | 3 +++
xen/arch/x86/mm/p2m.c | 41 +++++++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/p2m.h | 12 ++++++++++--
3 files changed, 54 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b4996ce658ac..34c2e2f8ac1c 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -182,6 +182,9 @@ static void ept_p2m_type_to_flags(struct p2m_domain *p2m, ept_entry_t *entry,
entry->a = !!cpu_has_vmx_ept_ad;
entry->d = 0;
break;
+ case p2m_epc:
+ entry->r = entry->w = entry->x = 1;
+ break;
}
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c72a3cdebb81..8eeafe4b250c 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1192,6 +1192,12 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
return ret;
}
+int set_epc_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+ return set_typed_p2m_entry(d, gfn, mfn, PAGE_ORDER_4K, p2m_epc,
+ p2m_get_hostp2m(d)->default_access);
+}
+
/*
* Returns:
* 0 for success
@@ -1278,6 +1284,41 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
return ret;
}
+int clear_epc_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
+{
+ struct p2m_domain *p2m = p2m_get_hostp2m(d);
+ mfn_t omfn;
+ p2m_type_t ot;
+ p2m_access_t oa;
+ int ret = 0;
+
+ gfn_lock(p2m, gfn, 0);
+
+ omfn = p2m->get_entry(p2m, _gfn(gfn), &ot, &oa, 0, NULL, NULL);
+ if ( mfn_eq(omfn, INVALID_MFN) || !p2m_is_epc(ot) )
+ {
+ printk(XENLOG_G_WARNING
+ "d%d: invalid EPC map to clear: gfn 0x%lx, type %d.\n",
+ d->domain_id, gfn, ot);
+ goto out;
+ }
+ if ( !mfn_eq(mfn, omfn) )
+ {
+ printk(XENLOG_G_WARNING
+ "d%d: mistaken EPC mfn to clear: gfn 0x%lx, "
+ "omfn 0x%lx, mfn 0x%lx.\n",
+ d->domain_id, gfn, mfn_x(omfn), mfn_x(mfn));
+ }
+
+ ret = p2m_set_entry(p2m, _gfn(gfn), INVALID_MFN, PAGE_ORDER_4K, p2m_invalid,
+ p2m->default_access);
+
+out:
+ gfn_unlock(p2m, gfn, 0);
+
+ return ret;
+}
+
/* Returns: 0 for success, -errno for failure */
int set_shared_p2m_entry(struct domain *d, unsigned long gfn_l, mfn_t mfn)
{
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 17b1d0c8d326..40a40dd54380 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -72,6 +72,7 @@ typedef enum {
p2m_ram_broken = 13, /* Broken page, access cause domain crash */
p2m_map_foreign = 14, /* ram pages from foreign domain */
p2m_ioreq_server = 15,
+ p2m_epc = 16, /* EPC */
} p2m_type_t;
/* Modifiers to the query */
@@ -142,10 +143,13 @@ typedef unsigned int p2m_query_t;
| p2m_to_mask(p2m_ram_logdirty) )
#define P2M_SHARED_TYPES (p2m_to_mask(p2m_ram_shared))
+#define P2M_EPC_TYPES (p2m_to_mask(p2m_epc))
+
/* Valid types not necessarily associated with a (valid) MFN. */
#define P2M_INVALID_MFN_TYPES (P2M_POD_TYPES \
| p2m_to_mask(p2m_mmio_direct) \
- | P2M_PAGING_TYPES)
+ | P2M_PAGING_TYPES \
+ | P2M_EPC_TYPES)
/* Broken type: the frame backing this pfn has failed in hardware
* and must not be touched. */
@@ -153,6 +157,7 @@ typedef unsigned int p2m_query_t;
/* Useful predicates */
#define p2m_is_ram(_t) (p2m_to_mask(_t) & P2M_RAM_TYPES)
+#define p2m_is_epc(_t) (p2m_to_mask(_t) & P2M_EPC_TYPES)
#define p2m_is_hole(_t) (p2m_to_mask(_t) & P2M_HOLE_TYPES)
#define p2m_is_mmio(_t) (p2m_to_mask(_t) & P2M_MMIO_TYPES)
#define p2m_is_readonly(_t) (p2m_to_mask(_t) & P2M_RO_TYPES)
@@ -163,7 +168,7 @@ typedef unsigned int p2m_query_t;
/* Grant types are *not* considered valid, because they can be
unmapped at any time and, unless you happen to be the shadow or p2m
implementations, there's no way of synchronising against that. */
-#define p2m_is_valid(_t) (p2m_to_mask(_t) & (P2M_RAM_TYPES | P2M_MMIO_TYPES))
+#define p2m_is_valid(_t) (p2m_to_mask(_t) & (P2M_RAM_TYPES | P2M_MMIO_TYPES | P2M_EPC_TYPES))
#define p2m_has_emt(_t) (p2m_to_mask(_t) & (P2M_RAM_TYPES | p2m_to_mask(p2m_mmio_direct)))
#define p2m_is_pageable(_t) (p2m_to_mask(_t) & P2M_PAGEABLE_TYPES)
#define p2m_is_paging(_t) (p2m_to_mask(_t) & P2M_PAGING_TYPES)
@@ -635,6 +640,9 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn);
int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
unsigned long gpfn, domid_t foreign_domid);
+int set_epc_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
+int clear_epc_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
+
/*
* Populate-on-demand
*/
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 06/17] xen: mm: introduce non-scrubbable pages
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (4 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 05/17] xen: p2m: new 'p2m_epc' type for EPC mapping Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 07/17] xen: mm: manage EPC pages in Xen heaps Boqun Feng
` (11 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
We are about to use the existing heap allocator for EPC page management,
and we need to prevent EPC pages from being scrubbed or merged with
normal memory pages, because EPC pages can not be accessed outside
Enclaves.
To do so, we use one bit in 'page_info::u::free' to record whether a
page could be scrubbed or not. 'page_scrubbable' is also introduced to
test this bit, however, it will always return 'true' for architectures
without unscrubbable pages like EPC pages for now(i.e. ARM).
Besides, during the page merging stage, we can not allow scrubbable
pages and unscrubbable pages to get merged, therefore 'page_mergeable'
is introduced, and it simply test whether two pages have the same
scrubbable attributes.
In 'scrub_one_page', scrubbing is aborted once the page is found
unscrubbable.
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/common/page_alloc.c | 10 +++++++---
xen/include/asm-arm/mm.h | 7 +++++++
xen/include/asm-x86/mm.h | 7 +++++++
3 files changed, 21 insertions(+), 3 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 5616a8226376..220d7d91c62b 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1364,6 +1364,8 @@ static void free_heap_pages(
if ( pg[i].u.free.need_tlbflush )
page_set_tlbflush_timestamp(&pg[i]);
+ pg[i].u.free.scrubbable = true;
+
/* This page is not a guest frame any more. */
page_set_owner(&pg[i], NULL); /* set_gpfn_from_mfn snoops pg owner */
set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY);
@@ -1402,7 +1404,8 @@ static void free_heap_pages(
if ( !mfn_valid(_mfn(page_to_mfn(predecessor))) ||
!page_state_is(predecessor, free) ||
(PFN_ORDER(predecessor) != order) ||
- (phys_to_nid(page_to_maddr(predecessor)) != node) )
+ (phys_to_nid(page_to_maddr(predecessor)) != node) ||
+ !page_mergeable(predecessor, pg) )
break;
check_and_stop_scrub(predecessor);
@@ -1425,7 +1428,8 @@ static void free_heap_pages(
if ( !mfn_valid(_mfn(page_to_mfn(successor))) ||
!page_state_is(successor, free) ||
(PFN_ORDER(successor) != order) ||
- (phys_to_nid(page_to_maddr(successor)) != node) )
+ (phys_to_nid(page_to_maddr(successor)) != node) ||
+ !page_mergeable(successor, pg) )
break;
check_and_stop_scrub(successor);
@@ -2379,7 +2383,7 @@ __initcall(pagealloc_keyhandler_init);
void scrub_one_page(struct page_info *pg)
{
- if ( unlikely(pg->count_info & PGC_broken) )
+ if ( !page_scrubbable(pg) || unlikely(pg->count_info & PGC_broken) )
return;
#ifndef NDEBUG
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index ad2f2a43dcbc..c715e2290510 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -55,6 +55,9 @@ struct page_info
/* Do TLBs need flushing for safety before next page use? */
bool need_tlbflush:1;
+ /* Could this page be scrubbed when it's free? */
+ bool scrubbable:1;
+
#define BUDDY_NOT_SCRUBBING 0
#define BUDDY_SCRUBBING 1
#define BUDDY_SCRUB_ABORT 2
@@ -150,6 +153,10 @@ extern vaddr_t xenheap_virt_start;
(mfn_valid(_mfn(mfn)) && is_xen_heap_page(__mfn_to_page(mfn)))
#endif
+#define page_scrubbable(_p) true
+
+#define page_mergeable(_p1, _p2) true
+
#define is_xen_fixed_mfn(mfn) \
((pfn_to_paddr(mfn) >= virt_to_maddr(&_start)) && \
(pfn_to_paddr(mfn) <= virt_to_maddr(&_end)))
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 77e3c3ba68d1..b0f0ea0a8b5d 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -98,6 +98,8 @@ struct page_info
/* Do TLBs need flushing for safety before next page use? */
bool need_tlbflush;
+ /* Could this page be scrubbed when it's free? */
+ bool scrubbable;
#define BUDDY_NOT_SCRUBBING 0
#define BUDDY_SCRUBBING 1
@@ -283,6 +285,11 @@ struct page_info
/* OOS fixup entries */
#define SHADOW_OOS_FIXUPS 2
+#define page_scrubbable(_p) ((_p)->u.free.scrubbable)
+
+#define page_mergeable(_p1, _p2) \
+ (page_scrubbable(_p1) == page_scrubbable(_p2))
+
#define page_get_owner(_p) \
((struct domain *)((_p)->v.inuse._domain ? \
pdx_to_virt((_p)->v.inuse._domain) : NULL))
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 07/17] xen: mm: manage EPC pages in Xen heaps
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (5 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 06/17] xen: mm: introduce non-scrubbable pages Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 08/17] xen: x86/mm: add SGX EPC management Boqun Feng
` (10 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
EPC is limited resouce reserved by BIOS, and is reported as reserved
memory in e820 but not normal memory. EPC must be managed in 4K pages,
and could not be accessed outside the Enclaves.
Using the existing memory allocation API(i.e. the heaps) allows us to
manage EPC pages in an efficient way, and may benefit EPC ballooning
implementation in the feature.
In order to use the existing heap mechanism to manage EPC pages, a
dedicated MEMZONE is required, because we need to avoid the mixture of
EPC pages and normal pages in one zone. And for the page_to_zone() to
return the proper zone number, similar to 'PGC_xen_heap' and
'is_xen_heap_page', 'PGC_epc' and 'is_epc_page' are introduced.
In 'free_heap_pages', 'need_scrub' is reset if the page is found to be
an EPC page, because EPC pages can not be scrubbed. And there is no
entry of EPC pages in m2p table, as it's not used, so related setting is
skipped.
Besides, a 'MEMF_epc' memflag is introduced to tell the allocator to get
EPC pages rather than normal memory.
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/common/page_alloc.c | 31 +++++++++++++++++++++++++------
xen/include/asm-arm/mm.h | 2 ++
xen/include/asm-x86/mm.h | 5 ++++-
xen/include/xen/mm.h | 2 ++
4 files changed, 33 insertions(+), 7 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 220d7d91c62b..3b9d2c1a534f 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -377,12 +377,14 @@ mfn_t __init alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align)
* BINARY BUDDY ALLOCATOR
*/
-#define MEMZONE_XEN 0
+#define MEMZONE_EPC 0
+#define MEMZONE_XEN 1
#define NR_ZONES (PADDR_BITS - PAGE_SHIFT + 1)
#define bits_to_zone(b) (((b) < (PAGE_SHIFT + 1)) ? 1 : ((b) - PAGE_SHIFT))
-#define page_to_zone(pg) (is_xen_heap_page(pg) ? MEMZONE_XEN : \
- (flsl(page_to_mfn(pg)) ? : 1))
+#define page_to_zone(pg) (is_epc_page(pg) ? MEMZONE_EPC : \
+ is_xen_heap_page(pg) ? MEMZONE_XEN : \
+ (flsl(page_to_mfn(pg)) ? : MEMZONE_XEN + 1))
typedef struct page_list_head heap_by_zone_and_order_t[NR_ZONES][MAX_ORDER+1];
static heap_by_zone_and_order_t *_heap[MAX_NUMNODES];
@@ -921,7 +923,12 @@ static struct page_info *alloc_heap_pages(
}
node = phys_to_nid(page_to_maddr(pg));
- zone = page_to_zone(pg);
+
+ if ( memflags & MEMF_epc )
+ zone = MEMZONE_EPC;
+ else
+ zone = page_to_zone(pg);
+
buddy_order = PFN_ORDER(pg);
first_dirty = pg->u.free.first_dirty;
@@ -1332,10 +1339,14 @@ static void free_heap_pages(
unsigned long mask, mfn = page_to_mfn(pg);
unsigned int i, node = phys_to_nid(page_to_maddr(pg)), tainted = 0;
unsigned int zone = page_to_zone(pg);
+ bool is_epc = false;
ASSERT(order <= MAX_ORDER);
ASSERT(node >= 0);
+ is_epc = is_epc_page(pg);
+ need_scrub = need_scrub && !is_epc;
+
spin_lock(&heap_lock);
for ( i = 0; i < (1 << order); i++ )
@@ -1364,11 +1375,13 @@ static void free_heap_pages(
if ( pg[i].u.free.need_tlbflush )
page_set_tlbflush_timestamp(&pg[i]);
- pg[i].u.free.scrubbable = true;
+ pg[i].u.free.scrubbable = !is_epc;
/* This page is not a guest frame any more. */
page_set_owner(&pg[i], NULL); /* set_gpfn_from_mfn snoops pg owner */
- set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY);
+
+ if ( !is_epc )
+ set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY);
if ( need_scrub )
{
@@ -2232,6 +2245,12 @@ struct page_info *alloc_domheap_pages(
if ( memflags & MEMF_no_owner )
memflags |= MEMF_no_refcount;
+ /* MEMF_epc implies MEMF_no_scrub */
+ if ((memflags & MEMF_epc) &&
+ !(pg = alloc_heap_pages(MEMZONE_EPC, MEMZONE_EPC, order,
+ memflags | MEMF_no_scrub, d)))
+ return NULL;
+
if ( dma_bitsize && ((dma_zone = bits_to_zone(dma_bitsize)) < zone_hi) )
pg = alloc_heap_pages(dma_zone + 1, zone_hi, order, memflags, d);
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index c715e2290510..bca26f027402 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -153,6 +153,8 @@ extern vaddr_t xenheap_virt_start;
(mfn_valid(_mfn(mfn)) && is_xen_heap_page(__mfn_to_page(mfn)))
#endif
+#define is_epc_page(page) false
+
#define page_scrubbable(_p) true
#define page_mergeable(_p1, _p2) true
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index b0f0ea0a8b5d..1dedb8099801 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -259,8 +259,10 @@ struct page_info
#define PGC_state_free PG_mask(3, 9)
#define page_state_is(pg, st) (((pg)->count_info&PGC_state) == PGC_state_##st)
+#define _PGC_epc PG_shift(10)
+#define PGC_epc PG_mask(1, 10)
/* Count of references to this frame. */
-#define PGC_count_width PG_shift(9)
+#define PGC_count_width PG_shift(10)
#define PGC_count_mask ((1UL<<PGC_count_width)-1)
/*
@@ -271,6 +273,7 @@ struct page_info
#define PGC_need_scrub PGC_allocated
#define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
+#define is_epc_page(page) ((page)->count_info & PGC_epc)
#define is_xen_heap_mfn(mfn) \
(__mfn_valid(mfn) && is_xen_heap_page(__mfn_to_page(mfn)))
#define is_xen_fixed_mfn(mfn) \
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index e813c07b225c..721a2975c1d4 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -250,6 +250,8 @@ struct npfec {
#define MEMF_no_icache_flush (1U<<_MEMF_no_icache_flush)
#define _MEMF_no_scrub 8
#define MEMF_no_scrub (1U<<_MEMF_no_scrub)
+#define _MEMF_epc 9
+#define MEMF_epc (1U<<_MEMF_epc)
#define _MEMF_node 16
#define MEMF_node_mask ((1U << (8 * sizeof(nodeid_t))) - 1)
#define MEMF_node(n) ((((n) + 1) & MEMF_node_mask) << _MEMF_node)
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 08/17] xen: x86/mm: add SGX EPC management
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (6 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 07/17] xen: mm: manage EPC pages in Xen heaps Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 09/17] xen: x86: add functions to populate and destroy EPC for domain Boqun Feng
` (9 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
As now the heap allocator supports EPC pages, the management of EPC
pages is simply putting EPC pages into the heap at booting up if SGX is
supported and the EPC section is reported consistently. Allocation and
reclamation are just heap allocation and reclamation with MEMF_epc.
One more thing we need to do is to populate the portion of EPC pages in
the 'frame_table' and set up the mapping properly.
SGX would be disabled, if EPC initialization found any problem.
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/sgx.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/sgx.h | 3 +
2 files changed, 164 insertions(+)
diff --git a/xen/arch/x86/sgx.c b/xen/arch/x86/sgx.c
index ead917543f3e..9409b041e4f7 100644
--- a/xen/arch/x86/sgx.c
+++ b/xen/arch/x86/sgx.c
@@ -22,6 +22,8 @@
#include <asm/cpufeature.h>
#include <asm/msr-index.h>
#include <asm/msr.h>
+#include <xen/errno.h>
+#include <xen/mm.h>
#include <asm/sgx.h>
struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
@@ -29,6 +31,13 @@ struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
static bool __read_mostly opt_sgx_enabled = false;
boolean_param("sgx", opt_sgx_enabled);
+#define total_epc_npages (boot_sgx_cpudata.epc_size >> PAGE_SHIFT)
+#define epc_base_mfn (boot_sgx_cpudata.epc_base >> PAGE_SHIFT)
+#define epc_base_maddr (boot_sgx_cpudata.epc_base)
+#define epc_end_maddr (epc_base_maddr + boot_sgx_cpudata.epc_size)
+
+static void *epc_base_vaddr = NULL;
+
static void __detect_sgx(struct sgx_cpuinfo *sgxinfo)
{
u32 eax, ebx, ecx, edx;
@@ -166,11 +175,163 @@ static void __init print_sgx_cpuinfo(struct sgx_cpuinfo *sgxinfo)
boot_sgx_cpudata.epc_base + boot_sgx_cpudata.epc_size);
}
+struct ft_page {
+ struct page_info *pg;
+ unsigned int order;
+ unsigned long idx;
+ struct list_head list;
+};
+
+static int extend_epc_frametable(unsigned long smfn, unsigned long emfn)
+{
+ unsigned long idx;
+ LIST_HEAD(ft_pages);
+ struct ft_page *ftp, *nftp;
+ int rc = 0;
+
+ for ( ; smfn < emfn; smfn += PDX_GROUP_COUNT )
+ {
+ idx = pfn_to_pdx(smfn) / PDX_GROUP_COUNT;
+
+ if (!test_bit(idx, pdx_group_valid))
+ {
+ unsigned long s = (unsigned long)pdx_to_page(idx * PDX_GROUP_COUNT);
+ struct page_info *pg;
+
+ ftp = xzalloc(struct ft_page);
+
+ if ( !ftp )
+ {
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ pg = alloc_domheap_pages(NULL, PDX_GROUP_SHIFT - PAGE_SHIFT, 0);
+
+ if ( !pg )
+ {
+ xfree(ftp);
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ ftp->order = PDX_GROUP_SHIFT - PAGE_SHIFT;
+ ftp->pg = pg;
+ ftp->idx = idx;
+
+ list_add_tail(&ftp->list, &ft_pages);
+
+ map_pages_to_xen(s, page_to_mfn(pg),
+ 1UL << (PDX_GROUP_SHIFT - PAGE_SHIFT),
+ PAGE_HYPERVISOR);
+ memset((void *)s, 0, sizeof(struct page_info) * PDX_GROUP_COUNT);
+ }
+ }
+
+out:
+ list_for_each_entry_safe(ftp, nftp, &ft_pages, list)
+ {
+ if ( rc )
+ {
+ unsigned long s = (unsigned long)pdx_to_page(ftp->idx * PDX_GROUP_COUNT);
+
+ destroy_xen_mappings(s, s + (1UL << PDX_GROUP_SHIFT));
+ free_domheap_pages(ftp->pg, ftp->order);
+ }
+ list_del(&ftp->list);
+ xfree(ftp);
+ }
+
+ if ( !rc )
+ set_pdx_range(smfn, emfn);
+
+ return rc;
+}
+
+static int __init init_epc_frametable(unsigned long mfn, unsigned long npages)
+{
+ return extend_epc_frametable(mfn, mfn + npages);
+}
+
+static int __init init_epc_heap(void)
+{
+ struct page_info *pg;
+ unsigned long nrpages = total_epc_npages;
+ unsigned long i;
+ int rc = 0;
+
+ rc = init_epc_frametable(epc_base_mfn, nrpages);
+
+ if ( rc )
+ return rc;
+
+ for ( i = 0; i < nrpages; i++ )
+ {
+ pg = mfn_to_page(epc_base_mfn + i);
+ pg->count_info |= PGC_epc;
+ }
+
+ init_domheap_pages(epc_base_maddr, epc_end_maddr);
+
+ return rc;
+}
+
+struct page_info *alloc_epc_page(void)
+{
+ struct page_info *pg = alloc_domheap_page(NULL, MEMF_epc);
+
+ if ( !pg )
+ return NULL;
+
+ /*
+ * PGC_epc will be cleared in free_heap_pages(), so we add it back at
+ * allocation time, so that is_epc_page() will return true, when this page
+ * gets freed.
+ */
+ pg->count_info |= PGC_epc;
+
+ return pg;
+}
+
+void free_epc_page(struct page_info *epg)
+{
+ free_domheap_page(epg);
+}
+
+
+static int __init sgx_init_epc(void)
+{
+ int rc = 0;
+
+ epc_base_vaddr = ioremap_wb(epc_base_maddr,
+ total_epc_npages << PAGE_SHIFT);
+
+ if ( !epc_base_maddr )
+ {
+ printk("Failed to ioremap_wb EPC range. Disable SGX.\n");
+
+ return -EFAULT;
+ }
+
+ rc = init_epc_heap();
+
+ if ( rc )
+ {
+ printk("Failed to init heap for EPC pages. Disable SGX.\n");
+ iounmap(epc_base_vaddr);
+ }
+
+ return rc;
+}
+
static int __init sgx_init(void)
{
if ( !cpu_has_sgx )
goto not_supported;
+ if ( sgx_init_epc() )
+ goto not_supported;
+
print_sgx_cpuinfo(&boot_sgx_cpudata);
return 0;
diff --git a/xen/include/asm-x86/sgx.h b/xen/include/asm-x86/sgx.h
index b37ebde64e84..8fed664fa154 100644
--- a/xen/include/asm-x86/sgx.h
+++ b/xen/include/asm-x86/sgx.h
@@ -58,4 +58,7 @@ void detect_sgx(struct sgx_cpuinfo *sgxinfo);
void disable_sgx(void);
#define sgx_lewr() (boot_sgx_cpudata.lewr)
+struct page_info *alloc_epc_page(void);
+void free_epc_page(struct page_info *epg);
+
#endif /* __ASM_X86_SGX_H__ */
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 09/17] xen: x86: add functions to populate and destroy EPC for domain
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (7 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 08/17] xen: x86/mm: add SGX EPC management Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 10/17] xen: x86: add SGX cpuid handling support Boqun Feng
` (8 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
Add per-domain structure to store SGX per-domain info. Currently only domain's
EPC base and size are stored. Also add new functions for further use:
- domain_populate_epc # populate EPC when EPC base & size are notified.
- domain_reset_epc # Reset domain's EPC to be invalid. Used when domain
goes to S3-S5, or being destroyed.
- domain_destroy_epc # destroy and free domain's EPC.
For now, those functions only work for HVM domain, and will return
-EFAULT if calling these for non-HVM domain.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 3 +
xen/arch/x86/sgx.c | 340 +++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/hvm/vmx/vmcs.h | 2 +
xen/include/asm-x86/sgx.h | 13 ++
4 files changed, 358 insertions(+)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b18cceab55b2..92fb85b13a0c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -417,6 +417,9 @@ static int vmx_domain_initialise(struct domain *d)
static void vmx_domain_destroy(struct domain *d)
{
+ if ( domain_epc_populated(d) )
+ domain_destroy_epc(d);
+
if ( !has_vlapic(d) )
return;
diff --git a/xen/arch/x86/sgx.c b/xen/arch/x86/sgx.c
index 9409b041e4f7..0c898c3086cb 100644
--- a/xen/arch/x86/sgx.c
+++ b/xen/arch/x86/sgx.c
@@ -25,6 +25,8 @@
#include <xen/errno.h>
#include <xen/mm.h>
#include <asm/sgx.h>
+#include <xen/sched.h>
+#include <asm/p2m.h>
struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
@@ -38,6 +40,344 @@ boolean_param("sgx", opt_sgx_enabled);
static void *epc_base_vaddr = NULL;
+static void *map_epc_page_to_xen(struct page_info *pg)
+{
+ BUG_ON(!epc_base_vaddr);
+
+ return (void *)((unsigned long)epc_base_vaddr +
+ ((page_to_mfn(pg) - epc_base_mfn) << PAGE_SHIFT));
+}
+
+/* ENCLS opcode */
+#define ENCLS .byte 0x0f, 0x01, 0xcf
+
+/*
+ * ENCLS leaf functions
+ *
+ * However currently we only needs EREMOVE..
+ */
+enum {
+ ECREATE = 0x0,
+ EADD = 0x1,
+ EINIT = 0x2,
+ EREMOVE = 0x3,
+ EDGBRD = 0x4,
+ EDGBWR = 0x5,
+ EEXTEND = 0x6,
+ ELDU = 0x8,
+ EBLOCK = 0x9,
+ EPA = 0xA,
+ EWB = 0xB,
+ ETRACK = 0xC,
+ EAUG = 0xD,
+ EMODPR = 0xE,
+ EMODT = 0xF,
+};
+
+/*
+ * ENCLS error code
+ *
+ * Currently we only need SGX_CHILD_PRESENT
+ */
+#define SGX_CHILD_PRESENT 13
+
+static inline int __encls(unsigned long rax, unsigned long rbx,
+ unsigned long rcx, unsigned long rdx)
+{
+ int ret;
+
+ asm volatile ( "ENCLS;\n\t"
+ : "=a" (ret)
+ : "a" (rax), "b" (rbx), "c" (rcx), "d" (rdx)
+ : "memory", "cc");
+
+ return ret;
+}
+
+static inline int __eremove(void *epc)
+{
+ unsigned long rbx = 0, rdx = 0;
+
+ return __encls(EREMOVE, rbx, (unsigned long)epc, rdx);
+}
+
+static int sgx_eremove(struct page_info *epg)
+{
+ void *addr = map_epc_page_to_xen(epg);
+ int ret;
+
+ BUG_ON(!addr);
+
+ ret = __eremove(addr);
+
+ return ret;
+}
+
+struct sgx_domain *to_sgx(struct domain *d)
+{
+ if (!is_hvm_domain(d))
+ return NULL;
+ else
+ return &d->arch.hvm_domain.vmx.sgx;
+}
+
+bool domain_epc_populated(struct domain *d)
+{
+ BUG_ON(!to_sgx(d));
+
+ return !!to_sgx(d)->epc_base_pfn;
+}
+
+/*
+ * Reset domain's EPC with EREMOVE. free_epc indicates whether to free EPC
+ * pages during reset. This will be called when domain goes into S3-S5 state
+ * (with free_epc being false), and when domain is destroyed (with free_epc
+ * being true).
+ *
+ * It is possible that EREMOVE will be called for SECS when it still has
+ * children present, in which case SGX_CHILD_PRESENT will be returned. In this
+ * case, SECS page is kept to a tmp list and after all EPC pages have been
+ * called with EREMOVE, we call EREMOVE for all the SECS pages again, and this
+ * time SGX_CHILD_PRESENT should never occur as all children should have been
+ * removed.
+ *
+ * If unexpected error returned by EREMOVE, it means the EPC page becomes
+ * abnormal, so it will not be freed even free_epc is true, as further use of
+ * this EPC can cause unexpected error, potentially damaging other domains.
+ */
+static int __domain_reset_epc(struct domain *d, unsigned long epc_base_pfn,
+ unsigned long epc_npages, bool free_epc)
+{
+ struct page_list_head secs_list;
+ struct page_info *epg, *tmp;
+ unsigned long i;
+ int ret = 0;
+
+ INIT_PAGE_LIST_HEAD(&secs_list);
+
+ for ( i = 0; i < epc_npages; i++ )
+ {
+ unsigned long gfn;
+ mfn_t mfn;
+ p2m_type_t t;
+ int r;
+
+ gfn = i + epc_base_pfn;
+ mfn = get_gfn_query(d, gfn, &t);
+ if ( unlikely(mfn_eq(mfn, INVALID_MFN)) )
+ {
+ printk("Domain %d: Reset EPC error: invalid MFN for gfn 0x%lx\n",
+ d->domain_id, gfn);
+ put_gfn(d, gfn);
+ ret = -EFAULT;
+ continue;
+ }
+
+ if ( unlikely(!p2m_is_epc(t)) )
+ {
+ printk("Domain %d: Reset EPC error: (gfn 0x%lx, mfn 0x%lx): "
+ "is not p2m_epc.\n", d->domain_id, gfn, mfn_x(mfn));
+ put_gfn(d, gfn);
+ ret = -EFAULT;
+ continue;
+ }
+
+ put_gfn(d, gfn);
+
+ epg = mfn_to_page(mfn_x(mfn));
+
+ /* EREMOVE the EPC page to make it invalid */
+ r = sgx_eremove(epg);
+ if ( r == SGX_CHILD_PRESENT )
+ {
+ page_list_add_tail(epg, &secs_list);
+ continue;
+ }
+
+ if ( r )
+ {
+ printk("Domain %d: Reset EPC error: (gfn 0x%lx, mfn 0x%lx): "
+ "EREMOVE returns %d\n", d->domain_id, gfn, mfn_x(mfn), r);
+ ret = r;
+ if ( free_epc )
+ printk("WARNING: EPC (mfn 0x%lx) becomes abnormal. "
+ "Remove it from useable EPC.", mfn_x(mfn));
+ continue;
+ }
+
+ if ( free_epc )
+ {
+ /* If EPC page is going to be freed, then also remove the mapping */
+ if ( clear_epc_p2m_entry(d, gfn, mfn) )
+ {
+ printk("Domain %d: Reset EPC error: (gfn 0x%lx, mfn 0x%lx): "
+ "clear p2m entry failed.\n", d->domain_id, gfn,
+ mfn_x(mfn));
+ ret = -EFAULT;
+ }
+ free_epc_page(epg);
+ }
+ }
+
+ page_list_for_each_safe(epg, tmp, &secs_list)
+ {
+ int r;
+
+ r = sgx_eremove(epg);
+ if ( r )
+ {
+ printk("Domain %d: Reset EPC error: mfn 0x%lx: "
+ "EREMOVE returns %d for SECS page\n",
+ d->domain_id, page_to_mfn(epg), r);
+ ret = r;
+ page_list_del(epg, &secs_list);
+
+ if ( free_epc )
+ printk("WARNING: EPC (mfn 0x%lx) becomes abnormal. "
+ "Remove it from useable EPC.",
+ page_to_mfn(epg));
+ continue;
+ }
+
+ if ( free_epc )
+ free_epc_page(epg);
+ }
+
+ return ret;
+}
+
+static void __domain_unpopulate_epc(struct domain *d,
+ unsigned long epc_base_pfn, unsigned long populated_npages)
+{
+ unsigned long i;
+
+ for ( i = 0; i < populated_npages; i++ )
+ {
+ struct page_info *epg;
+ unsigned long gfn;
+ mfn_t mfn;
+ p2m_type_t t;
+
+ gfn = i + epc_base_pfn;
+ mfn = get_gfn_query(d, gfn, &t);
+ if ( unlikely(mfn_eq(mfn, INVALID_MFN)) )
+ {
+ /*
+ * __domain_unpopulate_epc only called when creating the domain on
+ * failure, therefore we can just ignore this error.
+ */
+ printk("%s: Domain %u gfn 0x%lx returns invalid mfn\n", __func__,
+ d->domain_id, gfn);
+ put_gfn(d, gfn);
+ continue;
+ }
+
+ if ( unlikely(!p2m_is_epc(t)) )
+ {
+ printk("%s: Domain %u gfn 0x%lx returns non-EPC p2m type: %d\n",
+ __func__, d->domain_id, gfn, (int)t);
+ put_gfn(d, gfn);
+ continue;
+ }
+
+ put_gfn(d, gfn);
+
+ if ( clear_epc_p2m_entry(d, gfn, mfn) )
+ {
+ printk("clear_epc_p2m_entry failed: gfn 0x%lx, mfn 0x%lx\n",
+ gfn, mfn_x(mfn));
+ continue;
+ }
+
+ epg = mfn_to_page(mfn_x(mfn));
+ free_epc_page(epg);
+ }
+}
+
+static int __domain_populate_epc(struct domain *d, unsigned long epc_base_pfn,
+ unsigned long epc_npages)
+{
+ unsigned long i;
+ int ret;
+
+ for ( i = 0; i < epc_npages; i++ )
+ {
+ struct page_info *epg = alloc_epc_page();
+ unsigned long mfn;
+
+ if ( !epg )
+ {
+ printk("%s: Out of EPC\n", __func__);
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ mfn = page_to_mfn(epg);
+ ret = set_epc_p2m_entry(d, i + epc_base_pfn, _mfn(mfn));
+ if ( ret )
+ {
+ printk("%s: set_epc_p2m_entry failed with %d: gfn 0x%lx, "
+ "mfn 0x%lx\n", __func__, ret, i + epc_base_pfn, mfn);
+ free_epc_page(epg);
+ goto err;
+ }
+ }
+
+ return 0;
+
+err:
+ __domain_unpopulate_epc(d, epc_base_pfn, i);
+ return ret;
+}
+
+int domain_populate_epc(struct domain *d, unsigned long epc_base_pfn,
+ unsigned long epc_npages)
+{
+ struct sgx_domain *sgx = to_sgx(d);
+ int ret;
+
+ if ( !sgx )
+ return -EFAULT;
+
+ if ( domain_epc_populated(d) )
+ return -EBUSY;
+
+ if ( !epc_base_pfn || !epc_npages )
+ return -EINVAL;
+
+ if ( (ret = __domain_populate_epc(d, epc_base_pfn, epc_npages)) )
+ return ret;
+
+ sgx->epc_base_pfn = epc_base_pfn;
+ sgx->epc_npages = epc_npages;
+
+ return 0;
+}
+
+/*
+ *
+*
+ * This function returns error immediately if there's any unexpected error
+ * during this process.
+ */
+int domain_reset_epc(struct domain *d, bool free_epc)
+{
+ struct sgx_domain *sgx = to_sgx(d);
+
+ if ( !sgx )
+ return -EFAULT;
+
+ if ( !domain_epc_populated(d) )
+ return 0;
+
+ return __domain_reset_epc(d, sgx->epc_base_pfn, sgx->epc_npages, free_epc);
+}
+
+int domain_destroy_epc(struct domain *d)
+{
+ return domain_reset_epc(d, true);
+}
+
static void __detect_sgx(struct sgx_cpuinfo *sgxinfo)
{
u32 eax, ebx, ecx, edx;
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index d0293b1a3620..44ff4f0a113f 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -20,6 +20,7 @@
#include <asm/hvm/io.h>
#include <irq_vectors.h>
+#include <asm/sgx.h>
extern void vmcs_dump_vcpu(struct vcpu *v);
extern void setup_vmcs_dump(void);
@@ -63,6 +64,7 @@ struct vmx_domain {
unsigned long apic_access_mfn;
/* VMX_DOMAIN_* */
unsigned int status;
+ struct sgx_domain sgx;
};
/*
diff --git a/xen/include/asm-x86/sgx.h b/xen/include/asm-x86/sgx.h
index 8fed664fa154..855e7e638743 100644
--- a/xen/include/asm-x86/sgx.h
+++ b/xen/include/asm-x86/sgx.h
@@ -24,6 +24,7 @@
#include <xen/types.h>
#include <xen/init.h>
#include <asm/processor.h>
+#include <public/hvm/params.h> /* HVM_PARAM_SGX */
#define SGX_CPUID 0x12
@@ -61,4 +62,16 @@ void disable_sgx(void);
struct page_info *alloc_epc_page(void);
void free_epc_page(struct page_info *epg);
+struct sgx_domain {
+ unsigned long epc_base_pfn;
+ unsigned long epc_npages;
+};
+
+struct sgx_domain *to_sgx(struct domain *d);
+bool domain_epc_populated(struct domain *d);
+int domain_populate_epc(struct domain *d, unsigned long epc_base_pfn,
+ unsigned long epc_npages);
+int domain_reset_epc(struct domain *d, bool free_epc);
+int domain_destroy_epc(struct domain *d);
+
#endif /* __ASM_X86_SGX_H__ */
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 10/17] xen: x86: add SGX cpuid handling support.
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (8 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 09/17] xen: x86: add functions to populate and destroy EPC for domain Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 11/17] xen: vmx: handle SGX related MSRs Boqun Feng
` (7 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
This patch adds SGX to cpuid handling support. For SGX feature bit, it's
reported into raw_policy and passed along to a guest, but in
recalculate_cpu_policy(), we clear it if some one disabled SGX for some
reason. For EPC info, physical one is reported into raw_policy and
recalculated for *_policy. For a particular domain, it's EPC base and
size info will be filled by toolstack. Before domain's EPC base and size
are properly configured, guest's SGX cpuid should report invalid EPC,
which is also consistent with HW behavior.
Currently all EPC pages are fully populated for domain when it is
created. Xen gets domain's EPC base and size from toolstack via
XEN_DOMCTL_set_cpuid, so domain's EPC pages are also populated in
XEN_DOMCTL_set_cpuid, after receiving valid EPC base and size. Failure
to populate EPC (such as there's no enough free EPC pages) results in
domain creation failure by making XEN_DOMCTL_set_cpuid return error.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/cpuid.c | 62 ++++++++++++++++++++++++++++++++++++++++++++-
xen/arch/x86/domctl.c | 59 +++++++++++++++++++++++++++++++++++++++++-
xen/include/asm-x86/cpuid.h | 29 ++++++++++++++++++++-
3 files changed, 147 insertions(+), 3 deletions(-)
diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 5ee82d39d7cd..fcffbdec6bbe 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -9,6 +9,7 @@
#include <asm/paging.h>
#include <asm/processor.h>
#include <asm/xstate.h>
+#include <asm/sgx.h>
const uint32_t known_features[] = INIT_KNOWN_FEATURES;
const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
@@ -152,6 +153,33 @@ static void recalculate_xstate(struct cpuid_policy *p)
}
}
+static void recalculate_sgx(struct cpuid_policy *p)
+{
+ if ( !p->feat.sgx || !p->sgx.sgx1 )
+ {
+ memset(&p->sgx, 0, sizeof (p->sgx));
+ return;
+ }
+
+ /*
+ * SDM 42.7.2.1 SECS.ATTRIBUTE.XFRM:
+ *
+ * Legal value for SECS.ATTRIBUTE.XFRM conform to these requirements:
+ * - XFRM[1:0] must be set to 0x3;
+ * - If processor does not support XSAVE, or if the system software has not
+ * enabled XSAVE, then XFRM[63:2] must be 0.
+ * - If the processor does support XSAVE, XFRM must contain a value that
+ * would be legal if loaded into XCR0.
+ */
+ p->sgx.xfrm_low = 0x3;
+ p->sgx.xfrm_high = 0;
+ if ( p->basic.xsave )
+ {
+ p->sgx.xfrm_low |= p->xstate.xcr0_low;
+ p->sgx.xfrm_high |= p->xstate.xcr0_high;
+ }
+}
+
/*
* Misc adjustments to the policy. Mostly clobbering reserved fields and
* duplicating shared fields. Intentionally hidden fields are annotated.
@@ -233,7 +261,7 @@ static void __init calculate_raw_policy(void)
{
switch ( i )
{
- case 0x4: case 0x7: case 0xd:
+ case 0x4: case 0x7: case 0xd: case 0x12:
/* Multi-invocation leaves. Deferred. */
continue;
}
@@ -293,6 +321,19 @@ static void __init calculate_raw_policy(void)
}
}
+ if ( p->basic.max_leaf >= SGX_CPUID )
+ {
+ /*
+ * For raw policy we just report native CPUID. For EPC on native it's
+ * possible that we will have multiple EPC sections (meaning subleaf 3,
+ * 4, ... may also be valid), but as the policy is for guest so we only
+ * need one EPC section (subleaf 2).
+ */
+ cpuid_count_leaf(SGX_CPUID, 0, &p->sgx.raw[0]);
+ cpuid_count_leaf(SGX_CPUID, 1, &p->sgx.raw[1]);
+ cpuid_count_leaf(SGX_CPUID, 2, &p->sgx.raw[2]);
+ }
+
/* Extended leaves. */
cpuid_leaf(0x80000000, &p->extd.raw[0]);
for ( i = 1; i < min(ARRAY_SIZE(p->extd.raw),
@@ -318,6 +359,7 @@ static void __init calculate_host_policy(void)
cpuid_featureset_to_policy(boot_cpu_data.x86_capability, p);
recalculate_xstate(p);
recalculate_misc(p);
+ recalculate_sgx(p);
if ( p->extd.svm )
{
@@ -351,6 +393,7 @@ static void __init calculate_pv_max_policy(void)
sanitise_featureset(pv_featureset);
cpuid_featureset_to_policy(pv_featureset, p);
recalculate_xstate(p);
+ recalculate_sgx(p);
p->extd.raw[0xa] = EMPTY_LEAF; /* No SVM for PV guests. */
}
@@ -408,6 +451,7 @@ static void __init calculate_hvm_max_policy(void)
sanitise_featureset(hvm_featureset);
cpuid_featureset_to_policy(hvm_featureset, p);
recalculate_xstate(p);
+ recalculate_sgx(p);
}
void __init init_guest_cpuid(void)
@@ -523,6 +567,14 @@ void recalculate_cpuid_policy(struct domain *d)
if ( p->basic.max_leaf < XSTATE_CPUID )
__clear_bit(X86_FEATURE_XSAVE, fs);
+ /*
+ * We check cpu_has_sgx here because during boot up SGX may be disabled
+ * via disable_sgx(), e.g. BIOS disables SGX by setting
+ * IA32_FEATURE_CONTROL_SGX_ENABLE=0
+ */
+ if ( p->basic.max_leaf < SGX_CPUID || !cpu_has_sgx )
+ __clear_bit(X86_FEATURE_SGX, fs);
+
sanitise_featureset(fs);
/* Fold host's FDP_EXCP_ONLY and NO_FPU_SEL into guest's view. */
@@ -545,6 +597,7 @@ void recalculate_cpuid_policy(struct domain *d)
recalculate_xstate(p);
recalculate_misc(p);
+ recalculate_sgx(p);
for ( i = 0; i < ARRAY_SIZE(p->cache.raw); ++i )
{
@@ -641,6 +694,13 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
*res = p->xstate.raw[subleaf];
break;
+ case SGX_CPUID:
+ if ( !p->feat.sgx || subleaf >= ARRAY_SIZE(p->sgx.raw) )
+ return;
+
+ *res = p->sgx.raw[subleaf];
+ break;
+
default:
*res = p->basic.raw[leaf];
break;
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 80b4df9ec95b..0ee9fb6458ec 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -53,6 +53,7 @@ static int update_domain_cpuid_info(struct domain *d,
struct cpuid_policy *p = d->arch.cpuid;
const struct cpuid_leaf leaf = { ctl->eax, ctl->ebx, ctl->ecx, ctl->edx };
int old_vendor = p->x86_vendor;
+ int ret = 0;
/*
* Skip update for leaves we don't care about. This avoids the overhead
@@ -74,6 +75,11 @@ static int update_domain_cpuid_info(struct domain *d,
if ( ctl->input[0] == XSTATE_CPUID &&
ctl->input[1] != 1 ) /* Everything else automatically calculated. */
return 0;
+
+ if ( ctl->input[0] == SGX_CPUID &&
+ ctl->input[1] >= ARRAY_SIZE(p->sgx.raw) )
+ return 0;
+
break;
case 0x40000000: case 0x40000100:
@@ -104,6 +110,10 @@ static int update_domain_cpuid_info(struct domain *d,
p->xstate.raw[ctl->input[1]] = leaf;
break;
+ case SGX_CPUID:
+ p->sgx.raw[ctl->input[1]] = leaf;
+ break;
+
default:
p->basic.raw[ctl->input[0]] = leaf;
break;
@@ -255,6 +265,53 @@ static int update_domain_cpuid_info(struct domain *d,
}
break;
+ case 0x12:
+ {
+ uint64_t base_pfn, npages;
+ struct sgx_domain *sd;
+
+ if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
+ break;
+
+ if ( ctl->input[1] != 2 )
+ break;
+
+ /* SGX has not enabled */
+ if ( !p->feat.sgx || !p->sgx.sgx1 )
+ break;
+
+ /*
+ * If SGX is enabled in CPUID, then we are expecting valid EPC resource
+ * in sub-leaf 0x2. Return -EFAULT to notify toolstack that there's
+ * something wrong.
+ */
+ if ( !p->sgx.base_valid || !p->sgx.size_valid )
+ {
+ ret = -EINVAL;
+ break;
+ }
+
+ base_pfn = (((uint64_t)(p->sgx.base_high)) << 20) |
+ (uint64_t)p->sgx.base_low;
+ npages = (((uint64_t)(p->sgx.npages_high)) << 20) |
+ (uint64_t)p->sgx.npages_low;
+
+ sd = to_sgx(d);
+
+ if ( !sd )
+ {
+ ret = -EFAULT;
+ break;
+ }
+
+ if ( !domain_epc_populated(d) )
+ ret = domain_populate_epc(d, base_pfn, npages);
+ else
+ if ( base_pfn != sd->epc_base_pfn || npages != sd->epc_npages )
+ ret = -EINVAL;
+
+ break;
+ }
case 0x80000001:
if ( is_pv_domain(d) && ((levelling_caps & LCAP_e1cd) == LCAP_e1cd) )
{
@@ -299,7 +356,7 @@ static int update_domain_cpuid_info(struct domain *d,
break;
}
- return 0;
+ return ret;
}
static int vcpu_set_vmce(struct vcpu *v,
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index d2dd841e1581..6d043843713a 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -61,10 +61,11 @@ extern struct cpuidmasks cpuidmask_defaults;
/* Whether or not cpuid faulting is available for the current domain. */
DECLARE_PER_CPU(bool, cpuid_faulting_enabled);
-#define CPUID_GUEST_NR_BASIC (0xdu + 1)
+#define CPUID_GUEST_NR_BASIC (0x12u + 1)
#define CPUID_GUEST_NR_FEAT (0u + 1)
#define CPUID_GUEST_NR_CACHE (5u + 1)
#define CPUID_GUEST_NR_XSTATE (62u + 1)
+#define CPUID_GUEST_NR_SGX (0x2u + 1)
#define CPUID_GUEST_NR_EXTD_INTEL (0x8u + 1)
#define CPUID_GUEST_NR_EXTD_AMD (0x1cu + 1)
#define CPUID_GUEST_NR_EXTD MAX(CPUID_GUEST_NR_EXTD_INTEL, \
@@ -169,6 +170,32 @@ struct cpuid_policy
} comp[CPUID_GUEST_NR_XSTATE];
} xstate;
+ union {
+ struct cpuid_leaf raw[CPUID_GUEST_NR_SGX];
+
+ struct {
+ /* Subleaf 0. */
+ bool sgx1:1, sgx2:1; uint32_t :30;
+ uint32_t miscselect;
+ uint32_t /* c */ :32;
+ uint8_t maxsize_legecy, maxsize_long; uint32_t :16; /* d */
+
+ /* Subleaf 1. */
+ bool init:1, debug:1, mode64:1, /*reserve*/:1, provisionkey:1,
+ einittokenkey:1; uint32_t :26;
+ uint32_t /* SW reserved */ :32;
+ uint32_t xfrm_low, xfrm_high;
+
+ /* Subleaf 2. */
+ bool base_valid:1; uint32_t :11;
+ uint32_t base_low:20;
+ uint32_t base_high:20, :12;
+ bool size_valid:1; uint32_t :11;
+ uint32_t npages_low:20;
+ uint32_t npages_high:20, :12;
+ };
+ } sgx;
+
/* Extended leaves: 0x800000xx */
union {
struct cpuid_leaf raw[CPUID_GUEST_NR_EXTD];
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 11/17] xen: vmx: handle SGX related MSRs
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (9 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 10/17] xen: x86: add SGX cpuid handling support Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 12/17] xen: vmx: handle ENCLS VMEXIT Boqun Feng
` (6 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
This patch handles IA32_FEATURE_CONTROL and IA32_SGXLEPUBKEYHASHn MSRs.
For IA32_FEATURE_CONTROL, if SGX is exposed to domain, then SGX_ENABLE
bit is always set. The SGX_LE_WR bit is default to be 0, unless 1) the
SGX launch control is exposed to domain and 2) the XL parameter 'lewr'
is true(the handling of this parameter is in a later patch, so for this
patch, SGX_LE_WR bit is always 0). Write to IA32_FEATURE_CONTROL will
fault.
For IA32_SGXLEPUBKEYHASHn, vcpu's virtual ia32_sgxlepubkeyhash[0-3] are
added in 'sgx' field of 'struct msr_vcpu_policy'.
During vcpu is initialized, virtual ia32_sgxlepubkeyhash are also
initialized. The default values would be the physical values of the
physical machines. Later on, we may reset those values with the content
of the XL parameter 'lehash'. Besides if 'lewr' is true and no 'lehash'
is provided, we will reset those values with Intel's default value, as
for physical machines, those MSRs will have Intel's default value.
For IA32_SGXLEPUBKEYHASHn MSR read from guest, if SGX launch control is
not exposed to domain, guest is not allowed to read either, otherwise
vcpu's virtual MSR value is returned.
For IA32_SGXLEPUBKEYHASHn MSR write from guest, we allow guest to write
if only 'lewr' is set(so for this patch, writes will fault).
To make EINIT run successfully in guest, vcpu's virtual
IA32_SGXLEPUBKEYHASHn will be update to physical MSRs when vcpu is
scheduled in. Moreover, we cache the recent IA32_SGXLEPUBKEYHASHn in a
percpu variable, so that we won't need to update with wrmsr if the value
not changed.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
xen/arch/x86/domctl.c | 28 ++++++++-
xen/arch/x86/hvm/vmx/vmx.c | 19 ++++++
xen/arch/x86/msr.c | 6 +-
xen/arch/x86/sgx.c | 123 +++++++++++++++++++++++++++++++++++++++
xen/include/asm-x86/cpufeature.h | 3 +
xen/include/asm-x86/msr-index.h | 5 ++
xen/include/asm-x86/msr.h | 5 ++
xen/include/asm-x86/sgx.h | 9 +++
8 files changed, 196 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 0ee9fb6458ec..eb5d4b346313 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1352,13 +1352,16 @@ long arch_do_domctl(
ret = -EINVAL;
if ( (v == curr) || /* no vcpu_pause() */
- !is_pv_domain(d) )
+ (!is_pv_domain(d) && !d->arch.cpuid->feat.sgx_lc) )
break;
/* Count maximum number of optional msrs. */
if ( boot_cpu_has(X86_FEATURE_DBEXT) )
nr_msrs += 4;
+ if ( d->arch.cpuid->feat.sgx_lc )
+ nr_msrs += 5;
+
if ( domctl->cmd == XEN_DOMCTL_get_vcpu_msrs )
{
ret = 0; copyback = true;
@@ -1447,6 +1450,29 @@ long arch_do_domctl(
msr.index -= MSR_AMD64_DR1_ADDRESS_MASK - 1;
v->arch.pv_vcpu.dr_mask[msr.index] = msr.value;
continue;
+ case MSR_IA32_FEATURE_CONTROL:
+ if ( msr.value & IA32_FEATURE_CONTROL_SGX_LE_WR )
+ {
+ if ( d->arch.cpuid->feat.sgx_lc && sgx_lewr())
+ {
+ v->arch.msr->sgx.lewr = true;
+ continue;
+ }
+ else /* Try to set LE_WR while not supported */
+ break;
+ }
+ continue;
+ case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
+ if ( d->arch.cpuid->feat.sgx_lc && sgx_lewr() )
+ {
+ sgx_set_vcpu_sgxlepubkeyhash(v,
+ msr.index - MSR_IA32_SGXLEPUBKEYHASH0,
+ msr.value);
+ continue;
+ }
+ else
+ break;
+ continue;
}
break;
}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 92fb85b13a0c..ce1c95f69062 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1049,6 +1049,9 @@ static void vmx_ctxt_switch_to(struct vcpu *v)
if ( v->domain->arch.hvm_domain.pi_ops.switch_to )
v->domain->arch.hvm_domain.pi_ops.switch_to(v);
+
+ if ( v->domain->arch.cpuid->feat.sgx_lc && sgx_lewr() )
+ sgx_ctxt_switch_to(v);
}
@@ -2892,6 +2895,8 @@ static int is_last_branch_msr(u32 ecx)
static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
{
const struct vcpu *curr = current;
+ const struct msr_vcpu_policy *vp = curr->arch.msr;
+ const struct domain *d = current->domain;
HVM_DBG_LOG(DBG_LEVEL_MSR, "ecx=%#x", msr);
@@ -2915,11 +2920,19 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
*msr_content |= IA32_FEATURE_CONTROL_LMCE_ON;
if ( nestedhvm_enabled(curr->domain) )
*msr_content |= IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
+ if ( d->arch.cpuid->feat.sgx )
+ *msr_content |= IA32_FEATURE_CONTROL_SGX_ENABLE;
+ if ( vp->sgx.lewr )
+ *msr_content |= IA32_FEATURE_CONTROL_SGX_LE_WR;
break;
case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_VMFUNC:
if ( !nvmx_msr_read_intercept(msr, msr_content) )
goto gp_fault;
break;
+ case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
+ if ( !sgx_msr_read_intercept(current, msr, msr_content) )
+ goto gp_fault;
+ break;
case MSR_IA32_MISC_ENABLE:
rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
/* Debug Trace Store is not supported. */
@@ -3146,6 +3159,12 @@ static int vmx_msr_write_intercept(unsigned int msr, uint64_t msr_content)
case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
/* None of these MSRs are writeable. */
goto gp_fault;
+ break;
+
+ case MSR_IA32_SGXLEPUBKEYHASH0...MSR_IA32_SGXLEPUBKEYHASH3:
+ if ( !sgx_msr_write_intercept(current, msr, msr_content) )
+ goto gp_fault;
+ break;
case MSR_P6_PERFCTR(0)...MSR_P6_PERFCTR(7):
case MSR_P6_EVNTSEL(0)...MSR_P6_EVNTSEL(7):
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index baba44f43d05..95cb41b4d825 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -23,6 +23,7 @@
#include <xen/lib.h>
#include <xen/sched.h>
#include <asm/msr.h>
+#include <asm/sgx.h>
struct msr_domain_policy __read_mostly hvm_max_msr_domain_policy,
__read_mostly pv_max_msr_domain_policy;
@@ -112,6 +113,8 @@ int init_vcpu_msr_policy(struct vcpu *v)
if ( is_control_domain(d) )
vp->misc_features_enables.available = false;
+ sgx_msr_vcpu_init(v, vp);
+
v->arch.msr = vp;
return 0;
@@ -119,8 +122,9 @@ int init_vcpu_msr_policy(struct vcpu *v)
int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
{
- const struct msr_domain_policy *dp = v->domain->arch.msr;
const struct msr_vcpu_policy *vp = v->arch.msr;
+ const struct domain *d = v->domain;
+ const struct msr_domain_policy *dp = d->arch.msr;
switch ( msr )
{
diff --git a/xen/arch/x86/sgx.c b/xen/arch/x86/sgx.c
index 0c898c3086cb..d103eb243e7a 100644
--- a/xen/arch/x86/sgx.c
+++ b/xen/arch/x86/sgx.c
@@ -27,12 +27,15 @@
#include <asm/sgx.h>
#include <xen/sched.h>
#include <asm/p2m.h>
+#include <xen/percpu.h>
struct sgx_cpuinfo __read_mostly boot_sgx_cpudata;
static bool __read_mostly opt_sgx_enabled = false;
boolean_param("sgx", opt_sgx_enabled);
+DEFINE_PER_CPU(uint64_t[4], cpu_ia32_sgxlepubkeyhash);
+
#define total_epc_npages (boot_sgx_cpudata.epc_size >> PAGE_SHIFT)
#define epc_base_mfn (boot_sgx_cpudata.epc_base >> PAGE_SHIFT)
#define epc_base_maddr (boot_sgx_cpudata.epc_base)
@@ -378,6 +381,126 @@ int domain_destroy_epc(struct domain *d)
return domain_reset_epc(d, true);
}
+/* Digest of Intel signing key. MSR's default value after reset. */
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH0 0xa6053e051270b7ac
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH1 0x6cfbe8ba8b3b413d
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH2 0xc4916d99f2b3735d
+#define SGX_INTEL_DEFAULT_LEPUBKEYHASH3 0xd4f8c05909f9bb3b
+
+void sgx_set_vcpu_sgxlepubkeyhash(struct vcpu *v, int idx, uint64_t val)
+{
+ BUG_ON(idx < 0 || idx > 3);
+
+ v->arch.msr->sgx.ia32_sgxlepubkeyhash[idx] = val;
+}
+
+void sgx_msr_vcpu_init(struct vcpu *v, struct msr_vcpu_policy *vp)
+{
+ const struct domain *d = v->domain;
+
+ /* lewr is default false */
+ vp->sgx.lewr = false;
+
+ if ( d->arch.cpuid->feat.sgx_lc )
+ {
+ if ( sgx_lewr() )
+ {
+ vp->sgx.ia32_sgxlepubkeyhash[0] = SGX_INTEL_DEFAULT_LEPUBKEYHASH0;
+ vp->sgx.ia32_sgxlepubkeyhash[1] = SGX_INTEL_DEFAULT_LEPUBKEYHASH1;
+ vp->sgx.ia32_sgxlepubkeyhash[2] = SGX_INTEL_DEFAULT_LEPUBKEYHASH2;
+ vp->sgx.ia32_sgxlepubkeyhash[3] = SGX_INTEL_DEFAULT_LEPUBKEYHASH3;
+ }
+ else
+ {
+ rdmsrl(MSR_IA32_SGXLEPUBKEYHASH0, vp->sgx.ia32_sgxlepubkeyhash[0]);
+ rdmsrl(MSR_IA32_SGXLEPUBKEYHASH1, vp->sgx.ia32_sgxlepubkeyhash[1]);
+ rdmsrl(MSR_IA32_SGXLEPUBKEYHASH2, vp->sgx.ia32_sgxlepubkeyhash[2]);
+ rdmsrl(MSR_IA32_SGXLEPUBKEYHASH3, vp->sgx.ia32_sgxlepubkeyhash[3]);
+ }
+ }
+}
+
+#define sgx_try_to_write_msr(vp, i) \
+do \
+{ \
+ if ((vp)->sgx.ia32_sgxlepubkeyhash[i] != \
+ this_cpu(cpu_ia32_sgxlepubkeyhash[i])) \
+ { \
+ wrmsrl(MSR_IA32_SGXLEPUBKEYHASH##i, \
+ (vp)->sgx.ia32_sgxlepubkeyhash[i]); \
+ this_cpu(cpu_ia32_sgxlepubkeyhash[i]) = \
+ (vp)->sgx.ia32_sgxlepubkeyhash[i]; \
+ } \
+} while (0)
+
+void sgx_ctxt_switch_to(struct vcpu *v)
+{
+ struct msr_vcpu_policy *vp = v->arch.msr;
+
+ sgx_try_to_write_msr(vp, 0);
+ sgx_try_to_write_msr(vp, 1);
+ sgx_try_to_write_msr(vp, 2);
+ sgx_try_to_write_msr(vp, 3);
+}
+
+int sgx_msr_read_intercept(struct vcpu *v, unsigned int msr, u64 *msr_content)
+{
+ const struct msr_vcpu_policy *vp = v->arch.msr;
+ const struct domain *d = v->domain;
+ u64 data;
+ int r = 1;
+
+ if ( !d->arch.cpuid->feat.sgx_lc )
+ return 0;
+
+ switch ( msr )
+ {
+ case MSR_IA32_SGXLEPUBKEYHASH0 ... MSR_IA32_SGXLEPUBKEYHASH3:
+ data = vp->sgx.ia32_sgxlepubkeyhash[msr - MSR_IA32_SGXLEPUBKEYHASH0];
+ *msr_content = data;
+
+ break;
+ default:
+ r = 0;
+ break;
+ }
+
+ return r;
+}
+
+int sgx_msr_write_intercept(struct vcpu *v, unsigned int msr, u64 msr_content)
+{
+ struct msr_vcpu_policy *vp = v->arch.msr;
+ const struct domain *d = v->domain;
+ int r = 1;
+
+ /*
+ * SDM 35.1 Model-Specific Registers, table 35-2.
+ *
+ * IA32_SGXLEPUBKEYHASH[0..3]:
+ *
+ * - If CPUID.0x7.0:ECX[30] = 1, FEATURE_CONTROL[17] is available.
+ * - Write permitted if CPUID.0x12.0:EAX[0] = 1 &&
+ * FEATURE_CONTROL[17] = 1 && FEATURE_CONTROL[0] = 1.
+ */
+ if ( !d->arch.cpuid->feat.sgx_lc || !vp->sgx.lewr )
+ return 0;
+
+ switch ( msr )
+ {
+ case MSR_IA32_SGXLEPUBKEYHASH0...MSR_IA32_SGXLEPUBKEYHASH3:
+ vp->sgx.ia32_sgxlepubkeyhash[msr - MSR_IA32_SGXLEPUBKEYHASH0] =
+ msr_content;
+
+ break;
+ default:
+ r = 0;
+ break;
+ }
+
+ return r;
+}
+
static void __detect_sgx(struct sgx_cpuinfo *sgxinfo)
{
u32 eax, ebx, ecx, edx;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 9793f8c1c586..f15deb535871 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -98,6 +98,9 @@
#define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP)
#define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA)
+/* CPUID level 0x00000007:0.ecx */
+#define cpu_has_sgx_lc boot_cpu_has(X86_FEATURE_SGX_LC)
+
/* CPUID level 0x80000007.edx */
#define cpu_has_itsc boot_cpu_has(X86_FEATURE_ITSC)
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 63e11931cd09..004e0fb249d5 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -300,6 +300,11 @@
#define IA32_FEATURE_CONTROL_LMCE_ON 0x100000
#define IA32_FEATURE_CONTROL_SGX_LE_WR 0x20000
+#define MSR_IA32_SGXLEPUBKEYHASH0 0x0000008c
+#define MSR_IA32_SGXLEPUBKEYHASH1 0x0000008d
+#define MSR_IA32_SGXLEPUBKEYHASH2 0x0000008e
+#define MSR_IA32_SGXLEPUBKEYHASH3 0x0000008f
+
#define MSR_IA32_TSC_ADJUST 0x0000003b
#define MSR_IA32_APICBASE 0x0000001b
diff --git a/xen/include/asm-x86/msr.h b/xen/include/asm-x86/msr.h
index 751fa25a3694..e255a28f7fec 100644
--- a/xen/include/asm-x86/msr.h
+++ b/xen/include/asm-x86/msr.h
@@ -220,6 +220,11 @@ struct msr_vcpu_policy
bool available; /* This MSR is non-architectural */
bool cpuid_faulting;
} misc_features_enables;
+
+ struct {
+ bool lewr;
+ uint64_t ia32_sgxlepubkeyhash[4];
+ } sgx;
};
void init_guest_msr_policy(void);
diff --git a/xen/include/asm-x86/sgx.h b/xen/include/asm-x86/sgx.h
index 855e7e638743..97ca5dd5b1ef 100644
--- a/xen/include/asm-x86/sgx.h
+++ b/xen/include/asm-x86/sgx.h
@@ -24,6 +24,7 @@
#include <xen/types.h>
#include <xen/init.h>
#include <asm/processor.h>
+#include <asm/msr.h>
#include <public/hvm/params.h> /* HVM_PARAM_SGX */
#define SGX_CPUID 0x12
@@ -59,6 +60,8 @@ void detect_sgx(struct sgx_cpuinfo *sgxinfo);
void disable_sgx(void);
#define sgx_lewr() (boot_sgx_cpudata.lewr)
+DECLARE_PER_CPU(uint64_t[4], cpu_ia32_sgxlepubkeyhash);
+
struct page_info *alloc_epc_page(void);
void free_epc_page(struct page_info *epg);
@@ -74,4 +77,10 @@ int domain_populate_epc(struct domain *d, unsigned long epc_base_pfn,
int domain_reset_epc(struct domain *d, bool free_epc);
int domain_destroy_epc(struct domain *d);
+void sgx_set_vcpu_sgxlepubkeyhash(struct vcpu *v, int idx, uint64_t val);
+void sgx_ctxt_switch_to(struct vcpu *v);
+void sgx_msr_vcpu_init(struct vcpu *v, struct msr_vcpu_policy *vp);
+int sgx_msr_read_intercept(struct vcpu *v, unsigned int msr, u64 *msr_content);
+int sgx_msr_write_intercept(struct vcpu *v, unsigned int msr, u64 msr_content);
+
#endif /* __ASM_X86_SGX_H__ */
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 12/17] xen: vmx: handle ENCLS VMEXIT
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (10 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 11/17] xen: vmx: handle SGX related MSRs Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 13/17] xen: vmx: handle VMEXIT from SGX enclave Boqun Feng
` (5 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
Currently EPC are statically allocated and mapped to guest, we don't have
to trap ENCLS as it runs perfectly in VMX non-root mode. But exposing SGX
to guest means we also expose ENABLE_ENCLS bit to L1 hypervisor, therefore
we cannot stop L1 from enabling ENCLS VMEXIT. For ENCLS VMEXIT from L2 guest,
we simply inject it to L1, otherwise the ENCLS VMEXIT is unexpected in L0
and we simply crash the domain.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 10 ++++++++++
xen/arch/x86/hvm/vmx/vvmx.c | 11 +++++++++++
xen/include/asm-x86/hvm/vmx/vmcs.h | 1 +
xen/include/asm-x86/hvm/vmx/vmx.h | 1 +
4 files changed, 23 insertions(+)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ce1c95f69062..c48c44565fc5 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -4118,6 +4118,16 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
vmx_handle_apic_write();
break;
+ case EXIT_REASON_ENCLS:
+ /*
+ * Currently L0 doesn't turn on ENCLS VMEXIT, but L0 cannot stop L1
+ * from enabling ENCLS VMEXIT. ENCLS VMEXIT from L2 guest has already
+ * been handled so by reaching here it is a BUG. We simply crash the
+ * domain.
+ */
+ domain_crash(v->domain);
+ break;
+
case EXIT_REASON_PML_FULL:
vmx_vcpu_flush_pml_buffer(v);
break;
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index dde02c076b9f..9c6123dc35ee 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -2094,6 +2094,12 @@ int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
SECONDARY_EXEC_ENABLE_VPID |
SECONDARY_EXEC_UNRESTRICTED_GUEST |
SECONDARY_EXEC_ENABLE_EPT;
+ /*
+ * If SGX is exposed to guest, then ENABLE_ENCLS bit must also be
+ * exposed to guest.
+ */
+ if ( d->arch.cpuid->feat.sgx )
+ data |= SECONDARY_EXEC_ENABLE_ENCLS;
data = gen_vmx_msr(data, 0, host_data);
break;
case MSR_IA32_VMX_EXIT_CTLS:
@@ -2316,6 +2322,11 @@ int nvmx_n2_vmexit_handler(struct cpu_user_regs *regs,
case EXIT_REASON_VMXON:
case EXIT_REASON_INVEPT:
case EXIT_REASON_XSETBV:
+ /*
+ * L0 doesn't turn on ENCLS VMEXIT now, so ENCLS VMEXIT must come from
+ * L2 guest, and is because of ENCLS VMEXIT is turned on by L1.
+ */
+ case EXIT_REASON_ENCLS:
/* inject to L1 */
nvcpu->nv_vmexit_pending = 1;
break;
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index 44ff4f0a113f..f68f3d0f6801 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -407,6 +407,7 @@ enum vmcs_field {
VIRT_EXCEPTION_INFO = 0x0000202a,
XSS_EXIT_BITMAP = 0x0000202c,
TSC_MULTIPLIER = 0x00002032,
+ ENCLS_EXITING_BITMAP = 0x0000202E,
GUEST_PHYSICAL_ADDRESS = 0x00002400,
VMCS_LINK_POINTER = 0x00002800,
GUEST_IA32_DEBUGCTL = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 7341cb191ef2..8547de9168eb 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -215,6 +215,7 @@ static inline void pi_clear_sn(struct pi_desc *pi_desc)
#define EXIT_REASON_APIC_WRITE 56
#define EXIT_REASON_INVPCID 58
#define EXIT_REASON_VMFUNC 59
+#define EXIT_REASON_ENCLS 60
#define EXIT_REASON_PML_FULL 62
#define EXIT_REASON_XSAVES 63
#define EXIT_REASON_XRSTORS 64
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 13/17] xen: vmx: handle VMEXIT from SGX enclave
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (11 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 12/17] xen: vmx: handle ENCLS VMEXIT Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 14/17] xen: x86: reset EPC when guest got suspended Boqun Feng
` (4 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
VMX adds new bit to both exit_reason and GUEST_INTERRUPT_STATE to indicate
whether VMEXIT happens in Enclave. Several instructions are also invalid or
behave differently in enclave according to SDM. This patch handles those
cases.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 29 +++++++++++++++++++++++++++++
xen/include/asm-x86/hvm/vmx/vmcs.h | 2 ++
xen/include/asm-x86/hvm/vmx/vmx.h | 2 ++
3 files changed, 33 insertions(+)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c48c44565fc5..280fc82ca1ff 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -58,6 +58,7 @@
#include <asm/mce.h>
#include <asm/monitor.h>
#include <public/arch-x86/cpuid.h>
+#include <asm/sgx.h>
static bool_t __initdata opt_force_ept;
boolean_param("force-ept", opt_force_ept);
@@ -3536,6 +3537,7 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
unsigned long exit_qualification, exit_reason, idtv_info, intr_info = 0;
unsigned int vector = 0, mode;
struct vcpu *v = current;
+ bool_t exit_from_sgx_enclave;
__vmread(GUEST_RIP, ®s->rip);
__vmread(GUEST_RSP, ®s->rsp);
@@ -3561,6 +3563,11 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
perfc_incra(vmexits, exit_reason);
+ /* We need to handle several VMEXITs if VMEXIT is from enclave. Also clear
+ * bit 27 as it is further useless. */
+ exit_from_sgx_enclave = !!(exit_reason & VMX_EXIT_REASONS_FROM_ENCLAVE);
+ exit_reason &= ~VMX_EXIT_REASONS_FROM_ENCLAVE;
+
/* Handle the interrupt we missed before allowing any more in. */
switch ( (uint16_t)exit_reason )
{
@@ -4062,6 +4069,18 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
break;
case EXIT_REASON_INVD:
+ /*
+ * SDM 39.6.5 INVD Handling when Enclave Are Enabled
+ *
+ * INVD cause #GP if EPC is enabled.
+ * FIXME: WBINVD??
+ */
+ if ( exit_from_sgx_enclave )
+ {
+ hvm_inject_hw_exception(TRAP_gp_fault, 0);
+ break;
+ }
+ /* Otherwise passthrough */
case EXIT_REASON_WBINVD:
{
update_guest_eip(); /* Safe: INVD, WBINVD */
@@ -4073,6 +4092,16 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
{
paddr_t gpa;
+ /*
+ * Currently EPT violation from enclave is not possible as all EPC pages
+ * are statically allocated to guest when guest is created. We simply
+ * crash guest in this case.
+ */
+ if ( exit_from_sgx_enclave )
+ {
+ domain_crash(v->domain);
+ break;
+ }
__vmread(GUEST_PHYSICAL_ADDRESS, &gpa);
__vmread(EXIT_QUALIFICATION, &exit_qualification);
ept_handle_violation(exit_qualification, gpa);
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index f68f3d0f6801..52f137437b97 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -338,6 +338,8 @@ extern u64 vmx_ept_vpid_cap;
#define VMX_INTR_SHADOW_MOV_SS 0x00000002
#define VMX_INTR_SHADOW_SMI 0x00000004
#define VMX_INTR_SHADOW_NMI 0x00000008
+#define VMX_INTR_ENCLAVE_INTR 0x00000010 /* VMEXIT was incident to
+ enclave mode */
#define VMX_BASIC_REVISION_MASK 0x7fffffff
#define VMX_BASIC_VMCS_SIZE_MASK (0x1fffULL << 32)
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 8547de9168eb..88d0dd600500 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -158,6 +158,8 @@ static inline void pi_clear_sn(struct pi_desc *pi_desc)
* Exit Reasons
*/
#define VMX_EXIT_REASONS_FAILED_VMENTRY 0x80000000
+/* Bit 27 is also set if VMEXIT is from SGX enclave mode */
+#define VMX_EXIT_REASONS_FROM_ENCLAVE 0x08000000
#define EXIT_REASON_EXCEPTION_NMI 0
#define EXIT_REASON_EXTERNAL_INTERRUPT 1
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 14/17] xen: x86: reset EPC when guest got suspended.
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (12 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 13/17] xen: vmx: handle VMEXIT from SGX enclave Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 15/17] xen: tools: add new 'sgx' parameter support Boqun Feng
` (3 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
EPC is destroyed when power state goes to S3-S5. Emulate this behavior.
A new function s3_suspend is added to hvm_function_table for this purpose.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
xen/arch/x86/hvm/hvm.c | 3 +++
xen/arch/x86/hvm/vmx/vmx.c | 7 +++++++
xen/include/asm-x86/hvm/hvm.h | 3 +++
3 files changed, 13 insertions(+)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c5e8467f3219..053c15afc46a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3952,6 +3952,9 @@ static void hvm_s3_suspend(struct domain *d)
hvm_vcpu_reset_state(d->vcpu[0], 0xf000, 0xfff0);
+ if ( hvm_funcs.s3_suspend )
+ hvm_funcs.s3_suspend(d);
+
domain_unlock(d);
}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 280fc82ca1ff..17190b06a421 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2307,6 +2307,12 @@ static bool vmx_get_pending_event(struct vcpu *v, struct x86_event *info)
return true;
}
+static void vmx_s3_suspend(struct domain *d)
+{
+ if ( d->arch.cpuid->feat.sgx )
+ domain_reset_epc(d, false);
+}
+
static struct hvm_function_table __initdata vmx_function_table = {
.name = "VMX",
.cpu_up_prepare = vmx_cpu_up_prepare,
@@ -2378,6 +2384,7 @@ static struct hvm_function_table __initdata vmx_function_table = {
.max_ratio = VMX_TSC_MULTIPLIER_MAX,
.setup = vmx_setup_tsc_scaling,
},
+ .s3_suspend = vmx_s3_suspend,
};
/* Handle VT-d posted-interrupt when VCPU is blocked. */
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 6ecad3331695..d9ff98a1b0ed 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -227,6 +227,9 @@ struct hvm_function_table {
/* Architecture function to setup TSC scaling ratio */
void (*setup)(struct vcpu *v);
} tsc_scaling;
+
+ /* Domain S3 suspend */
+ void (*s3_suspend)(struct domain *d);
};
extern struct hvm_function_table hvm_funcs;
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 15/17] xen: tools: add new 'sgx' parameter support
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (13 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 14/17] xen: x86: reset EPC when guest got suspended Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 16/17] xen: tools: add SGX to applying CPUID policy Boqun Feng
` (2 subsequent siblings)
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
In order to be able to configure domain's SGX related attributes(EPC
size, Launch Enclave hash key, etc.), a new parameter 'sgx' is added to
XL configuration file, the parameter should be in the following format:
sgx = 'epc=<size in MB>,lehash=<..>,lewr=<0|1>'
, in which 'lehash=<..>' and 'lewr=<0|1>' are optional.
A new 'libxl_sgx_buildinfo', which contains EPC base and size, and
Launch Enclave hash key and its writable permission, is also
added to libxl_domain_buind_info. EPC base and size are also added to
'xc_dom_image' in order to add EPC to e820 table. EPC base is calculated
internally.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
tools/libxc/include/xc_dom.h | 4 +++
tools/libxl/libxl_create.c | 10 ++++++
tools/libxl/libxl_dom.c | 30 +++++++++++++++++
tools/libxl/libxl_internal.h | 2 ++
tools/libxl/libxl_types.idl | 11 +++++++
tools/libxl/libxl_x86.c | 12 +++++++
tools/xl/xl_parse.c | 76 ++++++++++++++++++++++++++++++++++++++++++++
tools/xl/xl_parse.h | 1 +
8 files changed, 146 insertions(+)
diff --git a/tools/libxc/include/xc_dom.h b/tools/libxc/include/xc_dom.h
index cdcdd07d2bc2..8440532d0e9d 100644
--- a/tools/libxc/include/xc_dom.h
+++ b/tools/libxc/include/xc_dom.h
@@ -203,6 +203,10 @@ struct xc_dom_image {
xen_paddr_t lowmem_end;
xen_paddr_t highmem_end;
xen_pfn_t vga_hole_size;
+#if defined(__i386__) || defined(__x86_64__)
+ xen_paddr_t epc_base;
+ xen_paddr_t epc_size;
+#endif
/* If unset disables the setup of the IOREQ pages. */
bool device_model;
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index f15fb215c24b..6a5863cd9637 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -59,6 +59,14 @@ void libxl__rdm_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
LIBXL_RDM_MEM_BOUNDARY_MEMKB_DEFAULT;
}
+void libxl__sgx_setdefault(libxl__gc *gc, libxl_domain_build_info *b_info)
+{
+ if (b_info->u.hvm.sgx.epckb == LIBXL_MEMKB_DEFAULT)
+ b_info->u.hvm.sgx.epckb = 0;
+ b_info->u.hvm.sgx.epcbase = 0;
+ libxl_defbool_setdefault(&b_info->u.hvm.sgx.lewr, false);
+}
+
int libxl__domain_build_info_setdefault(libxl__gc *gc,
libxl_domain_build_info *b_info)
{
@@ -359,6 +367,8 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
libxl_defbool_setdefault(&b_info->u.hvm.gfx_passthru, false);
libxl__rdm_setdefault(gc, b_info);
+
+ libxl__sgx_setdefault(gc, b_info);
break;
case LIBXL_DOMAIN_TYPE_PV:
libxl_defbool_setdefault(&b_info->u.pv.e820_host, false);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index ef834e652d65..bbdba7e6e292 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -1213,6 +1213,36 @@ int libxl__build_hvm(libxl__gc *gc, uint32_t domid,
highmem_end = (1ull << 32) + (lowmem_end - mmio_start);
lowmem_end = mmio_start;
}
+#if defined(__i386__) || defined(__x86_64__)
+ if (info->u.hvm.sgx.epckb) {
+ /*
+ * FIXME:
+ *
+ * Currently EPC base is put at highmem_end + 8G, which should be
+ * safe in most cases.
+ *
+ * I am not quite sure which is the best way to calcualte EPC base.
+ * IMO we can either:
+ * 1) put EPC between lowmem_end to mmio_start, but this brings
+ * additional logic to handle, ex, lowmem_end may become too small
+ * if EPC is large (shall we limit domain's EPC size?), and hvmloader
+ * will try to enlarge MMIO space until lowmem_end, or even relocate
+ * lowmem -- all those make things complicated, so probably put EPC
+ * in hole between lowmem_end to mmio_start is not good.
+ * 2) put EPC after highmem_end, but hvmloader may also relocate MMIO
+ * resource to the place after highmem_end. Maybe the ideal way is to
+ * put EPC right after highmem_end, and change hvmloader to detect
+ * EPC, and put high MMIO resource after EPC. I've done this but I
+ * found a strange bug that EPT mapping of EPC will be (at least part
+ * of the mappings) will be removed by whom I still cannot find.
+ * Currently EPC base is put at highmem_end + 8G, and hvmloader code
+ * is not changed to handle EPC, but this should be safe for most cases.
+ */
+ info->u.hvm.sgx.epcbase = highmem_end + (2ULL << 32);
+ }
+ dom->epc_size = (info->u.hvm.sgx.epckb << 10);
+ dom->epc_base = info->u.hvm.sgx.epcbase;
+#endif
dom->lowmem_end = lowmem_end;
dom->highmem_end = highmem_end;
dom->mmio_start = mmio_start;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bfa95d861901..ec3522f1b0e0 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1253,6 +1253,8 @@ _hidden int libxl__domain_build_info_setdefault(libxl__gc *gc,
libxl_domain_build_info *b_info);
_hidden void libxl__rdm_setdefault(libxl__gc *gc,
libxl_domain_build_info *b_info);
+_hidden void libxl__sgx_setdefault(libxl__gc *gc,
+ libxl_domain_build_info *b_info);
_hidden const char *libxl__device_nic_devname(libxl__gc *gc,
uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index a23932434163..762de807c7ed 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -457,6 +457,16 @@ libxl_altp2m_mode = Enumeration("altp2m_mode", [
(3, "limited"),
], init_val = "LIBXL_ALTP2M_MODE_DISABLED")
+libxl_sgx_buildinfo = Struct("sgx_buildinfo", [
+ ("epcbase", uint64), # EPC base address
+ ("epckb", MemKB), # EPC size in KB
+ ("lehash0", uint64), # Default SGXPUBKEYHASH
+ ("lehash1", uint64), # Default SGXPUBKEYHASH
+ ("lehash2", uint64), # Default SGXPUBKEYHASH
+ ("lehash3", uint64), # Default SGXPUBKEYHASH
+ ("lewr", libxl_defbool), # SGXPUBKEYHASH writable or not
+ ], dir=DIR_IN)
+
libxl_domain_build_info = Struct("domain_build_info",[
("max_vcpus", integer),
("avail_vcpus", libxl_bitmap),
@@ -581,6 +591,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("rdm", libxl_rdm_reserve),
("rdm_mem_boundary_memkb", MemKB),
("mca_caps", uint64),
+ ("sgx", libxl_sgx_buildinfo),
])),
("pv", Struct(None, [("kernel", string),
("slack_memkb", MemKB),
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 5f91fe4f92d8..01bd2f8eeef0 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -539,6 +539,9 @@ int libxl__arch_domain_construct_memmap(libxl__gc *gc,
if (dom->acpi_modules[i].length)
e820_entries++;
+ if ( dom->epc_base && dom->epc_size )
+ e820_entries++;
+
if (e820_entries >= E820MAX) {
LOGD(ERROR, domid, "Ooops! Too many entries in the memory map!");
rc = ERROR_INVAL;
@@ -579,6 +582,15 @@ int libxl__arch_domain_construct_memmap(libxl__gc *gc,
e820[nr].addr = ((uint64_t)1 << 32);
e820[nr].size = highmem_size;
e820[nr].type = E820_RAM;
+ nr++;
+ }
+
+ /* EPC */
+ if (dom->epc_base && dom->epc_size) {
+ e820[nr].addr = dom->epc_base;
+ e820[nr].size = dom->epc_size;
+ e820[nr].type = E820_RESERVED;
+ nr++;
}
if (xc_domain_set_memory_map(CTX->xch, domid, e820, e820_entries) != 0) {
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 9a692d5ae644..e96612bc71f3 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -804,6 +804,60 @@ int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token)
return 0;
}
+static uint64_t swap_uint64(uint64_t u)
+{
+ u = ((u << 8) & 0xFF00FF00FF00FF00ULL) | ((u >> 8) & 0x00FF00FF00FF00FFULL);
+ u = ((u << 16) & 0xFFFF0000FFFF0000ULL) | ((u >> 16) & 0x0000FFFF0000FFFFULL);
+ return (u << 32) | (u >> 32);
+}
+
+int parse_sgx_config(libxl_sgx_buildinfo *sgx, char *token)
+{
+ char *oparg;
+ long l;
+
+ if (MATCH_OPTION("epc", token, oparg)) {
+ l = strtol(oparg, NULL, 0);
+
+ /* Get EPC size. EPC base is calculated by toolstack later. */
+ if (l >= 0) {
+ sgx->epckb = l * 1024;
+ }
+ } else if (MATCH_OPTION("lehash", token, oparg)) {
+ if (strlen(oparg) != 64) { /* not 256bit hash */
+ fprintf(stderr, "'lehash=<...>' requires 256bit SHA256 hash\n");
+ return 1;
+ }
+
+ char buf[17];
+
+ memset(buf, 0, 17);
+
+ memcpy(buf, oparg, 16);
+ oparg += 16;
+ sgx->lehash0 = swap_uint64(strtoull(buf, NULL, 16));
+
+ memcpy(buf, oparg, 16);
+ oparg += 16;
+ sgx->lehash1 = swap_uint64(strtoull(buf, NULL, 16));
+
+ memcpy(buf, oparg, 16);
+ oparg += 16;
+ sgx->lehash2 = swap_uint64(strtoull(buf, NULL, 16));
+
+ memcpy(buf, oparg, 16);
+ oparg += 16;
+ sgx->lehash3 = swap_uint64(strtoull(buf, NULL, 16));
+ } else if (MATCH_OPTION("lewr", token, oparg)) {
+ libxl_defbool_set(&sgx->lewr, !!strtoul(oparg, NULL, 0));
+ } else {
+ fprintf(stderr, "Unknown string `%s' in sgx config\n", token);
+ return 1;
+ }
+
+ return 0;
+}
+
int parse_vdispl_config(libxl_device_vdispl *vdispl, char *token)
{
char *oparg;
@@ -1323,6 +1377,28 @@ void parse_config_data(const char *config_source,
if (!xlu_cfg_get_long (config, "rdm_mem_boundary", &l, 0))
b_info->u.hvm.rdm_mem_boundary_memkb = l * 1024;
+ if (!xlu_cfg_get_string(config, "sgx", &buf, 0)) {
+ char *buf2 = strdup(buf);
+ char *p;
+
+ b_info->u.hvm.sgx.lehash0 = 0;
+ b_info->u.hvm.sgx.lehash1 = 0;
+ b_info->u.hvm.sgx.lehash2 = 0;
+ b_info->u.hvm.sgx.lehash3 = 0;
+
+ p = strtok(buf2, ",");
+ if (!p)
+ goto skip_sgx;
+ do {
+ while (*p == ' ')
+ p++;
+ if (parse_sgx_config(&b_info->u.hvm.sgx, p))
+ exit(1);
+ } while ((p = strtok(NULL, ",")) != NULL);
+skip_sgx:
+ free(buf2);
+ }
+
switch (xlu_cfg_get_list(config, "mca_caps",
&mca_caps, &num_mca_caps, 1))
{
diff --git a/tools/xl/xl_parse.h b/tools/xl/xl_parse.h
index cc459fb43f4a..14eb69b8e6aa 100644
--- a/tools/xl/xl_parse.h
+++ b/tools/xl/xl_parse.h
@@ -31,6 +31,7 @@ void parse_disk_config_multistring(XLU_Config **config,
libxl_device_disk *disk);
int parse_usbctrl_config(libxl_device_usbctrl *usbctrl, char *token);
int parse_usbdev_config(libxl_device_usbdev *usbdev, char *token);
+int parse_sgx_config(libxl_sgx_buildinfo *sgx, char *token);
int parse_cpurange(const char *cpu, libxl_bitmap *cpumap);
int parse_nic_config(libxl_device_nic *nic, XLU_Config **config, char *token);
int parse_vdispl_config(libxl_device_vdispl *vdispl, char *token);
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 16/17] xen: tools: add SGX to applying CPUID policy
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (14 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 15/17] xen: tools: add new 'sgx' parameter support Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 17/17] xen: tools: add SGX to applying MSR policy Boqun Feng
2017-12-25 5:01 ` [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
From: Kai Huang <kai.huang@linux.intel.com>
In libxc, a new structure 'xc_cpuid_policy_build_info_t' is added to carry
domain's EPC base and size info from libxl. libxl_cpuid_apply_policy is also
changed to take 'libxl_domain_build_info_t' as parameter, where domain's EPC
base and size can be got and passed to xc_cpuid_apply_policy.
xc_cpuid_apply_policy is extended to support SGX CPUID. If hypervisor doesn't
report SGX feature in host type cpufeatureset, then using 'epc' parameter
results in domain creation failure as SGX cannot be supported.
Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
---
tools/libxc/include/xenctrl.h | 14 ++++++++
tools/libxc/xc_cpuid_x86.c | 68 ++++++++++++++++++++++++++++++++++---
tools/libxl/libxl.h | 3 +-
tools/libxl/libxl_cpuid.c | 15 ++++++--
tools/libxl/libxl_dom.c | 6 +++-
tools/libxl/libxl_nocpuid.c | 4 ++-
tools/ocaml/libs/xc/xenctrl_stubs.c | 11 +++++-
tools/python/xen/lowlevel/xc/xc.c | 11 +++++-
8 files changed, 121 insertions(+), 11 deletions(-)
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 666db0b9193e..ad4429ca5ffd 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1827,6 +1827,19 @@ int xc_domain_debug_control(xc_interface *xch,
uint32_t vcpu);
#if defined(__i386__) || defined(__x86_64__)
+typedef struct xc_cpuid_policy_build_info_sgx {
+ uint64_t epc_base;
+ uint64_t epc_size;
+} xc_cpuid_policy_build_info_sgx_t;
+
+typedef struct xc_cpuid_policy_build_info {
+ xc_cpuid_policy_build_info_sgx_t sgx;
+} xc_cpuid_policy_build_info_t;
+
+int xc_cpuid_check(xc_interface *xch,
+ const unsigned int *input,
+ const char **config,
+ char **config_transformed);
int xc_cpuid_set(xc_interface *xch,
uint32_t domid,
const unsigned int *input,
@@ -1834,6 +1847,7 @@ int xc_cpuid_set(xc_interface *xch,
char **config_transformed);
int xc_cpuid_apply_policy(xc_interface *xch,
uint32_t domid,
+ xc_cpuid_policy_build_info_t *b_info,
uint32_t *featureset,
unsigned int nr_features);
void xc_cpuid_to_str(const unsigned int *regs,
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 25b922ea2184..a778acf79a64 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -38,7 +38,7 @@ enum {
#define clear_feature(idx, dst) ((dst) &= ~bitmaskof(idx))
#define set_feature(idx, dst) ((dst) |= bitmaskof(idx))
-#define DEF_MAX_BASE 0x0000000du
+#define DEF_MAX_BASE 0x00000012u
#define DEF_MAX_INTELEXT 0x80000008u
#define DEF_MAX_AMDEXT 0x8000001cu
@@ -178,6 +178,8 @@ struct cpuid_domain_info
/* HVM-only information. */
bool pae;
bool nestedhvm;
+
+ xc_cpuid_policy_build_info_t *b_info;
};
static void cpuid(const unsigned int *input, unsigned int *regs)
@@ -369,6 +371,12 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
const struct cpuid_domain_info *info,
const unsigned int *input, unsigned int *regs)
{
+ xc_cpuid_policy_build_info_t *b_info = info->b_info;
+ xc_cpuid_policy_build_info_sgx_t *sgx = NULL;
+
+ if ( b_info )
+ sgx = &b_info->sgx;
+
switch ( input[0] )
{
case 0x00000004:
@@ -381,6 +389,30 @@ static void intel_xc_cpuid_policy(xc_interface *xch,
regs[3] &= 0x3ffu;
break;
+ case 0x00000012:
+ if ( !sgx ) {
+ regs[0] = regs[1] = regs[2] = regs[3] = 0;
+ break;
+ }
+
+ if ( !sgx->epc_base || !sgx->epc_size ) {
+ regs[0] = regs[1] = regs[2] = regs[3] = 0;
+ break;
+ }
+
+ if ( input[1] == 2 ) {
+ /*
+ * FIX EPC base and size for SGX CPUID leaf 2. Xen hypervisor is
+ * depending on XEN_DOMCTL_set_cpuid to know domain's EPC base
+ * and size.
+ */
+ regs[0] = (uint32_t)(sgx->epc_base & 0xfffff000) | 0x1;
+ regs[1] = (uint32_t)(sgx->epc_base >> 32);
+ regs[2] = (uint32_t)(sgx->epc_size & 0xfffff000) | 0x1;
+ regs[3] = (uint32_t)(sgx->epc_size >> 32);
+ }
+ break;
+
case 0x80000000:
if ( regs[0] > DEF_MAX_INTELEXT )
regs[0] = DEF_MAX_INTELEXT;
@@ -444,6 +476,10 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
regs[1] = regs[2] = regs[3] = 0;
break;
+ case 0x00000012:
+ /* Intel SGX. Passthrough to Intel function */
+ break;
+
case 0x80000000:
/* Passthrough to cpu vendor specific functions */
break;
@@ -649,12 +685,13 @@ void xc_cpuid_to_str(const unsigned int *regs, char **strs)
}
}
-static void sanitise_featureset(struct cpuid_domain_info *info)
+static int sanitise_featureset(struct cpuid_domain_info *info)
{
const uint32_t fs_size = xc_get_cpu_featureset_size();
uint32_t disabled_features[fs_size];
static const uint32_t deep_features[] = INIT_DEEP_FEATURES;
unsigned int i, b;
+ xc_cpuid_policy_build_info_t *b_info = info->b_info;
if ( info->hvm )
{
@@ -707,9 +744,19 @@ static void sanitise_featureset(struct cpuid_domain_info *info)
disabled_features[i] &= ~dfs[i];
}
}
+
+ /* Cannot support 'epc' parameter if SGX is unavailable */
+ if ( b_info && b_info->sgx.epc_base && b_info->sgx.epc_size )
+ if (!test_bit(X86_FEATURE_SGX, info->featureset)) {
+ printf("Xen hypervisor doesn't support SGX.\n");
+ return -EFAULT;
+ }
+
+ return 0;
}
int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
+ xc_cpuid_policy_build_info_t *b_info,
uint32_t *featureset,
unsigned int nr_features)
{
@@ -722,6 +769,8 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
if ( rc )
goto out;
+ info.b_info = b_info;
+
cpuid(input, regs);
base_max = (regs[0] <= DEF_MAX_BASE) ? regs[0] : DEF_MAX_BASE;
input[0] = 0x80000000;
@@ -732,7 +781,9 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
else
ext_max = (regs[0] <= DEF_MAX_INTELEXT) ? regs[0] : DEF_MAX_INTELEXT;
- sanitise_featureset(&info);
+ rc = sanitise_featureset(&info);
+ if ( rc )
+ goto out;
input[0] = 0;
input[1] = XEN_CPUID_INPUT_UNUSED;
@@ -757,12 +808,21 @@ int xc_cpuid_apply_policy(xc_interface *xch, uint32_t domid,
continue;
}
+ /* Intel SGX */
+ if ( input[0] == 0x12 )
+ {
+ input[1]++;
+ /* Intel SGX has 3 leaves */
+ if ( input[1] < 3 )
+ continue;
+ }
+
input[0]++;
if ( !(input[0] & 0x80000000u) && (input[0] > base_max ) )
input[0] = 0x80000000u;
input[1] = XEN_CPUID_INPUT_UNUSED;
- if ( (input[0] == 4) || (input[0] == 7) )
+ if ( (input[0] == 4) || (input[0] == 7) || input[0] == 0x12)
input[1] = 0;
else if ( input[0] == 0xd )
input[1] = 1; /* Xen automatically calculates almost everything. */
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 5e9aed739d7a..1a8a1d786ceb 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -2049,7 +2049,8 @@ libxl_device_pci *libxl_device_pci_assignable_list(libxl_ctx *ctx, int *num);
int libxl_cpuid_parse_config(libxl_cpuid_policy_list *cpuid, const char* str);
int libxl_cpuid_parse_config_xend(libxl_cpuid_policy_list *cpuid,
const char* str);
-void libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid);
+int libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid,
+ libxl_domain_build_info *info);
void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
libxl_cpuid_policy_list cpuid);
diff --git a/tools/libxl/libxl_cpuid.c b/tools/libxl/libxl_cpuid.c
index e692b6156979..5fb74322b99a 100644
--- a/tools/libxl/libxl_cpuid.c
+++ b/tools/libxl/libxl_cpuid.c
@@ -386,9 +386,20 @@ int libxl_cpuid_parse_config_xend(libxl_cpuid_policy_list *cpuid,
return 0;
}
-void libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid)
+int libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid,
+ libxl_domain_build_info *info)
{
- xc_cpuid_apply_policy(ctx->xch, domid, NULL, 0);
+ xc_cpuid_policy_build_info_t cpuid_binfo;
+
+ memset(&cpuid_binfo, 0, sizeof (xc_cpuid_policy_build_info_t));
+
+ /* Currently only Intel SGX needs info when applying CPUID policy */
+ if (info->type == LIBXL_DOMAIN_TYPE_HVM) {
+ cpuid_binfo.sgx.epc_base = info->u.hvm.sgx.epcbase;
+ cpuid_binfo.sgx.epc_size = (info->u.hvm.sgx.epckb << 10);
+ }
+
+ return xc_cpuid_apply_policy(ctx->xch, domid, &cpuid_binfo, NULL, 0);
}
void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index bbdba7e6e292..ac38ad65dd19 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -597,7 +597,11 @@ int libxl__build_post(libxl__gc *gc, uint32_t domid,
return ERROR_FAIL;
}
- libxl_cpuid_apply_policy(ctx, domid);
+ rc = libxl_cpuid_apply_policy(ctx, domid, info);
+ if (rc) {
+ LOG(ERROR, "Failed to apply CPUID policy (%d)", rc);
+ return ERROR_FAIL;
+ }
if (info->cpuid != NULL)
libxl_cpuid_set(ctx, domid, info->cpuid);
diff --git a/tools/libxl/libxl_nocpuid.c b/tools/libxl/libxl_nocpuid.c
index ef1161c4342b..70e0486e981b 100644
--- a/tools/libxl/libxl_nocpuid.c
+++ b/tools/libxl/libxl_nocpuid.c
@@ -34,8 +34,10 @@ int libxl_cpuid_parse_config_xend(libxl_cpuid_policy_list *cpuid,
return 0;
}
-void libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid)
+int libxl_cpuid_apply_policy(libxl_ctx *ctx, uint32_t domid,
+ libxl_domain_build_info *info)
{
+ return 0;
}
void libxl_cpuid_set(libxl_ctx *ctx, uint32_t domid,
diff --git a/tools/ocaml/libs/xc/xenctrl_stubs.c b/tools/ocaml/libs/xc/xenctrl_stubs.c
index c66732f67c89..4c469dd22f6e 100644
--- a/tools/ocaml/libs/xc/xenctrl_stubs.c
+++ b/tools/ocaml/libs/xc/xenctrl_stubs.c
@@ -796,7 +796,16 @@ CAMLprim value stub_xc_domain_cpuid_apply_policy(value xch, value domid)
#if defined(__i386__) || defined(__x86_64__)
int r;
- r = xc_cpuid_apply_policy(_H(xch), _D(domid), NULL, 0);
+ /*
+ * FIXME:
+ *
+ * Don't support passing SGX info to xc_cpuid_apply_policy here. To be
+ * honest I don't know the purpose of this CAML function, so I don't
+ * know whether we need to allow *caller* of this function to pass SGX
+ * info. As EPC base is calculated internally by toolstack so I think
+ * it is also impossible to pass EPC base from *user*.
+ */
+ r = xc_cpuid_apply_policy(_H(xch), _D(domid), NULL, NULL, 0);
if (r < 0)
failwith_xc(_H(xch));
#else
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index f501764100ad..bdecd0466a9a 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -719,7 +719,16 @@ static PyObject *pyxc_dom_set_policy_cpuid(XcObject *self,
if ( !PyArg_ParseTuple(args, "i", &domid) )
return NULL;
- if ( xc_cpuid_apply_policy(self->xc_handle, domid, NULL, 0) )
+ /*
+ * FIXME:
+ *
+ * Don't support passing SGX info to xc_cpuid_apply_policy here. To be
+ * honest I don't know the purpose of this python function, so I don't
+ * know whether we need to allow *caller* of this function to pass SGX
+ * info. As EPC base is calculated internally by toolstack so I think
+ * it is also impossible to pass EPC base from *user*.
+ */
+ if ( xc_cpuid_apply_policy(self->xc_handle, domid, NULL, NULL, 0) )
return pyxc_error_to_exception(self->xc_handle);
Py_INCREF(zero);
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 17/17] xen: tools: add SGX to applying MSR policy
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (15 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 16/17] xen: tools: add SGX to applying CPUID policy Boqun Feng
@ 2017-12-04 0:15 ` Boqun Feng
2017-12-25 5:01 ` [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 0:15 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jan Beulich, David Scott, Boqun Feng
In libxc, a new function 'xc_msr_sgx_set' is added, this function will
apply SGX related MSR policy to the target domain. This function takes
the value of 'lewr' and 'lehash*' in 'libxl_sgx_buildinfo', and set
the proper MSRs in all vcpus via 'XEN_DOMCTL_set_vcpu_msrs' hypercall.
If the physical IA32_SGXLEPUBKEYHASHn MSRs are writable:
* Domain's IA32_FEATURE_CONTROL_SGX_LE_WR bit depends on 'lwer'(default
false)
* If 'lehash' is unset, do nothing, as we already set the proper value
in sgx_domain_msr_init().
* If 'lehash' is set, set the domain's virtual IA32_SGXLEPUBKEYHASHn
with its value, and later on the vcpu's virtual IA32_SGXLEPUBKEYHASHn
will be set with the same value.
If the physical IA32_SGXLEPUBKEYHASHn MSRs are not writable, using
'lehash' or 'lewr' parameter results in domain creation failure.
Signed-off-by: Boqun Feng <boqun.feng@intel.com>
---
tools/libxc/Makefile | 1 +
tools/libxc/include/xenctrl.h | 2 ++
tools/libxc/xc_msr_x86.h | 10 ++++++
tools/libxc/xc_sgx.c | 82 +++++++++++++++++++++++++++++++++++++++++++
tools/libxl/libxl_dom.c | 29 +++++++++++++++
tools/xl/xl_parse.c | 10 ++++++
6 files changed, 134 insertions(+)
create mode 100644 tools/libxc/xc_sgx.c
diff --git a/tools/libxc/Makefile b/tools/libxc/Makefile
index 9a019e8dfed5..428430a15c40 100644
--- a/tools/libxc/Makefile
+++ b/tools/libxc/Makefile
@@ -41,6 +41,7 @@ CTRL_SRCS-y += xc_foreign_memory.c
CTRL_SRCS-y += xc_kexec.c
CTRL_SRCS-y += xc_resource.c
CTRL_SRCS-$(CONFIG_X86) += xc_psr.c
+CTRL_SRCS-$(CONFIG_X86) += xc_sgx.c
CTRL_SRCS-$(CONFIG_X86) += xc_pagetab.c
CTRL_SRCS-$(CONFIG_Linux) += xc_linux.c
CTRL_SRCS-$(CONFIG_FreeBSD) += xc_freebsd.c
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index ad4429ca5ffd..abc9f711141a 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1855,6 +1855,8 @@ void xc_cpuid_to_str(const unsigned int *regs,
int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
int xc_mca_op_inject_v2(xc_interface *xch, unsigned int flags,
xc_cpumap_t cpumap, unsigned int nr_cpus);
+int xc_msr_sgx_set(xc_interface *xch, uint32_t domid, bool lewr,
+ uint64_t *lehash, int max_vcpu);
#endif
struct xc_px_val {
diff --git a/tools/libxc/xc_msr_x86.h b/tools/libxc/xc_msr_x86.h
index 7f100e71a7a1..54eaa4de8945 100644
--- a/tools/libxc/xc_msr_x86.h
+++ b/tools/libxc/xc_msr_x86.h
@@ -24,6 +24,16 @@
#define MSR_IA32_CMT_EVTSEL 0x00000c8d
#define MSR_IA32_CMT_CTR 0x00000c8e
+#define MSR_IA32_FEATURE_CONTROL 0x0000003a
+#define IA32_FEATURE_CONTROL_LOCK 0x0001
+#define IA32_FEATURE_CONTROL_SGX_ENABLE 0x40000
+#define IA32_FEATURE_CONTROL_SGX_LE_WR 0x20000
+
+#define MSR_IA32_SGXLEPUBKEYHASH0 0x0000008c
+#define MSR_IA32_SGXLEPUBKEYHASH1 0x0000008d
+#define MSR_IA32_SGXLEPUBKEYHASH2 0x0000008e
+#define MSR_IA32_SGXLEPUBKEYHASH3 0x0000008f
+
#endif
/*
diff --git a/tools/libxc/xc_sgx.c b/tools/libxc/xc_sgx.c
new file mode 100644
index 000000000000..8f97ca0042e0
--- /dev/null
+++ b/tools/libxc/xc_sgx.c
@@ -0,0 +1,82 @@
+/*
+ * xc_sgx.c
+ *
+ * SGX related MSR setup
+ *
+ * Copyright (C) 2017 Intel Corporation
+ * Author Boqun Feng <boqun.feng@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include <assert.h>
+#include "xc_private.h"
+#include "xc_msr_x86.h"
+
+int xc_msr_sgx_set(xc_interface *xch, uint32_t domid, bool lewr,
+ uint64_t *lehash, int max_vcpu)
+{
+ int rc, i, nr_msrs;
+ DECLARE_DOMCTL;
+ xen_domctl_vcpu_msr_t sgx_msrs[5];
+ DECLARE_HYPERCALL_BUFFER(void, buffer);
+
+ if ( !lehash && !lewr )
+ return 0;
+
+ sgx_msrs[0].index = MSR_IA32_FEATURE_CONTROL;
+ sgx_msrs[0].reserved = 0;
+ sgx_msrs[0].value = IA32_FEATURE_CONTROL_LOCK |
+ IA32_FEATURE_CONTROL_SGX_ENABLE |
+ (lewr ? IA32_FEATURE_CONTROL_SGX_LE_WR : 0);
+
+ if ( !lehash )
+ nr_msrs = 1;
+ else
+ {
+ nr_msrs = 5;
+
+ for ( i = 0; i < 4; i++ )
+ {
+ sgx_msrs[i+1].index = MSR_IA32_SGXLEPUBKEYHASH0 + i;
+ sgx_msrs[i+1].reserved = 0;
+ sgx_msrs[i+1].value = lehash[i];
+ }
+ }
+
+ buffer = xc_hypercall_buffer_alloc(xch, buffer,
+ nr_msrs * sizeof(xen_domctl_vcpu_msr_t));
+ if ( !buffer )
+ {
+ ERROR("Unable to allocate %zu bytes for msr hypercall buffer",
+ 5 * sizeof(xen_domctl_vcpu_msr_t));
+ return -1;
+ }
+
+ domctl.cmd = XEN_DOMCTL_set_vcpu_msrs;
+ domctl.domain = domid;
+ domctl.u.vcpu_msrs.msr_count = nr_msrs;
+ set_xen_guest_handle(domctl.u.vcpu_msrs.msrs, buffer);
+
+ memcpy(buffer, sgx_msrs, nr_msrs * sizeof(xen_domctl_vcpu_msr_t));
+
+ for ( i = 0; i < max_vcpu; i++ ) {
+ domctl.u.vcpu_msrs.vcpu = i;
+ rc = xc_domctl(xch, &domctl);
+
+ if (rc)
+ break;
+ }
+
+ xc_hypercall_buffer_free(xch, buffer);
+
+ return rc;
+}
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index ac38ad65dd19..d5e33f8940ba 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -358,6 +358,35 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
return ERROR_FAIL;
}
+ if (info->type == LIBXL_DOMAIN_TYPE_HVM)
+ {
+ uint64_t lehash[4];
+
+ if ( !info->u.hvm.sgx.lehash0 && !info->u.hvm.sgx.lehash1 &&
+ !info->u.hvm.sgx.lehash2 && !info->u.hvm.sgx.lehash3 )
+ {
+ rc = xc_msr_sgx_set(ctx->xch, domid,
+ libxl_defbool_val(info->u.hvm.sgx.lewr),
+ NULL, info->max_vcpus);
+ }
+ else
+ {
+ lehash[0] = info->u.hvm.sgx.lehash0;
+ lehash[1] = info->u.hvm.sgx.lehash1;
+ lehash[2] = info->u.hvm.sgx.lehash2;
+ lehash[3] = info->u.hvm.sgx.lehash3;
+
+ rc = xc_msr_sgx_set(ctx->xch, domid,
+ libxl_defbool_val(info->u.hvm.sgx.lewr),
+ lehash, info->max_vcpus);
+ }
+
+ if (rc) {
+ LOG(ERROR, "Unable to set SGX related MSRs (%d)", rc);
+ return ERROR_FAIL;
+ }
+ }
+
if (xc_domain_set_gnttab_limits(ctx->xch, domid, info->max_grant_frames,
info->max_maptrack_frames) != 0) {
LOG(ERROR, "Couldn't set grant table limits");
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index e96612bc71f3..211ee832ca31 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -828,6 +828,16 @@ int parse_sgx_config(libxl_sgx_buildinfo *sgx, char *token)
fprintf(stderr, "'lehash=<...>' requires 256bit SHA256 hash\n");
return 1;
}
+
+ /*
+ * 'lehash' is a hex string of 32 bytes in little-endian, i.e. the
+ * leftmost byte is the least significant byte.
+ *
+ * We convert the hex string 8 bytes(64 bit) a time to uint64 via
+ * strtoull(). And strtoull() treats the string as big-endian,
+ * therefore we need to swap the value afterwards to get the correct
+ * value.
+ */
char buf[17];
--
2.15.0
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset
2017-12-04 0:15 ` [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset Boqun Feng
@ 2017-12-04 11:13 ` Julien Grall
2017-12-04 13:10 ` Boqun Feng
0 siblings, 1 reply; 23+ messages in thread
From: Julien Grall @ 2017-12-04 11:13 UTC (permalink / raw)
To: Boqun Feng, xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jan Beulich,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, Tim Deegan, kai.huang,
Julien Grall, Jun Nakajima, David Scott
Hello,
I am not sure to understand why I am being CCed. But it looks like you
CC everyone on each patch... Please CC only relevant person on each patch.
Cheers,
On 04/12/17 00:15, Boqun Feng wrote:
> From: Kai Huang <kai.huang@linux.intel.com>
>
> Expose SGX in CPU featureset for HVM domain. SGX will not be supported for
> PV domain, as ENCLS (which SGX driver in guest essentially runs) must run
> in ring 0, while PV kernel runs in ring 3. Theoretically we can support SGX
> in PV domain via either emulating #GP caused by ENCLS running in ring 3, or
> by PV ENCLS but it is really not necessary at this stage.
>
> SGX Launch Control is also exposed in CPU featureset for HVM domain. SGX
> Launch Control depends on SGX.
>
> Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
> Signed-off-by: Boqun Feng <boqun.feng@intel.com>
> ---
> xen/include/public/arch-x86/cpufeatureset.h | 3 ++-
> xen/tools/gen-cpuid.py | 3 +++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
> index be6da8eaf17c..1f8510eebb1d 100644
> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -193,7 +193,7 @@ XEN_CPUFEATURE(XSAVES, 4*32+ 3) /*S XSAVES/XRSTORS instructions */
> /* Intel-defined CPU features, CPUID level 0x00000007:0.ebx, word 5 */
> XEN_CPUFEATURE(FSGSBASE, 5*32+ 0) /*A {RD,WR}{FS,GS}BASE instructions */
> XEN_CPUFEATURE(TSC_ADJUST, 5*32+ 1) /*S TSC_ADJUST MSR available */
> -XEN_CPUFEATURE(SGX, 5*32+ 2) /* Software Guard extensions */
> +XEN_CPUFEATURE(SGX, 5*32+ 2) /*H Intel Software Guard extensions */
> XEN_CPUFEATURE(BMI1, 5*32+ 3) /*A 1st bit manipulation extensions */
> XEN_CPUFEATURE(HLE, 5*32+ 4) /*A Hardware Lock Elision */
> XEN_CPUFEATURE(AVX2, 5*32+ 5) /*A AVX2 instructions */
> @@ -230,6 +230,7 @@ XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */
> XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */
> XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */
> XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */
> +XEN_CPUFEATURE(SGX_LC, 6*32+30) /*H Intel SGX Launch Control */
>
> /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
> XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */
> diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
> index 9ec4486f2b4b..4fef21203086 100755
> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -256,6 +256,9 @@ def crunch_numbers(state):
> AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD,
> AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW,
> AVX512_4FMAPS, AVX512_VPOPCNTDQ],
> +
> + # SGX Launch Control depends on SGX
> + SGX: [SGX_LC],
> }
>
> deep_features = tuple(sorted(deps.keys()))
>
--
Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset
2017-12-04 11:13 ` Julien Grall
@ 2017-12-04 13:10 ` Boqun Feng
2017-12-04 14:13 ` Jan Beulich
0 siblings, 1 reply; 23+ messages in thread
From: Boqun Feng @ 2017-12-04 13:10 UTC (permalink / raw)
To: Julien Grall
Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu, Jan Beulich,
George Dunlap, Andrew Cooper, Ian Jackson,
Marek Marczykowski-Górecki, xen-devel, kai.huang,
Julien Grall, Jun Nakajima, David Scott
On Mon, Dec 04, 2017 at 11:13:45AM +0000, Julien Grall wrote:
> Hello,
>
Hi Julien,
> I am not sure to understand why I am being CCed. But it looks like you CC
> everyone on each patch... Please CC only relevant person on each patch.
>
Apologies... I thought the whole pathset will provide more context for
the reviewers. Will drop you from unrelevant patches in next verion. And
I guess it's OK for me to drop you from replies on unrelevant patches of
this version too?
Regards,
Boqun
> Cheers,
>
> On 04/12/17 00:15, Boqun Feng wrote:
> > From: Kai Huang <kai.huang@linux.intel.com>
> >
> > Expose SGX in CPU featureset for HVM domain. SGX will not be supported for
> > PV domain, as ENCLS (which SGX driver in guest essentially runs) must run
> > in ring 0, while PV kernel runs in ring 3. Theoretically we can support SGX
> > in PV domain via either emulating #GP caused by ENCLS running in ring 3, or
> > by PV ENCLS but it is really not necessary at this stage.
> >
> > SGX Launch Control is also exposed in CPU featureset for HVM domain. SGX
> > Launch Control depends on SGX.
> >
> > Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
> > Signed-off-by: Boqun Feng <boqun.feng@intel.com>
> > ---
> > xen/include/public/arch-x86/cpufeatureset.h | 3 ++-
> > xen/tools/gen-cpuid.py | 3 +++
> > 2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
> > index be6da8eaf17c..1f8510eebb1d 100644
> > --- a/xen/include/public/arch-x86/cpufeatureset.h
> > +++ b/xen/include/public/arch-x86/cpufeatureset.h
> > @@ -193,7 +193,7 @@ XEN_CPUFEATURE(XSAVES, 4*32+ 3) /*S XSAVES/XRSTORS instructions */
> > /* Intel-defined CPU features, CPUID level 0x00000007:0.ebx, word 5 */
> > XEN_CPUFEATURE(FSGSBASE, 5*32+ 0) /*A {RD,WR}{FS,GS}BASE instructions */
> > XEN_CPUFEATURE(TSC_ADJUST, 5*32+ 1) /*S TSC_ADJUST MSR available */
> > -XEN_CPUFEATURE(SGX, 5*32+ 2) /* Software Guard extensions */
> > +XEN_CPUFEATURE(SGX, 5*32+ 2) /*H Intel Software Guard extensions */
> > XEN_CPUFEATURE(BMI1, 5*32+ 3) /*A 1st bit manipulation extensions */
> > XEN_CPUFEATURE(HLE, 5*32+ 4) /*A Hardware Lock Elision */
> > XEN_CPUFEATURE(AVX2, 5*32+ 5) /*A AVX2 instructions */
> > @@ -230,6 +230,7 @@ XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */
> > XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */
> > XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */
> > XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */
> > +XEN_CPUFEATURE(SGX_LC, 6*32+30) /*H Intel SGX Launch Control */
> > /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
> > XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */
> > diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py
> > index 9ec4486f2b4b..4fef21203086 100755
> > --- a/xen/tools/gen-cpuid.py
> > +++ b/xen/tools/gen-cpuid.py
> > @@ -256,6 +256,9 @@ def crunch_numbers(state):
> > AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD,
> > AVX512BW, AVX512VL, AVX512VBMI, AVX512_4VNNIW,
> > AVX512_4FMAPS, AVX512_VPOPCNTDQ],
> > +
> > + # SGX Launch Control depends on SGX
> > + SGX: [SGX_LC],
> > }
> > deep_features = tuple(sorted(deps.keys()))
> >
>
> --
> Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset
2017-12-04 13:10 ` Boqun Feng
@ 2017-12-04 14:13 ` Jan Beulich
2017-12-05 0:22 ` Boqun Feng
0 siblings, 1 reply; 23+ messages in thread
From: Jan Beulich @ 2017-12-04 14:13 UTC (permalink / raw)
To: Boqun Feng
Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
George Dunlap, Andrew Cooper, Julien Grall, Ian Jackson,
Marek Marczykowski-Górecki, xen-devel, kai.huang,
Julien Grall, Jun Nakajima, David Scott
>>> On 04.12.17 at 14:10, <boqun.feng@intel.com> wrote:
> On Mon, Dec 04, 2017 at 11:13:45AM +0000, Julien Grall wrote:
>> I am not sure to understand why I am being CCed. But it looks like you CC
>> everyone on each patch... Please CC only relevant person on each patch.
>>
>
> Apologies... I thought the whole pathset will provide more context for
> the reviewers. Will drop you from unrelevant patches in next verion. And
> I guess it's OK for me to drop you from replies on unrelevant patches of
> this version too?
You shouldn't do this for just Julien - Cc lists of patches should
generally be composed per patch. Most people are subscribed to
the list anyway, and hence receive a copy of the other patches.
In the worst case people can either tell you to always be Cc-ed
on an entire patch set, or go to the list archives. Yet when you
Cc everyone on everything, it is quite difficult for an individual to
tell which parts to actually pay special attention to.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset
2017-12-04 14:13 ` Jan Beulich
@ 2017-12-05 0:22 ` Boqun Feng
0 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-05 0:22 UTC (permalink / raw)
To: Jan Beulich
Cc: Tim Deegan, Kevin Tian, Stefano Stabellini, Wei Liu,
George Dunlap, Andrew Cooper, Julien Grall, Ian Jackson,
Marek Marczykowski-Górecki, xen-devel, kai.huang,
Julien Grall, Jun Nakajima, David Scott
On Mon, Dec 04, 2017 at 07:13:52AM -0700, Jan Beulich wrote:
> >>> On 04.12.17 at 14:10, <boqun.feng@intel.com> wrote:
> > On Mon, Dec 04, 2017 at 11:13:45AM +0000, Julien Grall wrote:
> >> I am not sure to understand why I am being CCed. But it looks like you CC
> >> everyone on each patch... Please CC only relevant person on each patch.
> >>
> >
> > Apologies... I thought the whole pathset will provide more context for
> > the reviewers. Will drop you from unrelevant patches in next verion. And
> > I guess it's OK for me to drop you from replies on unrelevant patches of
> > this version too?
>
> You shouldn't do this for just Julien - Cc lists of patches should
> generally be composed per patch. Most people are subscribed to
> the list anyway, and hence receive a copy of the other patches.
> In the worst case people can either tell you to always be Cc-ed
> on an entire patch set, or go to the list archives. Yet when you
> Cc everyone on everything, it is quite difficult for an individual to
> tell which parts to actually pay special attention to.
>
Good point ;-) I will compose the Cc lists per patch in next version.
Regards,
Boqun
> Jan
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
` (16 preceding siblings ...)
2017-12-04 0:15 ` [PATCH v2 17/17] xen: tools: add SGX to applying MSR policy Boqun Feng
@ 2017-12-25 5:01 ` Boqun Feng
17 siblings, 0 replies; 23+ messages in thread
From: Boqun Feng @ 2017-12-25 5:01 UTC (permalink / raw)
To: xen-devel
Cc: Kevin Tian, Stefano Stabellini, Wei Liu, George Dunlap,
Andrew Cooper, Ian Jackson, Tim Deegan, kai.huang, Jan Beulich
On Mon, Dec 04, 2017 at 08:15:11AM +0800, Boqun Feng wrote:
> Hi all,
>
> This is the v2 of RFC SGX Virtualization design and draft patches, you
Ping ;-)
Any comments?
Regards,
Boqun
> can find v1 at:
>
> https://lists.gt.net/xen/devel/483404
>
> In the new version, I fix a few things according to the feedbacks for
> previous version(mostly are cleanups and code movement).
>
> Besides, Kai and I redesign the SGX MSRs setting up part and introduce
> new XL parameter 'lehash' and 'lewr'.
>
> Another big change is that I modify the EPC management to fit EPC pages
> in 'struct page_info', and in patch #6 and #7, unscrubbable pages,
> 'PGC_epc', 'MEMF_epc' and 'XENZONE_EPC' are introduced, so that EPC
> management is fully integrated into existing memory management of xen.
> This might be the controversial bit, so patch 6~8 are simply to show the
> idea and drive deep discussion.
>
> Detailed changes since v1: (modifications with tag "[New]" is totally
> new in this series, reviews and comments are highly welcome for those
> parts)
>
> * Make SGX related mostly common for x86 by: 1) moving sgx.[ch] to
> arch/x86/ and include/asm-x86/ and 2) renaming EPC related functions
> with domain_* prefix.
>
> * Rename ioremap_cache() with ioremap_wb() and make it x86-specific as
> suggested by Jan Beulich.
>
> * Remove percpu sgx_cpudata, during bootup secondary CPUs now check
> whether they read different value than boot CPU, if so SGX is
> disabled.
>
> * Remove domain_has_sgx_{,launch_control}, and make sure we can
> rely on domain's arch.cpuid->feat.sgx{_lc} for setting checks.
>
> * Cleanup the code for CPUID handling as suggested by Andrew Cooper.
>
> * Adjust to msr_policy framework for SGX MSRs handling, and remove
> unnecessary fields like 'readable' and 'writable'
>
> * Use 'page_info' to maintain EPC pages, and [NEW] add an draft
> implementation for employing xenheap for EPC page management. Please
> see patch 6~8
>
> * [New] Modify the XL parameter for SGX, please see section 2.1.1 in
> the updated design doc.
>
> * [New] Use _set_vcpu_msrs hypercall in the toolstack to set the SGX
> related. Please see patch #17.
>
> * ACPI related tool changes are temporarily dropped in this patchset,
> as I need more time to resolve the comments and do related tests.
>
> And the update design doc is as follow, as the previous version in the
> design there are some particualr points that we don't know which
> implementation is better. For those a question mark (?) is added at the
> right of the menu. And for SGX live migration, thanks to Wei Liu for
> providing comments that it's nice to support if we can in previous
> version review, but we'd like hear more from you guys so we still put a
> question mark fot this item. Your comments on those "question mark (?)"
> parts (and other comments as well, of course) are highly appreciated.
>
> ===================================================================
> 1. SGX Introduction
> 1.1 Overview
> 1.1.1 Enclave
> 1.1.2 EPC (Enclave Paage Cache)
> 1.1.3 ENCLS and ENCLU
> 1.2 Discovering SGX Capability
> 1.2.1 Enumerate SGX via CPUID
> 1.2.2 Intel SGX Opt-in Configuration
> 1.3 Enclave Life Cycle
> 1.3.1 Constructing & Destroying Enclave
> 1.3.2 Enclave Entry and Exit
> 1.3.2.1 Synchonous Entry and Exit
> 1.3.2.2 Asynchounous Enclave Exit
> 1.3.3 EPC Eviction and Reload
> 1.4 SGX Launch Control
> 1.5 SGX Interaction with IA32 and IA64 Architecture
> 2. SGX Virtualization Design
> 2.1 High Level Toolstack Changes
> 2.1.1 New 'sgx' XL configure file parameter
> 2.1.2 New XL commands (?)
> 2.1.3 Notify domain's virtual EPC base and size to Xen
> 2.2 High Level Hypervisor Changes
> 2.2.1 EPC Management
> 2.2.2 EPC Virtualization
> 2.2.3 Populate EPC for Guest
> 2.2.4 Launch Control Support
> 2.2.5 CPUID Emulation
> 2.2.6 EPT Violation & ENCLS Trapping Handling
> 2.2.7 Guest Suspend & Resume
> 2.2.8 Destroying Domain
> 2.3 Additional Point: Live Migration, Snapshot Support (?)
> 3. Reference
>
> 1. SGX Introduction
>
> 1.1 Overview
>
> 1.1.1 Enclave
>
> Intel Software Guard Extensions (SGX) is a set of instructions and mechanisms
> for memory accesses in order to provide security accesses for sensitive
> applications and data. SGX allows an application to use it's pariticular address
> space as an *enclave*, which is a protected area provides confidentiality and
> integrity even in the presence of privileged malware. Accesses to the enclave
> memory area from any software not resident in the enclave are prevented,
> including those from privileged software. Below diagram illustrates the presence
> of Enclave in application.
>
> |-----------------------|
> | |
> | |---------------| |
> | | OS kernel | | |-----------------------|
> | |---------------| | | |
> | | | | | |---------------| |
> | |---------------| | | | Entry table | |
> | | Enclave |---|-----> | |---------------| |
> | |---------------| | | | Enclave stack | |
> | | App code | | | |---------------| |
> | |---------------| | | | Enclave heap | |
> | | Enclave | | | |---------------| |
> | |---------------| | | | Enclave code | |
> | | App code | | | |---------------| |
> | |---------------| | | |
> | | | |-----------------------|
> |-----------------------|
>
> SGX supports SGX1 and SGX2 extensions. SGX1 provides basic enclave support,
> and SGX2 allows additional flexibility in runtime management of enclave
> resources and thread execution within an enclave.
>
> 1.1.2 EPC (Enclave Page Cache)
>
> Just like normal application memory management, enclave memory management can be
> devided into two parts: address space allocation and memory commitment. Address
> space allocation is allocating particular range of linear address space for
> enclave. Memory commitment is assigning actual resource for the enclave.
>
> Enclave Page Cache (EPC) is the physical resource used to commit to enclave.
> EPC is divided to 4K pages. An EPC page is 4K in size and always aligned to 4K
> boundary. Hardware performs additional access control checks to restrict access
> to the EPC page. The Enclave Page Cache Map (EPCM) is a secure structure which
> holds one entry for each EPC page, and is used by hardware to track the status
> of each EPC page (invisibe to software). Typically EPC and EPCM are reserved
> by BIOS as Processor Reserved Memory but the actual amount, size, and layout
> of EPC are model-specific, and dependent on BIOS settings. EPC is enumerated
> via new SGX CPUID, and is reported as reserved memory.
>
> EPC pages can either be invalid or valid. There are 4 valid EPC types in SGX1:
> regular EPC page, SGX Enclave Control Structure (SECS) page, Thread Control
> Structure (TCS) page, and Version Array (VA) page. SGX2 adds Trimmed EPC page.
> Each enclave is associated with one SECS page. Each thread in enclave is
> associated with one TCS page. VA page is used in EPC page eviction and reload.
> Trimmed EPC page is introduced in SGX2 when particular 4K page in enclave is
> going to be freed (trimmed) at runtime after enclave is initialized.
>
> 1.1.3 ENCLS and ENCLU
>
> Two new instructions ENCLS and ENCLU are introduced to manage enclave and EPC.
> ENCLS can only run in ring 0, while ENCLU can only run in ring 3. Both ENCLS and
> ENCLU have multiple leaf functions, with EAX indicating the specific leaf
> function.
>
> SGX1 supports below ENCLS and ENCLU leaves:
>
> ENCLS:
> - ECREATE, EADD, EEXTEND, EINIT, EREMOVE (Enclave build and destroy)
> - EPA, EBLOCK, ETRACK, EWB, ELDU/ELDB (EPC eviction & reload)
>
> ENCLU:
> - EENTER, EEXIT, ERESUME (Enclave entry, exit, re-enter)
> - EGETKEY, EREPORT (SGX key derivation, attestation)
>
> Additionally, SGX2 supports below ENCLS and ENCLU leaves for runtime add/remove
> EPC page to enclave after enclave is initialized, along with permission change.
>
> ENCLS:
> - EAUG, EMODT, EMODPR
>
> ENCLU:
> - EACCEPT, EACCEPTCOPY, EMODPE
>
> VMM is able to interfere with ENCLS running in guest (see 1.2.x SGX interaction
> with VMX) but is unable to interfere with ENCLU.
>
> 1.2 Discovering SGX Capability
>
> 1.2.1 Enumerate SGX via CPUID
>
> If CPUID.0x7.0:EBX.SGX (bit 2) is 1, then processor supports SGX and SGX
> capability and resource can be enumerated via new SGX CPUID (0x12).
> CPUID.0x12.0x0 reports SGX capability, such as the presence of SGX1, SGX2,
> enclave's maximum size for both 32-bit and 64-bit application. CPUID.0x12.0x1
> reports the availability of bits that can be set for SECS.ATTRIBUTES.
> CPUID.0x12.0x2 reports the EPC resource's base and size. Platform may support
> multiple EPC sections, and CPUID.0x12.0x3 and further sub-leaves can be used
> to detect the existence of multiple EPC sections (until CPUID reports invalid
> EPC).
>
> Refer to 37.7.2 Intel SGX Resource Enumeration Leaves for full description of
> SGX CPUID 0x12.
>
> 1.2.2 Intel SGX Opt-in Configuration
>
> On processors that support Intel SGX, IA32_FEATURE_CONTROL also provides the
> SGX_ENABLE bit (bit 18) to turn on/off SGX. Before system software can enable
> and use SGX, BIOS is required to set IA32_FEATURE_CONTROL.SGX_ENABLE = 1 to
> opt-in SGX.
>
> Setting SGX_ENABLE follows the rules of IA32_FEATURE_CONTROL.LOCK (bit 0).
> Software is considered to have opted into Intel SGX if and only if
> IA32_FEATURE_CONTROL.SGX_ENABLE and IA32_FEATURE_CONTROL.LOCK are set to 1.
>
> The setting of IA32_FEATURE_CONTROL.SGX_ENABLE (bit 18) is not reflected by
> SGX CPUID. Enclave instructions will behavior differently according to value
> of CPUID.0x7.0x0:EBX.SGX and whether BIOS has opted-in SGX.
>
> Refer to 37.7.1 Intel SGX Opt-in Configuration for more information.
>
> 1.3 Enclave Life Cycle
>
> 1.3.1 Constructing & Destroying Enclave
>
> Enclave is created via ENCLS[ECREATE] leaf by previleged software. Basically
> ECREATE converts an invalid EPC page into SECS page, according to a source SECS
> structure resides in normal memory. The source SECS contains enclave's info
> such as base (linear) address, size, enclave attributes, enclave's measurement,
> etc.
>
> After ECREATE, for each 4K linear address space page, priviledged software uses
> EADD and EEXTEND to add one EPC page to it. Enclave code/data (resides in normal
> memory) is loaded to enclave during EADD for enclave's each 4K page. After all
> EPC pages are added to enclave, priviledged software calls EINIT to initialize
> the enclave, and then enclave is ready to run.
>
> During enclave is constructed, enclave measurement, which is a SHA256 hash
> value, is also built according to enclave's size, code/data itself and its
> location in enclave, etc. The measurement can be used to uniquely identify the
> enclave. SIGSTRUCT in EINIT leaf also contains the measurement specified by
> untrusted software, via MRENCLAVE. EINIT will check the two measurements and
> will only succeed when the two matches.
>
> Enclave is destroyed by running EREMOVE for all Enclave's EPC page, and then
> for enclave's SECS. EREMOVE will report SGX_CHILD_PRESENT error if it is called
> for SECS when there's still regular EPC pages that haven't been removed from
> enclave.
>
> Please refer to SDM chapter 39.1 Constructing an Enclave for more infomation.
>
> 1.3.2 Enclave Entry and Exit
>
> 1.3.2.1 Synchonous Entry and Exit
>
> After enclave is constructed, non-priviledged software use ENCLU[EENTER] to
> enter enclave to run. While process runs in enclave, non-priviledged software
> can use ENCLU[EEXIT] to exit from enclave and return to normal mode.
>
> 1.3.2.2 Asynchounous Enclave Exit
>
> Asynchronous and synchronous events, such as exceptions, interrupts, traps,
> SMIs, and VM exits may occur while executing inside an enclave. These events
> are referred to as Enclave Exiting Events (EEE). Upon an EEE, the processor
> state is securely saved inside the enclave and then replaced by a synthetic
> state to prevent leakage of secrets. The process of securely saving state and
> establishing the synthetic state is called an Asynchronous Enclave Exit (AEX).
>
> After AEX, non-priviledged software uses ENCLU[ERESUME] to re-enter enclave.
> The SGX userspace software maintains a small piece of code (resides in normal
> memory) which basically calls ERESUME to re-enter enclave. The address of this
> piece of code is called Asynchronous Exit Pointer (AEP). AEP is specified as
> parameter in EENTER and will be kept internally in enclave. Upon AEX, AEP will
> be pushed to stack and upon returning from EEE handling, such as IRET, AEP will
> be loaded to RIP and ERESUME will be called subsequently to re-enter enclave.
>
> During AEX the processor will do context saving and restore automatically
> therefore no change to interrupt handling of OS kernel and VMM is required. It
> is SGX userspace software's responsibility to setup AEP correctly.
>
> Please refer to SDM chapter 39.2 Enclave Entry and Exit for more infomation.
>
> 1.3.3 EPC Eviction and Reload
>
> SGX also allows priviledged software to evict any EPC pages that are used by
> enclave. The idea is the same as normal memory swapping. Below is the detail
> info of how to evict EPC pages.
>
> Below is the sequence to evict regular EPC page:
>
> 1) Select one or multiple regular EPC pages from one enclave
> 2) Remove EPT/PT mapping for selected EPC pages
> 3) Send IPIs to remote CPUs to flush TLB of selected EPC pages
> 4) EBLOCK on selected EPC pages
> 5) ETRACK on enclave's SECS page
> 6) allocate one available slot (8-byte) in VA page
> 7) EWB on selected EPC pages
>
> With EWB taking:
>
> - VA slot, to restore eviction version info.
> - one normal 4K page in memory, to store encrypted content of EPC page.
> - one struct PCMD in memory, to store meta data.
>
> (VA slot is a 8-byte slot in VA page, which is a particualr EPC page.)
>
> And below is the sequence to evict an SECS page or VA page:
>
> 1) locate SECS (or VA) page
> 2) remove EPT/PT mapping for SECS (or VA) page
> 3) Send IPIs to remote CPUs
> 6) allocate one available slot (8-byte) in VA page
> 4) EWB on SECS (or) page
>
> And for evicting SECS page, all regular EPC pages that belongs to that SECS
> must be evicted out prior, otherwise EWB returns SGX_CHILD_PRESENT error.
>
> And to reload an EPC page:
>
> 1) ELDU/ELDB on EPC page
> 2) setup EPT/PT mapping
>
> With ELDU/ELDB taking:
>
> - location of SECS page
> - linear address of enclave's 4K page (that we are going to reload to)
> - VA slot (used in EWB)
> - 4K page in memory (used in EWB)
> - struct PCMD in memory (used in EWB)
>
> Please refer to SDM chapter 39.5 EPC and Management of EPC pages for more
> information.
>
> 1.4 SGX Launch Control
>
> SGX requires running "Launch Enclave" (LE) before running any other enclaves.
> This is because LE is the only enclave that does not requires EINITTOKEN in
> EINIT. Running any other enclave requires a valid EINITTOKEN, which contains
> MAC of the (first 192 bytes) EINITTOKEN calculated by EINITTOKEN key. EINIT
> will verify the MAC via internally deriving the EINITTOKEN key, and only the
> EINITTOKEN that has matched MAC will be accepted by EINIT. The EINITTOKEN key
> derivation depends on some info from LE. The typical process is LE generates
> EINITTOKEN for other enclave according to LE itself and the target enclave,
> and calcualtes the MAC by using ENCLU[EGETKEY] to get the EINITTOKEN key. Only
> LE is able to get the EINITTOKEN key.
>
> Running LE requies the SHA256 hash of LE signer's RSA public key (SHA256 of
> sigstruct->modulus) to equal to IA32_SGXLEPUBKEYHASH[0-3] MSRs (the 4 MSRs
> together makes up 256-bit SHA256 hash value).
>
> If CPUID.0x7.0x0:EBX.SGX and CPUID.0x7.0x0:ECX.SGX_LAUNCH_CONTROL[bit 30] is
> set, then IA32_FEATURE_CONTROL is available, and IA32_FEATURE_CONTROL MSR has
> SGX_LAUNCH_CONTROL_ENABLE bit (bit 17) available. 1-setting of
> SGX_LAUNCH_CONTROL_ENABLE bit enables runtime change of IA32_SGXLEPUBKEYHASHn
> after IA32_FEATURE_CONTROL is locked. Otherwise, IA32_SGXLEPUBKEYHASHn are
> read-only after IA32_FEATURE_CONTROL is locked. After reset,
> IA32_SGXLEPUBKEYHASHn will be set to hash of Intel's default key. On system
> that has only CPUID.0x7.0x0:EBX.SGX set, IA32_SGXLEPUBKEYHASHn are not
> available. On such system EINIT will always treat IA32_SGXLEPUBKEYHASHn as
> Intel's default value thus only Intel's LE is able to run.
>
> On system with IA32_SGXLEPUBKEYHASHn available, it is BIOS's implementation to
> decide whether to provide configurations to user to set IA32_SGXLEPUBKEYHASHn
> in *locked* (IA32_SGXLEPUBKEYHASHn are read-only after IA32_FEATURE_CONTROL is
> locked) or *unlocked* mode (IA32_SGXLEPUBKEYHASHn are writable to kernel at
> runtime). Also BIOS may or may not provide configurations to allow user to set
> custom value of IA32_SGXLEPUBKEYHASHn.
>
> 1.5 SGX Interaction with IA32 and IA64 Architecture
>
> SDM Chapter 42 describes SGX interaction with various features in IA32 and IA64
> architecture. Below outlines the major ones. Refer to Chapter 42 for full
> description of SGX interaction with various IA32 and IA64 features.
>
> 1.5.1 VMX Changes for Supporting SGX Virtualization
>
> A new 64-bit ENCLS-exiting bitmap control field is added to VMCS (encoding
> 0202EH) to control VMEXIT on ENCLS leaf functions. And a new "Enable ENCLS
> exiting" control bit (bit 15) is defined in secondary processor based vm
> execution control. 1-Setting of "Enable ENCLS exiting" enables ENCLS-exiting
> bitmap control. ENCLS-exiting bitmap controls which ENCLS leaves will trigger
> VMEXIT.
>
> Additionally two new bits are added to indicate whether VMEXIT (any) is from
> enclave. Below two bits will be set if VMEXIT is from enclave:
> - Bit 27 in the Exit reason filed of Basic VM-exit information.
> - Bit 4 in the Interruptibility State of Guest Non-Register State of VMCS.
>
> Refer to 42.5 Interactions with VMX, 27.2.1 Basic VM-Exit Information, and
> 27.3.4 Saving Non-Register.
>
> 1.5.2 Interaction with XSAVE
>
> SGX defines a sub-field called X-Feature Request Mask (XFRM) in the attributes
> field of SECS. On enclave entry, SGX HW verifies XFRM in SECS.ATTRIBUTES are
> already enabled in XCR0.
>
> Upon AEX, SGX saves the processor extended state and miscellaneous state to
> enclave's state-save area (SSA), and clear the secrets from processor extended
> state that is used by enclave (from leaking secrets).
>
> Refer to 42.7 Interaction with Processor Extended State and Miscellaneous State
>
> 1.5.3 Interaction with S state
>
> When processor goes into S3-S5 state, EPC is destroyed, thus all enclaves are
> destroyed as well consequently.
>
> Refer to 42.14 Interaction with S States.
>
> 2. SGX Virtualization Design
>
> 2.1 High Level Toolstack Changes:
>
> 2.1.1 New 'sgx' XL configure file parameter
>
> EPC is limited resource. In order to use EPC efficiently among all domains,
> when creating guest, administrator should be able to specify domain's virtual
> EPC size. And admin alao should be able to get all domain's virtual EPC size.
>
> For SGX Launch Control virtualization, we should allow admin to create VM with
> either VM's virtual IA32_SGXLEPUBKEYHASHn locked or unlocked, and we should
> also allow admin to create VM with custom IA32_SGXLEPUBKEYHASHn value.
>
> For above purposes, below new 'sgx' XL configure file parameter is added:
>
> sgx = 'epc=<size>,lehash=<sha256-hash>,lewr=<0|1>'
>
> In which 'epc' specifies VM's EPC size in MB and it's mandatory.
>
> When physical machine is in *locked* mode, both 'lehash' and 'lewr'
> cannot be specificed, as physical machine are unable to change
> IA32_SGXLEPUBKEYHASHn at runtime. Adding either 'lehash' and 'lewr' will
> cause failure to create VM in that case. And VM's initial
> IA32_SGXLEPUBKEYHASHn value will be set to value of physical MSRs.
>
> When physical machine is in *unlocked* mode, then VM's initial
> IA32_SGXLEPUBKEYHASHn value will be set to 'lehash' if specified, or
> Intel's default value. VM's SGX_LAUNCH_CONTROL_ENABLE bit in
> IA32_FEATURE_CONTROL will be set or cleared, depending on whether 'lewr'
> is specificied (or set to true or false expilicity).
>
> Please also refer to 2.2.4 Launch Control Support.
>
> 2.1.2 New XL commands (?)
>
> Administrator should be able to get physical EPC size, and all domain's virtual
> EPC size. For this purpose, we can introduce 2 additional commands:
>
> # xl sgxinfo
>
> Which will print out physical EPC size, and other SGX info (such as SGX1, SGX2,
> etc) if necessary.
>
> # xl sgxlist <did>
>
> Which will print out particular domain's virtual EPC size, or list all virtual
> EPC sizes for all supported domains.
>
> Alternatively, we can also extend existing XL commands by adding new option
>
> # xl info -sgx
>
> Which will print out physical EPC size along with other physinfo. And
>
> # xl list <did> -sgx
>
> Which will print out domain's virtual EPC size.
>
> Comments?
>
> In this RFC the two new commands are not implemented yet.
>
> 2.1.3 Notify domain's virtual EPC base and size to Xen
>
> Xen needs to know guest's EPC base and size in order to populate EPC pages for
> it. Toolstack notifies EPC base and size to Xen via XEN_DOMCTL_set_cpuid.
>
> 2.2 High Level Xen Hypervisor Changes:
>
> 2.2.1 EPC Management
>
> Xen hypervisor needs to detect SGX, discover EPC, and manage EPC before
> supporting SGX to guest. EPC is detected via SGX CPUID 0x12.0x2. It's possible
> that there are multiple EPC sections (enumerated via sub-leaves 0x3 and so on,
> until invaid EPC is reported), but this is typically on MP-socket server on
> which each package would have its own EPC.
>
> EPC is reported as reserved memory (so it is not reported as normal memory).
> EPC must be managed in 4K pages. CPU hardware uses EPCM to track status of each
> EPC pages. Xen needs to manage EPC and provide functions to, ie, alloc and free
> EPC pages for guest.
>
> Although typically on physical machine (at least existing machines), EPC is
> ~100M in size at maximum, but we cannot assume EPC size, thus in terms of EPC
> management, it's better to integrate EPC management to Xen's memmory management
> framework to take advantage of existing Xen's memory management algorithms.
>
> Specifically, one 'struct page_info' will be created for each EPC page, just
> like normal memory, and a new flag will be defined to identify whether 'struct
> page_info' is EPC or normal memory. Existing memory allocation API
> alloc_domheap_pages will be resued to allocate EPC page, by adding a new memflag
> 'MEMF_epc' to indicate EPC allocation, rather than memory allocation. The new
> 'MEMF_epc' can also be used for EPC ballooning (if required in the future), as
> with the new flag, existing XENMEM_increase{decrease}_reservation,
> XENMEM_populate_physmap can be resued for EPC as well.
>
> 2.2.2 EPC Virtualization
>
> This part is how to populate EPC for guests. We have 3 choices:
> - Static Partitioning
> - Oversubscription
> - Ballooning
>
> Static Partitioning means all EPC pages will be allocated and mapped to guest
> when it is created, and there's no runtime change of page table mappings for EPC
> pages. Oversubscription means Xen hypervisor supports EPC page swapping between
> domains, meaning Xen is able to evict EPC page from another domain and assign it
> to the domain that needs the EPC. With oversubscription, EPC can be assigned to
> domain on demand, when EPT violation happens. Ballooning is similar to memory
> ballooning. It is basically "Static Partitioning" + "Balloon driver" in guest.
>
> Static Partitioning is the easiest way in terms of implementation, and there
> will be no hypervisor overhead (except EPT overhead of course), because in
> "Static partitioning", there is no EPT violation for EPC, and Xen doesn't need
> to turn on ENCLS VMEXIT for guest as ENCLS runs perfectly in non-root mode.
>
> Ballooning is "Static Partitioning" + "Balloon driver" in guest. Like "Static
> Paratitioning", ballooning doesn't need to turn on ENCLS VMEXIT, and doesn't
> have EPT violation for EPC either. To support ballooning, we need ballooning
> driver in guest to issue hypercall to give up or reclaim EPC pages. In terms of
> hypercall, we have two choices: 1) Add new hypercall for EPC ballooning; 2)
> Using existing XENMEM_{increase/decrease}_reservation with new memory flag, ie,
> XENMEMF_epc. I'll discuss more regarding to adding dedicated hypercall or not
> later.
>
> Oversubscription looks nice but it requires more complicated implemetation.
> Firstly, as explained in 1.3.3 EPC Eviction & Reload, we need to follow specific
> steps to evict EPC pages, and in order to do that, basically Xen needs to trap
> ENCLS from guest and keep track of EPC page status and enclave info from all
> guest. This is because:
> - To evict regular EPC page, Xen needs to know SECS location
> - Xen needs to know EPC page type: evicting regular EPC and evicting SECS,
> VA page have different steps.
> - Xen needs to know EPC page status: whether the page is blocked or not.
>
> Those info can only be got by trapping ENCLS from guest, and parsing its
> parameters (to identify SECS page, etc). Parsing ENCLS parameters means we need
> to know which ENCLS leaf is being trapped, and we need to translate guest's
> virtual address to get physical address in order to locate EPC page. And once
> ENCLS is trapped, we have to emulate ENCLS in Xen, which means we need to
> reconstruct ENCLS parameters by remapping all guest's virtual address to Xen's
> virtual address (gva->gpa->pa->xen_va), as ENCLS always use *effective address*
> which is able to be traslated by processor when running ENCLS.
>
> --------------------------------------------------------------
> | ENCLS |
> --------------------------------------------------------------
> | /|\
> ENCLS VMEXIT| | VMENTRY
> | |
> \|/ |
>
> 1) parse ENCLS parameters
> 2) reconstruct(remap) guest's ENCLS parameters
> 3) run ENCLS on behalf of guest (and skip ENCLS)
> 4) on success, update EPC/enclave info, or inject error
>
> And Xen needs to maintain each EPC page's status (type, blocked or not, in
> enclave or not, etc). Xen also needs to maintain all Enclave's info from all
> guests, in order to find the correct SECS for regular EPC page, and enclave's
> linear address as well.
>
> So in general, "Static Partitioning" has simplest implementation, but obviously
> not the best way to use EPC efficiently; "Ballooning" has all pros of Static
> Partitioning but requies guest balloon driver; "Oversubscription" is best in
> terms of flexibility but requires complicated hypervisor implemetation.
>
> We will start with "Static Partitioning". If "Ballooning" is required in the
> future, we will support it. "Oversubscription" should not be needed in
> forseeable future.
>
> 2.2.3 Populate EPC for Guest
>
> Toolstack notifies Xen about domain's EPC base and size by XEN_DOMCTL_set_cpuid,
> so currently Xen populates all EPC pages for guest in XEN_DOMCTL_set_cpuid,
> particularly, in handling XEN_DOMCTL_set_cpuid for CPUID.0x12.0x2. Once Xen
> checks the values passed from toolstack is valid, Xen will allocate all EPC
> pages and setup EPT mappings for guest.
>
> 2.2.4 Launch Control Support
>
> To support running multiple domains with each running its own LE signed by
> different owners, physical machine's BIOS must leave IA32_SGXLEPUBKEYHASHn
> *unlocked* before handing to Xen. Xen will trap domain's write to
> IA32_SGXLEPUBKEYHASHn and keep the value in vcpu internally, and update the
> value to physical MSRs when vcpu is scheduled in. This can guarantee that
> when EINIT runs in guest, guest's virtual IA32_SGXLEPUBKEYHASHn have been
> written to physical MSRs.
>
> SGX_LAUNCH_CONTROL_ENABLE bit in guest's IA32_FEATURE_CONTROL is controlled
> by new added 'lewr' XL parameter (see 2.1.1 New 'sgx' XL configure file
> parameter).
>
> If physical IA32_SGXLEPUBKEYHASHn are *locked* in machine's BIOS, then only MSR
> read is allowed from guest, and Xen will inject error for guest's MSR writes.
>
> In addition, if physical IA32_SGXLEPUBKEYHASHn are *locked*, then creating guest
> with 'lehash' parameter or 'lewr' will fail, as in such case Xen is not able to
> update guest's virtual IA32_SGXLEPUBKEYHASHn to physical MSRs.
>
> If physical IA32_SGXLEPUBKEYHASHn are not available
> (CPUID.0x7.0x0:ECX.SGX_LAUHCN_CONTROL is not present), then creating VM with
> 'lehash' and 'lewr' will also fail. In addition, any MSR read/write for
> IA32_SGXLEPUBKEYHASHn from guest is invalid and Xen will inject error in such
> case.
>
> 2.2.5 CPUID Emulation
>
> Most of native SGX CPUID info can be exposed to guest, expect below two parts:
> - Sub-leaf 0x2 needs to report domain's virtual EPC base and size, instead
> of physical EPC info.
> - Sub-leaf 0x1 needs to be consistent with guest's XCR0. For the reason of
> this part please refer to 1.5.2 Interaction with XSAVE.
>
> 2.2.6 EPT Violation & ENCLS Trapping Handling
>
> Only needed when Xen supports EPC Oversubscription, as explained above.
>
> 2.2.7 Guest Suspend & Resume
>
> On hardware, EPC is destroyed when power goes to S3-S5. So Xen will destroy
> guest's EPC when guest's power goes into S3-S5. Currently Xen is notified by
> Qemu in terms of S State change via HVM_PARAM_ACPI_S_STATE, where Xen will
> destroy EPC if S State is S3-S5.
>
> Specifically, Xen will run EREMOVE for guest's each EPC page, as guest may
> not handle EPC suspend & resume correctly, in which case physically guest's EPC
> pages may still be valid, so Xen needs to run EREMOVE to make sure all EPC
> pages are becoming invalid. Otherwise further operation in guest on EPC may
> fault as it assumes all EPC pages are invalid after guest is resumed.
>
> For SECS page, EREMOVE may fault with SGX_CHILD_PRESENT, in which case Xen will
> keep this SECS page into a list, and call EREMOVE for them again after all EPC
> pages have been called with EREMOVE. This time the EREMOVE on SECS will succeed
> as all children (regular EPC pages) have already been removed.
>
> 2.2.8 Destroying Domain
>
> Normally Xen just frees all EPC pages for domain when it is destroyed. But Xen
> will also do EREMOVE on all guest's EPC pages (described in above 2.2.7) before
> free them, as guest may shutdown unexpected (ex, user kills guest), and in this
> case, guest's EPC may still be valid.
>
> 2.3 Additional Point: Live Migration, Snapshot Support (?)
>
> Actually from hardware's point of view, SGX is not migratable. There are two
> reasons:
>
> - SGX key architecture cannot be virtualized.
>
> For example, some keys are bound to CPU. For example, Sealing key, EREPORT
> key, etc. If VM is migrated to another machine, the same enclave will derive
> the different keys. Taking Sealing key as an example, Sealing key is
> typically used by enclave (enclave can get sealing key by EGETKEY) to *seal*
> its secrets to outside (ex, persistent storage) for further use. If Sealing
> key changes after VM migration, then the enclave can never get the sealed
> secrets back by using sealing key, as it has changed, and old sealing key
> cannot be got back.
>
> - There's no ENCLS to evict EPC page to normal memory, but at the meaning
> time, still keep content in EPC. Currently once EPC page is evicted, the EPC
> page becomes invalid. So technically, we are unable to implement live
> migration (or check pointing, or snapshot) for enclave.
>
> But, with some workaround, and some facts of existing SGX driver, technically
> we are able to support Live migration (or even check pointing, snapshot). This
> is because:
>
> - Changing key (which is bound to CPU) is not a problem in reality
>
> Take Sealing key as an example. Losing sealed data is not a problem, because
> sealing key is only supposed to encrypt secrets that can be provisioned
> again. The typical work model is, enclave gets secrets provisioned from
> remote (service provider), and use sealing key to store it for further use.
> When enclave tries to *unseal* use sealing key, if the sealing key is
> changed, enclave will find the data is some kind of corrupted (integrity
> check failure), so it will ask secrets to be provisioned again from remote.
> Another reason is, in data center, VM's typically share lots of data, and as
> sealing key is bound to CPU, it means the data encrypted by one enclave on
> one machine cannot be shared by another enclave on another mahcine. So from
> SGX app writer's point of view, developer should treat Sealing key as a
> changeable key, and should handle lose of sealing data anyway. Sealing key
> should only be used to seal secrets that can be easily provisioned again.
>
> For other keys such as EREPORT key and provisioning key, which are used for
> local attestation and remote attestation, due to the second reason below,
> losing them is not a problem either.
>
> - Sudden lose of EPC is not a problem.
>
> On hardware, EPC will be lost if system goes to S3-S5, or reset, or
> shutdown, and SGX driver need to handle lose of EPC due to power transition.
> This is done by cooperation between SGX driver and userspace SGX SDK/apps.
> However during live migration, there may not be power transition in guest,
> so there may not be EPC lose during live migration. And technically we
> cannot *really* live migrate enclave (explained above), so looks it's not
> feasible. But the fact is that both Linux SGX driver and Windows SGX driver
> have already supported *sudden* lose of EPC (not EPC lose during power
> transition), which means both driver are able to recover in case EPC is lost
> at any runtime. With this, technically we are able to support live migration
> by simply ignoring EPC. After VM is migrated, the destination VM will only
> suffer *sudden* lose of EPC, which both Windows SGX driver and Linux SGX
> driver are already able to handle.
>
> But we must point out such *sudden* lose of EPC is not hardware behavior,
> and other SGX driver for other OSes (such as FreeBSD) may not implement
> this, so for those guests, destination VM will behavior in unexpected
> manner. But I am not sure we need to care about other OSes.
>
> For the same reason, we are able to support check pointing for SGX guest (only
> Linux and Windows);
>
> For snapshot, we can support snapshot SGX guest by either:
>
> - Suspend guest before snapshot (s3-s5). This works for all guests but
> requires user to manually susppend guest.
> - Issue an hypercall to destroy guest's EPC in save_vm. This only works for
> Linux and Windows but doesn't require user intervention.
>
> What's your comments?
>
> 3. Reference
>
> - Intel SGX Homepage
> https://software.intel.com/en-us/sgx
>
> - Linux SGX SDK
> https://01.org/intel-software-guard-extensions
>
> - Linux SGX driver for upstreaming
> https://github.com/01org/linux-sgx
>
> - Intel SGX Specification (SDM Vol 3D)
> https://software.intel.com/sites/default/files/managed/7c/f1/332831-sdm-vol-3d.pdf
>
> - Paper: Intel SGX Explained
> https://eprint.iacr.org/2016/086.pdf
>
> - ISCA 2015 tutorial slides for Intel® SGX - Intel® Software
> https://software.intel.com/sites/default/files/332680-002.pdf
>
> Boqun Feng (5):
> xen: mm: introduce non-scrubbable pages
> xen: mm: manage EPC pages in Xen heaps
> xen: x86/mm: add SGX EPC management
> xen: x86: add functions to populate and destroy EPC for domain
> xen: tools: add SGX to applying MSR policy
>
> Kai Huang (12):
> xen: x86: expose SGX to HVM domain in CPU featureset
> xen: x86: add early stage SGX feature detection
> xen: vmx: detect ENCLS VMEXIT
> xen: x86/mm: introduce ioremap_wb()
> xen: p2m: new 'p2m_epc' type for EPC mapping
> xen: x86: add SGX cpuid handling support.
> xen: vmx: handle SGX related MSRs
> xen: vmx: handle ENCLS VMEXIT
> xen: vmx: handle VMEXIT from SGX enclave
> xen: x86: reset EPC when guest got suspended.
> xen: tools: add new 'sgx' parameter support
> xen: tools: add SGX to applying CPUID policy
>
> docs/misc/xen-command-line.markdown | 8 +
> tools/libxc/Makefile | 1 +
> tools/libxc/include/xc_dom.h | 4 +
> tools/libxc/include/xenctrl.h | 16 +
> tools/libxc/xc_cpuid_x86.c | 68 ++-
> tools/libxc/xc_msr_x86.h | 10 +
> tools/libxc/xc_sgx.c | 82 +++
> tools/libxl/libxl.h | 3 +-
> tools/libxl/libxl_cpuid.c | 15 +-
> tools/libxl/libxl_create.c | 10 +
> tools/libxl/libxl_dom.c | 65 ++-
> tools/libxl/libxl_internal.h | 2 +
> tools/libxl/libxl_nocpuid.c | 4 +-
> tools/libxl/libxl_types.idl | 11 +
> tools/libxl/libxl_x86.c | 12 +
> tools/ocaml/libs/xc/xenctrl_stubs.c | 11 +-
> tools/python/xen/lowlevel/xc/xc.c | 11 +-
> tools/xl/xl_parse.c | 86 +++
> tools/xl/xl_parse.h | 1 +
> xen/arch/x86/Makefile | 1 +
> xen/arch/x86/cpu/common.c | 15 +
> xen/arch/x86/cpuid.c | 62 ++-
> xen/arch/x86/domctl.c | 87 ++-
> xen/arch/x86/hvm/hvm.c | 3 +
> xen/arch/x86/hvm/vmx/vmcs.c | 16 +-
> xen/arch/x86/hvm/vmx/vmx.c | 68 +++
> xen/arch/x86/hvm/vmx/vvmx.c | 11 +
> xen/arch/x86/mm.c | 9 +-
> xen/arch/x86/mm/p2m-ept.c | 3 +
> xen/arch/x86/mm/p2m.c | 41 ++
> xen/arch/x86/msr.c | 6 +-
> xen/arch/x86/sgx.c | 815 ++++++++++++++++++++++++++++
> xen/common/page_alloc.c | 39 +-
> xen/include/asm-arm/mm.h | 9 +
> xen/include/asm-x86/cpufeature.h | 4 +
> xen/include/asm-x86/cpuid.h | 29 +-
> xen/include/asm-x86/hvm/hvm.h | 3 +
> xen/include/asm-x86/hvm/vmx/vmcs.h | 8 +
> xen/include/asm-x86/hvm/vmx/vmx.h | 3 +
> xen/include/asm-x86/mm.h | 19 +-
> xen/include/asm-x86/msr-index.h | 6 +
> xen/include/asm-x86/msr.h | 5 +
> xen/include/asm-x86/p2m.h | 12 +-
> xen/include/asm-x86/sgx.h | 86 +++
> xen/include/public/arch-x86/cpufeatureset.h | 3 +-
> xen/include/xen/mm.h | 2 +
> xen/tools/gen-cpuid.py | 3 +
> 47 files changed, 1757 insertions(+), 31 deletions(-)
> create mode 100644 tools/libxc/xc_sgx.c
> create mode 100644 xen/arch/x86/sgx.c
> create mode 100644 xen/include/asm-x86/sgx.h
>
> --
> 2.15.0
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2017-12-25 5:01 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-12-04 0:15 [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
2017-12-04 0:15 ` [PATCH v2 01/17] xen: x86: expose SGX to HVM domain in CPU featureset Boqun Feng
2017-12-04 11:13 ` Julien Grall
2017-12-04 13:10 ` Boqun Feng
2017-12-04 14:13 ` Jan Beulich
2017-12-05 0:22 ` Boqun Feng
2017-12-04 0:15 ` [PATCH v2 02/17] xen: x86: add early stage SGX feature detection Boqun Feng
2017-12-04 0:15 ` [PATCH v2 03/17] xen: vmx: detect ENCLS VMEXIT Boqun Feng
2017-12-04 0:15 ` [PATCH v2 04/17] xen: x86/mm: introduce ioremap_wb() Boqun Feng
2017-12-04 0:15 ` [PATCH v2 05/17] xen: p2m: new 'p2m_epc' type for EPC mapping Boqun Feng
2017-12-04 0:15 ` [PATCH v2 06/17] xen: mm: introduce non-scrubbable pages Boqun Feng
2017-12-04 0:15 ` [PATCH v2 07/17] xen: mm: manage EPC pages in Xen heaps Boqun Feng
2017-12-04 0:15 ` [PATCH v2 08/17] xen: x86/mm: add SGX EPC management Boqun Feng
2017-12-04 0:15 ` [PATCH v2 09/17] xen: x86: add functions to populate and destroy EPC for domain Boqun Feng
2017-12-04 0:15 ` [PATCH v2 10/17] xen: x86: add SGX cpuid handling support Boqun Feng
2017-12-04 0:15 ` [PATCH v2 11/17] xen: vmx: handle SGX related MSRs Boqun Feng
2017-12-04 0:15 ` [PATCH v2 12/17] xen: vmx: handle ENCLS VMEXIT Boqun Feng
2017-12-04 0:15 ` [PATCH v2 13/17] xen: vmx: handle VMEXIT from SGX enclave Boqun Feng
2017-12-04 0:15 ` [PATCH v2 14/17] xen: x86: reset EPC when guest got suspended Boqun Feng
2017-12-04 0:15 ` [PATCH v2 15/17] xen: tools: add new 'sgx' parameter support Boqun Feng
2017-12-04 0:15 ` [PATCH v2 16/17] xen: tools: add SGX to applying CPUID policy Boqun Feng
2017-12-04 0:15 ` [PATCH v2 17/17] xen: tools: add SGX to applying MSR policy Boqun Feng
2017-12-25 5:01 ` [RFC PATCH v2 00/17] RFC: SGX Virtualization design and draft patches Boqun Feng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).