From: Kamal Mostafa <kamal@canonical.com>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org,
kernel-team@lists.ubuntu.com
Cc: Andy Lutomirski <luto@amacapital.net>,
Paolo Bonzini <pbonzini@redhat.com>,
Kamal Mostafa <kamal@canonical.com>
Subject: [PATCH 4.2.y-ckt 81/98] KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo
Date: Tue, 15 Mar 2016 16:31:15 -0700 [thread overview]
Message-ID: <1458084692-23100-82-git-send-email-kamal@canonical.com> (raw)
In-Reply-To: <1458084692-23100-1-git-send-email-kamal@canonical.com>
4.2.8-ckt6 -stable review patch. If anyone has any objections, please let me know.
---8<------------------------------------------------------------
From: Paolo Bonzini <pbonzini@redhat.com>
commit 844a5fe219cf472060315971e15cbf97674a3324 upstream.
Yes, all of these are needed. :) This is admittedly a bit odd, but
kvm-unit-tests access.flat tests this if you run it with "-cpu host"
and of course ept=0.
KVM runs the guest with CR0.WP=1, so it must handle supervisor writes
specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault
when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0.
When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and
restarts execution. This will still cause a user write to fault, while
supervisor writes will succeed. User reads will fault spuriously now,
and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads
will be enabled and supervisor writes disabled, going back to the
originary situation where supervisor writes fault spuriously.
When SMEP is in effect, however, U=0 will enable kernel execution of
this page. To avoid this, KVM also sets NX=1 in the shadow PTE together
with U=0. If the guest has not enabled NX, the result is a continuous
stream of page faults due to the NX bit being reserved.
The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER
switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry
control, so they do not use user-return notifiers for EFER---if they did,
EFER.NX would be forced to the same value as the host).
There is another bug in the reserved bit check, which I've split to a
separate patch for easier application to stable kernels.
Cc: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
---
Documentation/virtual/kvm/mmu.txt | 3 ++-
arch/x86/kvm/vmx.c | 36 +++++++++++++++++++++++-------------
2 files changed, 25 insertions(+), 14 deletions(-)
diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index 3a4d681..b653641 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -358,7 +358,8 @@ In the first case there are two additional complications:
- if CR4.SMEP is enabled: since we've turned the page into a kernel page,
the kernel may now execute it. We handle this by also setting spte.nx.
If we get a user fetch or read fault, we'll change spte.u=1 and
- spte.nx=gpte.nx back.
+ spte.nx=gpte.nx back. For this to work, KVM forces EFER.NX to 1 when
+ shadow paging is in use.
- if CR4.SMAP is disabled: since the page has been changed to a kernel
page, it can not be reused when CR4.SMAP is enabled. We set
CR4.SMAP && !CR0.WP into shadow page's role to avoid this case. Note,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index cb450d8..abf8cc7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1718,26 +1718,31 @@ static void reload_tss(void)
static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
{
- u64 guest_efer;
- u64 ignore_bits;
+ u64 guest_efer = vmx->vcpu.arch.efer;
+ u64 ignore_bits = 0;
- guest_efer = vmx->vcpu.arch.efer;
+ if (!enable_ept) {
+ /*
+ * NX is needed to handle CR0.WP=1, CR4.SMEP=1. Testing
+ * host CPUID is more efficient than testing guest CPUID
+ * or CR4. Host SMEP is anyway a requirement for guest SMEP.
+ */
+ if (boot_cpu_has(X86_FEATURE_SMEP))
+ guest_efer |= EFER_NX;
+ else if (!(guest_efer & EFER_NX))
+ ignore_bits |= EFER_NX;
+ }
/*
- * NX is emulated; LMA and LME handled by hardware; SCE meaningless
- * outside long mode
+ * LMA and LME handled by hardware; SCE meaningless outside long mode.
*/
- ignore_bits = EFER_NX | EFER_SCE;
+ ignore_bits |= EFER_SCE;
#ifdef CONFIG_X86_64
ignore_bits |= EFER_LMA | EFER_LME;
/* SCE is meaningful only in long mode on Intel */
if (guest_efer & EFER_LMA)
ignore_bits &= ~(u64)EFER_SCE;
#endif
- guest_efer &= ~ignore_bits;
- guest_efer |= host_efer & ignore_bits;
- vmx->guest_msrs[efer_offset].data = guest_efer;
- vmx->guest_msrs[efer_offset].mask = ~ignore_bits;
clear_atomic_switch_msr(vmx, MSR_EFER);
@@ -1748,16 +1753,21 @@ static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
*/
if (cpu_has_load_ia32_efer ||
(enable_ept && ((vmx->vcpu.arch.efer ^ host_efer) & EFER_NX))) {
- guest_efer = vmx->vcpu.arch.efer;
if (!(guest_efer & EFER_LMA))
guest_efer &= ~EFER_LME;
if (guest_efer != host_efer)
add_atomic_switch_msr(vmx, MSR_EFER,
guest_efer, host_efer);
return false;
- }
+ } else {
+ guest_efer &= ~ignore_bits;
+ guest_efer |= host_efer & ignore_bits;
- return true;
+ vmx->guest_msrs[efer_offset].data = guest_efer;
+ vmx->guest_msrs[efer_offset].mask = ~ignore_bits;
+
+ return true;
+ }
}
static unsigned long segment_base(u16 selector)
--
2.7.0
next prev parent reply other threads:[~2016-03-15 23:33 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-15 23:29 [4.2.y-ckt stable] Linux 4.2.8-ckt6 stable review Kamal Mostafa
2016-03-15 23:29 ` [PATCH 4.2.y-ckt 01/98] tipc: fix connection abort during subscription cancel Kamal Mostafa
2016-03-15 23:29 ` [PATCH 4.2.y-ckt 02/98] tipc: fix nullptr crash " Kamal Mostafa
2016-03-15 23:29 ` [PATCH 4.2.y-ckt 03/98] s390/mm: four page table levels vs. fork Kamal Mostafa
2016-03-15 23:29 ` [PATCH 4.2.y-ckt 04/98] Input: aiptek - fix crash on detecting device without endpoints Kamal Mostafa
2016-03-15 23:29 ` [PATCH 4.2.y-ckt 05/98] wext: fix message delay/ordering Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 06/98] cfg80211/wext: fix message ordering Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 07/98] mac80211: fix use of uninitialised values in RX aggregation Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 08/98] mac80211: minstrel: Change expected throughput unit back to Kbps Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 09/98] libata: fix HDIO_GET_32BIT ioctl Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 10/98] iwlwifi: mvm: inc pending frames counter also when txing non-sta Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 11/98] [media] adv7604: fix tx 5v detect regression Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 12/98] ahci: add new Intel device IDs Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 13/98] ahci: Order SATA device IDs for codename Lewisburg Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 14/98] Adding Intel Lewisburg device IDs for SATA Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 15/98] ASoC: samsung: Use IRQ safe spin lock calls Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 16/98] mac80211: minstrel_ht: set default tx aggregation timeout to 0 Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 17/98] usb: chipidea: otg: change workqueue ci_otg as freezable Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 18/98] Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin" Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 19/98] jffs2: Fix page lock / f->sem deadlock Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 20/98] Fix directory hardlinks from deleted directories Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 21/98] iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 22/98] iommu/amd: Apply workaround for ATS write permission check Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 23/98] libata: Align ata_device's id on a cacheline Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 24/98] can: gs_usb: fixed disconnect bug by removing erroneous use of kfree() Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 25/98] fbcon: set a default value to blink interval Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 26/98] KVM: x86: fix root cause for missed hardware breakpoints Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 27/98] arm64: vmemmap: use virtual projection of linear region Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 28/98] vfio: fix ioctl error handling Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 29/98] ALSA: ctl: Fix ioctls for X32 ABI Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 30/98] ALSA: pcm: " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 31/98] ALSA: rawmidi: Fix ioctls " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 32/98] ALSA: timer: Fix broken compat timer user status ioctl Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 33/98] ALSA: timer: Fix ioctls for X32 ABI Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 34/98] cifs: fix out-of-bounds access in lease parsing Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 35/98] CIFS: Fix SMB2+ interim response processing for read requests Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 36/98] Fix cifs_uniqueid_to_ino_t() function for s390x Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 37/98] arm/arm64: KVM: Fix ioctl error handling Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 38/98] MIPS: kvm: " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 39/98] ALSA: hdspm: Fix wrong boolean ctl value accesses Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 40/98] ALSA: hdspm: Fix zero-division Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 41/98] ALSA: hdsp: Fix wrong boolean ctl value accesses Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 42/98] use ->d_seq to get coherency between ->d_inode and ->d_flags Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 43/98] USB: qcserial: add Dell Wireless 5809e Gobi 4G HSPA+ (rev3) Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 44/98] USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 45/98] ASoC: dapm: Fix ctl value accesses in a wrong type Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 46/98] ASoC: wm8958: Fix enum ctl " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 47/98] ASoC: wm8994: " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 48/98] ASoC: wm_adsp: " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 49/98] USB: serial: option: add support for Telit LE922 PID 0x1045 Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 50/98] USB: serial: option: add support for Quectel UC20 Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 51/98] ALSA: usb-audio: Add a quirk for Plantronics DA45 Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 52/98] mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 53/98] mac80211: Fix Public Action frame RX in AP mode Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 54/98] i2c: brcmstb: allocate correct amount of memory for regmap Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 55/98] ALSA: seq: oss: Don't drain at closing a client Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 56/98] parisc: Fix ptrace syscall number and return value modification Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 57/98] drm/ast: Fix incorrect register check for DRAM width Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 58/98] USB: qcserial: add Sierra Wireless EM74xx device ID Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 59/98] drm/amdgpu/pm: update current crtc info after setting the powerstate Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 60/98] drm/radeon/pm: " Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 61/98] drm/amdgpu: return from atombios_dp_get_dpcd only when error Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 62/98] PM / sleep / x86: Fix crash on graph trace through x86 suspend Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 63/98] ALSA: hda - Fix mic issues on Acer Aspire E1-472 Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 64/98] ovl: fix working on distributed fs as lower layer Kamal Mostafa
2016-03-15 23:30 ` [PATCH 4.2.y-ckt 65/98] ovl: fix getcwd() failure after unsuccessful rmdir Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 66/98] ovl: ignore lower entries when checking purity of non-directory entries Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 67/98] ovl: copy new uid/gid into overlayfs runtime inode Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 68/98] MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp' Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 69/98] ubi: Fix out of bounds write in volume update code Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 70/98] target: Drop incorrect ABORT_TASK put for completed commands Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 71/98] ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 72/98] ARM: dts: dra7: do not gate cpsw clock due to errata i877 Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 73/98] PCI: Allow a NULL "parent" pointer in pci_bus_assign_domain_nr() Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 74/98] Revert "drm/radeon: call hpd_irq_event on resume" Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 75/98] KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 76/98] ncpfs: fix a braino in OOM handling in ncp_fill_cache() Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 77/98] jffs2: reduce the breakage on recovery from halfway failed rename() Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 78/98] KVM: VMX: disable PEBS before a guest entry Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 79/98] arm64: account for sparsemem section alignment when choosing vmemmap offset Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 80/98] tracing: Fix check for cpu online when event is disabled Kamal Mostafa
2016-03-15 23:31 ` Kamal Mostafa [this message]
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 82/98] dmaengine: at_xdmac: fix residue computation Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 83/98] MIPS: Fix build error when SMP is used without GIC Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 84/98] IB/core: Use GRH when the path hop-limit > 0 Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 85/98] dmaengine: pxa_dma: fix cyclic transfers Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 86/98] MIPS: smp.c: Fix uninitialised temp_foreign_map Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 87/98] tcp: fix tcpi_segs_in after connection establishment Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 88/98] be2net: Don't leak iomapped memory on removal Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 89/98] tcp: convert cached rtt from usec to jiffies when feeding initial rto Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 90/98] ext4: iterate over buffer heads correctly in move_extent_per_page() Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 91/98] ppp: release rtnl mutex when interface creation fails Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 92/98] net/mlx4_core: Allow resetting VF admin mac to zero Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 93/98] ipv6: re-enable fragment header matching in ipv6_find_hdr Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 94/98] net/mlx5e: Remove wrong poll CQ optimization Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 95/98] cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 96/98] net: qca_spi: Don't clear IFF_BROADCAST Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 97/98] net: moxa: fix an error code Kamal Mostafa
2016-03-15 23:31 ` [PATCH 4.2.y-ckt 98/98] mld, igmp: Fix reserved tailroom calculation Kamal Mostafa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1458084692-23100-82-git-send-email-kamal@canonical.com \
--to=kamal@canonical.com \
--cc=kernel-team@lists.ubuntu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=pbonzini@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox