stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	"Zhang, Yang Z" <yang.z.zhang@intel.com>,
	"Liu, RongrongX" <rongrongx.liu@intel.com>,
	Felipe Reyes <freyes@suse.com>,
	Wanpeng Li <wanpeng.li@linux.intel.com>
Subject: [PATCH 3.14 42/88] KVM: nVMX: fix "acknowledge interrupt on exit" when APICv is in use
Date: Wed,  3 Sep 2014 15:05:16 -0700	[thread overview]
Message-ID: <20140903220517.231480248@linuxfoundation.org> (raw)
In-Reply-To: <20140903220515.958924632@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Wanpeng Li <wanpeng.li@linux.intel.com>

commit 56cc2406d68c0f09505c389e276f27a99f495cbd upstream.

After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info
if L1 asks us to), "Acknowledge interrupt on exit" behavior can be
emulated. To do so, KVM will ask the APIC for the interrupt vector if
during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set.  With APICv,
kvm_get_apic_interrupt would return -1 and give the following WARNING:

Call Trace:
 [<ffffffff81493563>] dump_stack+0x49/0x5e
 [<ffffffff8103f0eb>] warn_slowpath_common+0x7c/0x96
 [<ffffffffa059709a>] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffff8103f11a>] warn_slowpath_null+0x15/0x17
 [<ffffffffa059709a>] nested_vmx_vmexit+0xa4/0x233 [kvm_intel]
 [<ffffffffa0594295>] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel]
 [<ffffffffa0537931>] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm]
 [<ffffffffa05972ec>] vmx_check_nested_events+0xc3/0xd3 [kvm_intel]
 [<ffffffffa051ebe9>] inject_pending_event+0xd0/0x16e [kvm]
 [<ffffffffa051efa0>] vcpu_enter_guest+0x319/0x704 [kvm]

To fix this, we cannot rely on the processor's virtual interrupt delivery,
because "acknowledge interrupt on exit" must only update the virtual
ISR/PPR/IRR registers (and SVI, which is just a cache of the virtual ISR)
but it should not deliver the interrupt through the IDT.  Thus, KVM has
to deliver the interrupt "by hand", similar to the treatment of EOI in
commit fc57ac2c9ca8 (KVM: lapic: sync highest ISR to hardware apic on
EOI, 2014-05-14).

The patch modifies kvm_cpu_get_interrupt to always acknowledge an
interrupt; there are only two callers, and the other is not affected
because it is never reached with kvm_apic_vid_enabled() == true.  Then it
modifies apic_set_isr and apic_clear_irr to update SVI and RVI in addition
to the registers.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Tested-by: Liu, RongrongX <rongrongx.liu@intel.com>
Tested-by: Felipe Reyes <freyes@suse.com>
Fixes: 77b0f5d67ff2781f36831cba79674c3e97bd7acf
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/kvm/irq.c   |    2 -
 arch/x86/kvm/lapic.c |   52 ++++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 40 insertions(+), 14 deletions(-)

--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcp
 
 	vector = kvm_cpu_get_extint(v);
 
-	if (kvm_apic_vid_enabled(v->kvm) || vector != -1)
+	if (vector != -1)
 		return vector;			/* PIC */
 
 	return kvm_get_apic_interrupt(v);	/* APIC */
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(
 
 static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
 {
-	apic->irr_pending = false;
+	struct kvm_vcpu *vcpu;
+
+	vcpu = apic->vcpu;
+
 	apic_clear_vector(vec, apic->regs + APIC_IRR);
-	if (apic_search_irr(apic) != -1)
-		apic->irr_pending = true;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		/* try to update RVI */
+		kvm_make_request(KVM_REQ_EVENT, vcpu);
+	else {
+		vec = apic_search_irr(apic);
+		apic->irr_pending = (vec != -1);
+	}
 }
 
 static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 {
-	/* Note that we never get here with APIC virtualization enabled.  */
+	struct kvm_vcpu *vcpu;
+
+	if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
+		return;
+
+	vcpu = apic->vcpu;
 
-	if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR))
-		++apic->isr_count;
-	BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
 	/*
-	 * ISR (in service register) bit is set when injecting an interrupt.
-	 * The highest vector is injected. Thus the latest bit set matches
-	 * the highest bit in ISR.
+	 * With APIC virtualization enabled, all caching is disabled
+	 * because the processor can modify ISR under the hood.  Instead
+	 * just set SVI.
 	 */
-	apic->highest_isr_cache = vec;
+	if (unlikely(kvm_apic_vid_enabled(vcpu->kvm)))
+		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
+	else {
+		++apic->isr_count;
+		BUG_ON(apic->isr_count > MAX_APIC_VECTOR);
+		/*
+		 * ISR (in service register) bit is set when injecting an interrupt.
+		 * The highest vector is injected. Thus the latest bit set matches
+		 * the highest bit in ISR.
+		 */
+		apic->highest_isr_cache = vec;
+	}
 }
 
 static inline int apic_find_highest_isr(struct kvm_lapic *apic)
@@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vc
 	int vector = kvm_apic_has_interrupt(vcpu);
 	struct kvm_lapic *apic = vcpu->arch.apic;
 
-	/* Note that we never get here with APIC virtualization enabled.  */
-
 	if (vector == -1)
 		return -1;
 
+	/*
+	 * We get here even with APIC virtualization enabled, if doing
+	 * nested virtualization and L1 runs with the "acknowledge interrupt
+	 * on exit" mode.  Then we cannot inject the interrupt via RVI,
+	 * because the process would deliver it through the IDT.
+	 */
+
 	apic_set_isr(vector, apic);
 	apic_update_ppr(apic);
 	apic_clear_irr(vector, apic);



  parent reply	other threads:[~2014-09-03 22:05 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03 22:04 [PATCH 3.14 00/88] 3.14.18-stable review Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 01/88] stable_kernel_rules: Add pointer to netdev-FAQ for network patches Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 02/88] HID: logitech: perform bounds checking on device_id early enough Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 03/88] HID: fix a couple of off-by-ones Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 04/88] isofs: Fix unbounded recursion when processing relocated directories Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 05/88] USB: OHCI: fix bugs in debug routines Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 06/88] USB: OHCI: dont lose track of EDs when a controller dies Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 07/88] USB: devio: fix issue with log flooding Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 08/88] USB: serial: ftdi_sio: Annotate the current Xsens PID assignments Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 09/88] USB: serial: ftdi_sio: Add support for new Xsens devices Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 10/88] USB: ehci-pci: USB host controller support for Intel Quark X1000 Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 11/88] USB: Fix persist resume of some SS USB devices Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 12/88] ALSA: hda - fix an external mic jack problem on a HP machine Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 13/88] ALSA: usb-audio: Adjust Gamecom 780 volume level Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 14/88] ALSA: virtuoso: add Xonar Essence STX II support Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 15/88] ALSA: hda/ca0132 - Dont try loading firmware at resume when already failed Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 16/88] ALSA: usb-audio: fix BOSS ME-25 MIDI regression Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 17/88] ALSA: hda - Add mute LED pin quirk for HP 15 touchsmart Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 18/88] ALSA: hda - restore the gpio led after resume Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 19/88] ALSA: hda/realtek - Avoid setting wrong COEF on ALC269 & co Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 20/88] mei: start disconnect request timer consistently Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 21/88] sched: Fix sched_setparam() policy == -1 logic Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 22/88] ARM: dts: AM4372: Correct mailbox node data Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 23/88] ARM: 8097/1: unistd.h: relocate comments back to place Greg Kroah-Hartman
2014-09-03 22:04 ` [PATCH 3.14 25/88] drm: omapdrm: fix compiler errors Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 26/88] hwmon: (sis5595) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 27/88] hwmon: (amc6821) Fix possible race condition bug Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 28/88] hwmon: (lm78) Fix overflow problems seen when writing large temperature limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 29/88] hwmon: (gpio-fan) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 30/88] hwmon: (ads1015) Fix off-by-one for valid channel index checking Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 31/88] hwmon: (lm85) Fix various errors on attribute writes Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 32/88] hwmon: (ads1015) Fix out-of-bounds array access Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 33/88] hwmon: (dme1737) Prevent overflow problem when writing large limits Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 34/88] tpm: Add missing tpm_do_selftest to ST33 I2C driver Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 35/88] drivers/i2c/busses: use correct type for dma_map/unmap Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 36/88] ext4: fix ext4_discard_allocated_blocks() if we cant allocate the pa struct Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 37/88] serial: core: Preserve termios c_cflag for console resume Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 38/88] crypto: ux500 - make interrupt mode plausible Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 39/88] debugfs: Fix corrupted loop in debugfs_remove_recursive Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 40/88] KVM: x86: Inter-privilege level ret emulation is not implemeneted Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 41/88] KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table Greg Kroah-Hartman
2014-09-03 22:05 ` Greg Kroah-Hartman [this message]
2014-09-03 22:05 ` [PATCH 3.14 43/88] Revert "KVM: x86: Increase the number of fixed MTRR regs to 10" Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 44/88] kvm: iommu: fix the third parameter of kvm_iommu_put_pages (CVE-2014-3601) Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 45/88] ext4: fix BUG_ON in mb_free_blocks() Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 46/88] drm/radeon: add new KV pci id Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 47/88] drm/radeon: add new bonaire pci ids Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 48/88] drm/radeon: add additional SI " Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 49/88] PCI: Configure ASPM when enabling device Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 50/88] ACPI / PCI: Fix sysfs acpi_index and label errors Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 51/88] x86: dont exclude low BIOS area when allocating address space for non-PCI cards Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 52/88] powerpc/pci: Reorder pci bus/bridge unregistration during PHB removal Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 53/88] powerpc/powernv: Update dev->dma_mask in pci_set_dma_mask() path Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 54/88] x86_64/vsyscall: Fix warn_bad_vsyscall log output Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 55/88] hpsa: fix non-x86 builds Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 56/88] xen/events/fifo: ensure all bitops are properly aligned even on x86 Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 58/88] x86/xen: use vmap() to map grant table pages in PVH guests Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 59/88] x86/xen: resume timer irqs early Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 60/88] hpsa: fix bad -ENOMEM return value in hpsa_big_passthru_ioctl Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 61/88] Btrfs: Fix memory corruption by ulist_add_merge() on 32bit arch Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 62/88] Btrfs: fix csum tree corruption, duplicate and outdated checksums Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 63/88] Btrfs: read lock extent buffer while walking backrefs Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 64/88] Btrfs: fix compressed write corruption on enospc Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 65/88] Btrfs: fix crash on endio of reading corrupted block Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 66/88] mei: reset client state on queued connect request Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 67/88] mei: nfc: fix memory leak in error path Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 68/88] ext4: update i_disksize coherently with block allocation on " Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 69/88] jbd2: fix infinite loop when recovering corrupt journal blocks Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 70/88] jbd2: fix descriptor block size handling errors with journal_csum Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 71/88] staging: et131x: Fix errors caused by phydev->addr accesses before initialisation Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 72/88] staging/rtl8188eu: add 0df6:0076 Sitecom Europe B.V Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 73/88] staging: r8188eu: Add new USB ID Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 74/88] xhci: Treat not finding the event_seg on COMP_STOP the same as COMP_STOP_INVAL Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 75/88] usb: xhci: amd chipset also needs short TX quirk Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 76/88] ARM: OMAP2+: hwmod: Rearm wake-up interrupts for DT when MUSB is idled Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 77/88] USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 79/88] USB: whiteheat: Added bounds checking for bulk command response Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 80/88] usb: ehci: using wIndex + 1 for hub port Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 81/88] usb: hub: Prevent hub autosuspend if usbcore.autosuspend is -1 Greg Kroah-Hartman
     [not found]   ` <1409826760@msgid.manchmal.in-ulm.de>
2014-09-04 14:00     ` Greg KH
2014-09-03 22:05 ` [PATCH 3.14 82/88] NFSD: Decrease nfsd_users in nfsd_startup_generic fail Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 83/88] NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes Greg Kroah-Hartman
2014-09-03 22:18   ` Trond Myklebust
2014-09-03 22:05 ` [PATCH 3.14 84/88] nfs3_list_one_acl(): check get_acl() result with IS_ERR_OR_NULL Greg Kroah-Hartman
2014-09-03 22:05 ` [PATCH 3.14 85/88] svcrdma: Select NFSv4.1 backchannel transport based on forward channel Greg Kroah-Hartman
2014-09-03 22:06 ` [PATCH 3.14 86/88] NFSv3: Fix another acl regression Greg Kroah-Hartman
2014-09-03 22:06 ` [PATCH 3.14 87/88] NFSv4: Fix problems with close in the presence of a delegation Greg Kroah-Hartman
2014-09-03 22:06 ` [PATCH 3.14 88/88] vm_is_stack: use for_each_thread() rather then buggy while_each_thread() Greg Kroah-Hartman
2014-09-03 23:44 ` [PATCH 3.14 00/88] 3.14.18-stable review Greg Kroah-Hartman
2014-09-04  4:55   ` Guenter Roeck
2014-09-04 14:01     ` Greg Kroah-Hartman
2014-09-04 10:44   ` Holger Hoffstätte
2014-09-04 14:01     ` Greg KH
2014-09-04 13:38 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140903220517.231480248@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=freyes@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rongrongx.liu@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=wanpeng.li@linux.intel.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).