From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
Suresh Siddha <suresh.b.siddha@intel.com>,
Chris Wright <chrisw@sous-sol.org>,
Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
"H. Peter Anvin" <hpa@linux.intel.com>
Subject: [29/49] x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic
Date: Wed, 05 Jan 2011 15:00:47 -0800 [thread overview]
Message-ID: <20110105230326.382281092@clark.site> (raw)
In-Reply-To: <20110105230438.GA26241@kroah.com>
2.6.32-longterm review patch. If anyone has any objections, please let us know.
------------------
From: Suresh Siddha <suresh.b.siddha@intel.com>
commit 254e42006c893f45bca48f313536fcba12206418 upstream.
On platforms with Intel 7500 chipset, there were some reports of system
hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.
During kdump, there is a window where the devices might be still using old
kernel's interrupt information, while the kdump kernel is coming up. This can
cause vt-d faults as the interrupt configuration from the old kernel map to
null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
we still have the same issue but in this case we will see benign spurious
interrupt hit the new kernel).
Based on platform config settings, these platforms seem to generate NMI/SMI
when a vt-d fault happens and there were reports that the resulting SMI causes
the system to hang.
Fix it by masking vt-d spec defined errors to platform error reporting logic.
VT-d spec related errors are already handled by the VT-d OS code, so need to
report the same error through other channels.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1291667190.2675.8.camel@sbsiddha-MOBL3.sc.intel.com>
Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Reported-and-tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
---
drivers/pci/quirks.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2575,6 +2575,29 @@ extern struct pci_fixup __end_pci_fixups
extern struct pci_fixup __start_pci_fixups_suspend[];
extern struct pci_fixup __end_pci_fixups_suspend[];
+#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
+#define VTUNCERRMSK_REG 0x1ac
+#define VTD_MSK_SPEC_ERRORS (1 << 31)
+/*
+ * This is a quirk for masking vt-d spec defined errors to platform error
+ * handling logic. With out this, platforms using Intel 7500, 5500 chipsets
+ * (and the derivative chipsets like X58 etc) seem to generate NMI/SMI (based
+ * on the RAS config settings of the platform) when a vt-d fault happens.
+ * The resulting SMI caused the system to hang.
+ *
+ * VT-d spec related errors are already handled by the VT-d OS code, so no
+ * need to report the same error through other channels.
+ */
+static void vtd_mask_spec_errors(struct pci_dev *dev)
+{
+ u32 word;
+
+ pci_read_config_dword(dev, VTUNCERRMSK_REG, &word);
+ pci_write_config_dword(dev, VTUNCERRMSK_REG, word | VTD_MSK_SPEC_ERRORS);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
+#endif
void pci_fixup_device(enum pci_fixup_pass pass, struct pci_dev *dev)
{
next prev parent reply other threads:[~2011-01-05 23:08 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-01-05 23:04 [00/49] 2.6.32.28-longterm review Greg KH
2011-01-05 23:00 ` Greg KH
2011-01-05 23:00 ` [01/49] TTY: Fix error return from tty_ldisc_open() Greg KH
2011-01-05 23:00 ` [02/49] x86, hotplug: Use mwait to offline a processor, fix the legacy case Greg KH
2011-01-05 23:00 ` [03/49] fuse: verify ioctl retries Greg KH
2011-01-05 23:00 ` [04/49] fuse: fix ioctl when server is 32bit Greg KH
2011-01-05 23:00 ` [05/49] ALSA: hda: Use model=lg quirk for LG P1 Express to enable playback and capture Greg KH
2011-01-05 23:00 ` [06/49] drm/kms: remove spaces from connector names (v2) Greg KH
2011-01-05 23:49 ` [Stable-review] " Ben Hutchings
2011-01-05 23:56 ` Greg KH
2011-01-06 0:04 ` Ben Hutchings
2011-01-06 0:32 ` Alex Deucher
2011-01-05 23:00 ` [07/49] nohz: Fix printk_needs_cpu() return value on offline cpus Greg KH
2011-01-05 23:00 ` [08/49] nohz: Fix get_next_timer_interrupt() vs cpu hotplug Greg KH
2011-01-05 23:00 ` [09/49] NFS: Fix panic after nfs_umount() Greg KH
2011-01-05 23:00 ` [10/49] nfsd: Fix possible BUG_ON firing in set_change_info Greg KH
2011-01-05 23:00 ` [11/49] NFS: Fix fcntl F_GETLK not reporting some conflicts Greg KH
2011-01-05 23:00 ` [12/49] sunrpc: prevent use-after-free on clearing XPT_BUSY Greg KH
2011-01-05 23:00 ` [13/49] hwmon: (adm1026) Allow 1 as a valid divider value Greg KH
2011-01-05 23:00 ` [14/49] hwmon: (adm1026) Fix setting fan_div Greg KH
2011-01-05 23:00 ` [15/49] amd64_edac: Fix interleaving check Greg KH
2011-01-05 23:00 ` [16/49] IB/uverbs: Handle large number of entries in poll CQ Greg KH
2011-01-05 23:00 ` [17/49] PM / Hibernate: Fix PM_POST_* notification with user-space suspend Greg KH
2011-01-05 23:00 ` [18/49] ACPICA: Fix Scope() op in module level code Greg KH
2011-01-05 23:00 ` [19/49] ACPI: EC: Add another dmi match entry for MSI hardware Greg KH
2011-01-05 23:00 ` [20/49] orinoco: fix TKIP countermeasure behaviour Greg KH
2011-01-05 23:00 ` [21/49] orinoco: clear countermeasure setting on commit Greg KH
2011-01-05 23:00 ` [22/49] x86, amd: Fix panic on AMD CPU family 0x15 Greg KH
2011-01-05 23:00 ` [23/49] md: fix bug with re-adding of partially recovered device Greg KH
2011-01-05 23:00 ` [24/49] tracing: Fix panic when lseek() called on "trace" opened for writing Greg KH
2011-01-05 23:00 ` [25/49] x86, gcc-4.6: Use gcc -m options when building vdso Greg KH
2011-01-05 23:00 ` [26/49] x86: Enable the intr-remap fault handling after local APIC setup Greg KH
2011-01-05 23:00 ` [27/49] x86, vt-d: Handle previous faults after enabling fault handling Greg KH
2011-01-05 23:00 ` [28/49] x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode Greg KH
2011-01-05 23:00 ` Greg KH [this message]
2011-01-05 23:00 ` [30/49] hvc_console: Fix race between hvc_close and hvc_remove Greg KH
2011-01-05 23:00 ` [31/49] hvc_console: Fix race between hvc_close and hvc_remove, again Greg KH
2011-01-05 23:00 ` [32/49] HID: hidraw: fix window in hidraw_release Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110105230326.382281092@clark.site \
--to=gregkh@suse.de \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=chrisw@sous-sol.org \
--cc=hpa@linux.intel.com \
--cc=kaneshige.kenji@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable-review@kernel.org \
--cc=stable@kernel.org \
--cc=suresh.b.siddha@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox