linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiang Liu <jiang.liu@linux.intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: "Daniel Vetter" <daniel@ffwll.ch>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Alex Deucher" <alexdeucher@gmail.com>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Maling list - DRI developers" <dri-devel@lists.freedesktop.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized
Date: Wed, 30 Sep 2015 15:45:39 +0800	[thread overview]
Message-ID: <560B9323.6000309@linux.intel.com> (raw)
In-Reply-To: <20150929105138.GA12037@nazgul.tnic>

[-- Attachment #1: Type: text/plain, Size: 1282 bytes --]

n 2015/9/29 18:51, Borislav Petkov wrote:
> On Tue, Sep 29, 2015 at 04:50:36PM +0800, Jiang Liu wrote:
>> So could you please help to apply the attached debug patch to gather
>> more information about the regression?
> 
> Sure, just did.
> 
> I'm sending you a full s/r cycle attempt caught over serial in a private
> message.

Hi Boris,
>From the log file, we got to know that the NULL pointer dereference
was caused by AMD IOMMU device. For normal MSI-enabled PCI devices, we get
valid irq numbers such as:
[ 74.661170] ahci 0000:04:00.0: irqdomain: freeze msi 1 irq28
[ 74.661297] radeon 0000:01:00.0: irqdomain: freeze msi 1 irq47
But for AMD IOMMU device, we got an invalid irq number(0) after
enabling MSI as:
[ 74.662488] pci 0000:00:00.2: irqdomain: freeze msi 1 irq0
which then caused NULL pointer deference when __pci_restore_msi_state()
gets called by system resume code.
So we need to figure out why we got irq number 0 after enabling
MSI for AMD IOMMU device. The only hint I got is that iommu driver just
grabbing the PCI device without providing a PCI device driver for IOMMU
PCI device, we have solved a similar case for eata driver. So could you
please help to apply this debug patch to gather more info and send me
/proc/interrupts?
Thanks!
Gerry

O>
> Thanks.
> 

[-- Attachment #2: 0001-Debug-Gather-more-information-about-AMD-iommu-device.patch --]
[-- Type: text/x-patch, Size: 4545 bytes --]

>From 57d3013c1c64d9407e432598c645c8de256e0b42 Mon Sep 17 00:00:00 2001
From: Jiang Liu <jiang.liu@linux.intel.com>
Date: Wed, 30 Sep 2015 14:49:29 +0800
Subject: [PATCH] Debug: Gather more information about AMD iommu device

Hi Boris,
	From the log file, we got to know that the NULL pointer dereference
was caused by AMD IOMMU device. For normal MSI-enabled PCI devices, we get
valid irq numbers such as:
[   74.661170] ahci 0000:04:00.0: irqdomain: freeze msi 1 irq28
[   74.661297] radeon 0000:01:00.0: irqdomain: freeze msi 1 irq47
	But for AMD IOMMU device, we got an invalid irq number(0) after
enabling MSI as:
[   74.662488] pci 0000:00:00.2: irqdomain: freeze msi 1 irq0
which then caused NULL pointer deference when __pci_restore_msi_state()
gets called by system resume code.
	So we need to figure out why we got irq number 0 after enabling
MSI for AMD IOMMU device. The only hint I got is that iommu driver just
grabbing the PCI device without providing a PCI device driver for IOMMU
PCI device, we have solved a similar case for eata driver. So could you
please help to apply this debug patch to gather more info and send me
/proc/interrupts?
Thanks!
Gerry

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c     |    6 +++++-
 drivers/iommu/amd_iommu_init.c |    8 ++++++++
 drivers/pci/msi.c              |    4 ++++
 kernel/irq/msi.c               |    1 +
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 5f1feb6854af..050dcf25577c 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -71,6 +71,7 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 {
 	struct irq_domain *domain;
 	struct irq_alloc_info info;
+	int ret;
 
 	init_irq_alloc_info(&info, NULL);
 	info.type = X86_IRQ_ALLOC_TYPE_MSI;
@@ -82,7 +83,10 @@ int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 	if (domain == NULL)
 		return -ENOSYS;
 
-	return pci_msi_domain_alloc_irqs(domain, dev, nvec, type);
+	ret = pci_msi_domain_alloc_irqs(domain, dev, nvec, type);
+	dev_warn(&dev->dev, "irqdomain: domain %p, def_domain %p ret%d\n",
+		 domain, msi_default_domain, ret);
+	return ret;
 }
 
 void native_teardown_msi_irq(unsigned int irq)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5ef347a13cb5..23cd4d861dba 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1416,7 +1416,11 @@ static int iommu_setup_msi(struct amd_iommu *iommu)
 {
 	int r;
 
+	dev_warn(&iommu->dev->dev, "irqdomain: before enabling MSI for msi%d, irq%d\n",
+		 iommu->dev->msi_enabled, iommu->dev->irq);
 	r = pci_enable_msi(iommu->dev);
+	dev_warn(&iommu->dev->dev, "irqdomain: after enabling MSI for msi%d, irq%d\n",
+		 iommu->dev->msi_enabled, iommu->dev->irq);
 	if (r)
 		return r;
 
@@ -1428,6 +1432,8 @@ static int iommu_setup_msi(struct amd_iommu *iommu)
 
 	if (r) {
 		pci_disable_msi(iommu->dev);
+		dev_warn(&iommu->dev->dev, "irqdomain: failed to enable MSI for msi%d, irq%d\n",
+			 iommu->dev->msi_enabled, iommu->dev->irq);
 		return r;
 	}
 
@@ -1440,6 +1446,8 @@ static int iommu_init_msi(struct amd_iommu *iommu)
 {
 	int ret;
 
+	dev_warn(&iommu->dev->dev, "irqdomain: init msi for iommu %p int_enabled%d\n",
+		 iommu, iommu->int_enabled);
 	if (iommu->int_enabled)
 		goto enable_faults;
 
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index d4497141d083..0301a18663b0 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -602,6 +602,8 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 	int ret;
 	unsigned mask;
 
+	dev_warn(&dev->dev, "irqdomain: enable msi cap msi_enabled%d irq%d\n",
+		 dev->msi_enabled, dev->irq);
 	pci_msi_set_enable(dev, 0);	/* Disable MSI during set up */
 
 	entry = msi_setup_entry(dev, nvec);
@@ -643,6 +645,8 @@ static int msi_capability_init(struct pci_dev *dev, int nvec)
 
 	pcibios_free_irq(dev);
 	dev->irq = entry->irq;
+	dev_warn(&dev->dev, "irqdomain: succeed to enable msi cap msi_enabled%d irq%d\n",
+		 dev->msi_enabled, dev->irq);
 	return 0;
 }
 
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 7e6512b9dc1f..535cf59bc5a7 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -298,6 +298,7 @@ int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
 			return ret;
 		}
 
+		dev_warn(dev, "irqdomain: allocated virq%d\n", virq);
 		for (i = 0; i < desc->nvec_used; i++)
 			irq_set_msi_desc_off(virq, i, desc);
 	}
-- 
1.7.10.4


  reply	other threads:[~2015-09-30  7:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-21 13:31 WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() Borislav Petkov
2015-09-22 19:06 ` Borislav Petkov
2015-09-22 19:58 ` Alex Deucher
2015-09-22 20:21   ` Borislav Petkov
2015-09-22 20:54     ` Alex Deucher
2015-09-23  7:25       ` Daniel Vetter
2015-09-23  8:59         ` Borislav Petkov
2015-09-23 14:44           ` Daniel Vetter
2015-09-23 16:06             ` Borislav Petkov
2015-09-23 16:18               ` Borislav Petkov
2015-09-26 16:46                 ` WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized Borislav Petkov
2015-09-29  8:50                   ` Jiang Liu
2015-09-29 10:51                     ` Borislav Petkov
2015-09-30  7:45                       ` Jiang Liu [this message]
2015-09-30 12:44                         ` Joerg Roedel
2015-09-30 17:00                           ` Jiang Liu
2015-09-30 17:36                             ` Borislav Petkov
2015-09-30 18:07                               ` Joerg Roedel
2015-10-03  7:36                               ` Jiang Liu
2015-10-03  9:35                                 ` Borislav Petkov
2015-10-05 10:03                                 ` Joerg Roedel
2015-10-06 13:13                                   ` Jiang Liu
2015-10-09 10:24                                     ` Joerg Roedel
2015-09-30 18:06                             ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=560B9323.6000309@linux.intel.com \
    --to=jiang.liu@linux.intel.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=christian.koenig@amd.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).