From: Joerg Roedel <joro@8bytes.org>
To: Jiang Liu <jiang.liu@linux.intel.com>
Cc: lkml <linux-kernel@vger.kernel.org>,
"Maling list - DRI developers" <dri-devel@lists.freedesktop.org>,
"Borislav Petkov" <bp@alien8.de>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Christian König" <christian.koenig@amd.com>
Subject: Re: WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized
Date: Wed, 30 Sep 2015 20:06:21 +0200 [thread overview]
Message-ID: <20150930180621.GV3036@8bytes.org> (raw)
In-Reply-To: <560C153C.10600@linux.intel.com>
On Thu, Oct 01, 2015 at 01:00:44AM +0800, Jiang Liu wrote:
> Thanks Joerg, that makes sense. If some driver tries to binding to the
> IOMMU device, it will trigger the scenario as you described. For
> example, Xen backend driver will try to probe all PCI devices
> if enabled. I will do more investigation tomorrow.
Not only that, the probe code looks like this in __pci_device_probe:
error = -ENODEV;
id = pci_match_device(drv, pci_dev);
if (id)
error = pci_call_probe(drv, pci_dev, id);
if (error >= 0)
error = 0;
The pci_match_device() function will always return NULL for the iommu
pci_dev, because no driver matches the ids of it. So the function
returns -ENODEV, which will be handled in the caller (pci_device_probe):
error = pcibios_alloc_irq(pci_dev);
if (error < 0)
return error;
pci_dev_get(pci_dev);
error = __pci_device_probe(drv, pci_dev);
if (error) {
pcibios_free_irq(pci_dev);
pci_dev_put(pci_dev);
}
For the IOMMU pci_dev a pcibios-irq will be allocated (if there is one,
like on Boris' system) and because __pci_device_probe returns -ENODEV it
will be freed again with pcibios_free_irq().
The pcibios_free_irq() function will set dev->irq = 0, which overwrites
the value that pci_enable_msi() wrote there. So later in suspend/resume
code the msi-handling part tries to fetch the irq-descriptor for the
wrong irq (which is NULL) and causes the crash.
The issue got introduced because with your changes pci_enable_msi() is
only allowed after a pci-device was successfully probed by the driver.
But this assumption is not true, as the AMD IOMMU driver does not
register as a pci-driver.
Registering a pci-driver would actually be harmful, because a device can
be forcibly unbound from its driver, which would be pretty bad for an
IOMMU in the running system.
So the right fix is to allow pci_enable_msi() for pci-devices not
registered against a driver. The fix I sent Boris has issues (I think it
leaks pcibios irqs when MSI is in use), but was thinking about fixing it
in pci_device_probe by not allocating a pcibios-irq when MSI is already
active. What do you think?
Regards,
Joerg
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: Joerg Roedel <joro@8bytes.org>
To: Jiang Liu <jiang.liu@linux.intel.com>
Cc: "Borislav Petkov" <bp@alien8.de>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Alex Deucher" <alexdeucher@gmail.com>,
"Alex Deucher" <alexander.deucher@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Maling list - DRI developers" <dri-devel@lists.freedesktop.org>,
lkml <linux-kernel@vger.kernel.org>
Subject: Re: WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized
Date: Wed, 30 Sep 2015 20:06:21 +0200 [thread overview]
Message-ID: <20150930180621.GV3036@8bytes.org> (raw)
In-Reply-To: <560C153C.10600@linux.intel.com>
On Thu, Oct 01, 2015 at 01:00:44AM +0800, Jiang Liu wrote:
> Thanks Joerg, that makes sense. If some driver tries to binding to the
> IOMMU device, it will trigger the scenario as you described. For
> example, Xen backend driver will try to probe all PCI devices
> if enabled. I will do more investigation tomorrow.
Not only that, the probe code looks like this in __pci_device_probe:
error = -ENODEV;
id = pci_match_device(drv, pci_dev);
if (id)
error = pci_call_probe(drv, pci_dev, id);
if (error >= 0)
error = 0;
The pci_match_device() function will always return NULL for the iommu
pci_dev, because no driver matches the ids of it. So the function
returns -ENODEV, which will be handled in the caller (pci_device_probe):
error = pcibios_alloc_irq(pci_dev);
if (error < 0)
return error;
pci_dev_get(pci_dev);
error = __pci_device_probe(drv, pci_dev);
if (error) {
pcibios_free_irq(pci_dev);
pci_dev_put(pci_dev);
}
For the IOMMU pci_dev a pcibios-irq will be allocated (if there is one,
like on Boris' system) and because __pci_device_probe returns -ENODEV it
will be freed again with pcibios_free_irq().
The pcibios_free_irq() function will set dev->irq = 0, which overwrites
the value that pci_enable_msi() wrote there. So later in suspend/resume
code the msi-handling part tries to fetch the irq-descriptor for the
wrong irq (which is NULL) and causes the crash.
The issue got introduced because with your changes pci_enable_msi() is
only allowed after a pci-device was successfully probed by the driver.
But this assumption is not true, as the AMD IOMMU driver does not
register as a pci-driver.
Registering a pci-driver would actually be harmful, because a device can
be forcibly unbound from its driver, which would be pretty bad for an
IOMMU in the running system.
So the right fix is to allow pci_enable_msi() for pci-devices not
registered against a driver. The fix I sent Boris has issues (I think it
leaks pcibios irqs when MSI is in use), but was thinking about fixing it
in pci_device_probe by not allocating a pcibios-irq when MSI is already
active. What do you think?
Regards,
Joerg
next prev parent reply other threads:[~2015-09-30 18:06 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-21 13:31 WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() Borislav Petkov
2015-09-22 19:06 ` Borislav Petkov
2015-09-22 19:06 ` Borislav Petkov
2015-09-22 19:58 ` Alex Deucher
2015-09-22 19:58 ` Alex Deucher
2015-09-22 20:21 ` Borislav Petkov
2015-09-22 20:21 ` Borislav Petkov
2015-09-22 20:54 ` Alex Deucher
2015-09-22 20:54 ` Alex Deucher
2015-09-23 7:25 ` Daniel Vetter
2015-09-23 8:59 ` Borislav Petkov
2015-09-23 8:59 ` Borislav Petkov
2015-09-23 14:44 ` Daniel Vetter
2015-09-23 14:44 ` Daniel Vetter
2015-09-23 16:06 ` Borislav Petkov
2015-09-23 16:06 ` Borislav Petkov
2015-09-23 16:18 ` Borislav Petkov
2015-09-23 16:18 ` Borislav Petkov
2015-09-26 16:46 ` WARNING: CPU: 4 PID: 863 at include/drm/drm_crtc.h:1577 drm_helper_choose_encoder_dpms+0x88/0x90() - evildoer found and neutralized Borislav Petkov
2015-09-26 16:46 ` Borislav Petkov
2015-09-29 8:50 ` Jiang Liu
2015-09-29 10:51 ` Borislav Petkov
2015-09-29 10:51 ` Borislav Petkov
2015-09-30 7:45 ` Jiang Liu
2015-09-30 7:45 ` Jiang Liu
2015-09-30 12:44 ` Joerg Roedel
2015-09-30 17:00 ` Jiang Liu
2015-09-30 17:36 ` Borislav Petkov
2015-09-30 17:36 ` Borislav Petkov
2015-09-30 18:07 ` Joerg Roedel
2015-09-30 18:07 ` Joerg Roedel
2015-10-03 7:36 ` Jiang Liu
2015-10-03 7:36 ` Jiang Liu
2015-10-03 9:35 ` Borislav Petkov
2015-10-03 9:35 ` Borislav Petkov
2015-10-05 10:03 ` Joerg Roedel
2015-10-05 10:03 ` Joerg Roedel
2015-10-06 13:13 ` Jiang Liu
2015-10-09 10:24 ` Joerg Roedel
2015-10-09 10:24 ` Joerg Roedel
2015-09-30 18:06 ` Joerg Roedel [this message]
2015-09-30 18:06 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150930180621.GV3036@8bytes.org \
--to=joro@8bytes.org \
--cc=alexander.deucher@amd.com \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=jiang.liu@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.