* Device loses its IRQ number on driver unload? @ 2015-03-09 10:04 Thomas Hellstrom 2015-03-09 15:22 ` Daniel Vetter 0 siblings, 1 reply; 10+ messages in thread From: Thomas Hellstrom @ 2015-03-09 10:04 UTC (permalink / raw) To: dri-devel@lists.freedesktop.org Hi, I'm not sure this started with 4.0 but when I rmmod the device driver like so rmmod vmwgfx The device loses its IRQ line as shown in lscpi: Flags: bus master, medium devsel, latency 64 <irq missing here> and a subsequent modprobe will fail since pdev->irq is 0. Is anyone else seeing this with other drivers? Thanks, Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-09 10:04 Device loses its IRQ number on driver unload? Thomas Hellstrom @ 2015-03-09 15:22 ` Daniel Vetter 2015-03-09 16:02 ` Thomas Hellstrom 0 siblings, 1 reply; 10+ messages in thread From: Daniel Vetter @ 2015-03-09 15:22 UTC (permalink / raw) To: Thomas Hellstrom; +Cc: dri-devel@lists.freedesktop.org On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: > Hi, > > I'm not sure this started with 4.0 but when I rmmod the device driver > like so > rmmod vmwgfx > > The device loses its IRQ line as shown in lscpi: > Flags: bus master, medium devsel, latency 64 <irq missing here> > > and a subsequent modprobe will fail since pdev->irq is 0. > > Is anyone else seeing this with other drivers? I seen occasionally (over the past couple of kernels) random zeros in pdev but dismissed it as broken machines or bugs in i915 (we have them ...). Usually the box died chasing a NULL pointer from pdev. Otherwise no. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-09 15:22 ` Daniel Vetter @ 2015-03-09 16:02 ` Thomas Hellstrom 2015-03-09 20:25 ` Dave Airlie 0 siblings, 1 reply; 10+ messages in thread From: Thomas Hellstrom @ 2015-03-09 16:02 UTC (permalink / raw) To: Daniel Vetter; +Cc: dri-devel@lists.freedesktop.org On 03/09/2015 04:22 PM, Daniel Vetter wrote: > On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >> Hi, >> >> I'm not sure this started with 4.0 but when I rmmod the device driver >> like so >> rmmod vmwgfx >> >> The device loses its IRQ line as shown in lscpi: >> Flags: bus master, medium devsel, latency 64 <irq missing here> >> >> and a subsequent modprobe will fail since pdev->irq is 0. >> >> Is anyone else seeing this with other drivers? > I seen occasionally (over the past couple of kernels) random zeros in pdev > but dismissed it as broken machines or bugs in i915 (we have them ...). > Usually the box died chasing a NULL pointer from pdev. Otherwise no. > -Daniel OK. Thanks for the info. Since in my case this is 100% reproducible I guess I have an excellent opportunity to bisect the problem :-/ /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-09 16:02 ` Thomas Hellstrom @ 2015-03-09 20:25 ` Dave Airlie 2015-03-10 12:55 ` Thomas Hellstrom 0 siblings, 1 reply; 10+ messages in thread From: Dave Airlie @ 2015-03-09 20:25 UTC (permalink / raw) To: Thomas Hellstrom; +Cc: dri-devel@lists.freedesktop.org On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote: > On 03/09/2015 04:22 PM, Daniel Vetter wrote: >> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >>> Hi, >>> >>> I'm not sure this started with 4.0 but when I rmmod the device driver >>> like so >>> rmmod vmwgfx >>> >>> The device loses its IRQ line as shown in lscpi: >>> Flags: bus master, medium devsel, latency 64 <irq missing here> >>> >>> and a subsequent modprobe will fail since pdev->irq is 0. >>> >>> Is anyone else seeing this with other drivers? >> I seen occasionally (over the past couple of kernels) random zeros in pdev >> but dismissed it as broken machines or bugs in i915 (we have them ...). >> Usually the box died chasing a NULL pointer from pdev. Otherwise no. >> -Daniel > OK. Thanks for the info. Since in my case this is 100% reproducible I > guess I have an excellent opportunity to bisect the problem :-/ > does lspci -H1, or some option like to direct access hw show it? just whether this is the kernel copy or the hw register getting messed up. Dave. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-09 20:25 ` Dave Airlie @ 2015-03-10 12:55 ` Thomas Hellstrom 2015-03-10 14:01 ` Alex Deucher 2015-03-10 21:05 ` Dave Airlie 0 siblings, 2 replies; 10+ messages in thread From: Thomas Hellstrom @ 2015-03-10 12:55 UTC (permalink / raw) To: Dave Airlie; +Cc: linux-graphics-maintainer, dri-devel@lists.freedesktop.org On 03/09/2015 09:25 PM, Dave Airlie wrote: > On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote: >> On 03/09/2015 04:22 PM, Daniel Vetter wrote: >>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >>>> Hi, >>>> >>>> I'm not sure this started with 4.0 but when I rmmod the device driver >>>> like so >>>> rmmod vmwgfx >>>> >>>> The device loses its IRQ line as shown in lscpi: >>>> Flags: bus master, medium devsel, latency 64 <irq missing here> >>>> >>>> and a subsequent modprobe will fail since pdev->irq is 0. >>>> >>>> Is anyone else seeing this with other drivers? >>> I seen occasionally (over the past couple of kernels) random zeros in pdev >>> but dismissed it as broken machines or bugs in i915 (we have them ...). >>> Usually the box died chasing a NULL pointer from pdev. Otherwise no. >>> -Daniel >> OK. Thanks for the info. Since in my case this is 100% reproducible I >> guess I have an excellent opportunity to bisect the problem :-/ >> > does lspci -H1, or some option like to direct access hw show it? > > just whether this is the kernel copy or the hw register getting messed up. > > Dave. Hi, Dave, lspci -H1 indeed shows the IRQ number. It turns out that the commit introduced in 4.0 breaking this is b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit commit b4b55cda587442477a3a9f0669e26bba4b7800c0 Author: Jiang Liu <jiang.liu@linux.intel.com> Date: Thu Feb 5 13:44:47 2015 +0800 x86/PCI: Refine the way to release PCI IRQ resources It's obvious from the commit message that unloading the driver *should* drop the irq resource but its not obvious what's reallocating that resource on driver load... Anyway, it turns out that adding a pci_disable_device(pdev) in the pci driver's remove() method (vmw_remove() in my case) appears to fix the problem: The device irq is removed on driver unload and enabled again on driver load There appears to be no pci_disable_device() on driver exit in core drm. However it still beats me why other drm drivers aren't seeing this, and IMHO that commit should probably add a warning message if the pci device isn't disabled on pci driver unload...... /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-10 12:55 ` Thomas Hellstrom @ 2015-03-10 14:01 ` Alex Deucher 2015-03-10 21:05 ` Dave Airlie 1 sibling, 0 replies; 10+ messages in thread From: Alex Deucher @ 2015-03-10 14:01 UTC (permalink / raw) To: Thomas Hellstrom Cc: linux-graphics-maintainer, dri-devel@lists.freedesktop.org On Tue, Mar 10, 2015 at 8:55 AM, Thomas Hellstrom <thellstrom@vmware.com> wrote: > On 03/09/2015 09:25 PM, Dave Airlie wrote: >> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote: >>> On 03/09/2015 04:22 PM, Daniel Vetter wrote: >>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >>>>> Hi, >>>>> >>>>> I'm not sure this started with 4.0 but when I rmmod the device driver >>>>> like so >>>>> rmmod vmwgfx >>>>> >>>>> The device loses its IRQ line as shown in lscpi: >>>>> Flags: bus master, medium devsel, latency 64 <irq missing here> >>>>> >>>>> and a subsequent modprobe will fail since pdev->irq is 0. >>>>> >>>>> Is anyone else seeing this with other drivers? >>>> I seen occasionally (over the past couple of kernels) random zeros in pdev >>>> but dismissed it as broken machines or bugs in i915 (we have them ...). >>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no. >>>> -Daniel >>> OK. Thanks for the info. Since in my case this is 100% reproducible I >>> guess I have an excellent opportunity to bisect the problem :-/ >>> >> does lspci -H1, or some option like to direct access hw show it? >> >> just whether this is the kernel copy or the hw register getting messed up. >> >> Dave. > Hi, Dave, > > lspci -H1 indeed shows the IRQ number. It turns out that the commit > introduced in 4.0 breaking this is > > b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit > commit b4b55cda587442477a3a9f0669e26bba4b7800c0 > Author: Jiang Liu <jiang.liu@linux.intel.com> > Date: Thu Feb 5 13:44:47 2015 +0800 > > x86/PCI: Refine the way to release PCI IRQ resources > > > It's obvious from the commit message that unloading the driver *should* > drop the irq resource but its not > obvious what's reallocating that resource on driver load... > > Anyway, it turns out that adding a > pci_disable_device(pdev) in the pci driver's remove() method > (vmw_remove() in my case) appears to fix the problem: > The device irq is removed on driver unload and enabled again on driver > load There appears to be no pci_disable_device() on driver exit in core drm. > > However it still beats me why other drm drivers aren't seeing this, and > IMHO that commit should probably add a warning message if the pci device > isn't disabled on pci driver unload...... They are probably broken as well. I don't think module unload and reload is commonly done with most drivers. FWIW, the drm core also does not register a pci shutdown callback so when you use kexec, nothing in the driver gets torn down properly. Alex _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-10 12:55 ` Thomas Hellstrom 2015-03-10 14:01 ` Alex Deucher @ 2015-03-10 21:05 ` Dave Airlie 2015-03-11 6:40 ` Thomas Hellstrom 1 sibling, 1 reply; 10+ messages in thread From: Dave Airlie @ 2015-03-10 21:05 UTC (permalink / raw) To: Thomas Hellstrom Cc: linux-graphics-maintainer, dri-devel@lists.freedesktop.org On 10 March 2015 at 22:55, Thomas Hellstrom <thellstrom@vmware.com> wrote: > On 03/09/2015 09:25 PM, Dave Airlie wrote: >> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote: >>> On 03/09/2015 04:22 PM, Daniel Vetter wrote: >>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >>>>> Hi, >>>>> >>>>> I'm not sure this started with 4.0 but when I rmmod the device driver >>>>> like so >>>>> rmmod vmwgfx >>>>> >>>>> The device loses its IRQ line as shown in lscpi: >>>>> Flags: bus master, medium devsel, latency 64 <irq missing here> >>>>> >>>>> and a subsequent modprobe will fail since pdev->irq is 0. >>>>> >>>>> Is anyone else seeing this with other drivers? >>>> I seen occasionally (over the past couple of kernels) random zeros in pdev >>>> but dismissed it as broken machines or bugs in i915 (we have them ...). >>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no. >>>> -Daniel >>> OK. Thanks for the info. Since in my case this is 100% reproducible I >>> guess I have an excellent opportunity to bisect the problem :-/ >>> >> does lspci -H1, or some option like to direct access hw show it? >> >> just whether this is the kernel copy or the hw register getting messed up. >> >> Dave. > Hi, Dave, > > lspci -H1 indeed shows the IRQ number. It turns out that the commit > introduced in 4.0 breaking this is > > b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit > commit b4b55cda587442477a3a9f0669e26bba4b7800c0 > Author: Jiang Liu <jiang.liu@linux.intel.com> > Date: Thu Feb 5 13:44:47 2015 +0800 > > x86/PCI: Refine the way to release PCI IRQ resources > > > It's obvious from the commit message that unloading the driver *should* > drop the irq resource but its not > obvious what's reallocating that resource on driver load... > > Anyway, it turns out that adding a > pci_disable_device(pdev) in the pci driver's remove() method > (vmw_remove() in my case) appears to fix the problem: > The device irq is removed on driver unload and enabled again on driver > load There appears to be no pci_disable_device() on driver exit in core drm. Yes that is because at one time pre kms if you pci disabled the VGA device, bad things would happen. I think with modesetting driver it shouldn't be a problem anymore. Dave. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-10 21:05 ` Dave Airlie @ 2015-03-11 6:40 ` Thomas Hellstrom 2015-03-11 7:22 ` Dave Airlie 0 siblings, 1 reply; 10+ messages in thread From: Thomas Hellstrom @ 2015-03-11 6:40 UTC (permalink / raw) To: Dave Airlie Cc: Thomas Hellstrom, linux-graphics-maintainer, dri-devel@lists.freedesktop.org On 03/10/2015 10:05 PM, Dave Airlie wrote: > On 10 March 2015 at 22:55, Thomas Hellstrom <thellstrom@vmware.com> wrote: >> On 03/09/2015 09:25 PM, Dave Airlie wrote: >>> On 10 March 2015 at 02:02, Thomas Hellstrom <thellstrom@vmware.com> wrote: >>>> On 03/09/2015 04:22 PM, Daniel Vetter wrote: >>>>> On Mon, Mar 09, 2015 at 11:04:01AM +0100, Thomas Hellstrom wrote: >>>>>> Hi, >>>>>> >>>>>> I'm not sure this started with 4.0 but when I rmmod the device driver >>>>>> like so >>>>>> rmmod vmwgfx >>>>>> >>>>>> The device loses its IRQ line as shown in lscpi: >>>>>> Flags: bus master, medium devsel, latency 64 <irq missing here> >>>>>> >>>>>> and a subsequent modprobe will fail since pdev->irq is 0. >>>>>> >>>>>> Is anyone else seeing this with other drivers? >>>>> I seen occasionally (over the past couple of kernels) random zeros in pdev >>>>> but dismissed it as broken machines or bugs in i915 (we have them ...). >>>>> Usually the box died chasing a NULL pointer from pdev. Otherwise no. >>>>> -Daniel >>>> OK. Thanks for the info. Since in my case this is 100% reproducible I >>>> guess I have an excellent opportunity to bisect the problem :-/ >>>> >>> does lspci -H1, or some option like to direct access hw show it? >>> >>> just whether this is the kernel copy or the hw register getting messed up. >>> >>> Dave. >> Hi, Dave, >> >> lspci -H1 indeed shows the IRQ number. It turns out that the commit >> introduced in 4.0 breaking this is >> >> b4b55cda587442477a3a9f0669e26bba4b7800c0 is the first bad commit >> commit b4b55cda587442477a3a9f0669e26bba4b7800c0 >> Author: Jiang Liu <jiang.liu@linux.intel.com> >> Date: Thu Feb 5 13:44:47 2015 +0800 >> >> x86/PCI: Refine the way to release PCI IRQ resources >> >> >> It's obvious from the commit message that unloading the driver *should* >> drop the irq resource but its not >> obvious what's reallocating that resource on driver load... >> >> Anyway, it turns out that adding a >> pci_disable_device(pdev) in the pci driver's remove() method >> (vmw_remove() in my case) appears to fix the problem: >> The device irq is removed on driver unload and enabled again on driver >> load There appears to be no pci_disable_device() on driver exit in core drm. > Yes that is because at one time pre kms if you pci disabled the VGA device, > bad things would happen. > > I think with modesetting driver it shouldn't be a problem anymore. > > Dave. So what's the preferred remedy here? should I file a bug against the above commit or should we go ahead modifying the DRM drivers? Thanks, Thomas > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-11 6:40 ` Thomas Hellstrom @ 2015-03-11 7:22 ` Dave Airlie 2015-03-11 9:28 ` Thomas Hellstrom 0 siblings, 1 reply; 10+ messages in thread From: Dave Airlie @ 2015-03-11 7:22 UTC (permalink / raw) To: Thomas Hellstrom Cc: Thomas Hellstrom, linux-graphics-maintainer, dri-devel@lists.freedesktop.org >> >> I think with modesetting driver it shouldn't be a problem anymore. >> >> Dave. > > So what's the preferred remedy here? should I file a bug against the > above commit or should we go ahead modifying > the DRM drivers? I'd file against that first, and maybe see why it clears the value. Dave. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Device loses its IRQ number on driver unload? 2015-03-11 7:22 ` Dave Airlie @ 2015-03-11 9:28 ` Thomas Hellstrom 0 siblings, 0 replies; 10+ messages in thread From: Thomas Hellstrom @ 2015-03-11 9:28 UTC (permalink / raw) To: Dave Airlie; +Cc: linux-graphics-maintainer, dri-devel@lists.freedesktop.org On 03/11/2015 08:22 AM, Dave Airlie wrote: >>> I think with modesetting driver it shouldn't be a problem anymore. >>> >>> Dave. >> So what's the preferred remedy here? should I file a bug against the >> above commit or should we go ahead modifying >> the DRM drivers? > I'd file against that first, and maybe see why it clears the value. > > Dave. https://bugzilla.kernel.org/show_bug.cgi?id=94721 /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-03-11 9:28 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-03-09 10:04 Device loses its IRQ number on driver unload? Thomas Hellstrom 2015-03-09 15:22 ` Daniel Vetter 2015-03-09 16:02 ` Thomas Hellstrom 2015-03-09 20:25 ` Dave Airlie 2015-03-10 12:55 ` Thomas Hellstrom 2015-03-10 14:01 ` Alex Deucher 2015-03-10 21:05 ` Dave Airlie 2015-03-11 6:40 ` Thomas Hellstrom 2015-03-11 7:22 ` Dave Airlie 2015-03-11 9:28 ` Thomas Hellstrom
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.