linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* dmar messages caused by graphics.
@ 2014-10-17 21:17 Dave Jones
  2014-10-20 10:05 ` Joerg Roedel
  2014-10-21 15:36 ` [Intel-gfx] " Daniel Vetter
  0 siblings, 2 replies; 3+ messages in thread
From: Dave Jones @ 2014-10-17 21:17 UTC (permalink / raw)
  To: Linux Kernel; +Cc: intel-gfx, joro

Just hit this while fuzz-testing, (curiously, no graphics
related stuff was happening, X isn't even loaded on that box).

dmar: DRHD: handling fault status reg 2
dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000
DMAR:[fault reason 05] PTE Write access is not set


00:02:0 is..

00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th
Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00
		[VGA controller])

00: 86 80 12 04 07 04 90 00 06 00 00 03 00 00 00 00
10: 04 00 00 c0 00 00 00 00 0c 00 00 b0 00 00 00 00
20: 01 30 00 00 00 00 00 00 00 00 00 00 86 80 12 22
30: 00 00 00 00 90 00 00 00 00 00 00 00 0b 01 00 00


So then I rebooted, and noticed it spewed the exact same message on boot up too.

I power cycled, and this time got

[    0.576231] dmar: Host address width 39
[    0.576336] dmar: DRHD base: 0x000000fed90000 flags: 0x0
[    0.576491] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
[    0.576659] dmar: DRHD base: 0x000000fed91000 flags: 0x1
[    0.576793] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008020660462 ecap f010da
[    0.576961] dmar: RMRR base: 0x000000a2a1f000 end: 0x000000a2a32fff
[    0.577075] dmar: RMRR base: 0x000000ad800000 end: 0x000000af9fffff
[    6.715745] DMAR: No ATSR found
[    8.081845] [drm] DMAR active, disabling use of stolen memory
[    9.927343] dmar: DRHD: handling fault status reg 2
[    9.928335] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
DMAR:[fault reason 05] PTE Write access is not set
[   11.916211] dmar: DRHD: handling fault status reg 2
[   11.917105] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
DMAR:[fault reason 05] PTE Write access is not set


Same thing, different fault address.  It seems to change every time I boot.


Looking in the logs, this started happening on the 15th. The first instance
was this during boot..

[    9.917240] dmar: DRHD: handling fault status reg 2
[    9.918150] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
[    9.918150] DMAR:[fault reason 05] PTE Write access is not set
[    9.919582] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000 
[    9.919582] DMAR:[fault reason 05] PTE Write access is not set
[   10.157240] dmar: DRHD: handling fault status reg 3
[   10.158017] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3579736000 
[   10.158017] DMAR:[fault reason 05] PTE Write access is not set
[   11.926114] dmar: DRHD: handling fault status reg 3
[   11.927117] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
[   11.927117] DMAR:[fault reason 05] PTE Write access is not set

That time, the 'reg 3' showed up.

Dying hardware ? Or bug ?

	Dave


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: dmar messages caused by graphics.
  2014-10-17 21:17 dmar messages caused by graphics Dave Jones
@ 2014-10-20 10:05 ` Joerg Roedel
  2014-10-21 15:36 ` [Intel-gfx] " Daniel Vetter
  1 sibling, 0 replies; 3+ messages in thread
From: Joerg Roedel @ 2014-10-20 10:05 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, intel-gfx; +Cc: David Woodhouse, Jiang Liu

Adding David and Jiang, they might have an idea whats going wrong.

On Fri, Oct 17, 2014 at 05:17:16PM -0400, Dave Jones wrote:
> Just hit this while fuzz-testing, (curiously, no graphics
> related stuff was happening, X isn't even loaded on that box).
> 
> dmar: DRHD: handling fault status reg 2
> dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000
> DMAR:[fault reason 05] PTE Write access is not set
> 
> 
> 00:02:0 is..
> 
> 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th
> Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00
> 		[VGA controller])
> 
> 00: 86 80 12 04 07 04 90 00 06 00 00 03 00 00 00 00
> 10: 04 00 00 c0 00 00 00 00 0c 00 00 b0 00 00 00 00
> 20: 01 30 00 00 00 00 00 00 00 00 00 00 86 80 12 22
> 30: 00 00 00 00 90 00 00 00 00 00 00 00 0b 01 00 00
> 
> 
> So then I rebooted, and noticed it spewed the exact same message on boot up too.
> 
> I power cycled, and this time got
> 
> [    0.576231] dmar: Host address width 39
> [    0.576336] dmar: DRHD base: 0x000000fed90000 flags: 0x0
> [    0.576491] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
> [    0.576659] dmar: DRHD base: 0x000000fed91000 flags: 0x1
> [    0.576793] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008020660462 ecap f010da
> [    0.576961] dmar: RMRR base: 0x000000a2a1f000 end: 0x000000a2a32fff
> [    0.577075] dmar: RMRR base: 0x000000ad800000 end: 0x000000af9fffff
> [    6.715745] DMAR: No ATSR found
> [    8.081845] [drm] DMAR active, disabling use of stolen memory
> [    9.927343] dmar: DRHD: handling fault status reg 2
> [    9.928335] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
> DMAR:[fault reason 05] PTE Write access is not set
> [   11.916211] dmar: DRHD: handling fault status reg 2
> [   11.917105] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
> DMAR:[fault reason 05] PTE Write access is not set
> 
> 
> Same thing, different fault address.  It seems to change every time I boot.
> 
> 
> Looking in the logs, this started happening on the 15th. The first instance
> was this during boot..
> 
> [    9.917240] dmar: DRHD: handling fault status reg 2
> [    9.918150] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
> [    9.918150] DMAR:[fault reason 05] PTE Write access is not set
> [    9.919582] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000 
> [    9.919582] DMAR:[fault reason 05] PTE Write access is not set
> [   10.157240] dmar: DRHD: handling fault status reg 3
> [   10.158017] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3579736000 
> [   10.158017] DMAR:[fault reason 05] PTE Write access is not set
> [   11.926114] dmar: DRHD: handling fault status reg 3
> [   11.927117] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
> [   11.927117] DMAR:[fault reason 05] PTE Write access is not set
> 
> That time, the 'reg 3' showed up.
> 
> Dying hardware ? Or bug ?
> 
> 	Dave
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Intel-gfx] dmar messages caused by graphics.
  2014-10-17 21:17 dmar messages caused by graphics Dave Jones
  2014-10-20 10:05 ` Joerg Roedel
@ 2014-10-21 15:36 ` Daniel Vetter
  1 sibling, 0 replies; 3+ messages in thread
From: Daniel Vetter @ 2014-10-21 15:36 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, intel-gfx, joro

On Fri, Oct 17, 2014 at 05:17:16PM -0400, Dave Jones wrote:
> Just hit this while fuzz-testing, (curiously, no graphics
> related stuff was happening, X isn't even loaded on that box).
> 
> dmar: DRHD: handling fault status reg 2
> dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000
> DMAR:[fault reason 05] PTE Write access is not set
> 
> 
> 00:02:0 is..
> 
> 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th
> Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00
> 		[VGA controller])
> 
> 00: 86 80 12 04 07 04 90 00 06 00 00 03 00 00 00 00
> 10: 04 00 00 c0 00 00 00 00 0c 00 00 b0 00 00 00 00
> 20: 01 30 00 00 00 00 00 00 00 00 00 00 86 80 12 22
> 30: 00 00 00 00 90 00 00 00 00 00 00 00 0b 01 00 00
> 
> 
> So then I rebooted, and noticed it spewed the exact same message on boot up too.
> 
> I power cycled, and this time got
> 
> [    0.576231] dmar: Host address width 39
> [    0.576336] dmar: DRHD base: 0x000000fed90000 flags: 0x0
> [    0.576491] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap c0000020660462 ecap f0101a
> [    0.576659] dmar: DRHD base: 0x000000fed91000 flags: 0x1
> [    0.576793] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap d2008020660462 ecap f010da
> [    0.576961] dmar: RMRR base: 0x000000a2a1f000 end: 0x000000a2a32fff
> [    0.577075] dmar: RMRR base: 0x000000ad800000 end: 0x000000af9fffff
> [    6.715745] DMAR: No ATSR found
> [    8.081845] [drm] DMAR active, disabling use of stolen memory
> [    9.927343] dmar: DRHD: handling fault status reg 2
> [    9.928335] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
> DMAR:[fault reason 05] PTE Write access is not set
> [   11.916211] dmar: DRHD: handling fault status reg 2
> [   11.917105] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3c11284000 
> DMAR:[fault reason 05] PTE Write access is not set
> 
> 
> Same thing, different fault address.  It seems to change every time I boot.
> 
> 
> Looking in the logs, this started happening on the 15th. The first instance
> was this during boot..
> 
> [    9.917240] dmar: DRHD: handling fault status reg 2
> [    9.918150] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
> [    9.918150] DMAR:[fault reason 05] PTE Write access is not set
> [    9.919582] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7ffffff000 
> [    9.919582] DMAR:[fault reason 05] PTE Write access is not set
> [   10.157240] dmar: DRHD: handling fault status reg 3
> [   10.158017] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 3579736000 
> [   10.158017] DMAR:[fault reason 05] PTE Write access is not set
> [   11.926114] dmar: DRHD: handling fault status reg 3
> [   11.927117] dmar: DMAR:[DMA Write] Request device [00:02.0] fault addr 7300000000 
> [   11.927117] DMAR:[fault reason 05] PTE Write access is not set
> 
> That time, the 'reg 3' showed up.
> 
> Dying hardware ? Or bug ?

We see these occasionally after the gpu has gone bananas, and iirc also
sometimes after module reload (we probably botch the reinit stuff a bit).
That it happens without anything really going on from the gfx is slightly
more disturbing indeed. Any chance this could have been a kernel
regression?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-10-21 15:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-17 21:17 dmar messages caused by graphics Dave Jones
2014-10-20 10:05 ` Joerg Roedel
2014-10-21 15:36 ` [Intel-gfx] " Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).