Re: GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems

linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
       [not found] <1704067.2NCOGYajHN@f17simon>
@ 2012-10-19 14:26 ` Daniel Vetter
  2012-10-19 14:52   ` Simon Farnsworth
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Vetter @ 2012-10-19 14:26 UTC (permalink / raw)
  To: Simon Farnsworth; +Cc: linux-pci, intel-gfx, bhelgaas

On Fri, Oct 19, 2012 at 12:26 PM, Simon Farnsworth
<simon.farnsworth@onelan.co.uk> wrote:
> Hello,
>
> I've just been trying to work out why a PCIe to PCI bridge worked with kernel
> 3.3, but not with kernel 3.5 or Linus's git master. I bisected down to bad:
> [aa46419186992e6b8b8010319f0ca7f40a0d13f5] drm/i915: enable plain RC6 on Sandy
> Bridge by default.
>
> I then confirmed that on a failing kernel (3.5 or Linus git
> 8d2b6b3ae280dcf6f6c7a95623670a57cdf562ed from Tuesday 16th, "Merge tag 'sh-
> for-linus' of git://github.com/pmundt/linux-sh"), I can make the failure
> disappear by adding i915.i915_enable_rc6=0 to the kernel command line, and I
> can make it reappear by changing the value of i915_enable_rc6 to -1.
>
> I've attached lspci -vvxxxxx output from a failure case and from a working
> case as lspci.faulty and lspci.working; if you need anything more, just ask
> for it.
>
> My hardware is an Intel DH67CF motherboard, with an i3-2100 CPU; there is a
> Startech branded PCIe to PCI bridge in the PCIe x16 slot (which I believe is
> connected to the CPU PCIe lanes, not the PCH PCIe lanes). I have a Hauppauge
> HVR-1110 in the PCI slot provided by the bridge.
>
> I have two test cases; one is transferring MPEG-2 transport streams from the
> DVB-T tuner on the card (no graphics involved), the other is using the V4L2
> interface to capture buffers via the mmap() mechanism, which are then uploaded
> to the XServer via the Xv extension, and composited using an OpenGL compositor
> that uses texture_from_pixmap.

Ok, this is really freaky stuff. One thing to triage: Is it just
sufficient to put the gpu into rc6 to corrupt the dma transfers, or is
some light X/gpu load required? In either case, rc6 being able to
corrupt random dma transfers (or at least prevent them from reaching
their destination) would be a fitting explanation for the leftover rc6
issues on snb ...

Thanks, Daniel

>
> In both cases, the behaviour I see is that some DMA transfers don't transfer
> data; the DMA apparently completes, and addresses are updated correctly, but
> data bytes don't change. This results in corruption in the MPEG-2 transport
> streams, and in pixel spans not updating in the V4L2 case. Disabling RC6 fully
> fixes this.
>
> This isn't a deal-breaker for my application - I can force off RC6 and live
> with the extra power draw for now. However, I'd prefer to be able to run
> without command line options in the future.
>
> I'm happy to try patches, even if the goal is just to get you some more debug
> information; long term, I'd like to be able to remove the command line option,
> as I run a single software image on multiple boxes, not all of which have the
> PCIe to PCI bridge fitted, and on those that don't use the PCIe to PCI bridge,
> I'd like to run with the power savings of RC6.
> --
> Simon Farnsworth
> Software Engineer
> ONELAN Ltd
> http://www.onelan.com



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
  2012-10-19 14:26 ` GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems Daniel Vetter
@ 2012-10-19 14:52   ` Simon Farnsworth
       [not found]     ` <2233216.7bl6QCud67@f17simon>
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Farnsworth @ 2012-10-19 14:52 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: linux-pci, intel-gfx, bhelgaas

[-- Attachment #1: Type: text/plain, Size: 3132 bytes --]

On Friday 19 October 2012 16:26:08 Daniel Vetter wrote:
> On Fri, Oct 19, 2012 at 12:26 PM, Simon Farnsworth
> <simon.farnsworth@onelan.co.uk> wrote:
> > Hello,
> >
> > I've just been trying to work out why a PCIe to PCI bridge worked with kernel
> > 3.3, but not with kernel 3.5 or Linus's git master. I bisected down to bad:
> > [aa46419186992e6b8b8010319f0ca7f40a0d13f5] drm/i915: enable plain RC6 on Sandy
> > Bridge by default.
> >
> > I then confirmed that on a failing kernel (3.5 or Linus git
> > 8d2b6b3ae280dcf6f6c7a95623670a57cdf562ed from Tuesday 16th, "Merge tag 'sh-
> > for-linus' of git://github.com/pmundt/linux-sh"), I can make the failure
> > disappear by adding i915.i915_enable_rc6=0 to the kernel command line, and I
> > can make it reappear by changing the value of i915_enable_rc6 to -1.
> >
> > I've attached lspci -vvxxxxx output from a failure case and from a working
> > case as lspci.faulty and lspci.working; if you need anything more, just ask
> > for it.
> >
> > My hardware is an Intel DH67CF motherboard, with an i3-2100 CPU; there is a
> > Startech branded PCIe to PCI bridge in the PCIe x16 slot (which I believe is
> > connected to the CPU PCIe lanes, not the PCH PCIe lanes). I have a Hauppauge
> > HVR-1110 in the PCI slot provided by the bridge.
> >
> > I have two test cases; one is transferring MPEG-2 transport streams from the
> > DVB-T tuner on the card (no graphics involved), the other is using the V4L2
> > interface to capture buffers via the mmap() mechanism, which are then uploaded
> > to the XServer via the Xv extension, and composited using an OpenGL compositor
> > that uses texture_from_pixmap.
> 
> Ok, this is really freaky stuff. One thing to triage: Is it just
> sufficient to put the gpu into rc6 to corrupt the dma transfers, or is
> some light X/gpu load required? In either case, rc6 being able to
> corrupt random dma transfers (or at least prevent them from reaching
> their destination) would be a fitting explanation for the leftover rc6
> issues on snb ...
> 
In an attempt to have this happen with the GPU as idle as possible, I did the
following (note that I'm on a gigabit Ethernet segment, so I can burn network
bandwidth while testing):

1. Start X.org with -noreset, and don't start any X clients.
2. Run "xset dpms force off ; xrandr --output DP2 --off" (DP2 is the connected output).
3. On the affected machine, run "gst-launch v4l2src ! gdppay ! tcpclientsink host=f17simon port=65512"
4. On my desktop, run "gst-launch tcpserversrc host=0.0.0.0 port=65512 ! gdpdepay ! xvimagesink"

I see the corruption continue to happen, even though the GPU should be idle
and in RC6 state most of the time (confirmed by reading
/sys/class/drm/card0/power/rc6_residency_ms and seeing it increase between
reads). When I run intel_forcewaked from intel_gpu_tools, the corruption goes
away, and I can confirm by reading /sys/class/drm/card0/power/rc6_residency_ms
that the GPU does not enter RC6. Killing intel_forcewaked makes the corruption
reappear while streaming over the network (X11 idle).
-- 
Simon Farnsworth
Software Engineer
ONELAN Ltd
http://www.onelan.com

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
       [not found]     ` <2233216.7bl6QCud67@f17simon>
@ 2012-10-19 17:06       ` Simon Farnsworth
  2012-10-20 12:20         ` Andy Walls
  2012-10-19 17:18       ` Jesse Barnes
  1 sibling, 1 reply; 5+ messages in thread
From: Simon Farnsworth @ 2012-10-19 17:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, bhelgaas, linux-pci, linux-media, mchehab

[-- Attachment #1: Type: text/plain, Size: 1818 bytes --]

On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote:
> Mauro, Linux-Media
> 
> I have an issue where an SAA7134-based TV capture card connected via a PCIe to
> PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
> "skips" updating lines of the capture when the GPU is in RC6. We've confirmed
> that a CX23418 based chip doesn't have the problem, so the question is whether
> the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus.
> 
> This manifests as a regression, as I had no problems with kernel 3.3 (which
> never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with
> current Linus git master. I'm happy to try anything, 
> 
> I've attached lspci -vvxxxxx output (suitable for feeding to lspci -F) for
> when the corruption is present (lspci.faulty) and when it's not
> (lspci.working). The speculation is that the SAA7134 is somehow more
> sensitive to the changes in timings that RC6 introduces than the CX23418, and
> that someone who understands the saa7134 driver might be able to make it less
> sensitive.
> 
And timings are definitely the problem; I have a userspace provided pm_qos
request asking for 0 exit latency, but I can see CPU cores entering C6. I'll
take this problem to an appropriate list.

There is still be a bug in the SAA7134 driver, as the card clearly wants a
pm_qos request when streaming to stop the DMA latency becoming too high; this
doesn't directly affect me, as my userspace always requests minimal DMA
latency anyway, so consider this message as just closing down the thread for
now, and as a marker for the future (if people see such corruption, the
saa7134 driver needs a pm_qos request when streaming that isn't currently
present).
-- 
Simon Farnsworth
Software Engineer
ONELAN Ltd
http://www.onelan.com

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
       [not found]     ` <2233216.7bl6QCud67@f17simon>
  2012-10-19 17:06       ` [Intel-gfx] " Simon Farnsworth
@ 2012-10-19 17:18       ` Jesse Barnes
  1 sibling, 0 replies; 5+ messages in thread
From: Jesse Barnes @ 2012-10-19 17:18 UTC (permalink / raw)
  To: Simon Farnsworth
  Cc: intel-gfx, bhelgaas, Daniel Vetter, linux-media, mchehab,
	linux-pci

RC6 plus CPU C6 would also put the whole package into a low power
state.  It's possible we're missing some initialization to keep things
up for other system activity like bus mastering on PCIe?

Just thinking out loud here, unfortunately I don't know of any settings
that might control this.  But package level changes are one other
thing that would be affected by RC6 enabling.

Jesse

On Fri, 19 Oct 2012 17:10:17 +0100
Simon Farnsworth <simon.farnsworth@onelan.co.uk> wrote:

> Mauro, Linux-Media
> 
> I have an issue where an SAA7134-based TV capture card connected via a PCIe to
> PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
> "skips" updating lines of the capture when the GPU is in RC6. We've confirmed
> that a CX23418 based chip doesn't have the problem, so the question is whether
> the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus.
> 
> This manifests as a regression, as I had no problems with kernel 3.3 (which
> never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with
> current Linus git master. I'm happy to try anything, 
> 
> I've attached lspci -vvxxxxx output (suitable for feeding to lspci -F) for
> when the corruption is present (lspci.faulty) and when it's not
> (lspci.working). The speculation is that the SAA7134 is somehow more
> sensitive to the changes in timings that RC6 introduces than the CX23418, and
> that someone who understands the saa7134 driver might be able to make it less
> sensitive.
> 
> Details of the most recent tests follow:
> 
> On Friday 19 October 2012 15:52:32 Simon Farnsworth wrote:
> > On Friday 19 October 2012 16:26:08 Daniel Vetter wrote:
> > > Ok, this is really freaky stuff. One thing to triage: Is it just
> > > sufficient to put the gpu into rc6 to corrupt the dma transfers, or is
> > > some light X/gpu load required? In either case, rc6 being able to
> > > corrupt random dma transfers (or at least prevent them from reaching
> > > their destination) would be a fitting explanation for the leftover rc6
> > > issues on snb ...
> > > 
> > In an attempt to have this happen with the GPU as idle as possible, I did the
> > following (note that I'm on a gigabit Ethernet segment, so I can burn network
> > bandwidth while testing):
> > 
> > 1. Start X.org with -noreset, and don't start any X clients.
> > 2. Run "xset dpms force off ; xrandr --output DP2 --off" (DP2 is the connected output).
> > 3. On the affected machine, run "gst-launch v4l2src ! gdppay ! tcpclientsink host=f17simon port=65512"
> > 4. On my desktop, run "gst-launch tcpserversrc host=0.0.0.0 port=65512 ! gdpdepay ! xvimagesink"
> > 
> > I see the corruption continue to happen, even though the GPU should be idle
> > and in RC6 state most of the time (confirmed by reading
> > /sys/class/drm/card0/power/rc6_residency_ms and seeing it increase between
> > reads). When I run intel_forcewaked from intel_gpu_tools, the corruption goes
> > away, and I can confirm by reading /sys/class/drm/card0/power/rc6_residency_ms
> > that the GPU does not enter RC6. Killing intel_forcewaked makes the corruption
> > reappear while streaming over the network (X11 idle).
> > 
> As a follow up - Daniel requested via IRC that I try with a different capture
> card; I've switched to a HVR-1600 (cx18 driver instead of saa7134), and I've
> also tried with the X server forcibly quiesced via kill -STOP.
> 
> Quiescing the X server doesn't help; however, the HVR-1600 does not show the
> problem. This suggests that it's an interaction between the SAA7134 based TV
> card, the bridge chip, and the different PCIe timings when the GPU is in RC6.


-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Intel-gfx] GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems
  2012-10-19 17:06       ` [Intel-gfx] " Simon Farnsworth
@ 2012-10-20 12:20         ` Andy Walls
  0 siblings, 0 replies; 5+ messages in thread
From: Andy Walls @ 2012-10-20 12:20 UTC (permalink / raw)
  To: Simon Farnsworth
  Cc: intel-gfx, Daniel Vetter, bhelgaas, linux-pci, linux-media,
	mchehab

On Fri, 2012-10-19 at 18:06 +0100, Simon Farnsworth wrote:
> On Friday 19 October 2012 17:10:17 Simon Farnsworth wrote:
> > Mauro, Linux-Media
> > 
> > I have an issue where an SAA7134-based TV capture card connected via a PCIe to
> > PCI bridge chip works when the GPU is kept out of RC6 state, but sometimes
> > "skips" updating lines of the capture when the GPU is in RC6. We've confirmed
> > that a CX23418 based chip doesn't have the problem, so the question is whether
> > the SAA7134 and the saa7134 driver are at fault, or whether it's the PCIe bus.

My money's on the saa7134 driver's irq_handler or the driver's locking
scheme to protect data accessed by both irq handler and userspace file
operations (aka videobuf's locking) in the driver.

It could also be a system level problem with another driver's irq
handler being stupid.

> > This manifests as a regression, as I had no problems with kernel 3.3 (which
> > never enabled RC6 on the Intel GPU), but I do have problems with 3.5 and with
> > current Linus git master. I'm happy to try anything, 

Profile the saa7134 driver in operation:

http://www.spinics.net/lists/linux-media/msg15762.html

That will give you and driver writers a clue as to where any big delays
are hapeening in the saa7134 driver.

Odds are the processor slowing down to a lower power/lower speed state
is exposing inefficiencies in the irq handling of the saa7134 driver.

> > I've attached lspci -vvxxxxx output (suitable for feeding to lspci -F) for
> > when the corruption is present (lspci.faulty) and when it's not
> > (lspci.working). 

Doing a diff between the two files and checking what devices have
changed registers I noted that only 3 devices' PCI config space
registers changed: 00:01.0 and 00:1c.1 (both PCIe ports/bridges) and
00:1a.0. 

$ lspci -F lspci.working -tv
-[0000:00]-+-00.0  Intel Corporation 2nd Generation Core Processor Family DRAM Controller
           +-01.0-[01-02]----00.0-[02]----08.0  Philips Semiconductors SAA7131/SAA7133/SAA7135 Video Broadcast Decoder
           +-02.0  Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller
           +-16.0  Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1
           +-19.0  Intel Corporation 82579V Gigabit Network Connection
           +-1a.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2
           +-1b.0  Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller
           +-1c.0-[03]--
           +-1c.1-[04]----00.0  NEC Corporation uPD720200 USB 3.0 Host Controller
           +-1d.0  Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1
           +-1f.0  Intel Corporation H67 Express Chipset Family LPC Controller
           +-1f.2  Intel Corporation 6 Series/C200 Series Chipset Family 6 port SATA AHCI Controller
           \-1f.3  Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller

Obviously the changes to the bridge at 00:01.0 might matter, but I would
need to dig up the data sheet for the "00:01.0 PCI bridge [0604]: Intel
Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI
Express Root Port [8086:0101] (rev 09) (prog-if 00 [Normal decode])" to
see if it really mattered.

> The speculation is that the SAA7134 is somehow more
> > sensitive to the changes in timings that RC6 introduces than the CX23418, and
> > that someone who understands the saa7134 driver might be able to make it less
> > sensitive.

I heavily optimized the cx18 driver for the high throughput use case
(mutliple cards running multiple data streams), which meant squeezing
every little bit of useless junk out of the irq handler and adding
highly granular buffer queue locking between the irq handling and the
userspace file operations calls.  Also the CX23418 firmware has a "best
effort" buffer notification handshake and the cx18 driver does some
extra recovery processing to handle when it is late on handling buffer
notifications.  All that optimzation and robustness coding took me a few
months to get right.

I don't see that sort of optimization of the saa7134 driver coming
anytime soon.

Regards,
Andy

> And timings are definitely the problem; I have a userspace provided pm_qos
> request asking for 0 exit latency, but I can see CPU cores entering C6. I'll
> take this problem to an appropriate list.
> 
> There is still be a bug in the SAA7134 driver, as the card clearly wants a
> pm_qos request when streaming to stop the DMA latency becoming too high; this
> doesn't directly affect me, as my userspace always requests minimal DMA
> latency anyway, so consider this message as just closing down the thread for
> now, and as a marker for the future (if people see such corruption, the
> saa7134 driver needs a pm_qos request when streaming that isn't currently
> present).

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-10-20 12:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1704067.2NCOGYajHN@f17simon>
2012-10-19 14:26 ` GPU RC6 breaks PCIe to PCI bridge connected to CPU PCIe slot on SandyBridge systems Daniel Vetter
2012-10-19 14:52   ` Simon Farnsworth
     [not found]     ` <2233216.7bl6QCud67@f17simon>
2012-10-19 17:06       ` [Intel-gfx] " Simon Farnsworth
2012-10-20 12:20         ` Andy Walls
2012-10-19 17:18       ` Jesse Barnes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).