public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [BUG] i915 RC6 lockup
@ 2012-10-16 13:19 Jonas Jelten
  2012-10-17  2:30 ` Ben Widawsky
  0 siblings, 1 reply; 3+ messages in thread
From: Jonas Jelten @ 2012-10-16 13:19 UTC (permalink / raw)
  To: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1510 bytes --]

Hi list!

I think i've got a problem with the intel driver:

Sometimes, I think especially after running graphics intense
applications, RC6 is disabled completely and heats up my Thinkpad X220t
to 90 degree celsius, while idling.

At first I thought that this is a CPU frequency scaling issue, as the
cpufreq_powersave claims to be running at 800 MHz, but i7z
(http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
GHz CPU clock.

Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and 0%
RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
/sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
being caused by the GPU.

intel_gpu_top shows a total idle.

I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
Problem first occured with Kernel 3.6.0.
Core i5-2520M HD 3000

>cat /proc/cmdline
>cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
>i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
>i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
>initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux

Sometimes it can be fixed by going to pm-suspend and waking up. A reboot
always fixes it, until it randomly locks up the GPU again.

Please help me how i can do further investigation to catch the bug.

As this makes my Laptop consume ~40W, it would be really nice if this
gets fixed.


Cheers,

Jonas


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 897 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] i915 RC6 lockup
  2012-10-16 13:19 [BUG] i915 RC6 lockup Jonas Jelten
@ 2012-10-17  2:30 ` Ben Widawsky
  2012-10-19  7:59   ` Jonas Jelten
  0 siblings, 1 reply; 3+ messages in thread
From: Ben Widawsky @ 2012-10-17  2:30 UTC (permalink / raw)
  To: Jonas Jelten; +Cc: intel-gfx

On Tue, 16 Oct 2012 15:19:26 +0200
Jonas Jelten <jelten@in.tum.de> wrote:

> Hi list!
> 
> I think i've got a problem with the intel driver:
> 
> Sometimes, I think especially after running graphics intense
> applications, RC6 is disabled completely and heats up my Thinkpad
> X220t to 90 degree celsius, while idling.
> 
> At first I thought that this is a CPU frequency scaling issue, as the
> cpufreq_powersave claims to be running at 800 MHz, but i7z
> (http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
> GHz CPU clock.
> 
> Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and
> 0% RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
> /sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
> being caused by the GPU.

Do you mean the GPU is 0% active? If you really mean 100% then the
results are expected, though I'm not sure how powertop attempts to
calculate the GPU activity. I'm guessing it's just 100 - rc6
state percentage, which when rc6 works is probably pretty close to
reasonable.

> 
> intel_gpu_top shows a total idle.

This indicates the above assumption is true.

> 
> I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
> b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
> Problem first occured with Kernel 3.6.0.
> Core i5-2520M HD 3000

Obviously a bisect of the exact failing commit would be fantastic.

> 
> >cat /proc/cmdline
> >cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
> >i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
> >i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
> >initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux

First and most obvious, do not set rc6=7. If you do, do not file
bug reports with those results. RC6++ is known to be extremely broken,
and why we let users so easily hurt themselves is probably something we
need to remedy. On HD3000, even rc6+ is highly recommended against.

> 
> Sometimes it can be fixed by going to pm-suspend and waking up. A
> reboot always fixes it, until it randomly locks up the GPU again.
> 
> Please help me how i can do further investigation to catch the bug.

If you can reproduce it with rc6=1, then it echoes some other bugs
we're trying to track down. Figuring out the most minimal test case to
make it occur would be helpful. Also you can search the mailing list
for RPS related patches which seem to be related. Trying some of those
and reporting your results would be helpful.

Double check your dmesg for any GPU hangs which may have occurred before
the laptop becomes a space heater.


> 
> As this makes my Laptop consume ~40W, it would be really nice if this
> gets fixed.
> 
> 
> Cheers,
> 
> Jonas
> 
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] i915 RC6 lockup
  2012-10-17  2:30 ` Ben Widawsky
@ 2012-10-19  7:59   ` Jonas Jelten
  0 siblings, 0 replies; 3+ messages in thread
From: Jonas Jelten @ 2012-10-19  7:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3033 bytes --]

On 10/17/2012 04:30 AM, Ben Widawsky wrote:
> On Tue, 16 Oct 2012 15:19:26 +0200
> Jonas Jelten <jelten@in.tum.de> wrote:
> 
>> Hi list!
>>
>> I think i've got a problem with the intel driver:
>>
>> Sometimes, I think especially after running graphics intense
>> applications, RC6 is disabled completely and heats up my Thinkpad
>> X220t to 90 degree celsius, while idling.
>>
>> At first I thought that this is a CPU frequency scaling issue, as the
>> cpufreq_powersave claims to be running at 800 MHz, but i7z
>> (http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
>> GHz CPU clock.
>>
>> Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and
>> 0% RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
>> /sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
>> being caused by the GPU.
> 
> Do you mean the GPU is 0% active? If you really mean 100% then the
> results are expected, though I'm not sure how powertop attempts to
> calculate the GPU activity. I'm guessing it's just 100 - rc6
> state percentage, which when rc6 works is probably pretty close to
> reasonable.
> 
>>
>> intel_gpu_top shows a total idle.
> 
> This indicates the above assumption is true.
> 
>>
>> I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
>> b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
>> Problem first occured with Kernel 3.6.0.
>> Core i5-2520M HD 3000
> 
> Obviously a bisect of the exact failing commit would be fantastic.
> 
>>
>>> cat /proc/cmdline
>>> cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
>>> i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
>>> i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
>>> initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux
> 
> First and most obvious, do not set rc6=7. If you do, do not file
> bug reports with those results. RC6++ is known to be extremely broken,
> and why we let users so easily hurt themselves is probably something we
> need to remedy. On HD3000, even rc6+ is highly recommended against.
> 
>>
>> Sometimes it can be fixed by going to pm-suspend and waking up. A
>> reboot always fixes it, until it randomly locks up the GPU again.
>>
>> Please help me how i can do further investigation to catch the bug.
> 
> If you can reproduce it with rc6=1, then it echoes some other bugs
> we're trying to track down. Figuring out the most minimal test case to
> make it occur would be helpful. Also you can search the mailing list
> for RPS related patches which seem to be related. Trying some of those
> and reporting your results would be helpful.
> 
> Double check your dmesg for any GPU hangs which may have occurred before
> the laptop becomes a space heater.
> 
> 
>>
>> As this makes my Laptop consume ~40W, it would be really nice if this
>> gets fixed.
>>
>>
>> Cheers,
>>
>> Jonas
>>
>>
> 

others are also suffering:

https://bugs.archlinux.org/task/32025


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 897 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-19  7:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-16 13:19 [BUG] i915 RC6 lockup Jonas Jelten
2012-10-17  2:30 ` Ben Widawsky
2012-10-19  7:59   ` Jonas Jelten

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox