All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michel Dänzer" <michel@daenzer.net>
To: Carsten Emde <C.Emde@osadl.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Linux RT Users <linux-rt-users@vger.kernel.org>,
	DRI Development <dri-devel@lists.freedesktop.org>
Subject: Re: [OSADL QA 3.18.9-rt4 #1] Radeon driver hangs
Date: Wed, 25 Mar 2015 15:57:34 +0900	[thread overview]
Message-ID: <55125C5E.8040309@daenzer.net> (raw)
In-Reply-To: <550F3EC8.1080109@osadl.org>

On 23.03.2015 07:14, Carsten Emde wrote:
> Hi Michel,
> 
>>>>> [..]
>>>>> The most striking problem of kernel 3.18.9-rt4 affects all systems
>>>>> that
>>>>> are equipped with Radeon graphics (irrespective whether PCIe cards or
>>>>> APUs with on-chip graphics). They suffer from a hanging radeon driver.
>>>>> The block occurs when accelerated graphics load is created by
>>>>> x11perf or
>>>>> gltestperf. Sometimes only the graphics are frozen while ssh login
>>>>> still
>>>>> is possible, somtimes the entire box is no longer accessible at
>>>>> all. In
>>>>> any case, a reboot is needed to recover from this situation.
>>>>>
>>>>> Here is a selection of kernel messages:
>>>> [...]
>>>> The commits from
>>>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=f957063fee6392bb9365370db6db74dc0b2dce0a
>>>>
>>>>
>>>> to
>>>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=cffefd9bb31cd35ab745d3b49005d10616d25bdc
>>>>
>>>>
>>>> and
>>>> http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-fixes&id=b6610101718d4ab90d793c482625e98eb1262cad
>>>>
>>>>
>>>> might help for this.
>>>
>>> Thanks a lot. I have applied these patches to a number of systems:
>>> # quilt applied | tail -7
>>> patches/drm-radeon-do-a-posting-read-in-r100_set_irq.patch
>>> patches/drm-radeon-do-a-posting-read-in-rs600_set_irq.patch
>>> patches/drm-radeon-do-a-posting-read-in-r600_set_irq.patch
>>> patches/drm-radeon-do-a-posting-read-in-evergreen_set_irq.patch
>>> patches/drm-radeon-do-a-posting-read-in-si_set_irq.patch
>>> patches/drm-radeon-do-a-posting-read-in-cik_set_irq.patch
>>> patches/drm-radeon-fix-wait-to-actually-occur-after-the-signaling-callback.patch
>>>
>>>
>>>
>>>   The graphic boards still crash and freeze the screen, but in contrast
>>> to the earlier situation the systems remain accessible, and the X
>>> Window server can be restarted after the offensive programs are
>>> removed. The crashes were reliably triggered by
>>> - gltestperf
>>>    or
>>> - x11perf -repeat 3 -subs 25 -time 2 -rect10
> This is not entirely correct, since gltestperf does not reliably crash
> the graphics controller. However, "x11perf -repeat 3 -subs 25 -time 2
> -rect10" always does a reliable job to trigger the crash.
> 
>>> but the crashes also occur several times per day during normal work
>>> such as browsing the Internet or writing a text document. If you wish
>>> me to provide additional diagnostic information such as running test
>>> programs while the graphic boards are unresponsive, I certainly can do
>>> that.
>>
>> Does it also happen with a kernel built from a current drm-fixes tree?
>> http://cgit.freedesktop.org/~airlied/linux/log/?h=drm-fixes
> No. Apparently, you need full preemption to expose the problem.
> 
> The following list contains the results whether the command "x11perf
> -repeat 3 -subs 25 -time 2 -rect10" freezes the Radeon board under test
> (Radeon HD 7970 XFS / R9 280X) or not:
> linux-3.12.33-rt47               no
> linux-3.14.34-rt32               no
> linux-3.14.34-drm-3.16.7-rt32*   no
> linux-3.18.7-rt1                YES
> linux-3.18.9-rt4                YES
> linux-3.18.9-rt5                YES
> linux-3.18.9-drm-3.16.7-rt5**    no
> linux-4.0.0-rc4                  no
> linux-drm-fixes                  no
> *DRM subsystem backported from linux-3.16.7 to linux-3.14.34-rt32.
> **DRM subsystem ported from linux-3.16.7 to linux-3.18.9-rt5. 

Can you test a non-rt 3.18.y kernel? There were some intermittent issues
around 3.18 fixed by the patches I referenced above. Maybe I missed some
other fixes, though. Maarten, do you remember any other fixes offhand
that might help?


> More observations:
> If full function tracing is enabled (which makes the system about five
> times slower), the graphics controller no longer freezes. With partial
> function tracing such as "echo *drm* >set_ftrace_filter", the
> controller still freezes. The trace then contains vblank interrupt
> processing only, ioctls are no longer executed.
> 
> This is the location where the driver hangs:
> [25104.509258] INFO: task Xorg.bin:16591 blocked for more than 120 seconds.
> [25104.516322]       Not tainted 3.18.9-rt5 #2
> [25104.520715] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [25104.528853] Xorg.bin        D ffffffff8171ed90     0 16591  16239
> 0x10400080
> [25104.536102]  ffff8800ba0bb8d8 0000000000000002 ffff8800ba0bbfd8
> 0000000000000006
> [25104.536103]  000000000000dc08 ffff880626d0dc08 ffff8800ba0bbfd8
> 000000000000dc08
> [25104.536104]  ffff88061b2cdcd0 ffff880616d3a940 ffff880035c10000
> ffff880616d3a940
> [25104.559274] Call Trace:
> [25104.561844]  [<ffffffff8171bb54>] schedule+0x34/0xa0
> [25104.561846]  [<ffffffff8171e2ac>] schedule_timeout+0x23c/0x2a0
> [25104.561870]  [<ffffffffa00e3ab6>] ? radeon_fence_process+0x16/0x40
> [radeon]
> [25104.561879]  [<ffffffffa00e3b24>] ?
> radeon_fence_any_seq_signaled+0x44/0x90 [radeon]
> [25104.561887]  [<ffffffffa00e3e97>]
> radeon_fence_wait_seq_timeout.constprop.8+0x327/0x380 [radeon]
> [25104.561889]  [<ffffffff810d19c0>] ? __wake_up_sync+0x20/0x20
> [25104.561898]  [<ffffffffa00e4287>] radeon_fence_wait_any+0x57/0x70
> [radeon]
> [25104.561914]  [<ffffffffa015a36f>] radeon_sa_bo_new+0x2af/0x4b0 [radeon]
> [25104.561916]  [<ffffffff81379b07>] ? debug_smp_processor_id+0x17/0x20
> [25104.561918]  [<ffffffff811d0b4a>] ? __kmalloc+0x8a/0x300
> [25104.561932]  [<ffffffffa01b2197>] radeon_ib_get+0x37/0xe0 [radeon]
> [25104.561943]  [<ffffffffa01003ee>] radeon_cs_ioctl+0x22e/0x860 [radeon]
> [25104.561952]  [<ffffffffa0005bc7>] drm_ioctl+0x197/0x670 [drm]
> [25104.561954]  [<ffffffff81379b07>] ? debug_smp_processor_id+0x17/0x20
> [25104.561956]  [<ffffffff810901ba>] ? unpin_current_cpu+0x1a/0x80
> [25104.561959]  [<ffffffff810ba200>] ? migrate_enable+0x90/0x1a0
> [25104.561966]  [<ffffffffa00c604c>] radeon_drm_ioctl+0x4c/0x80 [radeon]
> [25104.561967]  [<ffffffff811fdb88>] do_vfs_ioctl+0x2c8/0x4c0
> [25104.561969]  [<ffffffff81208a92>] ? __fget+0x72/0xb0
> [25104.561970]  [<ffffffff811fde01>] SyS_ioctl+0x81/0xa0
> [25104.561971]  [<ffffffff8171f99e>] tracesys_phase2+0xd4/0xd9
> 
> Conclusion:
> An upgrade change of the DRM subsystem between 3.16.7 and 3.18.9
> introduced a race condition that freezes Radeon graphics. It requires
> full preemption to be exposed reliably.




-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

      reply	other threads:[~2015-03-25  6:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-12 23:23 [OSADL QA 3.18.9-rt4 #1] Radeon driver hangs Carsten Emde
2015-03-13  2:23 ` Michel Dänzer
2015-03-13 11:12   ` Sebastian Andrzej Siewior
2015-03-16 14:52   ` Carsten Emde
2015-03-17  2:31     ` Michel Dänzer
2015-03-22 22:14       ` Carsten Emde
2015-03-25  6:57         ` Michel Dänzer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55125C5E.8040309@daenzer.net \
    --to=michel@daenzer.net \
    --cc=C.Emde@osadl.org \
    --cc=bigeasy@linutronix.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-rt-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.