From mboxrd@z Thu Jan 1 00:00:00 1970 From: bugzilla-daemon@freedesktop.org Subject: [Bug 100712] ring 0 stalled after bytes_moved_threshold reached - Cap Verde - HD 7770 Date: Tue, 18 Apr 2017 15:14:55 +0000 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1238147873==" Return-path: Received: from culpepper.freedesktop.org (culpepper.freedesktop.org [131.252.210.165]) by gabe.freedesktop.org (Postfix) with ESMTP id 27EE26E1BD for ; Tue, 18 Apr 2017 15:14:55 +0000 (UTC) List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1238147873== Content-Type: multipart/alternative; boundary="14925284950.8e7ed20.10290"; charset="UTF-8" --14925284950.8e7ed20.10290 Date: Tue, 18 Apr 2017 15:14:55 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated https://bugs.freedesktop.org/show_bug.cgi?id=3D100712 Bug ID: 100712 Summary: ring 0 stalled after bytes_moved_threshold reached - Cap Verde - HD 7770 Product: DRI Version: DRI git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: julien.isorce@gmail.com Kernel 4.9 from https://cgit.freedesktop.org/~agd5f/linux/log/?h=3Damd-staging-4.9 and late= st mesa. (same result with drm-next-4.12 branch) Same result with kernel 4.8 and mesa 12.0.6. In kernel radeon_object.c::radeon_bo_list_validate, once "bytes_moved > bytes_moved_threshold" is reached (this is the case for 850 bo in the same list_for_each_entry loop), I can see that radeon_ib_schedule emits a fence = that it takes more than the radeon.lockup_timeout to be signaled. In radeon_fence_activity, I checked that the "last_emitted" is the seq numb= er for this last emited fence. And last_seq is equal to last_emitted-1. Then the next call to ttm_wait_bo blocks (15 * HZ > radeon.lockup_timeout) until gpu lockup which leads to a gpu reset. Also it seems the fence is signaled by swapper after more than 10 seconds b= ut it is too late. I requires to reduce the "15" param above to 4 to see that. Is it normal that radeon_bo_list_validate still tries to move the bo if bytes_moved_threshold is reached ? Indeed ttm_bo_validate is always called = (it blits from vram to vram). Is it also normal that ttm_bo_validate is called with evict flag as true on= ce bytes_moved_threshold is reached ? --=20 You are receiving this mail because: You are the assignee for the bug.= --14925284950.8e7ed20.10290 Date: Tue, 18 Apr 2017 15:14:55 +0000 MIME-Version: 1.0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://bugs.freedesktop.org/ Auto-Submitted: auto-generated
Bug ID 100712
Summary ring 0 stalled after bytes_moved_threshold reached - Cap Verd= e - HD 7770
Product DRI
Version DRI git
Hardware Other
OS All
Status NEW
Severity normal
Priority medium
Component DRM/Radeon
Assignee dri-devel@lists.freedesktop.org
Reporter julien.isorce@gmail.com

Kernel 4.9 from
https://cgit.freedesktop.org/~agd5f/linux/log/?h=3Damd-staging-4.9 =
and latest
mesa. (same result with drm-next-4.12 branch)
Same result with kernel 4.8 and mesa 12.0.6.

In kernel radeon_object.c::radeon_bo_list_validate, once "bytes_moved =
>
bytes_moved_threshold" is reached (this is the case for 850 bo in the =
same
list_for_each_entry loop), I can see that radeon_ib_schedule emits a fence =
that
it takes more than the radeon.lockup_timeout to be signaled.

In radeon_fence_activity, I checked that the "last_emitted" is th=
e seq number
for this last emited fence. And last_seq is equal to last_emitted-1.

Then the next call to ttm_wait_bo blocks (15 * HZ > radeon.lockup_timeou=
t)
until gpu lockup which leads to a gpu reset.

Also it seems the fence is signaled by swapper after more than 10 seconds b=
ut
it is too late. I requires to reduce the "15" param above to 4 to=
 see that.

Is it normal that radeon_bo_list_validate still tries to move the bo if
bytes_moved_threshold is reached ? Indeed ttm_bo_validate is always called =
(it
blits from vram to vram).
Is it also normal that ttm_bo_validate is called with evict flag as true on=
ce
bytes_moved_threshold is reached ?


You are receiving this mail because:
  • You are the assignee for the bug.
= --14925284950.8e7ed20.10290-- --===============1238147873== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1238147873==--