dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 206987] [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration
Date: Thu, 28 May 2020 16:24:21 +0000	[thread overview]
Message-ID: <bug-206987-2300-wNzCdYTPKA@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-206987-2300@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=206987

--- Comment #18 from Petteri Aimonen (jpa@kernelbug.mail.kapsi.fi) ---
As best as I can tell, the crash seems to be caused by some floating point
exception (such as underflow/overflow) in this function call in dcn_calc_auto.c
line 176:

dcn_bw_ceil2(v->byte_per_pixel_in_dety[k], 1.0)

In dcn_bw_ceil2() the exception occurs in this instruction:

addsd  0x0(%rip),%xmm3

which is performing the addition flr + 0.00001.
At this point %xmm3 is ((int)(v->byte_per_pixel_in_dety[k] / 1.0)) * 1.0
The variable byte_per_pixel_in_dety is only assigned constant values 1.0, 2.0,
4.0, 8.0 so
I don't see any reason for addsd to cause a simd exception. I'm not sure if the
exception
is precise or if it could be delayed from some prior instruction, but AFAIK it
should be
precise because in usermode the exception handler would attempt a recovery.

Having XMM3 or MXCSR values would help, but they don't seem to get included in
the dmesg output and I'm not sure if they are available in a crash dump either.

Google search turned up
https://beowulf.beowulf.narkive.com/tAHxVcs0/simd-exception-kernel-panic-on-skylake-ep-triggered-by-openfoam
where the exception was delayed for some reason.

Analyzing the dmesgs attached to this bug report, we have following crash
locations:

Cyrax    2020-03-26 21:36: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-04 07:40: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-18 13:19: divss  xmm0,DWORD PTR [r14+0x17f8]
farmboy0 2020-04-19 11:43: not a simd exception
Cyrax    2020-04-23 05:15: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-04-27 19:20: divss  xmm0,DWORD PTR [r14+0x17f8]
Cyrax    2020-05-02 14:18: divss  xmm0,DWORD PTR [r14+0x17f8]
PetteriA 2020-05-28 16:05: addsd  xmm3,QWORD PTR [rip+0x1de967]

So the crash locations appear fairly consistent for Cyrax's machine, but no two
machines have the same location.

For other users affected by this problem, it could be helpful if you install
kernel debugging symbols and use decode_stacktrace.sh to convert the raw stack
trace to code locations.

Also reported on freedesktop amd bugtracker:
https://gitlab.freedesktop.org/drm/amd/-/issues/1154

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2020-05-28 16:24 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 19:51 [Bug 206987] New: [drm] [amdgpu] Whole system crashes when the driver is in mode_support_and_system_configuration bugzilla-daemon
2020-03-26 19:54 ` [Bug 206987] " bugzilla-daemon
2020-03-26 21:36 ` bugzilla-daemon
2020-03-26 21:37 ` bugzilla-daemon
2020-04-04  7:40 ` bugzilla-daemon
2020-04-04  7:41 ` bugzilla-daemon
2020-04-04  7:42 ` bugzilla-daemon
2020-04-18 13:15 ` bugzilla-daemon
2020-04-18 13:19 ` bugzilla-daemon
2020-04-19 11:42 ` bugzilla-daemon
2020-04-19 11:43 ` bugzilla-daemon
2020-04-23  5:15 ` bugzilla-daemon
2020-04-23  5:15 ` bugzilla-daemon
2020-04-25  8:44 ` bugzilla-daemon
2020-04-25  8:44 ` bugzilla-daemon
2020-04-27 19:20 ` bugzilla-daemon
2020-04-27 19:20 ` bugzilla-daemon
2020-05-02 14:18 ` bugzilla-daemon
2020-05-23  1:52 ` bugzilla-daemon
2020-05-23  1:56 ` bugzilla-daemon
2020-05-23  1:58 ` bugzilla-daemon
2020-05-28 14:17 ` bugzilla-daemon
2020-05-28 16:05 ` bugzilla-daemon
2020-05-28 16:24 ` bugzilla-daemon [this message]
2020-05-28 18:56 ` bugzilla-daemon
2020-06-02  3:50 ` bugzilla-daemon
2020-06-03  1:34 ` bugzilla-daemon
2020-06-03  1:35 ` bugzilla-daemon
2020-06-03  1:36 ` bugzilla-daemon
2020-06-03  2:00 ` bugzilla-daemon
2020-06-03  2:28 ` bugzilla-daemon
2020-06-03  5:14 ` bugzilla-daemon
2020-06-03 11:05 ` bugzilla-daemon
2020-06-06  1:29 ` bugzilla-daemon
2020-06-06  6:42 ` bugzilla-daemon
2020-07-03 22:22 ` bugzilla-daemon
2020-07-15 16:07 ` bugzilla-daemon
2020-07-15 16:12 ` bugzilla-daemon
2020-07-17  4:40 ` bugzilla-daemon
2020-07-23  1:47 ` bugzilla-daemon
2020-08-19  6:37 ` bugzilla-daemon
2020-08-19  6:51 ` bugzilla-daemon
2020-08-20  3:30 ` bugzilla-daemon
2020-08-20  4:11 ` bugzilla-daemon
2020-08-20  4:21 ` bugzilla-daemon
2020-08-20  4:24 ` bugzilla-daemon
2021-02-11  7:48 ` bugzilla-daemon
2021-02-11 14:51 ` bugzilla-daemon
2021-02-11 18:36 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-206987-2300-wNzCdYTPKA@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).