AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Michel Dänzer" <michel@daenzer.net>,
	jiadong.zhu@amd.com, amd-gfx@lists.freedesktop.org
Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
	Huang Rui <ray.huang@amd.com>,
	Luben Tuikov <Luben.Tuikov@amd.com>
Subject: Re: [PATCH 1/5] drm/amdgpu: Introduce gfx software ring (v8)
Date: Thu, 20 Oct 2022 16:59:01 +0200	[thread overview]
Message-ID: <a4e05017-ac7d-9872-dfad-257be85d1572@amd.com> (raw)
In-Reply-To: <4046cec7-88a1-d91d-9553-678d5165d308@daenzer.net>

Am 20.10.22 um 16:49 schrieb Michel Dänzer:
> On 2022-10-18 11:08, jiadong.zhu@amd.com wrote:
>> From: "Jiadong.Zhu" <Jiadong.Zhu@amd.com>
>>
>> The software ring is created to support priority context while there is only
>> one hardware queue for gfx.
>>
>> Every software ring has its fence driver and could be used as an ordinary ring
>> for the GPU scheduler.
>> Multiple software rings are bound to a real ring with the ring muxer. The
>> packages committed on the software ring are copied to the real ring.
>>
>> v2: Use array to store software ring entry.
>> v3: Remove unnecessary prints.
>> v4: Remove amdgpu_ring_sw_init/fini functions,
>> using gtt for sw ring buffer for later dma copy
>> optimization.
>> v5: Allocate ring entry dynamically in the muxer.
>> v6: Update comments for the ring muxer.
>> v7: Modify for function naming.
>> v8: Combine software ring functions into amdgpu_ring_mux.c
> I tested patches 1-4 of this series (since Christian clearly nacked patch 5). I hit the oops below.

As long as you don't try to reset the GPU you can also test patch 5. 
It's just that we can't upstream the stuff like this or that would break 
immediately.

Regards,
Christian.

>
> amdgpu_sw_ring_ib_begin+0x70/0x80 is in amdgpu_mcbp_trigger_preempt according to scripts/faddr2line, specifically line 376:
>
> 	spin_unlock(&mux->lock);
>
> though I'm not sure why that would crash.
>
>
> Are you not able to reproduce this with a GNOME Wayland session?
>
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 0 P4D 0
> Oops: 0010 [#1] PREEMPT SMP NOPTI
> CPU: 7 PID: 281 Comm: gfx_high Tainted: G            E      6.0.2+ #1
> Hardware name: LENOVO 20NF0000GE/20NF0000GE, BIOS R11ET36W (1.16 ) 03/30/2020
> RIP: 0010:0x0
> Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
> RSP: 0018:ffffbd594073bdc8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff993c4a620000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff993c4a62a350
> RBP: ffff993c4a62d530 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000114
> R13: ffff993c4a620000 R14: 0000000000000000 R15: ffff993c4a62d128
> FS:  0000000000000000(0000) GS:ffff993ef0bc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffffffffd6 CR3: 00000001959fc000 CR4: 00000000003506e0
> Call Trace:
>   <TASK>
>   amdgpu_sw_ring_ib_begin+0x70/0x80 [amdgpu]
>   amdgpu_ib_schedule+0x15f/0x5d0 [amdgpu]
>   amdgpu_job_run+0x102/0x1c0 [amdgpu]
>   drm_sched_main+0x19a/0x440 [gpu_sched]
>   ? dequeue_task_stop+0x70/0x70
>   ? drm_sched_resubmit_jobs+0x10/0x10 [gpu_sched]
>   kthread+0xe9/0x110
>   ? kthread_complete_and_exit+0x20/0x20
>   ret_from_fork+0x22/0x30
>   </TASK>
> [...]
> note: gfx_high[281] exited with preempt_count 1
> [...]
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=14864, emitted seq=14865
> [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox.dpkg-di pid 3540 thread firefox:cs0 pid 4666
> amdgpu 0000:05:00.0: amdgpu: GPU reset begin!
>
>


  reply	other threads:[~2022-10-20 14:59 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-18  9:08 [PATCH 1/5] drm/amdgpu: Introduce gfx software ring (v8) jiadong.zhu
2022-10-18  9:08 ` [PATCH 2/5] drm/amdgpu: Add software ring callbacks for gfx9 (v8) jiadong.zhu
2022-10-18  9:08 ` [PATCH 3/5] drm/amdgpu: Modify unmap_queue format for gfx9 (v4) jiadong.zhu
2022-11-22  5:46   ` Luben Tuikov
2022-10-18  9:08 ` [PATCH 4/5] drm/amdgpu: MCBP based on DRM scheduler (v8) jiadong.zhu
2022-10-31 12:01   ` Michel Dänzer
2022-11-01  1:04     ` Zhu, Jiadong
2022-11-01  9:10       ` Michel Dänzer
2022-11-01  9:58         ` Zhu, Jiadong
2022-11-01 10:09           ` Michel Dänzer
2022-11-02 11:26             ` Michel Dänzer
2022-11-03  2:58               ` Zhu, Jiadong
2022-11-03  9:04                 ` Michel Dänzer
2022-11-08  8:01                   ` Zhu, Jiadong
2022-11-10 17:00                     ` Michel Dänzer
2022-11-10 17:54                       ` Alex Deucher
2022-11-11  6:15                         ` Zhu, Jiadong
2022-11-14 17:15                       ` Michel Dänzer
2022-11-17  3:34                         ` Zhu, Jiadong
2022-11-10 19:27   ` Luben Tuikov
2022-10-18  9:08 ` [PATCH 5/5] drm/amdgpu: Improve the software ring priority scheduler jiadong.zhu
2022-10-18 11:24   ` Christian König
2022-10-19 15:14 ` [PATCH 1/5] drm/amdgpu: Introduce gfx software ring (v8) Luben Tuikov
2022-10-20 14:49 ` Michel Dänzer
2022-10-20 14:59   ` Christian König [this message]
2022-10-21  7:42     ` Michel Dänzer
2022-10-31  8:10     ` Zhu, Jiadong
2022-10-31 11:58       ` Michel Dänzer
2022-11-22  5:33 ` Luben Tuikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a4e05017-ac7d-9872-dfad-257be85d1572@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Andrey.Grodzovsky@amd.com \
    --cc=Luben.Tuikov@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=jiadong.zhu@amd.com \
    --cc=michel@daenzer.net \
    --cc=ray.huang@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox