From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: "Mikhail Gavrilov" <mikhail.v.gavrilov@gmail.com>,
"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
"Linux List Kernel Mailing" <linux-kernel@vger.kernel.org>,
"Christian König" <Christian.Koenig@amd.com>,
"Daniel Vetter" <daniel.vetter@ffwll.ch>
Subject: Re: BUG: KASAN: null-ptr-deref in drm_sched_job_cleanup+0x96/0x290 [gpu_sched]
Date: Wed, 19 Apr 2023 10:12:24 +0200 [thread overview]
Message-ID: <10b2570f-a297-d236-fa7b-2e001a4dff12@gmail.com> (raw)
In-Reply-To: <CABXGCsOTEpJG_0NWdGXRvcXQ4iTav6AUJm-U4SQb-vVzjoL6rA@mail.gmail.com>
Am 19.04.23 um 09:00 schrieb Mikhail Gavrilov:
> Christian?
I'm already looking into this, but can't figure out why we run into
problems here.
What happens is that a CS is aborted without sending the job to the
scheduler and in this case the cleanup function doesn't seem to work.
Christian.
>
> ❯ /usr/src/kernels/6.3.0-0.rc7.56.fc39.x86_64/scripts/faddr2line
> /lib/debug/lib/modules/6.3.0-0.rc7.56.fc39.x86_64/kernel/drivers/gpu/drm/scheduler/gpu-sched.ko.debug
> drm_sched_job_cleanup+0x9a
> drm_sched_job_cleanup+0x9a/0x130:
> drm_sched_job_cleanup at
> /usr/src/debug/kernel-6.3-rc7/linux-6.3.0-0.rc7.56.fc39.x86_64/drivers/gpu/drm/scheduler/sched_main.c:808
> (discriminator 3)
>
> ❯ cat -s -n /usr/src/debug/kernel-6.3-rc7/linux-6.3.0-0.rc7.56.fc39.x86_64/drivers/gpu/drm/scheduler/sched_main.c
> | head -818 | tail -20
> 799 /* drm_sched_job_arm() has been called */
> 800 dma_fence_put(&job->s_fence->finished);
> 801 } else {
> 802 /* aborted job before committing to run it */
> 803 drm_sched_fence_free(job->s_fence);
> 804 }
> 805
> 806 job->s_fence = NULL;
> 807
> 808 xa_for_each(&job->dependencies, index, fence) {
> 809 dma_fence_put(fence);
> 810 }
> 811 xa_destroy(&job->dependencies);
> 812
> 813 }
> 814 EXPORT_SYMBOL(drm_sched_job_cleanup);
> 815
> 816 /**
> 817 * drm_sched_ready - is the scheduler ready
> 818 *
>
>> git blame drivers/gpu/drm/scheduler/sched_main.c -L 800,819
> dbe48d030b285 drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-17 10:49:16 +0200 800)
> dma_fence_put(&job->s_fence->finished);
> dbe48d030b285 drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-17 10:49:16 +0200 801) } else {
> dbe48d030b285 drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-17 10:49:16 +0200 802) /* aborted job
> before committing to run it */
> d4c16733e7960 drivers/gpu/drm/scheduler/sched_main.c (Boris
> Brezillon 2021-09-03 14:05:54 +0200 803)
> drm_sched_fence_free(job->s_fence);
> dbe48d030b285 drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-17 10:49:16 +0200 804) }
> dbe48d030b285 drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-17 10:49:16 +0200 805)
> 26efecf955889 drivers/gpu/drm/scheduler/sched_main.c (Sharat
> Masetty 2018-10-29 15:02:28 +0530 806) job->s_fence = NULL;
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 807)
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 808)
> xa_for_each(&job->dependencies, index, fence) {
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 809)
> dma_fence_put(fence);
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 810) }
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 811)
> xa_destroy(&job->dependencies);
> ebd5f74255b9f drivers/gpu/drm/scheduler/sched_main.c (Daniel
> Vetter 2021-08-05 12:46:49 +0200 812)
> 26efecf955889 drivers/gpu/drm/scheduler/sched_main.c (Sharat
> Masetty 2018-10-29 15:02:28 +0530 813) }
> 26efecf955889 drivers/gpu/drm/scheduler/sched_main.c (Sharat
> Masetty 2018-10-29 15:02:28 +0530 814)
> EXPORT_SYMBOL(drm_sched_job_cleanup);
> 26efecf955889 drivers/gpu/drm/scheduler/sched_main.c (Sharat
> Masetty 2018-10-29 15:02:28 +0530 815)
> e688b728228b9 drivers/gpu/drm/amd/scheduler/gpu_scheduler.c (Christian
> König 2015-08-20 17:01:01 +0200 816) /**
> 2d33948e4e00b drivers/gpu/drm/scheduler/gpu_scheduler.c (Nayan
> Deshmukh 2018-05-29 11:23:07 +0530 817) * drm_sched_ready - is the
> scheduler ready
> 2d33948e4e00b drivers/gpu/drm/scheduler/gpu_scheduler.c (Nayan
> Deshmukh 2018-05-29 11:23:07 +0530 818) *
> 2d33948e4e00b drivers/gpu/drm/scheduler/gpu_scheduler.c (Nayan
> Deshmukh 2018-05-29 11:23:07 +0530 819) * @sched: scheduler instance
>
> Daniel, because Christian, looks a little busy. Can you help? The git
> blame says that you are the author of code which KASAN mentions in its
> report.
> The issue is reproducible on all available AMD hardware: 6800M, 6900XT, 7900XTX.
>
next prev parent reply other threads:[~2023-04-19 8:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 17:40 BUG: KASAN: null-ptr-deref in drm_sched_job_cleanup+0x96/0x290 [gpu_sched] Mikhail Gavrilov
2023-04-14 15:08 ` Mikhail Gavrilov
2023-04-19 7:00 ` Mikhail Gavrilov
2023-04-19 8:12 ` Christian König [this message]
2023-04-19 13:13 ` Mikhail Gavrilov
2023-04-19 13:15 ` Christian König
2023-04-19 19:17 ` Mikhail Gavrilov
2023-04-20 9:59 ` Christian König
2023-04-20 10:32 ` Mikhail Gavrilov
2023-04-25 13:19 ` Mikhail Gavrilov
2023-04-26 2:00 ` Chen, Guchun
2023-04-26 11:48 ` Keyword Review - " Christian König
2023-04-26 11:50 ` Christian König
2023-05-02 19:28 ` Mikhail Gavrilov
2023-04-20 21:24 ` Mikhail Gavrilov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=10b2570f-a297-d236-fa7b-2e001a4dff12@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=Christian.Koenig@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=daniel.vetter@ffwll.ch \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikhail.v.gavrilov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox