All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: vitaly.prosyak@amd.com, amd-gfx@lists.freedesktop.org,
	dri-devel@lists.freedesktop.org
Cc: Christian Koenig <christian.koenig@amd.com>,
	Alex Deucher <alexander.deucher@amd.com>,
	Luben Tuikov <ltuikov89@gmail.com>,
	Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>,
	Joonkyo Jung <joonkyoj@yonsei.ac.kr>,
	Dokyung Song <dokyungs@yonsei.ac.kr>,
	jisoo.jang@yonsei.ac.kr, yw9865@yonsei.ac.kr
Subject: Re: [PATCH] drm/sched: fix null-ptr-deref in init entity
Date: Fri, 15 Mar 2024 15:04:04 +0100	[thread overview]
Message-ID: <62173dc5-2cc9-4bcc-9ef5-72631ae96e30@gmail.com> (raw)
In-Reply-To: <20240315023926.343164-1-vitaly.prosyak@amd.com>

Am 15.03.24 um 03:39 schrieb vitaly.prosyak@amd.com:
> From: Vitaly Prosyak <vitaly.prosyak@amd.com>
>
> The bug can be triggered by sending an amdgpu_cs_wait_ioctl
> to the AMDGPU DRM driver on any ASICs with valid context.
> The bug was reported by Joonkyo Jung <joonkyoj@yonsei.ac.kr>.
> For example the following code:
>
>      static void Syzkaller2(int fd)
>      {
> 	union drm_amdgpu_ctx arg1;
> 	union drm_amdgpu_wait_cs arg2;
>
> 	arg1.in.op = AMDGPU_CTX_OP_ALLOC_CTX;
> 	ret = drmIoctl(fd, 0x140106442 /* amdgpu_ctx_ioctl */, &arg1);
>
> 	arg2.in.handle = 0x0;
> 	arg2.in.timeout = 0x2000000000000;
> 	arg2.in.ip_type = AMD_IP_VPE /* 0x9 */;
> 	arg2->in.ip_instance = 0x0;
> 	arg2.in.ring = 0x0;
> 	arg2.in.ctx_id = arg1.out.alloc.ctx_id;
>
> 	drmIoctl(fd, 0xc0206449 /* AMDGPU_WAIT_CS * /, &arg2);
>      }
>
> The ioctl AMDGPU_WAIT_CS without previously submitted job could be assumed that
> the error should be returned, but the following commit 1decbf6bb0b4dc56c9da6c5e57b994ebfc2be3aa
> modified the logic and allowed to have sched_rq equal to NULL.
>
> As a result when there is no job the ioctl AMDGPU_WAIT_CS returns success.
> The change fixes null-ptr-deref in init entity and the stack below demonstrates
> the error condition:
>
> [  +0.000007] BUG: kernel NULL pointer dereference, address: 0000000000000028
> [  +0.007086] #PF: supervisor read access in kernel mode
> [  +0.005234] #PF: error_code(0x0000) - not-present page
> [  +0.005232] PGD 0 P4D 0
> [  +0.002501] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [  +0.005034] CPU: 10 PID: 9229 Comm: amd_basic Tainted: G    B   W    L     6.7.0+ #4
> [  +0.007797] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
> [  +0.009798] RIP: 0010:drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
> [  +0.006426] Code: 80 00 00 00 00 00 00 00 e8 1a 81 82 e0 49 89 9c 24 c0 00 00 00 4c 89 ef e8 4a 80 82 e0 49 8b 5d 00 48 8d 7b 28 e8 3d 80 82 e0 <48> 83 7b 28 00 0f 84 28 01 00 00 4d 8d ac 24 98 00 00 00 49 8d 5c
> [  +0.019094] RSP: 0018:ffffc90014c1fa40 EFLAGS: 00010282
> [  +0.005237] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8113f3fa
> [  +0.007326] RDX: fffffbfff0a7889d RSI: 0000000000000008 RDI: ffffffff853c44e0
> [  +0.007264] RBP: ffffc90014c1fa80 R08: 0000000000000001 R09: fffffbfff0a7889c
> [  +0.007266] R10: ffffffff853c44e7 R11: 0000000000000001 R12: ffff8881a719b010
> [  +0.007263] R13: ffff88810d412748 R14: 0000000000000002 R15: 0000000000000000
> [  +0.007264] FS:  00007ffff7045540(0000) GS:ffff8883cc900000(0000) knlGS:0000000000000000
> [  +0.008236] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  +0.005851] CR2: 0000000000000028 CR3: 000000011912e000 CR4: 0000000000350ef0
> [  +0.007175] Call Trace:
> [  +0.002561]  <TASK>
> [  +0.002141]  ? show_regs+0x6a/0x80
> [  +0.003473]  ? __die+0x25/0x70
> [  +0.003124]  ? page_fault_oops+0x214/0x720
> [  +0.004179]  ? preempt_count_sub+0x18/0xc0
> [  +0.004093]  ? __pfx_page_fault_oops+0x10/0x10
> [  +0.004590]  ? srso_return_thunk+0x5/0x5f
> [  +0.004000]  ? vprintk_default+0x1d/0x30
> [  +0.004063]  ? srso_return_thunk+0x5/0x5f
> [  +0.004087]  ? vprintk+0x5c/0x90
> [  +0.003296]  ? drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
> [  +0.005807]  ? srso_return_thunk+0x5/0x5f
> [  +0.004090]  ? _printk+0xb3/0xe0
> [  +0.003293]  ? __pfx__printk+0x10/0x10
> [  +0.003735]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [  +0.005482]  ? do_user_addr_fault+0x345/0x770
> [  +0.004361]  ? exc_page_fault+0x64/0xf0
> [  +0.003972]  ? asm_exc_page_fault+0x27/0x30
> [  +0.004271]  ? add_taint+0x2a/0xa0
> [  +0.003476]  ? drm_sched_entity_init+0x2d3/0x420 [gpu_sched]
> [  +0.005812]  amdgpu_ctx_get_entity+0x3f9/0x770 [amdgpu]
> [  +0.009530]  ? finish_task_switch.isra.0+0x129/0x470
> [  +0.005068]  ? __pfx_amdgpu_ctx_get_entity+0x10/0x10 [amdgpu]
> [  +0.010063]  ? __kasan_check_write+0x14/0x20
> [  +0.004356]  ? srso_return_thunk+0x5/0x5f
> [  +0.004001]  ? mutex_unlock+0x81/0xd0
> [  +0.003802]  ? srso_return_thunk+0x5/0x5f
> [  +0.004096]  amdgpu_cs_wait_ioctl+0xf6/0x270 [amdgpu]
> [  +0.009355]  ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
> [  +0.009981]  ? srso_return_thunk+0x5/0x5f
> [  +0.004089]  ? srso_return_thunk+0x5/0x5f
> [  +0.004090]  ? __srcu_read_lock+0x20/0x50
> [  +0.004096]  drm_ioctl_kernel+0x140/0x1f0 [drm]
> [  +0.005080]  ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
> [  +0.009974]  ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
> [  +0.005618]  ? srso_return_thunk+0x5/0x5f
> [  +0.004088]  ? __kasan_check_write+0x14/0x20
> [  +0.004357]  drm_ioctl+0x3da/0x730 [drm]
> [  +0.004461]  ? __pfx_amdgpu_cs_wait_ioctl+0x10/0x10 [amdgpu]
> [  +0.009979]  ? __pfx_drm_ioctl+0x10/0x10 [drm]
> [  +0.004993]  ? srso_return_thunk+0x5/0x5f
> [  +0.004090]  ? __kasan_check_write+0x14/0x20
> [  +0.004356]  ? srso_return_thunk+0x5/0x5f
> [  +0.004090]  ? _raw_spin_lock_irqsave+0x99/0x100
> [  +0.004712]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> [  +0.005063]  ? __pfx_arch_do_signal_or_restart+0x10/0x10
> [  +0.005477]  ? srso_return_thunk+0x5/0x5f
> [  +0.004000]  ? preempt_count_sub+0x18/0xc0
> [  +0.004237]  ? srso_return_thunk+0x5/0x5f
> [  +0.004090]  ? _raw_spin_unlock_irqrestore+0x27/0x50
> [  +0.005069]  amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu]
> [  +0.008912]  __x64_sys_ioctl+0xcd/0x110
> [  +0.003918]  do_syscall_64+0x5f/0xe0
> [  +0.003649]  ? noist_exc_debug+0xe6/0x120
> [  +0.004095]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> [  +0.005150] RIP: 0033:0x7ffff7b1a94f
> [  +0.003647] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
> [  +0.019097] RSP: 002b:00007fffffffe0a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  +0.007708] RAX: ffffffffffffffda RBX: 000055555558b360 RCX: 00007ffff7b1a94f
> [  +0.007176] RDX: 000055555558b360 RSI: 00000000c0206449 RDI: 0000000000000003
> [  +0.007326] RBP: 00000000c0206449 R08: 000055555556ded0 R09: 000000007fffffff
> [  +0.007176] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5d8
> [  +0.007238] R13: 0000000000000003 R14: 000055555555cba8 R15: 00007ffff7ffd040
> [  +0.007250]  </TASK>
>
> v2: Reworked check to guard against null ptr deref and added helpful comments
>      (Christian)
>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Luben Tuikov <ltuikov89@gmail.com>
> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> Cc: Joonkyo Jung <joonkyoj@yonsei.ac.kr>
> Cc: Dokyung Song <dokyungs@yonsei.ac.kr>
> Cc: <jisoo.jang@yonsei.ac.kr>
> Cc: <yw9865@yonsei.ac.kr>
> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

Going to push this to drm-misc-fixes.

Thanks,
Christian.

> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 3c4f5a392b06..58c8161289fe 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -71,13 +71,19 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   	entity->guilty = guilty;
>   	entity->num_sched_list = num_sched_list;
>   	entity->priority = priority;
> +	/*
> +	 * It's perfectly valid to initialize an entity without having a valid
> +	 * scheduler attached. It's just not valid to use the scheduler before it
> +	 * is initialized itself.
> +	 */
>   	entity->sched_list = num_sched_list > 1 ? sched_list : NULL;
>   	RCU_INIT_POINTER(entity->last_scheduled, NULL);
>   	RB_CLEAR_NODE(&entity->rb_tree_node);
>   
> -	if (!sched_list[0]->sched_rq) {
> -		/* Warn drivers not to do this and to fix their DRM
> -		 * calling order.
> +	if (num_sched_list && !sched_list[0]->sched_rq) {
> +		/* Since every entry covered by num_sched_list
> +		 * should be non-NULL and therefore we warn drivers
> +		 * not to do this and to fix their DRM calling order.
>   		 */
>   		pr_warn("%s: called with uninitialized scheduler\n", __func__);
>   	} else if (num_sched_list) {


  reply	other threads:[~2024-03-15 14:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-15  2:39 [PATCH] drm/sched: fix null-ptr-deref in init entity vitaly.prosyak
2024-03-15 14:04 ` Christian König [this message]
2024-03-15 14:12   ` Alex Deucher
2024-03-15 14:32     ` Christian König
  -- strict thread matches above, loose matches on Subject: below --
2024-03-13 21:24 vitaly.prosyak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62173dc5-2cc9-4bcc-9ef5-72631ae96e30@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=bas@basnieuwenhuizen.nl \
    --cc=christian.koenig@amd.com \
    --cc=dokyungs@yonsei.ac.kr \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jisoo.jang@yonsei.ac.kr \
    --cc=joonkyoj@yonsei.ac.kr \
    --cc=ltuikov89@gmail.com \
    --cc=vitaly.prosyak@amd.com \
    --cc=yw9865@yonsei.ac.kr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.