* [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw @ 2025-09-24 15:29 Yifan Zhang 2025-09-24 15:29 ` [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition Yifan Zhang 2025-09-24 22:48 ` [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Philip Yang 0 siblings, 2 replies; 16+ messages in thread From: Yifan Zhang @ 2025-09-24 15:29 UTC (permalink / raw) To: amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar, Yifan Zhang There is race in amdgpu_amdkfd_device_fini_sw and interrupt. if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and kfree(kfd), and KGD interrupt generated. kernel panic log: BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP PGD d78c68067 P4D d78c68067 kfd kfd: amdgpu: Allocated 3969056 bytes on gart PUD 1465b8067 PMD @ Oops: @002 [#1] SMP NOPTI kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> OF b1 17 75 Ba 4c 89 e@ 41 Sc 89 c6 e8 07 38 5d RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] ? amdgpu_fence_process+0xa4/0x150 [amdgpu] kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace amdgpu_irq_dispatch+0x165/0x210 [amdgpu] amdgpu_ih_process+0x80/0x100 [amdgpu] amdgpu: Virtual CRAT table created for GPU amdgpu_irq_handler+0x1f/@x60 [amdgpu] __handle_irq_event_percpu+0x3d/0x170 amdgpu: Topology: Add dGPU node [0x74a2:0x1002] handle_irq_event+0x5a/@xcO handle_edge_irq+0x93/0x240 kfd kfd: amdgpu: KFD node 1 partition @ size 49148M asm_call_irq_on_stack+0xf/@x20 </IRQ> common_interrupt+0xb3/0x130 asm_common_interrupt+0x1le/0x40 5.10.134-010.a1i5000.a18.x86_64 #1 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 349c351e242b..051a00152b08 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) } for (i = 0; i < kfd->num_nodes; i++) { - node = kfd->nodes[i]; + /* Race if another thread in b/w + * kfd_cleanup_nodes and kfree(kfd), + * when kfd->nodes[i] = NULL + */ + if (kfd->nodes[i]) + node = kfd->nodes[i]; + else + return; + spin_lock_irqsave(&node->interrupt_lock, flags); if (node->interrupts_active -- 2.43.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition 2025-09-24 15:29 [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Yifan Zhang @ 2025-09-24 15:29 ` Yifan Zhang 2025-09-24 23:14 ` Philip Yang 2025-09-26 19:52 ` Chen, Xiaogang 2025-09-24 22:48 ` [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Philip Yang 1 sibling, 2 replies; 16+ messages in thread From: Yifan Zhang @ 2025-09-24 15:29 UTC (permalink / raw) To: amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar, Yifan Zhang current switch partition only check if kfd_processes_table is empty. kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but kfd_process tear down is in kfd_process_wq_release. consider two processes: Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw -> kfd_node tear down. Process A and B may trigger a race as shown in dmesg log. This patch is to resolve the race by adding an atomic kfd_process counter kfd_processes_count, it increment as create kfd process, decrement as finish kfd_process_wq_release. v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds and bug fix. (Philip Yang) [3966658.307702] divide error: 0000 [#1] SMP NOPTI [3966658.350818] i10nm_edac [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] [3966658.362839] nfit [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 [3966658.380967] x86_pkg_temp_thermal [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [3966658.391536] PKRU: 55555554 [3966658.391536] Call Trace: [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] [3966658.399754] intel_powerclamp [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] [3966658.410516] coretemp [3966658.434016] process_one_work+0x1ad/0x380 [3966658.434021] worker_thread+0x49/0x310 [3966658.438963] kvm_intel [3966658.446041] ? process_one_work+0x380/0x380 [3966658.446045] kthread+0x118/0x140 [3966658.446047] ? __kthread_bind_mask+0x60/0x60 [3966658.446050] ret_from_fork+0x1f/0x30 [3966658.446053] Modules linked in: kpatch_20765354(OEK) [3966658.455310] kvm [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) [3966658.473462] idxd_mdev [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev [3966658.491237] vfio_pci [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq [3966658.508537] vfio_virqfd [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad vfat fat [3966658.526851] mdev [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) [3966658.540381] vfio_iommu_type1 [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core [3966658.551254] vfio [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) [3966658.563004] iax_crypto [3966658.570988] [last unloaded: diagnose] [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++++++++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++ 3 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 051a00152b08..488c8c0e6ccd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -1493,6 +1493,15 @@ int kgd2kfd_check_and_lock_kfd(struct kfd_dev *kfd) mutex_lock(&kfd_processes_mutex); + /* kfd_processes_count is per kfd_dev, return -EBUSY without + * further check + */ + if (!!atomic_read(&kfd->kfd_processes_count)) { + pr_debug("process_wq_release not finished\n"); + r = -EBUSY; + goto out; + } + if (hash_empty(kfd_processes_table) && !kfd_is_locked(kfd)) goto out; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index d01ef5ac0766..70ef051511bb 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -382,6 +382,8 @@ struct kfd_dev { /* for dynamic partitioning */ int kfd_dev_lock; + + atomic_t kfd_processes_count; }; enum kfd_mempool { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 5be28c6c4f6a..ddfe30c13e9d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1088,6 +1088,8 @@ static void kfd_process_destroy_pdds(struct kfd_process *p) pdd->runtime_inuse = false; } + atomic_dec(&pdd->dev->kfd->kfd_processes_count); + kfree(pdd); p->pdds[i] = NULL; } @@ -1649,6 +1651,8 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, /* Init idr used for memory handle translation */ idr_init(&pdd->alloc_idr); + atomic_inc(&dev->kfd->kfd_processes_count); + return pdd; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition 2025-09-24 15:29 ` [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition Yifan Zhang @ 2025-09-24 23:14 ` Philip Yang 2025-09-26 19:52 ` Chen, Xiaogang 1 sibling, 0 replies; 16+ messages in thread From: Philip Yang @ 2025-09-24 23:14 UTC (permalink / raw) To: Yifan Zhang, amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar On 2025-09-24 11:29, Yifan Zhang wrote: > current switch partition only check if kfd_processes_table is empty. > kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but > kfd_process tear down is in kfd_process_wq_release. > > consider two processes: > > Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member > Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw > -> kfd_node tear down. > > Process A and B may trigger a race as shown in dmesg log. > > This patch is to resolve the race by adding an atomic kfd_process counter > kfd_processes_count, it increment as create kfd process, decrement as > finish kfd_process_wq_release. > > v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds > and bug fix. (Philip Yang) > > [3966658.307702] divide error: 0000 [#1] SMP NOPTI > [3966658.350818] i10nm_edac > [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted > [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] > [3966658.362839] nfit > [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] > [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 > [3966658.380967] x86_pkg_temp_thermal > [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 > [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 > [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 > [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 > [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 > [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 > [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 > [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 > [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 > [3966658.391536] PKRU: 55555554 > [3966658.391536] Call Trace: > [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] > [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] > [3966658.399754] intel_powerclamp > [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] > [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] > [3966658.410516] coretemp > [3966658.434016] process_one_work+0x1ad/0x380 > [3966658.434021] worker_thread+0x49/0x310 > [3966658.438963] kvm_intel > [3966658.446041] ? process_one_work+0x380/0x380 > [3966658.446045] kthread+0x118/0x140 > [3966658.446047] ? __kthread_bind_mask+0x60/0x60 > [3966658.446050] ret_from_fork+0x1f/0x30 > [3966658.446053] Modules linked in: kpatch_20765354(OEK) > [3966658.455310] kvm > [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) > [3966658.473462] idxd_mdev > [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev > [3966658.491237] vfio_pci > [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq > [3966658.508537] vfio_virqfd > [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad vfat fat > [3966658.526851] mdev > [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) > [3966658.540381] vfio_iommu_type1 > [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core > [3966658.551254] vfio > [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) > [3966658.563004] iax_crypto > [3966658.570988] [last unloaded: diagnose] > [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Because kfd structure is kzalloc, on x86, atomic_set 0 is not needed, I don't know other platforms. To be safe, @@ -495,6 +495,7 @@ struct kfd_dev *kgd2kfd_probe(struct amdgpu_device *adev, bool vf) mutex_init(&kfd->doorbell_mutex); ida_init(&kfd->doorbell_ida); + atomic_set(&kfd->kfd_processes_count, 0); return kfd; } With this added, this patch is Reviewed-by: Philip.Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++++++++ > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++ > 3 files changed, 15 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 051a00152b08..488c8c0e6ccd 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1493,6 +1493,15 @@ int kgd2kfd_check_and_lock_kfd(struct kfd_dev *kfd) > > mutex_lock(&kfd_processes_mutex); > > + /* kfd_processes_count is per kfd_dev, return -EBUSY without > + * further check > + */ > + if (!!atomic_read(&kfd->kfd_processes_count)) { > + pr_debug("process_wq_release not finished\n"); > + r = -EBUSY; > + goto out; > + } > + > if (hash_empty(kfd_processes_table) && !kfd_is_locked(kfd)) > goto out; > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > index d01ef5ac0766..70ef051511bb 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > @@ -382,6 +382,8 @@ struct kfd_dev { > > /* for dynamic partitioning */ > int kfd_dev_lock; > + > + atomic_t kfd_processes_count; > }; > > enum kfd_mempool { > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > index 5be28c6c4f6a..ddfe30c13e9d 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > @@ -1088,6 +1088,8 @@ static void kfd_process_destroy_pdds(struct kfd_process *p) > pdd->runtime_inuse = false; > } > > + atomic_dec(&pdd->dev->kfd->kfd_processes_count); > + > kfree(pdd); > p->pdds[i] = NULL; > } > @@ -1649,6 +1651,8 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, > /* Init idr used for memory handle translation */ > idr_init(&pdd->alloc_idr); > > + atomic_inc(&dev->kfd->kfd_processes_count); > + > return pdd; > } > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition 2025-09-24 15:29 ` [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition Yifan Zhang 2025-09-24 23:14 ` Philip Yang @ 2025-09-26 19:52 ` Chen, Xiaogang 2025-09-29 13:09 ` Philip Yang 1 sibling, 1 reply; 16+ messages in thread From: Chen, Xiaogang @ 2025-09-26 19:52 UTC (permalink / raw) To: Yifan Zhang, amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar On 9/24/2025 10:29 AM, Yifan Zhang wrote: > current switch partition only check if kfd_processes_table is empty. > kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but > kfd_process tear down is in kfd_process_wq_release. > > consider two processes: > > Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node member > Process B switch partition -> amdgpu_xcp_pre_partition_switch -> amdgpu_amdkfd_device_fini_sw > -> kfd_node tear down. > > Process A and B may trigger a race as shown in dmesg log. > > This patch is to resolve the race by adding an atomic kfd_process counter > kfd_processes_count, it increment as create kfd process, decrement as > finish kfd_process_wq_release. > > v2: Put kfd_processes_count per kfd_dev, move decrement to kfd_process_destroy_pdds > and bug fix. (Philip Yang) > > [3966658.307702] divide error: 0000 [#1] SMP NOPTI > [3966658.350818] i10nm_edac > [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: loaded Tainted > [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release [amdgpu] > [3966658.362839] nfit > [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] > [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 ba 02 00 00 > [3966658.380967] x86_pkg_temp_thermal > [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 > [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: ffff888645900000 > [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: ffff888129151c00 > [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: ffff8890d2750af4 > [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: 0000000000000000 > [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: ffff8974e593b800 > [3966658.391533] FS: 0000000000000000(0000) GS:ffff88fe7f600000(0000) knlGS:0000000000000000 > [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: 0000000002770ee0 > [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 > [3966658.391536] PKRU: 55555554 > [3966658.391536] Call Trace: > [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] > [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] > [3966658.399754] intel_powerclamp > [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 [amdgpu] > [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] > [3966658.410516] coretemp > [3966658.434016] process_one_work+0x1ad/0x380 > [3966658.434021] worker_thread+0x49/0x310 > [3966658.438963] kvm_intel > [3966658.446041] ? process_one_work+0x380/0x380 > [3966658.446045] kthread+0x118/0x140 > [3966658.446047] ? __kthread_bind_mask+0x60/0x60 > [3966658.446050] ret_from_fork+0x1f/0x30 > [3966658.446053] Modules linked in: kpatch_20765354(OEK) > [3966658.455310] kvm > [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) kpatch_19749756(OEK) > [3966658.473462] idxd_mdev > [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi tcp_diag target_core_file inet_diag target_core_iblock target_core_user target_core_mod coldpgs kpatch_18383292(OEK) ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev > [3966658.491237] vfio_pci > [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_seq > [3966658.508537] vfio_virqfd > [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad vfat fat > [3966658.526851] mdev > [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) > [3966658.540381] vfio_iommu_type1 > [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core > [3966658.551254] vfio > [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) > [3966658.563004] iax_crypto > [3966658.570988] [last unloaded: diagnose] > [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++++++++ > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ > drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++ > 3 files changed, 15 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 051a00152b08..488c8c0e6ccd 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1493,6 +1493,15 @@ int kgd2kfd_check_and_lock_kfd(struct kfd_dev *kfd) > > mutex_lock(&kfd_processes_mutex); > > + /* kfd_processes_count is per kfd_dev, return -EBUSY without > + * further check > + */ > + if (!!atomic_read(&kfd->kfd_processes_count)) { > + pr_debug("process_wq_release not finished\n"); > + r = -EBUSY; > + goto out; > + } > + > if (hash_empty(kfd_processes_table) && !kfd_is_locked(kfd)) > goto out; > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > index d01ef5ac0766..70ef051511bb 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h > @@ -382,6 +382,8 @@ struct kfd_dev { > > /* for dynamic partitioning */ > int kfd_dev_lock; > + > + atomic_t kfd_processes_count; Why need add kfd process count per kfd_dev? A kfd process uses all kfd nodes on system. Is there a case that a kfd process just use some kfd_dev? It does not seems the root cause or the solution. Regards Xiaogang > }; > > enum kfd_mempool { > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > index 5be28c6c4f6a..ddfe30c13e9d 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c > @@ -1088,6 +1088,8 @@ static void kfd_process_destroy_pdds(struct kfd_process *p) > pdd->runtime_inuse = false; > } > > + atomic_dec(&pdd->dev->kfd->kfd_processes_count); > + > kfree(pdd); > p->pdds[i] = NULL; > } > @@ -1649,6 +1651,8 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_node *dev, > /* Init idr used for memory handle translation */ > idr_init(&pdd->alloc_idr); > > + atomic_inc(&dev->kfd->kfd_processes_count); > + > return pdd; > } > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition 2025-09-26 19:52 ` Chen, Xiaogang @ 2025-09-29 13:09 ` Philip Yang 0 siblings, 0 replies; 16+ messages in thread From: Philip Yang @ 2025-09-29 13:09 UTC (permalink / raw) To: Chen, Xiaogang, Yifan Zhang, amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar On 2025-09-26 15:52, Chen, Xiaogang wrote: > > On 9/24/2025 10:29 AM, Yifan Zhang wrote: >> current switch partition only check if kfd_processes_table is empty. >> kfd_prcesses_table entry is deleted in kfd_process_notifier_release, but >> kfd_process tear down is in kfd_process_wq_release. >> >> consider two processes: >> >> Process A (workqueue) -> kfd_process_wq_release -> Access kfd_node >> member >> Process B switch partition -> amdgpu_xcp_pre_partition_switch -> >> amdgpu_amdkfd_device_fini_sw >> -> kfd_node tear down. >> >> Process A and B may trigger a race as shown in dmesg log. >> >> This patch is to resolve the race by adding an atomic kfd_process >> counter >> kfd_processes_count, it increment as create kfd process, decrement as >> finish kfd_process_wq_release. >> >> v2: Put kfd_processes_count per kfd_dev, move decrement to >> kfd_process_destroy_pdds >> and bug fix. (Philip Yang) >> >> [3966658.307702] divide error: 0000 [#1] SMP NOPTI >> [3966658.350818] i10nm_edac >> [3966658.356318] CPU: 124 PID: 38435 Comm: kworker/124:0 Kdump: >> loaded Tainted >> [3966658.356890] Workqueue: kfd_process_wq kfd_process_wq_release >> [amdgpu] >> [3966658.362839] nfit >> [3966658.366457] RIP: 0010:kfd_get_num_sdma_engines+0x17/0x40 [amdgpu] >> [3966658.366460] Code: 00 00 e9 ac 81 02 00 66 66 2e 0f 1f 84 00 00 >> 00 00 00 90 0f 1f 44 00 00 48 8b 4f 08 48 8b b7 00 01 00 00 8b 81 58 >> 26 03 00 99 <f7> be b8 01 00 00 80 b9 70 2e 00 00 00 74 0b 83 f8 02 >> ba 02 00 00 >> [3966658.380967] x86_pkg_temp_thermal >> [3966658.391529] RSP: 0018:ffffc900a0edfdd8 EFLAGS: 00010246 >> [3966658.391531] RAX: 0000000000000008 RBX: ffff8974e593b800 RCX: >> ffff888645900000 >> [3966658.391531] RDX: 0000000000000000 RSI: ffff888129154400 RDI: >> ffff888129151c00 >> [3966658.391532] RBP: ffff8883ad79d400 R08: 0000000000000000 R09: >> ffff8890d2750af4 >> [3966658.391532] R10: 0000000000000018 R11: 0000000000000018 R12: >> 0000000000000000 >> [3966658.391533] R13: ffff8883ad79d400 R14: ffffe87ff662ba00 R15: >> ffff8974e593b800 >> [3966658.391533] FS: 0000000000000000(0000) >> GS:ffff88fe7f600000(0000) knlGS:0000000000000000 >> [3966658.391534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [3966658.391534] CR2: 0000000000d71000 CR3: 000000dd0e970004 CR4: >> 0000000002770ee0 >> [3966658.391535] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [3966658.391535] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: >> 0000000000000400 >> [3966658.391536] PKRU: 55555554 >> [3966658.391536] Call Trace: >> [3966658.391674] deallocate_sdma_queue+0x38/0xa0 [amdgpu] >> [3966658.391762] process_termination_cpsch+0x1ed/0x480 [amdgpu] >> [3966658.399754] intel_powerclamp >> [3966658.402831] kfd_process_dequeue_from_all_devices+0x5b/0xc0 >> [amdgpu] >> [3966658.402908] kfd_process_wq_release+0x1a/0x1a0 [amdgpu] >> [3966658.410516] coretemp >> [3966658.434016] process_one_work+0x1ad/0x380 >> [3966658.434021] worker_thread+0x49/0x310 >> [3966658.438963] kvm_intel >> [3966658.446041] ? process_one_work+0x380/0x380 >> [3966658.446045] kthread+0x118/0x140 >> [3966658.446047] ? __kthread_bind_mask+0x60/0x60 >> [3966658.446050] ret_from_fork+0x1f/0x30 >> [3966658.446053] Modules linked in: kpatch_20765354(OEK) >> [3966658.455310] kvm >> [3966658.464534] mptcp_diag xsk_diag raw_diag unix_diag >> af_packet_diag netlink_diag udp_diag act_pedit act_mirred act_vlan >> cls_flower kpatch_21951273(OEK) kpatch_18424469(OEK) >> kpatch_19749756(OEK) >> [3966658.473462] idxd_mdev >> [3966658.482306] kpatch_17971294(OEK) sch_ingress xt_conntrack >> amdgpu(OE) amdxcp(OE) amddrm_buddy(OE) amd_sched(OE) amdttm(OE) >> amdkcl(OE) intel_ifs iptable_mangle tcm_loop target_core_pscsi >> tcp_diag target_core_file inet_diag target_core_iblock >> target_core_user target_core_mod coldpgs kpatch_18383292(OEK) >> ip6table_nat ip6table_filter ip6_tables ip_set_hash_ipportip >> ip_set_hash_ipportnet ip_set_hash_ipport ip_set_bitmap_port >> xt_comment iptable_nat nf_nat iptable_filter ip_tables ip_set >> ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs nf_conntrack nf_defrag_ipv6 >> nf_defrag_ipv4 sn_core_odd(OE) i40e overlay binfmt_misc tun >> bonding(OE) aisqos(OE) aisqos_hotfixes(OE) rfkill uio_pci_generic uio >> cuse fuse nf_tables nfnetlink intel_rapl_msr intel_rapl_common >> intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nfit >> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm idxd_mdev >> [3966658.491237] vfio_pci >> [3966658.501196] vfio_pci vfio_virqfd mdev vfio_iommu_type1 vfio >> iax_crypto intel_pmt_telemetry iTCO_wdt intel_pmt_class >> iTCO_vendor_support irqbypass crct10dif_pclmul crc32_pclmul >> ghash_clmulni_intel rapl intel_cstate snd_hda_intel snd_intel_dspcfg >> snd_hda_codec snd_hda_core snd_hwdep snd_seq >> [3966658.508537] vfio_virqfd >> [3966658.517569] snd_seq_device ipmi_ssif isst_if_mbox_pci >> isst_if_mmio pcspkr snd_pcm idxd intel_uncore ses isst_if_common >> intel_vsec idxd_bus enclosure snd_timer mei_me snd i2c_i801 i2c_smbus >> mei i2c_ismt soundcore joydev acpi_ipmi ipmi_si ipmi_devintf >> ipmi_msghandler acpi_power_meter acpi_pad vfat fat >> [3966658.526851] mdev >> [3966658.536096] nfsd auth_rpcgss nfs_acl lockd grace slb_vtoa(OE) >> sunrpc dm_mod hookers mlx5_ib(OE) ast i2c_algo_bit drm_vram_helper >> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops >> drm_ttm_helper ttm mlx5_core(OE) mlxfw(OE) >> [3966658.540381] vfio_iommu_type1 >> [3966658.544341] nvme mpt3sas tls drm nvme_core pci_hyperv_intf >> raid_class psample libcrc32c crc32c_intel mlxdevm(OE) i2c_core >> [3966658.551254] vfio >> [3966658.558742] scsi_transport_sas wmi pinctrl_emmitsburg sd_mod >> t10_pi sg ahci libahci libata rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) >> iw_cm(OE) ib_cm(OE) ib_umad(OE) ib_core(OE) ib_ucm(OE) mlx_compat(OE) >> [3966658.563004] iax_crypto >> [3966658.570988] [last unloaded: diagnose] >> [3966658.571027] ---[ end trace cc9dbb180f9ae537 ]--- >> >> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++++++++ >> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 ++ >> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++++ >> 3 files changed, 15 insertions(+) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> index 051a00152b08..488c8c0e6ccd 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> @@ -1493,6 +1493,15 @@ int kgd2kfd_check_and_lock_kfd(struct kfd_dev >> *kfd) >> mutex_lock(&kfd_processes_mutex); >> + /* kfd_processes_count is per kfd_dev, return -EBUSY without >> + * further check >> + */ >> + if (!!atomic_read(&kfd->kfd_processes_count)) { >> + pr_debug("process_wq_release not finished\n"); >> + r = -EBUSY; >> + goto out; >> + } >> + >> if (hash_empty(kfd_processes_table) && !kfd_is_locked(kfd)) >> goto out; >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >> index d01ef5ac0766..70ef051511bb 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >> @@ -382,6 +382,8 @@ struct kfd_dev { >> /* for dynamic partitioning */ >> int kfd_dev_lock; >> + >> + atomic_t kfd_processes_count; > > Why need add kfd process count per kfd_dev? A kfd process uses all kfd > nodes on system. Is there a case that a kfd process just use some > kfd_dev? > yes, cgroup could exclude devices for process, count per kfd_dev allow device partition switch with running process that has been on cgroup excluded from this device Regards, Philip > It does not seems the root cause or the solution. > > Regards > > Xiaogang > >> }; >> enum kfd_mempool { >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> index 5be28c6c4f6a..ddfe30c13e9d 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c >> @@ -1088,6 +1088,8 @@ static void kfd_process_destroy_pdds(struct >> kfd_process *p) >> pdd->runtime_inuse = false; >> } >> + atomic_dec(&pdd->dev->kfd->kfd_processes_count); >> + >> kfree(pdd); >> p->pdds[i] = NULL; >> } >> @@ -1649,6 +1651,8 @@ struct kfd_process_device >> *kfd_create_process_device_data(struct kfd_node *dev, >> /* Init idr used for memory handle translation */ >> idr_init(&pdd->alloc_idr); >> + atomic_inc(&dev->kfd->kfd_processes_count); >> + >> return pdd; >> } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-24 15:29 [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Yifan Zhang 2025-09-24 15:29 ` [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition Yifan Zhang @ 2025-09-24 22:48 ` Philip Yang 2025-09-25 6:19 ` Lazar, Lijo 2025-09-25 19:11 ` Chen, Xiaogang 1 sibling, 2 replies; 16+ messages in thread From: Philip Yang @ 2025-09-24 22:48 UTC (permalink / raw) To: Yifan Zhang, amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 > amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-24 22:48 ` [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Philip Yang @ 2025-09-25 6:19 ` Lazar, Lijo 2025-09-25 6:41 ` Zhang, Yifan 2025-09-25 19:11 ` Chen, Xiaogang 1 sibling, 1 reply; 16+ messages in thread From: Lazar, Lijo @ 2025-09-25 6:19 UTC (permalink / raw) To: Yang, Philip, Zhang, Yifan, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 6:19 ` Lazar, Lijo @ 2025-09-25 6:41 ` Zhang, Yifan 2025-09-25 6:58 ` Lazar, Lijo 0 siblings, 1 reply; 16+ messages in thread From: Zhang, Yifan @ 2025-09-25 6:41 UTC (permalink / raw) To: Lazar, Lijo, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 6:41 ` Zhang, Yifan @ 2025-09-25 6:58 ` Lazar, Lijo 2025-09-25 7:06 ` Lazar, Lijo 0 siblings, 1 reply; 16+ messages in thread From: Lazar, Lijo @ 2025-09-25 6:58 UTC (permalink / raw) To: Zhang, Yifan, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if some work is already scheduled previously, it should be inside flush_workqueue() of cleanup_nodes. for (i = 0; kfd->ih_wq && i < kfd->num_nodes; i++) Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 12:11 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 6:58 ` Lazar, Lijo @ 2025-09-25 7:06 ` Lazar, Lijo 2025-09-25 9:54 ` Zhang, Yifan 0 siblings, 1 reply; 16+ messages in thread From: Lazar, Lijo @ 2025-09-25 7:06 UTC (permalink / raw) To: Lazar, Lijo, Zhang, Yifan, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] On a second thought, probably this will require some locking. For ex: it's quite possible that destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) could be happening back-to-back. Node is not yet deleted. Thanks, Lijo -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Lazar, Lijo Sent: Thursday, September 25, 2025 12:29 PM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if some work is already scheduled previously, it should be inside flush_workqueue() of cleanup_nodes. for (i = 0; kfd->ih_wq && i < kfd->num_nodes; i++) Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 12:11 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 7:06 ` Lazar, Lijo @ 2025-09-25 9:54 ` Zhang, Yifan 2025-09-26 6:49 ` Lazar, Lijo 0 siblings, 1 reply; 16+ messages in thread From: Zhang, Yifan @ 2025-09-25 9:54 UTC (permalink / raw) To: Lazar, Lijo, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] Hi Lijo, Do you mean a change like below ? Besides readability, is there functional improvement ? diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..86676acd9cbe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -663,6 +663,7 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) */ flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); + kfd->ih_wq = NULL; for (i = 0; i < num_nodes; i++) { knode = kfd->nodes[i]; @@ -1125,7 +1126,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) unsigned long flags; struct kfd_node *node; - if (!kfd->init_complete) + if (!kfd->init_complete && !kfd->ih_wq) return; if (kfd->device_info.ih_ring_entry_size > sizeof(patched_ihre)) { Regarding the destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) sync, it looks like another issue, should set node->interrupts_active = false before destroy_workqueue(kfd->ih_wq) is called. I think something like below: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..92d9fa99e954 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -662,6 +662,9 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) * can't be created because we stopped interrupt handling above. */ flush_workqueue(kfd->ih_wq); + for (i = 0; i < num_nodes; i++) { + kfd_interrupt_exit(knode); + } destroy_workqueue(kfd->ih_wq); for (i = 0; i < num_nodes; i++) { -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 3:06 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] On a second thought, probably this will require some locking. For ex: it's quite possible that destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) could be happening back-to-back. Node is not yet deleted. Thanks, Lijo -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Lazar, Lijo Sent: Thursday, September 25, 2025 12:29 PM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if some work is already scheduled previously, it should be inside flush_workqueue() of cleanup_nodes. for (i = 0; kfd->ih_wq && i < kfd->num_nodes; i++) Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 12:11 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply related [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 9:54 ` Zhang, Yifan @ 2025-09-26 6:49 ` Lazar, Lijo 2025-09-26 14:33 ` Zhang, Yifan 0 siblings, 1 reply; 16+ messages in thread From: Lazar, Lijo @ 2025-09-26 6:49 UTC (permalink / raw) To: Zhang, Yifan, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [Public] The intention is to let kgd2kfd_interrupt thread know that KFD is done with interrupt handling and exit at the earliest (that is even without going through kfd node loop). I was thinking of checking ih_wq NULL value, but since that value is not under lock, it's not necessary that kgd2kfd_interrupt thread sees the NULL value immediately. > if (!kfd->init_complete && !kfd->ih_wq) I think this should be if (!kfd->ih_wq || !kfd->init_complete) For this - kfd_interrupt_exit(knode) - it's better to keep it before flush_workqueue. That indicates no more queueing is allowed from any node. Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 3:25 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] Hi Lijo, Do you mean a change like below ? Besides readability, is there functional improvement ? diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..86676acd9cbe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -663,6 +663,7 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) */ flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); + kfd->ih_wq = NULL; for (i = 0; i < num_nodes; i++) { knode = kfd->nodes[i]; @@ -1125,7 +1126,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) unsigned long flags; struct kfd_node *node; - if (!kfd->init_complete) + if (!kfd->init_complete && !kfd->ih_wq) return; if (kfd->device_info.ih_ring_entry_size > sizeof(patched_ihre)) { Regarding the destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) sync, it looks like another issue, should set node->interrupts_active = false before destroy_workqueue(kfd->ih_wq) is called. I think something like below: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..92d9fa99e954 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -662,6 +662,9 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) * can't be created because we stopped interrupt handling above. */ flush_workqueue(kfd->ih_wq); + for (i = 0; i < num_nodes; i++) { + kfd_interrupt_exit(knode); + } destroy_workqueue(kfd->ih_wq); for (i = 0; i < num_nodes; i++) { -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 3:06 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] On a second thought, probably this will require some locking. For ex: it's quite possible that destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) could be happening back-to-back. Node is not yet deleted. Thanks, Lijo -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Lazar, Lijo Sent: Thursday, September 25, 2025 12:29 PM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if some work is already scheduled previously, it should be inside flush_workqueue() of cleanup_nodes. for (i = 0; kfd->ih_wq && i < kfd->num_nodes; i++) Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 12:11 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-26 6:49 ` Lazar, Lijo @ 2025-09-26 14:33 ` Zhang, Yifan 0 siblings, 0 replies; 16+ messages in thread From: Zhang, Yifan @ 2025-09-26 14:33 UTC (permalink / raw) To: Lazar, Lijo, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix [-- Attachment #1: Type: text/plain, Size: 14028 bytes --] [Public] kfd_interrupt_exit needs to break to two parts, because knode->ih_fifo is still needed by flush ih_wq. Change like this: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..270459826147 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -655,19 +655,33 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) { struct kfd_node *knode; unsigned int i; + unsigned long flags; /* * flush_work ensures that there are no outstanding * work-queue items that will access interrupt_ring. New work items * can't be created because we stopped interrupt handling above. */ + for (i = 0; i < num_nodes; i++) { + knode = kfd->nodes[i]; + spin_lock_irqsave(&knode->interrupt_lock, flags); + knode->interrupts_active = false; + spin_unlock_irqrestore(&knode->interrupt_lock, flags); + } + flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); + for (i = 0; i < num_nodes; i++) { + spin_lock_irqsave(&knode->interrupt_lock, flags); + knode->kfd->ih_wq = NULL; + spin_unlock_irqrestore(&knode->interrupt_lock, flags); + kfifo_free(&knode->ih_fifo); + } + for (i = 0; i < num_nodes; i++) { knode = kfd->nodes[i]; device_queue_manager_uninit(knode->dqm); - kfd_interrupt_exit(knode); kfd_topology_remove_device(knode); if (knode->gws) amdgpu_amdkfd_free_gws(knode->adev, knode->gws); @@ -1125,7 +1139,8 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) unsigned long flags; struct kfd_node *node; - if (!kfd->init_complete) + if (!kfd->init_complete || !kfd->ih_wq) return; if (kfd->device_info.ih_ring_entry_size > sizeof(patched_ihre)) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c index 783c2f5a04e4..be10c0a9d391 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c @@ -87,21 +87,6 @@ int kfd_interrupt_init(struct kfd_node *node) return 0; } -void kfd_interrupt_exit(struct kfd_node *node) -{ - /* - * Stop the interrupt handler from writing to the ring and scheduling - * workqueue items. The spinlock ensures that any interrupt running - * after we have unlocked sees interrupts_active = false. - */ - unsigned long flags; - - spin_lock_irqsave(&node->interrupt_lock, flags); - node->interrupts_active = false; - spin_unlock_irqrestore(&node->interrupt_lock, flags); - kfifo_free(&node->ih_fifo); -} - ________________________________ From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Friday, September 26, 2025 2:49 PM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The intention is to let kgd2kfd_interrupt thread know that KFD is done with interrupt handling and exit at the earliest (that is even without going through kfd node loop). I was thinking of checking ih_wq NULL value, but since that value is not under lock, it's not necessary that kgd2kfd_interrupt thread sees the NULL value immediately. > if (!kfd->init_complete && !kfd->ih_wq) I think this should be if (!kfd->ih_wq || !kfd->init_complete) For this - kfd_interrupt_exit(knode) - it's better to keep it before flush_workqueue. That indicates no more queueing is allowed from any node. Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 3:25 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] Hi Lijo, Do you mean a change like below ? Besides readability, is there functional improvement ? diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..86676acd9cbe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -663,6 +663,7 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) */ flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); + kfd->ih_wq = NULL; for (i = 0; i < num_nodes; i++) { knode = kfd->nodes[i]; @@ -1125,7 +1126,7 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) unsigned long flags; struct kfd_node *node; - if (!kfd->init_complete) + if (!kfd->init_complete && !kfd->ih_wq) return; if (kfd->device_info.ih_ring_entry_size > sizeof(patched_ihre)) { Regarding the destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) sync, it looks like another issue, should set node->interrupts_active = false before destroy_workqueue(kfd->ih_wq) is called. I think something like below: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index e9cfb80bd436..92d9fa99e954 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -662,6 +662,9 @@ static void kfd_cleanup_nodes(struct kfd_dev *kfd, unsigned int num_nodes) * can't be created because we stopped interrupt handling above. */ flush_workqueue(kfd->ih_wq); + for (i = 0; i < num_nodes; i++) { + kfd_interrupt_exit(knode); + } destroy_workqueue(kfd->ih_wq); for (i = 0; i < num_nodes; i++) { -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 3:06 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] On a second thought, probably this will require some locking. For ex: it's quite possible that destroy_workqueue(kfd->ih_wq) and queue_work(node->kfd->ih_wq, &node->interrupt_work) could be happening back-to-back. Node is not yet deleted. Thanks, Lijo -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Lazar, Lijo Sent: Thursday, September 25, 2025 12:29 PM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] I meant something like this. destroy_workqueue(kfd->ih_wq); kfd->ih_wq = NULL; Then check for NULL at the beginning of kgd2kfd_interrupt. If there is no IH workqueue, then there is no interrupt handling capability. May also be within the loop. Not sure if that is really required; if some work is already scheduled previously, it should be inside flush_workqueue() of cleanup_nodes. for (i = 0; kfd->ih_wq && i < kfd->num_nodes; i++) Thanks, Lijo -----Original Message----- From: Zhang, Yifan <Yifan1.Zhang@amd.com> Sent: Thursday, September 25, 2025 12:11 PM To: Lazar, Lijo <Lijo.Lazar@amd.com>; Yang, Philip <Philip.Yang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] The interrupts are from KGD, still active after flush ih_wq and kfd_dev is freed. And after knode is freed, node->interrupts_active is also inaccessible. The race condition is as below: Interrupt handling switch partition process | | flush_workqueue(kfd->ih_wq); | destroy_workqueue(kfd->ih_wq); amdgpu_irq_dispatch | amdgpu_amdkfd_interrupt | kgd2kfd_interrupt | | kfd_cleanup_nodes | kfree(knode); spin_lock_irqsave(&node->interrupt_lock, flags); | //NULL Pointer -----Original Message----- From: Lazar, Lijo <Lijo.Lazar@amd.com> Sent: Thursday, September 25, 2025 2:19 PM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com> Subject: RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw [Public] > Race if another thread in b/w kfd_cleanup_nodes This code is there before cleanup of nodes. flush_workqueue(kfd->ih_wq); destroy_workqueue(kfd->ih_wq); Shouldn't the interrupt handling code check if interrupt handling is enabled rather than checking individual node NULL pointers? Thanks, Lijo -----Original Message----- From: Yang, Philip <Philip.Yang@amd.com> Sent: Thursday, September 25, 2025 4:19 AM To: Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 2025-09-24 11:29, Yifan Zhang wrote: > There is race in amdgpu_amdkfd_device_fini_sw and interrupt. > if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and > kfree(kfd), and KGD interrupt generated. > > kernel panic log: > > BUG: kernel NULL pointer dereference, address: 0000000000000098 amdgpu > 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP > > PGD d78c68067 P4D d78c68067 > > kfd kfd: amdgpu: Allocated 3969056 bytes on gart > > PUD 1465b8067 PMD @ > > Oops: @002 [#1] SMP NOPTI > > kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 > CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K > > RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 > > Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF > 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> > OF b1 17 75 Ba 4c 89 e@ 41 Sc > > 89 c6 e8 07 38 5d > > RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 > 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 > ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 > 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 > ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 > > CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 > > CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 > 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 > > PKRU: 55555554 > > Call Trace: > > <IRQ> > > kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] > > ? amdgpu_fence_process+0xa4/0x150 [amdgpu] > > kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace > > amdgpu_irq_dispatch+0x165/0x210 [amdgpu] > > amdgpu_ih_process+0x80/0x100 [amdgpu] > > amdgpu: Virtual CRAT table created for GPU > > amdgpu_irq_handler+0x1f/@x60 [amdgpu] > > __handle_irq_event_percpu+0x3d/0x170 > > amdgpu: Topology: Add dGPU node [0x74a2:0x1002] > > handle_irq_event+0x5a/@xcO > > handle_edge_irq+0x93/0x240 > > kfd kfd: amdgpu: KFD node 1 partition @ size 49148M > > asm_call_irq_on_stack+0xf/@x20 > > </IRQ> > > common_interrupt+0xb3/0x130 > > asm_common_interrupt+0x1le/0x40 > > 5.10.134-010.a1i5000.a18.x86_64 #1 > > Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 349c351e242b..051a00152b08 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry) > } > > for (i = 0; i < kfd->num_nodes; i++) { > - node = kfd->nodes[i]; > + /* Race if another thread in b/w > + * kfd_cleanup_nodes and kfree(kfd), > + * when kfd->nodes[i] = NULL > + */ > + if (kfd->nodes[i]) > + node = kfd->nodes[i]; > + else > + return; > + > spin_lock_irqsave(&node->interrupt_lock, flags); > > if (node->interrupts_active [-- Attachment #2: Type: text/html, Size: 35205 bytes --] ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-24 22:48 ` [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Philip Yang 2025-09-25 6:19 ` Lazar, Lijo @ 2025-09-25 19:11 ` Chen, Xiaogang 2025-09-26 1:15 ` Zhang, Yifan 1 sibling, 1 reply; 16+ messages in thread From: Chen, Xiaogang @ 2025-09-25 19:11 UTC (permalink / raw) To: Philip Yang, Yifan Zhang, amd-gfx Cc: Alexander.Deucher, Felix.Kuehling, Philip.Yang, Lijo.Lazar On 9/24/2025 5:48 PM, Philip Yang wrote: > > On 2025-09-24 11:29, Yifan Zhang wrote: >> There is race in amdgpu_amdkfd_device_fini_sw and interrupt. >> if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and >> kfree(kfd), and KGD interrupt generated. >> >> kernel panic log: >> >> BUG: kernel NULL pointer dereference, address: 0000000000000098 >> amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP >> >> PGD d78c68067 P4D d78c68067 >> >> kfd kfd: amdgpu: Allocated 3969056 bytes on gart >> >> PUD 1465b8067 PMD @ >> >> Oops: @002 [#1] SMP NOPTI >> >> kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 >> CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K >> >> RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 >> >> Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF >> 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> >> OF b1 17 75 Ba 4c 89 e@ 41 Sc >> >> 89 c6 e8 07 38 5d >> >> RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 >> >> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 >> 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 >> ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 >> 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 >> ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 >> >> CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 >> >> CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 >> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 >> >> PKRU: 55555554 >> >> Call Trace: >> >> <IRQ> >> >> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] >> >> ? amdgpu_fence_process+0xa4/0x150 [amdgpu] >> >> kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace >> >> amdgpu_irq_dispatch+0x165/0x210 [amdgpu] >> >> amdgpu_ih_process+0x80/0x100 [amdgpu] >> >> amdgpu: Virtual CRAT table created for GPU >> >> amdgpu_irq_handler+0x1f/@x60 [amdgpu] >> >> __handle_irq_event_percpu+0x3d/0x170 >> >> amdgpu: Topology: Add dGPU node [0x74a2:0x1002] >> >> handle_irq_event+0x5a/@xcO >> >> handle_edge_irq+0x93/0x240 >> >> kfd kfd: amdgpu: KFD node 1 partition @ size 49148M >> >> asm_call_irq_on_stack+0xf/@x20 >> >> </IRQ> >> >> common_interrupt+0xb3/0x130 >> >> asm_common_interrupt+0x1le/0x40 >> >> 5.10.134-010.a1i5000.a18.x86_64 #1 >> >> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> > Reviewed-by: Philip Yang<Philip.Yang@amd.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> index 349c351e242b..051a00152b08 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, >> const void *ih_ring_entry) >> } >> for (i = 0; i < kfd->num_nodes; i++) { >> - node = kfd->nodes[i]; >> + /* Race if another thread in b/w >> + * kfd_cleanup_nodes and kfree(kfd), >> + * when kfd->nodes[i] = NULL >> + */ >> + if (kfd->nodes[i]) >> + node = kfd->nodes[i]; >> + else >> + return; >> + KFD interrupt is handled later in wq node->kfd->ih_wq at interrupt_wq which uses kfd->nodes[i]. Checking " kfd->nodes[i] == NULL" here is not enough as kfd_cleanup_nodes can free kfd node at any time. Regards Xiaogang >> spin_lock_irqsave(&node->interrupt_lock, flags); >> if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* RE: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-25 19:11 ` Chen, Xiaogang @ 2025-09-26 1:15 ` Zhang, Yifan 2025-09-26 15:26 ` Chen, Xiaogang 0 siblings, 1 reply; 16+ messages in thread From: Zhang, Yifan @ 2025-09-26 1:15 UTC (permalink / raw) To: Chen, Xiaogang, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix, Yang, Philip, Lazar, Lijo [AMD Official Use Only - AMD Internal Distribution Only] flush_workqueue(kfd->ih_wq) and destroy_workqueue(kfd->ih_wq) in kfd_cleanup_nodes clean up pending work items, and node->interrupts_active check prevent new work items from being enqueued. So after kfd_cleanup_nodes free kfd node, there is no pending kfd ih_wq work items. And I agree there is still potential race here, since kfd->nodes[i] NULL check is not protected by lock. Will address it together with the issue Lijo mentioned. -----Original Message----- From: Chen, Xiaogang <Xiaogang.Chen@amd.com> Sent: Friday, September 26, 2025 3:11 AM To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw On 9/24/2025 5:48 PM, Philip Yang wrote: > > On 2025-09-24 11:29, Yifan Zhang wrote: >> There is race in amdgpu_amdkfd_device_fini_sw and interrupt. >> if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and >> kfree(kfd), and KGD interrupt generated. >> >> kernel panic log: >> >> BUG: kernel NULL pointer dereference, address: 0000000000000098 >> amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP >> >> PGD d78c68067 P4D d78c68067 >> >> kfd kfd: amdgpu: Allocated 3969056 bytes on gart >> >> PUD 1465b8067 PMD @ >> >> Oops: @002 [#1] SMP NOPTI >> >> kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 >> CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K >> >> RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 >> >> Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF >> 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> >> OF b1 17 75 Ba 4c 89 e@ 41 Sc >> >> 89 c6 e8 07 38 5d >> >> RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 >> >> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 >> 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 >> ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 >> 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 >> ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 >> >> CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 >> >> CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 >> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 >> >> PKRU: 55555554 >> >> Call Trace: >> >> <IRQ> >> >> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] >> >> ? amdgpu_fence_process+0xa4/0x150 [amdgpu] >> >> kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace >> >> amdgpu_irq_dispatch+0x165/0x210 [amdgpu] >> >> amdgpu_ih_process+0x80/0x100 [amdgpu] >> >> amdgpu: Virtual CRAT table created for GPU >> >> amdgpu_irq_handler+0x1f/@x60 [amdgpu] >> >> __handle_irq_event_percpu+0x3d/0x170 >> >> amdgpu: Topology: Add dGPU node [0x74a2:0x1002] >> >> handle_irq_event+0x5a/@xcO >> >> handle_edge_irq+0x93/0x240 >> >> kfd kfd: amdgpu: KFD node 1 partition @ size 49148M >> >> asm_call_irq_on_stack+0xf/@x20 >> >> </IRQ> >> >> common_interrupt+0xb3/0x130 >> >> asm_common_interrupt+0x1le/0x40 >> >> 5.10.134-010.a1i5000.a18.x86_64 #1 >> >> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> > Reviewed-by: Philip Yang<Philip.Yang@amd.com> >> --- >> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> index 349c351e242b..051a00152b08 100644 >> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >> @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, >> const void *ih_ring_entry) >> } >> for (i = 0; i < kfd->num_nodes; i++) { >> - node = kfd->nodes[i]; >> + /* Race if another thread in b/w >> + * kfd_cleanup_nodes and kfree(kfd), >> + * when kfd->nodes[i] = NULL >> + */ >> + if (kfd->nodes[i]) >> + node = kfd->nodes[i]; >> + else >> + return; >> + KFD interrupt is handled later in wq node->kfd->ih_wq at interrupt_wq which uses kfd->nodes[i]. Checking " kfd->nodes[i] == NULL" here is not enough as kfd_cleanup_nodes can free kfd node at any time. Regards Xiaogang >> spin_lock_irqsave(&node->interrupt_lock, flags); >> if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw 2025-09-26 1:15 ` Zhang, Yifan @ 2025-09-26 15:26 ` Chen, Xiaogang 0 siblings, 0 replies; 16+ messages in thread From: Chen, Xiaogang @ 2025-09-26 15:26 UTC (permalink / raw) To: Zhang, Yifan, Yang, Philip, amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander, Kuehling, Felix, Lazar, Lijo On 9/25/2025 8:15 PM, Zhang, Yifan wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > flush_workqueue(kfd->ih_wq) and destroy_workqueue(kfd->ih_wq) in kfd_cleanup_nodes clean up pending work items, and node->interrupts_active check prevent new work items from being enqueued. So after kfd_cleanup_nodes free kfd node, there is no pending kfd ih_wq work items. > > And I agree there is still potential race here, since kfd->nodes[i] NULL check is not protected by lock. Will address it together with the issue Lijo mentioned. yes, the ih_wq is flushed and destroyed before kfd node got freed. But still, kfd->nodes[i] can be null at any time for kgd2kfd_interrupt at a different task. I think the real issue is kfd->nodes[i] was not checked under node->interrupt_lock. Both kfd_cleanup_nodes and kgd2kfd_interrupt need use this lock when operate on a kfd node. Currently only kgd2kfd_interrupt uses it and does not include kfd->nodes[i]. kfd_cleanup_nodes also needs use this lock to exclude operation on kfd node from kgd2kfd_interrupt. Regards Xiaogang > > > -----Original Message----- > From: Chen, Xiaogang <Xiaogang.Chen@amd.com> > Sent: Friday, September 26, 2025 3:11 AM > To: Yang, Philip <Philip.Yang@amd.com>; Zhang, Yifan <Yifan1.Zhang@amd.com>; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Kuehling, Felix <Felix.Kuehling@amd.com>; Yang, Philip <Philip.Yang@amd.com>; Lazar, Lijo <Lijo.Lazar@amd.com> > Subject: Re: [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw > > > On 9/24/2025 5:48 PM, Philip Yang wrote: >> On 2025-09-24 11:29, Yifan Zhang wrote: >>> There is race in amdgpu_amdkfd_device_fini_sw and interrupt. >>> if amdgpu_amdkfd_device_fini_sw run in b/w kfd_cleanup_nodes and >>> kfree(kfd), and KGD interrupt generated. >>> >>> kernel panic log: >>> >>> BUG: kernel NULL pointer dereference, address: 0000000000000098 >>> amdgpu 0000:c8:00.0: amdgpu: Requesting 4 partitions through PSP >>> >>> PGD d78c68067 P4D d78c68067 >>> >>> kfd kfd: amdgpu: Allocated 3969056 bytes on gart >>> >>> PUD 1465b8067 PMD @ >>> >>> Oops: @002 [#1] SMP NOPTI >>> >>> kfd kfd: amdgpu: Total number of KFD nodes to be created: 4 >>> CPU: 115 PID: @ Comm: swapper/115 Kdump: loaded Tainted: G S W OE K >>> >>> RIP: 0010:_raw_spin_lock_irqsave+0x12/0x40 >>> >>> Code: 89 e@ 41 5c c3 cc cc cc cc 66 66 2e Of 1f 84 00 00 00 00 00 OF >>> 1f 40 00 Of 1f 44% 00 00 41 54 9c 41 5c fa 31 cO ba 01 00 00 00 <fO> >>> OF b1 17 75 Ba 4c 89 e@ 41 Sc >>> >>> 89 c6 e8 07 38 5d >>> >>> RSP: 0018: ffffc90@1a6b0e28 EFLAGS: 00010046 >>> >>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000018 >>> 0000000000000001 RSI: ffff8883bb623e00 RDI: 0000000000000098 >>> ffff8883bb000000 RO8: ffff888100055020 ROO: ffff888100055020 >>> 0000000000000000 R11: 0000000000000000 R12: 0900000000000002 >>> ffff888F2b97da0@ R14: @000000000000098 R15: ffff8883babdfo00 >>> >>> CS: 010 DS: 0000 ES: 0000 CRO: 0000000080050033 >>> >>> CR2: 0000000000000098 CR3: 0000000e7cae2006 CR4: 0000000002770ce0 >>> 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> 0000000000000000 DR6: 00000000fffeO7FO DR7: 0000000000000400 >>> >>> PKRU: 55555554 >>> >>> Call Trace: >>> >>> <IRQ> >>> >>> kgd2kfd_interrupt+@x6b/0x1f@ [amdgpu] >>> >>> ? amdgpu_fence_process+0xa4/0x150 [amdgpu] >>> >>> kfd kfd: amdgpu: Node: 0, interrupt_bitmap: 3 YcpxFl Rant tErace >>> >>> amdgpu_irq_dispatch+0x165/0x210 [amdgpu] >>> >>> amdgpu_ih_process+0x80/0x100 [amdgpu] >>> >>> amdgpu: Virtual CRAT table created for GPU >>> >>> amdgpu_irq_handler+0x1f/@x60 [amdgpu] >>> >>> __handle_irq_event_percpu+0x3d/0x170 >>> >>> amdgpu: Topology: Add dGPU node [0x74a2:0x1002] >>> >>> handle_irq_event+0x5a/@xcO >>> >>> handle_edge_irq+0x93/0x240 >>> >>> kfd kfd: amdgpu: KFD node 1 partition @ size 49148M >>> >>> asm_call_irq_on_stack+0xf/@x20 >>> >>> </IRQ> >>> >>> common_interrupt+0xb3/0x130 >>> >>> asm_common_interrupt+0x1le/0x40 >>> >>> 5.10.134-010.a1i5000.a18.x86_64 #1 >>> >>> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> >> Reviewed-by: Philip Yang<Philip.Yang@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_device.c | 10 +++++++++- >>> 1 file changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >>> index 349c351e242b..051a00152b08 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c >>> @@ -1133,7 +1133,15 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, >>> const void *ih_ring_entry) >>> } >>> for (i = 0; i < kfd->num_nodes; i++) { >>> - node = kfd->nodes[i]; >>> + /* Race if another thread in b/w >>> + * kfd_cleanup_nodes and kfree(kfd), >>> + * when kfd->nodes[i] = NULL >>> + */ >>> + if (kfd->nodes[i]) >>> + node = kfd->nodes[i]; >>> + else >>> + return; >>> + > KFD interrupt is handled later in wq node->kfd->ih_wq at interrupt_wq which uses kfd->nodes[i]. Checking " kfd->nodes[i] == NULL" here is not enough as kfd_cleanup_nodes can free kfd node at any time. > > Regards > > Xiaogang > >>> spin_lock_irqsave(&node->interrupt_lock, flags); >>> if (node->interrupts_active ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-09-29 13:09 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-24 15:29 [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Yifan Zhang 2025-09-24 15:29 ` [PATCH v4 2/2] amd/amdkfd: enhance kfd process check in switch partition Yifan Zhang 2025-09-24 23:14 ` Philip Yang 2025-09-26 19:52 ` Chen, Xiaogang 2025-09-29 13:09 ` Philip Yang 2025-09-24 22:48 ` [PATCH v4 1/2] amd/amdkfd: resolve a race in amdgpu_amdkfd_device_fini_sw Philip Yang 2025-09-25 6:19 ` Lazar, Lijo 2025-09-25 6:41 ` Zhang, Yifan 2025-09-25 6:58 ` Lazar, Lijo 2025-09-25 7:06 ` Lazar, Lijo 2025-09-25 9:54 ` Zhang, Yifan 2025-09-26 6:49 ` Lazar, Lijo 2025-09-26 14:33 ` Zhang, Yifan 2025-09-25 19:11 ` Chen, Xiaogang 2025-09-26 1:15 ` Zhang, Yifan 2025-09-26 15:26 ` Chen, Xiaogang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox