On 10/20/2025 11:24 AM, Kim, Jonathan wrote: > > [Public] > > > You can see this error message on CTRL-C or gpu reset conditions. > > Those don’t seem like application race conditions to me i.e. the app > can’t do anything about terminations it can’t see coming. > > It appears any bad actor could spam dmesgs with these error messages > if they wanted to with the way the driver currently is. > If this warning message comes out due to some unexpected things happened(like ctrl-c) I think user would not be surprised to see some warnings. Regards Xiaogang > Jon > > *From:*amd-gfx *On Behalf Of > *Chen, Xiaogang > *Sent:* Monday, October 20, 2025 12:00 PM > *To:* Yang, Philip ; amd-gfx@lists.freedesktop.org > *Cc:* Kuehling, Felix > *Subject:* Re: [PATCH v3 1/2] drm/amdkfd: Fix false positive queue > buffer free warning > > > > > *Caution:*This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > On 10/20/2025 9:30 AM, Philip Yang wrote: > > Only show warning message if process mm is still alive when queue > > buffer is freed. > > If kfd_lookup_process_by_mm return NULL, means the process is already > > exited and mm is gone, it is fine to free queue buffer. > > Fixes: b049504e211e ("drm/amdkfd: Validate user queue svm memory residency") > > Signed-off-by: Philip Yang > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > > index 4d4a47313f5b..d1b2f8525f80 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > > @@ -2487,7 +2487,9 @@ svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_range *prange, > >         bool unmap_parent; > >         uint32_t i; > > > > -        if (atomic_read(&prange->queue_refcount)) { > > +        p = kfd_lookup_process_by_mm(mm); > > p->mm is null, not p  because driver set p->mm to null. But > still prange->queue_refcount is ref of queues from this prange. If app > unmap this prange app should have destroyed the queues associated with > this prange already. If not, it is not driver issue. I think driver > should send this warning anyway to indicate there are queues not > destroyed by app before app unmap the prange. It is an app race > condition, not driver. > > Regards > > Xiaogang > > + > > +        if (p && atomic_read(&prange->queue_refcount)) { > >                int r; > > > >                 pr_warn("Freeing queue vital buffer 0x%lx, queue evicted\n", > > @@ -2497,7 +2499,6 @@ svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_range *prange, > >                        pr_debug("failed %d to quiesce KFD queues\n", r); > >         } > > > > -        p = kfd_lookup_process_by_mm(mm); > >         if (!p) > >                return; > >         svms = &p->svms; >