AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Pan, Xinhui" <Xinhui.Pan@amd.com>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>
Subject: Re: 回复: [PATCH] drm/amdgpu: Fix a race of IB test
Date: Mon, 13 Sep 2021 08:19:24 +0200	[thread overview]
Message-ID: <91fd7ad8-b4f0-b30e-9f3e-eab229b7375e@amd.com> (raw)
In-Reply-To: <DM4PR12MB5165351AA7C45CCDBD15AAC187D79@DM4PR12MB5165.namprd12.prod.outlook.com>

Well is the delayed init work using direct submission or submission 
through the scheduler?

If the later we have the down_write of the reset semaphore pulled in 
through the scheduler dependency.

Anyway just having the sync before taking the lock should work.

Christian.

Am 11.09.21 um 12:18 schrieb Pan, Xinhui:
> [AMD Official Use Only]
>
> For the possible deadlock, we can just move flush_delayed_work above down_write. not a big thing.
> But I am not aware why delayed init work would try to lock reset_sem.
>
> delayed init work is enqueued when device resume. It calls amdgpu_ib_ring_tests directly. We need one sync method.
> But I see device resume itself woud flush it. So there is no race between them as userspace is still freezed.
>
> I will drop this flush in V2.
> ________________________________________
> 发件人: Christian König <ckoenig.leichtzumerken@gmail.com>
> 发送时间: 2021年9月11日 15:45
> 收件人: Pan, Xinhui; amd-gfx@lists.freedesktop.org
> 抄送: Deucher, Alexander; Koenig, Christian
> 主题: Re: [PATCH] drm/amdgpu: Fix a race of IB test
>
>
>
> Am 11.09.21 um 03:55 schrieb xinhui pan:
>> Direct IB submission should be exclusive. So use write lock.
>>
>> Signed-off-by: xinhui pan <xinhui.pan@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 9 +++++++--
>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> index 19323b4cce7b..acbe02928791 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
>> @@ -1358,10 +1358,15 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>>        }
>>
>>        /* Avoid accidently unparking the sched thread during GPU reset */
>> -     r = down_read_killable(&adev->reset_sem);
>> +     r = down_write_killable(&adev->reset_sem);
>>        if (r)
>>                return r;
>>
>> +     /* Avoid concurrently IB test but not cancel it as I don't know whether we
>> +      * would add more code in the delayed init work.
>> +      */
>> +     flush_delayed_work(&adev->delayed_init_work);
>> +
> That won't work. It's at least theoretical possible that the delayed
> init work waits for the reset_sem which we are holding here.
>
> Very unlikely to happen, but lockdep might be able to point that out
> with a nice backtrace in the logs.
>
> On the other hand delayed init work and direct IB test through this
> interface should work at the same time, so I would just drop it.
>
> Christian.
>
>>        /* hold on the scheduler */
>>        for (i = 0; i < AMDGPU_MAX_RINGS; i++) {
>>                struct amdgpu_ring *ring = adev->rings[i];
>> @@ -1387,7 +1392,7 @@ static int amdgpu_debugfs_test_ib_show(struct seq_file *m, void *unused)
>>                kthread_unpark(ring->sched.thread);
>>        }
>>
>> -     up_read(&adev->reset_sem);
>> +     up_write(&adev->reset_sem);
>>
>>        pm_runtime_mark_last_busy(dev->dev);
>>        pm_runtime_put_autosuspend(dev->dev);


  reply	other threads:[~2021-09-13  6:19 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11  1:55 [PATCH] drm/amdgpu: Fix a race of IB test xinhui pan
2021-09-11  7:45 ` Christian König
2021-09-11 10:18   ` 回复: " Pan, Xinhui
2021-09-13  6:19     ` Christian König [this message]
2021-09-13  6:55       ` 回复: " Pan, Xinhui
2021-09-13  6:58         ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91fd7ad8-b4f0-b30e-9f3e-eab229b7375e@amd.com \
    --to=christian.koenig@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Xinhui.Pan@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox