AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: "Andrey Grodzovsky" <andrey.grodzovsky@amd.com>,
	"Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"amd-gfx list" <amd-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	chris@chris-wilson.co.uk,
	"daniel.vetter@ffwll.ch" <daniel.vetter@ffwll.ch>
Subject: Re: Lockdep spalt on killing a processes
Date: Tue, 26 Oct 2021 12:54:54 +0200	[thread overview]
Message-ID: <bb70e916-e71b-9968-78d9-d7de8fef9e91@amd.com> (raw)
In-Reply-To: <da1ed5da-59ad-d12a-906c-c84b7462d8c4@amd.com>

Am 26.10.21 um 04:33 schrieb Andrey Grodzovsky:
> On 2021-10-25 3:56 p.m., Christian König wrote:
>
>> In general I'm all there to get this fixed, but there is one major 
>> problem: Drivers don't expect the lock to be dropped.
>
>
> I am probably missing something but in my approach we only modify the 
> code for those clients that call dma_fence_signal,
> not dma_fence_signal_locked. In those cases the drivers are agnostic 
> to lock behavior (or should be at least) since the lock
> is acquired within the dma fence code. Note that if you are worried 
> about calling the callback without lock then same exact
> concern is relevant to using the irq_work directly in the fence code 
> since the irq_work will execute at a later time without locked
> fence->lock (which is the point of using irq_work).

Yeah, I've seen that it just doesn't make much sense to me.

>>
>> What we could do is to change all drivers so they call always call 
>> the dma_fence_signal functions and drop the _locked variants. This 
>> way we could move calling the callback out of the spinlock.
>>
>> But that requires audit of all drivers, so quite a lot of work to do.
>
>
> As i said earlier - if we only modify dma_fence_signal and don't touch 
> dma_fence_signal_locked then our only concern should the users of 
> dma_fence_signal.

Yes, but what do you do with the drivers who call the _locked variant?

> Let me please know if I am still missing some point of yours.

Well, I mean we need to be able to handle this for all drivers.

Regards,
Christian.

>
> Andrey
>
>
>>
>> Regards,
>> Christian.
>>
>> Am 25.10.21 um 21:10 schrieb Andrey Grodzovsky:
>>> Adding back Daniel (somehow he got off the addresses list) and Chris 
>>> who worked a lot in this area.
>>>
>>> On 2021-10-21 2:34 a.m., Christian König wrote:
>>>>
>>>>
>>>> Am 20.10.21 um 21:32 schrieb Andrey Grodzovsky:
>>>>> On 2021-10-04 4:14 a.m., Christian König wrote:
>>>>>
>>>>>> The problem is a bit different.
>>>>>>
>>>>>> The callback is on the dependent fence, while we need to signal 
>>>>>> the scheduler fence.
>>>>>>
>>>>>> Daniel is right that this needs an irq_work struct to handle this 
>>>>>> properly.
>>>>>>
>>>>>> Christian.
>>>>>
>>>>>
>>>>> So we had some discussions with Christian regarding irq_work and 
>>>>> agreed I should look into doing it but stepping back for a sec -
>>>>>
>>>>> Why we insist on calling the dma_fence_cb  with fence->lock locked 
>>>>> ? Is it because of dma_fence_add_callback ?
>>>>> Because we first test for DMA_FENCE_FLAG_SIGNALED_BIT and only 
>>>>> after that lock the fence->lock ? If so, can't we
>>>>> move DMA_FENCE_FLAG_SIGNALED_BIT  check inside the locked section 
>>>>> ? Because if in theory
>>>>> we could call the cb with unlocked fence->lock (i.e. this kind of 
>>>>> iteration 
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.15-rc6%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_resource.c%23L117&amp;data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc8a4525f94c244bebbd208d997f19242%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637707886075917091%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YBq%2BNkDuYKERc8XJDWTfeM%2FSknpuCBHQYgN8Uo5PFv0%3D&amp;reserved=0)
>>>>> we wouldn't have the lockdep splat. And in general, is it really
>>>>> the correct approach to call a third party code from a call back 
>>>>> with locked spinlock ? We don't know what the cb does inside
>>>>> and I don't see any explicit restrictions in documentation of 
>>>>> dma_fence_func_t what can and cannot be done there.
>>>>
>>>> Yeah, that's exactly what I meant with using the irq_work directly 
>>>> in the fence code.
>>>
>>>
>>> My idea is not to use irq work at all but instead to implement 
>>> unlocked dma_fence cb execution using iteration
>>> which drops the spinlock each time next cb is executed and acquiring 
>>> it again after (until cb_list is empy).
>>>
>>>
>>>>
>>>>
>>>> The problem is dma_fence_signal_locked() which is used by quite a 
>>>> number of drivers to signal the fence while holding the lock.
>>>
>>>
>>> For this I think we should not reuse dma_fence_signal_locked inside 
>>> dma_fence_signal and instead implement it using the
>>> unlocked iteration I mentioned above. I looked a bit in the code and 
>>> the history and I see that until some time ago
>>> (this commit by Chris 0fc89b6802ba1fcc561b0c906e0cefd384e3b2e5), 
>>> indeed dma_fence_signal was doing it's own, locked iteration
>>> and wasn't reusing dma_fence_signal_locked. This way whoever relies 
>>> on the dma_fence_signal_locked won't be impacted
>>> an who is not (like us in 
>>> drm_sched_fence_scheduled/drm_sched_fence_finished) should also not 
>>> be impacted by more narrow
>>> scope of the lock. I also looked at dma_fence_default_wait and how 
>>> it locks the fence->lock and check if fence is signaled
>>> before wait start and I don't see a problem there either.
>>>
>>> I attached quick draft of this proposal to clarify.
>>>
>>> Andrey
>>>
>>>
>>>>
>>>> Otherwise we could indeed simplify the fence handling a lot.
>>>>
>>>> Christian.
>>>>
>>>>>
>>>>> Andrey
>>>>>
>>>>>
>>>>>>
>>>>>> Am 01.10.21 um 17:10 schrieb Andrey Grodzovsky:
>>>>>>> From what I see here you supposed to have actual deadlock and 
>>>>>>> not only warning, sched_fence->finished is  first signaled from 
>>>>>>> within
>>>>>>> hw fence done callback (drm_sched_job_done_cb) but then again 
>>>>>>> from within it's own callback (drm_sched_entity_kill_jobs_cb) 
>>>>>>> and so
>>>>>>> looks like same fence  object is recursively signaled twice. 
>>>>>>> This leads to attempt to lock fence->lock second time while it's 
>>>>>>> already
>>>>>>> locked. I don't see a need to call drm_sched_fence_finished from 
>>>>>>> within drm_sched_entity_kill_jobs_cb as this callback already 
>>>>>>> registered
>>>>>>> on sched_fence->finished fence (entity->last_scheduled == 
>>>>>>> s_fence->finished) and hence the signaling already took place.
>>>>>>>
>>>>>>> Andrey
>>>>>>>
>>>>>>> On 2021-10-01 6:50 a.m., Christian König wrote:
>>>>>>>> Hey, Andrey.
>>>>>>>>
>>>>>>>> while investigating some memory management problems I've got 
>>>>>>>> the logdep splat below.
>>>>>>>>
>>>>>>>> Looks like something is wrong with 
>>>>>>>> drm_sched_entity_kill_jobs_cb(), can you investigate?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>> [11176.741052] ============================================
>>>>>>>> [11176.741056] WARNING: possible recursive locking detected
>>>>>>>> [11176.741060] 5.15.0-rc1-00031-g9d546d600800 #171 Not tainted
>>>>>>>> [11176.741066] --------------------------------------------
>>>>>>>> [11176.741070] swapper/12/0 is trying to acquire lock:
>>>>>>>> [11176.741074] ffff9c337ed175a8 (&fence->lock){-.-.}-{3:3}, at: 
>>>>>>>> dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741088]
>>>>>>>>                but task is already holding lock:
>>>>>>>> [11176.741092] ffff9c337ed172a8 (&fence->lock){-.-.}-{3:3}, at: 
>>>>>>>> dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741100]
>>>>>>>>                other info that might help us debug this:
>>>>>>>> [11176.741104]  Possible unsafe locking scenario:
>>>>>>>>
>>>>>>>> [11176.741108]        CPU0
>>>>>>>> [11176.741110]        ----
>>>>>>>> [11176.741113]   lock(&fence->lock);
>>>>>>>> [11176.741118]   lock(&fence->lock);
>>>>>>>> [11176.741122]
>>>>>>>>                 *** DEADLOCK ***
>>>>>>>>
>>>>>>>> [11176.741125]  May be due to missing lock nesting notation
>>>>>>>>
>>>>>>>> [11176.741128] 2 locks held by swapper/12/0:
>>>>>>>> [11176.741133]  #0: ffff9c339c30f768 
>>>>>>>> (&ring->fence_drv.lock){-.-.}-{3:3}, at: 
>>>>>>>> dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741142]  #1: ffff9c337ed172a8 
>>>>>>>> (&fence->lock){-.-.}-{3:3}, at: dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741151]
>>>>>>>>                stack backtrace:
>>>>>>>> [11176.741155] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 
>>>>>>>> 5.15.0-rc1-00031-g9d546d600800 #171
>>>>>>>> [11176.741160] Hardware name: System manufacturer System 
>>>>>>>> Product Name/PRIME X399-A, BIOS 0808 10/12/2018
>>>>>>>> [11176.741165] Call Trace:
>>>>>>>> [11176.741169]  <IRQ>
>>>>>>>> [11176.741173]  dump_stack_lvl+0x5b/0x74
>>>>>>>> [11176.741181]  dump_stack+0x10/0x12
>>>>>>>> [11176.741186]  __lock_acquire.cold+0x208/0x2df
>>>>>>>> [11176.741197]  lock_acquire+0xc6/0x2d0
>>>>>>>> [11176.741204]  ? dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741212]  _raw_spin_lock_irqsave+0x4d/0x70
>>>>>>>> [11176.741219]  ? dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741225]  dma_fence_signal+0x28/0x80
>>>>>>>> [11176.741230]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
>>>>>>>> [11176.741240] drm_sched_entity_kill_jobs_cb+0x1c/0x50 [gpu_sched]
>>>>>>>> [11176.741248] dma_fence_signal_timestamp_locked+0xac/0x1a0
>>>>>>>> [11176.741254]  dma_fence_signal+0x3b/0x80
>>>>>>>> [11176.741260]  drm_sched_fence_finished+0x12/0x20 [gpu_sched]
>>>>>>>> [11176.741268]  drm_sched_job_done.isra.0+0x7f/0x1a0 [gpu_sched]
>>>>>>>> [11176.741277]  drm_sched_job_done_cb+0x12/0x20 [gpu_sched]
>>>>>>>> [11176.741284] dma_fence_signal_timestamp_locked+0xac/0x1a0
>>>>>>>> [11176.741290]  dma_fence_signal+0x3b/0x80
>>>>>>>> [11176.741296]  amdgpu_fence_process+0xd1/0x140 [amdgpu]
>>>>>>>> [11176.741504]  sdma_v4_0_process_trap_irq+0x8c/0xb0 [amdgpu]
>>>>>>>> [11176.741731]  amdgpu_irq_dispatch+0xce/0x250 [amdgpu]
>>>>>>>> [11176.741954]  amdgpu_ih_process+0x81/0x100 [amdgpu]
>>>>>>>> [11176.742174]  amdgpu_irq_handler+0x26/0xa0 [amdgpu]
>>>>>>>> [11176.742393]  __handle_irq_event_percpu+0x4f/0x2c0
>>>>>>>> [11176.742402]  handle_irq_event_percpu+0x33/0x80
>>>>>>>> [11176.742408]  handle_irq_event+0x39/0x60
>>>>>>>> [11176.742414]  handle_edge_irq+0x93/0x1d0
>>>>>>>> [11176.742419]  __common_interrupt+0x50/0xe0
>>>>>>>> [11176.742426]  common_interrupt+0x80/0x90
>>>>>>>> [11176.742431]  </IRQ>
>>>>>>>> [11176.742436]  asm_common_interrupt+0x1e/0x40
>>>>>>>> [11176.742442] RIP: 0010:cpuidle_enter_state+0xff/0x470
>>>>>>>> [11176.742449] Code: 0f a3 05 04 54 24 01 0f 82 70 02 00 00 31 
>>>>>>>> ff e8 37 5d 6f ff 80 7d d7 00 0f 85 e9 01 00 00 e8 58 a2 7f ff 
>>>>>>>> fb 66 0f 1f 44 00 00 <45> 85 ff 0f 88 01 01 00 00 49 63 c7 4c 
>>>>>>>> 2b 75 c8 48 8d 14 40 48 8d
>>>>>>>> [11176.742455] RSP: 0018:ffffb6970021fe48 EFLAGS: 00000202
>>>>>>>> [11176.742461] RAX: 000000000059be25 RBX: 0000000000000002 RCX: 
>>>>>>>> 0000000000000000
>>>>>>>> [11176.742465] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
>>>>>>>> ffffffff9efeed78
>>>>>>>> [11176.742470] RBP: ffffb6970021fe80 R08: 0000000000000001 R09: 
>>>>>>>> 0000000000000001
>>>>>>>> [11176.742473] R10: 0000000000000001 R11: 0000000000000001 R12: 
>>>>>>>> ffff9c3350b0e800
>>>>>>>> [11176.742477] R13: ffffffffa00e9680 R14: 00000a2a49ada060 R15: 
>>>>>>>> 0000000000000002
>>>>>>>> [11176.742483]  ? cpuidle_enter_state+0xf8/0x470
>>>>>>>> [11176.742489]  ? cpuidle_enter_state+0xf8/0x470
>>>>>>>> [11176.742495]  cpuidle_enter+0x2e/0x40
>>>>>>>> [11176.742500]  call_cpuidle+0x23/0x40
>>>>>>>> [11176.742506]  do_idle+0x201/0x280
>>>>>>>> [11176.742512]  cpu_startup_entry+0x20/0x30
>>>>>>>> [11176.742517]  start_secondary+0x11f/0x160
>>>>>>>> [11176.742523] secondary_startup_64_no_verify+0xb0/0xbb
>>>>>>>>
>>>>>>
>>>>
>>


  reply	other threads:[~2021-10-26 10:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 10:50 Lockdep spalt on killing a processes Christian König
2021-10-01 14:52 ` Daniel Vetter
2021-10-01 15:10 ` Andrey Grodzovsky
2021-10-04  8:14   ` Christian König
2021-10-04 15:27     ` Andrey Grodzovsky
2021-10-20 19:32     ` Andrey Grodzovsky
2021-10-21  6:34       ` Christian König
2021-10-25 19:10         ` Andrey Grodzovsky
2021-10-25 19:56           ` Christian König
2021-10-26  2:33             ` Andrey Grodzovsky
2021-10-26 10:54               ` Christian König [this message]
2021-10-27 14:27                 ` Andrey Grodzovsky
2021-10-27 14:34                   ` Christian König
2021-10-27 14:47                     ` Andrey Grodzovsky
2021-10-27 14:50                       ` Christian König
2021-10-27 19:58                         ` Andrey Grodzovsky
2021-10-28 17:26                           ` Andrey Grodzovsky
2021-10-29  7:07                             ` Christian König
2021-11-01 15:24                               ` Andrey Grodzovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb70e916-e71b-9968-78d9-d7de8fef9e91@amd.com \
    --to=christian.koenig@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrey.grodzovsky@amd.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=ckoenig.leichtzumerken@gmail.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox