* [PATCH] drm/amd/display: move the nonblocking commit helpers appropriately
@ 2018-09-24 13:31 Shirish S
[not found] ` <1537795907-1168-1-git-send-email-shirish.s-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 5+ messages in thread
From: Shirish S @ 2018-09-24 13:31 UTC (permalink / raw)
To: harry.wentland-5C7GfCeVMHo, sunpeng.li-5C7GfCeVMHo
Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Shirish S
[Why]
In multi-monitor scenario, if first crtc's flip done event occurs delayed
(but within timeout), due to non-blocking design of commit_tail(), there
are more than one commit's scheduled by the time the second crtc's
wait_for_completion_timeout() is called in drm_atomic_helper_wait_for_flip_done.
Due to these in-between additions and deletions in the atomic state, it is
found that the system fails while accessing common data structures of the
second crtc in drm_atomic_helper_wait_for_flip_done(), leading to crash as
below:
BUG: unable to handle kernel paging request at 000000010000001c
IP: do_raw_spin_lock+0xf/0x94
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
Call Trace:
__wait_for_common+0x36/0x60
drm_atomic_helper_wait_for_flip_done+0x47/0x89
amdgpu_dm_atomic_commit_tail+0xf4b/0xf84
? drm_atomic_helper_wait_for_dependencies+0x1cd/0x217
commit_tail+0x41/0x5f
[How]
Move drm_atomic_helper_commit_hw_done() post wait_for_flip_done(),
which cleans up the atomic state's commit and completes pending hw_done and
flip_done works as a result there wont be dangling flip waits on random commits.
Signed-off-by: Shirish S <shirish.s@amd.com>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index 0f10d92..41a1958 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -4626,12 +4626,17 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state)
}
spin_unlock_irqrestore(&adev->ddev->event_lock, flags);
- /* Signal HW programming completion */
- drm_atomic_helper_commit_hw_done(state);
if (wait_for_vblank)
drm_atomic_helper_wait_for_flip_done(dev, state);
+ /* Atomic state pointer gets corrupted in case of frequent
+ * modesets operations like changing resolutions.
+ * Hence discard state->commit before signalling to user
+ * space.
+ */
+ drm_atomic_helper_commit_hw_done(state);
+
drm_atomic_helper_cleanup_planes(dev, state);
/*
--
2.7.4
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 5+ messages in thread[parent not found: <1537795907-1168-1-git-send-email-shirish.s-5C7GfCeVMHo@public.gmane.org>]
* [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip [not found] ` <1537795907-1168-1-git-send-email-shirish.s-5C7GfCeVMHo@public.gmane.org> @ 2018-09-28 13:16 ` Harry Wentland [not found] ` <20180928131606.15931-1-harry.wentland-5C7GfCeVMHo@public.gmane.org> 2018-10-01 23:26 ` [PATCH v2] drm/amd/display: Signal hw_done() after waiting for flip_done() sunpeng.li-5C7GfCeVMHo 1 sibling, 1 reply; 5+ messages in thread From: Harry Wentland @ 2018-09-28 13:16 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, shirish.s-5C7GfCeVMHo, sunpeng.li-5C7GfCeVMHo Cc: Harry Wentland [Why] There is a race that between drm_crtc_commit_hw_done and drm_atomic_helper_wait_for_flip where the possibilty exist for the crtc->commit to be cleared after the null check in wait_for_flip_done but before the call to wait_for_completion_timeout on commit->flip_done. [How] Take a reference to all commits in the state before drm_crtc_commit_hw_done is called and release those after drm_atomic_helper_wait_for_flip has finished. Signed-off-by: Harry Wentland <harry.wentland@amd.com> --- Would something like this work? I get the strong sense that this happens because Intel and IMX use the helpers in the other order, hence the dependency between hw_done and wait_for_flip was missed. I'd rather make it obvious that there's (a) no reason to reorder these two calls on AMD HW (other than this unexpected dependency) and (b) this is something we'll probably want to fix in DRM. Sorry it took me a while to understand what was happening here. Been busy at XDC until Jordan and Leo reminded me to take another look. Harry drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0f10d920a785..ed9a7d680b63 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4626,12 +4626,27 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) } spin_unlock_irqrestore(&adev->ddev->event_lock, flags); + /* + * WORKAROUND: Take a ref for all crtc_state commits to avoid + * a race where the commit gets freed before commit_hw_done had + * a chance to look for commit->flip_done + */ + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) + drm_crtc_commit_get(new_crtc_state->commit); + /* Signal HW programming completion */ drm_atomic_helper_commit_hw_done(state); if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state); + /* + * WORKAROUND: put the commit refs from above (see comment on + * the drm_crtc_commit_get call above) + */ + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) + drm_crtc_commit_put(new_crtc_state->commit); + drm_atomic_helper_cleanup_planes(dev, state); /* -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 5+ messages in thread
[parent not found: <20180928131606.15931-1-harry.wentland-5C7GfCeVMHo@public.gmane.org>]
* RE: [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip [not found] ` <20180928131606.15931-1-harry.wentland-5C7GfCeVMHo@public.gmane.org> @ 2018-10-01 9:06 ` S, Shirish 0 siblings, 0 replies; 5+ messages in thread From: S, Shirish @ 2018-10-01 9:06 UTC (permalink / raw) To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org, Li, Sun peng (Leo) Cc: Wentland, Harry This workaround is not fixing the issue. Regards, Shirish S -----Original Message----- From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Harry Wentland Sent: Friday, September 28, 2018 6:46 PM To: amd-gfx@lists.freedesktop.org; S, Shirish <Shirish.S@amd.com>; Li, Sun peng (Leo) <Sunpeng.Li@amd.com> Cc: Wentland, Harry <Harry.Wentland@amd.com> Subject: [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip [Why] There is a race that between drm_crtc_commit_hw_done and drm_atomic_helper_wait_for_flip where the possibilty exist for the crtc->commit to be cleared after the null check in wait_for_flip_done but before the call to wait_for_completion_timeout on commit->flip_done. [How] Take a reference to all commits in the state before drm_crtc_commit_hw_done is called and release those after drm_atomic_helper_wait_for_flip has finished. Signed-off-by: Harry Wentland <harry.wentland@amd.com> --- Would something like this work? I get the strong sense that this happens because Intel and IMX use the helpers in the other order, hence the dependency between hw_done and wait_for_flip was missed. I'd rather make it obvious that there's (a) no reason to reorder these two calls on AMD HW (other than this unexpected dependency) and (b) this is something we'll probably want to fix in DRM. Sorry it took me a while to understand what was happening here. Been busy at XDC until Jordan and Leo reminded me to take another look. Harry drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0f10d920a785..ed9a7d680b63 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4626,12 +4626,27 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) } spin_unlock_irqrestore(&adev->ddev->event_lock, flags); + /* + * WORKAROUND: Take a ref for all crtc_state commits to avoid + * a race where the commit gets freed before commit_hw_done had + * a chance to look for commit->flip_done + */ + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) + drm_crtc_commit_get(new_crtc_state->commit); + /* Signal HW programming completion */ drm_atomic_helper_commit_hw_done(state); if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state); + /* + * WORKAROUND: put the commit refs from above (see comment on + * the drm_crtc_commit_get call above) + */ + for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) + drm_crtc_commit_put(new_crtc_state->commit); + drm_atomic_helper_cleanup_planes(dev, state); /* -- 2.17.1 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2] drm/amd/display: Signal hw_done() after waiting for flip_done() [not found] ` <1537795907-1168-1-git-send-email-shirish.s-5C7GfCeVMHo@public.gmane.org> 2018-09-28 13:16 ` [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip Harry Wentland @ 2018-10-01 23:26 ` sunpeng.li-5C7GfCeVMHo [not found] ` <1538436399-28084-1-git-send-email-sunpeng.li-5C7GfCeVMHo@public.gmane.org> 1 sibling, 1 reply; 5+ messages in thread From: sunpeng.li-5C7GfCeVMHo @ 2018-10-01 23:26 UTC (permalink / raw) To: shirish.s-5C7GfCeVMHo Cc: Leo Li, harry.wentland-5C7GfCeVMHo, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW From: Shirish S <shirish.s@amd.com> In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before we signal hw_done(). [Why] This is to temporary address a paging error that occurs when a nonblocking commit contends with another commit, particularly in a mirrored multi-display configuration. The error occurs in drm_atomic_helper_wait_for_flip_done(), when we attempt to access the contents of new_crtc_state->commit. Here's the sequence for a mirrored 2 display setup (irrelevant steps left out for clarity): **THREAD 1** | **THREAD 2** | Initialize atomic state for flip | | Queue worker | ... | Do work for flip | | Signal hw_done() on CRTC 1 | Signal hw_done() on CRTC 2 | | Wait for flip_done() on CRTC 1 <---- **PREEMPTED BY THREAD 1** Initialize atomic state for cursor update (1) | | Do cursor update worker | | Clear atomic state (2) | **DONE** | ... | | Wait for flip_done() on CRTC 2 | *ERROR* | The issue starts with (1). When the atomic state is initialized, the current CRTC states are duplicated to be the new_crtc_states, and referenced to be the old_crtc_states. (The new_crtc_states are to be filled with update data.) Some things to note: * Due to the mirrored configuration, the cursor updates on both CRTCs. * At this point, the pflip IRQ has already been handled, and flip_done signaled on all CRTCs. The cursor commit can therefore continue. * The old_crtc_states used by the cursor update are the **same states** as the new_crtc_states used by the flip worker. At (2), the old_crtc_state is freed (*), and the cursor commit completes. We then context switch back to the flip worker, where we attempt to access the new_crtc_state->commit object. This is problematic, as this state has already been freed. (*) Technically, 'state->crtcs[i].state' is freed, which was made to reference old_crtc_state in drm_atomic_helper_swap_state() [How] By moving hw_done() after wait_for_flip_done(), we're guaranteed that the new_crtc_state (from the flip worker's perspective) still exists. This is because any other commit will be blocked, waiting for the hw_done() signal. Note that both the i915 and imx drivers have this sequence flipped already, masking this problem. Signed-off-by: Shirish S <shirish.s@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> --- Hi Shirish, Sorry for the late reply. I took some time to dig at this issue, and found that a proper fix will need more thought. I'm OK with this fix for now, until we can address it with a proper fix. I've updated the commit message and comments below to reflect the underlying issue. Thanks, Leo drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 774db34..bbfffaf 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -4755,12 +4755,18 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) } spin_unlock_irqrestore(&adev->ddev->event_lock, flags); - /* Signal HW programming completion */ - drm_atomic_helper_commit_hw_done(state); if (wait_for_vblank) drm_atomic_helper_wait_for_flip_done(dev, state); + /* + * FIXME: + * Delay hw_done() until flip_done() is signaled. This is to block + * another commit from freeing the CRTC state while we're still + * waiting on flip_done. + */ + drm_atomic_helper_commit_hw_done(state); + drm_atomic_helper_cleanup_planes(dev, state); /* -- 2.7.4 _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply related [flat|nested] 5+ messages in thread
[parent not found: <1538436399-28084-1-git-send-email-sunpeng.li-5C7GfCeVMHo@public.gmane.org>]
* Re: [PATCH v2] drm/amd/display: Signal hw_done() after waiting for flip_done() [not found] ` <1538436399-28084-1-git-send-email-sunpeng.li-5C7GfCeVMHo@public.gmane.org> @ 2018-10-02 14:41 ` Harry Wentland 0 siblings, 0 replies; 5+ messages in thread From: Harry Wentland @ 2018-10-02 14:41 UTC (permalink / raw) To: sunpeng.li-5C7GfCeVMHo, shirish.s-5C7GfCeVMHo Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW On 2018-10-01 07:26 PM, sunpeng.li@amd.com wrote: > From: Shirish S <shirish.s@amd.com> > > In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before > we signal hw_done(). > > [Why] > > This is to temporary address a paging error that occurs when a s/temporary/temporarily/g > nonblocking commit contends with another commit, particularly in a > mirrored multi-display configuration. The error occurs in > drm_atomic_helper_wait_for_flip_done(), when we attempt to access the > contents of new_crtc_state->commit. > > Here's the sequence for a mirrored 2 display setup (irrelevant steps > left out for clarity): > > **THREAD 1** | **THREAD 2** > | > Initialize atomic state for flip | > | > Queue worker | > ... > > | Do work for flip > | > | Signal hw_done() on CRTC 1 > | Signal hw_done() on CRTC 2 > | > | Wait for flip_done() on CRTC 1 > > <---- **PREEMPTED BY THREAD 1** > > Initialize atomic state for cursor > update (1) | > | > Do cursor update worker | > | > Clear atomic state (2) | > **DONE** | > ... > | > | Wait for flip_done() on CRTC 2 > | *ERROR* > | > > The issue starts with (1). When the atomic state is initialized, the > current CRTC states are duplicated to be the new_crtc_states, and > referenced to be the old_crtc_states. (The new_crtc_states are to be > filled with update data.) > > Some things to note: > > * Due to the mirrored configuration, the cursor updates on both CRTCs. > > * At this point, the pflip IRQ has already been handled, and flip_done > signaled on all CRTCs. The cursor commit can therefore continue. > > * The old_crtc_states used by the cursor update are the **same states** > as the new_crtc_states used by the flip worker. > > At (2), the old_crtc_state is freed (*), and the cursor commit > completes. We then context switch back to the flip worker, where we > attempt to access the new_crtc_state->commit object. This is > problematic, as this state has already been freed. > > (*) Technically, 'state->crtcs[i].state' is freed, which was made to > reference old_crtc_state in drm_atomic_helper_swap_state() > > [How] > > By moving hw_done() after wait_for_flip_done(), we're guaranteed that > the new_crtc_state (from the flip worker's perspective) still exists. > This is because any other commit will be blocked, waiting for the > hw_done() signal. > > Note that both the i915 and imx drivers have this sequence flipped > already, masking this problem. > > Signed-off-by: Shirish S <shirish.s@amd.com> > Signed-off-by: Leo Li <sunpeng.li@amd.com> Thanks for the additional explanation. Took me a while to understand what was happening here. Let's keep pursuing a proper solution that allows us to flip hw_done and wait_for_flip_done again but since that might be non-trivial this change is Reviewed-by: Harry Wentland <harry.wentland@amd.com> Harry > --- > > Hi Shirish, > > Sorry for the late reply. I took some time to dig at this issue, and found > that a proper fix will need more thought. I'm OK with this fix for now, until > we can address it with a proper fix. > > I've updated the commit message and comments below to reflect the underlying > issue. > > Thanks, > Leo > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 774db34..bbfffaf 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -4755,12 +4755,18 @@ static void amdgpu_dm_atomic_commit_tail(struct drm_atomic_state *state) > } > spin_unlock_irqrestore(&adev->ddev->event_lock, flags); > > - /* Signal HW programming completion */ > - drm_atomic_helper_commit_hw_done(state); > > if (wait_for_vblank) > drm_atomic_helper_wait_for_flip_done(dev, state); > > + /* > + * FIXME: > + * Delay hw_done() until flip_done() is signaled. This is to block > + * another commit from freeing the CRTC state while we're still > + * waiting on flip_done. > + */ > + drm_atomic_helper_commit_hw_done(state); > + > drm_atomic_helper_cleanup_planes(dev, state); > > /* > _______________________________________________ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-10-02 14:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-24 13:31 [PATCH] drm/amd/display: move the nonblocking commit helpers appropriately Shirish S
[not found] ` <1537795907-1168-1-git-send-email-shirish.s-5C7GfCeVMHo@public.gmane.org>
2018-09-28 13:16 ` [PATCH] drm/amd/display: Work around race beetween hw_done and wait_for_flip Harry Wentland
[not found] ` <20180928131606.15931-1-harry.wentland-5C7GfCeVMHo@public.gmane.org>
2018-10-01 9:06 ` S, Shirish
2018-10-01 23:26 ` [PATCH v2] drm/amd/display: Signal hw_done() after waiting for flip_done() sunpeng.li-5C7GfCeVMHo
[not found] ` <1538436399-28084-1-git-send-email-sunpeng.li-5C7GfCeVMHo@public.gmane.org>
2018-10-02 14:41 ` Harry Wentland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox