intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Incorrect plane programming sequence results into a corrupted display/hard hung system
       [not found] <7673AD08FC9483488D001D004D4635333156113B@ORSMSX106.amr.corp.intel.com>
@ 2018-04-12 16:57 ` Vyas, Tarun
  2018-04-12 19:28   ` Runyan, Arthur J
  0 siblings, 1 reply; 3+ messages in thread
From: Vyas, Tarun @ 2018-04-12 16:57 UTC (permalink / raw)
  To: intel-gfx@lists.freedesktop.org
  Cc: Herbert, Marc, Runyan, Arthur J, Shaikh, Azhar, Ciobanu, Nathan D,
	Lankhorst, Maarten


[-- Attachment #1.1: Type: text/plain, Size: 2041 bytes --]

On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted display is observed while running tests that frequently disable/re-enable primary/overlay planes. Details recorded in this FDO bug: https://bugs.freedesktop.org/show_bug.cgi?id=104975.

The issue has been root caused as a race where only partial register updates get latched on the next vblank, specifically, the updates that give the buffer allocation of the current plane before disabling it (again, details captured in the FDO bug above). There have been several optimizations to work around this bug:

1.       Enable Isochronous priority control (IPC)

2.       Increase the vblank evasion time to 250 usec (We have tried 500 usec but that doesn't helps).

3.       Disable DOUBLE_BUFFER_CTL while the updates are done, inside intel_pipe_update_start and intel_pipe_update_end (doesn't helps)

4.       Grab all the required locks before starting the pipe_update.

Per Ville, none of the above optimizations guarantee a *full* update before the vblank. As a result, to fix this issue the right way, the plane programming sequence needs to be altered in the driver as mentioned below:

"Buffer allocation overlap among enabled planes will cause a full frame underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
You need to make sure the plane is disabled before reallocating the buffer it uses.  For a single pipe it is sufficient to initiate the disabling of the plane before the reallocation.  For multiple pipes it can be more complex.
In this case you should be doing something like this to ensure plane 2A turns off before plane 1A steals the buffer


1.         PLANE_CTL_2A -> disabled

2.       PLANE_SURF_2A:  touch to arm double buffer update

3.       PLANE_BUF_CFG_1A -> (0-860)

4.       PLANE_SURF_1A: touch to arm double buffer update
If the planes are on different pipes there needs to be a wait for vblank between step 2 and 3 to ensure the plane 2A disable completed."


Please comment as required.

[-- Attachment #1.2: Type: text/html, Size: 8690 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Incorrect plane programming sequence results into a corrupted display/hard hung system
  2018-04-12 16:57 ` Incorrect plane programming sequence results into a corrupted display/hard hung system Vyas, Tarun
@ 2018-04-12 19:28   ` Runyan, Arthur J
  2018-04-17 19:17     ` Ville Syrjälä
  0 siblings, 1 reply; 3+ messages in thread
From: Runyan, Arthur J @ 2018-04-12 19:28 UTC (permalink / raw)
  To: Vyas, Tarun, intel-gfx@lists.freedesktop.org
  Cc: Herbert, Marc, Shaikh, Azhar, Ciobanu, Nathan D,
	Lankhorst, Maarten


[-- Attachment #1.1: Type: text/plain, Size: 2849 bytes --]

This seems like a typical atomic modeset requirement.

IPC should not impact register programming.
Vblank evasion only works if you have a guarantee on worst case interrupts/delays.  I think locks is part of the guarantee.
Double buffer control should guarantee safe alignment of programming across planes on the same pipe.  Multiple pipes will still require a wait for vblank.

From: Vyas, Tarun
Sent: Thursday, 12 April, 2018 9:57 AM
To: intel-gfx@lists.freedesktop.org
Cc: Runyan, Arthur J <arthur.j.runyan@intel.com>; Shaikh, Azhar <azhar.shaikh@intel.com>; Herbert, Marc <marc.herbert@intel.com>; Ciobanu, Nathan D <nathan.d.ciobanu@intel.com>; Lankhorst, Maarten <maarten.lankhorst@intel.com>
Subject: Incorrect plane programming sequence results into a corrupted display/hard hung system

On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted display is observed while running tests that frequently disable/re-enable primary/overlay planes. Details recorded in this FDO bug: https://bugs.freedesktop.org/show_bug.cgi?id=104975.

The issue has been root caused as a race where only partial register updates get latched on the next vblank, specifically, the updates that give the buffer allocation of the current plane before disabling it (again, details captured in the FDO bug above). There have been several optimizations to work around this bug:

1.       Enable Isochronous priority control (IPC)

2.       Increase the vblank evasion time to 250 usec (We have tried 500 usec but that doesn't helps).

3.       Disable DOUBLE_BUFFER_CTL while the updates are done, inside intel_pipe_update_start and intel_pipe_update_end (doesn't helps)

4.       Grab all the required locks before starting the pipe_update.

Per Ville, none of the above optimizations guarantee a *full* update before the vblank. As a result, to fix this issue the right way, the plane programming sequence needs to be altered in the driver as mentioned below:

"Buffer allocation overlap among enabled planes will cause a full frame underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
You need to make sure the plane is disabled before reallocating the buffer it uses.  For a single pipe it is sufficient to initiate the disabling of the plane before the reallocation.  For multiple pipes it can be more complex.
In this case you should be doing something like this to ensure plane 2A turns off before plane 1A steals the buffer


1.         PLANE_CTL_2A -> disabled

2.       PLANE_SURF_2A:  touch to arm double buffer update

3.       PLANE_BUF_CFG_1A -> (0-860)

4.       PLANE_SURF_1A: touch to arm double buffer update
If the planes are on different pipes there needs to be a wait for vblank between step 2 and 3 to ensure the plane 2A disable completed."


Please comment as required.

[-- Attachment #1.2: Type: text/html, Size: 10514 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Incorrect plane programming sequence results into a corrupted display/hard hung system
  2018-04-12 19:28   ` Runyan, Arthur J
@ 2018-04-17 19:17     ` Ville Syrjälä
  0 siblings, 0 replies; 3+ messages in thread
From: Ville Syrjälä @ 2018-04-17 19:17 UTC (permalink / raw)
  To: Runyan, Arthur J
  Cc: intel-gfx@lists.freedesktop.org, Shaikh, Azhar, Ciobanu, Nathan D,
	Herbert, Marc, Lankhorst, Maarten

On Thu, Apr 12, 2018 at 07:28:48PM +0000, Runyan, Arthur J wrote:
> This seems like a typical atomic modeset requirement.
> 
> IPC should not impact register programming.
> Vblank evasion only works if you have a guarantee on worst case interrupts/delays.  I think locks is part of the guarantee.
> Double buffer control should guarantee safe alignment of programming across planes on the same pipe.

This is the part that troubles me. This was supposedly tested, but
apparently it didn't help. Are there some relevant registers that
don't respect the double buffer control?

> Multiple pipes will still require a wait for vblank.

I don't think the problems should be related to multiple pipes. We
shouldn't be overlapping any allocations between planes on different
pipes while they're running, and the pipe enable/disable code should
be sequencing things correctly (with appropriate vblank waits) to
avoid overlaps.

> 
> From: Vyas, Tarun
> Sent: Thursday, 12 April, 2018 9:57 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Runyan, Arthur J <arthur.j.runyan@intel.com>; Shaikh, Azhar <azhar.shaikh@intel.com>; Herbert, Marc <marc.herbert@intel.com>; Ciobanu, Nathan D <nathan.d.ciobanu@intel.com>; Lankhorst, Maarten <maarten.lankhorst@intel.com>
> Subject: Incorrect plane programming sequence results into a corrupted display/hard hung system
> 
> On KBL platforms, with HW overlay and/or PSR2, a hard hang with corrupted display is observed while running tests that frequently disable/re-enable primary/overlay planes. Details recorded in this FDO bug: https://bugs.freedesktop.org/show_bug.cgi?id=104975.
> 
> The issue has been root caused as a race where only partial register updates get latched on the next vblank, specifically, the updates that give the buffer allocation of the current plane before disabling it (again, details captured in the FDO bug above). There have been several optimizations to work around this bug:
> 
> 1.       Enable Isochronous priority control (IPC)
> 
> 2.       Increase the vblank evasion time to 250 usec (We have tried 500 usec but that doesn't helps).
> 
> 3.       Disable DOUBLE_BUFFER_CTL while the updates are done, inside intel_pipe_update_start and intel_pipe_update_end (doesn't helps)
> 
> 4.       Grab all the required locks before starting the pipe_update.
> 
> Per Ville, none of the above optimizations guarantee a *full* update before the vblank. As a result, to fix this issue the right way, the plane programming sequence needs to be altered in the driver as mentioned below:
> 
> "Buffer allocation overlap among enabled planes will cause a full frame underrun, and that becomes a hard hange if pkgC or SAGV are enabled.
> You need to make sure the plane is disabled before reallocating the buffer it uses.  For a single pipe it is sufficient to initiate the disabling of the plane before the reallocation.  For multiple pipes it can be more complex.
> In this case you should be doing something like this to ensure plane 2A turns off before plane 1A steals the buffer
> 
> 
> 1.         PLANE_CTL_2A -> disabled
> 
> 2.       PLANE_SURF_2A:  touch to arm double buffer update
> 
> 3.       PLANE_BUF_CFG_1A -> (0-860)
> 
> 4.       PLANE_SURF_1A: touch to arm double buffer update
> If the planes are on different pipes there needs to be a wait for vblank between step 2 and 3 to ensure the plane 2A disable completed."
> 
> 
> Please comment as required.

> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-04-17 19:17 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <7673AD08FC9483488D001D004D4635333156113B@ORSMSX106.amr.corp.intel.com>
2018-04-12 16:57 ` Incorrect plane programming sequence results into a corrupted display/hard hung system Vyas, Tarun
2018-04-12 19:28   ` Runyan, Arthur J
2018-04-17 19:17     ` Ville Syrjälä

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).