* [PATCH 1/5] drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
@ 2022-08-30 20:34 ` Nathan Chancellor
2022-08-30 20:34 ` [PATCH 2/5] drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule() Nathan Chancellor
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Nathan Chancellor @ 2022-08-30 20:34 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Nathan Chancellor
Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer created at the
top of dml32_ModeSupportAndSystemConfigurationFull(). This reduces the
total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 216 bytes with
LLVM 16 (2152 -> 1936), helping clear up the following clang warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.
Additionally, while modifying the arguments to
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(), use 'v'
consistently, instead of 'v' mixed with 'mode_lib->vba'.
Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
.../dc/dml/dcn32/display_mode_vba_32.c | 118 ++-------
.../dc/dml/dcn32/display_mode_vba_util_32.c | 248 ++++++++----------
.../dc/dml/dcn32/display_mode_vba_util_32.h | 33 +--
3 files changed, 140 insertions(+), 259 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index d8014bfbc3fe..7da90fba95fc 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -1163,58 +1163,28 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters.SMNLatency = mode_lib->vba.SMNLatency;
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
- mode_lib->vba.USRRetrainingRequiredFinal,
- mode_lib->vba.UsesMALLForPStateChange,
- mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
- mode_lib->vba.NumberOfActiveSurfaces,
- mode_lib->vba.MaxLineBufferLines,
- mode_lib->vba.LineBufferSizeFinal,
- mode_lib->vba.WritebackInterfaceBufferSize,
- mode_lib->vba.DCFCLK,
- mode_lib->vba.ReturnBW,
- mode_lib->vba.SynchronizeTimingsFinal,
- mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
- mode_lib->vba.DRRDisplay,
- v->dpte_group_bytes,
- v->meta_row_height,
- v->meta_row_height_chroma,
+ v,
+ v->PrefetchModePerState[v->VoltageLevel][v->maxMpcComb],
+ v->DCFCLK,
+ v->ReturnBW,
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
- mode_lib->vba.WritebackChunkSize,
- mode_lib->vba.SOCCLK,
+ v->SOCCLK,
v->DCFCLKDeepSleep,
- mode_lib->vba.DETBufferSizeY,
- mode_lib->vba.DETBufferSizeC,
- mode_lib->vba.SwathHeightY,
- mode_lib->vba.SwathHeightC,
- mode_lib->vba.LBBitPerPixel,
+ v->DETBufferSizeY,
+ v->DETBufferSizeC,
+ v->SwathHeightY,
+ v->SwathHeightC,
v->SwathWidthY,
v->SwathWidthC,
- mode_lib->vba.HRatio,
- mode_lib->vba.HRatioChroma,
- mode_lib->vba.vtaps,
- mode_lib->vba.VTAPsChroma,
- mode_lib->vba.VRatio,
- mode_lib->vba.VRatioChroma,
- mode_lib->vba.HTotal,
- mode_lib->vba.VTotal,
- mode_lib->vba.VActive,
- mode_lib->vba.PixelClock,
- mode_lib->vba.BlendingAndTiming,
- mode_lib->vba.DPPPerPlane,
+ v->DPPPerPlane,
v->BytePerPixelDETY,
v->BytePerPixelDETC,
v->DSTXAfterScaler,
v->DSTYAfterScaler,
- mode_lib->vba.WritebackEnable,
- mode_lib->vba.WritebackPixelFormat,
- mode_lib->vba.WritebackDestinationWidth,
- mode_lib->vba.WritebackDestinationHeight,
- mode_lib->vba.WritebackSourceHeight,
v->UnboundedRequestEnabled,
v->CompressedBufferSizeInkByte,
/* Output */
- &v->Watermark,
&v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.dummy_dramchange_support,
v->MaxActiveDRAMClockChangeLatencySupported,
v->SubViewportLinesNeededInMALL,
@@ -3559,62 +3529,32 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
{
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
- mode_lib->vba.USRRetrainingRequiredFinal,
- mode_lib->vba.UsesMALLForPStateChange,
- mode_lib->vba.PrefetchModePerState[i][j],
- mode_lib->vba.NumberOfActiveSurfaces,
- mode_lib->vba.MaxLineBufferLines,
- mode_lib->vba.LineBufferSizeFinal,
- mode_lib->vba.WritebackInterfaceBufferSize,
- mode_lib->vba.DCFCLKState[i][j],
- mode_lib->vba.ReturnBWPerState[i][j],
- mode_lib->vba.SynchronizeTimingsFinal,
- mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
- mode_lib->vba.DRRDisplay,
- mode_lib->vba.dpte_group_bytes,
- mode_lib->vba.meta_row_height,
- mode_lib->vba.meta_row_height_chroma,
+ v,
+ v->PrefetchModePerState[i][j],
+ v->DCFCLKState[i][j],
+ v->ReturnBWPerState[i][j],
v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.mSOCParameters,
- mode_lib->vba.WritebackChunkSize,
- mode_lib->vba.SOCCLKPerState[i],
- mode_lib->vba.ProjectedDCFCLKDeepSleep[i][j],
- mode_lib->vba.DETBufferSizeYThisState,
- mode_lib->vba.DETBufferSizeCThisState,
- mode_lib->vba.SwathHeightYThisState,
- mode_lib->vba.SwathHeightCThisState,
- mode_lib->vba.LBBitPerPixel,
- mode_lib->vba.SwathWidthYThisState, // 24
- mode_lib->vba.SwathWidthCThisState,
- mode_lib->vba.HRatio,
- mode_lib->vba.HRatioChroma,
- mode_lib->vba.vtaps,
- mode_lib->vba.VTAPsChroma,
- mode_lib->vba.VRatio,
- mode_lib->vba.VRatioChroma,
- mode_lib->vba.HTotal,
- mode_lib->vba.VTotal,
- mode_lib->vba.VActive,
- mode_lib->vba.PixelClock,
- mode_lib->vba.BlendingAndTiming,
- mode_lib->vba.NoOfDPPThisState,
- mode_lib->vba.BytePerPixelInDETY,
- mode_lib->vba.BytePerPixelInDETC,
+ v->SOCCLKPerState[i],
+ v->ProjectedDCFCLKDeepSleep[i][j],
+ v->DETBufferSizeYThisState,
+ v->DETBufferSizeCThisState,
+ v->SwathHeightYThisState,
+ v->SwathHeightCThisState,
+ v->SwathWidthYThisState, // 24
+ v->SwathWidthCThisState,
+ v->NoOfDPPThisState,
+ v->BytePerPixelInDETY,
+ v->BytePerPixelInDETC,
v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.DSTXAfterScaler,
v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.DSTYAfterScaler,
- mode_lib->vba.WritebackEnable,
- mode_lib->vba.WritebackPixelFormat,
- mode_lib->vba.WritebackDestinationWidth,
- mode_lib->vba.WritebackDestinationHeight,
- mode_lib->vba.WritebackSourceHeight,
- mode_lib->vba.UnboundedRequestEnabledThisState,
- mode_lib->vba.CompressedBufferSizeInkByteThisState,
+ v->UnboundedRequestEnabledThisState,
+ v->CompressedBufferSizeInkByteThisState,
/* Output */
- &mode_lib->vba.Watermark, // Store the values in vba
- &mode_lib->vba.DRAMClockChangeSupport[i][j],
+ &v->DRAMClockChangeSupport[i][j],
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_single2[0], // double *MaxActiveDRAMClockChangeLatencySupported
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_integer[0], // Long SubViewportLinesNeededInMALL[]
- &mode_lib->vba.FCLKChangeSupport[i][j],
+ &v->FCLKChangeSupport[i][j],
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_single2[1], // double *MinActiveFCLKChangeLatencySupported
&mode_lib->vba.USRRetrainingSupport[i][j],
mode_lib->vba.ActiveDRAMClockChangeLatencyMarginPerState[i][j]);
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
index dc501ee7d01a..6c7b748ee62b 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
@@ -4209,58 +4209,28 @@ void dml32_CalculateFlipSchedule(
} // CalculateFlipSchedule
void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
- bool USRRetrainingRequiredFinal,
- enum dm_use_mall_for_pstate_change_mode UseMALLForPStateChange[],
+ struct vba_vars_st *v,
unsigned int PrefetchMode,
- unsigned int NumberOfActiveSurfaces,
- unsigned int MaxLineBufferLines,
- unsigned int LineBufferSize,
- unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
- bool SynchronizeTimingsFinal,
- bool SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
- bool DRRDisplay[],
- unsigned int dpte_group_bytes[],
- unsigned int meta_row_height[],
- unsigned int meta_row_height_chroma[],
SOCParametersList mmSOCParameters,
- unsigned int WritebackChunkSize,
double SOCCLK,
double DCFClkDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
- unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
- double HRatio[],
- double HRatioChroma[],
- unsigned int VTaps[],
- unsigned int VTapsChroma[],
- double VRatio[],
- double VRatioChroma[],
- unsigned int HTotal[],
- unsigned int VTotal[],
- unsigned int VActive[],
- double PixelClock[],
- unsigned int BlendingAndTiming[],
unsigned int DPPPerSurface[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
double DSTXAfterScaler[],
double DSTYAfterScaler[],
- bool WritebackEnable[],
- enum source_format_class WritebackPixelFormat[],
- double WritebackDestinationWidth[],
- double WritebackDestinationHeight[],
- double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
unsigned int CompressedBufferSizeInkByte,
/* Output */
- Watermarks *Watermark,
enum clock_change_support *DRAMClockChangeSupport,
double MaxActiveDRAMClockChangeLatencySupported[],
unsigned int SubViewportLinesNeededInMALL[],
@@ -4302,136 +4272,136 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
unsigned int LBLatencyHidingSourceLinesY[DC__NUM_DPP__MAX];
unsigned int LBLatencyHidingSourceLinesC[DC__NUM_DPP__MAX];
- Watermark->UrgentWatermark = mmSOCParameters.UrgentLatency + mmSOCParameters.ExtraLatency;
- Watermark->USRRetrainingWatermark = mmSOCParameters.UrgentLatency + mmSOCParameters.ExtraLatency
+ v->Watermark.UrgentWatermark = mmSOCParameters.UrgentLatency + mmSOCParameters.ExtraLatency;
+ v->Watermark.USRRetrainingWatermark = mmSOCParameters.UrgentLatency + mmSOCParameters.ExtraLatency
+ mmSOCParameters.USRRetrainingLatency + mmSOCParameters.SMNLatency;
- Watermark->DRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency + Watermark->UrgentWatermark;
- Watermark->FCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency + Watermark->UrgentWatermark;
- Watermark->StutterExitWatermark = mmSOCParameters.SRExitTime + mmSOCParameters.ExtraLatency
+ v->Watermark.DRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency + v->Watermark.UrgentWatermark;
+ v->Watermark.FCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency + v->Watermark.UrgentWatermark;
+ v->Watermark.StutterExitWatermark = mmSOCParameters.SRExitTime + mmSOCParameters.ExtraLatency
+ 10 / DCFClkDeepSleep;
- Watermark->StutterEnterPlusExitWatermark = mmSOCParameters.SREnterPlusExitTime + mmSOCParameters.ExtraLatency
+ v->Watermark.StutterEnterPlusExitWatermark = mmSOCParameters.SREnterPlusExitTime + mmSOCParameters.ExtraLatency
+ 10 / DCFClkDeepSleep;
- Watermark->Z8StutterExitWatermark = mmSOCParameters.SRExitZ8Time + mmSOCParameters.ExtraLatency
+ v->Watermark.Z8StutterExitWatermark = mmSOCParameters.SRExitZ8Time + mmSOCParameters.ExtraLatency
+ 10 / DCFClkDeepSleep;
- Watermark->Z8StutterEnterPlusExitWatermark = mmSOCParameters.SREnterPlusExitZ8Time
+ v->Watermark.Z8StutterEnterPlusExitWatermark = mmSOCParameters.SREnterPlusExitZ8Time
+ mmSOCParameters.ExtraLatency + 10 / DCFClkDeepSleep;
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: UrgentLatency = %f\n", __func__, mmSOCParameters.UrgentLatency);
dml_print("DML::%s: ExtraLatency = %f\n", __func__, mmSOCParameters.ExtraLatency);
dml_print("DML::%s: DRAMClockChangeLatency = %f\n", __func__, mmSOCParameters.DRAMClockChangeLatency);
- dml_print("DML::%s: UrgentWatermark = %f\n", __func__, Watermark->UrgentWatermark);
- dml_print("DML::%s: USRRetrainingWatermark = %f\n", __func__, Watermark->USRRetrainingWatermark);
- dml_print("DML::%s: DRAMClockChangeWatermark = %f\n", __func__, Watermark->DRAMClockChangeWatermark);
- dml_print("DML::%s: FCLKChangeWatermark = %f\n", __func__, Watermark->FCLKChangeWatermark);
- dml_print("DML::%s: StutterExitWatermark = %f\n", __func__, Watermark->StutterExitWatermark);
- dml_print("DML::%s: StutterEnterPlusExitWatermark = %f\n", __func__, Watermark->StutterEnterPlusExitWatermark);
- dml_print("DML::%s: Z8StutterExitWatermark = %f\n", __func__, Watermark->Z8StutterExitWatermark);
+ dml_print("DML::%s: UrgentWatermark = %f\n", __func__, v->Watermark.UrgentWatermark);
+ dml_print("DML::%s: USRRetrainingWatermark = %f\n", __func__, v->Watermark.USRRetrainingWatermark);
+ dml_print("DML::%s: DRAMClockChangeWatermark = %f\n", __func__, v->Watermark.DRAMClockChangeWatermark);
+ dml_print("DML::%s: FCLKChangeWatermark = %f\n", __func__, v->Watermark.FCLKChangeWatermark);
+ dml_print("DML::%s: StutterExitWatermark = %f\n", __func__, v->Watermark.StutterExitWatermark);
+ dml_print("DML::%s: StutterEnterPlusExitWatermark = %f\n", __func__, v->Watermark.StutterEnterPlusExitWatermark);
+ dml_print("DML::%s: Z8StutterExitWatermark = %f\n", __func__, v->Watermark.Z8StutterExitWatermark);
dml_print("DML::%s: Z8StutterEnterPlusExitWatermark = %f\n",
- __func__, Watermark->Z8StutterEnterPlusExitWatermark);
+ __func__, v->Watermark.Z8StutterEnterPlusExitWatermark);
#endif
TotalActiveWriteback = 0;
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- if (WritebackEnable[k] == true)
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ if (v->WritebackEnable[k] == true)
TotalActiveWriteback = TotalActiveWriteback + 1;
}
if (TotalActiveWriteback <= 1) {
- Watermark->WritebackUrgentWatermark = mmSOCParameters.WritebackLatency;
+ v->Watermark.WritebackUrgentWatermark = mmSOCParameters.WritebackLatency;
} else {
- Watermark->WritebackUrgentWatermark = mmSOCParameters.WritebackLatency
- + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
+ v->Watermark.WritebackUrgentWatermark = mmSOCParameters.WritebackLatency
+ + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
}
- if (USRRetrainingRequiredFinal)
- Watermark->WritebackUrgentWatermark = Watermark->WritebackUrgentWatermark
+ if (v->USRRetrainingRequiredFinal)
+ v->Watermark.WritebackUrgentWatermark = v->Watermark.WritebackUrgentWatermark
+ mmSOCParameters.USRRetrainingLatency;
if (TotalActiveWriteback <= 1) {
- Watermark->WritebackDRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency
+ v->Watermark.WritebackDRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency
+ mmSOCParameters.WritebackLatency;
- Watermark->WritebackFCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency
+ v->Watermark.WritebackFCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency
+ mmSOCParameters.WritebackLatency;
} else {
- Watermark->WritebackDRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency
- + mmSOCParameters.WritebackLatency + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
- Watermark->WritebackFCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency
- + mmSOCParameters.WritebackLatency + WritebackChunkSize * 1024 / 32 / SOCCLK;
+ v->Watermark.WritebackDRAMClockChangeWatermark = mmSOCParameters.DRAMClockChangeLatency
+ + mmSOCParameters.WritebackLatency + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
+ v->Watermark.WritebackFCLKChangeWatermark = mmSOCParameters.FCLKChangeLatency
+ + mmSOCParameters.WritebackLatency + v->WritebackChunkSize * 1024 / 32 / SOCCLK;
}
- if (USRRetrainingRequiredFinal)
- Watermark->WritebackDRAMClockChangeWatermark = Watermark->WritebackDRAMClockChangeWatermark
+ if (v->USRRetrainingRequiredFinal)
+ v->Watermark.WritebackDRAMClockChangeWatermark = v->Watermark.WritebackDRAMClockChangeWatermark
+ mmSOCParameters.USRRetrainingLatency;
- if (USRRetrainingRequiredFinal)
- Watermark->WritebackFCLKChangeWatermark = Watermark->WritebackFCLKChangeWatermark
+ if (v->USRRetrainingRequiredFinal)
+ v->Watermark.WritebackFCLKChangeWatermark = v->Watermark.WritebackFCLKChangeWatermark
+ mmSOCParameters.USRRetrainingLatency;
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: WritebackDRAMClockChangeWatermark = %f\n",
- __func__, Watermark->WritebackDRAMClockChangeWatermark);
- dml_print("DML::%s: WritebackFCLKChangeWatermark = %f\n", __func__, Watermark->WritebackFCLKChangeWatermark);
- dml_print("DML::%s: WritebackUrgentWatermark = %f\n", __func__, Watermark->WritebackUrgentWatermark);
- dml_print("DML::%s: USRRetrainingRequiredFinal = %d\n", __func__, USRRetrainingRequiredFinal);
+ __func__, v->Watermark.WritebackDRAMClockChangeWatermark);
+ dml_print("DML::%s: WritebackFCLKChangeWatermark = %f\n", __func__, v->Watermark.WritebackFCLKChangeWatermark);
+ dml_print("DML::%s: WritebackUrgentWatermark = %f\n", __func__, v->Watermark.WritebackUrgentWatermark);
+ dml_print("DML::%s: v->USRRetrainingRequiredFinal = %d\n", __func__, v->USRRetrainingRequiredFinal);
dml_print("DML::%s: USRRetrainingLatency = %f\n", __func__, mmSOCParameters.USRRetrainingLatency);
#endif
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- TotalPixelBW = TotalPixelBW + DPPPerSurface[k] * (SwathWidthY[k] * BytePerPixelDETY[k] * VRatio[k] +
- SwathWidthC[k] * BytePerPixelDETC[k] * VRatioChroma[k]) / (HTotal[k] / PixelClock[k]);
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ TotalPixelBW = TotalPixelBW + DPPPerSurface[k] * (SwathWidthY[k] * BytePerPixelDETY[k] * v->VRatio[k] +
+ SwathWidthC[k] * BytePerPixelDETC[k] * v->VRatioChroma[k]) / (v->HTotal[k] / v->PixelClock[k]);
}
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
- LBLatencyHidingSourceLinesY[k] = dml_min((double) MaxLineBufferLines, dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(HRatio[k], 1.0)), 1)) - (VTaps[k] - 1);
- LBLatencyHidingSourceLinesC[k] = dml_min((double) MaxLineBufferLines, dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(HRatioChroma[k], 1.0)), 1)) - (VTapsChroma[k] - 1);
+ LBLatencyHidingSourceLinesY[k] = dml_min((double) v->MaxLineBufferLines, dml_floor(v->LineBufferSizeFinal / v->LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(v->HRatio[k], 1.0)), 1)) - (v->vtaps[k] - 1);
+ LBLatencyHidingSourceLinesC[k] = dml_min((double) v->MaxLineBufferLines, dml_floor(v->LineBufferSizeFinal / v->LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(v->HRatioChroma[k], 1.0)), 1)) - (v->VTAPsChroma[k] - 1);
#ifdef __DML_VBA_DEBUG__
- dml_print("DML::%s: k=%d, MaxLineBufferLines = %d\n", __func__, k, MaxLineBufferLines);
- dml_print("DML::%s: k=%d, LineBufferSize = %d\n", __func__, k, LineBufferSize);
- dml_print("DML::%s: k=%d, LBBitPerPixel = %d\n", __func__, k, LBBitPerPixel[k]);
- dml_print("DML::%s: k=%d, HRatio = %f\n", __func__, k, HRatio[k]);
- dml_print("DML::%s: k=%d, VTaps = %d\n", __func__, k, VTaps[k]);
+ dml_print("DML::%s: k=%d, v->MaxLineBufferLines = %d\n", __func__, k, v->MaxLineBufferLines);
+ dml_print("DML::%s: k=%d, v->LineBufferSizeFinal = %d\n", __func__, k, v->LineBufferSizeFinal);
+ dml_print("DML::%s: k=%d, v->LBBitPerPixel = %d\n", __func__, k, v->LBBitPerPixel[k]);
+ dml_print("DML::%s: k=%d, v->HRatio = %f\n", __func__, k, v->HRatio[k]);
+ dml_print("DML::%s: k=%d, v->vtaps = %d\n", __func__, k, v->vtaps[k]);
#endif
- EffectiveLBLatencyHidingY = LBLatencyHidingSourceLinesY[k] / VRatio[k] * (HTotal[k] / PixelClock[k]);
- EffectiveLBLatencyHidingC = LBLatencyHidingSourceLinesC[k] / VRatioChroma[k] * (HTotal[k] / PixelClock[k]);
+ EffectiveLBLatencyHidingY = LBLatencyHidingSourceLinesY[k] / v->VRatio[k] * (v->HTotal[k] / v->PixelClock[k]);
+ EffectiveLBLatencyHidingC = LBLatencyHidingSourceLinesC[k] / v->VRatioChroma[k] * (v->HTotal[k] / v->PixelClock[k]);
EffectiveDETBufferSizeY = DETBufferSizeY[k];
if (UnboundedRequestEnabled) {
EffectiveDETBufferSizeY = EffectiveDETBufferSizeY
+ CompressedBufferSizeInkByte * 1024
- * (SwathWidthY[k] * BytePerPixelDETY[k] * VRatio[k])
- / (HTotal[k] / PixelClock[k]) / TotalPixelBW;
+ * (SwathWidthY[k] * BytePerPixelDETY[k] * v->VRatio[k])
+ / (v->HTotal[k] / v->PixelClock[k]) / TotalPixelBW;
}
LinesInDETY[k] = (double) EffectiveDETBufferSizeY / BytePerPixelDETY[k] / SwathWidthY[k];
LinesInDETYRoundedDownToSwath[k] = dml_floor(LinesInDETY[k], SwathHeightY[k]);
- FullDETBufferingTimeY = LinesInDETYRoundedDownToSwath[k] * (HTotal[k] / PixelClock[k]) / VRatio[k];
+ FullDETBufferingTimeY = LinesInDETYRoundedDownToSwath[k] * (v->HTotal[k] / v->PixelClock[k]) / v->VRatio[k];
ActiveClockChangeLatencyHidingY = EffectiveLBLatencyHidingY + FullDETBufferingTimeY
- - (DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) * HTotal[k] / PixelClock[k];
+ - (DSTXAfterScaler[k] / v->HTotal[k] + DSTYAfterScaler[k]) * v->HTotal[k] / v->PixelClock[k];
- if (NumberOfActiveSurfaces > 1) {
+ if (v->NumberOfActiveSurfaces > 1) {
ActiveClockChangeLatencyHidingY = ActiveClockChangeLatencyHidingY
- - (1 - 1 / NumberOfActiveSurfaces) * SwathHeightY[k] * HTotal[k]
- / PixelClock[k] / VRatio[k];
+ - (1 - 1 / v->NumberOfActiveSurfaces) * SwathHeightY[k] * v->HTotal[k]
+ / v->PixelClock[k] / v->VRatio[k];
}
if (BytePerPixelDETC[k] > 0) {
LinesInDETC[k] = DETBufferSizeC[k] / BytePerPixelDETC[k] / SwathWidthC[k];
LinesInDETCRoundedDownToSwath[k] = dml_floor(LinesInDETC[k], SwathHeightC[k]);
- FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath[k] * (HTotal[k] / PixelClock[k])
- / VRatioChroma[k];
+ FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath[k] * (v->HTotal[k] / v->PixelClock[k])
+ / v->VRatioChroma[k];
ActiveClockChangeLatencyHidingC = EffectiveLBLatencyHidingC + FullDETBufferingTimeC
- - (DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) * HTotal[k]
- / PixelClock[k];
- if (NumberOfActiveSurfaces > 1) {
+ - (DSTXAfterScaler[k] / v->HTotal[k] + DSTYAfterScaler[k]) * v->HTotal[k]
+ / v->PixelClock[k];
+ if (v->NumberOfActiveSurfaces > 1) {
ActiveClockChangeLatencyHidingC = ActiveClockChangeLatencyHidingC
- - (1 - 1 / NumberOfActiveSurfaces) * SwathHeightC[k] * HTotal[k]
- / PixelClock[k] / VRatioChroma[k];
+ - (1 - 1 / v->NumberOfActiveSurfaces) * SwathHeightC[k] * v->HTotal[k]
+ / v->PixelClock[k] / v->VRatioChroma[k];
}
ActiveClockChangeLatencyHiding = dml_min(ActiveClockChangeLatencyHidingY,
ActiveClockChangeLatencyHidingC);
@@ -4439,24 +4409,24 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
ActiveClockChangeLatencyHiding = ActiveClockChangeLatencyHidingY;
}
- ActiveDRAMClockChangeLatencyMargin[k] = ActiveClockChangeLatencyHiding - Watermark->UrgentWatermark
- - Watermark->DRAMClockChangeWatermark;
- ActiveFCLKChangeLatencyMargin[k] = ActiveClockChangeLatencyHiding - Watermark->UrgentWatermark
- - Watermark->FCLKChangeWatermark;
- USRRetrainingLatencyMargin[k] = ActiveClockChangeLatencyHiding - Watermark->USRRetrainingWatermark;
-
- if (WritebackEnable[k]) {
- WritebackLatencyHiding = WritebackInterfaceBufferSize * 1024
- / (WritebackDestinationWidth[k] * WritebackDestinationHeight[k]
- / (WritebackSourceHeight[k] * HTotal[k] / PixelClock[k]) * 4);
- if (WritebackPixelFormat[k] == dm_444_64)
+ ActiveDRAMClockChangeLatencyMargin[k] = ActiveClockChangeLatencyHiding - v->Watermark.UrgentWatermark
+ - v->Watermark.DRAMClockChangeWatermark;
+ ActiveFCLKChangeLatencyMargin[k] = ActiveClockChangeLatencyHiding - v->Watermark.UrgentWatermark
+ - v->Watermark.FCLKChangeWatermark;
+ USRRetrainingLatencyMargin[k] = ActiveClockChangeLatencyHiding - v->Watermark.USRRetrainingWatermark;
+
+ if (v->WritebackEnable[k]) {
+ WritebackLatencyHiding = v->WritebackInterfaceBufferSize * 1024
+ / (v->WritebackDestinationWidth[k] * v->WritebackDestinationHeight[k]
+ / (v->WritebackSourceHeight[k] * v->HTotal[k] / v->PixelClock[k]) * 4);
+ if (v->WritebackPixelFormat[k] == dm_444_64)
WritebackLatencyHiding = WritebackLatencyHiding / 2;
WritebackDRAMClockChangeLatencyMargin = WritebackLatencyHiding
- - Watermark->WritebackDRAMClockChangeWatermark;
+ - v->Watermark.WritebackDRAMClockChangeWatermark;
WritebackFCLKChangeLatencyMargin = WritebackLatencyHiding
- - Watermark->WritebackFCLKChangeWatermark;
+ - v->Watermark.WritebackFCLKChangeWatermark;
ActiveDRAMClockChangeLatencyMargin[k] = dml_min(ActiveDRAMClockChangeLatencyMargin[k],
WritebackFCLKChangeLatencyMargin);
@@ -4464,22 +4434,22 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
WritebackDRAMClockChangeLatencyMargin);
}
MaxActiveDRAMClockChangeLatencySupported[k] =
- (UseMALLForPStateChange[k] == dm_use_mall_pstate_change_phantom_pipe) ?
+ (v->UsesMALLForPStateChange[k] == dm_use_mall_pstate_change_phantom_pipe) ?
0 :
(ActiveDRAMClockChangeLatencyMargin[k]
+ mmSOCParameters.DRAMClockChangeLatency);
}
- for (i = 0; i < NumberOfActiveSurfaces; ++i) {
- for (j = 0; j < NumberOfActiveSurfaces; ++j) {
+ for (i = 0; i < v->NumberOfActiveSurfaces; ++i) {
+ for (j = 0; j < v->NumberOfActiveSurfaces; ++j) {
if (i == j ||
- (BlendingAndTiming[i] == i && BlendingAndTiming[j] == i) ||
- (BlendingAndTiming[j] == j && BlendingAndTiming[i] == j) ||
- (BlendingAndTiming[i] == BlendingAndTiming[j] && BlendingAndTiming[i] != i) ||
- (SynchronizeTimingsFinal && PixelClock[i] == PixelClock[j] &&
- HTotal[i] == HTotal[j] && VTotal[i] == VTotal[j] &&
- VActive[i] == VActive[j]) || (SynchronizeDRRDisplaysForUCLKPStateChangeFinal &&
- (DRRDisplay[i] || DRRDisplay[j]))) {
+ (v->BlendingAndTiming[i] == i && v->BlendingAndTiming[j] == i) ||
+ (v->BlendingAndTiming[j] == j && v->BlendingAndTiming[i] == j) ||
+ (v->BlendingAndTiming[i] == v->BlendingAndTiming[j] && v->BlendingAndTiming[i] != i) ||
+ (v->SynchronizeTimingsFinal && v->PixelClock[i] == v->PixelClock[j] &&
+ v->HTotal[i] == v->HTotal[j] && v->VTotal[i] == v->VTotal[j] &&
+ v->VActive[i] == v->VActive[j]) || (v->SynchronizeDRRDisplaysForUCLKPStateChangeFinal &&
+ (v->DRRDisplay[i] || v->DRRDisplay[j]))) {
SynchronizedSurfaces[i][j] = true;
} else {
SynchronizedSurfaces[i][j] = false;
@@ -4487,8 +4457,8 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
}
}
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- if ((UseMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ if ((v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
(!FoundFirstSurfaceWithMinActiveFCLKChangeMargin ||
ActiveFCLKChangeLatencyMargin[k] < MinActiveFCLKChangeMargin)) {
FoundFirstSurfaceWithMinActiveFCLKChangeMargin = true;
@@ -4500,9 +4470,9 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
*MinActiveFCLKChangeLatencySupported = MinActiveFCLKChangeMargin + mmSOCParameters.FCLKChangeLatency;
SameTimingForFCLKChange = true;
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
if (!SynchronizedSurfaces[k][SurfaceWithMinActiveFCLKChangeMargin]) {
- if ((UseMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
+ if ((v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
(SameTimingForFCLKChange ||
ActiveFCLKChangeLatencyMargin[k] <
SecondMinActiveFCLKChangeMarginOneDisplayInVBLank)) {
@@ -4522,17 +4492,17 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
}
*USRRetrainingSupport = true;
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- if ((UseMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ if ((v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe) &&
(USRRetrainingLatencyMargin[k] < 0)) {
*USRRetrainingSupport = false;
}
}
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- if (UseMALLForPStateChange[k] != dm_use_mall_pstate_change_full_frame &&
- UseMALLForPStateChange[k] != dm_use_mall_pstate_change_sub_viewport &&
- UseMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe &&
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ if (v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_full_frame &&
+ v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_sub_viewport &&
+ v->UsesMALLForPStateChange[k] != dm_use_mall_pstate_change_phantom_pipe &&
ActiveDRAMClockChangeLatencyMargin[k] < 0) {
if (PrefetchMode > 0) {
DRAMClockChangeSupportNumber = 2;
@@ -4546,10 +4516,10 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
}
}
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
- if (UseMALLForPStateChange[k] == dm_use_mall_pstate_change_full_frame)
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
+ if (v->UsesMALLForPStateChange[k] == dm_use_mall_pstate_change_full_frame)
DRAMClockChangeMethod = 1;
- else if (UseMALLForPStateChange[k] == dm_use_mall_pstate_change_sub_viewport)
+ else if (v->UsesMALLForPStateChange[k] == dm_use_mall_pstate_change_sub_viewport)
DRAMClockChangeMethod = 2;
}
@@ -4576,16 +4546,16 @@ void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
*DRAMClockChangeSupport = dm_dram_clock_change_unsupported;
}
- for (k = 0; k < NumberOfActiveSurfaces; ++k) {
+ for (k = 0; k < v->NumberOfActiveSurfaces; ++k) {
unsigned int dst_y_pstate;
unsigned int src_y_pstate_l;
unsigned int src_y_pstate_c;
unsigned int src_y_ahead_l, src_y_ahead_c, sub_vp_lines_l, sub_vp_lines_c;
- dst_y_pstate = dml_ceil((mmSOCParameters.DRAMClockChangeLatency + mmSOCParameters.UrgentLatency) / (HTotal[k] / PixelClock[k]), 1);
- src_y_pstate_l = dml_ceil(dst_y_pstate * VRatio[k], SwathHeightY[k]);
+ dst_y_pstate = dml_ceil((mmSOCParameters.DRAMClockChangeLatency + mmSOCParameters.UrgentLatency) / (v->HTotal[k] / v->PixelClock[k]), 1);
+ src_y_pstate_l = dml_ceil(dst_y_pstate * v->VRatio[k], SwathHeightY[k]);
src_y_ahead_l = dml_floor(DETBufferSizeY[k] / BytePerPixelDETY[k] / SwathWidthY[k], SwathHeightY[k]) + LBLatencyHidingSourceLinesY[k];
- sub_vp_lines_l = src_y_pstate_l + src_y_ahead_l + meta_row_height[k];
+ sub_vp_lines_l = src_y_pstate_l + src_y_ahead_l + v->meta_row_height[k];
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: k=%d, DETBufferSizeY = %d\n", __func__, k, DETBufferSizeY[k]);
@@ -4596,21 +4566,21 @@ dml_print("DML::%s: k=%d, LBLatencyHidingSourceLinesY = %d\n", __func__, k, LBL
dml_print("DML::%s: k=%d, dst_y_pstate = %d\n", __func__, k, dst_y_pstate);
dml_print("DML::%s: k=%d, src_y_pstate_l = %d\n", __func__, k, src_y_pstate_l);
dml_print("DML::%s: k=%d, src_y_ahead_l = %d\n", __func__, k, src_y_ahead_l);
-dml_print("DML::%s: k=%d, meta_row_height = %d\n", __func__, k, meta_row_height[k]);
+dml_print("DML::%s: k=%d, v->meta_row_height = %d\n", __func__, k, v->meta_row_height[k]);
dml_print("DML::%s: k=%d, sub_vp_lines_l = %d\n", __func__, k, sub_vp_lines_l);
#endif
SubViewportLinesNeededInMALL[k] = sub_vp_lines_l;
if (BytePerPixelDETC[k] > 0) {
- src_y_pstate_c = dml_ceil(dst_y_pstate * VRatioChroma[k], SwathHeightC[k]);
+ src_y_pstate_c = dml_ceil(dst_y_pstate * v->VRatioChroma[k], SwathHeightC[k]);
src_y_ahead_c = dml_floor(DETBufferSizeC[k] / BytePerPixelDETC[k] / SwathWidthC[k], SwathHeightC[k]) + LBLatencyHidingSourceLinesC[k];
- sub_vp_lines_c = src_y_pstate_c + src_y_ahead_c + meta_row_height_chroma[k];
+ sub_vp_lines_c = src_y_pstate_c + src_y_ahead_c + v->meta_row_height_chroma[k];
SubViewportLinesNeededInMALL[k] = dml_max(sub_vp_lines_l, sub_vp_lines_c);
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: k=%d, src_y_pstate_c = %d\n", __func__, k, src_y_pstate_c);
dml_print("DML::%s: k=%d, src_y_ahead_c = %d\n", __func__, k, src_y_ahead_c);
-dml_print("DML::%s: k=%d, meta_row_height_chroma = %d\n", __func__, k, meta_row_height_chroma[k]);
+dml_print("DML::%s: k=%d, v->meta_row_height_chroma = %d\n", __func__, k, v->meta_row_height_chroma[k]);
dml_print("DML::%s: k=%d, sub_vp_lines_c = %d\n", __func__, k, sub_vp_lines_c);
#endif
}
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
index 626f6605e2d5..7cdf0418830a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
@@ -30,6 +30,7 @@
#include "os_types.h"
#include "../dc_features.h"
#include "../display_mode_structs.h"
+#include "dml/display_mode_vba.h"
unsigned int dml32_dscceComputeDelay(
unsigned int bpc,
@@ -808,58 +809,28 @@ void dml32_CalculateFlipSchedule(
bool *ImmediateFlipSupportedForPipe);
void dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
- bool USRRetrainingRequiredFinal,
- enum dm_use_mall_for_pstate_change_mode UseMALLForPStateChange[],
+ struct vba_vars_st *v,
unsigned int PrefetchMode,
- unsigned int NumberOfActiveSurfaces,
- unsigned int MaxLineBufferLines,
- unsigned int LineBufferSize,
- unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
- bool SynchronizeTimingsFinal,
- bool SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
- bool DRRDisplay[],
- unsigned int dpte_group_bytes[],
- unsigned int meta_row_height[],
- unsigned int meta_row_height_chroma[],
SOCParametersList mmSOCParameters,
- unsigned int WritebackChunkSize,
double SOCCLK,
double DCFClkDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
- unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
- double HRatio[],
- double HRatioChroma[],
- unsigned int VTaps[],
- unsigned int VTapsChroma[],
- double VRatio[],
- double VRatioChroma[],
- unsigned int HTotal[],
- unsigned int VTotal[],
- unsigned int VActive[],
- double PixelClock[],
- unsigned int BlendingAndTiming[],
unsigned int DPPPerSurface[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
double DSTXAfterScaler[],
double DSTYAfterScaler[],
- bool WritebackEnable[],
- enum source_format_class WritebackPixelFormat[],
- double WritebackDestinationWidth[],
- double WritebackDestinationHeight[],
- double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
unsigned int CompressedBufferSizeInkByte,
/* Output */
- Watermarks *Watermark,
enum clock_change_support *DRAMClockChangeSupport,
double MaxActiveDRAMClockChangeLatencySupported[],
unsigned int SubViewportLinesNeededInMALL[],
--
2.37.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 2/5] drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule()
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
2022-08-30 20:34 ` [PATCH 1/5] drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport() Nathan Chancellor
@ 2022-08-30 20:34 ` Nathan Chancellor
2022-08-30 20:34 ` [PATCH 3/5] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport() Nathan Chancellor
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Nathan Chancellor @ 2022-08-30 20:34 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Nathan Chancellor
Several of the arguments are identical between the two call sites and
they can be accessed through the 'struct vba_vars_st' pointer. This
reduces the total amount of stack space that
dml32_ModeSupportAndSystemConfigurationFull() uses by 208 bytes with
LLVM 16 (1936 -> 1728), helping clear up the following clang warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.
Additionally, while modifying the arguments to
dml32_CalculatePrefetchSchedule(), use 'v' consistently, instead of 'v'
mixed with 'mode_lib->vba'.
Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
.../dc/dml/dcn32/display_mode_vba_32.c | 118 +++++++-----------
.../dc/dml/dcn32/display_mode_vba_util_32.c | 75 +++++------
.../dc/dml/dcn32/display_mode_vba_util_32.h | 18 +--
3 files changed, 78 insertions(+), 133 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
index 7da90fba95fc..58c6cc58583a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_32.c
@@ -755,30 +755,18 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.BytePerPixelY = v->BytePerPixelY[k];
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.BytePerPixelC = v->BytePerPixelC[k];
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe.ProgressiveToInterlaceUnitInOPP = mode_lib->vba.ProgressiveToInterlaceUnitInOPP;
- v->ErrorResult[k] = dml32_CalculatePrefetchSchedule(v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.HostVMInefficiencyFactor,
- &v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe, v->DSCDelay[k],
- mode_lib->vba.DPPCLKDelaySubtotal + mode_lib->vba.DPPCLKDelayCNVCFormater,
- mode_lib->vba.DPPCLKDelaySCL,
- mode_lib->vba.DPPCLKDelaySCLLBOnly,
- mode_lib->vba.DPPCLKDelayCNVCCursor,
- mode_lib->vba.DISPCLKDelaySubtotal,
- (unsigned int) (v->SwathWidthY[k] / mode_lib->vba.HRatio[k]),
- mode_lib->vba.OutputFormat[k],
- mode_lib->vba.MaxInterDCNTileRepeaters,
+ v->ErrorResult[k] = dml32_CalculatePrefetchSchedule(
+ v,
+ k,
+ v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.HostVMInefficiencyFactor,
+ &v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.myPipe,
+ v->DSCDelay[k],
+ (unsigned int) (v->SwathWidthY[k] / v->HRatio[k]),
dml_min(v->VStartupLines, v->MaxVStartupLines[k]),
v->MaxVStartupLines[k],
- mode_lib->vba.GPUVMMaxPageTableLevels,
- mode_lib->vba.GPUVMEnable,
- mode_lib->vba.HostVMEnable,
- mode_lib->vba.HostVMMaxNonCachedPageTableLevels,
- mode_lib->vba.HostVMMinPageSize,
- mode_lib->vba.DynamicMetadataEnable[k],
- mode_lib->vba.DynamicMetadataVMEnabled,
- mode_lib->vba.DynamicMetadataLinesBeforeActiveRequired[k],
- mode_lib->vba.DynamicMetadataTransmittedBytes[k],
v->UrgentLatency,
v->UrgentExtraLatency,
- mode_lib->vba.TCalc,
+ v->TCalc,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
v->PixelPTEBytesPerRow[k],
@@ -792,8 +780,8 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->MaxNumSwathC[k],
v->swath_width_luma_ub[k],
v->swath_width_chroma_ub[k],
- mode_lib->vba.SwathHeightY[k],
- mode_lib->vba.SwathHeightC[k],
+ v->SwathHeightY[k],
+ v->SwathHeightC[k],
TWait,
/* Output */
&v->DSTXAfterScaler[k],
@@ -3230,63 +3218,47 @@ void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
mode_lib->vba.NoTimeForPrefetch[i][j][k] =
dml32_CalculatePrefetchSchedule(
+ v,
+ k,
v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.HostVMInefficiencyFactor,
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.myPipe,
- mode_lib->vba.DSCDelayPerState[i][k],
- mode_lib->vba.DPPCLKDelaySubtotal +
- mode_lib->vba.DPPCLKDelayCNVCFormater,
- mode_lib->vba.DPPCLKDelaySCL,
- mode_lib->vba.DPPCLKDelaySCLLBOnly,
- mode_lib->vba.DPPCLKDelayCNVCCursor,
- mode_lib->vba.DISPCLKDelaySubtotal,
- mode_lib->vba.SwathWidthYThisState[k] /
- mode_lib->vba.HRatio[k],
- mode_lib->vba.OutputFormat[k],
- mode_lib->vba.MaxInterDCNTileRepeaters,
- dml_min(mode_lib->vba.MaxVStartup,
- mode_lib->vba.MaximumVStartup[i][j][k]),
- mode_lib->vba.MaximumVStartup[i][j][k],
- mode_lib->vba.GPUVMMaxPageTableLevels,
- mode_lib->vba.GPUVMEnable, mode_lib->vba.HostVMEnable,
- mode_lib->vba.HostVMMaxNonCachedPageTableLevels,
- mode_lib->vba.HostVMMinPageSize,
- mode_lib->vba.DynamicMetadataEnable[k],
- mode_lib->vba.DynamicMetadataVMEnabled,
- mode_lib->vba.DynamicMetadataLinesBeforeActiveRequired[k],
- mode_lib->vba.DynamicMetadataTransmittedBytes[k],
- mode_lib->vba.UrgLatency[i],
- mode_lib->vba.ExtraLatency,
- mode_lib->vba.TimeCalc,
- mode_lib->vba.PDEAndMetaPTEBytesPerFrame[i][j][k],
- mode_lib->vba.MetaRowBytes[i][j][k],
- mode_lib->vba.DPTEBytesPerRow[i][j][k],
- mode_lib->vba.PrefetchLinesY[i][j][k],
- mode_lib->vba.SwathWidthYThisState[k],
- mode_lib->vba.PrefillY[k],
- mode_lib->vba.MaxNumSwY[k],
- mode_lib->vba.PrefetchLinesC[i][j][k],
- mode_lib->vba.SwathWidthCThisState[k],
- mode_lib->vba.PrefillC[k],
- mode_lib->vba.MaxNumSwC[k],
- mode_lib->vba.swath_width_luma_ub_this_state[k],
- mode_lib->vba.swath_width_chroma_ub_this_state[k],
- mode_lib->vba.SwathHeightYThisState[k],
- mode_lib->vba.SwathHeightCThisState[k], mode_lib->vba.TWait,
+ v->DSCDelayPerState[i][k],
+ v->SwathWidthYThisState[k] / v->HRatio[k],
+ dml_min(v->MaxVStartup, v->MaximumVStartup[i][j][k]),
+ v->MaximumVStartup[i][j][k],
+ v->UrgLatency[i],
+ v->ExtraLatency,
+ v->TimeCalc,
+ v->PDEAndMetaPTEBytesPerFrame[i][j][k],
+ v->MetaRowBytes[i][j][k],
+ v->DPTEBytesPerRow[i][j][k],
+ v->PrefetchLinesY[i][j][k],
+ v->SwathWidthYThisState[k],
+ v->PrefillY[k],
+ v->MaxNumSwY[k],
+ v->PrefetchLinesC[i][j][k],
+ v->SwathWidthCThisState[k],
+ v->PrefillC[k],
+ v->MaxNumSwC[k],
+ v->swath_width_luma_ub_this_state[k],
+ v->swath_width_chroma_ub_this_state[k],
+ v->SwathHeightYThisState[k],
+ v->SwathHeightCThisState[k], v->TWait,
/* Output */
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.DSTXAfterScaler[k],
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.DSTYAfterScaler[k],
- &mode_lib->vba.LineTimesForPrefetch[k],
- &mode_lib->vba.PrefetchBW[k],
- &mode_lib->vba.LinesForMetaPTE[k],
- &mode_lib->vba.LinesForMetaAndDPTERow[k],
- &mode_lib->vba.VRatioPreY[i][j][k],
- &mode_lib->vba.VRatioPreC[i][j][k],
- &mode_lib->vba.RequiredPrefetchPixelDataBWLuma[0][0][k],
- &mode_lib->vba.RequiredPrefetchPixelDataBWChroma[0][0][k],
- &mode_lib->vba.NoTimeForDynamicMetadata[i][j][k],
- &mode_lib->vba.Tno_bw[k],
- &mode_lib->vba.prefetch_vmrow_bw[k],
+ &v->LineTimesForPrefetch[k],
+ &v->PrefetchBW[k],
+ &v->LinesForMetaPTE[k],
+ &v->LinesForMetaAndDPTERow[k],
+ &v->VRatioPreY[i][j][k],
+ &v->VRatioPreC[i][j][k],
+ &v->RequiredPrefetchPixelDataBWLuma[0][0][k],
+ &v->RequiredPrefetchPixelDataBWChroma[0][0][k],
+ &v->NoTimeForDynamicMetadata[i][j][k],
+ &v->Tno_bw[k],
+ &v->prefetch_vmrow_bw[k],
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_single[0], // double *Tdmdl_vm
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_single[1], // double *Tdmdl
&v->dummy_vars.dml32_ModeSupportAndSystemConfigurationFull.dummy_single[2], // double *TSetup
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
index 6c7b748ee62b..61be183a4036 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.c
@@ -3366,28 +3366,14 @@ double dml32_CalculateExtraLatency(
} // CalculateExtraLatency
bool dml32_CalculatePrefetchSchedule(
+ struct vba_vars_st *v,
+ unsigned int k,
double HostVMInefficiencyFactor,
DmlPipe *myPipe,
unsigned int DSCDelay,
- double DPPCLKDelaySubtotalPlusCNVCFormater,
- double DPPCLKDelaySCL,
- double DPPCLKDelaySCLLBOnly,
- double DPPCLKDelayCNVCCursor,
- double DISPCLKDelaySubtotal,
unsigned int DPP_RECOUT_WIDTH,
- enum output_format_class OutputFormat,
- unsigned int MaxInterDCNTileRepeaters,
unsigned int VStartup,
unsigned int MaxVStartup,
- unsigned int GPUVMPageTableLevels,
- bool GPUVMEnable,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- double HostVMMinPageSize,
- bool DynamicMetadataEnable,
- bool DynamicMetadataVMEnabled,
- int DynamicMetadataLinesBeforeActiveRequired,
- unsigned int DynamicMetadataTransmittedBytes,
double UrgentLatency,
double UrgentExtraLatency,
double TCalc,
@@ -3428,6 +3414,7 @@ bool dml32_CalculatePrefetchSchedule(
double *VUpdateWidthPix,
double *VReadyOffsetPix)
{
+ double DPPCLKDelaySubtotalPlusCNVCFormater = v->DPPCLKDelaySubtotal + v->DPPCLKDelayCNVCFormater;
bool MyError = false;
unsigned int DPPCycles, DISPCLKCycles;
double DSTTotalPixelsAfterScaler;
@@ -3464,27 +3451,27 @@ bool dml32_CalculatePrefetchSchedule(
double Tsw_est1 = 0;
double Tsw_est3 = 0;
- if (GPUVMEnable == true && HostVMEnable == true)
- HostVMDynamicLevelsTrips = HostVMMaxNonCachedPageTableLevels;
+ if (v->GPUVMEnable == true && v->HostVMEnable == true)
+ HostVMDynamicLevelsTrips = v->HostVMMaxNonCachedPageTableLevels;
else
HostVMDynamicLevelsTrips = 0;
#ifdef __DML_VBA_DEBUG__
- dml_print("DML::%s: GPUVMEnable = %d\n", __func__, GPUVMEnable);
- dml_print("DML::%s: GPUVMPageTableLevels = %d\n", __func__, GPUVMPageTableLevels);
+ dml_print("DML::%s: v->GPUVMEnable = %d\n", __func__, v->GPUVMEnable);
+ dml_print("DML::%s: v->GPUVMMaxPageTableLevels = %d\n", __func__, v->GPUVMMaxPageTableLevels);
dml_print("DML::%s: DCCEnable = %d\n", __func__, myPipe->DCCEnable);
- dml_print("DML::%s: HostVMEnable=%d HostVMInefficiencyFactor=%f\n",
- __func__, HostVMEnable, HostVMInefficiencyFactor);
+ dml_print("DML::%s: v->HostVMEnable=%d HostVMInefficiencyFactor=%f\n",
+ __func__, v->HostVMEnable, HostVMInefficiencyFactor);
#endif
dml32_CalculateVUpdateAndDynamicMetadataParameters(
- MaxInterDCNTileRepeaters,
+ v->MaxInterDCNTileRepeaters,
myPipe->Dppclk,
myPipe->Dispclk,
myPipe->DCFClkDeepSleep,
myPipe->PixelClock,
myPipe->HTotal,
myPipe->VBlank,
- DynamicMetadataTransmittedBytes,
- DynamicMetadataLinesBeforeActiveRequired,
+ v->DynamicMetadataTransmittedBytes[k],
+ v->DynamicMetadataLinesBeforeActiveRequired[k],
myPipe->InterlaceEnable,
myPipe->ProgressiveToInterlaceUnitInOPP,
TSetup,
@@ -3499,19 +3486,19 @@ bool dml32_CalculatePrefetchSchedule(
LineTime = myPipe->HTotal / myPipe->PixelClock;
trip_to_mem = UrgentLatency;
- Tvm_trips = UrgentExtraLatency + trip_to_mem * (GPUVMPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1);
+ Tvm_trips = UrgentExtraLatency + trip_to_mem * (v->GPUVMMaxPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1);
- if (DynamicMetadataVMEnabled == true)
+ if (v->DynamicMetadataVMEnabled == true)
*Tdmdl = TWait + Tvm_trips + trip_to_mem;
else
*Tdmdl = TWait + UrgentExtraLatency;
#ifdef __DML_VBA_ALLOW_DELTA__
- if (DynamicMetadataEnable == false)
+ if (v->DynamicMetadataEnable[k] == false)
*Tdmdl = 0.0;
#endif
- if (DynamicMetadataEnable == true) {
+ if (v->DynamicMetadataEnable[k] == true) {
if (VStartup * LineTime < *TSetup + *Tdmdl + Tdmbf + Tdmec + Tdmsks) {
*NotEnoughTimeForDynamicMetadata = true;
#ifdef __DML_VBA_DEBUG__
@@ -3531,17 +3518,17 @@ bool dml32_CalculatePrefetchSchedule(
*NotEnoughTimeForDynamicMetadata = false;
}
- *Tdmdl_vm = (DynamicMetadataEnable == true && DynamicMetadataVMEnabled == true &&
- GPUVMEnable == true ? TWait + Tvm_trips : 0);
+ *Tdmdl_vm = (v->DynamicMetadataEnable[k] == true && v->DynamicMetadataVMEnabled == true &&
+ v->GPUVMEnable == true ? TWait + Tvm_trips : 0);
if (myPipe->ScalerEnabled)
- DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCL;
+ DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCL;
else
- DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCLLBOnly;
+ DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCLLBOnly;
- DPPCycles = DPPCycles + myPipe->NumberOfCursors * DPPCLKDelayCNVCCursor;
+ DPPCycles = DPPCycles + myPipe->NumberOfCursors * v->DPPCLKDelayCNVCCursor;
- DISPCLKCycles = DISPCLKDelaySubtotal;
+ DISPCLKCycles = v->DISPCLKDelaySubtotal;
if (myPipe->Dppclk == 0.0 || myPipe->Dispclk == 0.0)
return true;
@@ -3567,7 +3554,7 @@ bool dml32_CalculatePrefetchSchedule(
dml_print("DML::%s: DSTXAfterScaler: %d\n", __func__, *DSTXAfterScaler);
#endif
- if (OutputFormat == dm_420 || (myPipe->InterlaceEnable && myPipe->ProgressiveToInterlaceUnitInOPP))
+ if (v->OutputFormat[k] == dm_420 || (myPipe->InterlaceEnable && myPipe->ProgressiveToInterlaceUnitInOPP))
*DSTYAfterScaler = 1;
else
*DSTYAfterScaler = 0;
@@ -3584,13 +3571,13 @@ bool dml32_CalculatePrefetchSchedule(
Tr0_trips = trip_to_mem * (HostVMDynamicLevelsTrips + 1);
- if (GPUVMEnable == true) {
+ if (v->GPUVMEnable == true) {
Tvm_trips_rounded = dml_ceil(4.0 * Tvm_trips / LineTime, 1.0) / 4.0 * LineTime;
Tr0_trips_rounded = dml_ceil(4.0 * Tr0_trips / LineTime, 1.0) / 4.0 * LineTime;
- if (GPUVMPageTableLevels >= 3) {
+ if (v->GPUVMMaxPageTableLevels >= 3) {
*Tno_bw = UrgentExtraLatency + trip_to_mem *
- (double) ((GPUVMPageTableLevels - 2) * (HostVMDynamicLevelsTrips + 1) - 1);
- } else if (GPUVMPageTableLevels == 1 && myPipe->DCCEnable != true) {
+ (double) ((v->GPUVMMaxPageTableLevels - 2) * (HostVMDynamicLevelsTrips + 1) - 1);
+ } else if (v->GPUVMMaxPageTableLevels == 1 && myPipe->DCCEnable != true) {
Tr0_trips_rounded = dml_ceil(4.0 * UrgentExtraLatency / LineTime, 1.0) /
4.0 * LineTime; // VBA_ERROR
*Tno_bw = UrgentExtraLatency;
@@ -3625,7 +3612,7 @@ bool dml32_CalculatePrefetchSchedule(
min_Lsw = dml_max(min_Lsw, 1.0);
Lsw_oto = dml_ceil(4.0 * dml_max(prefetch_sw_bytes / prefetch_bw_oto / LineTime, min_Lsw), 1.0) / 4.0;
- if (GPUVMEnable == true) {
+ if (v->GPUVMEnable == true) {
Tvm_oto = dml_max3(
Tvm_trips,
*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_oto,
@@ -3633,7 +3620,7 @@ bool dml32_CalculatePrefetchSchedule(
} else
Tvm_oto = LineTime / 4.0;
- if ((GPUVMEnable == true || myPipe->DCCEnable == true)) {
+ if ((v->GPUVMEnable == true || myPipe->DCCEnable == true)) {
Tr0_oto = dml_max4(
Tr0_trips,
(MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / prefetch_bw_oto,
@@ -3836,7 +3823,7 @@ bool dml32_CalculatePrefetchSchedule(
#endif
if (prefetch_bw_equ > 0) {
- if (GPUVMEnable == true) {
+ if (v->GPUVMEnable == true) {
Tvm_equ = dml_max3(*Tno_bw + PDEAndMetaPTEBytesFrame *
HostVMInefficiencyFactor / prefetch_bw_equ,
Tvm_trips, LineTime / 4);
@@ -3844,7 +3831,7 @@ bool dml32_CalculatePrefetchSchedule(
Tvm_equ = LineTime / 4;
}
- if ((GPUVMEnable == true || myPipe->DCCEnable == true)) {
+ if ((v->GPUVMEnable == true || myPipe->DCCEnable == true)) {
Tr0_equ = dml_max4((MetaRowByte + PixelPTEBytesPerRow *
HostVMInefficiencyFactor) / prefetch_bw_equ, Tr0_trips,
(LineTime - Tvm_equ) / 2, LineTime / 4);
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
index 7cdf0418830a..3dbc9cf46aad 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/display_mode_vba_util_32.h
@@ -714,28 +714,14 @@ double dml32_CalculateExtraLatency(
unsigned int HostVMMaxNonCachedPageTableLevels);
bool dml32_CalculatePrefetchSchedule(
+ struct vba_vars_st *v,
+ unsigned int k,
double HostVMInefficiencyFactor,
DmlPipe *myPipe,
unsigned int DSCDelay,
- double DPPCLKDelaySubtotalPlusCNVCFormater,
- double DPPCLKDelaySCL,
- double DPPCLKDelaySCLLBOnly,
- double DPPCLKDelayCNVCCursor,
- double DISPCLKDelaySubtotal,
unsigned int DPP_RECOUT_WIDTH,
- enum output_format_class OutputFormat,
- unsigned int MaxInterDCNTileRepeaters,
unsigned int VStartup,
unsigned int MaxVStartup,
- unsigned int GPUVMPageTableLevels,
- bool GPUVMEnable,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- double HostVMMinPageSize,
- bool DynamicMetadataEnable,
- bool DynamicMetadataVMEnabled,
- int DynamicMetadataLinesBeforeActiveRequired,
- unsigned int DynamicMetadataTransmittedBytes,
double UrgentLatency,
double UrgentExtraLatency,
double TCalc,
--
2.37.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 3/5] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport()
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
2022-08-30 20:34 ` [PATCH 1/5] drm/amd/display: Reduce number of arguments of dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport() Nathan Chancellor
2022-08-30 20:34 ` [PATCH 2/5] drm/amd/display: Reduce number of arguments of dml32_CalculatePrefetchSchedule() Nathan Chancellor
@ 2022-08-30 20:34 ` Nathan Chancellor
2022-08-30 20:34 ` [PATCH 4/5] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule() Nathan Chancellor
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Nathan Chancellor @ 2022-08-30 20:34 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Nathan Chancellor
Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 240 bytes with
LLVM 16 (2216 -> 1976), helping clear up the following clang warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
.../dc/dml/dcn31/display_mode_vba_31.c | 248 ++++--------------
1 file changed, 52 insertions(+), 196 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index d63b4209b14c..21c74ee1deec 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -311,64 +311,28 @@ static void CalculateVupdateAndDynamicMetadataParameters(
static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
- unsigned int NumberOfActivePlanes,
- unsigned int MaxLineBufferLines,
- unsigned int LineBufferSize,
- unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
- bool SynchronizedVBlank,
- unsigned int dpte_group_bytes[],
- unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
- double WritebackLatency,
- double WritebackChunkSize,
double SOCCLK,
- double DRAMClockChangeLatency,
- double SRExitTime,
- double SREnterPlusExitTime,
- double SRExitZ8Time,
- double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
- unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
- double HRatio[],
- double HRatioChroma[],
- unsigned int vtaps[],
- unsigned int VTAPsChroma[],
- double VRatio[],
- double VRatioChroma[],
- unsigned int HTotal[],
- double PixelClock[],
- unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
- double DSTXAfterScaler[],
- double DSTYAfterScaler[],
- bool WritebackEnable[],
- enum source_format_class WritebackPixelFormat[],
- double WritebackDestinationWidth[],
- double WritebackDestinationHeight[],
- double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
- double *UrgentWatermark,
- double *WritebackUrgentWatermark,
- double *DRAMClockChangeWatermark,
- double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
- double *Z8StutterEnterPlusExitWatermark,
- double *MinActiveDRAMClockChangeLatencySupported);
+ double *Z8StutterEnterPlusExitWatermark);
static void CalculateDCFCLKDeepSleep(
struct display_mode_lib *mode_lib,
@@ -3017,64 +2981,28 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
PrefetchMode,
- v->NumberOfActivePlanes,
- v->MaxLineBufferLines,
- v->LineBufferSize,
- v->WritebackInterfaceBufferSize,
v->DCFCLK,
v->ReturnBW,
- v->SynchronizedVBlank,
- v->dpte_group_bytes,
- v->MetaChunkSize,
v->UrgentLatency,
v->UrgentExtraLatency,
- v->WritebackLatency,
- v->WritebackChunkSize,
v->SOCCLK,
- v->DRAMClockChangeLatency,
- v->SRExitTime,
- v->SREnterPlusExitTime,
- v->SRExitZ8Time,
- v->SREnterPlusExitZ8Time,
v->DCFCLKDeepSleep,
v->DETBufferSizeY,
v->DETBufferSizeC,
v->SwathHeightY,
v->SwathHeightC,
- v->LBBitPerPixel,
v->SwathWidthY,
v->SwathWidthC,
- v->HRatio,
- v->HRatioChroma,
- v->vtaps,
- v->VTAPsChroma,
- v->VRatio,
- v->VRatioChroma,
- v->HTotal,
- v->PixelClock,
- v->BlendingAndTiming,
v->DPPPerPlane,
v->BytePerPixelDETY,
v->BytePerPixelDETC,
- v->DSTXAfterScaler,
- v->DSTYAfterScaler,
- v->WritebackEnable,
- v->WritebackPixelFormat,
- v->WritebackDestinationWidth,
- v->WritebackDestinationHeight,
- v->WritebackSourceHeight,
v->UnboundedRequestEnabled,
v->CompressedBufferSizeInkByte,
&DRAMClockChangeSupport,
- &v->UrgentWatermark,
- &v->WritebackUrgentWatermark,
- &v->DRAMClockChangeWatermark,
- &v->WritebackDRAMClockChangeWatermark,
&v->StutterExitWatermark,
&v->StutterEnterPlusExitWatermark,
&v->Z8StutterExitWatermark,
- &v->Z8StutterEnterPlusExitWatermark,
- &v->MinActiveDRAMClockChangeLatencySupported);
+ &v->Z8StutterEnterPlusExitWatermark);
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
if (v->WritebackEnable[k] == true) {
@@ -5384,64 +5312,28 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
CalculateWatermarksAndDRAMSpeedChangeSupport(
mode_lib,
v->PrefetchModePerState[i][j],
- v->NumberOfActivePlanes,
- v->MaxLineBufferLines,
- v->LineBufferSize,
- v->WritebackInterfaceBufferSize,
v->DCFCLKState[i][j],
v->ReturnBWPerState[i][j],
- v->SynchronizedVBlank,
- v->dpte_group_bytes,
- v->MetaChunkSize,
v->UrgLatency[i],
v->ExtraLatency,
- v->WritebackLatency,
- v->WritebackChunkSize,
v->SOCCLKPerState[i],
- v->DRAMClockChangeLatency,
- v->SRExitTime,
- v->SREnterPlusExitTime,
- v->SRExitZ8Time,
- v->SREnterPlusExitZ8Time,
v->ProjectedDCFCLKDeepSleep[i][j],
v->DETBufferSizeYThisState,
v->DETBufferSizeCThisState,
v->SwathHeightYThisState,
v->SwathHeightCThisState,
- v->LBBitPerPixel,
v->SwathWidthYThisState,
v->SwathWidthCThisState,
- v->HRatio,
- v->HRatioChroma,
- v->vtaps,
- v->VTAPsChroma,
- v->VRatio,
- v->VRatioChroma,
- v->HTotal,
- v->PixelClock,
- v->BlendingAndTiming,
v->NoOfDPPThisState,
v->BytePerPixelInDETY,
v->BytePerPixelInDETC,
- v->DSTXAfterScaler,
- v->DSTYAfterScaler,
- v->WritebackEnable,
- v->WritebackPixelFormat,
- v->WritebackDestinationWidth,
- v->WritebackDestinationHeight,
- v->WritebackSourceHeight,
UnboundedRequestEnabledThisState,
CompressedBufferSizeInkByteThisState,
&v->DRAMClockChangeSupport[i][j],
- &v->UrgentWatermark,
- &v->WritebackUrgentWatermark,
- &v->DRAMClockChangeWatermark,
- &v->WritebackDRAMClockChangeWatermark,
- &dummy,
&dummy,
&dummy,
&dummy,
- &v->MinActiveDRAMClockChangeLatencySupported);
+ &dummy);
}
}
@@ -5566,64 +5458,28 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
static void CalculateWatermarksAndDRAMSpeedChangeSupport(
struct display_mode_lib *mode_lib,
unsigned int PrefetchMode,
- unsigned int NumberOfActivePlanes,
- unsigned int MaxLineBufferLines,
- unsigned int LineBufferSize,
- unsigned int WritebackInterfaceBufferSize,
double DCFCLK,
double ReturnBW,
- bool SynchronizedVBlank,
- unsigned int dpte_group_bytes[],
- unsigned int MetaChunkSize,
double UrgentLatency,
double ExtraLatency,
- double WritebackLatency,
- double WritebackChunkSize,
double SOCCLK,
- double DRAMClockChangeLatency,
- double SRExitTime,
- double SREnterPlusExitTime,
- double SRExitZ8Time,
- double SREnterPlusExitZ8Time,
double DCFCLKDeepSleep,
unsigned int DETBufferSizeY[],
unsigned int DETBufferSizeC[],
unsigned int SwathHeightY[],
unsigned int SwathHeightC[],
- unsigned int LBBitPerPixel[],
double SwathWidthY[],
double SwathWidthC[],
- double HRatio[],
- double HRatioChroma[],
- unsigned int vtaps[],
- unsigned int VTAPsChroma[],
- double VRatio[],
- double VRatioChroma[],
- unsigned int HTotal[],
- double PixelClock[],
- unsigned int BlendingAndTiming[],
unsigned int DPPPerPlane[],
double BytePerPixelDETY[],
double BytePerPixelDETC[],
- double DSTXAfterScaler[],
- double DSTYAfterScaler[],
- bool WritebackEnable[],
- enum source_format_class WritebackPixelFormat[],
- double WritebackDestinationWidth[],
- double WritebackDestinationHeight[],
- double WritebackSourceHeight[],
bool UnboundedRequestEnabled,
int unsigned CompressedBufferSizeInkByte,
enum clock_change_support *DRAMClockChangeSupport,
- double *UrgentWatermark,
- double *WritebackUrgentWatermark,
- double *DRAMClockChangeWatermark,
- double *WritebackDRAMClockChangeWatermark,
double *StutterExitWatermark,
double *StutterEnterPlusExitWatermark,
double *Z8StutterExitWatermark,
- double *Z8StutterEnterPlusExitWatermark,
- double *MinActiveDRAMClockChangeLatencySupported)
+ double *Z8StutterEnterPlusExitWatermark)
{
struct vba_vars_st *v = &mode_lib->vba;
double EffectiveLBLatencyHidingY;
@@ -5643,103 +5499,103 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport(
double TotalPixelBW = 0.0;
int k, j;
- *UrgentWatermark = UrgentLatency + ExtraLatency;
+ v->UrgentWatermark = UrgentLatency + ExtraLatency;
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: UrgentLatency = %f\n", __func__, UrgentLatency);
dml_print("DML::%s: ExtraLatency = %f\n", __func__, ExtraLatency);
- dml_print("DML::%s: UrgentWatermark = %f\n", __func__, *UrgentWatermark);
+ dml_print("DML::%s: UrgentWatermark = %f\n", __func__, v->UrgentWatermark);
#endif
- *DRAMClockChangeWatermark = DRAMClockChangeLatency + *UrgentWatermark;
+ v->DRAMClockChangeWatermark = v->DRAMClockChangeLatency + v->UrgentWatermark;
#ifdef __DML_VBA_DEBUG__
- dml_print("DML::%s: DRAMClockChangeLatency = %f\n", __func__, DRAMClockChangeLatency);
- dml_print("DML::%s: DRAMClockChangeWatermark = %f\n", __func__, *DRAMClockChangeWatermark);
+ dml_print("DML::%s: v->DRAMClockChangeLatency = %f\n", __func__, v->DRAMClockChangeLatency );
+ dml_print("DML::%s: DRAMClockChangeWatermark = %f\n", __func__, v->DRAMClockChangeWatermark);
#endif
v->TotalActiveWriteback = 0;
- for (k = 0; k < NumberOfActivePlanes; ++k) {
- if (WritebackEnable[k] == true) {
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
+ if (v->WritebackEnable[k] == true) {
v->TotalActiveWriteback = v->TotalActiveWriteback + 1;
}
}
if (v->TotalActiveWriteback <= 1) {
- *WritebackUrgentWatermark = WritebackLatency;
+ v->WritebackUrgentWatermark = v->WritebackLatency;
} else {
- *WritebackUrgentWatermark = WritebackLatency + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
+ v->WritebackUrgentWatermark = v->WritebackLatency + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
}
if (v->TotalActiveWriteback <= 1) {
- *WritebackDRAMClockChangeWatermark = DRAMClockChangeLatency + WritebackLatency;
+ v->WritebackDRAMClockChangeWatermark = v->DRAMClockChangeLatency + v->WritebackLatency;
} else {
- *WritebackDRAMClockChangeWatermark = DRAMClockChangeLatency + WritebackLatency + WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
+ v->WritebackDRAMClockChangeWatermark = v->DRAMClockChangeLatency + v->WritebackLatency + v->WritebackChunkSize * 1024.0 / 32.0 / SOCCLK;
}
- for (k = 0; k < NumberOfActivePlanes; ++k) {
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
TotalPixelBW = TotalPixelBW
- + DPPPerPlane[k] * (SwathWidthY[k] * BytePerPixelDETY[k] * VRatio[k] + SwathWidthC[k] * BytePerPixelDETC[k] * VRatioChroma[k])
- / (HTotal[k] / PixelClock[k]);
+ + DPPPerPlane[k] * (SwathWidthY[k] * BytePerPixelDETY[k] * v->VRatio[k] + SwathWidthC[k] * BytePerPixelDETC[k] * v->VRatioChroma[k])
+ / (v->HTotal[k] / v->PixelClock[k]);
}
- for (k = 0; k < NumberOfActivePlanes; ++k) {
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
double EffectiveDETBufferSizeY = DETBufferSizeY[k];
v->LBLatencyHidingSourceLinesY = dml_min(
- (double) MaxLineBufferLines,
- dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(HRatio[k], 1.0)), 1)) - (vtaps[k] - 1);
+ (double) v->MaxLineBufferLines,
+ dml_floor(v->LineBufferSize / v->LBBitPerPixel[k] / (SwathWidthY[k] / dml_max(v->HRatio[k], 1.0)), 1)) - (v->vtaps[k] - 1);
v->LBLatencyHidingSourceLinesC = dml_min(
- (double) MaxLineBufferLines,
- dml_floor(LineBufferSize / LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(HRatioChroma[k], 1.0)), 1)) - (VTAPsChroma[k] - 1);
+ (double) v->MaxLineBufferLines,
+ dml_floor(v->LineBufferSize / v->LBBitPerPixel[k] / (SwathWidthC[k] / dml_max(v->HRatioChroma[k], 1.0)), 1)) - (v->VTAPsChroma[k] - 1);
- EffectiveLBLatencyHidingY = v->LBLatencyHidingSourceLinesY / VRatio[k] * (HTotal[k] / PixelClock[k]);
+ EffectiveLBLatencyHidingY = v->LBLatencyHidingSourceLinesY / v->VRatio[k] * (v->HTotal[k] / v->PixelClock[k]);
- EffectiveLBLatencyHidingC = v->LBLatencyHidingSourceLinesC / VRatioChroma[k] * (HTotal[k] / PixelClock[k]);
+ EffectiveLBLatencyHidingC = v->LBLatencyHidingSourceLinesC / v->VRatioChroma[k] * (v->HTotal[k] / v->PixelClock[k]);
if (UnboundedRequestEnabled) {
EffectiveDETBufferSizeY = EffectiveDETBufferSizeY
- + CompressedBufferSizeInkByte * 1024 * SwathWidthY[k] * BytePerPixelDETY[k] * VRatio[k] / (HTotal[k] / PixelClock[k]) / TotalPixelBW;
+ + CompressedBufferSizeInkByte * 1024 * SwathWidthY[k] * BytePerPixelDETY[k] * v->VRatio[k] / (v->HTotal[k] / v->PixelClock[k]) / TotalPixelBW;
}
LinesInDETY[k] = (double) EffectiveDETBufferSizeY / BytePerPixelDETY[k] / SwathWidthY[k];
LinesInDETYRoundedDownToSwath[k] = dml_floor(LinesInDETY[k], SwathHeightY[k]);
- FullDETBufferingTimeY = LinesInDETYRoundedDownToSwath[k] * (HTotal[k] / PixelClock[k]) / VRatio[k];
+ FullDETBufferingTimeY = LinesInDETYRoundedDownToSwath[k] * (v->HTotal[k] / v->PixelClock[k]) / v->VRatio[k];
if (BytePerPixelDETC[k] > 0) {
LinesInDETC = v->DETBufferSizeC[k] / BytePerPixelDETC[k] / SwathWidthC[k];
LinesInDETCRoundedDownToSwath = dml_floor(LinesInDETC, SwathHeightC[k]);
- FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath * (HTotal[k] / PixelClock[k]) / VRatioChroma[k];
+ FullDETBufferingTimeC = LinesInDETCRoundedDownToSwath * (v->HTotal[k] / v->PixelClock[k]) / v->VRatioChroma[k];
} else {
LinesInDETC = 0;
FullDETBufferingTimeC = 999999;
}
ActiveDRAMClockChangeLatencyMarginY = EffectiveLBLatencyHidingY + FullDETBufferingTimeY
- - ((double) DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) * HTotal[k] / PixelClock[k] - *UrgentWatermark - *DRAMClockChangeWatermark;
+ - ((double) v->DSTXAfterScaler[k] / v->HTotal[k] + v->DSTYAfterScaler[k]) * v->HTotal[k] / v->PixelClock[k] - v->UrgentWatermark - v->DRAMClockChangeWatermark;
- if (NumberOfActivePlanes > 1) {
+ if (v->NumberOfActivePlanes > 1) {
ActiveDRAMClockChangeLatencyMarginY = ActiveDRAMClockChangeLatencyMarginY
- - (1 - 1.0 / NumberOfActivePlanes) * SwathHeightY[k] * HTotal[k] / PixelClock[k] / VRatio[k];
+ - (1 - 1.0 / v->NumberOfActivePlanes) * SwathHeightY[k] * v->HTotal[k] / v->PixelClock[k] / v->VRatio[k];
}
if (BytePerPixelDETC[k] > 0) {
ActiveDRAMClockChangeLatencyMarginC = EffectiveLBLatencyHidingC + FullDETBufferingTimeC
- - ((double) DSTXAfterScaler[k] / HTotal[k] + DSTYAfterScaler[k]) * HTotal[k] / PixelClock[k] - *UrgentWatermark - *DRAMClockChangeWatermark;
+ - ((double) v->DSTXAfterScaler[k] / v->HTotal[k] + v->DSTYAfterScaler[k]) * v->HTotal[k] / v->PixelClock[k] - v->UrgentWatermark - v->DRAMClockChangeWatermark;
- if (NumberOfActivePlanes > 1) {
+ if (v->NumberOfActivePlanes > 1) {
ActiveDRAMClockChangeLatencyMarginC = ActiveDRAMClockChangeLatencyMarginC
- - (1 - 1.0 / NumberOfActivePlanes) * SwathHeightC[k] * HTotal[k] / PixelClock[k] / VRatioChroma[k];
+ - (1 - 1.0 / v->NumberOfActivePlanes) * SwathHeightC[k] * v->HTotal[k] / v->PixelClock[k] / v->VRatioChroma[k];
}
v->ActiveDRAMClockChangeLatencyMargin[k] = dml_min(ActiveDRAMClockChangeLatencyMarginY, ActiveDRAMClockChangeLatencyMarginC);
} else {
v->ActiveDRAMClockChangeLatencyMargin[k] = ActiveDRAMClockChangeLatencyMarginY;
}
- if (WritebackEnable[k] == true) {
- WritebackDRAMClockChangeLatencyHiding = WritebackInterfaceBufferSize * 1024
- / (WritebackDestinationWidth[k] * WritebackDestinationHeight[k] / (WritebackSourceHeight[k] * HTotal[k] / PixelClock[k]) * 4);
- if (WritebackPixelFormat[k] == dm_444_64) {
+ if (v->WritebackEnable[k] == true) {
+ WritebackDRAMClockChangeLatencyHiding = v->WritebackInterfaceBufferSize * 1024
+ / (v->WritebackDestinationWidth[k] * v->WritebackDestinationHeight[k] / (v->WritebackSourceHeight[k] * v->HTotal[k] / v->PixelClock[k]) * 4);
+ if (v->WritebackPixelFormat[k] == dm_444_64) {
WritebackDRAMClockChangeLatencyHiding = WritebackDRAMClockChangeLatencyHiding / 2;
}
WritebackDRAMClockChangeLatencyMargin = WritebackDRAMClockChangeLatencyHiding - v->WritebackDRAMClockChangeWatermark;
@@ -5749,14 +5605,14 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport(
v->MinActiveDRAMClockChangeMargin = 999999;
PlaneWithMinActiveDRAMClockChangeMargin = 0;
- for (k = 0; k < NumberOfActivePlanes; ++k) {
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
if (v->ActiveDRAMClockChangeLatencyMargin[k] < v->MinActiveDRAMClockChangeMargin) {
v->MinActiveDRAMClockChangeMargin = v->ActiveDRAMClockChangeLatencyMargin[k];
- if (BlendingAndTiming[k] == k) {
+ if (v->BlendingAndTiming[k] == k) {
PlaneWithMinActiveDRAMClockChangeMargin = k;
} else {
- for (j = 0; j < NumberOfActivePlanes; ++j) {
- if (BlendingAndTiming[k] == j) {
+ for (j = 0; j < v->NumberOfActivePlanes; ++j) {
+ if (v->BlendingAndTiming[k] == j) {
PlaneWithMinActiveDRAMClockChangeMargin = j;
}
}
@@ -5764,11 +5620,11 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport(
}
}
- *MinActiveDRAMClockChangeLatencySupported = v->MinActiveDRAMClockChangeMargin + DRAMClockChangeLatency;
+ v->MinActiveDRAMClockChangeLatencySupported = v->MinActiveDRAMClockChangeMargin + v->DRAMClockChangeLatency ;
SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank = 999999;
- for (k = 0; k < NumberOfActivePlanes; ++k) {
- if (!((k == PlaneWithMinActiveDRAMClockChangeMargin) && (BlendingAndTiming[k] == k)) && !(BlendingAndTiming[k] == PlaneWithMinActiveDRAMClockChangeMargin)
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
+ if (!((k == PlaneWithMinActiveDRAMClockChangeMargin) && (v->BlendingAndTiming[k] == k)) && !(v->BlendingAndTiming[k] == PlaneWithMinActiveDRAMClockChangeMargin)
&& v->ActiveDRAMClockChangeLatencyMargin[k] < SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank) {
SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank = v->ActiveDRAMClockChangeLatencyMargin[k];
}
@@ -5776,25 +5632,25 @@ static void CalculateWatermarksAndDRAMSpeedChangeSupport(
v->TotalNumberOfActiveOTG = 0;
- for (k = 0; k < NumberOfActivePlanes; ++k) {
- if (BlendingAndTiming[k] == k) {
+ for (k = 0; k < v->NumberOfActivePlanes; ++k) {
+ if (v->BlendingAndTiming[k] == k) {
v->TotalNumberOfActiveOTG = v->TotalNumberOfActiveOTG + 1;
}
}
if (v->MinActiveDRAMClockChangeMargin > 0 && PrefetchMode == 0) {
*DRAMClockChangeSupport = dm_dram_clock_change_vactive;
- } else if ((SynchronizedVBlank == true || v->TotalNumberOfActiveOTG == 1
+ } else if ((v->SynchronizedVBlank == true || v->TotalNumberOfActiveOTG == 1
|| SecondMinActiveDRAMClockChangeMarginOneDisplayInVBLank > 0) && PrefetchMode == 0) {
*DRAMClockChangeSupport = dm_dram_clock_change_vblank;
} else {
*DRAMClockChangeSupport = dm_dram_clock_change_unsupported;
}
- *StutterExitWatermark = SRExitTime + ExtraLatency + 10 / DCFCLKDeepSleep;
- *StutterEnterPlusExitWatermark = (SREnterPlusExitTime + ExtraLatency + 10 / DCFCLKDeepSleep);
- *Z8StutterExitWatermark = SRExitZ8Time + ExtraLatency + 10 / DCFCLKDeepSleep;
- *Z8StutterEnterPlusExitWatermark = SREnterPlusExitZ8Time + ExtraLatency + 10 / DCFCLKDeepSleep;
+ *StutterExitWatermark = v->SRExitTime + ExtraLatency + 10 / DCFCLKDeepSleep;
+ *StutterEnterPlusExitWatermark = (v->SREnterPlusExitTime + ExtraLatency + 10 / DCFCLKDeepSleep);
+ *Z8StutterExitWatermark = v->SRExitZ8Time + ExtraLatency + 10 / DCFCLKDeepSleep;
+ *Z8StutterEnterPlusExitWatermark = v->SREnterPlusExitZ8Time + ExtraLatency + 10 / DCFCLKDeepSleep;
#ifdef __DML_VBA_DEBUG__
dml_print("DML::%s: StutterExitWatermark = %f\n", __func__, *StutterExitWatermark);
--
2.37.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 4/5] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule()
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
` (2 preceding siblings ...)
2022-08-30 20:34 ` [PATCH 3/5] drm/amd/display: Reduce number of arguments of dml31's CalculateWatermarksAndDRAMSpeedChangeSupport() Nathan Chancellor
@ 2022-08-30 20:34 ` Nathan Chancellor
2022-08-30 20:34 ` [PATCH 5/5] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage Nathan Chancellor
` (2 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Nathan Chancellor @ 2022-08-30 20:34 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Nathan Chancellor
Most of the arguments are identical between the two call sites and they
can be accessed through the 'struct vba_vars_st' pointer. This reduces
the total amount of stack space that
dml31_ModeSupportAndSystemConfigurationFull() uses by 112 bytes with
LLVM 16 (1976 -> 1864), helping clear up the following clang warning:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.
Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
.../dc/dml/dcn31/display_mode_vba_31.c | 172 +++++-------------
1 file changed, 47 insertions(+), 125 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
index 21c74ee1deec..60ee936e6436 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c
@@ -251,33 +251,13 @@ static void CalculateRowBandwidth(
static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+ unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
- unsigned int GPUVMMaxPageTableLevels,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- bool GPUVMEnable,
- double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
- double DPTEBytesPerRow,
- double BandwidthAvailableForImmediateFlip,
- unsigned int TotImmediateFlipBytes,
- enum source_format_class SourcePixelFormat,
- double LineTime,
- double VRatio,
- double VRatioChroma,
- double Tno_bw,
- bool DCCEnable,
- unsigned int dpte_row_height,
- unsigned int meta_row_height,
- unsigned int dpte_row_height_chroma,
- unsigned int meta_row_height_chroma,
- double *DestinationLinesToRequestVMInImmediateFlip,
- double *DestinationLinesToRequestRowInImmediateFlip,
- double *final_flip_bw,
- bool *ImmediateFlipSupportedForPipe);
+ double DPTEBytesPerRow);
static double CalculateWriteBackDelay(
enum source_format_class WritebackPixelFormat,
double WritebackHRatio,
@@ -2868,33 +2848,13 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
for (k = 0; k < v->NumberOfActivePlanes; ++k) {
CalculateFlipSchedule(
mode_lib,
+ k,
HostVMInefficiencyFactor,
v->UrgentExtraLatency,
v->UrgentLatency,
- v->GPUVMMaxPageTableLevels,
- v->HostVMEnable,
- v->HostVMMaxNonCachedPageTableLevels,
- v->GPUVMEnable,
- v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesFrame[k],
v->MetaRowByte[k],
- v->PixelPTEBytesPerRow[k],
- v->BandwidthAvailableForImmediateFlip,
- v->TotImmediateFlipBytes,
- v->SourcePixelFormat[k],
- v->HTotal[k] / v->PixelClock[k],
- v->VRatio[k],
- v->VRatioChroma[k],
- v->Tno_bw[k],
- v->DCCEnable[k],
- v->dpte_row_height[k],
- v->meta_row_height[k],
- v->dpte_row_height_chroma[k],
- v->meta_row_height_chroma[k],
- &v->DestinationLinesToRequestVMInImmediateFlip[k],
- &v->DestinationLinesToRequestRowInImmediateFlip[k],
- &v->final_flip_bw[k],
- &v->ImmediateFlipSupportedForPipe[k]);
+ v->PixelPTEBytesPerRow[k]);
}
v->total_dcn_read_bw_with_flip = 0.0;
@@ -3526,61 +3486,43 @@ static void CalculateRowBandwidth(
static void CalculateFlipSchedule(
struct display_mode_lib *mode_lib,
+ unsigned int k,
double HostVMInefficiencyFactor,
double UrgentExtraLatency,
double UrgentLatency,
- unsigned int GPUVMMaxPageTableLevels,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- bool GPUVMEnable,
- double HostVMMinPageSize,
double PDEAndMetaPTEBytesPerFrame,
double MetaRowBytes,
- double DPTEBytesPerRow,
- double BandwidthAvailableForImmediateFlip,
- unsigned int TotImmediateFlipBytes,
- enum source_format_class SourcePixelFormat,
- double LineTime,
- double VRatio,
- double VRatioChroma,
- double Tno_bw,
- bool DCCEnable,
- unsigned int dpte_row_height,
- unsigned int meta_row_height,
- unsigned int dpte_row_height_chroma,
- unsigned int meta_row_height_chroma,
- double *DestinationLinesToRequestVMInImmediateFlip,
- double *DestinationLinesToRequestRowInImmediateFlip,
- double *final_flip_bw,
- bool *ImmediateFlipSupportedForPipe)
+ double DPTEBytesPerRow)
{
+ struct vba_vars_st *v = &mode_lib->vba;
double min_row_time = 0.0;
unsigned int HostVMDynamicLevelsTrips;
double TimeForFetchingMetaPTEImmediateFlip;
double TimeForFetchingRowInVBlankImmediateFlip;
double ImmediateFlipBW;
+ double LineTime = v->HTotal[k] / v->PixelClock[k];
- if (GPUVMEnable == true && HostVMEnable == true) {
- HostVMDynamicLevelsTrips = HostVMMaxNonCachedPageTableLevels;
+ if (v->GPUVMEnable == true && v->HostVMEnable == true) {
+ HostVMDynamicLevelsTrips = v->HostVMMaxNonCachedPageTableLevels;
} else {
HostVMDynamicLevelsTrips = 0;
}
- if (GPUVMEnable == true || DCCEnable == true) {
- ImmediateFlipBW = (PDEAndMetaPTEBytesPerFrame + MetaRowBytes + DPTEBytesPerRow) * BandwidthAvailableForImmediateFlip / TotImmediateFlipBytes;
+ if (v->GPUVMEnable == true || v->DCCEnable[k] == true) {
+ ImmediateFlipBW = (PDEAndMetaPTEBytesPerFrame + MetaRowBytes + DPTEBytesPerRow) * v->BandwidthAvailableForImmediateFlip / v->TotImmediateFlipBytes;
}
- if (GPUVMEnable == true) {
+ if (v->GPUVMEnable == true) {
TimeForFetchingMetaPTEImmediateFlip = dml_max3(
- Tno_bw + PDEAndMetaPTEBytesPerFrame * HostVMInefficiencyFactor / ImmediateFlipBW,
- UrgentExtraLatency + UrgentLatency * (GPUVMMaxPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1),
+ v->Tno_bw[k] + PDEAndMetaPTEBytesPerFrame * HostVMInefficiencyFactor / ImmediateFlipBW,
+ UrgentExtraLatency + UrgentLatency * (v->GPUVMMaxPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1),
LineTime / 4.0);
} else {
TimeForFetchingMetaPTEImmediateFlip = 0;
}
- *DestinationLinesToRequestVMInImmediateFlip = dml_ceil(4.0 * (TimeForFetchingMetaPTEImmediateFlip / LineTime), 1) / 4.0;
- if ((GPUVMEnable == true || DCCEnable == true)) {
+ v->DestinationLinesToRequestVMInImmediateFlip[k] = dml_ceil(4.0 * (TimeForFetchingMetaPTEImmediateFlip / LineTime), 1) / 4.0;
+ if ((v->GPUVMEnable == true || v->DCCEnable[k] == true)) {
TimeForFetchingRowInVBlankImmediateFlip = dml_max3(
(MetaRowBytes + DPTEBytesPerRow * HostVMInefficiencyFactor) / ImmediateFlipBW,
UrgentLatency * (HostVMDynamicLevelsTrips + 1),
@@ -3589,54 +3531,54 @@ static void CalculateFlipSchedule(
TimeForFetchingRowInVBlankImmediateFlip = 0;
}
- *DestinationLinesToRequestRowInImmediateFlip = dml_ceil(4.0 * (TimeForFetchingRowInVBlankImmediateFlip / LineTime), 1) / 4.0;
+ v->DestinationLinesToRequestRowInImmediateFlip[k] = dml_ceil(4.0 * (TimeForFetchingRowInVBlankImmediateFlip / LineTime), 1) / 4.0;
- if (GPUVMEnable == true) {
- *final_flip_bw = dml_max(
- PDEAndMetaPTEBytesPerFrame * HostVMInefficiencyFactor / (*DestinationLinesToRequestVMInImmediateFlip * LineTime),
- (MetaRowBytes + DPTEBytesPerRow * HostVMInefficiencyFactor) / (*DestinationLinesToRequestRowInImmediateFlip * LineTime));
- } else if ((GPUVMEnable == true || DCCEnable == true)) {
- *final_flip_bw = (MetaRowBytes + DPTEBytesPerRow * HostVMInefficiencyFactor) / (*DestinationLinesToRequestRowInImmediateFlip * LineTime);
+ if (v->GPUVMEnable == true) {
+ v->final_flip_bw[k] = dml_max(
+ PDEAndMetaPTEBytesPerFrame * HostVMInefficiencyFactor / (v->DestinationLinesToRequestVMInImmediateFlip[k] * LineTime),
+ (MetaRowBytes + DPTEBytesPerRow * HostVMInefficiencyFactor) / (v->DestinationLinesToRequestRowInImmediateFlip[k] * LineTime));
+ } else if ((v->GPUVMEnable == true || v->DCCEnable[k] == true)) {
+ v->final_flip_bw[k] = (MetaRowBytes + DPTEBytesPerRow * HostVMInefficiencyFactor) / (v->DestinationLinesToRequestRowInImmediateFlip[k] * LineTime);
} else {
- *final_flip_bw = 0;
+ v->final_flip_bw[k] = 0;
}
- if (SourcePixelFormat == dm_420_8 || SourcePixelFormat == dm_420_10 || SourcePixelFormat == dm_rgbe_alpha) {
- if (GPUVMEnable == true && DCCEnable != true) {
- min_row_time = dml_min(dpte_row_height * LineTime / VRatio, dpte_row_height_chroma * LineTime / VRatioChroma);
- } else if (GPUVMEnable != true && DCCEnable == true) {
- min_row_time = dml_min(meta_row_height * LineTime / VRatio, meta_row_height_chroma * LineTime / VRatioChroma);
+ if (v->SourcePixelFormat[k] == dm_420_8 || v->SourcePixelFormat[k] == dm_420_10 || v->SourcePixelFormat[k] == dm_rgbe_alpha) {
+ if (v->GPUVMEnable == true && v->DCCEnable[k] != true) {
+ min_row_time = dml_min(v->dpte_row_height[k] * LineTime / v->VRatio[k], v->dpte_row_height_chroma[k] * LineTime / v->VRatioChroma[k]);
+ } else if (v->GPUVMEnable != true && v->DCCEnable[k] == true) {
+ min_row_time = dml_min(v->meta_row_height[k] * LineTime / v->VRatio[k], v->meta_row_height_chroma[k] * LineTime / v->VRatioChroma[k]);
} else {
min_row_time = dml_min4(
- dpte_row_height * LineTime / VRatio,
- meta_row_height * LineTime / VRatio,
- dpte_row_height_chroma * LineTime / VRatioChroma,
- meta_row_height_chroma * LineTime / VRatioChroma);
+ v->dpte_row_height[k] * LineTime / v->VRatio[k],
+ v->meta_row_height[k] * LineTime / v->VRatio[k],
+ v->dpte_row_height_chroma[k] * LineTime / v->VRatioChroma[k],
+ v->meta_row_height_chroma[k] * LineTime / v->VRatioChroma[k]);
}
} else {
- if (GPUVMEnable == true && DCCEnable != true) {
- min_row_time = dpte_row_height * LineTime / VRatio;
- } else if (GPUVMEnable != true && DCCEnable == true) {
- min_row_time = meta_row_height * LineTime / VRatio;
+ if (v->GPUVMEnable == true && v->DCCEnable[k] != true) {
+ min_row_time = v->dpte_row_height[k] * LineTime / v->VRatio[k];
+ } else if (v->GPUVMEnable != true && v->DCCEnable[k] == true) {
+ min_row_time = v->meta_row_height[k] * LineTime / v->VRatio[k];
} else {
- min_row_time = dml_min(dpte_row_height * LineTime / VRatio, meta_row_height * LineTime / VRatio);
+ min_row_time = dml_min(v->dpte_row_height[k] * LineTime / v->VRatio[k], v->meta_row_height[k] * LineTime / v->VRatio[k]);
}
}
- if (*DestinationLinesToRequestVMInImmediateFlip >= 32 || *DestinationLinesToRequestRowInImmediateFlip >= 16
+ if (v->DestinationLinesToRequestVMInImmediateFlip[k] >= 32 || v->DestinationLinesToRequestRowInImmediateFlip[k] >= 16
|| TimeForFetchingMetaPTEImmediateFlip + 2 * TimeForFetchingRowInVBlankImmediateFlip > min_row_time) {
- *ImmediateFlipSupportedForPipe = false;
+ v->ImmediateFlipSupportedForPipe[k] = false;
} else {
- *ImmediateFlipSupportedForPipe = true;
+ v->ImmediateFlipSupportedForPipe[k] = true;
}
#ifdef __DML_VBA_DEBUG__
- dml_print("DML::%s: DestinationLinesToRequestVMInImmediateFlip = %f\n", __func__, *DestinationLinesToRequestVMInImmediateFlip);
- dml_print("DML::%s: DestinationLinesToRequestRowInImmediateFlip = %f\n", __func__, *DestinationLinesToRequestRowInImmediateFlip);
+ dml_print("DML::%s: DestinationLinesToRequestVMInImmediateFlip = %f\n", __func__, v->DestinationLinesToRequestVMInImmediateFlip[k]);
+ dml_print("DML::%s: DestinationLinesToRequestRowInImmediateFlip = %f\n", __func__, v->DestinationLinesToRequestRowInImmediateFlip[k]);
dml_print("DML::%s: TimeForFetchingMetaPTEImmediateFlip = %f\n", __func__, TimeForFetchingMetaPTEImmediateFlip);
dml_print("DML::%s: TimeForFetchingRowInVBlankImmediateFlip = %f\n", __func__, TimeForFetchingRowInVBlankImmediateFlip);
dml_print("DML::%s: min_row_time = %f\n", __func__, min_row_time);
- dml_print("DML::%s: ImmediateFlipSupportedForPipe = %d\n", __func__, *ImmediateFlipSupportedForPipe);
+ dml_print("DML::%s: ImmediateFlipSupportedForPipe = %d\n", __func__, v->ImmediateFlipSupportedForPipe[k]);
#endif
}
@@ -5228,33 +5170,13 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
for (k = 0; k < v->NumberOfActivePlanes; k++) {
CalculateFlipSchedule(
mode_lib,
+ k,
HostVMInefficiencyFactor,
v->ExtraLatency,
v->UrgLatency[i],
- v->GPUVMMaxPageTableLevels,
- v->HostVMEnable,
- v->HostVMMaxNonCachedPageTableLevels,
- v->GPUVMEnable,
- v->HostVMMinPageSize,
v->PDEAndMetaPTEBytesPerFrame[i][j][k],
v->MetaRowBytes[i][j][k],
- v->DPTEBytesPerRow[i][j][k],
- v->BandwidthAvailableForImmediateFlip,
- v->TotImmediateFlipBytes,
- v->SourcePixelFormat[k],
- v->HTotal[k] / v->PixelClock[k],
- v->VRatio[k],
- v->VRatioChroma[k],
- v->Tno_bw[k],
- v->DCCEnable[k],
- v->dpte_row_height[k],
- v->meta_row_height[k],
- v->dpte_row_height_chroma[k],
- v->meta_row_height_chroma[k],
- &v->DestinationLinesToRequestVMInImmediateFlip[k],
- &v->DestinationLinesToRequestRowInImmediateFlip[k],
- &v->final_flip_bw[k],
- &v->ImmediateFlipSupportedForPipe[k]);
+ v->DPTEBytesPerRow[i][j][k]);
}
v->total_dcn_read_bw_with_flip = 0.0;
for (k = 0; k < v->NumberOfActivePlanes; k++) {
--
2.37.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* [PATCH 5/5] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
` (3 preceding siblings ...)
2022-08-30 20:34 ` [PATCH 4/5] drm/amd/display: Reduce number of arguments of dml31's CalculateFlipSchedule() Nathan Chancellor
@ 2022-08-30 20:34 ` Nathan Chancellor
2022-09-11 20:58 ` [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Maíra Canal
2022-09-12 21:50 ` Rodrigo Siqueira Jordao
6 siblings, 0 replies; 10+ messages in thread
From: Nathan Chancellor @ 2022-08-30 20:34 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Nathan Chancellor
This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
^
1 error generated.
Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.
To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.
Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index b7fa003ffe06..6991a68061ef 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -6497,7 +6497,7 @@ static double CalculateUrgentLatency(
return ret;
}
-static void UseMinimumDCFCLK(
+static noinline_for_stack void UseMinimumDCFCLK(
struct display_mode_lib *mode_lib,
struct vba_vars_st *v,
int MaxPrefetchMode,
--
2.37.2
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
` (4 preceding siblings ...)
2022-08-30 20:34 ` [PATCH 5/5] drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage Nathan Chancellor
@ 2022-09-11 20:58 ` Maíra Canal
2022-09-12 21:50 ` Rodrigo Siqueira Jordao
6 siblings, 0 replies; 10+ messages in thread
From: Maíra Canal @ 2022-09-11 20:58 UTC (permalink / raw)
To: Nathan Chancellor, Harry Wentland, Leo Li, Rodrigo Siqueira,
Alex Deucher, Christian König, Pan, Xinhui
Cc: Tom Rix, llvm, Nick Desaulniers, patches, dri-devel, amd-gfx,
Sudip Mukherjee
Hi Nathan,
I have built-tested the whole series with clang 14.0.5 (Fedora
14.0.5-1.fc36), using:
$ make -kj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper allmodconfig
drivers/gpu/drm/amd/amdgpu/
Great to see this patchset coming for DML!
To the whole series:
Tested-by: Maíra Canal <mairacanal@riseup.net>
Best Regards,
- Maíra Canal
On 8/30/22 17:34, Nathan Chancellor wrote:
> Hi all,
>
> This series aims to address the following warnings, which are visible
> when building x86_64 allmodconfig with clang after commit 3876a8b5e241
> ("drm/amd/display: Enable building new display engine with KCOV
> enabled").
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
> some inconsistent indentings"). These warnings are fatal for
> allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
> patches cherry-picked to a branch targeting mainline to allow our builds
> to go back to green. However, since this series is not exactly trivial
> in size, I can understand not wanting to apply these to mainline during
> the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
> patch raising the value of -Wframe-larger-than for these files that can
> be cherry-picked to 6.0/mainline then add a revert of that change as the
> last patch in the stack so everything goes back to normal for -next/6.1.
> I am open to other options though!
>
> I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
> x86_64. It has seen no runtime testing, as my only test system with AMD
> graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
>
> Nathan Chancellor (5):
> drm/amd/display: Reduce number of arguments of
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
> drm/amd/display: Reduce number of arguments of
> dml32_CalculatePrefetchSchedule()
> drm/amd/display: Reduce number of arguments of dml31's
> CalculateWatermarksAndDRAMSpeedChangeSupport()
> drm/amd/display: Reduce number of arguments of dml31's
> CalculateFlipSchedule()
> drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
> usage
>
> .../dc/dml/dcn30/display_mode_vba_30.c | 2 +-
> .../dc/dml/dcn31/display_mode_vba_31.c | 420 +++++-------------
> .../dc/dml/dcn32/display_mode_vba_32.c | 236 +++-------
> .../dc/dml/dcn32/display_mode_vba_util_32.c | 323 ++++++--------
> .../dc/dml/dcn32/display_mode_vba_util_32.h | 51 +--
> 5 files changed, 318 insertions(+), 714 deletions(-)
>
>
> base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang
2022-08-30 20:34 [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Nathan Chancellor
` (5 preceding siblings ...)
2022-09-11 20:58 ` [PATCH 0/5] drm/amd/display: Reduce stack usage for clang Maíra Canal
@ 2022-09-12 21:50 ` Rodrigo Siqueira Jordao
2022-09-12 22:02 ` Nathan Chancellor
6 siblings, 1 reply; 10+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-09-12 21:50 UTC (permalink / raw)
To: Nathan Chancellor, Harry Wentland, Leo Li, Alex Deucher,
Christian König, Pan, Xinhui
Cc: Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx, dri-devel,
llvm, patches, Maíra Canal
On 2022-08-30 16:34, Nathan Chancellor wrote:
> Hi all,
>
> This series aims to address the following warnings, which are visible
> when building x86_64 allmodconfig with clang after commit 3876a8b5e241
> ("drm/amd/display: Enable building new display engine with KCOV
> enabled").
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> ^
> 1 error generated.
>
> This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
> some inconsistent indentings"). These warnings are fatal for
> allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
> patches cherry-picked to a branch targeting mainline to allow our builds
> to go back to green. However, since this series is not exactly trivial
> in size, I can understand not wanting to apply these to mainline during
> the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
> patch raising the value of -Wframe-larger-than for these files that can
> be cherry-picked to 6.0/mainline then add a revert of that change as the
> last patch in the stack so everything goes back to normal for -next/6.1.
> I am open to other options though!
>
> I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
> x86_64. It has seen no runtime testing, as my only test system with AMD
> graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
>
> Nathan Chancellor (5):
> drm/amd/display: Reduce number of arguments of
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
> drm/amd/display: Reduce number of arguments of
> dml32_CalculatePrefetchSchedule()
> drm/amd/display: Reduce number of arguments of dml31's
> CalculateWatermarksAndDRAMSpeedChangeSupport()
> drm/amd/display: Reduce number of arguments of dml31's
> CalculateFlipSchedule()
> drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
> usage
>
> .../dc/dml/dcn30/display_mode_vba_30.c | 2 +-
> .../dc/dml/dcn31/display_mode_vba_31.c | 420 +++++-------------
> .../dc/dml/dcn32/display_mode_vba_32.c | 236 +++-------
> .../dc/dml/dcn32/display_mode_vba_util_32.c | 323 ++++++--------
> .../dc/dml/dcn32/display_mode_vba_util_32.h | 51 +--
> 5 files changed, 318 insertions(+), 714 deletions(-)
>
>
> base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
Hi Nathan,
First of all, thanks a lot for your patchset!
Sorry for the delay; it took me more time than I expected to review and
run a couple of tests in this patchset (most of them were IGT). Anyway,
I'm good with this change; this series is:
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
And I applied it to amd-staging-drm-next.
We will run some extra tests this week; if we find some issues, I'll
debug them.
Also, thanks, Maíra, for checking this patch as well.
Best Regards,
Siqueira
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang
2022-09-12 21:50 ` Rodrigo Siqueira Jordao
@ 2022-09-12 22:02 ` Nathan Chancellor
2022-09-13 22:48 ` Rodrigo Siqueira Jordao
0 siblings, 1 reply; 10+ messages in thread
From: Nathan Chancellor @ 2022-09-12 22:02 UTC (permalink / raw)
To: Rodrigo Siqueira Jordao
Cc: Harry Wentland, Leo Li, Alex Deucher, Christian König,
Pan, Xinhui, Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx,
dri-devel, llvm, patches, Maíra Canal
Hi Rodrigo,
On Mon, Sep 12, 2022 at 05:50:31PM -0400, Rodrigo Siqueira Jordao wrote:
>
>
> On 2022-08-30 16:34, Nathan Chancellor wrote:
> > Hi all,
> >
> > This series aims to address the following warnings, which are visible
> > when building x86_64 allmodconfig with clang after commit 3876a8b5e241
> > ("drm/amd/display: Enable building new display engine with KCOV
> > enabled").
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> > void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> > ^
> > 1 error generated.
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> > void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> > ^
> > 1 error generated.
> >
> > drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
> > void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
> > ^
> > 1 error generated.
> >
> > This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
> > some inconsistent indentings"). These warnings are fatal for
> > allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
> > patches cherry-picked to a branch targeting mainline to allow our builds
> > to go back to green. However, since this series is not exactly trivial
> > in size, I can understand not wanting to apply these to mainline during
> > the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
> > patch raising the value of -Wframe-larger-than for these files that can
> > be cherry-picked to 6.0/mainline then add a revert of that change as the
> > last patch in the stack so everything goes back to normal for -next/6.1.
> > I am open to other options though!
> >
> > I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
> > x86_64. It has seen no runtime testing, as my only test system with AMD
> > graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
> >
> > Nathan Chancellor (5):
> > drm/amd/display: Reduce number of arguments of
> > dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
> > drm/amd/display: Reduce number of arguments of
> > dml32_CalculatePrefetchSchedule()
> > drm/amd/display: Reduce number of arguments of dml31's
> > CalculateWatermarksAndDRAMSpeedChangeSupport()
> > drm/amd/display: Reduce number of arguments of dml31's
> > CalculateFlipSchedule()
> > drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
> > usage
> >
> > .../dc/dml/dcn30/display_mode_vba_30.c | 2 +-
> > .../dc/dml/dcn31/display_mode_vba_31.c | 420 +++++-------------
> > .../dc/dml/dcn32/display_mode_vba_32.c | 236 +++-------
> > .../dc/dml/dcn32/display_mode_vba_util_32.c | 323 ++++++--------
> > .../dc/dml/dcn32/display_mode_vba_util_32.h | 51 +--
> > 5 files changed, 318 insertions(+), 714 deletions(-)
> >
> >
> > base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
>
> Hi Nathan,
>
> First of all, thanks a lot for your patchset!
>
> Sorry for the delay; it took me more time than I expected to review and run
> a couple of tests in this patchset (most of them were IGT). Anyway, I'm good
> with this change; this series is:
>
> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
>
> And I applied it to amd-staging-drm-next.
>
> We will run some extra tests this week; if we find some issues, I'll debug
> them.
>
> Also, thanks, Maíra, for checking this patch as well.
No worries on the delay, the series is not exactly the smallest one I
have ever sent :) While the changes were mostly mechanical, I could have
definitely messed something up and I appreciate you taking the time to
review it and run it through some tests. Please let me know if I can be
of further assistance on that front.
If you have any thoughts on the blurb I had in the cover letter around
how to handle the warnings this series resolves with regards to
mainline, I would love to hear them.
Cheers,
Nathan
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH 0/5] drm/amd/display: Reduce stack usage for clang
2022-09-12 22:02 ` Nathan Chancellor
@ 2022-09-13 22:48 ` Rodrigo Siqueira Jordao
0 siblings, 0 replies; 10+ messages in thread
From: Rodrigo Siqueira Jordao @ 2022-09-13 22:48 UTC (permalink / raw)
To: Nathan Chancellor
Cc: Harry Wentland, Leo Li, Alex Deucher, Christian König,
Pan, Xinhui, Nick Desaulniers, Tom Rix, Sudip Mukherjee, amd-gfx,
dri-devel, llvm, patches, Maíra Canal
On 2022-09-12 18:02, Nathan Chancellor wrote:
> Hi Rodrigo,
>
> On Mon, Sep 12, 2022 at 05:50:31PM -0400, Rodrigo Siqueira Jordao wrote:
>>
>>
>> On 2022-08-30 16:34, Nathan Chancellor wrote:
>>> Hi all,
>>>
>>> This series aims to address the following warnings, which are visible
>>> when building x86_64 allmodconfig with clang after commit 3876a8b5e241
>>> ("drm/amd/display: Enable building new display engine with KCOV
>>> enabled").
>>>
>>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
>>> void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
>>> ^
>>> 1 error generated.
>>>
>>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3908:6: error: stack frame size (2216) exceeds limit (2048) in 'dml31_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
>>> void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
>>> ^
>>> 1 error generated.
>>>
>>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:1721:6: error: stack frame size (2152) exceeds limit (2048) in 'dml32_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
>>> void dml32_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
>>> ^
>>> 1 error generated.
>>>
>>> This series is based on commit b3235e8635e1 ("drm/amd/display: clean up
>>> some inconsistent indentings"). These warnings are fatal for
>>> allmodconfig due to CONFIG_WERROR so ideally, I would like to see these
>>> patches cherry-picked to a branch targeting mainline to allow our builds
>>> to go back to green. However, since this series is not exactly trivial
>>> in size, I can understand not wanting to apply these to mainline during
>>> the -rc cycle. If they cannot be cherry-picked to mainline, I can add a
>>> patch raising the value of -Wframe-larger-than for these files that can
>>> be cherry-picked to 6.0/mainline then add a revert of that change as the
>>> last patch in the stack so everything goes back to normal for -next/6.1.
>>> I am open to other options though!
>>>
>>> I have built this series against clang 16.0.0 (ToT) and GCC 12.2.0 for
>>> x86_64. It has seen no runtime testing, as my only test system with AMD
>>> graphics is a Renoir one, which as far as I understand it uses DCN 2.1.
>>>
>>> Nathan Chancellor (5):
>>> drm/amd/display: Reduce number of arguments of
>>> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport()
>>> drm/amd/display: Reduce number of arguments of
>>> dml32_CalculatePrefetchSchedule()
>>> drm/amd/display: Reduce number of arguments of dml31's
>>> CalculateWatermarksAndDRAMSpeedChangeSupport()
>>> drm/amd/display: Reduce number of arguments of dml31's
>>> CalculateFlipSchedule()
>>> drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack
>>> usage
>>>
>>> .../dc/dml/dcn30/display_mode_vba_30.c | 2 +-
>>> .../dc/dml/dcn31/display_mode_vba_31.c | 420 +++++-------------
>>> .../dc/dml/dcn32/display_mode_vba_32.c | 236 +++-------
>>> .../dc/dml/dcn32/display_mode_vba_util_32.c | 323 ++++++--------
>>> .../dc/dml/dcn32/display_mode_vba_util_32.h | 51 +--
>>> 5 files changed, 318 insertions(+), 714 deletions(-)
>>>
>>>
>>> base-commit: b3235e8635e1dd7ac1a27a73330e9880dfe05154
>>
>> Hi Nathan,
>>
>> First of all, thanks a lot for your patchset!
>>
>> Sorry for the delay; it took me more time than I expected to review and run
>> a couple of tests in this patchset (most of them were IGT). Anyway, I'm good
>> with this change; this series is:
>>
>> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
>>
>> And I applied it to amd-staging-drm-next.
>>
>> We will run some extra tests this week; if we find some issues, I'll debug
>> them.
>>
>> Also, thanks, Maíra, for checking this patch as well.
>
> No worries on the delay, the series is not exactly the smallest one I
> have ever sent :) While the changes were mostly mechanical, I could have
> definitely messed something up and I appreciate you taking the time to
> review it and run it through some tests. Please let me know if I can be
> of further assistance on that front.
>
> If you have any thoughts on the blurb I had in the cover letter around
> how to handle the warnings this series resolves with regards to
> mainline, I would love to hear them.
Actually, I think it will be valuable to add a kernel-doc about DML and
create a Troubleshoot section where we can have some of the things you
described in the cover letter as part of this doc. I'll work on that,
and when I have a V1, I'll Cc you.
>
> Cheers,
> Nathan
^ permalink raw reply [flat|nested] 10+ messages in thread