From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Nathan Chancellor <nathan@kernel.org>,
Alex Deucher <alexander.deucher@amd.com>,
Sasha Levin <sashal@kernel.org>,
zaeem.mohamed@amd.com, tungyu.lu@amd.com, alex.hung@amd.com,
v.shevtsov@mt-integration.ru, llvm@lists.linux.dev
Subject: [PATCH AUTOSEL 6.18] drm/amd/display: Reduce number of arguments of dcn30's CalculatePrefetchSchedule()
Date: Mon, 12 Jan 2026 09:58:20 -0500 [thread overview]
Message-ID: <20260112145840.724774-19-sashal@kernel.org> (raw)
In-Reply-To: <20260112145840.724774-1-sashal@kernel.org>
From: Nathan Chancellor <nathan@kernel.org>
[ Upstream commit f54a91f5337cd918eb86cf600320d25b6cfd8209 ]
After an innocuous optimization change in clang-22,
dml30_ModeSupportAndSystemConfigurationFull() is over the 2048 byte
stack limit for display_mode_vba_30.c.
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (2096) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
With clang-21, this function was already close to the limit:
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3529:6: warning: stack frame size (1912) exceeds limit (1586) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than]
3529 | void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
| ^
CalculatePrefetchSchedule() has a large number of parameters, which must
be passed on the stack. Most of the parameters between the two callsites
are the same, so they can be accessed through the existing mode_lib
pointer, instead of being passed as explicit arguments. Doing this
reduces the stack size of dml30_ModeSupportAndSystemConfigurationFull()
from 2096 bytes to 1912 bytes with clang-22.
Closes: https://github.com/ClangBuiltLinux/linux/issues/2117
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b20b3fc4210f83089f835cdb91deec4b0778761a)
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
## Analysis of Commit for Stable Backport
### 1. COMMIT MESSAGE ANALYSIS
The commit addresses a **build fix** for stack frame size exceeding the
2048 byte limit when compiling with clang-22. The warning:
```
drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c:3529:6:
warning: stack frame size (2096) exceeds limit (2048) in
'dml30_ModeSupportAndSystemConfigurationFull'
```
This references a tracked ClangBuiltLinux issue and has been cherry-
picked from mainline, indicating maintainer intent for backport.
### 2. CODE CHANGE ANALYSIS
The fix reduces parameters to `CalculatePrefetchSchedule()` from ~45 to
significantly fewer by:
- Passing an index `k` and accessing values via `mode_lib->vba`
structure instead of explicit parameters
- Converting pointer-based output parameters to direct writes to `v->`
array elements (e.g., `v->DSTXAfterScaler[k]`, `v->Tno_bw[k]`)
- Updating both callsites in `DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParame
tersWatermarksAndPerformanceCalculation()` and
`dml30_ModeSupportAndSystemConfigurationFull()`
The transformation is mechanical - values that were passed as parameters
are now accessed through the structure, and output parameters now write
directly to struct arrays.
### 3. CLASSIFICATION
**BUILD FIX** - This category is explicitly allowed in stable per
Documentation/process/stable-kernel-rules.rst. The fix prevents
compilation warnings that can become errors with `-Werror`.
### 4. SCOPE AND RISK ASSESSMENT
- **Files Changed**: 1 file (display_mode_vba_30.c)
- **Lines Changed**: ~400 lines (substantial refactoring)
- **Subsystem**: AMD display DML code for DCN 3.0
**Risk Factors**:
- Large refactoring, not a minimal surgical fix
- AMD display code is complex; errors can cause display corruption
- The transformation must be exactly right for functional equivalence
**Mitigating Factors**:
- Changes are purely mechanical (struct access vs parameters)
- No logical changes to calculations
- Same values accessed, just differently
- Already cherry-picked with sign-offs from Nathan Chancellor and Alex
Deucher
### 5. USER IMPACT
- **Affected**: Users building with clang-22+ (especially with
`-Werror`)
- **Hardware**: AMD DCN 3.0 display (Navi2x/RDNA2)
- **Severity**: Build failure for affected users
### 6. STABILITY INDICATORS
- Signed-off-by: Nathan Chancellor (ClangBuiltLinux maintainer)
- Signed-off-by: Alex Deucher (AMD display maintainer)
- Already cherry-picked from mainline
(b20b3fc4210f83089f835cdb91deec4b0778761a)
- Linked to tracked issue: ClangBuiltLinux/linux#2117
### 7. STABLE TREE CRITERIA EVALUATION
| Criterion | Assessment |
|-----------|------------|
| Fixes real bug | ✅ Build failure with modern clang |
| Obviously correct | ⚠️ Mechanical but large |
| Small and contained | ❌ ~400 lines is large |
| No new features | ✅ Pure refactoring |
| Build fix exception | ✅ Explicitly allowed |
| Tested | ✅ Cherry-picked from mainline |
### Verdict
This is a **borderline case**. While the change is larger than typical
stable patches, build fixes are explicitly allowed in stable kernel
rules. The transformation is mechanical (not logic changes), the
maintainers have already cherry-picked it, and ClangBuiltLinux issues
receive priority attention. The fact that it was cherry-picked indicates
upstream maintainers believe it's appropriate for stable.
The risk is that display driver bugs can be severe, but the mechanical
nature of the transformation (accessing through struct vs parameters)
and maintainer endorsement provide reasonable confidence.
**YES**
.../dc/dml/dcn30/display_mode_vba_30.c | 258 +++++-------------
1 file changed, 73 insertions(+), 185 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
index 8d24763938ea6..2d19bb8de59c8 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
@@ -77,32 +77,14 @@ static unsigned int dscceComputeDelay(
static unsigned int dscComputeDelay(
enum output_format_class pixelFormat,
enum output_encoder_class Output);
-// Super monster function with some 45 argument
static bool CalculatePrefetchSchedule(
struct display_mode_lib *mode_lib,
- double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData,
- double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly,
+ unsigned int k,
Pipe *myPipe,
unsigned int DSCDelay,
- double DPPCLKDelaySubtotalPlusCNVCFormater,
- double DPPCLKDelaySCL,
- double DPPCLKDelaySCLLBOnly,
- double DPPCLKDelayCNVCCursor,
- double DISPCLKDelaySubtotal,
unsigned int DPP_RECOUT_WIDTH,
- enum output_format_class OutputFormat,
- unsigned int MaxInterDCNTileRepeaters,
unsigned int VStartup,
unsigned int MaxVStartup,
- unsigned int GPUVMPageTableLevels,
- bool GPUVMEnable,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- double HostVMMinPageSize,
- bool DynamicMetadataEnable,
- bool DynamicMetadataVMEnabled,
- int DynamicMetadataLinesBeforeActiveRequired,
- unsigned int DynamicMetadataTransmittedBytes,
double UrgentLatency,
double UrgentExtraLatency,
double TCalc,
@@ -116,7 +98,6 @@ static bool CalculatePrefetchSchedule(
unsigned int MaxNumSwathY,
double PrefetchSourceLinesC,
unsigned int SwathWidthC,
- int BytePerPixelC,
double VInitPreFillC,
unsigned int MaxNumSwathC,
long swath_width_luma_ub,
@@ -124,9 +105,6 @@ static bool CalculatePrefetchSchedule(
unsigned int SwathHeightY,
unsigned int SwathHeightC,
double TWait,
- bool ProgressiveToInterlaceUnitInOPP,
- double *DSTXAfterScaler,
- double *DSTYAfterScaler,
double *DestinationLinesForPrefetch,
double *PrefetchBandwidth,
double *DestinationLinesToRequestVMInVBlank,
@@ -135,14 +113,7 @@ static bool CalculatePrefetchSchedule(
double *VRatioPrefetchC,
double *RequiredPrefetchPixDataBWLuma,
double *RequiredPrefetchPixDataBWChroma,
- bool *NotEnoughTimeForDynamicMetadata,
- double *Tno_bw,
- double *prefetch_vmrow_bw,
- double *Tdmdl_vm,
- double *Tdmdl,
- unsigned int *VUpdateOffsetPix,
- double *VUpdateWidthPix,
- double *VReadyOffsetPix);
+ bool *NotEnoughTimeForDynamicMetadata);
static double RoundToDFSGranularityUp(double Clock, double VCOSpeed);
static double RoundToDFSGranularityDown(double Clock, double VCOSpeed);
static void CalculateDCCConfiguration(
@@ -810,29 +781,12 @@ static unsigned int dscComputeDelay(enum output_format_class pixelFormat, enum o
static bool CalculatePrefetchSchedule(
struct display_mode_lib *mode_lib,
- double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData,
- double PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly,
+ unsigned int k,
Pipe *myPipe,
unsigned int DSCDelay,
- double DPPCLKDelaySubtotalPlusCNVCFormater,
- double DPPCLKDelaySCL,
- double DPPCLKDelaySCLLBOnly,
- double DPPCLKDelayCNVCCursor,
- double DISPCLKDelaySubtotal,
unsigned int DPP_RECOUT_WIDTH,
- enum output_format_class OutputFormat,
- unsigned int MaxInterDCNTileRepeaters,
unsigned int VStartup,
unsigned int MaxVStartup,
- unsigned int GPUVMPageTableLevels,
- bool GPUVMEnable,
- bool HostVMEnable,
- unsigned int HostVMMaxNonCachedPageTableLevels,
- double HostVMMinPageSize,
- bool DynamicMetadataEnable,
- bool DynamicMetadataVMEnabled,
- int DynamicMetadataLinesBeforeActiveRequired,
- unsigned int DynamicMetadataTransmittedBytes,
double UrgentLatency,
double UrgentExtraLatency,
double TCalc,
@@ -846,7 +800,6 @@ static bool CalculatePrefetchSchedule(
unsigned int MaxNumSwathY,
double PrefetchSourceLinesC,
unsigned int SwathWidthC,
- int BytePerPixelC,
double VInitPreFillC,
unsigned int MaxNumSwathC,
long swath_width_luma_ub,
@@ -854,9 +807,6 @@ static bool CalculatePrefetchSchedule(
unsigned int SwathHeightY,
unsigned int SwathHeightC,
double TWait,
- bool ProgressiveToInterlaceUnitInOPP,
- double *DSTXAfterScaler,
- double *DSTYAfterScaler,
double *DestinationLinesForPrefetch,
double *PrefetchBandwidth,
double *DestinationLinesToRequestVMInVBlank,
@@ -865,15 +815,10 @@ static bool CalculatePrefetchSchedule(
double *VRatioPrefetchC,
double *RequiredPrefetchPixDataBWLuma,
double *RequiredPrefetchPixDataBWChroma,
- bool *NotEnoughTimeForDynamicMetadata,
- double *Tno_bw,
- double *prefetch_vmrow_bw,
- double *Tdmdl_vm,
- double *Tdmdl,
- unsigned int *VUpdateOffsetPix,
- double *VUpdateWidthPix,
- double *VReadyOffsetPix)
+ bool *NotEnoughTimeForDynamicMetadata)
{
+ struct vba_vars_st *v = &mode_lib->vba;
+ double DPPCLKDelaySubtotalPlusCNVCFormater = v->DPPCLKDelaySubtotal + v->DPPCLKDelayCNVCFormater;
bool MyError = false;
unsigned int DPPCycles = 0, DISPCLKCycles = 0;
double DSTTotalPixelsAfterScaler = 0;
@@ -905,26 +850,26 @@ static bool CalculatePrefetchSchedule(
double Tdmec = 0;
double Tdmsks = 0;
- if (GPUVMEnable == true && HostVMEnable == true) {
- HostVMInefficiencyFactor = PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData / PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly;
- HostVMDynamicLevelsTrips = HostVMMaxNonCachedPageTableLevels;
+ if (v->GPUVMEnable == true && v->HostVMEnable == true) {
+ HostVMInefficiencyFactor = v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData / v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly;
+ HostVMDynamicLevelsTrips = v->HostVMMaxNonCachedPageTableLevels;
} else {
HostVMInefficiencyFactor = 1;
HostVMDynamicLevelsTrips = 0;
}
CalculateDynamicMetadataParameters(
- MaxInterDCNTileRepeaters,
+ v->MaxInterDCNTileRepeaters,
myPipe->DPPCLK,
myPipe->DISPCLK,
myPipe->DCFCLKDeepSleep,
myPipe->PixelClock,
myPipe->HTotal,
myPipe->VBlank,
- DynamicMetadataTransmittedBytes,
- DynamicMetadataLinesBeforeActiveRequired,
+ v->DynamicMetadataTransmittedBytes[k],
+ v->DynamicMetadataLinesBeforeActiveRequired[k],
myPipe->InterlaceEnable,
- ProgressiveToInterlaceUnitInOPP,
+ v->ProgressiveToInterlaceUnitInOPP,
&Tsetup,
&Tdmbf,
&Tdmec,
@@ -932,16 +877,16 @@ static bool CalculatePrefetchSchedule(
LineTime = myPipe->HTotal / myPipe->PixelClock;
trip_to_mem = UrgentLatency;
- Tvm_trips = UrgentExtraLatency + trip_to_mem * (GPUVMPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1);
+ Tvm_trips = UrgentExtraLatency + trip_to_mem * (v->GPUVMMaxPageTableLevels * (HostVMDynamicLevelsTrips + 1) - 1);
- if (DynamicMetadataVMEnabled == true && GPUVMEnable == true) {
- *Tdmdl = TWait + Tvm_trips + trip_to_mem;
+ if (v->DynamicMetadataVMEnabled == true && v->GPUVMEnable == true) {
+ v->Tdmdl[k] = TWait + Tvm_trips + trip_to_mem;
} else {
- *Tdmdl = TWait + UrgentExtraLatency;
+ v->Tdmdl[k] = TWait + UrgentExtraLatency;
}
- if (DynamicMetadataEnable == true) {
- if (VStartup * LineTime < Tsetup + *Tdmdl + Tdmbf + Tdmec + Tdmsks) {
+ if (v->DynamicMetadataEnable[k] == true) {
+ if (VStartup * LineTime < Tsetup + v->Tdmdl[k] + Tdmbf + Tdmec + Tdmsks) {
*NotEnoughTimeForDynamicMetadata = true;
} else {
*NotEnoughTimeForDynamicMetadata = false;
@@ -949,39 +894,39 @@ static bool CalculatePrefetchSchedule(
dml_print("DML: Tdmbf: %fus - time for dmd transfer from dchub to dio output buffer\n", Tdmbf);
dml_print("DML: Tdmec: %fus - time dio takes to transfer dmd\n", Tdmec);
dml_print("DML: Tdmsks: %fus - time before active dmd must complete transmission at dio\n", Tdmsks);
- dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", *Tdmdl);
+ dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", v->Tdmdl[k]);
}
} else {
*NotEnoughTimeForDynamicMetadata = false;
}
- *Tdmdl_vm = (DynamicMetadataEnable == true && DynamicMetadataVMEnabled == true && GPUVMEnable == true ? TWait + Tvm_trips : 0);
+ v->Tdmdl_vm[k] = (v->DynamicMetadataEnable[k] == true && v->DynamicMetadataVMEnabled == true && v->GPUVMEnable == true ? TWait + Tvm_trips : 0);
if (myPipe->ScalerEnabled)
- DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCL;
+ DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCL;
else
- DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + DPPCLKDelaySCLLBOnly;
+ DPPCycles = DPPCLKDelaySubtotalPlusCNVCFormater + v->DPPCLKDelaySCLLBOnly;
- DPPCycles = DPPCycles + myPipe->NumberOfCursors * DPPCLKDelayCNVCCursor;
+ DPPCycles = DPPCycles + myPipe->NumberOfCursors * v->DPPCLKDelayCNVCCursor;
- DISPCLKCycles = DISPCLKDelaySubtotal;
+ DISPCLKCycles = v->DISPCLKDelaySubtotal;
if (myPipe->DPPCLK == 0.0 || myPipe->DISPCLK == 0.0)
return true;
- *DSTXAfterScaler = DPPCycles * myPipe->PixelClock / myPipe->DPPCLK + DISPCLKCycles * myPipe->PixelClock / myPipe->DISPCLK
+ v->DSTXAfterScaler[k] = DPPCycles * myPipe->PixelClock / myPipe->DPPCLK + DISPCLKCycles * myPipe->PixelClock / myPipe->DISPCLK
+ DSCDelay;
- *DSTXAfterScaler = *DSTXAfterScaler + ((myPipe->ODMCombineEnabled)?18:0) + (myPipe->DPPPerPlane - 1) * DPP_RECOUT_WIDTH;
+ v->DSTXAfterScaler[k] = v->DSTXAfterScaler[k] + ((myPipe->ODMCombineEnabled)?18:0) + (myPipe->DPPPerPlane - 1) * DPP_RECOUT_WIDTH;
- if (OutputFormat == dm_420 || (myPipe->InterlaceEnable && ProgressiveToInterlaceUnitInOPP))
- *DSTYAfterScaler = 1;
+ if (v->OutputFormat[k] == dm_420 || (myPipe->InterlaceEnable && v->ProgressiveToInterlaceUnitInOPP))
+ v->DSTYAfterScaler[k] = 1;
else
- *DSTYAfterScaler = 0;
+ v->DSTYAfterScaler[k] = 0;
- DSTTotalPixelsAfterScaler = *DSTYAfterScaler * myPipe->HTotal + *DSTXAfterScaler;
- *DSTYAfterScaler = dml_floor(DSTTotalPixelsAfterScaler / myPipe->HTotal, 1);
- *DSTXAfterScaler = DSTTotalPixelsAfterScaler - ((double) (*DSTYAfterScaler * myPipe->HTotal));
+ DSTTotalPixelsAfterScaler = v->DSTYAfterScaler[k] * myPipe->HTotal + v->DSTXAfterScaler[k];
+ v->DSTYAfterScaler[k] = dml_floor(DSTTotalPixelsAfterScaler / myPipe->HTotal, 1);
+ v->DSTXAfterScaler[k] = DSTTotalPixelsAfterScaler - ((double) (v->DSTYAfterScaler[k] * myPipe->HTotal));
MyError = false;
@@ -990,33 +935,33 @@ static bool CalculatePrefetchSchedule(
Tvm_trips_rounded = dml_ceil(4.0 * Tvm_trips / LineTime, 1) / 4 * LineTime;
Tr0_trips_rounded = dml_ceil(4.0 * Tr0_trips / LineTime, 1) / 4 * LineTime;
- if (GPUVMEnable) {
- if (GPUVMPageTableLevels >= 3) {
- *Tno_bw = UrgentExtraLatency + trip_to_mem * ((GPUVMPageTableLevels - 2) - 1);
+ if (v->GPUVMEnable) {
+ if (v->GPUVMMaxPageTableLevels >= 3) {
+ v->Tno_bw[k] = UrgentExtraLatency + trip_to_mem * ((v->GPUVMMaxPageTableLevels - 2) - 1);
} else
- *Tno_bw = 0;
+ v->Tno_bw[k] = 0;
} else if (!myPipe->DCCEnable)
- *Tno_bw = LineTime;
+ v->Tno_bw[k] = LineTime;
else
- *Tno_bw = LineTime / 4;
+ v->Tno_bw[k] = LineTime / 4;
- dst_y_prefetch_equ = VStartup - (Tsetup + dml_max(TWait + TCalc, *Tdmdl)) / LineTime
- - (*DSTYAfterScaler + *DSTXAfterScaler / myPipe->HTotal);
+ dst_y_prefetch_equ = VStartup - (Tsetup + dml_max(TWait + TCalc, v->Tdmdl[k])) / LineTime
+ - (v->DSTYAfterScaler[k] + v->DSTXAfterScaler[k] / myPipe->HTotal);
dst_y_prefetch_equ = dml_min(dst_y_prefetch_equ, 63.75); // limit to the reg limit of U6.2 for DST_Y_PREFETCH
Lsw_oto = dml_max(PrefetchSourceLinesY, PrefetchSourceLinesC);
Tsw_oto = Lsw_oto * LineTime;
- prefetch_bw_oto = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC) / Tsw_oto;
+ prefetch_bw_oto = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k]) / Tsw_oto;
- if (GPUVMEnable == true) {
- Tvm_oto = dml_max3(*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_oto,
+ if (v->GPUVMEnable == true) {
+ Tvm_oto = dml_max3(v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_oto,
Tvm_trips,
LineTime / 4.0);
} else
Tvm_oto = LineTime / 4.0;
- if ((GPUVMEnable == true || myPipe->DCCEnable == true)) {
+ if ((v->GPUVMEnable == true || myPipe->DCCEnable == true)) {
Tr0_oto = dml_max3(
(MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / prefetch_bw_oto,
LineTime - Tvm_oto, LineTime / 4);
@@ -1042,10 +987,10 @@ static bool CalculatePrefetchSchedule(
dml_print("DML: Tdmbf: %fus - time for dmd transfer from dchub to dio output buffer\n", Tdmbf);
dml_print("DML: Tdmec: %fus - time dio takes to transfer dmd\n", Tdmec);
dml_print("DML: Tdmsks: %fus - time before active dmd must complete transmission at dio\n", Tdmsks);
- dml_print("DML: Tdmdl_vm: %fus - time for vm stages of dmd \n", *Tdmdl_vm);
- dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", *Tdmdl);
- dml_print("DML: dst_x_after_scl: %f pixels - number of pixel clocks pipeline and buffer delay after scaler \n", *DSTXAfterScaler);
- dml_print("DML: dst_y_after_scl: %d lines - number of lines of pipeline and buffer delay after scaler \n", (int)*DSTYAfterScaler);
+ dml_print("DML: Tdmdl_vm: %fus - time for vm stages of dmd \n", v->Tdmdl_vm[k]);
+ dml_print("DML: Tdmdl: %fus - time for fabric to become ready and fetch dmd \n", v->Tdmdl[k]);
+ dml_print("DML: dst_x_after_scl: %f pixels - number of pixel clocks pipeline and buffer delay after scaler \n", v->DSTXAfterScaler[k]);
+ dml_print("DML: dst_y_after_scl: %d lines - number of lines of pipeline and buffer delay after scaler \n", (int)v->DSTYAfterScaler[k]);
*PrefetchBandwidth = 0;
*DestinationLinesToRequestVMInVBlank = 0;
@@ -1059,26 +1004,26 @@ static bool CalculatePrefetchSchedule(
double PrefetchBandwidth3 = 0;
double PrefetchBandwidth4 = 0;
- if (Tpre_rounded - *Tno_bw > 0)
+ if (Tpre_rounded - v->Tno_bw[k] > 0)
PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte
+ 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor
+ PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY
- + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC)
- / (Tpre_rounded - *Tno_bw);
+ + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k])
+ / (Tpre_rounded - v->Tno_bw[k]);
else
PrefetchBandwidth1 = 0;
- if (VStartup == MaxVStartup && (PrefetchBandwidth1 > 4 * prefetch_bw_oto) && (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - *Tno_bw) > 0) {
- PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor) / (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - *Tno_bw);
+ if (VStartup == MaxVStartup && (PrefetchBandwidth1 > 4 * prefetch_bw_oto) && (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - v->Tno_bw[k]) > 0) {
+ PrefetchBandwidth1 = (PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor + 2 * MetaRowByte + 2 * PixelPTEBytesPerRow * HostVMInefficiencyFactor) / (Tpre_rounded - Tsw_oto / 4 - 0.75 * LineTime - v->Tno_bw[k]);
}
- if (Tpre_rounded - *Tno_bw - 2 * Tr0_trips_rounded > 0)
+ if (Tpre_rounded - v->Tno_bw[k] - 2 * Tr0_trips_rounded > 0)
PrefetchBandwidth2 = (PDEAndMetaPTEBytesFrame *
HostVMInefficiencyFactor + PrefetchSourceLinesY *
swath_width_luma_ub * BytePerPixelY +
PrefetchSourceLinesC * swath_width_chroma_ub *
- BytePerPixelC) /
- (Tpre_rounded - *Tno_bw - 2 * Tr0_trips_rounded);
+ v->BytePerPixelC[k]) /
+ (Tpre_rounded - v->Tno_bw[k] - 2 * Tr0_trips_rounded);
else
PrefetchBandwidth2 = 0;
@@ -1086,7 +1031,7 @@ static bool CalculatePrefetchSchedule(
PrefetchBandwidth3 = (2 * MetaRowByte + 2 * PixelPTEBytesPerRow *
HostVMInefficiencyFactor + PrefetchSourceLinesY *
swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC *
- swath_width_chroma_ub * BytePerPixelC) / (Tpre_rounded -
+ swath_width_chroma_ub * v->BytePerPixelC[k]) / (Tpre_rounded -
Tvm_trips_rounded);
else
PrefetchBandwidth3 = 0;
@@ -1096,7 +1041,7 @@ static bool CalculatePrefetchSchedule(
}
if (Tpre_rounded - Tvm_trips_rounded - 2 * Tr0_trips_rounded > 0)
- PrefetchBandwidth4 = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * BytePerPixelC)
+ PrefetchBandwidth4 = (PrefetchSourceLinesY * swath_width_luma_ub * BytePerPixelY + PrefetchSourceLinesC * swath_width_chroma_ub * v->BytePerPixelC[k])
/ (Tpre_rounded - Tvm_trips_rounded - 2 * Tr0_trips_rounded);
else
PrefetchBandwidth4 = 0;
@@ -1107,7 +1052,7 @@ static bool CalculatePrefetchSchedule(
bool Case3OK;
if (PrefetchBandwidth1 > 0) {
- if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth1
+ if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth1
>= Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth1 >= Tr0_trips_rounded) {
Case1OK = true;
} else {
@@ -1118,7 +1063,7 @@ static bool CalculatePrefetchSchedule(
}
if (PrefetchBandwidth2 > 0) {
- if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth2
+ if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth2
>= Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth2 < Tr0_trips_rounded) {
Case2OK = true;
} else {
@@ -1129,7 +1074,7 @@ static bool CalculatePrefetchSchedule(
}
if (PrefetchBandwidth3 > 0) {
- if (*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth3
+ if (v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / PrefetchBandwidth3
< Tvm_trips_rounded && (MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / PrefetchBandwidth3 >= Tr0_trips_rounded) {
Case3OK = true;
} else {
@@ -1152,13 +1097,13 @@ static bool CalculatePrefetchSchedule(
dml_print("DML: prefetch_bw_equ: %f\n", prefetch_bw_equ);
if (prefetch_bw_equ > 0) {
- if (GPUVMEnable) {
- Tvm_equ = dml_max3(*Tno_bw + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_equ, Tvm_trips, LineTime / 4);
+ if (v->GPUVMEnable) {
+ Tvm_equ = dml_max3(v->Tno_bw[k] + PDEAndMetaPTEBytesFrame * HostVMInefficiencyFactor / prefetch_bw_equ, Tvm_trips, LineTime / 4);
} else {
Tvm_equ = LineTime / 4;
}
- if ((GPUVMEnable || myPipe->DCCEnable)) {
+ if ((v->GPUVMEnable || myPipe->DCCEnable)) {
Tr0_equ = dml_max4(
(MetaRowByte + PixelPTEBytesPerRow * HostVMInefficiencyFactor) / prefetch_bw_equ,
Tr0_trips,
@@ -1227,7 +1172,7 @@ static bool CalculatePrefetchSchedule(
}
*RequiredPrefetchPixDataBWLuma = (double) PrefetchSourceLinesY / LinesToRequestPrefetchPixelData * BytePerPixelY * swath_width_luma_ub / LineTime;
- *RequiredPrefetchPixDataBWChroma = (double) PrefetchSourceLinesC / LinesToRequestPrefetchPixelData * BytePerPixelC * swath_width_chroma_ub / LineTime;
+ *RequiredPrefetchPixDataBWChroma = (double) PrefetchSourceLinesC / LinesToRequestPrefetchPixelData * v->BytePerPixelC[k] * swath_width_chroma_ub / LineTime;
} else {
MyError = true;
dml_print("DML: MyErr set %s:%d\n", __FILE__, __LINE__);
@@ -1243,9 +1188,9 @@ static bool CalculatePrefetchSchedule(
dml_print("DML: Tr0: %fus - time to fetch first row of data pagetables and first row of meta data (done in parallel)\n", TimeForFetchingRowInVBlank);
dml_print("DML: Tr1: %fus - time to fetch second row of data pagetables and second row of meta data (done in parallel)\n", TimeForFetchingRowInVBlank);
dml_print("DML: Tsw: %fus = time to fetch enough pixel data and cursor data to feed the scalers init position and detile\n", (double)LinesToRequestPrefetchPixelData * LineTime);
- dml_print("DML: To: %fus - time for propagation from scaler to optc\n", (*DSTYAfterScaler + ((*DSTXAfterScaler) / (double) myPipe->HTotal)) * LineTime);
+ dml_print("DML: To: %fus - time for propagation from scaler to optc\n", (v->DSTYAfterScaler[k] + ((v->DSTXAfterScaler[k]) / (double) myPipe->HTotal)) * LineTime);
dml_print("DML: Tvstartup - Tsetup - Tcalc - Twait - Tpre - To > 0\n");
- dml_print("DML: Tslack(pre): %fus - time left over in schedule\n", VStartup * LineTime - TimeForFetchingMetaPTE - 2 * TimeForFetchingRowInVBlank - (*DSTYAfterScaler + ((*DSTXAfterScaler) / (double) myPipe->HTotal)) * LineTime - TWait - TCalc - Tsetup);
+ dml_print("DML: Tslack(pre): %fus - time left over in schedule\n", VStartup * LineTime - TimeForFetchingMetaPTE - 2 * TimeForFetchingRowInVBlank - (v->DSTYAfterScaler[k] + ((v->DSTXAfterScaler[k]) / (double) myPipe->HTotal)) * LineTime - TWait - TCalc - Tsetup);
dml_print("DML: row_bytes = dpte_row_bytes (per_pipe) = PixelPTEBytesPerRow = : %d\n", PixelPTEBytesPerRow);
} else {
@@ -1276,7 +1221,7 @@ static bool CalculatePrefetchSchedule(
dml_print("DML: MyErr set %s:%d\n", __FILE__, __LINE__);
}
- *prefetch_vmrow_bw = dml_max(prefetch_vm_bw, prefetch_row_bw);
+ v->prefetch_vmrow_bw[k] = dml_max(prefetch_vm_bw, prefetch_row_bw);
}
if (MyError) {
@@ -2437,30 +2382,12 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->ErrorResult[k] = CalculatePrefetchSchedule(
mode_lib,
- v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData,
- v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly,
+ k,
&myPipe,
v->DSCDelay[k],
- v->DPPCLKDelaySubtotal
- + v->DPPCLKDelayCNVCFormater,
- v->DPPCLKDelaySCL,
- v->DPPCLKDelaySCLLBOnly,
- v->DPPCLKDelayCNVCCursor,
- v->DISPCLKDelaySubtotal,
(unsigned int) (v->SwathWidthY[k] / v->HRatio[k]),
- v->OutputFormat[k],
- v->MaxInterDCNTileRepeaters,
dml_min(v->VStartupLines, v->MaxVStartupLines[k]),
v->MaxVStartupLines[k],
- v->GPUVMMaxPageTableLevels,
- v->GPUVMEnable,
- v->HostVMEnable,
- v->HostVMMaxNonCachedPageTableLevels,
- v->HostVMMinPageSize,
- v->DynamicMetadataEnable[k],
- v->DynamicMetadataVMEnabled,
- v->DynamicMetadataLinesBeforeActiveRequired[k],
- v->DynamicMetadataTransmittedBytes[k],
v->UrgentLatency,
v->UrgentExtraLatency,
v->TCalc,
@@ -2474,7 +2401,6 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->MaxNumSwathY[k],
v->PrefetchSourceLinesC[k],
v->SwathWidthC[k],
- v->BytePerPixelC[k],
v->VInitPreFillC[k],
v->MaxNumSwathC[k],
v->swath_width_luma_ub[k],
@@ -2482,9 +2408,6 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
v->SwathHeightY[k],
v->SwathHeightC[k],
TWait,
- v->ProgressiveToInterlaceUnitInOPP,
- &v->DSTXAfterScaler[k],
- &v->DSTYAfterScaler[k],
&v->DestinationLinesForPrefetch[k],
&v->PrefetchBandwidth[k],
&v->DestinationLinesToRequestVMInVBlank[k],
@@ -2493,14 +2416,7 @@ static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerforman
&v->VRatioPrefetchC[k],
&v->RequiredPrefetchPixDataBWLuma[k],
&v->RequiredPrefetchPixDataBWChroma[k],
- &v->NotEnoughTimeForDynamicMetadata[k],
- &v->Tno_bw[k],
- &v->prefetch_vmrow_bw[k],
- &v->Tdmdl_vm[k],
- &v->Tdmdl[k],
- &v->VUpdateOffsetPix[k],
- &v->VUpdateWidthPix[k],
- &v->VReadyOffsetPix[k]);
+ &v->NotEnoughTimeForDynamicMetadata[k]);
if (v->BlendingAndTiming[k] == k) {
double TotalRepeaterDelayTime = v->MaxInterDCNTileRepeaters * (2 / v->DPPCLK[k] + 3 / v->DISPCLK);
v->VUpdateWidthPix[k] = (14 / v->DCFCLKDeepSleep + 12 / v->DPPCLK[k] + TotalRepeaterDelayTime) * v->PixelClock[k];
@@ -4770,29 +4686,12 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
v->NoTimeForPrefetch[i][j][k] = CalculatePrefetchSchedule(
mode_lib,
- v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyPixelMixedWithVMData,
- v->PercentOfIdealDRAMFabricAndSDPPortBWReceivedAfterUrgLatencyVMDataOnly,
+ k,
&myPipe,
v->DSCDelayPerState[i][k],
- v->DPPCLKDelaySubtotal + v->DPPCLKDelayCNVCFormater,
- v->DPPCLKDelaySCL,
- v->DPPCLKDelaySCLLBOnly,
- v->DPPCLKDelayCNVCCursor,
- v->DISPCLKDelaySubtotal,
v->SwathWidthYThisState[k] / v->HRatio[k],
- v->OutputFormat[k],
- v->MaxInterDCNTileRepeaters,
dml_min(v->MaxVStartup, v->MaximumVStartup[i][j][k]),
v->MaximumVStartup[i][j][k],
- v->GPUVMMaxPageTableLevels,
- v->GPUVMEnable,
- v->HostVMEnable,
- v->HostVMMaxNonCachedPageTableLevels,
- v->HostVMMinPageSize,
- v->DynamicMetadataEnable[k],
- v->DynamicMetadataVMEnabled,
- v->DynamicMetadataLinesBeforeActiveRequired[k],
- v->DynamicMetadataTransmittedBytes[k],
v->UrgLatency[i],
v->ExtraLatency,
v->TimeCalc,
@@ -4806,7 +4705,6 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
v->MaxNumSwY[k],
v->PrefetchLinesC[i][j][k],
v->SwathWidthCThisState[k],
- v->BytePerPixelC[k],
v->PrefillC[k],
v->MaxNumSwC[k],
v->swath_width_luma_ub_this_state[k],
@@ -4814,9 +4712,6 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
v->SwathHeightYThisState[k],
v->SwathHeightCThisState[k],
v->TWait,
- v->ProgressiveToInterlaceUnitInOPP,
- &v->DSTXAfterScaler[k],
- &v->DSTYAfterScaler[k],
&v->LineTimesForPrefetch[k],
&v->PrefetchBW[k],
&v->LinesForMetaPTE[k],
@@ -4825,14 +4720,7 @@ void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l
&v->VRatioPreC[i][j][k],
&v->RequiredPrefetchPixelDataBWLuma[i][j][k],
&v->RequiredPrefetchPixelDataBWChroma[i][j][k],
- &v->NoTimeForDynamicMetadata[i][j][k],
- &v->Tno_bw[k],
- &v->prefetch_vmrow_bw[k],
- &v->Tdmdl_vm[k],
- &v->Tdmdl[k],
- &v->VUpdateOffsetPix[k],
- &v->VUpdateWidthPix[k],
- &v->VReadyOffsetPix[k]);
+ &v->NoTimeForDynamicMetadata[i][j][k]);
}
for (k = 0; k <= v->NumberOfActivePlanes - 1; k++) {
--
2.51.0
next prev parent reply other threads:[~2026-01-12 14:59 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 14:58 [PATCH AUTOSEL 6.18] HID: Elecom: Add support for ELECOM M-XT3DRBK (018C) Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] x86/sev: Disable GCOV on noinstr object Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] wifi: mac80211: collect station statistics earlier when disconnect Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] btrfs: do not free data reservation in fallback from inline due to -ENOSPC Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] btrfs: fix deadlock in wait_current_trans() due to ignored transaction type Sasha Levin
2026-01-19 11:46 ` Motiejus Jakštys
2026-01-20 11:03 ` Greg KH
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: quirks: Add another Chicony HP 5MP Cameras to hid_ignore_list Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] HID: intel-ish-hid: Update ishtp bus match to support device ID table Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: multitouch: add MT_QUIRK_STICKY_FINGERS to MT_CLS_VTL Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] HID: i2c-hid: fix potential buffer overflow in i2c_hid_get_report() Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] riscv: trace: fix snapshot deadlock with sbi ecall Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] drm/amd/pm: Disable MMIO access during SMU Mode 1 reset Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] riscv: Sanitize syscall table indexing under speculation Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.15] netfilter: replace -EEXIST with -EBUSY Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] PCI: qcom: Remove ASPM L0s support for MSM8996 SoC Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] ALSA: hda/realtek: add HP Laptop 15s-eq1xxx mute LED quirk Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] ring-buffer: Avoid softlockup in ring_buffer_resize() during memory free Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.15] HID: playstation: Center initial joystick axes to prevent spurious events Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: intel-ish-hid: Reset enum_devices_done before enumeration Sasha Levin
2026-01-12 14:58 ` Sasha Levin [this message]
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: Apply quirk HID_QUIRK_ALWAYS_POLL to Edifier QR30 (2d99:a101) Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.1] btrfs: fix reservation leak in some error paths when inserting inline extent Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] ALSA: hda/realtek: Add quirk for Acer Nitro AN517-55 Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-6.12] HID: logitech: add HID++ support for Logitech MX Anywhere 3S Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18] HID: Intel-thc-hid: Intel-thc: Add safety check for reading DMA buffer Sasha Levin
2026-01-12 14:58 ` [PATCH AUTOSEL 6.18-5.10] HID: usbhid: paper over wrong bNumDescriptor field Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260112145840.724774-19-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=alex.hung@amd.com \
--cc=alexander.deucher@amd.com \
--cc=llvm@lists.linux.dev \
--cc=nathan@kernel.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
--cc=tungyu.lu@amd.com \
--cc=v.shevtsov@mt-integration.ru \
--cc=zaeem.mohamed@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox