AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] Enable '-Werror' by default for all kernel builds
       [not found]       ` <CAHk-=wgZkQ+eZ02TaCpAWo_ffiLMwA2tYNHyL+B1dQ4YB0qfmA@mail.gmail.com>
@ 2021-09-07 17:33         ` Linus Torvalds
  2021-09-07 21:07           ` Harry Wentland
  0 siblings, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2021-09-07 17:33 UTC (permalink / raw)
  To: Arnd Bergmann, Harry Wentland, Leo Li, Alex Deucher,
	Christian König, Pan, Xinhui
  Cc: Nathan Chancellor, Guenter Roeck, Linux Kernel Mailing List, llvm,
	Nick Desaulniers, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 2203 bytes --]

On Tue, Sep 7, 2021 at 10:10 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Do I know why? No. I do note that that code is disgusting.
>
> It's passing one of those structs around by value, for example. That's
> a 72-byte structure that is copied on the stack due to stupid calling
> conventions. Maybe clang generates a few extra temporaries for it as
> part of the function call stack setup? Who knows..

Ooh, yes.

This attached patch is crap - it converts the helper functions to use
const pointers instead of passing the whole structure, but it then
only converts that one file that *uses* them.

So the end result will not compile in general, but you can do

    make drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.o

and it compiles for me.

And while gcc doesn't care that much - it will apparently either
generate the argument stack every call - clang cares deeply.

The nasty 720-byte stack frame that clang generates turns into just a
320-byte one, and code generation in general looks a *lot* better.

Now, as mentioned, this patch is broken and incomplete. But I really
think the AMD GPU people need to do this. It makes those functions go
from practically unusable to not horribly disgusting.

So Harry/Leo/Alex/Christian and amd-gfx list - can you look into
making this ugly "make one file compile better" patch actually work
properly?

It *looks* lto me ike that code was perhaps written for a C++ compiler
and the helpers have been written as a "pass by reference", and the
arguments used to be

     const display_data_rq_misc_params_st& rq_misc_param

and then the compiler will pass the argument as a pointer. And then it
was converted to C, and the "pass by reference" in the function
declaration was turned into "pass by value", to avoid changing "." to
"->" in the use.

But a '&arg' thing in C++ really is a '*arg' pointer in C, and should
have been done as that.

Of course, it's also possible that that code was simply written by
somebody who didn't understand just *how* horrible it is to pass
structures bigger than a word or two by value.

Do we have a compiler warning for passing big structures by value?

                   Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 17439 bytes --]

 .../display/dc/dml/dcn30/display_rq_dlg_calc_30.c  | 142 ++++++++++-----------
 .../display/dc/dml/dcn30/display_rq_dlg_calc_30.h  |   2 +-
 .../amd/display/dc/dml/display_rq_dlg_helpers.h    |  20 +--
 3 files changed, 82 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
index 0d934fae1c3a..9b5ee53d2de3 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
@@ -92,7 +92,7 @@ static void extract_rq_sizing_regs(struct display_mode_lib *mode_lib,
 	const display_data_rq_sizing_params_st rq_sizing)
 {
 	dml_print("DML_DLG: %s: rq_sizing param\n", __func__);
-	print__data_rq_sizing_params_st(mode_lib, rq_sizing);
+	print__data_rq_sizing_params_st(mode_lib, &rq_sizing);
 
 	rq_regs->chunk_size = dml_log2(rq_sizing.chunk_bytes) - 10;
 
@@ -113,28 +113,28 @@ static void extract_rq_sizing_regs(struct display_mode_lib *mode_lib,
 
 static void extract_rq_regs(struct display_mode_lib *mode_lib,
 	display_rq_regs_st *rq_regs,
-	const display_rq_params_st rq_param)
+	const display_rq_params_st *rq_param)
 {
 	unsigned int detile_buf_size_in_bytes = mode_lib->ip.det_buffer_size_kbytes * 1024;
 	unsigned int detile_buf_plane1_addr = 0;
 
-	extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_l), rq_param.sizing.rq_l);
+	extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_l), rq_param->sizing.rq_l);
 
-	rq_regs->rq_regs_l.pte_row_height_linear = dml_floor(dml_log2(rq_param.dlg.rq_l.dpte_row_height),
+	rq_regs->rq_regs_l.pte_row_height_linear = dml_floor(dml_log2(rq_param->dlg.rq_l.dpte_row_height),
 		1) - 3;
 
-	if (rq_param.yuv420) {
-		extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_c), rq_param.sizing.rq_c);
-		rq_regs->rq_regs_c.pte_row_height_linear = dml_floor(dml_log2(rq_param.dlg.rq_c.dpte_row_height),
+	if (rq_param->yuv420) {
+		extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_c), rq_param->sizing.rq_c);
+		rq_regs->rq_regs_c.pte_row_height_linear = dml_floor(dml_log2(rq_param->dlg.rq_c.dpte_row_height),
 			1) - 3;
 	}
 
-	rq_regs->rq_regs_l.swath_height = dml_log2(rq_param.dlg.rq_l.swath_height);
-	rq_regs->rq_regs_c.swath_height = dml_log2(rq_param.dlg.rq_c.swath_height);
+	rq_regs->rq_regs_l.swath_height = dml_log2(rq_param->dlg.rq_l.swath_height);
+	rq_regs->rq_regs_c.swath_height = dml_log2(rq_param->dlg.rq_c.swath_height);
 
 	// FIXME: take the max between luma, chroma chunk size?
 	// okay for now, as we are setting chunk_bytes to 8kb anyways
-	if (rq_param.sizing.rq_l.chunk_bytes >= 32 * 1024 || (rq_param.yuv420 && rq_param.sizing.rq_c.chunk_bytes >= 32 * 1024)) { //32kb
+	if (rq_param->sizing.rq_l.chunk_bytes >= 32 * 1024 || (rq_param->yuv420 && rq_param->sizing.rq_c.chunk_bytes >= 32 * 1024)) { //32kb
 		rq_regs->drq_expansion_mode = 0;
 	} else {
 		rq_regs->drq_expansion_mode = 2;
@@ -143,9 +143,9 @@ static void extract_rq_regs(struct display_mode_lib *mode_lib,
 	rq_regs->mrq_expansion_mode = 1;
 	rq_regs->crq_expansion_mode = 1;
 
-	if (rq_param.yuv420) {
-	if ((double)rq_param.misc.rq_l.stored_swath_bytes
-			/ (double)rq_param.misc.rq_c.stored_swath_bytes <= 1.5) {
+	if (rq_param->yuv420) {
+	if ((double)rq_param->misc.rq_l.stored_swath_bytes
+			/ (double)rq_param->misc.rq_c.stored_swath_bytes <= 1.5) {
 			detile_buf_plane1_addr = (detile_buf_size_in_bytes / 2.0 / 64.0); // half to chroma
 		} else {
 			detile_buf_plane1_addr = dml_round_to_multiple((unsigned int)((2.0 * detile_buf_size_in_bytes) / 3.0),
@@ -747,7 +747,7 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 	display_data_rq_sizing_params_st *rq_sizing_param,
 	display_data_rq_dlg_params_st *rq_dlg_param,
 	display_data_rq_misc_params_st *rq_misc_param,
-	const display_pipe_params_st pipe_param,
+	const display_pipe_params_st *pipe_param,
 	bool is_chroma,
 	bool is_alpha)
 {
@@ -761,32 +761,32 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 
 	// FIXME check if ppe apply for both luma and chroma in 422 case
 	if (is_chroma | is_alpha) {
-		vp_width = pipe_param.src.viewport_width_c / ppe;
-		vp_height = pipe_param.src.viewport_height_c;
-		data_pitch = pipe_param.src.data_pitch_c;
-		meta_pitch = pipe_param.src.meta_pitch_c;
-		surface_height = pipe_param.src.surface_height_y / 2.0;
+		vp_width = pipe_param->src.viewport_width_c / ppe;
+		vp_height = pipe_param->src.viewport_height_c;
+		data_pitch = pipe_param->src.data_pitch_c;
+		meta_pitch = pipe_param->src.meta_pitch_c;
+		surface_height = pipe_param->src.surface_height_y / 2.0;
 	} else {
-		vp_width = pipe_param.src.viewport_width / ppe;
-		vp_height = pipe_param.src.viewport_height;
-		data_pitch = pipe_param.src.data_pitch;
-		meta_pitch = pipe_param.src.meta_pitch;
-		surface_height = pipe_param.src.surface_height_y;
+		vp_width = pipe_param->src.viewport_width / ppe;
+		vp_height = pipe_param->src.viewport_height;
+		data_pitch = pipe_param->src.data_pitch;
+		meta_pitch = pipe_param->src.meta_pitch;
+		surface_height = pipe_param->src.surface_height_y;
 	}
 
-	if (pipe_param.dest.odm_combine) {
+	if (pipe_param->dest.odm_combine) {
 		unsigned int access_dir = 0;
 		unsigned int full_src_vp_width = 0;
 		unsigned int hactive_odm = 0;
 		unsigned int src_hactive_odm = 0;
-		access_dir = (pipe_param.src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
-		hactive_odm  = pipe_param.dest.hactive / ((unsigned int)pipe_param.dest.odm_combine*2);
+		access_dir = (pipe_param->src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
+		hactive_odm  = pipe_param->dest.hactive / ((unsigned int)pipe_param->dest.odm_combine*2);
 		if (is_chroma) {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio_c * pipe_param.dest.full_recout_width;
-			src_hactive_odm  = pipe_param.scale_ratio_depth.hscl_ratio_c * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio_c * pipe_param->dest.full_recout_width;
+			src_hactive_odm  = pipe_param->scale_ratio_depth.hscl_ratio_c * hactive_odm;
 		} else {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio * pipe_param.dest.full_recout_width;
-			src_hactive_odm  = pipe_param.scale_ratio_depth.hscl_ratio * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio * pipe_param->dest.full_recout_width;
+			src_hactive_odm  = pipe_param->scale_ratio_depth.hscl_ratio * hactive_odm;
 		}
 
 		if (access_dir == 0) {
@@ -815,7 +815,7 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 	rq_sizing_param->meta_chunk_bytes = 2048;
 	rq_sizing_param->min_meta_chunk_bytes = 256;
 
-	if (pipe_param.src.hostvm)
+	if (pipe_param->src.hostvm)
 		rq_sizing_param->mpte_group_bytes = 512;
 	else
 		rq_sizing_param->mpte_group_bytes = 2048;
@@ -828,28 +828,28 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 		vp_height,
 		data_pitch,
 		meta_pitch,
-		pipe_param.src.source_format,
-		pipe_param.src.sw_mode,
-		pipe_param.src.macro_tile_size,
-		pipe_param.src.source_scan,
-		pipe_param.src.hostvm,
+		pipe_param->src.source_format,
+		pipe_param->src.sw_mode,
+		pipe_param->src.macro_tile_size,
+		pipe_param->src.source_scan,
+		pipe_param->src.hostvm,
 		is_chroma,
 		surface_height);
 }
 
 static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 	display_rq_params_st *rq_param,
-	const display_pipe_params_st pipe_param)
+	const display_pipe_params_st *pipe_param)
 {
 	// get param for luma surface
-	rq_param->yuv420 = pipe_param.src.source_format == dm_420_8
-	|| pipe_param.src.source_format == dm_420_10
-	|| pipe_param.src.source_format == dm_rgbe_alpha
-	|| pipe_param.src.source_format == dm_420_12;
+	rq_param->yuv420 = pipe_param->src.source_format == dm_420_8
+	|| pipe_param->src.source_format == dm_420_10
+	|| pipe_param->src.source_format == dm_rgbe_alpha
+	|| pipe_param->src.source_format == dm_420_12;
 
-	rq_param->yuv420_10bpc = pipe_param.src.source_format == dm_420_10;
+	rq_param->yuv420_10bpc = pipe_param->src.source_format == dm_420_10;
 
-	rq_param->rgbe_alpha = (pipe_param.src.source_format == dm_rgbe_alpha)?1:0;
+	rq_param->rgbe_alpha = (pipe_param->src.source_format == dm_rgbe_alpha)?1:0;
 
 	get_surf_rq_param(mode_lib,
 		&(rq_param->sizing.rq_l),
@@ -859,7 +859,7 @@ static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 		0,
 		0);
 
-	if (is_dual_plane((enum source_format_class)(pipe_param.src.source_format))) {
+	if (is_dual_plane((enum source_format_class)(pipe_param->src.source_format))) {
 		// get param for chroma surface
 		get_surf_rq_param(mode_lib,
 			&(rq_param->sizing.rq_c),
@@ -871,21 +871,21 @@ static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 	}
 
 	// calculate how to split the det buffer space between luma and chroma
-	handle_det_buf_split(mode_lib, rq_param, pipe_param.src);
-	print__rq_params_st(mode_lib, *rq_param);
+	handle_det_buf_split(mode_lib, rq_param, pipe_param->src);
+	print__rq_params_st(mode_lib, rq_param);
 }
 
 void dml30_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 	display_rq_regs_st *rq_regs,
-	const display_pipe_params_st pipe_param)
+	const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = { 0 };
 
 	memset(rq_regs, 0, sizeof(*rq_regs));
 	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, pipe_param);
-	extract_rq_regs(mode_lib, rq_regs, rq_param);
+	extract_rq_regs(mode_lib, rq_regs, &rq_param);
 
-	print__rq_regs_st(mode_lib, *rq_regs);
+	print__rq_regs_st(mode_lib, rq_regs);
 }
 
 static void calculate_ttu_cursor(struct display_mode_lib *mode_lib,
@@ -984,8 +984,8 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 	const unsigned int pipe_idx,
 	display_dlg_regs_st *disp_dlg_regs,
 	display_ttu_regs_st *disp_ttu_regs,
-	const display_rq_dlg_params_st rq_dlg_param,
-	const display_dlg_sys_params_st dlg_sys_param,
+	const display_rq_dlg_params_st *rq_dlg_param,
+	const display_dlg_sys_params_st *dlg_sys_param,
 	const bool cstate_en,
 	const bool pstate_en,
 	const bool vm_en,
@@ -1137,7 +1137,7 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 		* dml_pow(2, 8));
 	disp_dlg_regs->dlg_vblank_end = interlaced ? (vblank_end / 2) : vblank_end; // 15 bits
 
-	min_dcfclk_mhz = dlg_sys_param.deepsleep_dcfclk_mhz;
+	min_dcfclk_mhz = dlg_sys_param->deepsleep_dcfclk_mhz;
 	t_calc_us = get_tcalc(mode_lib, e2e_pipe_param, num_pipes);
 	min_ttu_vblank = get_min_ttu_vblank(mode_lib, e2e_pipe_param, num_pipes, pipe_idx);
 
@@ -1191,13 +1191,13 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 	scl_enable = scl->scl_enable;
 
 	line_time_in_us = (htotal / pclk_freq_in_mhz);
-	swath_width_ub_l = rq_dlg_param.rq_l.swath_width_ub;
-	dpte_groups_per_row_ub_l = rq_dlg_param.rq_l.dpte_groups_per_row_ub;
-	swath_width_ub_c = rq_dlg_param.rq_c.swath_width_ub;
-	dpte_groups_per_row_ub_c = rq_dlg_param.rq_c.dpte_groups_per_row_ub;
+	swath_width_ub_l = rq_dlg_param->rq_l.swath_width_ub;
+	dpte_groups_per_row_ub_l = rq_dlg_param->rq_l.dpte_groups_per_row_ub;
+	swath_width_ub_c = rq_dlg_param->rq_c.swath_width_ub;
+	dpte_groups_per_row_ub_c = rq_dlg_param->rq_c.dpte_groups_per_row_ub;
 
-	meta_chunks_per_row_ub_l = rq_dlg_param.rq_l.meta_chunks_per_row_ub;
-	meta_chunks_per_row_ub_c = rq_dlg_param.rq_c.meta_chunks_per_row_ub;
+	meta_chunks_per_row_ub_l = rq_dlg_param->rq_l.meta_chunks_per_row_ub;
+	meta_chunks_per_row_ub_c = rq_dlg_param->rq_c.meta_chunks_per_row_ub;
 	vupdate_offset = dst->vupdate_offset;
 	vupdate_width = dst->vupdate_width;
 	vready_offset = dst->vready_offset;
@@ -1379,16 +1379,16 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 	dml_print("DML_DLG: %s: vratio_pre_c=%3.2f\n", __func__, vratio_pre_c);
 
 	// Active
-	req_per_swath_ub_l = rq_dlg_param.rq_l.req_per_swath_ub;
-	req_per_swath_ub_c = rq_dlg_param.rq_c.req_per_swath_ub;
-	meta_row_height_l = rq_dlg_param.rq_l.meta_row_height;
-	meta_row_height_c = rq_dlg_param.rq_c.meta_row_height;
+	req_per_swath_ub_l = rq_dlg_param->rq_l.req_per_swath_ub;
+	req_per_swath_ub_c = rq_dlg_param->rq_c.req_per_swath_ub;
+	meta_row_height_l = rq_dlg_param->rq_l.meta_row_height;
+	meta_row_height_c = rq_dlg_param->rq_c.meta_row_height;
 	swath_width_pixels_ub_l = 0;
 	swath_width_pixels_ub_c = 0;
 	scaler_rec_in_width_l = 0;
 	scaler_rec_in_width_c = 0;
-	dpte_row_height_l = rq_dlg_param.rq_l.dpte_row_height;
-	dpte_row_height_c = rq_dlg_param.rq_c.dpte_row_height;
+	dpte_row_height_l = rq_dlg_param->rq_l.dpte_row_height;
+	dpte_row_height_c = rq_dlg_param->rq_c.dpte_row_height;
 
 	if (mode_422) {
 		swath_width_pixels_ub_l = swath_width_ub_l * 2;  // *2 for 2 pixel per element
@@ -1824,8 +1824,8 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 	disp_ttu_regs->min_ttu_vblank = min_ttu_vblank * refclk_freq_in_mhz;
 	ASSERT(disp_ttu_regs->min_ttu_vblank < dml_pow(2, 24));
 
-	print__ttu_regs_st(mode_lib, *disp_ttu_regs);
-	print__dlg_regs_st(mode_lib, *disp_dlg_regs);
+	print__ttu_regs_st(mode_lib, disp_ttu_regs);
+	print__dlg_regs_st(mode_lib, disp_dlg_regs);
 }
 
 void dml30_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
@@ -1861,20 +1861,20 @@ void dml30_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
 	dlg_sys_param.t_srx_delay_us = mode_lib->ip.dcfclk_cstate_latency
 		/ dlg_sys_param.deepsleep_dcfclk_mhz; // TODO: Deprecated
 
-	print__dlg_sys_params_st(mode_lib, dlg_sys_param);
+	print__dlg_sys_params_st(mode_lib, &dlg_sys_param);
 
 	// system parameter calculation done
 
 	dml_print("DML_DLG: Calculation for pipe[%d] start\n\n", pipe_idx);
-	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, e2e_pipe_param[pipe_idx].pipe);
+	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, &e2e_pipe_param[pipe_idx].pipe);
 	dml_rq_dlg_get_dlg_params(mode_lib,
 		e2e_pipe_param,
 		num_pipes,
 		pipe_idx,
 		dlg_regs,
 		ttu_regs,
-		rq_param.dlg,
-		dlg_sys_param,
+		&rq_param.dlg,
+		&dlg_sys_param,
 		cstate_en,
 		pstate_en,
 		vm_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
index c04965cceff3..b40abc41828a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
@@ -41,7 +41,7 @@ struct display_mode_lib;
 //            See also: <display_rq_regs_st>
 void dml30_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 // Function: dml_rq_dlg_get_dlg_reg
 //   Calculate and return DLG and TTU register struct given the system setting
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_helpers.h b/drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_helpers.h
index 2555ef0358c2..fb61a0b1470f 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_helpers.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_rq_dlg_helpers.h
@@ -31,16 +31,16 @@
 /* Function: Printer functions
  *  Print various struct
  */
-void print__rq_params_st(struct display_mode_lib *mode_lib, display_rq_params_st rq_param);
-void print__data_rq_sizing_params_st(struct display_mode_lib *mode_lib, display_data_rq_sizing_params_st rq_sizing);
-void print__data_rq_dlg_params_st(struct display_mode_lib *mode_lib, display_data_rq_dlg_params_st rq_dlg_param);
-void print__data_rq_misc_params_st(struct display_mode_lib *mode_lib, display_data_rq_misc_params_st rq_misc_param);
-void print__rq_dlg_params_st(struct display_mode_lib *mode_lib, display_rq_dlg_params_st rq_dlg_param);
-void print__dlg_sys_params_st(struct display_mode_lib *mode_lib, display_dlg_sys_params_st dlg_sys_param);
+void print__rq_params_st(struct display_mode_lib *mode_lib, const display_rq_params_st *rq_param);
+void print__data_rq_sizing_params_st(struct display_mode_lib *mode_lib, const display_data_rq_sizing_params_st *rq_sizing);
+void print__data_rq_dlg_params_st(struct display_mode_lib *mode_lib, const display_data_rq_dlg_params_st *rq_dlg_param);
+void print__data_rq_misc_params_st(struct display_mode_lib *mode_lib, const display_data_rq_misc_params_st *rq_misc_param);
+void print__rq_dlg_params_st(struct display_mode_lib *mode_lib, const display_rq_dlg_params_st *rq_dlg_param);
+void print__dlg_sys_params_st(struct display_mode_lib *mode_lib, const display_dlg_sys_params_st *dlg_sys_param);
 
-void print__data_rq_regs_st(struct display_mode_lib *mode_lib, display_data_rq_regs_st data_rq_regs);
-void print__rq_regs_st(struct display_mode_lib *mode_lib, display_rq_regs_st rq_regs);
-void print__dlg_regs_st(struct display_mode_lib *mode_lib, display_dlg_regs_st dlg_regs);
-void print__ttu_regs_st(struct display_mode_lib *mode_lib, display_ttu_regs_st ttu_regs);
+void print__data_rq_regs_st(struct display_mode_lib *mode_lib, const display_data_rq_regs_st *data_rq_regs);
+void print__rq_regs_st(struct display_mode_lib *mode_lib, const display_rq_regs_st *rq_regs);
+void print__dlg_regs_st(struct display_mode_lib *mode_lib, const display_dlg_regs_st *dlg_regs);
+void print__ttu_regs_st(struct display_mode_lib *mode_lib, const display_ttu_regs_st *ttu_regs);
 
 #endif

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-07 17:33         ` [PATCH] Enable '-Werror' by default for all kernel builds Linus Torvalds
@ 2021-09-07 21:07           ` Harry Wentland
  2021-09-08  3:52             ` Harry Wentland
  0 siblings, 1 reply; 16+ messages in thread
From: Harry Wentland @ 2021-09-07 21:07 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Leo Li, Alex Deucher,
	Christian König, Pan, Xinhui
  Cc: Nathan Chancellor, Guenter Roeck, Linux Kernel Mailing List, llvm,
	Nick Desaulniers, amd-gfx



On 2021-09-07 1:33 p.m., Linus Torvalds wrote:
> On Tue, Sep 7, 2021 at 10:10 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Do I know why? No. I do note that that code is disgusting.
>>
>> It's passing one of those structs around by value, for example. That's
>> a 72-byte structure that is copied on the stack due to stupid calling
>> conventions. Maybe clang generates a few extra temporaries for it as
>> part of the function call stack setup? Who knows..
> 
> Ooh, yes.
> 
> This attached patch is crap - it converts the helper functions to use
> const pointers instead of passing the whole structure, but it then
> only converts that one file that *uses* them.
> 
> So the end result will not compile in general, but you can do
> 
>     make drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.o
> 
> and it compiles for me.
> 
> And while gcc doesn't care that much - it will apparently either
> generate the argument stack every call - clang cares deeply.
> 
> The nasty 720-byte stack frame that clang generates turns into just a
> 320-byte one, and code generation in general looks a *lot* better.
> 
> Now, as mentioned, this patch is broken and incomplete. But I really
> think the AMD GPU people need to do this. It makes those functions go
> from practically unusable to not horribly disgusting.
> 
> So Harry/Leo/Alex/Christian and amd-gfx list - can you look into
> making this ugly "make one file compile better" patch actually work
> properly?
> 

Yes, will take a look at this tonight. We definitely shouldn't be passing
large structs by value.

Harry

> It *looks* lto me ike that code was perhaps written for a C++ compiler
> and the helpers have been written as a "pass by reference", and the
> arguments used to be
> 
>      const display_data_rq_misc_params_st& rq_misc_param
> 
> and then the compiler will pass the argument as a pointer. And then it
> was converted to C, and the "pass by reference" in the function
> declaration was turned into "pass by value", to avoid changing "." to
> "->" in the use.
> 
> But a '&arg' thing in C++ really is a '*arg' pointer in C, and should
> have been done as that.
> 
> Of course, it's also possible that that code was simply written by
> somebody who didn't understand just *how* horrible it is to pass
> structures bigger than a word or two by value.
> 
> Do we have a compiler warning for passing big structures by value?
> 
>                    Linus
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-07 21:07           ` Harry Wentland
@ 2021-09-08  3:52             ` Harry Wentland
  2021-09-08  4:41               ` Linus Torvalds
  0 siblings, 1 reply; 16+ messages in thread
From: Harry Wentland @ 2021-09-08  3:52 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann, Leo Li, Alex Deucher,
	Christian König, Pan, Xinhui
  Cc: Nathan Chancellor, Guenter Roeck, Linux Kernel Mailing List, llvm,
	Nick Desaulniers, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 3680 bytes --]



On 2021-09-07 5:07 p.m., Harry Wentland wrote:
> 
> 
> On 2021-09-07 1:33 p.m., Linus Torvalds wrote:
>> On Tue, Sep 7, 2021 at 10:10 AM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> Do I know why? No. I do note that that code is disgusting.
>>>
>>> It's passing one of those structs around by value, for example. That's
>>> a 72-byte structure that is copied on the stack due to stupid calling
>>> conventions. Maybe clang generates a few extra temporaries for it as
>>> part of the function call stack setup? Who knows..
>>
>> Ooh, yes.
>>
>> This attached patch is crap - it converts the helper functions to use
>> const pointers instead of passing the whole structure, but it then
>> only converts that one file that *uses* them.
>>
>> So the end result will not compile in general, but you can do
>>
>>     make drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.o
>>
>> and it compiles for me.
>>
>> And while gcc doesn't care that much - it will apparently either
>> generate the argument stack every call - clang cares deeply.
>>
>> The nasty 720-byte stack frame that clang generates turns into just a
>> 320-byte one, and code generation in general looks a *lot* better.
>>
>> Now, as mentioned, this patch is broken and incomplete. But I really
>> think the AMD GPU people need to do this. It makes those functions go
>> from practically unusable to not horribly disgusting.
>>
>> So Harry/Leo/Alex/Christian and amd-gfx list - can you look into
>> making this ugly "make one file compile better" patch actually work
>> properly?
>>
> 
> Yes, will take a look at this tonight. We definitely shouldn't be passing
> large structs by value.
> 

Attached patches fix these x86_64 ones reported by Nick:

x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:452:13: error: stack frame size (1800) exceeds limit (1280) in function 'dcn_bw_calc_rq_dlg_ttu' [-Werror,-Wframe-larger-than]
x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_rq_dlg_calc_21.c:1657:6: error: stack frame size (1336) exceeds limit (1280) in function 'dml21_rq_dlg_get_dlg_reg' [-Werror,-Wframe-larger-than]
x86_64-alpine.log:drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_rq_dlg_calc_30.c:1831:6: error: stack frame size (1352) exceeds limit (1280) in function 'dml30_rq_dlg_get_dlg_reg' [-Werror,-Wframe-larger-than]

I'm also seeing one more that might be more challenging to fix but is nearly at 1024:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6: error: stack frame size of 1064 bytes in function 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]

The attached patches build and boot without error or warning on a Radeon RX 5500 XT.

Harry

> Harry
> 
>> It *looks* lto me ike that code was perhaps written for a C++ compiler
>> and the helpers have been written as a "pass by reference", and the
>> arguments used to be
>>
>>      const display_data_rq_misc_params_st& rq_misc_param
>>
>> and then the compiler will pass the argument as a pointer. And then it
>> was converted to C, and the "pass by reference" in the function
>> declaration was turned into "pass by value", to avoid changing "." to
>> "->" in the use.
>>
>> But a '&arg' thing in C++ really is a '*arg' pointer in C, and should
>> have been done as that.
>>
>> Of course, it's also possible that that code was simply written by
>> somebody who didn't understand just *how* horrible it is to pass
>> structures bigger than a word or two by value.
>>
>> Do we have a compiler warning for passing big structures by value?
>>
>>                    Linus
>>
> 


[-- Attachment #2: 0001-drm-amd-display-Pass-display_pipe_params_st-as-const.patch --]
[-- Type: text/x-patch, Size: 30495 bytes --]

From 353a23b14584705cb194d9bfc91010caacaf42f8 Mon Sep 17 00:00:00 2001
From: Harry Wentland <harry.wentland@amd.com>
Date: Tue, 7 Sep 2021 19:40:06 -0400
Subject: [PATCH 1/2] drm/amd/display: Pass display_pipe_params_st as const in
 DML

[Why]
This neither needs to be on the stack nor passed by value
to each function call. In fact, when building with clang
it seems to break the Linux's default 1024 byte stack
frame limit.

[How]
We can simply pass this as a const pointer.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
---
 .../drm/amd/display/dc/dcn20/dcn20_resource.c |  2 +-
 .../dc/dml/dcn20/display_rq_dlg_calc_20.c     |  6 +-
 .../dc/dml/dcn20/display_rq_dlg_calc_20.h     |  4 +-
 .../dc/dml/dcn20/display_rq_dlg_calc_20v2.c   |  6 +-
 .../dc/dml/dcn20/display_rq_dlg_calc_20v2.h   |  4 +-
 .../dc/dml/dcn21/display_rq_dlg_calc_21.c     | 62 ++++++++--------
 .../dc/dml/dcn21/display_rq_dlg_calc_21.h     |  4 +-
 .../dc/dml/dcn30/display_rq_dlg_calc_30.c     | 72 +++++++++----------
 .../dc/dml/dcn30/display_rq_dlg_calc_30.h     |  4 +-
 .../dc/dml/dcn31/display_rq_dlg_calc_31.c     | 68 +++++++++---------
 .../dc/dml/dcn31/display_rq_dlg_calc_31.h     |  4 +-
 .../drm/amd/display/dc/dml/display_mode_lib.h |  4 +-
 12 files changed, 120 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
index e3e01b17c164..4389b36f0760 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_resource.c
@@ -3152,7 +3152,7 @@ void dcn20_calculate_dlg_params(
 
 		context->bw_ctx.dml.funcs.rq_dlg_get_rq_reg(&context->bw_ctx.dml,
 				&context->res_ctx.pipe_ctx[i].rq_regs,
-				pipes[pipe_idx].pipe);
+				&pipes[pipe_idx].pipe);
 		pipe_idx++;
 	}
 }
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
index 2091dd8c252d..8c168f348a27 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.c
@@ -768,12 +768,12 @@ static void dml20_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 
 void dml20_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param)
+		const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = {0};
 
 	memset(rq_regs, 0, sizeof(*rq_regs));
-	dml20_rq_dlg_get_rq_params(mode_lib, &rq_param, pipe_param.src);
+	dml20_rq_dlg_get_rq_params(mode_lib, &rq_param, pipe_param->src);
 	extract_rq_regs(mode_lib, rq_regs, rq_param);
 
 	print__rq_regs_st(mode_lib, *rq_regs);
@@ -1549,7 +1549,7 @@ static void dml20_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 void dml20_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.h
index d0b90947f540..8b23867e97c1 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20.h
@@ -43,7 +43,7 @@ struct display_mode_lib;
 void dml20_rq_dlg_get_rq_reg(
 		struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 
 // Function: dml_rq_dlg_get_dlg_reg
@@ -61,7 +61,7 @@ void dml20_rq_dlg_get_dlg_reg(
 		struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c
index 1a0c14e465fa..26ececfd40cd 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.c
@@ -768,12 +768,12 @@ static void dml20v2_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 
 void dml20v2_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param)
+		const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = {0};
 
 	memset(rq_regs, 0, sizeof(*rq_regs));
-	dml20v2_rq_dlg_get_rq_params(mode_lib, &rq_param, pipe_param.src);
+	dml20v2_rq_dlg_get_rq_params(mode_lib, &rq_param, pipe_param->src);
 	extract_rq_regs(mode_lib, rq_regs, rq_param);
 
 	print__rq_regs_st(mode_lib, *rq_regs);
@@ -1550,7 +1550,7 @@ static void dml20v2_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 void dml20v2_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.h b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.h
index 27cf8bed9376..2b4e46ea1c3d 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_rq_dlg_calc_20v2.h
@@ -43,7 +43,7 @@ struct display_mode_lib;
 void dml20v2_rq_dlg_get_rq_reg(
 		struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 
 // Function: dml_rq_dlg_get_dlg_reg
@@ -61,7 +61,7 @@ void dml20v2_rq_dlg_get_dlg_reg(
 		struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c
index 287e31052b30..736978c4d40a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.c
@@ -694,7 +694,7 @@ static void get_surf_rq_param(
 		display_data_rq_sizing_params_st *rq_sizing_param,
 		display_data_rq_dlg_params_st *rq_dlg_param,
 		display_data_rq_misc_params_st *rq_misc_param,
-		const display_pipe_params_st pipe_param,
+		const display_pipe_params_st *pipe_param,
 		bool is_chroma)
 {
 	bool mode_422 = false;
@@ -706,30 +706,30 @@ static void get_surf_rq_param(
 
 	// FIXME check if ppe apply for both luma and chroma in 422 case
 	if (is_chroma) {
-		vp_width = pipe_param.src.viewport_width_c / ppe;
-		vp_height = pipe_param.src.viewport_height_c;
-		data_pitch = pipe_param.src.data_pitch_c;
-		meta_pitch = pipe_param.src.meta_pitch_c;
+		vp_width = pipe_param->src.viewport_width_c / ppe;
+		vp_height = pipe_param->src.viewport_height_c;
+		data_pitch = pipe_param->src.data_pitch_c;
+		meta_pitch = pipe_param->src.meta_pitch_c;
 	} else {
-		vp_width = pipe_param.src.viewport_width / ppe;
-		vp_height = pipe_param.src.viewport_height;
-		data_pitch = pipe_param.src.data_pitch;
-		meta_pitch = pipe_param.src.meta_pitch;
+		vp_width = pipe_param->src.viewport_width / ppe;
+		vp_height = pipe_param->src.viewport_height;
+		data_pitch = pipe_param->src.data_pitch;
+		meta_pitch = pipe_param->src.meta_pitch;
 	}
 
-	if (pipe_param.dest.odm_combine) {
+	if (pipe_param->dest.odm_combine) {
 		unsigned int access_dir;
 		unsigned int full_src_vp_width;
 		unsigned int hactive_half;
 		unsigned int src_hactive_half;
-		access_dir = (pipe_param.src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
-		hactive_half  = pipe_param.dest.hactive / 2;
+		access_dir = (pipe_param->src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
+		hactive_half  = pipe_param->dest.hactive / 2;
 		if (is_chroma) {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio_c * pipe_param.dest.full_recout_width;
-			src_hactive_half  = pipe_param.scale_ratio_depth.hscl_ratio_c * hactive_half;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio_c * pipe_param->dest.full_recout_width;
+			src_hactive_half  = pipe_param->scale_ratio_depth.hscl_ratio_c * hactive_half;
 		} else {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio * pipe_param.dest.full_recout_width;
-			src_hactive_half  = pipe_param.scale_ratio_depth.hscl_ratio * hactive_half;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio * pipe_param->dest.full_recout_width;
+			src_hactive_half  = pipe_param->scale_ratio_depth.hscl_ratio * hactive_half;
 		}
 
 		if (access_dir == 0) {
@@ -754,7 +754,7 @@ static void get_surf_rq_param(
 	rq_sizing_param->meta_chunk_bytes = 2048;
 	rq_sizing_param->min_meta_chunk_bytes = 256;
 
-	if (pipe_param.src.hostvm)
+	if (pipe_param->src.hostvm)
 		rq_sizing_param->mpte_group_bytes = 512;
 	else
 		rq_sizing_param->mpte_group_bytes = 2048;
@@ -768,23 +768,23 @@ static void get_surf_rq_param(
 			vp_height,
 			data_pitch,
 			meta_pitch,
-			pipe_param.src.source_format,
-			pipe_param.src.sw_mode,
-			pipe_param.src.macro_tile_size,
-			pipe_param.src.source_scan,
-			pipe_param.src.hostvm,
+			pipe_param->src.source_format,
+			pipe_param->src.sw_mode,
+			pipe_param->src.macro_tile_size,
+			pipe_param->src.source_scan,
+			pipe_param->src.hostvm,
 			is_chroma);
 }
 
 static void dml_rq_dlg_get_rq_params(
 		struct display_mode_lib *mode_lib,
 		display_rq_params_st *rq_param,
-		const display_pipe_params_st pipe_param)
+		const display_pipe_params_st *pipe_param)
 {
 	// get param for luma surface
-	rq_param->yuv420 = pipe_param.src.source_format == dm_420_8
-			|| pipe_param.src.source_format == dm_420_10;
-	rq_param->yuv420_10bpc = pipe_param.src.source_format == dm_420_10;
+	rq_param->yuv420 = pipe_param->src.source_format == dm_420_8
+			|| pipe_param->src.source_format == dm_420_10;
+	rq_param->yuv420_10bpc = pipe_param->src.source_format == dm_420_10;
 
 	get_surf_rq_param(
 			mode_lib,
@@ -794,7 +794,7 @@ static void dml_rq_dlg_get_rq_params(
 			pipe_param,
 			0);
 
-	if (is_dual_plane((enum source_format_class) (pipe_param.src.source_format))) {
+	if (is_dual_plane((enum source_format_class) (pipe_param->src.source_format))) {
 		// get param for chroma surface
 		get_surf_rq_param(
 				mode_lib,
@@ -806,14 +806,14 @@ static void dml_rq_dlg_get_rq_params(
 	}
 
 	// calculate how to split the det buffer space between luma and chroma
-	handle_det_buf_split(mode_lib, rq_param, pipe_param.src);
+	handle_det_buf_split(mode_lib, rq_param, pipe_param->src);
 	print__rq_params_st(mode_lib, *rq_param);
 }
 
 void dml21_rq_dlg_get_rq_reg(
 		struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param)
+		const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = {0};
 
@@ -1658,7 +1658,7 @@ void dml21_rq_dlg_get_dlg_reg(
 		struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
@@ -1696,7 +1696,7 @@ void dml21_rq_dlg_get_dlg_reg(
 	// system parameter calculation done
 
 	dml_print("DML_DLG: Calculation for pipe[%d] start\n\n", pipe_idx);
-	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, e2e_pipe_param[pipe_idx].pipe);
+	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, &e2e_pipe_param[pipe_idx].pipe);
 	dml_rq_dlg_get_dlg_params(
 			mode_lib,
 			e2e_pipe_param,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.h b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.h
index e8f7785e3fc6..af6ad0ca9cf8 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn21/display_rq_dlg_calc_21.h
@@ -44,7 +44,7 @@ struct display_mode_lib;
 void dml21_rq_dlg_get_rq_reg(
 		struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 // Function: dml_rq_dlg_get_dlg_reg
 //   Calculate and return DLG and TTU register struct given the system setting
@@ -61,7 +61,7 @@ void dml21_rq_dlg_get_dlg_reg(
 		struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
index 0d934fae1c3a..2120e0941a09 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.c
@@ -747,7 +747,7 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 	display_data_rq_sizing_params_st *rq_sizing_param,
 	display_data_rq_dlg_params_st *rq_dlg_param,
 	display_data_rq_misc_params_st *rq_misc_param,
-	const display_pipe_params_st pipe_param,
+	const display_pipe_params_st *pipe_param,
 	bool is_chroma,
 	bool is_alpha)
 {
@@ -761,32 +761,32 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 
 	// FIXME check if ppe apply for both luma and chroma in 422 case
 	if (is_chroma | is_alpha) {
-		vp_width = pipe_param.src.viewport_width_c / ppe;
-		vp_height = pipe_param.src.viewport_height_c;
-		data_pitch = pipe_param.src.data_pitch_c;
-		meta_pitch = pipe_param.src.meta_pitch_c;
-		surface_height = pipe_param.src.surface_height_y / 2.0;
+		vp_width = pipe_param->src.viewport_width_c / ppe;
+		vp_height = pipe_param->src.viewport_height_c;
+		data_pitch = pipe_param->src.data_pitch_c;
+		meta_pitch = pipe_param->src.meta_pitch_c;
+		surface_height = pipe_param->src.surface_height_y / 2.0;
 	} else {
-		vp_width = pipe_param.src.viewport_width / ppe;
-		vp_height = pipe_param.src.viewport_height;
-		data_pitch = pipe_param.src.data_pitch;
-		meta_pitch = pipe_param.src.meta_pitch;
-		surface_height = pipe_param.src.surface_height_y;
+		vp_width = pipe_param->src.viewport_width / ppe;
+		vp_height = pipe_param->src.viewport_height;
+		data_pitch = pipe_param->src.data_pitch;
+		meta_pitch = pipe_param->src.meta_pitch;
+		surface_height = pipe_param->src.surface_height_y;
 	}
 
-	if (pipe_param.dest.odm_combine) {
+	if (pipe_param->dest.odm_combine) {
 		unsigned int access_dir = 0;
 		unsigned int full_src_vp_width = 0;
 		unsigned int hactive_odm = 0;
 		unsigned int src_hactive_odm = 0;
-		access_dir = (pipe_param.src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
-		hactive_odm  = pipe_param.dest.hactive / ((unsigned int)pipe_param.dest.odm_combine*2);
+		access_dir = (pipe_param->src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
+		hactive_odm  = pipe_param->dest.hactive / ((unsigned int) pipe_param->dest.odm_combine*2);
 		if (is_chroma) {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio_c * pipe_param.dest.full_recout_width;
-			src_hactive_odm  = pipe_param.scale_ratio_depth.hscl_ratio_c * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio_c * pipe_param->dest.full_recout_width;
+			src_hactive_odm  = pipe_param->scale_ratio_depth.hscl_ratio_c * hactive_odm;
 		} else {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio * pipe_param.dest.full_recout_width;
-			src_hactive_odm  = pipe_param.scale_ratio_depth.hscl_ratio * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio * pipe_param->dest.full_recout_width;
+			src_hactive_odm  = pipe_param->scale_ratio_depth.hscl_ratio * hactive_odm;
 		}
 
 		if (access_dir == 0) {
@@ -815,7 +815,7 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 	rq_sizing_param->meta_chunk_bytes = 2048;
 	rq_sizing_param->min_meta_chunk_bytes = 256;
 
-	if (pipe_param.src.hostvm)
+	if (pipe_param->src.hostvm)
 		rq_sizing_param->mpte_group_bytes = 512;
 	else
 		rq_sizing_param->mpte_group_bytes = 2048;
@@ -828,28 +828,28 @@ static void get_surf_rq_param(struct display_mode_lib *mode_lib,
 		vp_height,
 		data_pitch,
 		meta_pitch,
-		pipe_param.src.source_format,
-		pipe_param.src.sw_mode,
-		pipe_param.src.macro_tile_size,
-		pipe_param.src.source_scan,
-		pipe_param.src.hostvm,
+		pipe_param->src.source_format,
+		pipe_param->src.sw_mode,
+		pipe_param->src.macro_tile_size,
+		pipe_param->src.source_scan,
+		pipe_param->src.hostvm,
 		is_chroma,
 		surface_height);
 }
 
 static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 	display_rq_params_st *rq_param,
-	const display_pipe_params_st pipe_param)
+	const display_pipe_params_st *pipe_param)
 {
 	// get param for luma surface
-	rq_param->yuv420 = pipe_param.src.source_format == dm_420_8
-	|| pipe_param.src.source_format == dm_420_10
-	|| pipe_param.src.source_format == dm_rgbe_alpha
-	|| pipe_param.src.source_format == dm_420_12;
+	rq_param->yuv420 = pipe_param->src.source_format == dm_420_8
+	|| pipe_param->src.source_format == dm_420_10
+	|| pipe_param->src.source_format == dm_rgbe_alpha
+	|| pipe_param->src.source_format == dm_420_12;
 
-	rq_param->yuv420_10bpc = pipe_param.src.source_format == dm_420_10;
+	rq_param->yuv420_10bpc = pipe_param->src.source_format == dm_420_10;
 
-	rq_param->rgbe_alpha = (pipe_param.src.source_format == dm_rgbe_alpha)?1:0;
+	rq_param->rgbe_alpha = (pipe_param->src.source_format == dm_rgbe_alpha)?1:0;
 
 	get_surf_rq_param(mode_lib,
 		&(rq_param->sizing.rq_l),
@@ -859,7 +859,7 @@ static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 		0,
 		0);
 
-	if (is_dual_plane((enum source_format_class)(pipe_param.src.source_format))) {
+	if (is_dual_plane((enum source_format_class)(pipe_param->src.source_format))) {
 		// get param for chroma surface
 		get_surf_rq_param(mode_lib,
 			&(rq_param->sizing.rq_c),
@@ -871,13 +871,13 @@ static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib,
 	}
 
 	// calculate how to split the det buffer space between luma and chroma
-	handle_det_buf_split(mode_lib, rq_param, pipe_param.src);
+	handle_det_buf_split(mode_lib, rq_param, pipe_param->src);
 	print__rq_params_st(mode_lib, *rq_param);
 }
 
 void dml30_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 	display_rq_regs_st *rq_regs,
-	const display_pipe_params_st pipe_param)
+	const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = { 0 };
 
@@ -1831,7 +1831,7 @@ static void dml_rq_dlg_get_dlg_params(struct display_mode_lib *mode_lib,
 void dml30_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
 	display_dlg_regs_st *dlg_regs,
 	display_ttu_regs_st *ttu_regs,
-	display_e2e_pipe_params_st *e2e_pipe_param,
+	const display_e2e_pipe_params_st *e2e_pipe_param,
 	const unsigned int num_pipes,
 	const unsigned int pipe_idx,
 	const bool cstate_en,
@@ -1866,7 +1866,7 @@ void dml30_rq_dlg_get_dlg_reg(struct display_mode_lib *mode_lib,
 	// system parameter calculation done
 
 	dml_print("DML_DLG: Calculation for pipe[%d] start\n\n", pipe_idx);
-	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, e2e_pipe_param[pipe_idx].pipe);
+	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, &e2e_pipe_param[pipe_idx].pipe);
 	dml_rq_dlg_get_dlg_params(mode_lib,
 		e2e_pipe_param,
 		num_pipes,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
index c04965cceff3..625e41f8d575 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_rq_dlg_calc_30.h
@@ -41,7 +41,7 @@ struct display_mode_lib;
 //            See also: <display_rq_regs_st>
 void dml30_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 // Function: dml_rq_dlg_get_dlg_reg
 //   Calculate and return DLG and TTU register struct given the system setting
@@ -57,7 +57,7 @@ void dml30_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 void dml30_rq_dlg_get_dlg_reg(struct display_mode_lib             *mode_lib,
 		display_dlg_regs_st          *dlg_regs,
 		display_ttu_regs_st          *ttu_regs,
-		display_e2e_pipe_params_st   *e2e_pipe_param,
+		const display_e2e_pipe_params_st   *e2e_pipe_param,
 		const unsigned int            num_pipes,
 		const unsigned int            pipe_idx,
 		const bool                    cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
index c23905bc733a..57bd4e3f8a82 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.c
@@ -738,7 +738,7 @@ static void get_surf_rq_param(
 		display_data_rq_sizing_params_st *rq_sizing_param,
 		display_data_rq_dlg_params_st *rq_dlg_param,
 		display_data_rq_misc_params_st *rq_misc_param,
-		const display_pipe_params_st pipe_param,
+		const display_pipe_params_st *pipe_param,
 		bool is_chroma,
 		bool is_alpha)
 {
@@ -752,33 +752,33 @@ static void get_surf_rq_param(
 
 	// FIXME check if ppe apply for both luma and chroma in 422 case
 	if (is_chroma | is_alpha) {
-		vp_width = pipe_param.src.viewport_width_c / ppe;
-		vp_height = pipe_param.src.viewport_height_c;
-		data_pitch = pipe_param.src.data_pitch_c;
-		meta_pitch = pipe_param.src.meta_pitch_c;
-		surface_height = pipe_param.src.surface_height_y / 2.0;
+		vp_width = pipe_param->src.viewport_width_c / ppe;
+		vp_height = pipe_param->src.viewport_height_c;
+		data_pitch = pipe_param->src.data_pitch_c;
+		meta_pitch = pipe_param->src.meta_pitch_c;
+		surface_height = pipe_param->src.surface_height_y / 2.0;
 	} else {
-		vp_width = pipe_param.src.viewport_width / ppe;
-		vp_height = pipe_param.src.viewport_height;
-		data_pitch = pipe_param.src.data_pitch;
-		meta_pitch = pipe_param.src.meta_pitch;
-		surface_height = pipe_param.src.surface_height_y;
+		vp_width = pipe_param->src.viewport_width / ppe;
+		vp_height = pipe_param->src.viewport_height;
+		data_pitch = pipe_param->src.data_pitch;
+		meta_pitch = pipe_param->src.meta_pitch;
+		surface_height = pipe_param->src.surface_height_y;
 	}
 
-	if (pipe_param.dest.odm_combine) {
+	if (pipe_param->dest.odm_combine) {
 		unsigned int access_dir;
 		unsigned int full_src_vp_width;
 		unsigned int hactive_odm;
 		unsigned int src_hactive_odm;
 
-		access_dir = (pipe_param.src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
-		hactive_odm = pipe_param.dest.hactive / ((unsigned int) pipe_param.dest.odm_combine * 2);
+		access_dir = (pipe_param->src.source_scan == dm_vert); // vp access direction: horizontal or vertical accessed
+		hactive_odm = pipe_param->dest.hactive / ((unsigned int) pipe_param->dest.odm_combine * 2);
 		if (is_chroma) {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio_c * pipe_param.dest.full_recout_width;
-			src_hactive_odm = pipe_param.scale_ratio_depth.hscl_ratio_c * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio_c * pipe_param->dest.full_recout_width;
+			src_hactive_odm = pipe_param->scale_ratio_depth.hscl_ratio_c * hactive_odm;
 		} else {
-			full_src_vp_width = pipe_param.scale_ratio_depth.hscl_ratio * pipe_param.dest.full_recout_width;
-			src_hactive_odm = pipe_param.scale_ratio_depth.hscl_ratio * hactive_odm;
+			full_src_vp_width = pipe_param->scale_ratio_depth.hscl_ratio * pipe_param->dest.full_recout_width;
+			src_hactive_odm = pipe_param->scale_ratio_depth.hscl_ratio * hactive_odm;
 		}
 
 		if (access_dir == 0) {
@@ -808,7 +808,7 @@ static void get_surf_rq_param(
 	rq_sizing_param->meta_chunk_bytes = 2048;
 	rq_sizing_param->min_meta_chunk_bytes = 256;
 
-	if (pipe_param.src.hostvm)
+	if (pipe_param->src.hostvm)
 		rq_sizing_param->mpte_group_bytes = 512;
 	else
 		rq_sizing_param->mpte_group_bytes = 2048;
@@ -822,38 +822,38 @@ static void get_surf_rq_param(
 			vp_height,
 			data_pitch,
 			meta_pitch,
-			pipe_param.src.source_format,
-			pipe_param.src.sw_mode,
-			pipe_param.src.macro_tile_size,
-			pipe_param.src.source_scan,
-			pipe_param.src.hostvm,
+			pipe_param->src.source_format,
+			pipe_param->src.sw_mode,
+			pipe_param->src.macro_tile_size,
+			pipe_param->src.source_scan,
+			pipe_param->src.hostvm,
 			is_chroma,
 			surface_height);
 }
 
-static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib, display_rq_params_st *rq_param, const display_pipe_params_st pipe_param)
+static void dml_rq_dlg_get_rq_params(struct display_mode_lib *mode_lib, display_rq_params_st *rq_param, const display_pipe_params_st *pipe_param)
 {
 	// get param for luma surface
-	rq_param->yuv420 = pipe_param.src.source_format == dm_420_8 || pipe_param.src.source_format == dm_420_10 || pipe_param.src.source_format == dm_rgbe_alpha
-			|| pipe_param.src.source_format == dm_420_12;
+	rq_param->yuv420 = pipe_param->src.source_format == dm_420_8 || pipe_param->src.source_format == dm_420_10 || pipe_param->src.source_format == dm_rgbe_alpha
+			|| pipe_param->src.source_format == dm_420_12;
 
-	rq_param->yuv420_10bpc = pipe_param.src.source_format == dm_420_10;
+	rq_param->yuv420_10bpc = pipe_param->src.source_format == dm_420_10;
 
-	rq_param->rgbe_alpha = (pipe_param.src.source_format == dm_rgbe_alpha) ? 1 : 0;
+	rq_param->rgbe_alpha = (pipe_param->src.source_format == dm_rgbe_alpha) ? 1 : 0;
 
 	get_surf_rq_param(mode_lib, &(rq_param->sizing.rq_l), &(rq_param->dlg.rq_l), &(rq_param->misc.rq_l), pipe_param, 0, 0);
 
-	if (is_dual_plane((enum source_format_class) (pipe_param.src.source_format))) {
+	if (is_dual_plane((enum source_format_class) (pipe_param->src.source_format))) {
 		// get param for chroma surface
 		get_surf_rq_param(mode_lib, &(rq_param->sizing.rq_c), &(rq_param->dlg.rq_c), &(rq_param->misc.rq_c), pipe_param, 1, rq_param->rgbe_alpha);
 	}
 
 	// calculate how to split the det buffer space between luma and chroma
-	handle_det_buf_split(mode_lib, rq_param, pipe_param.src);
+	handle_det_buf_split(mode_lib, rq_param, pipe_param->src);
 	print__rq_params_st(mode_lib, *rq_param);
 }
 
-void dml31_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib, display_rq_regs_st *rq_regs, const display_pipe_params_st pipe_param)
+void dml31_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib, display_rq_regs_st *rq_regs, const display_pipe_params_st *pipe_param)
 {
 	display_rq_params_st rq_param = {0};
 
@@ -1677,7 +1677,7 @@ void dml31_rq_dlg_get_dlg_reg(
 		struct display_mode_lib *mode_lib,
 		display_dlg_regs_st *dlg_regs,
 		display_ttu_regs_st *ttu_regs,
-		display_e2e_pipe_params_st *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int num_pipes,
 		const unsigned int pipe_idx,
 		const bool cstate_en,
@@ -1704,7 +1704,7 @@ void dml31_rq_dlg_get_dlg_reg(
 	// system parameter calculation done
 
 	dml_print("DML_DLG: Calculation for pipe[%d] start\n\n", pipe_idx);
-	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, e2e_pipe_param[pipe_idx].pipe);
+	dml_rq_dlg_get_rq_params(mode_lib, &rq_param, &e2e_pipe_param[pipe_idx].pipe);
 	dml_rq_dlg_get_dlg_params(
 			mode_lib,
 			e2e_pipe_param,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.h b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.h
index adf8518f761f..8ee991351699 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_rq_dlg_calc_31.h
@@ -41,7 +41,7 @@ struct display_mode_lib;
 //            See also: <display_rq_regs_st>
 void dml31_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 
 // Function: dml_rq_dlg_get_dlg_reg
 //   Calculate and return DLG and TTU register struct given the system setting
@@ -57,7 +57,7 @@ void dml31_rq_dlg_get_rq_reg(struct display_mode_lib *mode_lib,
 void dml31_rq_dlg_get_dlg_reg(struct display_mode_lib             *mode_lib,
 		display_dlg_regs_st          *dlg_regs,
 		display_ttu_regs_st          *ttu_regs,
-		display_e2e_pipe_params_st   *e2e_pipe_param,
+		const display_e2e_pipe_params_st *e2e_pipe_param,
 		const unsigned int            num_pipes,
 		const unsigned int            pipe_idx,
 		const bool                    cstate_en,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
index d42a0aeca6be..72b1957022aa 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/display_mode_lib.h
@@ -49,7 +49,7 @@ struct dml_funcs {
 			struct display_mode_lib *mode_lib,
 			display_dlg_regs_st *dlg_regs,
 			display_ttu_regs_st *ttu_regs,
-			display_e2e_pipe_params_st *e2e_pipe_param,
+			const display_e2e_pipe_params_st *e2e_pipe_param,
 			const unsigned int num_pipes,
 			const unsigned int pipe_idx,
 			const bool cstate_en,
@@ -60,7 +60,7 @@ struct dml_funcs {
 	void (*rq_dlg_get_rq_reg)(
 		struct display_mode_lib *mode_lib,
 		display_rq_regs_st *rq_regs,
-		const display_pipe_params_st pipe_param);
+		const display_pipe_params_st *pipe_param);
 	void (*recalculate)(struct display_mode_lib *mode_lib);
 	void (*validate)(struct display_mode_lib *mode_lib);
 };
-- 
2.33.0


[-- Attachment #3: 0002-drm-amd-display-Allocate-structs-needed-by-dcn_bw_ca.patch --]
[-- Type: text/x-patch, Size: 22557 bytes --]

From eece72e18fc9afb237f11077c0c57d82c59b1d1f Mon Sep 17 00:00:00 2001
From: Harry Wentland <harry.wentland@amd.com>
Date: Tue, 7 Sep 2021 20:22:13 -0400
Subject: [PATCH 2/2] drm/amd/display: Allocate structs needed by
 dcn_bw_calc_rq_dlg_ttu in pipe_ctx

[Why & How]
dcn_bw_calc_rq_dlg_ttu uses a stack frame great than 1024. To solve this
we could allocate the rq_param, dlg_sys_param, and input structs
dynamically. Since this function is inside a kernel_fpu_begin()/end()
call we want to avoid memory allocation. Instead it's much
safer to pre-allocate these on the pipe_ctx.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
---
 .../gpu/drm/amd/display/dc/calcs/dcn_calcs.c  |  55 +++---
 .../display/dc/dml/dml1_display_rq_dlg_calc.c | 170 +++++++++---------
 .../display/dc/dml/dml1_display_rq_dlg_calc.h |   8 +-
 .../gpu/drm/amd/display/dc/inc/core_types.h   |   3 +
 4 files changed, 121 insertions(+), 115 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 0e18df1283b6..e64145b0a32b 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -459,9 +459,9 @@ static void dcn_bw_calc_rq_dlg_ttu(
 	struct _vcs_dpi_display_dlg_regs_st *dlg_regs = &pipe->dlg_regs;
 	struct _vcs_dpi_display_ttu_regs_st *ttu_regs = &pipe->ttu_regs;
 	struct _vcs_dpi_display_rq_regs_st *rq_regs = &pipe->rq_regs;
-	struct _vcs_dpi_display_rq_params_st rq_param = {0};
-	struct _vcs_dpi_display_dlg_sys_params_st dlg_sys_param = {0};
-	struct _vcs_dpi_display_e2e_pipe_params_st input = { { { 0 } } };
+	struct _vcs_dpi_display_rq_params_st *rq_param = &pipe->dml_rq_param;
+	struct _vcs_dpi_display_dlg_sys_params_st *dlg_sys_param = &pipe->dml_dlg_sys_param;
+	struct _vcs_dpi_display_e2e_pipe_params_st *input = &pipe->dml_input;
 	float total_active_bw = 0;
 	float total_prefetch_bw = 0;
 	int total_flip_bytes = 0;
@@ -470,45 +470,48 @@ static void dcn_bw_calc_rq_dlg_ttu(
 	memset(dlg_regs, 0, sizeof(*dlg_regs));
 	memset(ttu_regs, 0, sizeof(*ttu_regs));
 	memset(rq_regs, 0, sizeof(*rq_regs));
+	memset(rq_param, 0, sizeof(*rq_param));
+	memset(dlg_sys_param, 0, sizeof(*dlg_sys_param));
+	memset(input, 0, sizeof(*input));
 
 	for (i = 0; i < number_of_planes; i++) {
 		total_active_bw += v->read_bandwidth[i];
 		total_prefetch_bw += v->prefetch_bandwidth[i];
 		total_flip_bytes += v->total_immediate_flip_bytes[i];
 	}
-	dlg_sys_param.total_flip_bw = v->return_bw - dcn_bw_max2(total_active_bw, total_prefetch_bw);
-	if (dlg_sys_param.total_flip_bw < 0.0)
-		dlg_sys_param.total_flip_bw = 0;
-
-	dlg_sys_param.t_mclk_wm_us = v->dram_clock_change_watermark;
-	dlg_sys_param.t_sr_wm_us = v->stutter_enter_plus_exit_watermark;
-	dlg_sys_param.t_urg_wm_us = v->urgent_watermark;
-	dlg_sys_param.t_extra_us = v->urgent_extra_latency;
-	dlg_sys_param.deepsleep_dcfclk_mhz = v->dcf_clk_deep_sleep;
-	dlg_sys_param.total_flip_bytes = total_flip_bytes;
-
-	pipe_ctx_to_e2e_pipe_params(pipe, &input.pipe);
-	input.clks_cfg.dcfclk_mhz = v->dcfclk;
-	input.clks_cfg.dispclk_mhz = v->dispclk;
-	input.clks_cfg.dppclk_mhz = v->dppclk;
-	input.clks_cfg.refclk_mhz = dc->res_pool->ref_clocks.dchub_ref_clock_inKhz / 1000.0;
-	input.clks_cfg.socclk_mhz = v->socclk;
-	input.clks_cfg.voltage = v->voltage_level;
+	dlg_sys_param->total_flip_bw = v->return_bw - dcn_bw_max2(total_active_bw, total_prefetch_bw);
+	if (dlg_sys_param->total_flip_bw < 0.0)
+		dlg_sys_param->total_flip_bw = 0;
+
+	dlg_sys_param->t_mclk_wm_us = v->dram_clock_change_watermark;
+	dlg_sys_param->t_sr_wm_us = v->stutter_enter_plus_exit_watermark;
+	dlg_sys_param->t_urg_wm_us = v->urgent_watermark;
+	dlg_sys_param->t_extra_us = v->urgent_extra_latency;
+	dlg_sys_param->deepsleep_dcfclk_mhz = v->dcf_clk_deep_sleep;
+	dlg_sys_param->total_flip_bytes = total_flip_bytes;
+
+	pipe_ctx_to_e2e_pipe_params(pipe, &input->pipe);
+	input->clks_cfg.dcfclk_mhz = v->dcfclk;
+	input->clks_cfg.dispclk_mhz = v->dispclk;
+	input->clks_cfg.dppclk_mhz = v->dppclk;
+	input->clks_cfg.refclk_mhz = dc->res_pool->ref_clocks.dchub_ref_clock_inKhz / 1000.0;
+	input->clks_cfg.socclk_mhz = v->socclk;
+	input->clks_cfg.voltage = v->voltage_level;
 //	dc->dml.logger = pool->base.logger;
-	input.dout.output_format = (v->output_format[in_idx] == dcn_bw_420) ? dm_420 : dm_444;
-	input.dout.output_type  = (v->output[in_idx] == dcn_bw_hdmi) ? dm_hdmi : dm_dp;
+	input->dout.output_format = (v->output_format[in_idx] == dcn_bw_420) ? dm_420 : dm_444;
+	input->dout.output_type  = (v->output[in_idx] == dcn_bw_hdmi) ? dm_hdmi : dm_dp;
 	//input[in_idx].dout.output_standard;
 
 	/*todo: soc->sr_enter_plus_exit_time??*/
-	dlg_sys_param.t_srx_delay_us = dc->dcn_ip->dcfclk_cstate_latency / v->dcf_clk_deep_sleep;
+	dlg_sys_param->t_srx_delay_us = dc->dcn_ip->dcfclk_cstate_latency / v->dcf_clk_deep_sleep;
 
-	dml1_rq_dlg_get_rq_params(dml, &rq_param, input.pipe.src);
+	dml1_rq_dlg_get_rq_params(dml, rq_param, input->pipe.src);
 	dml1_extract_rq_regs(dml, rq_regs, rq_param);
 	dml1_rq_dlg_get_dlg_params(
 			dml,
 			dlg_regs,
 			ttu_regs,
-			rq_param.dlg,
+			&rq_param->dlg,
 			dlg_sys_param,
 			input,
 			true,
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
index 8f2b1684c231..0561374cdada 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.c
@@ -231,22 +231,22 @@ static void extract_rq_sizing_regs(
 void dml1_extract_rq_regs(
 		struct display_mode_lib *mode_lib,
 		struct _vcs_dpi_display_rq_regs_st *rq_regs,
-		const struct _vcs_dpi_display_rq_params_st rq_param)
+		const struct _vcs_dpi_display_rq_params_st *rq_param)
 {
 	unsigned int detile_buf_size_in_bytes = mode_lib->ip.det_buffer_size_kbytes * 1024;
 	unsigned int detile_buf_plane1_addr = 0;
 
-	extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_l), rq_param.sizing.rq_l);
-	if (rq_param.yuv420)
-		extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_c), rq_param.sizing.rq_c);
+	extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_l), rq_param->sizing.rq_l);
+	if (rq_param->yuv420)
+		extract_rq_sizing_regs(mode_lib, &(rq_regs->rq_regs_c), rq_param->sizing.rq_c);
 
-	rq_regs->rq_regs_l.swath_height = dml_log2(rq_param.dlg.rq_l.swath_height);
-	rq_regs->rq_regs_c.swath_height = dml_log2(rq_param.dlg.rq_c.swath_height);
+	rq_regs->rq_regs_l.swath_height = dml_log2(rq_param->dlg.rq_l.swath_height);
+	rq_regs->rq_regs_c.swath_height = dml_log2(rq_param->dlg.rq_c.swath_height);
 
 	/* TODO: take the max between luma, chroma chunk size?
 	 * okay for now, as we are setting chunk_bytes to 8kb anyways
 	 */
-	if (rq_param.sizing.rq_l.chunk_bytes >= 32 * 1024) { /*32kb */
+	if (rq_param->sizing.rq_l.chunk_bytes >= 32 * 1024) { /*32kb */
 		rq_regs->drq_expansion_mode = 0;
 	} else {
 		rq_regs->drq_expansion_mode = 2;
@@ -255,9 +255,9 @@ void dml1_extract_rq_regs(
 	rq_regs->mrq_expansion_mode = 1;
 	rq_regs->crq_expansion_mode = 1;
 
-	if (rq_param.yuv420) {
-		if ((double) rq_param.misc.rq_l.stored_swath_bytes
-				/ (double) rq_param.misc.rq_c.stored_swath_bytes <= 1.5) {
+	if (rq_param->yuv420) {
+		if ((double) rq_param->misc.rq_l.stored_swath_bytes
+				/ (double) rq_param->misc.rq_c.stored_swath_bytes <= 1.5) {
 			detile_buf_plane1_addr = (detile_buf_size_in_bytes / 2.0 / 64.0); /* half to chroma */
 		} else {
 			detile_buf_plane1_addr = dml_round_to_multiple(
@@ -998,26 +998,26 @@ void dml1_rq_dlg_get_dlg_params(
 		struct display_mode_lib *mode_lib,
 		struct _vcs_dpi_display_dlg_regs_st *disp_dlg_regs,
 		struct _vcs_dpi_display_ttu_regs_st *disp_ttu_regs,
-		const struct _vcs_dpi_display_rq_dlg_params_st rq_dlg_param,
-		const struct _vcs_dpi_display_dlg_sys_params_st dlg_sys_param,
-		const struct _vcs_dpi_display_e2e_pipe_params_st e2e_pipe_param,
+		const struct _vcs_dpi_display_rq_dlg_params_st *rq_dlg_param,
+		const struct _vcs_dpi_display_dlg_sys_params_st *dlg_sys_param,
+		const struct _vcs_dpi_display_e2e_pipe_params_st *e2e_pipe_param,
 		const bool cstate_en,
 		const bool pstate_en,
 		const bool vm_en,
 		const bool iflip_en)
 {
 	/* Timing */
-	unsigned int htotal = e2e_pipe_param.pipe.dest.htotal;
-	unsigned int hblank_end = e2e_pipe_param.pipe.dest.hblank_end;
-	unsigned int vblank_start = e2e_pipe_param.pipe.dest.vblank_start;
-	unsigned int vblank_end = e2e_pipe_param.pipe.dest.vblank_end;
-	bool interlaced = e2e_pipe_param.pipe.dest.interlaced;
+	unsigned int htotal = e2e_pipe_param->pipe.dest.htotal;
+	unsigned int hblank_end = e2e_pipe_param->pipe.dest.hblank_end;
+	unsigned int vblank_start = e2e_pipe_param->pipe.dest.vblank_start;
+	unsigned int vblank_end = e2e_pipe_param->pipe.dest.vblank_end;
+	bool interlaced = e2e_pipe_param->pipe.dest.interlaced;
 	unsigned int min_vblank = mode_lib->ip.min_vblank_lines;
 
-	double pclk_freq_in_mhz = e2e_pipe_param.pipe.dest.pixel_rate_mhz;
-	double refclk_freq_in_mhz = e2e_pipe_param.clks_cfg.refclk_mhz;
-	double dppclk_freq_in_mhz = e2e_pipe_param.clks_cfg.dppclk_mhz;
-	double dispclk_freq_in_mhz = e2e_pipe_param.clks_cfg.dispclk_mhz;
+	double pclk_freq_in_mhz = e2e_pipe_param->pipe.dest.pixel_rate_mhz;
+	double refclk_freq_in_mhz = e2e_pipe_param->clks_cfg.refclk_mhz;
+	double dppclk_freq_in_mhz = e2e_pipe_param->clks_cfg.dppclk_mhz;
+	double dispclk_freq_in_mhz = e2e_pipe_param->clks_cfg.dispclk_mhz;
 
 	double ref_freq_to_pix_freq;
 	double prefetch_xy_calc_in_dcfclk;
@@ -1160,13 +1160,13 @@ void dml1_rq_dlg_get_dlg_params(
 	disp_dlg_regs->dlg_vblank_end = interlaced ? (vblank_end / 2) : vblank_end; /* 15 bits */
 
 	prefetch_xy_calc_in_dcfclk = 24.0; /* TODO: ip_param */
-	min_dcfclk_mhz = dlg_sys_param.deepsleep_dcfclk_mhz;
+	min_dcfclk_mhz = dlg_sys_param->deepsleep_dcfclk_mhz;
 	t_calc_us = prefetch_xy_calc_in_dcfclk / min_dcfclk_mhz;
-	min_ttu_vblank = dlg_sys_param.t_urg_wm_us;
+	min_ttu_vblank = dlg_sys_param->t_urg_wm_us;
 	if (cstate_en)
-		min_ttu_vblank = dml_max(dlg_sys_param.t_sr_wm_us, min_ttu_vblank);
+		min_ttu_vblank = dml_max(dlg_sys_param->t_sr_wm_us, min_ttu_vblank);
 	if (pstate_en)
-		min_ttu_vblank = dml_max(dlg_sys_param.t_mclk_wm_us, min_ttu_vblank);
+		min_ttu_vblank = dml_max(dlg_sys_param->t_mclk_wm_us, min_ttu_vblank);
 	min_ttu_vblank = min_ttu_vblank + t_calc_us;
 
 	min_dst_y_ttu_vblank = min_ttu_vblank * pclk_freq_in_mhz / (double) htotal;
@@ -1197,59 +1197,59 @@ void dml1_rq_dlg_get_dlg_params(
 	/* ------------------------- */
 	/* Prefetch Calc */
 	/* Source */
-	dcc_en = e2e_pipe_param.pipe.src.dcc;
+	dcc_en = e2e_pipe_param->pipe.src.dcc;
 	dual_plane = is_dual_plane(
-			(enum source_format_class) e2e_pipe_param.pipe.src.source_format);
+			(enum source_format_class) e2e_pipe_param->pipe.src.source_format);
 	mode_422 = 0; /* TODO */
-	access_dir = (e2e_pipe_param.pipe.src.source_scan == dm_vert); /* vp access direction: horizontal or vertical accessed */
+	access_dir = (e2e_pipe_param->pipe.src.source_scan == dm_vert); /* vp access direction: horizontal or vertical accessed */
 	bytes_per_element_l = get_bytes_per_element(
-			(enum source_format_class) e2e_pipe_param.pipe.src.source_format,
+			(enum source_format_class) e2e_pipe_param->pipe.src.source_format,
 			0);
 	bytes_per_element_c = get_bytes_per_element(
-			(enum source_format_class) e2e_pipe_param.pipe.src.source_format,
+			(enum source_format_class) e2e_pipe_param->pipe.src.source_format,
 			1);
-	vp_height_l = e2e_pipe_param.pipe.src.viewport_height;
-	vp_width_l = e2e_pipe_param.pipe.src.viewport_width;
-	vp_height_c = e2e_pipe_param.pipe.src.viewport_height_c;
-	vp_width_c = e2e_pipe_param.pipe.src.viewport_width_c;
+	vp_height_l = e2e_pipe_param->pipe.src.viewport_height;
+	vp_width_l = e2e_pipe_param->pipe.src.viewport_width;
+	vp_height_c = e2e_pipe_param->pipe.src.viewport_height_c;
+	vp_width_c = e2e_pipe_param->pipe.src.viewport_width_c;
 
 	/* Scaling */
-	htaps_l = e2e_pipe_param.pipe.scale_taps.htaps;
-	htaps_c = e2e_pipe_param.pipe.scale_taps.htaps_c;
-	hratios_l = e2e_pipe_param.pipe.scale_ratio_depth.hscl_ratio;
-	hratios_c = e2e_pipe_param.pipe.scale_ratio_depth.hscl_ratio_c;
-	vratio_l = e2e_pipe_param.pipe.scale_ratio_depth.vscl_ratio;
-	vratio_c = e2e_pipe_param.pipe.scale_ratio_depth.vscl_ratio_c;
+	htaps_l = e2e_pipe_param->pipe.scale_taps.htaps;
+	htaps_c = e2e_pipe_param->pipe.scale_taps.htaps_c;
+	hratios_l = e2e_pipe_param->pipe.scale_ratio_depth.hscl_ratio;
+	hratios_c = e2e_pipe_param->pipe.scale_ratio_depth.hscl_ratio_c;
+	vratio_l = e2e_pipe_param->pipe.scale_ratio_depth.vscl_ratio;
+	vratio_c = e2e_pipe_param->pipe.scale_ratio_depth.vscl_ratio_c;
 
 	line_time_in_us = (htotal / pclk_freq_in_mhz);
-	vinit_l = e2e_pipe_param.pipe.scale_ratio_depth.vinit;
-	vinit_c = e2e_pipe_param.pipe.scale_ratio_depth.vinit_c;
-	vinit_bot_l = e2e_pipe_param.pipe.scale_ratio_depth.vinit_bot;
-	vinit_bot_c = e2e_pipe_param.pipe.scale_ratio_depth.vinit_bot_c;
-
-	swath_height_l = rq_dlg_param.rq_l.swath_height;
-	swath_width_ub_l = rq_dlg_param.rq_l.swath_width_ub;
-	dpte_bytes_per_row_ub_l = rq_dlg_param.rq_l.dpte_bytes_per_row_ub;
-	dpte_groups_per_row_ub_l = rq_dlg_param.rq_l.dpte_groups_per_row_ub;
-	meta_pte_bytes_per_frame_ub_l = rq_dlg_param.rq_l.meta_pte_bytes_per_frame_ub;
-	meta_bytes_per_row_ub_l = rq_dlg_param.rq_l.meta_bytes_per_row_ub;
-
-	swath_height_c = rq_dlg_param.rq_c.swath_height;
-	swath_width_ub_c = rq_dlg_param.rq_c.swath_width_ub;
-	dpte_bytes_per_row_ub_c = rq_dlg_param.rq_c.dpte_bytes_per_row_ub;
-	dpte_groups_per_row_ub_c = rq_dlg_param.rq_c.dpte_groups_per_row_ub;
-
-	meta_chunks_per_row_ub_l = rq_dlg_param.rq_l.meta_chunks_per_row_ub;
-	vupdate_offset = e2e_pipe_param.pipe.dest.vupdate_offset;
-	vupdate_width = e2e_pipe_param.pipe.dest.vupdate_width;
-	vready_offset = e2e_pipe_param.pipe.dest.vready_offset;
+	vinit_l = e2e_pipe_param->pipe.scale_ratio_depth.vinit;
+	vinit_c = e2e_pipe_param->pipe.scale_ratio_depth.vinit_c;
+	vinit_bot_l = e2e_pipe_param->pipe.scale_ratio_depth.vinit_bot;
+	vinit_bot_c = e2e_pipe_param->pipe.scale_ratio_depth.vinit_bot_c;
+
+	swath_height_l = rq_dlg_param->rq_l.swath_height;
+	swath_width_ub_l = rq_dlg_param->rq_l.swath_width_ub;
+	dpte_bytes_per_row_ub_l = rq_dlg_param->rq_l.dpte_bytes_per_row_ub;
+	dpte_groups_per_row_ub_l = rq_dlg_param->rq_l.dpte_groups_per_row_ub;
+	meta_pte_bytes_per_frame_ub_l = rq_dlg_param->rq_l.meta_pte_bytes_per_frame_ub;
+	meta_bytes_per_row_ub_l = rq_dlg_param->rq_l.meta_bytes_per_row_ub;
+
+	swath_height_c = rq_dlg_param->rq_c.swath_height;
+	swath_width_ub_c = rq_dlg_param->rq_c.swath_width_ub;
+	dpte_bytes_per_row_ub_c = rq_dlg_param->rq_c.dpte_bytes_per_row_ub;
+	dpte_groups_per_row_ub_c = rq_dlg_param->rq_c.dpte_groups_per_row_ub;
+
+	meta_chunks_per_row_ub_l = rq_dlg_param->rq_l.meta_chunks_per_row_ub;
+	vupdate_offset = e2e_pipe_param->pipe.dest.vupdate_offset;
+	vupdate_width = e2e_pipe_param->pipe.dest.vupdate_width;
+	vready_offset = e2e_pipe_param->pipe.dest.vready_offset;
 
 	dppclk_delay_subtotal = mode_lib->ip.dppclk_delay_subtotal;
 	dispclk_delay_subtotal = mode_lib->ip.dispclk_delay_subtotal;
 	pixel_rate_delay_subtotal = dppclk_delay_subtotal * pclk_freq_in_mhz / dppclk_freq_in_mhz
 			+ dispclk_delay_subtotal * pclk_freq_in_mhz / dispclk_freq_in_mhz;
 
-	vstartup_start = e2e_pipe_param.pipe.dest.vstartup_start;
+	vstartup_start = e2e_pipe_param->pipe.dest.vstartup_start;
 
 	if (interlaced)
 		vstartup_start = vstartup_start / 2;
@@ -1276,13 +1276,13 @@ void dml1_rq_dlg_get_dlg_params(
 	dst_x_after_scaler = 0;
 	dst_y_after_scaler = 0;
 
-	if (e2e_pipe_param.pipe.src.is_hsplit)
+	if (e2e_pipe_param->pipe.src.is_hsplit)
 		dst_x_after_scaler = pixel_rate_delay_subtotal
-				+ e2e_pipe_param.pipe.dest.recout_width;
+				+ e2e_pipe_param->pipe.dest.recout_width;
 	else
 		dst_x_after_scaler = pixel_rate_delay_subtotal;
 
-	if (e2e_pipe_param.dout.output_format == dm_420)
+	if (e2e_pipe_param->dout.output_format == dm_420)
 		dst_y_after_scaler = 1;
 	else
 		dst_y_after_scaler = 0;
@@ -1334,7 +1334,7 @@ void dml1_rq_dlg_get_dlg_params(
 	DTRACE(
 			"DLG: %s: t_srx_delay_us     = %3.2f",
 			__func__,
-			(double) dlg_sys_param.t_srx_delay_us);
+			(double) dlg_sys_param->t_srx_delay_us);
 	DTRACE("DLG: %s: line_time_in_us    = %3.2f", __func__, (double) line_time_in_us);
 	DTRACE("DLG: %s: vupdate_offset     = %d", __func__, vupdate_offset);
 	DTRACE("DLG: %s: vupdate_width      = %d", __func__, vupdate_width);
@@ -1408,12 +1408,12 @@ void dml1_rq_dlg_get_dlg_params(
 	DTRACE("DLG: %s: dpte_row_bytes          = %d", __func__, dpte_row_bytes);
 
 	prefetch_bw = (vm_bytes + 2 * dpte_row_bytes + 2 * meta_row_bytes + sw_bytes) / t_pre_us;
-	flip_bw = ((vm_bytes + dpte_row_bytes + meta_row_bytes) * dlg_sys_param.total_flip_bw)
-			/ (double) dlg_sys_param.total_flip_bytes;
+	flip_bw = ((vm_bytes + dpte_row_bytes + meta_row_bytes) * dlg_sys_param->total_flip_bw)
+			/ (double) dlg_sys_param->total_flip_bytes;
 	t_vm_us = line_time_in_us / 4.0;
 	if (vm_en && dcc_en) {
 		t_vm_us = dml_max(
-				dlg_sys_param.t_extra_us,
+				dlg_sys_param->t_extra_us,
 				dml_max((double) vm_bytes / prefetch_bw, t_vm_us));
 
 		if (iflip_en && !dual_plane) {
@@ -1423,12 +1423,12 @@ void dml1_rq_dlg_get_dlg_params(
 		}
 	}
 
-	t_r0_us = dml_max(dlg_sys_param.t_extra_us - t_vm_us, line_time_in_us - t_vm_us);
+	t_r0_us = dml_max(dlg_sys_param->t_extra_us - t_vm_us, line_time_in_us - t_vm_us);
 
 	if (vm_en || dcc_en) {
 		t_r0_us = dml_max(
 				(double) (dpte_row_bytes + meta_row_bytes) / prefetch_bw,
-				dlg_sys_param.t_extra_us);
+				dlg_sys_param->t_extra_us);
 		t_r0_us = dml_max((double) (line_time_in_us - t_vm_us), t_r0_us);
 
 		if (iflip_en && !dual_plane) {
@@ -1550,15 +1550,15 @@ void dml1_rq_dlg_get_dlg_params(
 			disp_dlg_regs->refcyc_per_meta_chunk_vblank_l;/* dcc for 4:2:0 is not supported in dcn1.0.  assigned to be the same as _l for now */
 
 	/* Active */
-	req_per_swath_ub_l = rq_dlg_param.rq_l.req_per_swath_ub;
-	req_per_swath_ub_c = rq_dlg_param.rq_c.req_per_swath_ub;
-	meta_row_height_l = rq_dlg_param.rq_l.meta_row_height;
+	req_per_swath_ub_l = rq_dlg_param->rq_l.req_per_swath_ub;
+	req_per_swath_ub_c = rq_dlg_param->rq_c.req_per_swath_ub;
+	meta_row_height_l = rq_dlg_param->rq_l.meta_row_height;
 	swath_width_pixels_ub_l = 0;
 	swath_width_pixels_ub_c = 0;
 	scaler_rec_in_width_l = 0;
 	scaler_rec_in_width_c = 0;
-	dpte_row_height_l = rq_dlg_param.rq_l.dpte_row_height;
-	dpte_row_height_c = rq_dlg_param.rq_c.dpte_row_height;
+	dpte_row_height_l = rq_dlg_param->rq_l.dpte_row_height;
+	dpte_row_height_c = rq_dlg_param->rq_c.dpte_row_height;
 
 	disp_dlg_regs->dst_y_per_pte_row_nom_l = (unsigned int) ((double) dpte_row_height_l
 			/ (double) vratio_l * dml_pow(2, 2));
@@ -1650,14 +1650,14 @@ void dml1_rq_dlg_get_dlg_params(
 	refcyc_per_req_delivery_cur0 = 0.;
 
 	full_recout_width = 0;
-	if (e2e_pipe_param.pipe.src.is_hsplit) {
-		if (e2e_pipe_param.pipe.dest.full_recout_width == 0) {
+	if (e2e_pipe_param->pipe.src.is_hsplit) {
+		if (e2e_pipe_param->pipe.dest.full_recout_width == 0) {
 			DTRACE("DLG: %s: Warningfull_recout_width not set in hsplit mode", __func__);
-			full_recout_width = e2e_pipe_param.pipe.dest.recout_width * 2; /* assume half split for dcn1 */
+			full_recout_width = e2e_pipe_param->pipe.dest.recout_width * 2; /* assume half split for dcn1 */
 		} else
-			full_recout_width = e2e_pipe_param.pipe.dest.full_recout_width;
+			full_recout_width = e2e_pipe_param->pipe.dest.full_recout_width;
 	} else
-		full_recout_width = e2e_pipe_param.pipe.dest.recout_width;
+		full_recout_width = e2e_pipe_param->pipe.dest.recout_width;
 
 	refcyc_per_line_delivery_pre_l = get_refcyc_per_delivery(
 			mode_lib,
@@ -1824,9 +1824,9 @@ void dml1_rq_dlg_get_dlg_params(
 	}
 
 	/* TTU - Cursor */
-	hratios_cur0 = e2e_pipe_param.pipe.scale_ratio_depth.hscl_ratio;
-	cur0_src_width = e2e_pipe_param.pipe.src.cur0_src_width; /* cursor source width */
-	cur0_bpp = (enum cursor_bpp) e2e_pipe_param.pipe.src.cur0_bpp;
+	hratios_cur0 = e2e_pipe_param->pipe.scale_ratio_depth.hscl_ratio;
+	cur0_src_width = e2e_pipe_param->pipe.src.cur0_src_width; /* cursor source width */
+	cur0_bpp = (enum cursor_bpp) e2e_pipe_param->pipe.src.cur0_bpp;
 	cur0_req_size = 0;
 	cur0_req_width = 0;
 	cur0_width_ub = 0.0;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.h b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.h
index 9c06913ad767..23d4f30826ff 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml1_display_rq_dlg_calc.h
@@ -33,7 +33,7 @@ struct display_mode_lib;
 void dml1_extract_rq_regs(
 		struct display_mode_lib *mode_lib,
 		struct _vcs_dpi_display_rq_regs_st *rq_regs,
-		const struct _vcs_dpi_display_rq_params_st rq_param);
+		const struct _vcs_dpi_display_rq_params_st *rq_param);
 /* Function: dml_rq_dlg_get_rq_params
  *  Calculate requestor related parameters that register definition agnostic
  *  (i.e. this layer does try to separate real values from register definition)
@@ -55,9 +55,9 @@ void dml1_rq_dlg_get_dlg_params(
 		struct display_mode_lib *mode_lib,
 		struct _vcs_dpi_display_dlg_regs_st *dlg_regs,
 		struct _vcs_dpi_display_ttu_regs_st *ttu_regs,
-		const struct _vcs_dpi_display_rq_dlg_params_st rq_dlg_param,
-		const struct _vcs_dpi_display_dlg_sys_params_st dlg_sys_param,
-		const struct _vcs_dpi_display_e2e_pipe_params_st e2e_pipe_param,
+		const struct _vcs_dpi_display_rq_dlg_params_st *rq_dlg_param,
+		const struct _vcs_dpi_display_dlg_sys_params_st *dlg_sys_param,
+		const struct _vcs_dpi_display_e2e_pipe_params_st *e2e_pipe_param,
 		const bool cstate_en,
 		const bool pstate_en,
 		const bool vm_en,
diff --git a/drivers/gpu/drm/amd/display/dc/inc/core_types.h b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
index 45a6216dfa2a..4dca14b598dd 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/core_types.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/core_types.h
@@ -366,6 +366,9 @@ struct pipe_ctx {
 	struct _vcs_dpi_display_ttu_regs_st ttu_regs;
 	struct _vcs_dpi_display_rq_regs_st rq_regs;
 	struct _vcs_dpi_display_pipe_dest_params_st pipe_dlg_param;
+	struct _vcs_dpi_display_rq_params_st dml_rq_param;
+	struct _vcs_dpi_display_dlg_sys_params_st dml_dlg_sys_param;
+	struct _vcs_dpi_display_e2e_pipe_params_st dml_input;
 	int det_buffer_size_kb;
 	bool unbounded_req;
 #endif
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-08  3:52             ` Harry Wentland
@ 2021-09-08  4:41               ` Linus Torvalds
  2021-09-09  0:48                 ` Harry Wentland
  0 siblings, 1 reply; 16+ messages in thread
From: Linus Torvalds @ 2021-09-08  4:41 UTC (permalink / raw)
  To: Harry Wentland
  Cc: Arnd Bergmann, Leo Li, Alex Deucher, Christian König,
	Pan, Xinhui, Nathan Chancellor, Guenter Roeck,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, amd-gfx

On Tue, Sep 7, 2021 at 8:52 PM Harry Wentland <harry.wentland@amd.com> wrote:
>
> Attached patches fix these x86_64 ones reported by Nick:

Hmm.

You didn't seem to fix up the calling convention for print__xyz(),
which still take those xyz structs as pass-by-value.

Obviously it would be good to do things incrementally, so if that
attached patch was just [1/N] I won't complain..

> I'm also seeing one more that might be more challenging to fix but is nearly at 1024:
>
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6: error: stack frame size of 1064 bytes in function 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]

Oh Gods, that function is truly something else..

Is there some reason why it's one humongous function, with the
occasional single-line comment?

Because it really looks to me like pretty much everywhere I see one of
those rare comments, I would go "this part should be a function of its
own", and then there would be one caller fuynction that just calls
each of those sub-functions one after the other.

That would - I think - make the code easier to read, and then it would
also make it very obvious where it magically uses a lot of stack.

My suspicion is actually "nowhere". The stack use is just hugely
spread out, and the compiler has just kept accumulating more spill
variables on the frame with no single big reason.

Yes, I see a couple of local structures:

                Pipe myPipe;
                HostVM myHostVM;

but more than that I see several function calls that have basically 62
arguments. And I wish I was making that number up. I'm not. That
"CalculatePrefetchSchedule()" call literally has 62 arguments.

But *all* of the top-level loops in that function literally look like
they could - and should - be functions in their own right. Some of
them would be fairly complex even so (ie that code under the comment

        //Prefetch Check

would be quite the big function all of its own.

We have a coding style thing:

    Documentation/process/coding-style.rst

that says that you should strive to have functions that are "short and
sweet" and fit on one or two screenfuls of text.

That one function from hell is 1832 lines of code.

It really could be improved upon.

               Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-08  4:41               ` Linus Torvalds
@ 2021-09-09  0:48                 ` Harry Wentland
  0 siblings, 0 replies; 16+ messages in thread
From: Harry Wentland @ 2021-09-09  0:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Leo Li, Alex Deucher, Christian König,
	Pan, Xinhui, Nathan Chancellor, Guenter Roeck,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, amd-gfx



On 2021-09-08 12:41 a.m., Linus Torvalds wrote:
> On Tue, Sep 7, 2021 at 8:52 PM Harry Wentland <harry.wentland@amd.com> wrote:
>>
>> Attached patches fix these x86_64 ones reported by Nick:
> 
> Hmm.
> 
> You didn't seem to fix up the calling convention for print__xyz(),
> which still take those xyz structs as pass-by-value.
> 
> Obviously it would be good to do things incrementally, so if that
> attached patch was just [1/N] I won't complain..
> 

You're right. I was focussed on the stack frame limit but fixed up
the rest as well now and sent the series out.

https://lkml.org/lkml/2021/9/8/933

>> I'm also seeing one more that might be more challenging to fix but is nearly at 1024:
>>
>> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6: error: stack frame size of 1064 bytes in function 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=]
> 
> Oh Gods, that function is truly something else..
> 
> Is there some reason why it's one humongous function, with the
> occasional single-line comment?
> 
> Because it really looks to me like pretty much everywhere I see one of
> those rare comments, I would go "this part should be a function of its
> own", and then there would be one caller fuynction that just calls
> each of those sub-functions one after the other.
> 

Yeah, that's what I'm thinking as well. It would likely fix the stack
size, even without dynamically allocating the two structs you mention
below.

> That would - I think - make the code easier to read, and then it would
> also make it very obvious where it magically uses a lot of stack.
> 
> My suspicion is actually "nowhere". The stack use is just hugely
> spread out, and the compiler has just kept accumulating more spill
> variables on the frame with no single big reason.
> 
> Yes, I see a couple of local structures:
> 
>                 Pipe myPipe;
>                 HostVM myHostVM;
> 
> but more than that I see several function calls that have basically 62
> arguments. And I wish I was making that number up. I'm not. That
> "CalculatePrefetchSchedule()" call literally has 62 arguments.
> 
> But *all* of the top-level loops in that function literally look like
> they could - and should - be functions in their own right. Some of
> them would be fairly complex even so (ie that code under the comment
> 
>         //Prefetch Check
> 
> would be quite the big function all of its own.
> 
> We have a coding style thing:
> 
>     Documentation/process/coding-style.rst
> 
> that says that you should strive to have functions that are "short and
> sweet" and fit on one or two screenfuls of text.
> 
> That one function from hell is 1832 lines of code.
> 
> It really could be improved upon.
> 

Absolutely.

The file comes with an (easy to miss) disclaimer that it is
"gcc-parseable HW gospel":
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c?h=v5.14#n32

The display_mode_vba* stuff deals with bandwidth formulas from HW
designers. At some point in the past we attempted to convert them
to something more readable and elegant but would often run into
difficulties getting support from the right people when things
wouldn't work. Using the HW designer's code directly tends to
short circuit any arguments about SW correctness.

In short, I don't really like this code but it works. It helps
prevent black screens and underflows on the display.

We try to follow the coding-style.rst for the most part elsewhere,
though there are still plenty of areas where we can improve.

Harry

>                Linus
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
       [not found]           ` <YTkyIAevt7XOd+8j@elver.google.com>
@ 2021-09-09  5:58             ` Christoph Hellwig
  2021-09-09  6:07               ` Guenter Roeck
                                 ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Christoph Hellwig @ 2021-09-09  5:58 UTC (permalink / raw)
  To: Marco Elver
  Cc: Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linus Torvalds,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Christian =?unknown-8bit?B?S8O2bmln?=, Pan, Xinhui, amd-gfx

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=unknown-8bit, Size: 949 bytes --]

On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> It'd be good to avoid. It has helped uncover build issues with KASAN in
> the past. Or at least make it dependent on the problematic architecture.
> For example if arm is a problem, something like this:

I'm also seeing quite a few stack size warnings with KASAN on x86_64
without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
few warnings without KASAN, but with KASAN there are a lot more.
I'll try to find some time to dig into them.

While we're at it, with -Werror something like this is really futile:

drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function ‘amdgpu_bo_support_uswc’:
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Wcpp
  493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
      |  ^~~~~~~

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09  5:58             ` Christoph Hellwig
@ 2021-09-09  6:07               ` Guenter Roeck
  2021-09-09  7:30                 ` Christian König
  2021-09-09 10:53               ` Marco Elver
  2021-09-09 16:46               ` Linus Torvalds
  2 siblings, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2021-09-09  6:07 UTC (permalink / raw)
  To: Christoph Hellwig, Marco Elver
  Cc: Nathan Chancellor, Arnd Bergmann, Linus Torvalds,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Christian =?unknown-8bit?B?S8O2bmln?=, Pan, Xinhui, amd-gfx

On 9/8/21 10:58 PM, Christoph Hellwig wrote:
> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
>> It'd be good to avoid. It has helped uncover build issues with KASAN in
>> the past. Or at least make it dependent on the problematic architecture.
>> For example if arm is a problem, something like this:
> 
> I'm also seeing quite a few stack size warnings with KASAN on x86_64
> without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> few warnings without KASAN, but with KASAN there are a lot more.
> I'll try to find some time to dig into them.
> 
> While we're at it, with -Werror something like this is really futile:
> 
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function ‘amdgpu_bo_support_uswc’:
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
> Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Wcpp
>    493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
>        |  ^~~~~~~
> 

I have been wondering if all those #warning "errors" should either
be removed or be replaced with "#pragma message".

Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09  6:07               ` Guenter Roeck
@ 2021-09-09  7:30                 ` Christian König
  2021-09-09 14:59                   ` Guenter Roeck
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2021-09-09  7:30 UTC (permalink / raw)
  To: Guenter Roeck, Christoph Hellwig, Marco Elver
  Cc: Nathan Chancellor, Arnd Bergmann, Linus Torvalds,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Pan, Xinhui, amd-gfx

Am 09.09.21 um 08:07 schrieb Guenter Roeck:
> On 9/8/21 10:58 PM, Christoph Hellwig wrote:
>> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
>>> It'd be good to avoid. It has helped uncover build issues with KASAN in
>>> the past. Or at least make it dependent on the problematic 
>>> architecture.
>>> For example if arm is a problem, something like this:
>>
>> I'm also seeing quite a few stack size warnings with KASAN on x86_64
>> without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
>> few warnings without KASAN, but with KASAN there are a lot more.
>> I'll try to find some time to dig into them.
>>
>> While we're at it, with -Werror something like this is really futile:
>>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function 
>> ‘amdgpu_bo_support_uswc’:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
>> Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance 
>> thanks to write-combining [-Wcpp
>>    493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for 
>> better performance \
>>        |  ^~~~~~~

Ah, yes good point!

>
> I have been wondering if all those #warning "errors" should either
> be removed or be replaced with "#pragma message".

Well we started to add those warnings because people compiled their 
kernel with CONFIG_MTRR and CONFIG_X86_PAT and was then wondering why 
the performance of the display driver was so crappy.

When those warning now generate an error which you have to disable 
explicitly then that might not be bad at all.

It at least points people to this setting and makes it really clear that 
they are doing something very unusual and need to keep in mind that it 
might not have the desired result.

Regards,
Christian.

>
> Guenter


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09  5:58             ` Christoph Hellwig
  2021-09-09  6:07               ` Guenter Roeck
@ 2021-09-09 10:53               ` Marco Elver
  2021-09-09 11:00                 ` Arnd Bergmann
  2021-09-09 16:46               ` Linus Torvalds
  2 siblings, 1 reply; 16+ messages in thread
From: Marco Elver @ 2021-09-09 10:53 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Guenter Roeck, Nathan Chancellor, Arnd Bergmann, Linus Torvalds,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Christian König, Pan, Xinhui, amd-gfx

On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > the past. Or at least make it dependent on the problematic architecture.
> > For example if arm is a problem, something like this:
>
> I'm also seeing quite a few stack size warnings with KASAN on x86_64
> without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> few warnings without KASAN, but with KASAN there are a lot more.
> I'll try to find some time to dig into them.

Right, this reminded me that we actually at least double the real
stack size for KASAN builds, because it inherently requires more stack
space. I think we need Wframe-larger-than to match that, otherwise
we'll just keep having this problem:

https://lkml.kernel.org/r/20210909104925.809674-1-elver@google.com

> While we're at it, with -Werror something like this is really futile:
>
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function ‘amdgpu_bo_support_uswc’:
> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
> Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Wcpp
>   493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
>       |  ^~~~~~~

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09 10:53               ` Marco Elver
@ 2021-09-09 11:00                 ` Arnd Bergmann
  2021-09-09 11:43                   ` Marco Elver
  0 siblings, 1 reply; 16+ messages in thread
From: Arnd Bergmann @ 2021-09-09 11:00 UTC (permalink / raw)
  To: Marco Elver
  Cc: Christoph Hellwig, Guenter Roeck, Nathan Chancellor,
	Linus Torvalds, Linux Kernel Mailing List, llvm, Nick Desaulniers,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui,
	amd-gfx list

On Thu, Sep 9, 2021 at 12:54 PM Marco Elver <elver@google.com> wrote:
> On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <hch@infradead.org> wrote:
> > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > the past. Or at least make it dependent on the problematic architecture.
> > > For example if arm is a problem, something like this:
> >
> > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> > few warnings without KASAN, but with KASAN there are a lot more.
> > I'll try to find some time to dig into them.
>
> Right, this reminded me that we actually at least double the real
> stack size for KASAN builds, because it inherently requires more stack
> space. I think we need Wframe-larger-than to match that, otherwise
> we'll just keep having this problem:
>
> https://lkml.kernel.org/r/20210909104925.809674-1-elver@google.com

The problem with this is that it completely defeats the point of the
stack size warnings in allmodconfig kernels when they have KASAN
enabled and end up missing obvious code bugs in drivers that put
large structures on the stack. Let's not go there.

        Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09 11:00                 ` Arnd Bergmann
@ 2021-09-09 11:43                   ` Marco Elver
  2021-09-09 12:55                     ` Arnd Bergmann
  2021-09-09 16:53                     ` Linus Torvalds
  0 siblings, 2 replies; 16+ messages in thread
From: Marco Elver @ 2021-09-09 11:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Guenter Roeck, Nathan Chancellor,
	Linus Torvalds, Linux Kernel Mailing List, llvm, Nick Desaulniers,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui,
	amd-gfx list

On Thu, 9 Sept 2021 at 13:00, Arnd Bergmann <arnd@kernel.org> wrote:
> On Thu, Sep 9, 2021 at 12:54 PM Marco Elver <elver@google.com> wrote:
> > On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <hch@infradead.org> wrote:
> > > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > > the past. Or at least make it dependent on the problematic architecture.
> > > > For example if arm is a problem, something like this:
> > >
> > > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > > without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> > > few warnings without KASAN, but with KASAN there are a lot more.
> > > I'll try to find some time to dig into them.
> >
> > Right, this reminded me that we actually at least double the real
> > stack size for KASAN builds, because it inherently requires more stack
> > space. I think we need Wframe-larger-than to match that, otherwise
> > we'll just keep having this problem:
> >
> > https://lkml.kernel.org/r/20210909104925.809674-1-elver@google.com
>
> The problem with this is that it completely defeats the point of the
> stack size warnings in allmodconfig kernels when they have KASAN
> enabled and end up missing obvious code bugs in drivers that put
> large structures on the stack. Let's not go there.

Sure, but the reality is that the real stack size is already doubled
for KASAN. And that should be reflected in Wframe-larger-than.

Either that, or we just have to live with the occasional warning (that
is likely benign). But with WERROR we're now forced to make the
defaults as sane as possible. If the worry is allmodconfig, maybe we
do have to make KASAN dependent on !COMPILE_TEST, even though that's
not great either because it has caught real issues in the past (it'll
also mean doing the same for all other instrumentation-based tools,
like KCSAN, UBSAN, etc.).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09 11:43                   ` Marco Elver
@ 2021-09-09 12:55                     ` Arnd Bergmann
  2021-09-09 16:53                     ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2021-09-09 12:55 UTC (permalink / raw)
  To: Marco Elver
  Cc: Christoph Hellwig, Guenter Roeck, Nathan Chancellor,
	Linus Torvalds, Linux Kernel Mailing List, llvm, Nick Desaulniers,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, linux-riscv,
	Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui,
	amd-gfx list

On Thu, Sep 9, 2021 at 1:43 PM Marco Elver <elver@google.com> wrote:
> On Thu, 9 Sept 2021 at 13:00, Arnd Bergmann <arnd@kernel.org> wrote:
> > On Thu, Sep 9, 2021 at 12:54 PM Marco Elver <elver@google.com> wrote:
> > > On Thu, 9 Sept 2021 at 07:59, Christoph Hellwig <hch@infradead.org> wrote:
> > > > On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
> > > > > It'd be good to avoid. It has helped uncover build issues with KASAN in
> > > > > the past. Or at least make it dependent on the problematic architecture.
> > > > > For example if arm is a problem, something like this:
> > > >
> > > > I'm also seeing quite a few stack size warnings with KASAN on x86_64
> > > > without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
> > > > few warnings without KASAN, but with KASAN there are a lot more.
> > > > I'll try to find some time to dig into them.
> > >
> > > Right, this reminded me that we actually at least double the real
> > > stack size for KASAN builds, because it inherently requires more stack
> > > space. I think we need Wframe-larger-than to match that, otherwise
> > > we'll just keep having this problem:
> > >
> > > https://lkml.kernel.org/r/20210909104925.809674-1-elver@google.com
> >
> > The problem with this is that it completely defeats the point of the
> > stack size warnings in allmodconfig kernels when they have KASAN
> > enabled and end up missing obvious code bugs in drivers that put
> > large structures on the stack. Let's not go there.
>
> Sure, but the reality is that the real stack size is already doubled
> for KASAN. And that should be reflected in Wframe-larger-than.

I don't think "double" is an accurate description of what is going on,
it's much more complex than this. There are some functions
that completely explode with KASAN_STACK enabled on clang,
and many other functions instances that don't grow much at all.

I've been building randconfig kernels for a long time with KASAN_STACK
enabled on gcc, and the limit increased to 1440 bytes for 32-bit
and not increased beyond the normal 2048 bytes for 64-bit. I have
some patches to address the outliers and should go through and
resend some of those.

With the same limits and patches using clang, and KASAN=y but
KASAN_STACK=n I also get no warnings in randconfig builds,
but KASAN_STACK on clang doesn't really seem to have a good
limit that would make an allmodconfig kernel build with no warnings.

These are the worst offenders I see based on configuration, using
an 32-bit ARM allmodconfig with my fixups:

gcc-11, KASAN, no KASAN_STACK, FRAME_WARN=1024:
(nothing)

gcc-11, KASAN_STACK:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1:
warning: the frame size of 1416 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/media/dvb-frontends/mxl5xx.c:1575:1: warning: the frame size
of 1240 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/nftlcore.c:468:1: warning: the frame size of 1232 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
drivers/char/ipmi/ipmi_msghandler.c:4880:1: warning: the frame size of
1232 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/mtd/chips/cfi_cmdset_0001.c:1870:1: warning: the frame size of
1224 bytes is larger than 1024 bytes [-Wframe-larger-than=]
drivers/net/wireless/ath/ath9k/ar9003_paprd.c:749:1: warning: the
frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c:136:1: warning:
the frame size of 1216 bytes is larger than 1024 bytes
[-Wframe-larger-than=]
drivers/ntb/hw/idt/ntb_hw_idt.c:1116:1: warning: the frame size of
1200 bytes is larger than 1024 bytes [-Wframe-larger-than=]
net/dcb/dcbnl.c:1172:1: warning: the frame size of 1192 bytes is
larger than 1024 bytes [-Wframe-larger-than=]
fs/select.c:1042:1: warning: the frame size of 1192 bytes is larger
than 1024 bytes [-Wframe-larger-than=]

clang-12 KASAN, no KASAN_STACK, FRAME_WARN=1024:

kernel/trace/trace_events_hist.c:4601:13: error: stack frame size 1384
exceeds limit 1024 in function 'hist_trigger_print_key'
[-Werror,-Wframe-larger-than]
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3045:6:
error: stack frame size 1384 exceeds limit 1024 in function 'bw_calcs'
[-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:992:5: error: stack frame size 1208
exceeds limit 1024 in function 'fbtft_init_display'
[-Werror,-Wframe-larger-than]
crypto/wp512.c:782:13: error: stack frame size 1176 exceeds limit 1024
in function 'wp512_process_buffer' [-Werror,-Wframe-larger-than]
drivers/staging/fbtft/fbtft-core.c:902:12: error: stack frame size
1080 exceeds limit 1024 in function 'fbtft_init_display_from_property'
[-Werror,-Wframe-larger-than]
drivers/mtd/chips/cfi_cmdset_0001.c:1872:12: error: stack frame size
1064 exceeds limit 1024 in function 'cfi_intelext_writev'
[-Werror,-Wframe-larger-than]
drivers/staging/rtl8723bs/core/rtw_security.c:1288:5: error: stack
frame size 1040 exceeds limit 1024 in function 'rtw_aes_decrypt'
[-Werror,-Wframe-larger-than]
drivers/ntb/hw/idt/ntb_hw_idt.c:1041:27: error: stack frame size 1032
exceeds limit 1024 in function 'idt_scan_mws'
[-Werror,-Wframe-larger-than]

clang-12, KASAN_STACK:

drivers/infiniband/hw/ocrdma/ocrdma_stats.c:686:16: error: stack frame
size 20608 exceeds limit 1024 in function 'ocrdma_dbgfs_ops_read'
[-Werror,-Wframe-larger-than]
lib/bitfield_kunit.c:60:20: error: stack frame size 10336 exceeds
limit 10240 in function 'test_bitfields_constants'
[-Werror,-Wframe-larger-than]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9012:13: error: stack
frame size 9952 exceeds limit 1024 in function 'rt2800_init_rfcsr'
[-Werror,-Wframe-larger-than]
drivers/net/usb/r8152.c:7486:13: error: stack frame size 8768 exceeds
limit 1024 in function 'r8156b_hw_phy_cfg'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/nxt200x.c:915:12: error: stack frame size
8192 exceeds limit 1024 in function 'nxt2004_init'
[-Werror,-Wframe-larger-than]
drivers/net/wan/slic_ds26522.c:203:12: error: stack frame size 8064
exceeds limit 1024 in function 'slic_ds26522_probe'
[-Werror,-Wframe-larger-than]
drivers/firmware/broadcom/bcm47xx_sprom.c:188:13: error: stack frame
size 8064 exceeds limit 1024 in function 'bcm47xx_sprom_fill_auto'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/drxd_hard.c:2857:12: error: stack frame
size 7584 exceeds limit 1024 in function 'drxd_set_frontend'
[-Werror,-Wframe-larger-than]
drivers/media/dvb-frontends/nxt200x.c:519:12: error: stack frame size
6848 exceeds limit 1024 in function
'nxt200x_setup_frontend_parameters' [-Werror,-Wframe-larger-than]
drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:17019:13:
error: stack frame size 6560 exceeds limit 1024 in function
'wlc_phy_workarounds_nphy' [-Werror,-Wframe-larger-than]

> Either that, or we just have to live with the occasional warning (that
> is likely benign). But with WERROR we're now forced to make the
> defaults as sane as possible. If the worry is allmodconfig, maybe we
> do have to make KASAN dependent on !COMPILE_TEST, even though that's
> not great either because it has caught real issues in the past (it'll
> also mean doing the same for all other instrumentation-based tools,
> like KCSAN, UBSAN, etc.).

I would prefer going back to marking KASAN_STACK as broken on clang, it does
not seem like the warnings on the symbol were enough to stop people from
attempting to using it, and the remaining warnings seem fixable with a small
increase of the FRAME_WARN when using KASAN with clang but no KASAN_STACK,
or when using KASAN_STACK with gcc.

      Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09  7:30                 ` Christian König
@ 2021-09-09 14:59                   ` Guenter Roeck
  0 siblings, 0 replies; 16+ messages in thread
From: Guenter Roeck @ 2021-09-09 14:59 UTC (permalink / raw)
  To: Christian König, Christoph Hellwig, Marco Elver
  Cc: Nathan Chancellor, Arnd Bergmann, Linus Torvalds,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Pan, Xinhui, amd-gfx

On 9/9/21 12:30 AM, Christian König wrote:
> Am 09.09.21 um 08:07 schrieb Guenter Roeck:
>> On 9/8/21 10:58 PM, Christoph Hellwig wrote:
>>> On Wed, Sep 08, 2021 at 11:58:56PM +0200, Marco Elver wrote:
>>>> It'd be good to avoid. It has helped uncover build issues with KASAN in
>>>> the past. Or at least make it dependent on the problematic architecture.
>>>> For example if arm is a problem, something like this:
>>>
>>> I'm also seeing quite a few stack size warnings with KASAN on x86_64
>>> without COMPILT_TEST using gcc 10.2.1 from Debian.  In fact there are a
>>> few warnings without KASAN, but with KASAN there are a lot more.
>>> I'll try to find some time to dig into them.
>>>
>>> While we're at it, with -Werror something like this is really futile:
>>>
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c: In function ‘amdgpu_bo_support_uswc’:
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:493:2: warning: #warning
>>> Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance thanks to write-combining [-Wcpp
>>>    493 | #warning Please enable CONFIG_MTRR and CONFIG_X86_PAT for better performance \
>>>        |  ^~~~~~~
> 
> Ah, yes good point!
> 
>>
>> I have been wondering if all those #warning "errors" should either
>> be removed or be replaced with "#pragma message".
> 
> Well we started to add those warnings because people compiled their kernel with CONFIG_MTRR and CONFIG_X86_PAT and was then wondering why the performance of the display driver was so crappy.
> 
> When those warning now generate an error which you have to disable explicitly then that might not be bad at all.
> 
> It at least points people to this setting and makes it really clear that they are doing something very unusual and need to keep in mind that it might not have the desired result.
> 

That specific warning is surrounded with "#ifndef CONFIG_COMPILE_TEST"
so it doesn't really matter because it doesn't cause test build failures.
Of course, we could do the same for any #warning which does now
cause a test build failure.

Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09  5:58             ` Christoph Hellwig
  2021-09-09  6:07               ` Guenter Roeck
  2021-09-09 10:53               ` Marco Elver
@ 2021-09-09 16:46               ` Linus Torvalds
  2 siblings, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2021-09-09 16:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Marco Elver, Guenter Roeck, Nathan Chancellor, Arnd Bergmann,
	Linux Kernel Mailing List, llvm, Nick Desaulniers, Paul Walmsley,
	Palmer Dabbelt, Albert Ou, linux-riscv, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Andrey Konovalov, kasan-dev,
	Christian König, Pan, Xinhui, amd-gfx

On Wed, Sep 8, 2021 at 10:59 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> While we're at it, with -Werror something like this is really futile:

Yeah, I'm thinking we could do

 -Wno-error=cpp

to at least allow the cpp warnings to come through without being fatal.

Because while they can be annoying too, they are most definitely under
our direct control, so..

I didn't actually test that, but I think it should work.

That said, maybe they should just be removed. They might be better off
just as Kconfig rules, rather than as a "hey, you screwed up your
Kconfig" warning after the fact.

             Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
  2021-09-09 11:43                   ` Marco Elver
  2021-09-09 12:55                     ` Arnd Bergmann
@ 2021-09-09 16:53                     ` Linus Torvalds
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Torvalds @ 2021-09-09 16:53 UTC (permalink / raw)
  To: Marco Elver
  Cc: Arnd Bergmann, Christoph Hellwig, Guenter Roeck,
	Nathan Chancellor, Linux Kernel Mailing List, llvm,
	Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Andrey Konovalov, kasan-dev, Christian König, Pan, Xinhui,
	amd-gfx list

On Thu, Sep 9, 2021 at 4:43 AM Marco Elver <elver@google.com> wrote:
>
> Sure, but the reality is that the real stack size is already doubled
> for KASAN. And that should be reflected in Wframe-larger-than.

I don't think that's true.

Quite the reverse, in fact.

Yes, the *dynamic* stack size is doubled due to KASAN, because it will
cause much deeper callchains.

But the individual frames don't grow that much apart from compilers
doing stupid things (ie apparently clang and KASAN_STACK), and if
anything, the deeper dynamic call chains means that the individual
frame size being small is even *more* important, but we do compensate
for the deeper stacks by making THREAD_SIZE_ORDER bigger at least on
x86.

Honestly, I am not even happy with the current "2048 bytes for
64-bit". The excuse has been that 64-bit needs more stack, but all it
ever did was clearly to just allow people to just do bad things.

Because a 1kB stack frame is horrendous even in 64-bit. That's not
"spill some registers" kind of stack frame. That's "put a big
structure on the stack" kind of stack frame regardless of any other
issues.

And no, "but we have 16kB of stack and we'll switch stacks on
interrupts" is not an excuse for one single level to use up 1kB, much
less 2kB.  Does anybody seriously believe that we don't quite normally
have stacks that are easily tens of frames deep?

Without having some true "this is the full callchain" information, the
best we can do is just limit individual stack frames. And 2kB is
*excessive*.

                     Linus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Enable '-Werror' by default for all kernel builds
       [not found]       ` <YTkjJPCdR1VGaaVm@archlinux-ax161>
       [not found]         ` <75a10e8b-9f11-64c4-460b-9f3ac09965e2@roeck-us.net>
@ 2021-09-21 15:41         ` Arnd Bergmann
  1 sibling, 0 replies; 16+ messages in thread
From: Arnd Bergmann @ 2021-09-21 15:41 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Linus Torvalds, Guenter Roeck, Linux Kernel Mailing List, llvm,
	Nick Desaulniers, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	linux-riscv, Andrey Ryabinin, Alexander Potapenko, Dmitry Vyukov,
	Andrey Konovalov, kasan-dev, Harry Wentland, Alex Deucher,
	Christian König, xinhui pan, amd-gfx list

On Wed, Sep 8, 2021 at 10:55 PM Nathan Chancellor <nathan@kernel.org> wrote:
> On Tue, Sep 07, 2021 at 11:11:17AM +0200, Arnd Bergmann wrote:
> > On Tue, Sep 7, 2021 at 4:32 AM Nathan Chancellor <nathan@kernel.org> wrote:
function 'rtw_aes_decrypt' [-Werror,-Wframe-larger-than]
> > > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:3043:6: error: stack frame size (1376) exceeds limit (1024) in function 'bw_calcs' [-Werror,-Wframe-larger-than]
> > > arm32-fedora.log: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dce_calcs.c:77:13: error: stack frame size (5384) exceeds limit (1024) in function 'calculate_bandwidth' [-Werror,-Wframe-larger-than]
> > >
> > > Aside from the dce_calcs.c warnings, these do not seem too bad. I
> > > believe allmodconfig turns on UBSAN but it could also be aggressive
> > > inlining by clang. I intend to look at all -Wframe-large-than warnings
> > > closely later.
> >
> > I've had them close to zero in the past, but a couple of new ones came in.
> >
> > The amdgpu ones are probably not fixable unless they stop using 64-bit
> > floats in the kernel for
> > random calculations. The crypto/* ones tend to be compiler bugs, but hard to fix
>
> I have started taking a look at these. Most of the allmodconfig ones
> appear to be related to CONFIG_KASAN, which is now supported for
> CONFIG_ARM.
>
> The two in bpmp-debugfs.c appear regardless of CONFIG_KASAN and it turns
> out that you actually submitted a patch for these:
>
> https://lore.kernel.org/r/20201204193714.3134651-1-arnd@kernel.org/
>
> Is it worth resending or pinging that?

I'm now restarting from a clean tree for my randconfig patches to see which
ones are actually needed, will hopefully get to that.

> The dce_calcs.c ones also appear without CONFIG_KASAN, which you noted
> is probably unavoidable.

(adding amdgpu folks to Cc here)

Harry Wentland did a nice rework for dcn_calcs.c that should also be
portable to dce_calcs.c, I hope that he will be able to get to that as well.

Looking at my older patches now, I found that I had only suppressed that one
and given up fixing it, but I did put my analysis into
https://bugs.llvm.org/show_bug.cgi?id=42551, which should be helpful
for addressing it in either the kernel or the compiler.

        Arnd

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-09-21 15:42 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20210906142615.GA1917503@roeck-us.net>
     [not found] ` <CAHk-=wgjTePY1v_D-jszz4NrpTso0CdvB9PcdroPS=TNU1oZMQ@mail.gmail.com>
     [not found]   ` <YTbOs13waorzamZ6@Ryzen-9-3900X.localdomain>
     [not found]     ` <CAK8P3a3_Tdc-XVPXrJ69j3S9048uzmVJGrNcvi0T6yr6OrHkPw@mail.gmail.com>
     [not found]       ` <CAHk-=wgZkQ+eZ02TaCpAWo_ffiLMwA2tYNHyL+B1dQ4YB0qfmA@mail.gmail.com>
2021-09-07 17:33         ` [PATCH] Enable '-Werror' by default for all kernel builds Linus Torvalds
2021-09-07 21:07           ` Harry Wentland
2021-09-08  3:52             ` Harry Wentland
2021-09-08  4:41               ` Linus Torvalds
2021-09-09  0:48                 ` Harry Wentland
     [not found]       ` <YTkjJPCdR1VGaaVm@archlinux-ax161>
     [not found]         ` <75a10e8b-9f11-64c4-460b-9f3ac09965e2@roeck-us.net>
     [not found]           ` <YTkyIAevt7XOd+8j@elver.google.com>
2021-09-09  5:58             ` Christoph Hellwig
2021-09-09  6:07               ` Guenter Roeck
2021-09-09  7:30                 ` Christian König
2021-09-09 14:59                   ` Guenter Roeck
2021-09-09 10:53               ` Marco Elver
2021-09-09 11:00                 ` Arnd Bergmann
2021-09-09 11:43                   ` Marco Elver
2021-09-09 12:55                     ` Arnd Bergmann
2021-09-09 16:53                     ` Linus Torvalds
2021-09-09 16:46               ` Linus Torvalds
2021-09-21 15:41         ` Arnd Bergmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox