linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
@ 2025-10-02 21:00 Ard Biesheuvel
  2025-10-06 17:28 ` Alex Deucher
  2025-10-06 17:42 ` Christian König
  0 siblings, 2 replies; 6+ messages in thread
From: Ard Biesheuvel @ 2025-10-02 21:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-kernel, will, catalin.marinas, mark.rutland, broonie,
	Ard Biesheuvel, Austin Zheng, Jun Lei, Harry Wentland, Leo Li,
	Rodrigo Siqueira, Alex Deucher, Christian König, amd-gfx,
	dri-devel

From: Ard Biesheuvel <ardb@kernel.org>

The point of isolating code that uses kernel mode FPU in separate
compilation units is to ensure that even implicit uses of, e.g., SIMD
registers for spilling occur only in a context where this is permitted,
i.e., from inside a kernel_fpu_begin/end block.

This is important on arm64, which uses -mgeneral-regs-only to build all
kernel code, with the exception of such compilation units where FP or
SIMD registers are expected to be used. Given that the compiler may
invent uses of FP/SIMD anywhere in such a unit, none of its code may be
accessible from outside a kernel_fpu_begin/end block.

This means that all callers into such compilation units must use the
DC_FP start/end macros, which must not occur there themselves. For
robustness, all functions with external linkage that reside there should
call dc_assert_fp_enabled() to assert that the FPU context was set up
correctly.

Fix this for the DCN35, DCN351 and DCN36 implementations.

Cc: Austin Zheng <austin.zheng@amd.com>
Cc: Jun Lei <jun.lei@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c    |  4 ++++
 .../drm/amd/display/dc/dml/dcn35/dcn35_fpu.c    |  6 ++++--
 .../drm/amd/display/dc/dml/dcn351/dcn351_fpu.c  |  4 ++--
 .../display/dc/resource/dcn35/dcn35_resource.c  | 16 +++++++++++++++-
 .../dc/resource/dcn351/dcn351_resource.c        | 17 ++++++++++++++++-
 .../display/dc/resource/dcn36/dcn36_resource.c  | 16 +++++++++++++++-
 6 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
index 17a21bcbde17..1a28061bb9ff 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
@@ -808,6 +808,8 @@ void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_param
 
 int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc)
 {
+	dc_assert_fp_enabled();
+
 	return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0);
 }
 
@@ -815,6 +817,8 @@ int dcn_get_approx_det_segs_required_for_pstate(
 		struct _vcs_dpi_soc_bounding_box_st *soc,
 		int pix_clk_100hz, int bpp, int seg_size_kb)
 {
+	dc_assert_fp_enabled();
+
 	/* Roughly calculate required crb to hide latency. In practice there is slightly
 	 * more buffer available for latency hiding
 	 */
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
index 5d73efa2f0c9..15a1d77dfe36 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
@@ -445,6 +445,8 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
 	bool upscaled = false;
 	const unsigned int max_allowed_vblank_nom = 1023;
 
+	dc_assert_fp_enabled();
+
 	dcn31_populate_dml_pipes_from_context(dc, context, pipes,
 					      validate_mode);
 
@@ -498,9 +500,7 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
 
 		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
 
-		DC_FP_START();
 		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
-		DC_FP_END();
 
 		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
 		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
@@ -581,6 +581,8 @@ void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context)
 	unsigned int i, plane_count = 0;
 	DC_LOGGER_INIT(dc->ctx->logger);
 
+	dc_assert_fp_enabled();
+
 	for (i = 0; i < dc->res_pool->pipe_count; i++) {
 		if (context->res_ctx.pipe_ctx[i].plane_state)
 			plane_count++;
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
index 6f516af82956..e5cfe73f640a 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
@@ -478,6 +478,8 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
 	bool upscaled = false;
 	const unsigned int max_allowed_vblank_nom = 1023;
 
+	dc_assert_fp_enabled();
+
 	dcn31_populate_dml_pipes_from_context(dc, context, pipes,
 					      validate_mode);
 
@@ -531,9 +533,7 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
 
 		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
 
-		DC_FP_START();
 		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
-		DC_FP_END();
 
 		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
 		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
index 8475c6eec547..32678b66c410 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
@@ -1760,6 +1760,20 @@ enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_stat
 }
 
 
+static int populate_dml_pipes_from_context_fpu(struct dc *dc,
+					       struct dc_state *context,
+					       display_e2e_pipe_params_st *pipes,
+					       enum dc_validate_mode validate_mode)
+{
+	int ret;
+
+	DC_FP_START();
+	ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
+	DC_FP_END();
+
+	return ret;
+}
+
 static struct resource_funcs dcn35_res_pool_funcs = {
 	.destroy = dcn35_destroy_resource_pool,
 	.link_enc_create = dcn35_link_encoder_create,
@@ -1770,7 +1784,7 @@ static struct resource_funcs dcn35_res_pool_funcs = {
 	.validate_bandwidth = dcn35_validate_bandwidth,
 	.calculate_wm_and_dlg = NULL,
 	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
-	.populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
+	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
 	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
 	.release_pipe = dcn20_release_pipe,
 	.add_stream_to_ctx = dcn30_add_stream_to_ctx,
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
index 0971c0f74186..677cee27589c 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
@@ -1732,6 +1732,21 @@ static enum dc_status dcn351_validate_bandwidth(struct dc *dc,
 	return out ? DC_OK : DC_FAIL_BANDWIDTH_VALIDATE;
 }
 
+static int populate_dml_pipes_from_context_fpu(struct dc *dc,
+					       struct dc_state *context,
+					       display_e2e_pipe_params_st *pipes,
+					       enum dc_validate_mode validate_mode)
+{
+	int ret;
+
+	DC_FP_START();
+	ret = dcn351_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
+	DC_FP_END();
+
+	return ret;
+
+}
+
 static struct resource_funcs dcn351_res_pool_funcs = {
 	.destroy = dcn351_destroy_resource_pool,
 	.link_enc_create = dcn35_link_encoder_create,
@@ -1742,7 +1757,7 @@ static struct resource_funcs dcn351_res_pool_funcs = {
 	.validate_bandwidth = dcn351_validate_bandwidth,
 	.calculate_wm_and_dlg = NULL,
 	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
-	.populate_dml_pipes = dcn351_populate_dml_pipes_from_context_fpu,
+	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
 	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
 	.release_pipe = dcn20_release_pipe,
 	.add_stream_to_ctx = dcn30_add_stream_to_ctx,
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
index 8bae7fcedc22..d81540515e5c 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
@@ -1734,6 +1734,20 @@ static enum dc_status dcn35_validate_bandwidth(struct dc *dc,
 }
 
 
+static int populate_dml_pipes_from_context_fpu(struct dc *dc,
+					       struct dc_state *context,
+					       display_e2e_pipe_params_st *pipes,
+					       enum dc_validate_mode validate_mode)
+{
+	int ret;
+
+	DC_FP_START();
+	ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
+	DC_FP_END();
+
+	return ret;
+}
+
 static struct resource_funcs dcn36_res_pool_funcs = {
 	.destroy = dcn36_destroy_resource_pool,
 	.link_enc_create = dcn35_link_encoder_create,
@@ -1744,7 +1758,7 @@ static struct resource_funcs dcn36_res_pool_funcs = {
 	.validate_bandwidth = dcn35_validate_bandwidth,
 	.calculate_wm_and_dlg = NULL,
 	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
-	.populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
+	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
 	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
 	.release_pipe = dcn20_release_pipe,
 	.add_stream_to_ctx = dcn30_add_stream_to_ctx,
-- 
2.51.0.618.g983fd99d29-goog



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
  2025-10-02 21:00 [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU Ard Biesheuvel
@ 2025-10-06 17:28 ` Alex Deucher
  2025-10-06 17:42 ` Christian König
  1 sibling, 0 replies; 6+ messages in thread
From: Alex Deucher @ 2025-10-06 17:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-kernel, linux-arm-kernel, will, catalin.marinas,
	mark.rutland, broonie, Ard Biesheuvel, Austin Zheng, Jun Lei,
	Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
	Christian König, amd-gfx, dri-devel

Applied.  Thanks!

On Thu, Oct 2, 2025 at 5:11 PM Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The point of isolating code that uses kernel mode FPU in separate
> compilation units is to ensure that even implicit uses of, e.g., SIMD
> registers for spilling occur only in a context where this is permitted,
> i.e., from inside a kernel_fpu_begin/end block.
>
> This is important on arm64, which uses -mgeneral-regs-only to build all
> kernel code, with the exception of such compilation units where FP or
> SIMD registers are expected to be used. Given that the compiler may
> invent uses of FP/SIMD anywhere in such a unit, none of its code may be
> accessible from outside a kernel_fpu_begin/end block.
>
> This means that all callers into such compilation units must use the
> DC_FP start/end macros, which must not occur there themselves. For
> robustness, all functions with external linkage that reside there should
> call dc_assert_fp_enabled() to assert that the FPU context was set up
> correctly.
>
> Fix this for the DCN35, DCN351 and DCN36 implementations.
>
> Cc: Austin Zheng <austin.zheng@amd.com>
> Cc: Jun Lei <jun.lei@amd.com>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Leo Li <sunpeng.li@amd.com>
> Cc: Rodrigo Siqueira <siqueira@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c    |  4 ++++
>  .../drm/amd/display/dc/dml/dcn35/dcn35_fpu.c    |  6 ++++--
>  .../drm/amd/display/dc/dml/dcn351/dcn351_fpu.c  |  4 ++--
>  .../display/dc/resource/dcn35/dcn35_resource.c  | 16 +++++++++++++++-
>  .../dc/resource/dcn351/dcn351_resource.c        | 17 ++++++++++++++++-
>  .../display/dc/resource/dcn36/dcn36_resource.c  | 16 +++++++++++++++-
>  6 files changed, 56 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> index 17a21bcbde17..1a28061bb9ff 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> @@ -808,6 +808,8 @@ void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_param
>
>  int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc)
>  {
> +       dc_assert_fp_enabled();
> +
>         return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0);
>  }
>
> @@ -815,6 +817,8 @@ int dcn_get_approx_det_segs_required_for_pstate(
>                 struct _vcs_dpi_soc_bounding_box_st *soc,
>                 int pix_clk_100hz, int bpp, int seg_size_kb)
>  {
> +       dc_assert_fp_enabled();
> +
>         /* Roughly calculate required crb to hide latency. In practice there is slightly
>          * more buffer available for latency hiding
>          */
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> index 5d73efa2f0c9..15a1d77dfe36 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> @@ -445,6 +445,8 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
>         bool upscaled = false;
>         const unsigned int max_allowed_vblank_nom = 1023;
>
> +       dc_assert_fp_enabled();
> +
>         dcn31_populate_dml_pipes_from_context(dc, context, pipes,
>                                               validate_mode);
>
> @@ -498,9 +500,7 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
>
>                 pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>
> -               DC_FP_START();
>                 dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> -               DC_FP_END();
>
>                 pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>                 pipes[pipe_cnt].pipe.src.dcc_rate = 3;
> @@ -581,6 +581,8 @@ void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context)
>         unsigned int i, plane_count = 0;
>         DC_LOGGER_INIT(dc->ctx->logger);
>
> +       dc_assert_fp_enabled();
> +
>         for (i = 0; i < dc->res_pool->pipe_count; i++) {
>                 if (context->res_ctx.pipe_ctx[i].plane_state)
>                         plane_count++;
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> index 6f516af82956..e5cfe73f640a 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> @@ -478,6 +478,8 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
>         bool upscaled = false;
>         const unsigned int max_allowed_vblank_nom = 1023;
>
> +       dc_assert_fp_enabled();
> +
>         dcn31_populate_dml_pipes_from_context(dc, context, pipes,
>                                               validate_mode);
>
> @@ -531,9 +533,7 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
>
>                 pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>
> -               DC_FP_START();
>                 dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> -               DC_FP_END();
>
>                 pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>                 pipes[pipe_cnt].pipe.src.dcc_rate = 3;
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> index 8475c6eec547..32678b66c410 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> @@ -1760,6 +1760,20 @@ enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_stat
>  }
>
>
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +                                              struct dc_state *context,
> +                                              display_e2e_pipe_params_st *pipes,
> +                                              enum dc_validate_mode validate_mode)
> +{
> +       int ret;
> +
> +       DC_FP_START();
> +       ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +       DC_FP_END();
> +
> +       return ret;
> +}
> +
>  static struct resource_funcs dcn35_res_pool_funcs = {
>         .destroy = dcn35_destroy_resource_pool,
>         .link_enc_create = dcn35_link_encoder_create,
> @@ -1770,7 +1784,7 @@ static struct resource_funcs dcn35_res_pool_funcs = {
>         .validate_bandwidth = dcn35_validate_bandwidth,
>         .calculate_wm_and_dlg = NULL,
>         .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -       .populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
> +       .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>         .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>         .release_pipe = dcn20_release_pipe,
>         .add_stream_to_ctx = dcn30_add_stream_to_ctx,
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> index 0971c0f74186..677cee27589c 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> @@ -1732,6 +1732,21 @@ static enum dc_status dcn351_validate_bandwidth(struct dc *dc,
>         return out ? DC_OK : DC_FAIL_BANDWIDTH_VALIDATE;
>  }
>
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +                                              struct dc_state *context,
> +                                              display_e2e_pipe_params_st *pipes,
> +                                              enum dc_validate_mode validate_mode)
> +{
> +       int ret;
> +
> +       DC_FP_START();
> +       ret = dcn351_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +       DC_FP_END();
> +
> +       return ret;
> +
> +}
> +
>  static struct resource_funcs dcn351_res_pool_funcs = {
>         .destroy = dcn351_destroy_resource_pool,
>         .link_enc_create = dcn35_link_encoder_create,
> @@ -1742,7 +1757,7 @@ static struct resource_funcs dcn351_res_pool_funcs = {
>         .validate_bandwidth = dcn351_validate_bandwidth,
>         .calculate_wm_and_dlg = NULL,
>         .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -       .populate_dml_pipes = dcn351_populate_dml_pipes_from_context_fpu,
> +       .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>         .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>         .release_pipe = dcn20_release_pipe,
>         .add_stream_to_ctx = dcn30_add_stream_to_ctx,
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> index 8bae7fcedc22..d81540515e5c 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> @@ -1734,6 +1734,20 @@ static enum dc_status dcn35_validate_bandwidth(struct dc *dc,
>  }
>
>
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +                                              struct dc_state *context,
> +                                              display_e2e_pipe_params_st *pipes,
> +                                              enum dc_validate_mode validate_mode)
> +{
> +       int ret;
> +
> +       DC_FP_START();
> +       ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +       DC_FP_END();
> +
> +       return ret;
> +}
> +
>  static struct resource_funcs dcn36_res_pool_funcs = {
>         .destroy = dcn36_destroy_resource_pool,
>         .link_enc_create = dcn35_link_encoder_create,
> @@ -1744,7 +1758,7 @@ static struct resource_funcs dcn36_res_pool_funcs = {
>         .validate_bandwidth = dcn35_validate_bandwidth,
>         .calculate_wm_and_dlg = NULL,
>         .update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -       .populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
> +       .populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>         .acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>         .release_pipe = dcn20_release_pipe,
>         .add_stream_to_ctx = dcn30_add_stream_to_ctx,
> --
> 2.51.0.618.g983fd99d29-goog
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
  2025-10-02 21:00 [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU Ard Biesheuvel
  2025-10-06 17:28 ` Alex Deucher
@ 2025-10-06 17:42 ` Christian König
  2025-10-06 19:59   ` Ard Biesheuvel
  1 sibling, 1 reply; 6+ messages in thread
From: Christian König @ 2025-10-06 17:42 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-kernel
  Cc: linux-arm-kernel, will, catalin.marinas, mark.rutland, broonie,
	Ard Biesheuvel, Austin Zheng, Jun Lei, Harry Wentland, Leo Li,
	Rodrigo Siqueira, Alex Deucher, amd-gfx, dri-devel

On 02.10.25 23:00, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The point of isolating code that uses kernel mode FPU in separate
> compilation units is to ensure that even implicit uses of, e.g., SIMD
> registers for spilling occur only in a context where this is permitted,
> i.e., from inside a kernel_fpu_begin/end block.
> 
> This is important on arm64, which uses -mgeneral-regs-only to build all
> kernel code, with the exception of such compilation units where FP or
> SIMD registers are expected to be used. Given that the compiler may
> invent uses of FP/SIMD anywhere in such a unit, none of its code may be
> accessible from outside a kernel_fpu_begin/end block.
> 
> This means that all callers into such compilation units must use the
> DC_FP start/end macros, which must not occur there themselves. For
> robustness, all functions with external linkage that reside there should
> call dc_assert_fp_enabled() to assert that the FPU context was set up
> correctly.

Thanks a lot for that, I've pointed out this restriction before as well.

Since we had that issue multiple times now would it be somehow possible to automate rejecting new code getting this wrong?

E.g. adding something to the DC_FP_START()/DC_FP_END() or kernel_fpu_begin/end macros to make sure that they fail to compile on compolation units where FP use is enabled?

Regards,
Christian.

> 
> Fix this for the DCN35, DCN351 and DCN36 implementations.
> 
> Cc: Austin Zheng <austin.zheng@amd.com>
> Cc: Jun Lei <jun.lei@amd.com>
> Cc: Harry Wentland <harry.wentland@amd.com>
> Cc: Leo Li <sunpeng.li@amd.com>
> Cc: Rodrigo Siqueira <siqueira@igalia.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: amd-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  .../drm/amd/display/dc/dml/dcn31/dcn31_fpu.c    |  4 ++++
>  .../drm/amd/display/dc/dml/dcn35/dcn35_fpu.c    |  6 ++++--
>  .../drm/amd/display/dc/dml/dcn351/dcn351_fpu.c  |  4 ++--
>  .../display/dc/resource/dcn35/dcn35_resource.c  | 16 +++++++++++++++-
>  .../dc/resource/dcn351/dcn351_resource.c        | 17 ++++++++++++++++-
>  .../display/dc/resource/dcn36/dcn36_resource.c  | 16 +++++++++++++++-
>  6 files changed, 56 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> index 17a21bcbde17..1a28061bb9ff 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c
> @@ -808,6 +808,8 @@ void dcn316_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw_param
>  
>  int dcn_get_max_non_odm_pix_rate_100hz(struct _vcs_dpi_soc_bounding_box_st *soc)
>  {
> +	dc_assert_fp_enabled();
> +
>  	return soc->clock_limits[0].dispclk_mhz * 10000.0 / (1.0 + soc->dcn_downspread_percent / 100.0);
>  }
>  
> @@ -815,6 +817,8 @@ int dcn_get_approx_det_segs_required_for_pstate(
>  		struct _vcs_dpi_soc_bounding_box_st *soc,
>  		int pix_clk_100hz, int bpp, int seg_size_kb)
>  {
> +	dc_assert_fp_enabled();
> +
>  	/* Roughly calculate required crb to hide latency. In practice there is slightly
>  	 * more buffer available for latency hiding
>  	 */
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> index 5d73efa2f0c9..15a1d77dfe36 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn35/dcn35_fpu.c
> @@ -445,6 +445,8 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
>  	bool upscaled = false;
>  	const unsigned int max_allowed_vblank_nom = 1023;
>  
> +	dc_assert_fp_enabled();
> +
>  	dcn31_populate_dml_pipes_from_context(dc, context, pipes,
>  					      validate_mode);
>  
> @@ -498,9 +500,7 @@ int dcn35_populate_dml_pipes_from_context_fpu(struct dc *dc,
>  
>  		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>  
> -		DC_FP_START();
>  		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> -		DC_FP_END();
>  
>  		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>  		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
> @@ -581,6 +581,8 @@ void dcn35_decide_zstate_support(struct dc *dc, struct dc_state *context)
>  	unsigned int i, plane_count = 0;
>  	DC_LOGGER_INIT(dc->ctx->logger);
>  
> +	dc_assert_fp_enabled();
> +
>  	for (i = 0; i < dc->res_pool->pipe_count; i++) {
>  		if (context->res_ctx.pipe_ctx[i].plane_state)
>  			plane_count++;
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> index 6f516af82956..e5cfe73f640a 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn351/dcn351_fpu.c
> @@ -478,6 +478,8 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
>  	bool upscaled = false;
>  	const unsigned int max_allowed_vblank_nom = 1023;
>  
> +	dc_assert_fp_enabled();
> +
>  	dcn31_populate_dml_pipes_from_context(dc, context, pipes,
>  					      validate_mode);
>  
> @@ -531,9 +533,7 @@ int dcn351_populate_dml_pipes_from_context_fpu(struct dc *dc,
>  
>  		pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
>  
> -		DC_FP_START();
>  		dcn31_zero_pipe_dcc_fraction(pipes, pipe_cnt);
> -		DC_FP_END();
>  
>  		pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
>  		pipes[pipe_cnt].pipe.src.dcc_rate = 3;
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> index 8475c6eec547..32678b66c410 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn35/dcn35_resource.c
> @@ -1760,6 +1760,20 @@ enum dc_status dcn35_patch_unknown_plane_state(struct dc_plane_state *plane_stat
>  }
>  
>  
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +					       struct dc_state *context,
> +					       display_e2e_pipe_params_st *pipes,
> +					       enum dc_validate_mode validate_mode)
> +{
> +	int ret;
> +
> +	DC_FP_START();
> +	ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +	DC_FP_END();
> +
> +	return ret;
> +}
> +
>  static struct resource_funcs dcn35_res_pool_funcs = {
>  	.destroy = dcn35_destroy_resource_pool,
>  	.link_enc_create = dcn35_link_encoder_create,
> @@ -1770,7 +1784,7 @@ static struct resource_funcs dcn35_res_pool_funcs = {
>  	.validate_bandwidth = dcn35_validate_bandwidth,
>  	.calculate_wm_and_dlg = NULL,
>  	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -	.populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
> +	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>  	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>  	.release_pipe = dcn20_release_pipe,
>  	.add_stream_to_ctx = dcn30_add_stream_to_ctx,
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> index 0971c0f74186..677cee27589c 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn351/dcn351_resource.c
> @@ -1732,6 +1732,21 @@ static enum dc_status dcn351_validate_bandwidth(struct dc *dc,
>  	return out ? DC_OK : DC_FAIL_BANDWIDTH_VALIDATE;
>  }
>  
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +					       struct dc_state *context,
> +					       display_e2e_pipe_params_st *pipes,
> +					       enum dc_validate_mode validate_mode)
> +{
> +	int ret;
> +
> +	DC_FP_START();
> +	ret = dcn351_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +	DC_FP_END();
> +
> +	return ret;
> +
> +}
> +
>  static struct resource_funcs dcn351_res_pool_funcs = {
>  	.destroy = dcn351_destroy_resource_pool,
>  	.link_enc_create = dcn35_link_encoder_create,
> @@ -1742,7 +1757,7 @@ static struct resource_funcs dcn351_res_pool_funcs = {
>  	.validate_bandwidth = dcn351_validate_bandwidth,
>  	.calculate_wm_and_dlg = NULL,
>  	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -	.populate_dml_pipes = dcn351_populate_dml_pipes_from_context_fpu,
> +	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>  	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>  	.release_pipe = dcn20_release_pipe,
>  	.add_stream_to_ctx = dcn30_add_stream_to_ctx,
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> index 8bae7fcedc22..d81540515e5c 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn36/dcn36_resource.c
> @@ -1734,6 +1734,20 @@ static enum dc_status dcn35_validate_bandwidth(struct dc *dc,
>  }
>  
>  
> +static int populate_dml_pipes_from_context_fpu(struct dc *dc,
> +					       struct dc_state *context,
> +					       display_e2e_pipe_params_st *pipes,
> +					       enum dc_validate_mode validate_mode)
> +{
> +	int ret;
> +
> +	DC_FP_START();
> +	ret = dcn35_populate_dml_pipes_from_context_fpu(dc, context, pipes, validate_mode);
> +	DC_FP_END();
> +
> +	return ret;
> +}
> +
>  static struct resource_funcs dcn36_res_pool_funcs = {
>  	.destroy = dcn36_destroy_resource_pool,
>  	.link_enc_create = dcn35_link_encoder_create,
> @@ -1744,7 +1758,7 @@ static struct resource_funcs dcn36_res_pool_funcs = {
>  	.validate_bandwidth = dcn35_validate_bandwidth,
>  	.calculate_wm_and_dlg = NULL,
>  	.update_soc_for_wm_a = dcn31_update_soc_for_wm_a,
> -	.populate_dml_pipes = dcn35_populate_dml_pipes_from_context_fpu,
> +	.populate_dml_pipes = populate_dml_pipes_from_context_fpu,
>  	.acquire_free_pipe_as_secondary_dpp_pipe = dcn20_acquire_free_pipe_for_layer,
>  	.release_pipe = dcn20_release_pipe,
>  	.add_stream_to_ctx = dcn30_add_stream_to_ctx,



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
  2025-10-06 17:42 ` Christian König
@ 2025-10-06 19:59   ` Ard Biesheuvel
  2025-10-07 20:52     ` Ard Biesheuvel
  0 siblings, 1 reply; 6+ messages in thread
From: Ard Biesheuvel @ 2025-10-06 19:59 UTC (permalink / raw)
  To: Christian König
  Cc: Ard Biesheuvel, linux-kernel, linux-arm-kernel, will,
	catalin.marinas, mark.rutland, broonie, Austin Zheng, Jun Lei,
	Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher, amd-gfx,
	dri-devel

On Mon, 6 Oct 2025 at 19:42, Christian König <christian.koenig@amd.com> wrote:
>
> On 02.10.25 23:00, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > The point of isolating code that uses kernel mode FPU in separate
> > compilation units is to ensure that even implicit uses of, e.g., SIMD
> > registers for spilling occur only in a context where this is permitted,
> > i.e., from inside a kernel_fpu_begin/end block.
> >
> > This is important on arm64, which uses -mgeneral-regs-only to build all
> > kernel code, with the exception of such compilation units where FP or
> > SIMD registers are expected to be used. Given that the compiler may
> > invent uses of FP/SIMD anywhere in such a unit, none of its code may be
> > accessible from outside a kernel_fpu_begin/end block.
> >
> > This means that all callers into such compilation units must use the
> > DC_FP start/end macros, which must not occur there themselves. For
> > robustness, all functions with external linkage that reside there should
> > call dc_assert_fp_enabled() to assert that the FPU context was set up
> > correctly.
>
> Thanks a lot for that, I've pointed out this restriction before as well.
>
> Since we had that issue multiple times now would it be somehow possible to automate rejecting new code getting this wrong?
>
> E.g. adding something to the DC_FP_START()/DC_FP_END() or kernel_fpu_begin/end macros to make sure that they fail to compile on compolation units where FP use is enabled?
>

Something like the below perhaps?

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 73a10f65ce8b..d03e3705bade 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -35,7 +35,7 @@ endif

 # The GCC option -ffreestanding is required in order to compile code containing
 # ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
-CC_FLAGS_FPU   := -ffreestanding
+CC_FLAGS_FPU   := -ffreestanding -DIN_SIMD
 # Enable <arm_neon.h>
 CC_FLAGS_FPU   += -isystem $(shell $(CC) -print-file-name=include)
 CC_FLAGS_NO_FPU        := -mgeneral-regs-only
diff --git a/arch/arm64/include/asm/fpu.h b/arch/arm64/include/asm/fpu.h
index 2ae50bdce59b..1297e660bd89 100644
--- a/arch/arm64/include/asm/fpu.h
+++ b/arch/arm64/include/asm/fpu.h
@@ -8,8 +8,10 @@

 #include <asm/neon.h>

+#ifndef IN_SIMD
 #define kernel_fpu_available() cpu_has_neon()
 #define kernel_fpu_begin()     kernel_neon_begin()
 #define kernel_fpu_end()       kernel_neon_end()
+#endif

 #endif /* ! __ASM_FPU_H */


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
  2025-10-06 19:59   ` Ard Biesheuvel
@ 2025-10-07 20:52     ` Ard Biesheuvel
  2025-10-08  0:53       ` Ard Biesheuvel
  0 siblings, 1 reply; 6+ messages in thread
From: Ard Biesheuvel @ 2025-10-07 20:52 UTC (permalink / raw)
  To: Christian König
  Cc: Ard Biesheuvel, linux-kernel, linux-arm-kernel, will,
	catalin.marinas, mark.rutland, broonie, Austin Zheng, Jun Lei,
	Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher, amd-gfx,
	dri-devel

On Mon, 6 Oct 2025 at 12:59, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 6 Oct 2025 at 19:42, Christian König <christian.koenig@amd.com> wrote:
> >
> > On 02.10.25 23:00, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > The point of isolating code that uses kernel mode FPU in separate
> > > compilation units is to ensure that even implicit uses of, e.g., SIMD
> > > registers for spilling occur only in a context where this is permitted,
> > > i.e., from inside a kernel_fpu_begin/end block.
> > >
> > > This is important on arm64, which uses -mgeneral-regs-only to build all
> > > kernel code, with the exception of such compilation units where FP or
> > > SIMD registers are expected to be used. Given that the compiler may
> > > invent uses of FP/SIMD anywhere in such a unit, none of its code may be
> > > accessible from outside a kernel_fpu_begin/end block.
> > >
> > > This means that all callers into such compilation units must use the
> > > DC_FP start/end macros, which must not occur there themselves. For
> > > robustness, all functions with external linkage that reside there should
> > > call dc_assert_fp_enabled() to assert that the FPU context was set up
> > > correctly.
> >
> > Thanks a lot for that, I've pointed out this restriction before as well.
> >
> > Since we had that issue multiple times now would it be somehow possible to automate rejecting new code getting this wrong?
> >
> > E.g. adding something to the DC_FP_START()/DC_FP_END() or kernel_fpu_begin/end macros to make sure that they fail to compile on compolation units where FP use is enabled?
> >
>
> Something like the below perhaps?
>

Never mind, that doesn't work. dc_fpu_begin() is an out-of-line
function, and so it is the DC_FP_START() macro that evaluates to
something that includes an arch-provided assert. I'll code something
and send it out.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU
  2025-10-07 20:52     ` Ard Biesheuvel
@ 2025-10-08  0:53       ` Ard Biesheuvel
  0 siblings, 0 replies; 6+ messages in thread
From: Ard Biesheuvel @ 2025-10-08  0:53 UTC (permalink / raw)
  To: Christian König
  Cc: Ard Biesheuvel, linux-kernel, linux-arm-kernel, will,
	catalin.marinas, mark.rutland, broonie, Austin Zheng, Jun Lei,
	Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher, amd-gfx,
	dri-devel

On Tue, 7 Oct 2025 at 13:52, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 6 Oct 2025 at 12:59, Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 6 Oct 2025 at 19:42, Christian König <christian.koenig@amd.com> wrote:
> > >
> > > On 02.10.25 23:00, Ard Biesheuvel wrote:
> > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > >
> > > > The point of isolating code that uses kernel mode FPU in separate
> > > > compilation units is to ensure that even implicit uses of, e.g., SIMD
> > > > registers for spilling occur only in a context where this is permitted,
> > > > i.e., from inside a kernel_fpu_begin/end block.
> > > >
> > > > This is important on arm64, which uses -mgeneral-regs-only to build all
> > > > kernel code, with the exception of such compilation units where FP or
> > > > SIMD registers are expected to be used. Given that the compiler may
> > > > invent uses of FP/SIMD anywhere in such a unit, none of its code may be
> > > > accessible from outside a kernel_fpu_begin/end block.
> > > >
> > > > This means that all callers into such compilation units must use the
> > > > DC_FP start/end macros, which must not occur there themselves. For
> > > > robustness, all functions with external linkage that reside there should
> > > > call dc_assert_fp_enabled() to assert that the FPU context was set up
> > > > correctly.
> > >
> > > Thanks a lot for that, I've pointed out this restriction before as well.
> > >
> > > Since we had that issue multiple times now would it be somehow possible to automate rejecting new code getting this wrong?
> > >
> > > E.g. adding something to the DC_FP_START()/DC_FP_END() or kernel_fpu_begin/end macros to make sure that they fail to compile on compolation units where FP use is enabled?
> > >
> >
> > Something like the below perhaps?
> >
>
> Never mind, that doesn't work. dc_fpu_begin() is an out-of-line
> function, and so it is the DC_FP_START() macro that evaluates to
> something that includes an arch-provided assert. I'll code something
> and send it out.

OK, so as it turns out, the logic already exists to force a build time
error in this case. However, due to the way the amdgpu driver
constructs its own API around kernel_fpu_begin() and kernel_fpu_end(),
the logic never fires for the users for DC_FP_START.

It is sufficient to include linux/fpu.h:

diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h
b/drivers/gpu/drm/amd/display/dc/os_types.h
index 782316348941..6ef9b7f5e099 100644
--- a/drivers/gpu/drm/amd/display/dc/os_types.h
+++ b/drivers/gpu/drm/amd/display/dc/os_types.h
@@ -32,6 +32,7 @@
 #include <linux/delay.h>
 #include <linux/mm.h>
 #include <linux/vmalloc.h>
+#include <linux/fpu.h>

 #include <asm/byteorder.h>

Maybe this could be folded into this patch?


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-10-08  0:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-02 21:00 [PATCH] drm/amd/display: Fix unsafe uses of kernel mode FPU Ard Biesheuvel
2025-10-06 17:28 ` Alex Deucher
2025-10-06 17:42 ` Christian König
2025-10-06 19:59   ` Ard Biesheuvel
2025-10-07 20:52     ` Ard Biesheuvel
2025-10-08  0:53       ` Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).