AMD-GFX Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box
@ 2024-06-04 13:50 Alex Deucher
  2024-06-04 14:22 ` Christian König
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Deucher @ 2024-06-04 13:50 UTC (permalink / raw)
  To: amd-gfx
  Cc: Alex Deucher, George Zhang, Arnd Bergmann, harry.wentland,
	sunpeng.li, Rodrigo.Siqueira

This can be called in atomic context.  Should fix:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
Workqueue: events_unbound async_run_entry_fn

Fixes: 88c61827cedc ("drm/amd/display: dynamically allocate dml2_configuration_options structures")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: George Zhang <george.zhang@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: harry.wentland@amd.com
Cc: sunpeng.li@amd.com
Cc: Rodrigo.Siqueira@amd.com
---
 drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c  | 2 +-
 .../gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
index 0f11d7c8791c..3fe0d5334145 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
@@ -2009,7 +2009,7 @@ static void dcn32_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw
 {
 	struct dml2_configuration_options *dml2_opt;
 
-	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
+	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
 	if (!dml2_opt)
 		return;
 
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
index 07ca6f58447d..a61cf5741275 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
@@ -1583,7 +1583,7 @@ static void dcn321_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b
 {
 	struct dml2_configuration_options *dml2_opt;
 
-	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
+	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
 	if (!dml2_opt)
 		return;
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box
  2024-06-04 13:50 [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box Alex Deucher
@ 2024-06-04 14:22 ` Christian König
  2024-06-04 14:45   ` Alex Deucher
  2024-06-04 14:57   ` Arnd Bergmann
  0 siblings, 2 replies; 5+ messages in thread
From: Christian König @ 2024-06-04 14:22 UTC (permalink / raw)
  To: Alex Deucher, amd-gfx
  Cc: George Zhang, Arnd Bergmann, harry.wentland, sunpeng.li,
	Rodrigo.Siqueira

Am 04.06.24 um 15:50 schrieb Alex Deucher:
> This can be called in atomic context.  Should fix:
>
> BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
> preempt_count: 2, expected: 0
> RCU nest depth: 0, expected: 0
> Preemption disabled at:
> ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
> CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
> Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
> Workqueue: events_unbound async_run_entry_fn

That most likely only papers over the real problem and is not a valid fix.

The question is why is that an atomic context?  If the function is used 
under a spinlock then this might indeed be the right fix.

If it's because of floating point operation then that here won't work 
either.

In that case the only real fix is to avoid the allocation altogether.

Regards,
Christian.

>
> Fixes: 88c61827cedc ("drm/amd/display: dynamically allocate dml2_configuration_options structures")
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> Cc: George Zhang <george.zhang@amd.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: harry.wentland@amd.com
> Cc: sunpeng.li@amd.com
> Cc: Rodrigo.Siqueira@amd.com
> ---
>   drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c  | 2 +-
>   .../gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c    | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> index 0f11d7c8791c..3fe0d5334145 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> @@ -2009,7 +2009,7 @@ static void dcn32_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw
>   {
>   	struct dml2_configuration_options *dml2_opt;
>   
> -	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
> +	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
>   	if (!dml2_opt)
>   		return;
>   
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> index 07ca6f58447d..a61cf5741275 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> @@ -1583,7 +1583,7 @@ static void dcn321_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b
>   {
>   	struct dml2_configuration_options *dml2_opt;
>   
> -	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
> +	dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
>   	if (!dml2_opt)
>   		return;
>   


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box
  2024-06-04 14:22 ` Christian König
@ 2024-06-04 14:45   ` Alex Deucher
  2024-06-04 14:57   ` Arnd Bergmann
  1 sibling, 0 replies; 5+ messages in thread
From: Alex Deucher @ 2024-06-04 14:45 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Deucher, amd-gfx, George Zhang, Arnd Bergmann,
	harry.wentland, sunpeng.li, Rodrigo.Siqueira

On Tue, Jun 4, 2024 at 10:32 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 04.06.24 um 15:50 schrieb Alex Deucher:
> > This can be called in atomic context.  Should fix:
> >
> > BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
> > preempt_count: 2, expected: 0
> > RCU nest depth: 0, expected: 0
> > Preemption disabled at:
> > ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
> > CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
> > Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
> > Workqueue: events_unbound async_run_entry_fn
>
> That most likely only papers over the real problem and is not a valid fix.
>
> The question is why is that an atomic context?  If the function is used
> under a spinlock then this might indeed be the right fix.
>
> If it's because of floating point operation then that here won't work
> either.
>
> In that case the only real fix is to avoid the allocation altogether.

Reverting the patch will bring back the stack size warning.  I guess
the display folks will need to look more closely at this.

Alex

>
> Regards,
> Christian.
>
> >
> > Fixes: 88c61827cedc ("drm/amd/display: dynamically allocate dml2_configuration_options structures")
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > Cc: George Zhang <george.zhang@amd.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: harry.wentland@amd.com
> > Cc: sunpeng.li@amd.com
> > Cc: Rodrigo.Siqueira@amd.com
> > ---
> >   drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c  | 2 +-
> >   .../gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c    | 2 +-
> >   2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> > index 0f11d7c8791c..3fe0d5334145 100644
> > --- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> > +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
> > @@ -2009,7 +2009,7 @@ static void dcn32_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *bw
> >   {
> >       struct dml2_configuration_options *dml2_opt;
> >
> > -     dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
> > +     dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
> >       if (!dml2_opt)
> >               return;
> >
> > diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> > index 07ca6f58447d..a61cf5741275 100644
> > --- a/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> > +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn321/dcn321_resource.c
> > @@ -1583,7 +1583,7 @@ static void dcn321_update_bw_bounding_box(struct dc *dc, struct clk_bw_params *b
> >   {
> >       struct dml2_configuration_options *dml2_opt;
> >
> > -     dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_KERNEL);
> > +     dml2_opt = kmemdup(&dc->dml2_options, sizeof(dc->dml2_options), GFP_ATOMIC);
> >       if (!dml2_opt)
> >               return;
> >
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box
  2024-06-04 14:22 ` Christian König
  2024-06-04 14:45   ` Alex Deucher
@ 2024-06-04 14:57   ` Arnd Bergmann
  2024-06-05  9:06     ` Christian König
  1 sibling, 1 reply; 5+ messages in thread
From: Arnd Bergmann @ 2024-06-04 14:57 UTC (permalink / raw)
  To: Christian König, Alex Deucher, amd-gfx
  Cc: George Zhang, Harry Wentland, Leo Li, Rodrigo Siqueira

On Tue, Jun 4, 2024, at 16:22, Christian König wrote:
> Am 04.06.24 um 15:50 schrieb Alex Deucher:
>> This can be called in atomic context.  Should fix:
>>
>> BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
>> preempt_count: 2, expected: 0
>> RCU nest depth: 0, expected: 0
>> Preemption disabled at:
>> ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
>> CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
>> Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
>> Workqueue: events_unbound async_run_entry_fn
>
> That most likely only papers over the real problem and is not a valid fix.
>
> The question is why is that an atomic context?  If the function is used 
> under a spinlock then this might indeed be the right fix.
>
> If it's because of floating point operation then that here won't work 
> either.

It looks like there is only one caller, and this turns on
floating point instructions just for the call:

        if (dc->res_pool->funcs->update_bw_bounding_box) {
                DC_FP_START();
                dc->res_pool->funcs->update_bw_bounding_box(dc, dc->clk_mgr->bw_params);
                DC_FP_END();
        }

but then they are enabled again inside of the function.

If we can drop the outer DC_FP_START(), that means
the GFP_KERNEL allocation works. On the other hand if
we actually have to enabled it before calling into
the function (e.g. because there is an architecture that
has incompatible function calling conventions when FP
is enabled), the inner one is redundant, but we can
potentially move the kmemdup() into the caller and
pass the copy by refernence.

      Arnd

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box
  2024-06-04 14:57   ` Arnd Bergmann
@ 2024-06-05  9:06     ` Christian König
  0 siblings, 0 replies; 5+ messages in thread
From: Christian König @ 2024-06-05  9:06 UTC (permalink / raw)
  To: Arnd Bergmann, Alex Deucher, amd-gfx
  Cc: George Zhang, Harry Wentland, Leo Li, Rodrigo Siqueira

Am 04.06.24 um 16:57 schrieb Arnd Bergmann:
> On Tue, Jun 4, 2024, at 16:22, Christian König wrote:
>> Am 04.06.24 um 15:50 schrieb Alex Deucher:
>>> This can be called in atomic context.  Should fix:
>>>
>>> BUG: sleeping function called from invalid context at include/linux/sched/mm.h:306
>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 449, name: kworker/u64:8
>>> preempt_count: 2, expected: 0
>>> RCU nest depth: 0, expected: 0
>>> Preemption disabled at:
>>> ffffffffc0ce1580>] dc_fpu_begin+0x30/0xd0 [amdgpu]
>>> CPU: 5 PID: 449 Comm: kworker/u64:8 Tainted: G        W          6.8.0+ #35
>>> Hardware name: System manufacturer System Product Name/ROG STRIX X570-E GAMING WIFI II, BIOS 4204 02/24/2022
>>> Workqueue: events_unbound async_run_entry_fn
>> That most likely only papers over the real problem and is not a valid fix.
>>
>> The question is why is that an atomic context?  If the function is used
>> under a spinlock then this might indeed be the right fix.
>>
>> If it's because of floating point operation then that here won't work
>> either.
> It looks like there is only one caller, and this turns on
> floating point instructions just for the call:
>
>          if (dc->res_pool->funcs->update_bw_bounding_box) {
>                  DC_FP_START();
>                  dc->res_pool->funcs->update_bw_bounding_box(dc, dc->clk_mgr->bw_params);
>                  DC_FP_END();
>          }
>
> but then they are enabled again inside of the function.
>
> If we can drop the outer DC_FP_START(), that means
> the GFP_KERNEL allocation works. On the other hand if
> we actually have to enabled it before calling into
> the function (e.g. because there is an architecture that
> has incompatible function calling conventions when FP
> is enabled), the inner one is redundant, but we can
> potentially move the kmemdup() into the caller and
> pass the copy by refernence.

Yeah exactly that's the case.

The DC_FP_START() and DC_FP_END() calls need to be outside of the 
function  because the compiler has no idea that it can't move any 
flouting point instructions outside of the critical section between the 
two functions.

So yes the calls to DC_FP_START() and DC_FP_END() from within floating 
point enabled code should be forbidden somehow.

And when that is done the caller should allocate any parameters needed 
and pass them by reference to avoid the GFP_ATOMIC.

Regards,
Christian.

>
>        Arnd


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-06-05 12:57 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 13:50 [PATCH] drm/amd/display: use GFP_ATOMIC for bounding box Alex Deucher
2024-06-04 14:22 ` Christian König
2024-06-04 14:45   ` Alex Deucher
2024-06-04 14:57   ` Arnd Bergmann
2024-06-05  9:06     ` Christian König

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox