* [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
@ 2025-12-02 18:39 Jani Nikula
2025-12-02 21:24 ` Imre Deak
2026-03-02 17:48 ` [PATCH v2] " Imre Deak
0 siblings, 2 replies; 12+ messages in thread
From: Jani Nikula @ 2025-12-02 18:39 UTC (permalink / raw)
To: intel-gfx, intel-xe; +Cc: jani.nikula, Mohammed Thasleem, Imre Deak, stable
intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
initialized, and dmc is thus NULL.
That would be the case when the call path is
intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
intel_power_domains_init_hw() is called *before* intel_dmc_init().
However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
conditionally, depending on the current and target DC states. At probe,
the target is disabled, but if DC6 is enabled, the function is called,
and an oops follows. Apparently it's quite unlikely that DC6 is enabled
at probe, as we haven't seen this failure mode before.
Add NULL checks and switch the dmc->display references to just display.
Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
---
Rare case, but this may also throw off the rc6 counting in debugfs when
it does happen.
---
drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
index 2fb6fec6dc99..169bbbc91f6d 100644
--- a/drivers/gpu/drm/i915/display/intel_dmc.c
+++ b/drivers/gpu/drm/i915/display/intel_dmc.c
@@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
struct intel_dmc *dmc = display_to_dmc(display);
u32 dc5_cur_count;
- if (DISPLAY_VER(dmc->display) < 14)
+ if (!dmc || DISPLAY_VER(display) < 14)
return;
- dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
+ dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
if (!start_tracking)
dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
@@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
struct intel_dmc *dmc = display_to_dmc(display);
bool dc6_enabled;
- if (DISPLAY_VER(display) < 14)
+ if (!dmc || DISPLAY_VER(display) < 14)
return false;
mutex_lock(&power_domains->lock);
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-02 18:39 [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe Jani Nikula
@ 2025-12-02 21:24 ` Imre Deak
2025-12-02 21:35 ` Imre Deak
2026-03-02 17:48 ` [PATCH v2] " Imre Deak
1 sibling, 1 reply; 12+ messages in thread
From: Imre Deak @ 2025-12-02 21:24 UTC (permalink / raw)
To: Jani Nikula; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, stable
On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
> intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> initialized, and dmc is thus NULL.
>
> That would be the case when the call path is
> intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> intel_power_domains_init_hw() is called *before* intel_dmc_init().
>
> However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> conditionally, depending on the current and target DC states. At probe,
> the target is disabled, but if DC6 is enabled, the function is called,
> and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> at probe, as we haven't seen this failure mode before.
>
> Add NULL checks and switch the dmc->display references to just display.
>
> Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: <stable@vger.kernel.org> # v6.16+
> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>
> ---
>
> Rare case, but this may also throw off the rc6 counting in debugfs when
> it does happen.
Yes, I missed the case where the driver is being loaded while DC6 is
enabled, this is what happens for the reporter:
i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
That's odd, as DC6 requires the DMC firmware, which - if it's indeed
loaded by BIOS for instance - will be overwritten by the driver, not a
well specified sequence (even though the driver is trying to handle it
correctly by disabling any active firmware handler).
But as you pointed out this would also throw off the cooked-up DC6
counter tracking, so could instead the counter update depend on the
driver's DC state instead of the HW state? I.e. set
gen9_set_dc_state()/dc6_was_enabled,
intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
says that DC6 was indeed enabled by the driver (instead of checking the
HW state).
That would fix the reporter's oops when calling
intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
it if the driver hasn't actually enabled DC6 and it would also keep the
DC6 counter tracking correct.
intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
guaranteed to be called only once the firmware is loaded, as until that
point enabling DC6 is blocked (by holding a reference on the DC_off
power well).
> ---
> drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> index 2fb6fec6dc99..169bbbc91f6d 100644
> --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
> struct intel_dmc *dmc = display_to_dmc(display);
> u32 dc5_cur_count;
>
> - if (DISPLAY_VER(dmc->display) < 14)
> + if (!dmc || DISPLAY_VER(display) < 14)
> return;
>
> - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
> + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
>
> if (!start_tracking)
> dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
> @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> struct intel_dmc *dmc = display_to_dmc(display);
> bool dc6_enabled;
>
> - if (DISPLAY_VER(display) < 14)
> + if (!dmc || DISPLAY_VER(display) < 14)
> return false;
>
> mutex_lock(&power_domains->lock);
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-02 21:24 ` Imre Deak
@ 2025-12-02 21:35 ` Imre Deak
2025-12-03 8:13 ` Jani Nikula
0 siblings, 1 reply; 12+ messages in thread
From: Imre Deak @ 2025-12-02 21:35 UTC (permalink / raw)
To: Jani Nikula, intel-gfx, intel-xe, Mohammed Thasleem, stable
On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> > initialized, and dmc is thus NULL.
> >
> > That would be the case when the call path is
> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
> >
> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> > conditionally, depending on the current and target DC states. At probe,
> > the target is disabled, but if DC6 is enabled, the function is called,
> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> > at probe, as we haven't seen this failure mode before.
> >
> > Add NULL checks and switch the dmc->display references to just display.
> >
> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: <stable@vger.kernel.org> # v6.16+
> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> >
> > ---
> >
> > Rare case, but this may also throw off the rc6 counting in debugfs when
> > it does happen.
>
> Yes, I missed the case where the driver is being loaded while DC6 is
> enabled, this is what happens for the reporter:
>
> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
>
> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
> loaded by BIOS for instance - will be overwritten by the driver, not a
> well specified sequence (even though the driver is trying to handle it
> correctly by disabling any active firmware handler).
>
> But as you pointed out this would also throw off the cooked-up DC6
> counter tracking,
Actually the patch would keep the counter working, as the counter
wouldn't be updated in the dmc==NULL case. However I still think the
correct fix would be to check the correct DC state, which from the POV
of the counter tracking is the driver's version of the state, not the HW
state.
> so could instead the counter update depend on the
> driver's DC state instead of the HW state? I.e. set
> gen9_set_dc_state()/dc6_was_enabled,
> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
> says that DC6 was indeed enabled by the driver (instead of checking the
> HW state).
>
> That would fix the reporter's oops when calling
> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
> it if the driver hasn't actually enabled DC6 and it would also keep the
> DC6 counter tracking correct.
>
> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
> guaranteed to be called only once the firmware is loaded, as until that
> point enabling DC6 is blocked (by holding a reference on the DC_off
> power well).
>
> > ---
> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> > index 2fb6fec6dc99..169bbbc91f6d 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
> > struct intel_dmc *dmc = display_to_dmc(display);
> > u32 dc5_cur_count;
> >
> > - if (DISPLAY_VER(dmc->display) < 14)
> > + if (!dmc || DISPLAY_VER(display) < 14)
> > return;
> >
> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
> >
> > if (!start_tracking)
> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> > struct intel_dmc *dmc = display_to_dmc(display);
> > bool dc6_enabled;
> >
> > - if (DISPLAY_VER(display) < 14)
> > + if (!dmc || DISPLAY_VER(display) < 14)
> > return false;
> >
> > mutex_lock(&power_domains->lock);
> > --
> > 2.47.3
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-02 21:35 ` Imre Deak
@ 2025-12-03 8:13 ` Jani Nikula
2025-12-03 10:38 ` Imre Deak
0 siblings, 1 reply; 12+ messages in thread
From: Jani Nikula @ 2025-12-03 8:13 UTC (permalink / raw)
To: imre.deak, intel-gfx, intel-xe, Mohammed Thasleem, stable
On Tue, 02 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
>> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
>> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
>> > initialized, and dmc is thus NULL.
>> >
>> > That would be the case when the call path is
>> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
>> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
>> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
>> >
>> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
>> > conditionally, depending on the current and target DC states. At probe,
>> > the target is disabled, but if DC6 is enabled, the function is called,
>> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
>> > at probe, as we haven't seen this failure mode before.
>> >
>> > Add NULL checks and switch the dmc->display references to just display.
>> >
>> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
>> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
>> > Cc: Imre Deak <imre.deak@intel.com>
>> > Cc: <stable@vger.kernel.org> # v6.16+
>> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> >
>> > ---
>> >
>> > Rare case, but this may also throw off the rc6 counting in debugfs when
>> > it does happen.
>>
>> Yes, I missed the case where the driver is being loaded while DC6 is
>> enabled, this is what happens for the reporter:
>>
>> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
>>
>> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
>> loaded by BIOS for instance - will be overwritten by the driver, not a
>> well specified sequence (even though the driver is trying to handle it
>> correctly by disabling any active firmware handler).
>>
>> But as you pointed out this would also throw off the cooked-up DC6
>> counter tracking,
>
> Actually the patch would keep the counter working, as the counter
> wouldn't be updated in the dmc==NULL case. However I still think the
> correct fix would be to check the correct DC state, which from the POV
> of the counter tracking is the driver's version of the state, not the HW
> state.
One thing I failed to mention is that this happens in a KASAN run in
QEMU. So I'm kind of not surprised we haven't hit this before. And it
impacts the deductions about the DC state.
I'm not quite sure what exactly you're suggesting, maybe a draft patch
would communicate the idea better than plain English? ;)
Anyway, I think "not oopsing" is a lot better than "inaccurate DC
counters in debugfs".
BR,
Jani.
>
>> so could instead the counter update depend on the
>> driver's DC state instead of the HW state? I.e. set
>> gen9_set_dc_state()/dc6_was_enabled,
>> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
>> says that DC6 was indeed enabled by the driver (instead of checking the
>> HW state).
>>
>> That would fix the reporter's oops when calling
>> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
>> it if the driver hasn't actually enabled DC6 and it would also keep the
>> DC6 counter tracking correct.
>>
>> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
>> guaranteed to be called only once the firmware is loaded, as until that
>> point enabling DC6 is blocked (by holding a reference on the DC_off
>> power well).
>>
>> > ---
>> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
>> > 1 file changed, 3 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
>> > index 2fb6fec6dc99..169bbbc91f6d 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
>> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
>> > struct intel_dmc *dmc = display_to_dmc(display);
>> > u32 dc5_cur_count;
>> >
>> > - if (DISPLAY_VER(dmc->display) < 14)
>> > + if (!dmc || DISPLAY_VER(display) < 14)
>> > return;
>> >
>> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
>> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
>> >
>> > if (!start_tracking)
>> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
>> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
>> > struct intel_dmc *dmc = display_to_dmc(display);
>> > bool dc6_enabled;
>> >
>> > - if (DISPLAY_VER(display) < 14)
>> > + if (!dmc || DISPLAY_VER(display) < 14)
>> > return false;
>> >
>> > mutex_lock(&power_domains->lock);
>> > --
>> > 2.47.3
>> >
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-03 8:13 ` Jani Nikula
@ 2025-12-03 10:38 ` Imre Deak
2025-12-04 11:30 ` Jani Nikula
0 siblings, 1 reply; 12+ messages in thread
From: Imre Deak @ 2025-12-03 10:38 UTC (permalink / raw)
To: Jani Nikula; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, stable
On Wed, Dec 03, 2025 at 10:13:44AM +0200, Jani Nikula wrote:
> On Tue, 02 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> > On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
> >> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
> >> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> >> > initialized, and dmc is thus NULL.
> >> >
> >> > That would be the case when the call path is
> >> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> >> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> >> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
> >> >
> >> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> >> > conditionally, depending on the current and target DC states. At probe,
> >> > the target is disabled, but if DC6 is enabled, the function is called,
> >> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> >> > at probe, as we haven't seen this failure mode before.
> >> >
> >> > Add NULL checks and switch the dmc->display references to just display.
> >> >
> >> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> >> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> >> > Cc: Imre Deak <imre.deak@intel.com>
> >> > Cc: <stable@vger.kernel.org> # v6.16+
> >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> >> >
> >> > ---
> >> >
> >> > Rare case, but this may also throw off the rc6 counting in debugfs when
> >> > it does happen.
> >>
> >> Yes, I missed the case where the driver is being loaded while DC6 is
> >> enabled, this is what happens for the reporter:
> >>
> >> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
> >>
> >> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
> >> loaded by BIOS for instance - will be overwritten by the driver, not a
> >> well specified sequence (even though the driver is trying to handle it
> >> correctly by disabling any active firmware handler).
> >>
> >> But as you pointed out this would also throw off the cooked-up DC6
> >> counter tracking,
> >
> > Actually the patch would keep the counter working, as the counter
> > wouldn't be updated in the dmc==NULL case. However I still think the
> > correct fix would be to check the correct DC state, which from the POV
> > of the counter tracking is the driver's version of the state, not the HW
> > state.
>
> One thing I failed to mention is that this happens in a KASAN run in
> QEMU. So I'm kind of not surprised we haven't hit this before. And it
> impacts the deductions about the DC state.
Ok, it's strange why QEMU decides to initialize the DC_STATE_EN register
to a non-zero value then. But in any case the driver should handle it.
> I'm not quite sure what exactly you're suggesting, maybe a draft patch
> would communicate the idea better than plain English? ;)
intel_dmc_get_dc6_allowed_count() still needs to check for dmc==NULL, as
the debugfs entry can be read at any point. With that, what I meant is:
in gen9_set_dc_state():
...
- dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
+ dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
...
in intel_dmc_get_dc6_allowed_count():
...
if (DISPLAY_VER(display) < 14)
return false;
+ if (!dmc) {
+ *count = 0;
+ return true;
+ }
+
mutex_lock(&power_domains->lock);
- dc6_enabled = intel_de_read(display, DC_STATE_EN) &
- DC_STATE_EN_UPTO_DC6;
+ dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
...
> Anyway, I think "not oopsing" is a lot better than "inaccurate DC
> counters in debugfs".
Agreed, the above would ensure both.
>
> BR,
> Jani.
>
>
> >
> >> so could instead the counter update depend on the
> >> driver's DC state instead of the HW state? I.e. set
> >> gen9_set_dc_state()/dc6_was_enabled,
> >> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
> >> says that DC6 was indeed enabled by the driver (instead of checking the
> >> HW state).
> >>
> >> That would fix the reporter's oops when calling
> >> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
> >> it if the driver hasn't actually enabled DC6 and it would also keep the
> >> DC6 counter tracking correct.
> >>
> >> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
> >> guaranteed to be called only once the firmware is loaded, as until that
> >> point enabling DC6 is blocked (by holding a reference on the DC_off
> >> power well).
> >>
> >> > ---
> >> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
> >> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> >> > index 2fb6fec6dc99..169bbbc91f6d 100644
> >> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> >> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> >> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
> >> > struct intel_dmc *dmc = display_to_dmc(display);
> >> > u32 dc5_cur_count;
> >> >
> >> > - if (DISPLAY_VER(dmc->display) < 14)
> >> > + if (!dmc || DISPLAY_VER(display) < 14)
> >> > return;
> >> >
> >> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
> >> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
> >> >
> >> > if (!start_tracking)
> >> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
> >> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> >> > struct intel_dmc *dmc = display_to_dmc(display);
> >> > bool dc6_enabled;
> >> >
> >> > - if (DISPLAY_VER(display) < 14)
> >> > + if (!dmc || DISPLAY_VER(display) < 14)
> >> > return false;
> >> >
> >> > mutex_lock(&power_domains->lock);
> >> > --
> >> > 2.47.3
> >> >
>
> --
> Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-03 10:38 ` Imre Deak
@ 2025-12-04 11:30 ` Jani Nikula
2025-12-04 13:02 ` Imre Deak
0 siblings, 1 reply; 12+ messages in thread
From: Jani Nikula @ 2025-12-04 11:30 UTC (permalink / raw)
To: imre.deak; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, stable
On Wed, 03 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> On Wed, Dec 03, 2025 at 10:13:44AM +0200, Jani Nikula wrote:
>> On Tue, 02 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
>> > On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
>> >> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
>> >> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
>> >> > initialized, and dmc is thus NULL.
>> >> >
>> >> > That would be the case when the call path is
>> >> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
>> >> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
>> >> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
>> >> >
>> >> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
>> >> > conditionally, depending on the current and target DC states. At probe,
>> >> > the target is disabled, but if DC6 is enabled, the function is called,
>> >> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
>> >> > at probe, as we haven't seen this failure mode before.
>> >> >
>> >> > Add NULL checks and switch the dmc->display references to just display.
>> >> >
>> >> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
>> >> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
>> >> > Cc: Imre Deak <imre.deak@intel.com>
>> >> > Cc: <stable@vger.kernel.org> # v6.16+
>> >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> >> >
>> >> > ---
>> >> >
>> >> > Rare case, but this may also throw off the rc6 counting in debugfs when
>> >> > it does happen.
>> >>
>> >> Yes, I missed the case where the driver is being loaded while DC6 is
>> >> enabled, this is what happens for the reporter:
>> >>
>> >> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
>> >>
>> >> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
>> >> loaded by BIOS for instance - will be overwritten by the driver, not a
>> >> well specified sequence (even though the driver is trying to handle it
>> >> correctly by disabling any active firmware handler).
>> >>
>> >> But as you pointed out this would also throw off the cooked-up DC6
>> >> counter tracking,
>> >
>> > Actually the patch would keep the counter working, as the counter
>> > wouldn't be updated in the dmc==NULL case. However I still think the
>> > correct fix would be to check the correct DC state, which from the POV
>> > of the counter tracking is the driver's version of the state, not the HW
>> > state.
>>
>> One thing I failed to mention is that this happens in a KASAN run in
>> QEMU. So I'm kind of not surprised we haven't hit this before. And it
>> impacts the deductions about the DC state.
>
> Ok, it's strange why QEMU decides to initialize the DC_STATE_EN register
> to a non-zero value then. But in any case the driver should handle it.
>
>> I'm not quite sure what exactly you're suggesting, maybe a draft patch
>> would communicate the idea better than plain English? ;)
>
> intel_dmc_get_dc6_allowed_count() still needs to check for dmc==NULL, as
> the debugfs entry can be read at any point. With that, what I meant is:
>
> in gen9_set_dc_state():
> ...
> - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
> + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
I still don't understand why we can trust our own value rather than
what's in the hardware in this case.
For resume, we even call gen9_sanitize_dc_state(), but not for probe.
> ...
>
> in intel_dmc_get_dc6_allowed_count():
> ...
> if (DISPLAY_VER(display) < 14)
> return false;
>
> + if (!dmc) {
> + *count = 0;
> + return true;
> + }
> +
This seems neat but is overkill. dmc is never NULL here, but I added the
check for completeness.
It's the intel_dmc_update_dc6_allowed_count() that's more fragile, and I
want that to have the !dmc check, instead of relying on the subtle
dependency on power_domains->dc_state.
> mutex_lock(&power_domains->lock);
> - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
> - DC_STATE_EN_UPTO_DC6;
> + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> ...
>
>> Anyway, I think "not oopsing" is a lot better than "inaccurate DC
>> counters in debugfs".
>
> Agreed, the above would ensure both.
>
>>
>> BR,
>> Jani.
>>
>>
>> >
>> >> so could instead the counter update depend on the
>> >> driver's DC state instead of the HW state? I.e. set
>> >> gen9_set_dc_state()/dc6_was_enabled,
>> >> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
>> >> says that DC6 was indeed enabled by the driver (instead of checking the
>> >> HW state).
>> >>
>> >> That would fix the reporter's oops when calling
>> >> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
>> >> it if the driver hasn't actually enabled DC6 and it would also keep the
>> >> DC6 counter tracking correct.
>> >>
>> >> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
>> >> guaranteed to be called only once the firmware is loaded, as until that
>> >> point enabling DC6 is blocked (by holding a reference on the DC_off
>> >> power well).
>> >>
>> >> > ---
>> >> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
>> >> > 1 file changed, 3 insertions(+), 3 deletions(-)
>> >> >
>> >> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> > index 2fb6fec6dc99..169bbbc91f6d 100644
>> >> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
>> >> > struct intel_dmc *dmc = display_to_dmc(display);
>> >> > u32 dc5_cur_count;
>> >> >
>> >> > - if (DISPLAY_VER(dmc->display) < 14)
>> >> > + if (!dmc || DISPLAY_VER(display) < 14)
>> >> > return;
>> >> >
>> >> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
>> >> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
>> >> >
>> >> > if (!start_tracking)
>> >> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
>> >> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
>> >> > struct intel_dmc *dmc = display_to_dmc(display);
>> >> > bool dc6_enabled;
>> >> >
>> >> > - if (DISPLAY_VER(display) < 14)
>> >> > + if (!dmc || DISPLAY_VER(display) < 14)
>> >> > return false;
>> >> >
>> >> > mutex_lock(&power_domains->lock);
>> >> > --
>> >> > 2.47.3
>> >> >
>>
>> --
>> Jani Nikula, Intel
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-04 11:30 ` Jani Nikula
@ 2025-12-04 13:02 ` Imre Deak
2026-01-07 11:16 ` Jani Nikula
0 siblings, 1 reply; 12+ messages in thread
From: Imre Deak @ 2025-12-04 13:02 UTC (permalink / raw)
To: Jani Nikula; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, stable
On Thu, Dec 04, 2025 at 01:30:27PM +0200, Jani Nikula wrote:
> On Wed, 03 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> > On Wed, Dec 03, 2025 at 10:13:44AM +0200, Jani Nikula wrote:
> >> On Tue, 02 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> >> > On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
> >> >> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
> >> >> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> >> >> > initialized, and dmc is thus NULL.
> >> >> >
> >> >> > That would be the case when the call path is
> >> >> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> >> >> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> >> >> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
> >> >> >
> >> >> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> >> >> > conditionally, depending on the current and target DC states. At probe,
> >> >> > the target is disabled, but if DC6 is enabled, the function is called,
> >> >> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> >> >> > at probe, as we haven't seen this failure mode before.
> >> >> >
> >> >> > Add NULL checks and switch the dmc->display references to just display.
> >> >> >
> >> >> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> >> >> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> >> >> > Cc: Imre Deak <imre.deak@intel.com>
> >> >> > Cc: <stable@vger.kernel.org> # v6.16+
> >> >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
> >> >> >
> >> >> > ---
> >> >> >
> >> >> > Rare case, but this may also throw off the rc6 counting in debugfs when
> >> >> > it does happen.
> >> >>
> >> >> Yes, I missed the case where the driver is being loaded while DC6 is
> >> >> enabled, this is what happens for the reporter:
> >> >>
> >> >> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
> >> >>
> >> >> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
> >> >> loaded by BIOS for instance - will be overwritten by the driver, not a
> >> >> well specified sequence (even though the driver is trying to handle it
> >> >> correctly by disabling any active firmware handler).
> >> >>
> >> >> But as you pointed out this would also throw off the cooked-up DC6
> >> >> counter tracking,
> >> >
> >> > Actually the patch would keep the counter working, as the counter
> >> > wouldn't be updated in the dmc==NULL case. However I still think the
> >> > correct fix would be to check the correct DC state, which from the POV
> >> > of the counter tracking is the driver's version of the state, not the HW
> >> > state.
> >>
> >> One thing I failed to mention is that this happens in a KASAN run in
> >> QEMU. So I'm kind of not surprised we haven't hit this before. And it
> >> impacts the deductions about the DC state.
> >
> > Ok, it's strange why QEMU decides to initialize the DC_STATE_EN register
> > to a non-zero value then. But in any case the driver should handle it.
> >
> >> I'm not quite sure what exactly you're suggesting, maybe a draft patch
> >> would communicate the idea better than plain English? ;)
> >
> > intel_dmc_get_dc6_allowed_count() still needs to check for dmc==NULL, as
> > the debugfs entry can be read at any point. With that, what I meant is:
> >
> > in gen9_set_dc_state():
> > ...
> > - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
> > + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
>
> I still don't understand why we can trust our own value rather than
> what's in the hardware in this case.
The BIOS/FW can set random flags in the register, as in the above case,
so it can't be trusted. The counter update ending the tracking of the
duration of a DC6 enabled state only works if the driver did in fact
enable DC6 previously and has started the tracking accordingly. This is
only guaranteed if the driver has set DC_STATE_EN_UPTO_DC6 in
power_domains->dc_state, the corrsponding HW flag doesn't guarantee it.
> For resume, we even call gen9_sanitize_dc_state(), but not for probe.
After system suspend, the driver enabling DC9 by setting the
corrsponding DC9 flag in the DC_STATE_EN register, the HW/firmware will
disable DC9 while resuming. The SW version of the DC state will be
updated accordingly in the above function to reflect the disabled DC9
state.
> > ...
> >
> > in intel_dmc_get_dc6_allowed_count():
> > ...
> > if (DISPLAY_VER(display) < 14)
> > return false;
> >
> > + if (!dmc) {
> > + *count = 0;
> > + return true;
> > + }
> > +
>
> This seems neat but is overkill. dmc is never NULL here, but I added the
> check for completeness.
intel_dmc_get_dc6_allowed_count() shouldn't fall back on DISPLAY_VER>=14
to report the DC6 residency in a way that only works for older
platforms. Hence the function should return true for DISPLAY_VER>=14.
> It's the intel_dmc_update_dc6_allowed_count() that's more fragile, and I
> want that to have the !dmc check, instead of relying on the subtle
> dependency on power_domains->dc_state.
The counter tracking should depend on the power_domians->dc_state SW
state as described above, so that's the correct thing to do there.
dmc==NULL in intel_dmc_update_dc6_allowed_count() would be only a bug in
the driver, if you wanted to check for that it should be a
WARN_ON(!dmc) check.
> > mutex_lock(&power_domains->lock);
> > - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
> > - DC_STATE_EN_UPTO_DC6;
> > + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> > ...
> >
> >> Anyway, I think "not oopsing" is a lot better than "inaccurate DC
> >> counters in debugfs".
> >
> > Agreed, the above would ensure both.
> >
> >>
> >> BR,
> >> Jani.
> >>
> >>
> >> >
> >> >> so could instead the counter update depend on the
> >> >> driver's DC state instead of the HW state? I.e. set
> >> >> gen9_set_dc_state()/dc6_was_enabled,
> >> >> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
> >> >> says that DC6 was indeed enabled by the driver (instead of checking the
> >> >> HW state).
> >> >>
> >> >> That would fix the reporter's oops when calling
> >> >> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
> >> >> it if the driver hasn't actually enabled DC6 and it would also keep the
> >> >> DC6 counter tracking correct.
> >> >>
> >> >> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
> >> >> guaranteed to be called only once the firmware is loaded, as until that
> >> >> point enabling DC6 is blocked (by holding a reference on the DC_off
> >> >> power well).
> >> >>
> >> >> > ---
> >> >> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
> >> >> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >> >> >
> >> >> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> >> >> > index 2fb6fec6dc99..169bbbc91f6d 100644
> >> >> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> >> >> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> >> >> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
> >> >> > struct intel_dmc *dmc = display_to_dmc(display);
> >> >> > u32 dc5_cur_count;
> >> >> >
> >> >> > - if (DISPLAY_VER(dmc->display) < 14)
> >> >> > + if (!dmc || DISPLAY_VER(display) < 14)
> >> >> > return;
> >> >> >
> >> >> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
> >> >> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
> >> >> >
> >> >> > if (!start_tracking)
> >> >> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
> >> >> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> >> >> > struct intel_dmc *dmc = display_to_dmc(display);
> >> >> > bool dc6_enabled;
> >> >> >
> >> >> > - if (DISPLAY_VER(display) < 14)
> >> >> > + if (!dmc || DISPLAY_VER(display) < 14)
> >> >> > return false;
> >> >> >
> >> >> > mutex_lock(&power_domains->lock);
> >> >> > --
> >> >> > 2.47.3
> >> >> >
> >>
> >> --
> >> Jani Nikula, Intel
>
> --
> Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-04 13:02 ` Imre Deak
@ 2026-01-07 11:16 ` Jani Nikula
0 siblings, 0 replies; 12+ messages in thread
From: Jani Nikula @ 2026-01-07 11:16 UTC (permalink / raw)
To: imre.deak; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, stable
On Thu, 04 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
> On Thu, Dec 04, 2025 at 01:30:27PM +0200, Jani Nikula wrote:
>> On Wed, 03 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
>> > On Wed, Dec 03, 2025 at 10:13:44AM +0200, Jani Nikula wrote:
>> >> On Tue, 02 Dec 2025, Imre Deak <imre.deak@intel.com> wrote:
>> >> > On Tue, Dec 02, 2025 at 11:24:42PM +0200, Imre Deak wrote:
>> >> >> On Tue, Dec 02, 2025 at 08:39:50PM +0200, Jani Nikula wrote:
>> >> >> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
>> >> >> > initialized, and dmc is thus NULL.
>> >> >> >
>> >> >> > That would be the case when the call path is
>> >> >> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
>> >> >> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
>> >> >> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
>> >> >> >
>> >> >> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
>> >> >> > conditionally, depending on the current and target DC states. At probe,
>> >> >> > the target is disabled, but if DC6 is enabled, the function is called,
>> >> >> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
>> >> >> > at probe, as we haven't seen this failure mode before.
>> >> >> >
>> >> >> > Add NULL checks and switch the dmc->display references to just display.
>> >> >> >
>> >> >> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
>> >> >> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
>> >> >> > Cc: Imre Deak <imre.deak@intel.com>
>> >> >> > Cc: <stable@vger.kernel.org> # v6.16+
>> >> >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com>
>> >> >> >
>> >> >> > ---
>> >> >> >
>> >> >> > Rare case, but this may also throw off the rc6 counting in debugfs when
>> >> >> > it does happen.
>> >> >>
>> >> >> Yes, I missed the case where the driver is being loaded while DC6 is
>> >> >> enabled, this is what happens for the reporter:
>> >> >>
>> >> >> i915 0000:00:04.0: [drm] *ERROR* DC state mismatch (0x0 -> 0x2)
>> >> >>
>> >> >> That's odd, as DC6 requires the DMC firmware, which - if it's indeed
>> >> >> loaded by BIOS for instance - will be overwritten by the driver, not a
>> >> >> well specified sequence (even though the driver is trying to handle it
>> >> >> correctly by disabling any active firmware handler).
>> >> >>
>> >> >> But as you pointed out this would also throw off the cooked-up DC6
>> >> >> counter tracking,
>> >> >
>> >> > Actually the patch would keep the counter working, as the counter
>> >> > wouldn't be updated in the dmc==NULL case. However I still think the
>> >> > correct fix would be to check the correct DC state, which from the POV
>> >> > of the counter tracking is the driver's version of the state, not the HW
>> >> > state.
>> >>
>> >> One thing I failed to mention is that this happens in a KASAN run in
>> >> QEMU. So I'm kind of not surprised we haven't hit this before. And it
>> >> impacts the deductions about the DC state.
>> >
>> > Ok, it's strange why QEMU decides to initialize the DC_STATE_EN register
>> > to a non-zero value then. But in any case the driver should handle it.
>> >
>> >> I'm not quite sure what exactly you're suggesting, maybe a draft patch
>> >> would communicate the idea better than plain English? ;)
>> >
>> > intel_dmc_get_dc6_allowed_count() still needs to check for dmc==NULL, as
>> > the debugfs entry can be read at any point. With that, what I meant is:
>> >
>> > in gen9_set_dc_state():
>> > ...
>> > - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
>> > + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
>>
>> I still don't understand why we can trust our own value rather than
>> what's in the hardware in this case.
>
> The BIOS/FW can set random flags in the register, as in the above case,
> so it can't be trusted. The counter update ending the tracking of the
> duration of a DC6 enabled state only works if the driver did in fact
> enable DC6 previously and has started the tracking accordingly. This is
> only guaranteed if the driver has set DC_STATE_EN_UPTO_DC6 in
> power_domains->dc_state, the corrsponding HW flag doesn't guarantee it.
>
>> For resume, we even call gen9_sanitize_dc_state(), but not for probe.
>
> After system suspend, the driver enabling DC9 by setting the
> corrsponding DC9 flag in the DC_STATE_EN register, the HW/firmware will
> disable DC9 while resuming. The SW version of the DC state will be
> updated accordingly in the above function to reflect the disabled DC9
> state.
>
>> > ...
>> >
>> > in intel_dmc_get_dc6_allowed_count():
>> > ...
>> > if (DISPLAY_VER(display) < 14)
>> > return false;
>> >
>> > + if (!dmc) {
>> > + *count = 0;
>> > + return true;
>> > + }
>> > +
>>
>> This seems neat but is overkill. dmc is never NULL here, but I added the
>> check for completeness.
>
> intel_dmc_get_dc6_allowed_count() shouldn't fall back on DISPLAY_VER>=14
> to report the DC6 residency in a way that only works for older
> platforms. Hence the function should return true for DISPLAY_VER>=14.
>
>> It's the intel_dmc_update_dc6_allowed_count() that's more fragile, and I
>> want that to have the !dmc check, instead of relying on the subtle
>> dependency on power_domains->dc_state.
>
> The counter tracking should depend on the power_domians->dc_state SW
> state as described above, so that's the correct thing to do there.
> dmc==NULL in intel_dmc_update_dc6_allowed_count() would be only a bug in
> the driver, if you wanted to check for that it should be a
> WARN_ON(!dmc) check.
So we still have the NULL pointer dereference at probe, albeit very
unlikely. I don't really know what to do here. Care to send a patch to
fix it?
BR,
Jani.
>
>> > mutex_lock(&power_domains->lock);
>> > - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
>> > - DC_STATE_EN_UPTO_DC6;
>> > + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
>> > ...
>> >
>> >> Anyway, I think "not oopsing" is a lot better than "inaccurate DC
>> >> counters in debugfs".
>> >
>> > Agreed, the above would ensure both.
>> >
>> >>
>> >> BR,
>> >> Jani.
>> >>
>> >>
>> >> >
>> >> >> so could instead the counter update depend on the
>> >> >> driver's DC state instead of the HW state? I.e. set
>> >> >> gen9_set_dc_state()/dc6_was_enabled,
>> >> >> intel_dmc_get_dc6_allowed_count()/dc6_enable if power_domains->dc_state
>> >> >> says that DC6 was indeed enabled by the driver (instead of checking the
>> >> >> HW state).
>> >> >>
>> >> >> That would fix the reporter's oops when calling
>> >> >> intel_dmc_update_dc6_allowed_count(start_tracking=false), by not calling
>> >> >> it if the driver hasn't actually enabled DC6 and it would also keep the
>> >> >> DC6 counter tracking correct.
>> >> >>
>> >> >> intel_dmc_update_dc6_allowed_count(start_tracking=true) would be also
>> >> >> guaranteed to be called only once the firmware is loaded, as until that
>> >> >> point enabling DC6 is blocked (by holding a reference on the DC_off
>> >> >> power well).
>> >> >>
>> >> >> > ---
>> >> >> > drivers/gpu/drm/i915/display/intel_dmc.c | 6 +++---
>> >> >> > 1 file changed, 3 insertions(+), 3 deletions(-)
>> >> >> >
>> >> >> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> >> > index 2fb6fec6dc99..169bbbc91f6d 100644
>> >> >> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> >> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
>> >> >> > @@ -1570,10 +1570,10 @@ void intel_dmc_update_dc6_allowed_count(struct intel_display *display,
>> >> >> > struct intel_dmc *dmc = display_to_dmc(display);
>> >> >> > u32 dc5_cur_count;
>> >> >> >
>> >> >> > - if (DISPLAY_VER(dmc->display) < 14)
>> >> >> > + if (!dmc || DISPLAY_VER(display) < 14)
>> >> >> > return;
>> >> >> >
>> >> >> > - dc5_cur_count = intel_de_read(dmc->display, DG1_DMC_DEBUG_DC5_COUNT);
>> >> >> > + dc5_cur_count = intel_de_read(display, DG1_DMC_DEBUG_DC5_COUNT);
>> >> >> >
>> >> >> > if (!start_tracking)
>> >> >> > dmc->dc6_allowed.count += dc5_cur_count - dmc->dc6_allowed.dc5_start;
>> >> >> > @@ -1587,7 +1587,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
>> >> >> > struct intel_dmc *dmc = display_to_dmc(display);
>> >> >> > bool dc6_enabled;
>> >> >> >
>> >> >> > - if (DISPLAY_VER(display) < 14)
>> >> >> > + if (!dmc || DISPLAY_VER(display) < 14)
>> >> >> > return false;
>> >> >> >
>> >> >> > mutex_lock(&power_domains->lock);
>> >> >> > --
>> >> >> > 2.47.3
>> >> >> >
>> >>
>> >> --
>> >> Jani Nikula, Intel
>>
>> --
>> Jani Nikula, Intel
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2025-12-02 18:39 [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe Jani Nikula
2025-12-02 21:24 ` Imre Deak
@ 2026-03-02 17:48 ` Imre Deak
2026-03-03 2:44 ` Tao Liu
2026-03-06 10:12 ` Jani Nikula
1 sibling, 2 replies; 12+ messages in thread
From: Imre Deak @ 2026-03-02 17:48 UTC (permalink / raw)
To: intel-gfx, intel-xe; +Cc: Mohammed Thasleem, Jani Nikula, Tao Liu, stable
intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
initialized, and dmc is thus NULL.
That would be the case when the call path is
intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
intel_power_domains_init_hw() is called *before* intel_dmc_init().
However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
conditionally, depending on the current and target DC states. At probe,
the target is disabled, but if DC6 is enabled, the function is called,
and an oops follows. Apparently it's quite unlikely that DC6 is enabled
at probe, as we haven't seen this failure mode before.
It is also strange to have DC6 enabled at boot, since that would require
the DMC firmware (loaded by BIOS); the BIOS loading the DMC firmware and
the driver stopping / reprogramming the firmware is a poorly specified
sequence and as such unlikely an intentional BIOS behaviour. It's more
likely that BIOS is leaving an unintentionally enabled DC6 HW state
behind (without actually loading the required DMC firmware for this).
The tracking of the DC6 allowed counter only works if starting /
stopping the counter depends on the _SW_ DC6 state vs. the current _HW_
DC6 state (since stopping the counter requires the DC5 counter captured
when the counter was started). Thus, using the HW DC6 state is incorrect
and it also leads to the above oops. Fix both issues by using the SW DC6
state for the tracking.
This is v2 of the fix originally sent by Jani, updated based on the
first References: discussion below.
Link: https://lore.kernel.org/all/3626411dc9e556452c432d0919821b76d9991217@intel.com
Link: https://lore.kernel.org/all/20260228130946.50919-2-ltao@redhat.com
Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Tao Liu <ltao@redhat.com>
Cc: <stable@vger.kernel.org> # v6.16+
Signed-off-by: Imre Deak <imre.deak@intel.com>
---
drivers/gpu/drm/i915/display/intel_display_power_well.c | 2 +-
drivers/gpu/drm/i915/display/intel_dmc.c | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power_well.c b/drivers/gpu/drm/i915/display/intel_display_power_well.c
index 9c8d29839cafc..969b2c421d308 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power_well.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power_well.c
@@ -852,7 +852,7 @@ void gen9_set_dc_state(struct intel_display *display, u32 state)
power_domains->dc_state, val & mask);
enable_dc6 = state & DC_STATE_EN_UPTO_DC6;
- dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
+ dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
if (!dc6_was_enabled && enable_dc6)
intel_dmc_update_dc6_allowed_count(display, true);
diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
index c3b411259a0c5..90ba932d940ac 100644
--- a/drivers/gpu/drm/i915/display/intel_dmc.c
+++ b/drivers/gpu/drm/i915/display/intel_dmc.c
@@ -1598,8 +1598,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
return false;
mutex_lock(&power_domains->lock);
- dc6_enabled = intel_de_read(display, DC_STATE_EN) &
- DC_STATE_EN_UPTO_DC6;
+ dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
if (dc6_enabled)
intel_dmc_update_dc6_allowed_count(display, false);
--
2.49.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2026-03-02 17:48 ` [PATCH v2] " Imre Deak
@ 2026-03-03 2:44 ` Tao Liu
2026-03-06 10:12 ` Jani Nikula
1 sibling, 0 replies; 12+ messages in thread
From: Tao Liu @ 2026-03-03 2:44 UTC (permalink / raw)
To: Imre Deak; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, Jani Nikula, stable
I have tested this patch can work for my kdump case as described in
Link: https://lore.kernel.org/all/20260228130946.50919-2-ltao@redhat.com.
Tested-by: Tao Liu <ltao@redhat.com>
On Tue, Mar 3, 2026 at 6:49 AM Imre Deak <imre.deak@intel.com> wrote:
>
> intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> initialized, and dmc is thus NULL.
>
> That would be the case when the call path is
> intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> intel_power_domains_init_hw() is called *before* intel_dmc_init().
>
> However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> conditionally, depending on the current and target DC states. At probe,
> the target is disabled, but if DC6 is enabled, the function is called,
> and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> at probe, as we haven't seen this failure mode before.
>
> It is also strange to have DC6 enabled at boot, since that would require
> the DMC firmware (loaded by BIOS); the BIOS loading the DMC firmware and
> the driver stopping / reprogramming the firmware is a poorly specified
> sequence and as such unlikely an intentional BIOS behaviour. It's more
> likely that BIOS is leaving an unintentionally enabled DC6 HW state
> behind (without actually loading the required DMC firmware for this).
>
> The tracking of the DC6 allowed counter only works if starting /
> stopping the counter depends on the _SW_ DC6 state vs. the current _HW_
> DC6 state (since stopping the counter requires the DC5 counter captured
> when the counter was started). Thus, using the HW DC6 state is incorrect
> and it also leads to the above oops. Fix both issues by using the SW DC6
> state for the tracking.
>
> This is v2 of the fix originally sent by Jani, updated based on the
> first References: discussion below.
>
> Link: https://lore.kernel.org/all/3626411dc9e556452c432d0919821b76d9991217@intel.com
> Link: https://lore.kernel.org/all/20260228130946.50919-2-ltao@redhat.com
> Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Tao Liu <ltao@redhat.com>
> Cc: <stable@vger.kernel.org> # v6.16+
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display_power_well.c | 2 +-
> drivers/gpu/drm/i915/display/intel_dmc.c | 3 +--
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power_well.c b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> index 9c8d29839cafc..969b2c421d308 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power_well.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> @@ -852,7 +852,7 @@ void gen9_set_dc_state(struct intel_display *display, u32 state)
> power_domains->dc_state, val & mask);
>
> enable_dc6 = state & DC_STATE_EN_UPTO_DC6;
> - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
> + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> if (!dc6_was_enabled && enable_dc6)
> intel_dmc_update_dc6_allowed_count(display, true);
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> index c3b411259a0c5..90ba932d940ac 100644
> --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> @@ -1598,8 +1598,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> return false;
>
> mutex_lock(&power_domains->lock);
> - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
> - DC_STATE_EN_UPTO_DC6;
> + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> if (dc6_enabled)
> intel_dmc_update_dc6_allowed_count(display, false);
>
> --
> 2.49.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2026-03-02 17:48 ` [PATCH v2] " Imre Deak
2026-03-03 2:44 ` Tao Liu
@ 2026-03-06 10:12 ` Jani Nikula
2026-03-06 14:57 ` Imre Deak
1 sibling, 1 reply; 12+ messages in thread
From: Jani Nikula @ 2026-03-06 10:12 UTC (permalink / raw)
To: Imre Deak, intel-gfx, intel-xe; +Cc: Mohammed Thasleem, Tao Liu, stable
On Mon, 02 Mar 2026, Imre Deak <imre.deak@intel.com> wrote:
> intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> initialized, and dmc is thus NULL.
>
> That would be the case when the call path is
> intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> intel_power_domains_init_hw() is called *before* intel_dmc_init().
>
> However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> conditionally, depending on the current and target DC states. At probe,
> the target is disabled, but if DC6 is enabled, the function is called,
> and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> at probe, as we haven't seen this failure mode before.
>
> It is also strange to have DC6 enabled at boot, since that would require
> the DMC firmware (loaded by BIOS); the BIOS loading the DMC firmware and
> the driver stopping / reprogramming the firmware is a poorly specified
> sequence and as such unlikely an intentional BIOS behaviour. It's more
> likely that BIOS is leaving an unintentionally enabled DC6 HW state
> behind (without actually loading the required DMC firmware for this).
>
> The tracking of the DC6 allowed counter only works if starting /
> stopping the counter depends on the _SW_ DC6 state vs. the current _HW_
> DC6 state (since stopping the counter requires the DC5 counter captured
> when the counter was started). Thus, using the HW DC6 state is incorrect
> and it also leads to the above oops. Fix both issues by using the SW DC6
> state for the tracking.
>
> This is v2 of the fix originally sent by Jani, updated based on the
> first References: discussion below.
>
> Link: https://lore.kernel.org/all/3626411dc9e556452c432d0919821b76d9991217@intel.com
> Link: https://lore.kernel.org/all/20260228130946.50919-2-ltao@redhat.com
> Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Tao Liu <ltao@redhat.com>
> Cc: <stable@vger.kernel.org> # v6.16+
> Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
However, I still think the whole gen9_set_dc_state() is a bit fragile
wrt DMC loaded or not. Pretty much everything else wraps the relevant
parts within intel_dmc_has_payload(), and it's obvious what's going
on. The comment for the function primarily talks about DMC but there's
not even a mention of the possibility DMC is not loaded.
I also think intel_dmc_update_dc6_allowed_count() is fragile in oopsing
when DMC is not loaded, and I still think that should be fixed too.
The patch at hand looks like it fixes the root cause, but I still think
the parts around it could use some more robustness, if only to make it
evident to the reader what the possible conditions are.
BR,
Jani.
> ---
> drivers/gpu/drm/i915/display/intel_display_power_well.c | 2 +-
> drivers/gpu/drm/i915/display/intel_dmc.c | 3 +--
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power_well.c b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> index 9c8d29839cafc..969b2c421d308 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power_well.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> @@ -852,7 +852,7 @@ void gen9_set_dc_state(struct intel_display *display, u32 state)
> power_domains->dc_state, val & mask);
>
> enable_dc6 = state & DC_STATE_EN_UPTO_DC6;
> - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
> + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> if (!dc6_was_enabled && enable_dc6)
> intel_dmc_update_dc6_allowed_count(display, true);
>
> diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> index c3b411259a0c5..90ba932d940ac 100644
> --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> @@ -1598,8 +1598,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> return false;
>
> mutex_lock(&power_domains->lock);
> - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
> - DC_STATE_EN_UPTO_DC6;
> + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> if (dc6_enabled)
> intel_dmc_update_dc6_allowed_count(display, false);
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2] drm/i915/dmc: fix an unlikely NULL pointer deference at probe
2026-03-06 10:12 ` Jani Nikula
@ 2026-03-06 14:57 ` Imre Deak
0 siblings, 0 replies; 12+ messages in thread
From: Imre Deak @ 2026-03-06 14:57 UTC (permalink / raw)
To: Jani Nikula; +Cc: intel-gfx, intel-xe, Mohammed Thasleem, Tao Liu, stable
On Fri, Mar 06, 2026 at 12:12:41PM +0200, Jani Nikula wrote:
> On Mon, 02 Mar 2026, Imre Deak <imre.deak@intel.com> wrote:
> > intel_dmc_update_dc6_allowed_count() oopses when DMC hasn't been
> > initialized, and dmc is thus NULL.
> >
> > That would be the case when the call path is
> > intel_power_domains_init_hw() -> {skl,bxt,icl}_display_core_init() ->
> > gen9_set_dc_state() -> intel_dmc_update_dc6_allowed_count(), as
> > intel_power_domains_init_hw() is called *before* intel_dmc_init().
> >
> > However, gen9_set_dc_state() calls intel_dmc_update_dc6_allowed_count()
> > conditionally, depending on the current and target DC states. At probe,
> > the target is disabled, but if DC6 is enabled, the function is called,
> > and an oops follows. Apparently it's quite unlikely that DC6 is enabled
> > at probe, as we haven't seen this failure mode before.
> >
> > It is also strange to have DC6 enabled at boot, since that would require
> > the DMC firmware (loaded by BIOS); the BIOS loading the DMC firmware and
> > the driver stopping / reprogramming the firmware is a poorly specified
> > sequence and as such unlikely an intentional BIOS behaviour. It's more
> > likely that BIOS is leaving an unintentionally enabled DC6 HW state
> > behind (without actually loading the required DMC firmware for this).
> >
> > The tracking of the DC6 allowed counter only works if starting /
> > stopping the counter depends on the _SW_ DC6 state vs. the current _HW_
> > DC6 state (since stopping the counter requires the DC5 counter captured
> > when the counter was started). Thus, using the HW DC6 state is incorrect
> > and it also leads to the above oops. Fix both issues by using the SW DC6
> > state for the tracking.
> >
> > This is v2 of the fix originally sent by Jani, updated based on the
> > first References: discussion below.
> >
> > Link: https://lore.kernel.org/all/3626411dc9e556452c432d0919821b76d9991217@intel.com
> > Link: https://lore.kernel.org/all/20260228130946.50919-2-ltao@redhat.com
> > Fixes: 88c1f9a4d36d ("drm/i915/dmc: Create debugfs entry for dc6 counter")
> > Cc: Mohammed Thasleem <mohammed.thasleem@intel.com>
> > Cc: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Tao Liu <ltao@redhat.com>
> > Cc: <stable@vger.kernel.org> # v6.16+
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
>
> Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Thanks.
> However, I still think the whole gen9_set_dc_state() is a bit fragile
> wrt DMC loaded or not. Pretty much everything else wraps the relevant
> parts within intel_dmc_has_payload(), and it's obvious what's going
> on. The comment for the function primarily talks about DMC but there's
> not even a mention of the possibility DMC is not loaded.
Yes, the DC state HW interface programmed via the DC_STATE_EN register
is not too intuitive. Some DC states are handled by the DMC firmware
(DC3CO, DC5, DC6) but others are not (DC9). Thus, the former states
require the FW to be loaded, but the latter state doesn't.
The fact that DC3CO, DC5, DC6 will not get enabled until the firmware is
loaded is ensured by holding a reference on the DC_OFF power well, while
the firmware is being loaded and that reference gets dropped only once
the firmware is loaded; the DC_OFF power reference in turn maintains the
DC_STATE_DISABLE state.
I agree with you in that there could be an assert/early return in
gen9_set_dc_state() according to the above:
if (drm_WARN_ON_ONCE(display->drm,
(state & (DC_STATE_EN_DC3CO | DC_STATE_EN_UPTO_DC5 | DC_STATE_EN_UPTO_DC6)) &&
!intel_dmc_has_payload(display)))
return;
and the function documentation should also explain the difference
between the DC states wrt. the firmware dependency and the way setting a
DC state handled by the firmware is prevented via the DC_OFF power
reference.
> I also think intel_dmc_update_dc6_allowed_count() is fragile in oopsing
> when DMC is not loaded, and I still think that should be fixed too.
The only way intel_dmc_update_dc6_allowed_count() can be called is
(after this patch) if DC6 was ever enabled by the driver. Based on the
above that is only possible once the firmware is already loaded.
intel_dmc_update_dc6_allowed_count() could also assert this:
if (drm_WARN_ON(display->drm, !intel_dmc_has_payload(display)))
return;
> The patch at hand looks like it fixes the root cause, but I still think
> the parts around it could use some more robustness, if only to make it
> evident to the reader what the possible conditions are.
>
>
> BR,
> Jani.
>
>
> > ---
> > drivers/gpu/drm/i915/display/intel_display_power_well.c | 2 +-
> > drivers/gpu/drm/i915/display/intel_dmc.c | 3 +--
> > 2 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power_well.c b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> > index 9c8d29839cafc..969b2c421d308 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power_well.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power_well.c
> > @@ -852,7 +852,7 @@ void gen9_set_dc_state(struct intel_display *display, u32 state)
> > power_domains->dc_state, val & mask);
> >
> > enable_dc6 = state & DC_STATE_EN_UPTO_DC6;
> > - dc6_was_enabled = val & DC_STATE_EN_UPTO_DC6;
> > + dc6_was_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> > if (!dc6_was_enabled && enable_dc6)
> > intel_dmc_update_dc6_allowed_count(display, true);
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c
> > index c3b411259a0c5..90ba932d940ac 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dmc.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dmc.c
> > @@ -1598,8 +1598,7 @@ static bool intel_dmc_get_dc6_allowed_count(struct intel_display *display, u32 *
> > return false;
> >
> > mutex_lock(&power_domains->lock);
> > - dc6_enabled = intel_de_read(display, DC_STATE_EN) &
> > - DC_STATE_EN_UPTO_DC6;
> > + dc6_enabled = power_domains->dc_state & DC_STATE_EN_UPTO_DC6;
> > if (dc6_enabled)
> > intel_dmc_update_dc6_allowed_count(display, false);
>
> --
> Jani Nikula, Intel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-03-06 14:58 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-02 18:39 [PATCH] drm/i915/dmc: fix an unlikely NULL pointer deference at probe Jani Nikula
2025-12-02 21:24 ` Imre Deak
2025-12-02 21:35 ` Imre Deak
2025-12-03 8:13 ` Jani Nikula
2025-12-03 10:38 ` Imre Deak
2025-12-04 11:30 ` Jani Nikula
2025-12-04 13:02 ` Imre Deak
2026-01-07 11:16 ` Jani Nikula
2026-03-02 17:48 ` [PATCH v2] " Imre Deak
2026-03-03 2:44 ` Tao Liu
2026-03-06 10:12 ` Jani Nikula
2026-03-06 14:57 ` Imre Deak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox