public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression)
@ 2026-02-27 23:01 Rafael Passos
  2026-03-04 12:27 ` Rafael Passos
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael Passos @ 2026-02-27 23:01 UTC (permalink / raw)
  To: amd-gfx, dri-devel, siqueira, linux-kernel; +Cc: rcpassos, davidtadokoro

Hi,

I found this bug while running v7.0-rc1 on my HP laptop.
The laptop returns from suspension when flipping the scren up,
but the iGPU does not. Tested in Fedora and Arch.
Same compilation has no issues in my Desktop (with a 6800XT).

HP ProBook x360 435 G7
CPU is a AMD Ryzen 7 4700U (8) @ 2.00 GHz with Vega iGPU

This started in a compilation (mainline) between Feb 10 and Feb 23.
I plan to do a lot of bisecting this weekend to find the exact cause.
I know ga9aabb3b839a is a good commit, and 6de23f8 is a bad one (the tag).
Hopefully I can submit a patch fixing this bug,
but helping to find it is ok as well.

Here is one sample of the Kernel logs I collected over SSH.

[   98.539672] smpboot: Booting Node 0 Processor 7 APIC 0x7
[   98.543543] CPU7 is up
[   98.544343] ACPI: PM: Waking up from system sleep state S3
[   99.061284] ACPI: EC: interrupt unblocked
[   99.066648] ACPI: EC: event unblocked
[   99.067043] amdgpu 0000:04:00.0: [drm] PCIE GART of 1024M enabled.
[   99.067049] amdgpu 0000:04:00.0: [drm] PTB located at 0x000000F41FC00000
[   99.067070] amdgpu 0000:04:00.0: PSP is resuming...
[   99.067167] amdgpu 0000:04:00.0: reserve 0x400000 from 0xf41f800000 for PSP TMR
[   99.157012] amdgpu 0000:04:00.0: RAS: optional ras ta ucode is not available
[   99.168905] amdgpu 0000:04:00.0: RAP: optional rap ta ucode is not available
[   99.168909] amdgpu 0000:04:00.0: SECUREDISPLAY: optional securedisplay ta ucode is not available
[   99.168913] amdgpu 0000:04:00.0: SMU is resuming...
[   99.169354] amdgpu 0000:04:00.0: dpm has been disabled
[   99.170469] amdgpu 0000:04:00.0: SMU is resumed successfully!
[   99.171833] amdgpu 0000:04:00.0: kiq ring mec 2 pipe 1 q 0
[   99.176522] amdgpu 0000:04:00.0: [drm] DMUB hardware initialized: version=0x0101002B
[   99.176650] ------------[ cut here ]------------
[   99.176652] WARNING: drivers/gpu/drm/amd/amdgpu/../display/dc/hubbub/dcn20/dcn20_hubbub.c:587 at hubbub2_get_dchub_ref_freq+0xa1/0xb0 [amdgpu], CPU#3: kworker/u33:7/782
[   99.177462] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer ccm nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib sunrpc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr cmac algif_hash algif_skcipher af_alg bnep vfat fat snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_soc_acpi_amd_sdca_quirks snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation snd_ctl_led soundwire_bus snd_soc_sdca snd_hda_codec_alc269 snd_hda_codec_realtek_lib amd_atl intel_rapl_msr snd_hda_codec_atihdmi snd_hda_scodec_component snd_soc_core snd_hda_codec_generic intel_rapl_common snd_hda_codec_hdmi iwlmvm snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_hda_codec mac80211 snd_rpl_pci_acp6x btusb snd_acp_pci hp_bioscfg snd_hda_core
[   99.177516]  snd_amd_acpi_mach btmtk snd_intel_dspcfg snd_acp_legacy_common btrtl ptp snd_intel_sdw_acpi firmware_attributes_class snd_pci_acp6x kvm_amd pps_core snd_hwdep libarc4 uvcvideo btbcm hp_wmi snd_pci_acp5x videobuf2_vmalloc btintel kvm snd_seq uvc platform_profile videobuf2_memops snd_rn_pci_acp3x sparse_keymap snd_seq_device videobuf2_v4l2 irqbypass snd_acp_config videobuf2_common bluetooth rapl videodev wmi_bmof iwlwifi mc snd_pcm snd_soc_acpi pcspkr acpi_cpufreq i2c_piix4 i2c_smbus k10temp snd_pci_acp3x snd_timer cfg80211 hid_sensor_gyro_3d hid_sensor_magn_3d snd hid_sensor_accel_3d hid_sensor_trigger soundcore rfkill industrialio_triggered_buffer kfifo_buf wireless_hotkey hid_sensor_iio_common industrialio joydev mousedev mac_hid loop nfnetlink zram 842_decompress 842_compress lz4hc_compress lz4_compress dm_crypt encrypted_keys trusted asn1_encoder tee amdgpu hid_sensor_hub amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec drm_panel_backlight_quirks gpu_sched drm_suballoc_helper nvme rtsx_pci_sdmmc
[   99.177578]  drm_buddy nvme_core ucsi_acpi mmc_core drm_display_helper nvme_keyring typec_ucsi ghash_clmulni_intel nvme_auth roles hid_multitouch aesni_intel typec cec ccp hkdf amd_sfh video rtsx_pci i2c_hid_acpi wmi thunderbolt sp5100_tco i2c_hid serio_raw dm_mod i2c_dev
[   99.177603] CPU: 3 UID: 0 PID: 782 Comm: kworker/u33:7 Tainted: G        W           7.0.0-rc1-auyer+ #3 PREEMPT(full)  89085bfd3471dc2ddc421b17d990c1c500db5584
[   99.177608] Tainted: [W]=WARN
[   99.177609] Hardware name: HP HP ProBook x360 435 G7/8735, BIOS S80 Ver. 01.17.02 06/07/2024
[   99.177612] Workqueue: async async_run_entry_fn
[   99.177619] RIP: 0010:hubbub2_get_dchub_ref_freq+0xa1/0xb0 [amdgpu]
[   99.178421] Code: 8d 83 c0 63 ff ff 3d 20 4e 00 00 77 21 89 5d 00 48 8b 44 24 08 65 48 2b 05 84 05 36 d0 75 13 48 83 c4 10 5b 5d e9 af ae 7f ce <0f> 0b eb df 0f 0b eb db e8 e2 7a 7e ce 66 90 90 90 90 90 90 90 90
[   99.178423] RSP: 0018:ffffcd310072fc50 EFLAGS: 00010246
[   99.178426] RAX: 0000000000001000 RBX: 000000000000bb80 RCX: 0000000000000000
[   99.178428] RDX: ffffcd310072fc54 RSI: 00000000000039df RDI: ffff8ce014800000
[   99.178430] RBP: ffff8ce00f882bf8 R08: ffffcd310072fc50 R09: 000000000000000c
[   99.178431] R10: ffffcd313fecaf00 R11: ffffcd310072f898 R12: ffff8ce00f882800
[   99.178433] R13: ffff8ce00f848400 R14: ffff8ce00dc83e00 R15: ffff8ce0f6317ec0
[   99.178435] FS:  0000000000000000(0000) GS:ffff8ce75e767000(0000) knlGS:0000000000000000
[   99.178437] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   99.178439] CR2: 0000000000000000 CR3: 0000000211d28000 CR4: 0000000000350ef0
[   99.178441] Call Trace:
[   99.178446]  <TASK>
[   99.178450]  dcn10_init_hw+0x186/0x4e0 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.179286]  dc_set_power_state+0xd1/0x150 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.180031]  dm_resume+0x12e/0x8b0 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.180833]  amdgpu_ip_block_resume+0x27/0x50 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.181466]  amdgpu_device_ip_resume_phase3+0x6d/0x90 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.182040]  amdgpu_device_resume+0xbb/0x380 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.182620]  amdgpu_pmops_resume+0x46/0x80 [amdgpu c06be2407ae9725ac90c67be79bb3ef89e7cefec]
[   99.183194]  ? __pfx_pci_pm_resume+0x10/0x10
[   99.183200]  dpm_run_callback+0x51/0x180
[   99.183204]  ? dpm_wait_for_superior+0xf7/0x150
[   99.183207]  device_resume+0x15c/0x260
[   99.183210]  async_resume+0x21/0x30
[   99.183213]  async_run_entry_fn+0x36/0x160
[   99.183218]  process_one_work+0x193/0x390
[   99.183222]  worker_thread+0x1a1/0x310
[   99.183226]  ? __pfx_worker_thread+0x10/0x10
[   99.183229]  kthread+0xe3/0x120
[   99.183234]  ? __pfx_kthread+0x10/0x10
[   99.183238]  ret_from_fork+0x2bf/0x350
[   99.183243]  ? __pfx_kthread+0x10/0x10
[   99.183246]  ret_from_fork_asm+0x1a/0x30
[   99.183253]  </TASK>
[   99.183255] ---[ end trace 0000000000000000 ]---
[   99.299257] usb 1-4: reset full-speed USB device number 3 using xhci_hcd
[   99.307188] usb 3-3: reset high-speed USB device number 2 using xhci_hcd
[   99.404342] nvme nvme0: 8/0/0 default/read/poll queues
[   99.538424] usb 1-3: reset full-speed USB device number 2 using xhci_hcd
[  100.833055] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1122: core_link_write_dpcd (DP_DOWNSPREAD_CTRL) failed
[  100.919764] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1127: core_link_write_dpcd (DP_LANE_COUNT_SET) failed
[  101.006487] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1155: core_link_write_dpcd (DP_LINK_BW_SET) failed
[  102.589793] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1122: core_link_write_dpcd (DP_DOWNSPREAD_CTRL) failed
[  102.676500] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1127: core_link_write_dpcd (DP_LANE_COUNT_SET) failed
[  102.763207] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1155: core_link_write_dpcd (DP_LINK_BW_SET) failed
[  104.398168] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1122: core_link_write_dpcd (DP_DOWNSPREAD_CTRL) failed
[  104.484883] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1127: core_link_write_dpcd (DP_LANE_COUNT_SET) failed
[  104.571597] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1155: core_link_write_dpcd (DP_LINK_BW_SET) failed
[  106.253776] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1122: core_link_write_dpcd (DP_DOWNSPREAD_CTRL) failed
[  106.340510] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1127: core_link_write_dpcd (DP_LANE_COUNT_SET) failed
[  106.427251] amdgpu 0000:04:00.0: [drm] *ERROR* dpcd_set_link_settings:1155: core_link_write_dpcd (DP_LINK_BW_SET) failed
[  107.382304] amdgpu 0000:04:00.0: [drm] enabling link 0 failed: 15
[  107.568497] amdgpu 0000:04:00.0: ring gfx uses VM inv eng 0 on hub 0
[  107.568501] amdgpu 0000:04:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
[  107.568503] amdgpu 0000:04:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
[  107.568504] amdgpu 0000:04:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
[  107.568506] amdgpu 0000:04:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
[  107.568508] amdgpu 0000:04:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
[  107.568509] amdgpu 0000:04:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
[  107.568511] amdgpu 0000:04:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
[  107.568512] amdgpu 0000:04:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
[  107.568514] amdgpu 0000:04:00.0: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0
[  107.568516] amdgpu 0000:04:00.0: ring sdma0 uses VM inv eng 0 on hub 8
[  107.568518] amdgpu 0000:04:00.0: ring vcn_dec uses VM inv eng 1 on hub 8
[  107.568520] amdgpu 0000:04:00.0: ring vcn_enc0 uses VM inv eng 4 on hub 8
[  107.568521] amdgpu 0000:04:00.0: ring vcn_enc1 uses VM inv eng 5 on hub 8
[  107.568523] amdgpu 0000:04:00.0: ring jpeg_dec uses VM inv eng 6 on hub 8
[  107.584564] OOM killer enabled.
[  107.584569] Restarting tasks: Starting
[  107.585190] Restarting tasks: Done


regards,
Rafael Passos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression)
  2026-02-27 23:01 [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression) Rafael Passos
@ 2026-03-04 12:27 ` Rafael Passos
  2026-03-04 16:32   ` Alex Deucher
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael Passos @ 2026-03-04 12:27 UTC (permalink / raw)
  To: amd-gfx, siqueira, linux-kernel, Martin Leung,
	Bhuvanachandra Pinninti, Ray Wu, Daniel Wheeler, Alex Deucher
  Cc: Rafael Passos, davidbtadokoro, dri-devel

I found the issue, but I'm still not sure how to proceed.
I would like some guidance in fixing this regression.

The issue is the where a Register is being read from.
Before this change, the MICROSECOND_TIME_BASE_DIV reg wa read from
dce_hwseq_registers (dce_hwseq.h) and now from dccg_registers (dcn20_dccg.h)

The bisection lead me to this commit: 4c595e75110ece20af3a68c1ebef8ed4c1b69afe
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe

After lot of debugging, I traced the issue to this file:
drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe

This card is dcn21, but it uses most of the dcn20 implementation.
For easy comparison, the following block contains the function with the original path
commented out (from dcn21), and the function it calls from dcn20:

```
bool dcn21_s0i3_golden_init_wa(struct dc *dc)
{
	if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done){

		printk(KERN_CRIT "AUYER in %s", __func__);
		return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
	}

	printk(KERN_CRIT "AUYER in %s", __func__);

	return false;
	
	// original flow:
	// struct dce_hwseq *hws = dc->hwseq;
	// uint32_t value = 0;
	// value = REG_READ(MICROSECOND_TIME_BASE_DIV);

	// return value != 0x00120464;
}

// is_s0i3_golden_init_wa_done -> dccg2_is_s0i3_golden_init_wa_done
bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
{
	struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);

	return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
}
```

I instrumented this code to compare the values.
On boot, the value is the same. When resuming from s3 sleep, different.
If using the output of this codepath before this commit, the screen works.
At the end of this email is my "debugging patch", and the logs comparing what shows
up on boot vs on resuming from sleep.

I am attempting to implement a `dccg21_is_s0i3_golden_init_wa_done` to
replace the `dccg2_is_s0i3_golden_init_wa_done` that is used in dcn21_dccg.c.
Maybe dcn21 needs a separate register page, (insted of using dcn20_dccg.h)?


Note the difference between log line 2 and 5
[    4.956404] [    T316] AUYER PATCHED in dcn21_s0i3_golden_init_wa, values compared to 0x00120464
[    4.956407] [    T316] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, bool: 1
[    4.956411] [    T316] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
[    4.956412] [    T316] AUYER in dccg21_is_s0i3_golden_init_wa_done
[    4.956415] [    T316] AUYER in dccg21_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
[    4.956418] [    T316] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as bool 1


1 [    4.942660] [    T343] AUYER PATCHED in dcn21_s0i3_golden_init_wa
2 [    4.942662] [    T343] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, comparing to 0x00120464 bool: 1
3 [    4.942665] [    T343] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
4 [    4.942668] [    T343] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
5 [    4.942671] [    T343] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1

On wake from S3:

1 [  279.431636] [   T5497] AUYER PATCHED in dcn21_s0i3_golden_init_wa
2 [  279.431638] [   T5497] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180772, comparing to 0x00120464 bool: 0
3 [  279.431640] [   T5497] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180772
4 [  279.431641] [   T5497] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
5 [  279.431642] [   T5497] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1


The "patch" (just a test lab), to understad where these logs came from.
I applies cleanly to amddrm drm-next, and mainline.

---
 .../amd/display/dc/dccg/dcn20/dcn20_dccg.c    |  3 +++
 .../amd/display/dc/hwss/dcn21/dcn21_hwseq.c   | 25 ++++++++++++++++---
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
index 13ba7f5ce13e..0ba20c7969ed 100644
--- a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
@@ -158,6 +158,9 @@ bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
 {
        struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
 
+       printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x, value: %d",
+               __func__, dccg_dcn->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
+
        return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
 }
 
diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
index 062745389d9a..143c552e0fa9 100644
--- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
@@ -88,10 +88,28 @@ int dcn21_init_sys_ctx(struct dce_hwseq *hws, struct dc *dc, struct dc_phy_addr_
 
 bool dcn21_s0i3_golden_init_wa(struct dc *dc)
 {
-       if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done)
-               return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
 
-       return false;
+       printk(KERN_CRIT "AUYER PATCHED in %s, values compared to 0x00120464", __func__);
+
+       // original flow
+       struct dce_hwseq *hws = dc->hwseq;
+       uint32_t value = 0;
+       value = REG_READ(MICROSECOND_TIME_BASE_DIV);
+
+       printk(KERN_CRIT "AUYER in %s, original flow value: %d, bool: %d",
+               __func__, value, value != 0x00120464);
+
+       printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x value: %d",
+               __func__, hws->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
+
+       if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done) {
+               // new flow
+               bool v2 = 0;
+               v2 = !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
+               printk(KERN_CRIT "AUYER in %s, NEW flow value as bool %d", __func__,  v2);
+       }
+
+       return value != 0x00120464;
 }
 
 void dcn21_exit_optimized_pwr_state(
@@ -298,4 +316,3 @@ bool dcn21_is_abm_supported(struct dc *dc,
        }
        return false;
 }
-
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression)
  2026-03-04 12:27 ` Rafael Passos
@ 2026-03-04 16:32   ` Alex Deucher
  2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
  0 siblings, 1 reply; 9+ messages in thread
From: Alex Deucher @ 2026-03-04 16:32 UTC (permalink / raw)
  To: Rafael Passos, Wentland, Harry, Leo (Sunpeng) Li,
	Bhuvana Chandra Pinninti
  Cc: amd-gfx, siqueira, linux-kernel, Martin Leung, Ray Wu,
	Daniel Wheeler, Alex Deucher, Rafael Passos, davidbtadokoro,
	dri-devel

+ Harry, Leo, Bhuvana

On Wed, Mar 4, 2026 at 8:42 AM Rafael Passos <rafael@rcpassos.me> wrote:
>
> I found the issue, but I'm still not sure how to proceed.
> I would like some guidance in fixing this regression.
>
> The issue is the where a Register is being read from.
> Before this change, the MICROSECOND_TIME_BASE_DIV reg wa read from
> dce_hwseq_registers (dce_hwseq.h) and now from dccg_registers (dcn20_dccg.h)
>
> The bisection lead me to this commit: 4c595e75110ece20af3a68c1ebef8ed4c1b69afe
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> After lot of debugging, I traced the issue to this file:
> drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c?id=4c595e75110ece20af3a68c1ebef8ed4c1b69afe
>
> This card is dcn21, but it uses most of the dcn20 implementation.
> For easy comparison, the following block contains the function with the original path
> commented out (from dcn21), and the function it calls from dcn20:
>
> ```
> bool dcn21_s0i3_golden_init_wa(struct dc *dc)
> {
>         if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done){
>
>                 printk(KERN_CRIT "AUYER in %s", __func__);
>                 return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
>         }
>
>         printk(KERN_CRIT "AUYER in %s", __func__);
>
>         return false;
>
>         // original flow:
>         // struct dce_hwseq *hws = dc->hwseq;
>         // uint32_t value = 0;
>         // value = REG_READ(MICROSECOND_TIME_BASE_DIV);
>
>         // return value != 0x00120464;
> }
>
> // is_s0i3_golden_init_wa_done -> dccg2_is_s0i3_golden_init_wa_done
> bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
> {
>         struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
>         return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
> }
> ```
>
> I instrumented this code to compare the values.
> On boot, the value is the same. When resuming from s3 sleep, different.
> If using the output of this codepath before this commit, the screen works.
> At the end of this email is my "debugging patch", and the logs comparing what shows
> up on boot vs on resuming from sleep.
>
> I am attempting to implement a `dccg21_is_s0i3_golden_init_wa_done` to
> replace the `dccg2_is_s0i3_golden_init_wa_done` that is used in dcn21_dccg.c.
> Maybe dcn21 needs a separate register page, (insted of using dcn20_dccg.h)?
>
>
> Note the difference between log line 2 and 5
> [    4.956404] [    T316] AUYER PATCHED in dcn21_s0i3_golden_init_wa, values compared to 0x00120464
> [    4.956407] [    T316] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, bool: 1
> [    4.956411] [    T316] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> [    4.956412] [    T316] AUYER in dccg21_is_s0i3_golden_init_wa_done
> [    4.956415] [    T316] AUYER in dccg21_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> [    4.956418] [    T316] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as bool 1
>
>
> 1 [    4.942660] [    T343] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [    4.942662] [    T343] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180208, comparing to 0x00120464 bool: 1
> 3 [    4.942665] [    T343] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180208
> 4 [    4.942668] [    T343] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [    4.942671] [    T343] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
> On wake from S3:
>
> 1 [  279.431636] [   T5497] AUYER PATCHED in dcn21_s0i3_golden_init_wa
> 2 [  279.431638] [   T5497] AUYER in dcn21_s0i3_golden_init_wa, original flow value: 1180772, comparing to 0x00120464 bool: 0
> 3 [  279.431640] [   T5497] AUYER in dcn21_s0i3_golden_init_wa: MICROSECOND_TIME_BASE_DIV reg: 13b value: 1180772
> 4 [  279.431641] [   T5497] AUYER in dccg2_is_s0i3_golden_init_wa_done: MICROSECOND_TIME_BASE_DIV reg: 0, value: 1148576
> 5 [  279.431642] [   T5497] AUYER in dcn21_s0i3_golden_init_wa, NEW flow value as is: bool 1
>
>
> The "patch" (just a test lab), to understad where these logs came from.
> I applies cleanly to amddrm drm-next, and mainline.
>
> ---
>  .../amd/display/dc/dccg/dcn20/dcn20_dccg.c    |  3 +++
>  .../amd/display/dc/hwss/dcn21/dcn21_hwseq.c   | 25 ++++++++++++++++---
>  2 files changed, 24 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> index 13ba7f5ce13e..0ba20c7969ed 100644
> --- a/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> +++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn20/dcn20_dccg.c
> @@ -158,6 +158,9 @@ bool dccg2_is_s0i3_golden_init_wa_done(struct dccg *dccg)
>  {
>         struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
>
> +       printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x, value: %d",
> +               __func__, dccg_dcn->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
>         return REG_READ(MICROSECOND_TIME_BASE_DIV) == 0x00120464;
>  }
>
> diff --git a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> index 062745389d9a..143c552e0fa9 100644
> --- a/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> +++ b/drivers/gpu/drm/amd/display/dc/hwss/dcn21/dcn21_hwseq.c
> @@ -88,10 +88,28 @@ int dcn21_init_sys_ctx(struct dce_hwseq *hws, struct dc *dc, struct dc_phy_addr_
>
>  bool dcn21_s0i3_golden_init_wa(struct dc *dc)
>  {
> -       if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done)
> -               return !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
>
> -       return false;
> +       printk(KERN_CRIT "AUYER PATCHED in %s, values compared to 0x00120464", __func__);
> +
> +       // original flow
> +       struct dce_hwseq *hws = dc->hwseq;
> +       uint32_t value = 0;
> +       value = REG_READ(MICROSECOND_TIME_BASE_DIV);
> +
> +       printk(KERN_CRIT "AUYER in %s, original flow value: %d, bool: %d",
> +               __func__, value, value != 0x00120464);
> +
> +       printk(KERN_CRIT "AUYER in %s: MICROSECOND_TIME_BASE_DIV reg: %x value: %d",
> +               __func__, hws->regs->MICROSECOND_TIME_BASE_DIV, REG_READ(MICROSECOND_TIME_BASE_DIV));
> +
> +       if (dc->res_pool->dccg && dc->res_pool->dccg->funcs && dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done) {
> +               // new flow
> +               bool v2 = 0;
> +               v2 = !dc->res_pool->dccg->funcs->is_s0i3_golden_init_wa_done(dc->res_pool->dccg);
> +               printk(KERN_CRIT "AUYER in %s, NEW flow value as bool %d", __func__,  v2);
> +       }
> +
> +       return value != 0x00120464;
>  }
>
>  void dcn21_exit_optimized_pwr_state(
> @@ -298,4 +316,3 @@ bool dcn21_is_abm_supported(struct dc *dc,
>         }
>         return false;
>  }
> -
> --
> 2.53.0
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-04 16:32   ` Alex Deucher
@ 2026-03-08  0:04     ` Rafael Passos
  2026-03-08  8:19       ` kernel test robot
                         ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Rafael Passos @ 2026-03-08  0:04 UTC (permalink / raw)
  To: alexdeucher
  Cc: BhuvanaChandra.Pinninti, Harry.Wentland, Martin.Leung, Sunpeng.Li,
	alexander.deucher, amd-gfx, daniel.wheeler, davidbtadokoro,
	dri-devel, linux-kernel, rafael, ray.wu, rcpassos, siqueira

[WHAT]
Set the register offset MICROSECOND_TIME_BASE_DIV in dccg_registers for DCN21.
Introduce a new dccg21_init function, used in dccg_funcs.dccg_init for DCN21.
The new dccg21_init sets 0x00120464 to set the MICROSECOND_TIME_BASE_DIV
register instead of 0x00120264, set by dccg2_init.

[WHY]
The previous commit introduced a change where the dcn21_s0i3_golden_init_wa
function used to read the MICROSECOND_TIME_BASE_DIV reg from hwseq, and
now started reading from dccg using dccg2_is_s0i3_golden_init_wa_done.
However, this register is not properly initialized in dccg.
Also, the value was initialized to 0x00120264 by dccg2_init, but
compared to 0x00120464. For this reason, we created a new dccg21_init
with the values specific to this card.

Fixes: 4c595e75110e ("drm/amd/display: Migrate DCCG registers access from hwseq to dccg component.")
Signed-off-by: Rafael Passos <rafael@rcpassos.me>
Co-developed-by: David Tadokoro <davidbtadokoro@ime.usp.br>
Signed-off-by: David Tadokoro <davidbtadokoro@ime.usp.br>
---

It took a lot of debugging to get to this point.
We are not sure this is the right fix, but it works.
We found that when reading the MICROSECOND_TIME_BASE_DIV register,
the offset was 13b in the old path and 0 in the new path.

The dcn21_s0i3_golden_init_wa is called when booting
and when waking from sleep. It compares the value from
MICROSECOND_TIME_BASE_DIV to 0x00120464.
When booting, the value was different (and this function returns true).
When waking from sleep, the value should be equal; thus,
this function would return false.

After 4c595e75110e, the value was always different than 0x00120464, so
this function always returned true, failing to wake the screen.
This happened because the offset of MICROSECOND_TIME_BASE_DIV was 0,
and READ_REG always returned 0x1186A0 (value from MILLISECOND_TIME_BASE_DIV?).

Things we are unsure of:
- We used SR to set MICROSECOND_TIME_BASE_DIV direclty in the
	dccg_registers struct. We did not find other examples of this.
	Should we set MICROSECOND_TIME_BASE_DIV to the DCCG_COMMON_REG_LIST_DCN_BASE ?
	I only added it to DCN21, because it is the hardware I have (and validated it works).
- We changed 0x00120264 to 0x00120464 in the init, but dccg2 has the
	same difference in setting and reading. We would like to know if this issue
	also affects dccg2 (and other cards), or if we are missing something.
	Maybe we should change this value in dccg2_is_s0i3_golden_init_wa_done.

It applies to the mainline master, amdgpu drm-next and amd-staging-drm-next.

Any feedback is appreciated. It was a fun-frustrating-veryfun journey. :)
Code written only by humans.


 .../drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c  | 17 ++++++++++++++++-
 .../display/dc/resource/dcn21/dcn21_resource.c  |  3 ++-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
index 75c69348027e..6f96e9c189dc 100644
--- a/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
+++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
@@ -96,6 +96,21 @@ static void dccg21_update_dpp_dto(struct dccg *dccg, int dpp_inst, int req_dppcl
 	dccg->pipe_dppclk_khz[dpp_inst] = req_dppclk;
 }
 
+void dccg21_init(struct dccg *dccg)
+{
+	struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
+
+	/* Hardcoded register values for DCN21
+	 * These are specific to 100Mhz refclk
+	 * Different ASICs with different refclk may override this in their own init
+	 */
+	REG_WRITE(MICROSECOND_TIME_BASE_DIV, 0x00120464);
+	REG_WRITE(MILLISECOND_TIME_BASE_DIV, 0x001186a0);
+	REG_WRITE(DISPCLK_FREQ_CHANGE_CNTL, 0x0e01003c);
+
+	if (REG(REFCLK_CNTL))
+		REG_WRITE(REFCLK_CNTL, 0);
+}
 
 static const struct dccg_funcs dccg21_funcs = {
 	.update_dpp_dto = dccg21_update_dpp_dto,
@@ -103,7 +118,7 @@ static const struct dccg_funcs dccg21_funcs = {
 	.set_fifo_errdet_ovr_en = dccg2_set_fifo_errdet_ovr_en,
 	.otg_add_pixel = dccg2_otg_add_pixel,
 	.otg_drop_pixel = dccg2_otg_drop_pixel,
-	.dccg_init = dccg2_init,
+	.dccg_init = dccg21_init,
 	.refclk_setup = dccg2_refclk_setup, /* Deprecated - for backward compatibility only */
 	.allow_clock_gating = dccg2_allow_clock_gating,
 	.enable_memory_low_power = dccg2_enable_memory_low_power,
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
index 0f4307f8f3dd..7f8f657eb0f2 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
@@ -222,7 +222,8 @@ static const struct dce_audio_mask audio_mask = {
 };
 
 static const struct dccg_registers dccg_regs = {
-		DCCG_COMMON_REG_LIST_DCN_BASE()
+		DCCG_COMMON_REG_LIST_DCN_BASE(),
+		SR(MICROSECOND_TIME_BASE_DIV)
 };
 
 static const struct dccg_shift dccg_shift = {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
@ 2026-03-08  8:19       ` kernel test robot
  2026-03-08 16:23       ` kernel test robot
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2026-03-08  8:19 UTC (permalink / raw)
  To: Rafael Passos, alexdeucher
  Cc: oe-kbuild-all, BhuvanaChandra.Pinninti, Harry.Wentland,
	Martin.Leung, Sunpeng.Li, alexander.deucher, amd-gfx,
	daniel.wheeler, davidbtadokoro, dri-devel, linux-kernel, rafael,
	ray.wu, rcpassos, siqueira

Hi Rafael,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on linus/master v7.0-rc2 next-20260306]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Rafael-Passos/drm-amd-display-fix-resuming-from-S3-sleep-for-Renoir-iGPU/20260308-080715
base:   https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link:    https://lore.kernel.org/r/20260308000515.890688-1-rafael%40rcpassos.me
patch subject: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
config: x86_64-rhel-9.4-ltp (https://download.01.org/0day-ci/archive/20260308/202603080959.llKqWvRQ-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260308/202603080959.llKqWvRQ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603080959.llKqWvRQ-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c:99:6: warning: no previous prototype for 'dccg21_init' [-Wmissing-prototypes]
      99 | void dccg21_init(struct dccg *dccg)
         |      ^~~~~~~~~~~


vim +/dccg21_init +99 drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c

    98	
  > 99	void dccg21_init(struct dccg *dccg)
   100	{
   101		struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
   102	
   103		/* Hardcoded register values for DCN21
   104		 * These are specific to 100Mhz refclk
   105		 * Different ASICs with different refclk may override this in their own init
   106		 */
   107		REG_WRITE(MICROSECOND_TIME_BASE_DIV, 0x00120464);
   108		REG_WRITE(MILLISECOND_TIME_BASE_DIV, 0x001186a0);
   109		REG_WRITE(DISPCLK_FREQ_CHANGE_CNTL, 0x0e01003c);
   110	
   111		if (REG(REFCLK_CNTL))
   112			REG_WRITE(REFCLK_CNTL, 0);
   113	}
   114	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
  2026-03-08  8:19       ` kernel test robot
@ 2026-03-08 16:23       ` kernel test robot
  2026-03-09 17:02       ` Leo Li
  2026-03-09 18:12       ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2026-03-08 16:23 UTC (permalink / raw)
  To: Rafael Passos, alexdeucher
  Cc: oe-kbuild-all, BhuvanaChandra.Pinninti, Harry.Wentland,
	Martin.Leung, Sunpeng.Li, alexander.deucher, amd-gfx,
	daniel.wheeler, davidbtadokoro, dri-devel, linux-kernel, rafael,
	ray.wu, rcpassos, siqueira

Hi Rafael,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on linus/master v7.0-rc2 next-20260306]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Rafael-Passos/drm-amd-display-fix-resuming-from-S3-sleep-for-Renoir-iGPU/20260308-080715
base:   https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link:    https://lore.kernel.org/r/20260308000515.890688-1-rafael%40rcpassos.me
patch subject: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260309/202603090058.Jvh5jmdd-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260309/202603090058.Jvh5jmdd-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603090058.Jvh5jmdd-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c:99:6: warning: no previous prototype for 'dccg21_init' [-Wmissing-prototypes]
      99 | void dccg21_init(struct dccg *dccg)
         |      ^~~~~~~~~~~


vim +/dccg21_init +99 drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c

    98	
  > 99	void dccg21_init(struct dccg *dccg)
   100	{
   101		struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
   102	
   103		/* Hardcoded register values for DCN21
   104		 * These are specific to 100Mhz refclk
   105		 * Different ASICs with different refclk may override this in their own init
   106		 */
   107		REG_WRITE(MICROSECOND_TIME_BASE_DIV, 0x00120464);
   108		REG_WRITE(MILLISECOND_TIME_BASE_DIV, 0x001186a0);
   109		REG_WRITE(DISPCLK_FREQ_CHANGE_CNTL, 0x0e01003c);
   110	
   111		if (REG(REFCLK_CNTL))
   112			REG_WRITE(REFCLK_CNTL, 0);
   113	}
   114	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
  2026-03-08  8:19       ` kernel test robot
  2026-03-08 16:23       ` kernel test robot
@ 2026-03-09 17:02       ` Leo Li
  2026-03-09 17:39         ` Rafael Passos
  2026-03-09 18:12       ` kernel test robot
  3 siblings, 1 reply; 9+ messages in thread
From: Leo Li @ 2026-03-09 17:02 UTC (permalink / raw)
  To: Rafael Passos, alexdeucher
  Cc: BhuvanaChandra.Pinninti, Harry.Wentland, Martin.Leung,
	alexander.deucher, amd-gfx, daniel.wheeler, davidbtadokoro,
	dri-devel, linux-kernel, ray.wu, rcpassos, siqueira, Ivan Lipski



On 2026-03-07 19:04, Rafael Passos wrote:
> [You don't often get email from rafael@rcpassos.me. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> [WHAT]
> Set the register offset MICROSECOND_TIME_BASE_DIV in dccg_registers for DCN21.
> Introduce a new dccg21_init function, used in dccg_funcs.dccg_init for DCN21.
> The new dccg21_init sets 0x00120464 to set the MICROSECOND_TIME_BASE_DIV
> register instead of 0x00120264, set by dccg2_init.
> 
> [WHY]
> The previous commit introduced a change where the dcn21_s0i3_golden_init_wa
> function used to read the MICROSECOND_TIME_BASE_DIV reg from hwseq, and
> now started reading from dccg using dccg2_is_s0i3_golden_init_wa_done.
> However, this register is not properly initialized in dccg.
> Also, the value was initialized to 0x00120264 by dccg2_init, but
> compared to 0x00120464. For this reason, we created a new dccg21_init
> with the values specific to this card.
> 
> Fixes: 4c595e75110e ("drm/amd/display: Migrate DCCG registers access from hwseq to dccg component.")
> Signed-off-by: Rafael Passos <rafael@rcpassos.me>
> Co-developed-by: David Tadokoro <davidbtadokoro@ime.usp.br>
> Signed-off-by: David Tadokoro <davidbtadokoro@ime.usp.br>
> ---
> 
> It took a lot of debugging to get to this point.
> We are not sure this is the right fix, but it works.
> We found that when reading the MICROSECOND_TIME_BASE_DIV register,
> the offset was 13b in the old path and 0 in the new path.
> 
> The dcn21_s0i3_golden_init_wa is called when booting
> and when waking from sleep. It compares the value from
> MICROSECOND_TIME_BASE_DIV to 0x00120464.
> When booting, the value was different (and this function returns true).
> When waking from sleep, the value should be equal; thus,
> this function would return false.
> 
> After 4c595e75110e, the value was always different than 0x00120464, so
> this function always returned true, failing to wake the screen.
> This happened because the offset of MICROSECOND_TIME_BASE_DIV was 0,
> and READ_REG always returned 0x1186A0 (value from MILLISECOND_TIME_BASE_DIV?).
> 
> Things we are unsure of:
> - We used SR to set MICROSECOND_TIME_BASE_DIV direclty in the
>         dccg_registers struct. We did not find other examples of this.
>         Should we set MICROSECOND_TIME_BASE_DIV to the DCCG_COMMON_REG_LIST_DCN_BASE ?
>         I only added it to DCN21, because it is the hardware I have (and validated it works).
> - We changed 0x00120264 to 0x00120464 in the init, but dccg2 has the
>         same difference in setting and reading. We would like to know if this issue
>         also affects dccg2 (and other cards), or if we are missing something.
>         Maybe we should change this value in dccg2_is_s0i3_golden_init_wa_done.
> 
> It applies to the mainline master, amdgpu drm-next and amd-staging-drm-next.
> 
> Any feedback is appreciated. It was a fun-frustrating-veryfun journey. :)
> Code written only by humans.

Hi Rafael,

Thanks for bisecting and identifying the root cause. A fix has been submitted here:
https://lore.kernel.org/all/20260306031932.136179-14-alex.hung@amd.com/

Additionally, the offending change missed updating register definitions, which was
fixed here:
https://lore.kernel.org/all/20260306031932.136179-10-alex.hung@amd.com/

- Leo

> 
> 
>  .../drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c  | 17 ++++++++++++++++-
>  .../display/dc/resource/dcn21/dcn21_resource.c  |  3 ++-
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c b/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
> index 75c69348027e..6f96e9c189dc 100644
> --- a/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
> +++ b/drivers/gpu/drm/amd/display/dc/dccg/dcn21/dcn21_dccg.c
> @@ -96,6 +96,21 @@ static void dccg21_update_dpp_dto(struct dccg *dccg, int dpp_inst, int req_dppcl
>         dccg->pipe_dppclk_khz[dpp_inst] = req_dppclk;
>  }
> 
> +void dccg21_init(struct dccg *dccg)
> +{
> +       struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
> +
> +       /* Hardcoded register values for DCN21
> +        * These are specific to 100Mhz refclk
> +        * Different ASICs with different refclk may override this in their own init
> +        */
> +       REG_WRITE(MICROSECOND_TIME_BASE_DIV, 0x00120464);
> +       REG_WRITE(MILLISECOND_TIME_BASE_DIV, 0x001186a0);
> +       REG_WRITE(DISPCLK_FREQ_CHANGE_CNTL, 0x0e01003c);
> +
> +       if (REG(REFCLK_CNTL))
> +               REG_WRITE(REFCLK_CNTL, 0);
> +}
> 
>  static const struct dccg_funcs dccg21_funcs = {
>         .update_dpp_dto = dccg21_update_dpp_dto,
> @@ -103,7 +118,7 @@ static const struct dccg_funcs dccg21_funcs = {
>         .set_fifo_errdet_ovr_en = dccg2_set_fifo_errdet_ovr_en,
>         .otg_add_pixel = dccg2_otg_add_pixel,
>         .otg_drop_pixel = dccg2_otg_drop_pixel,
> -       .dccg_init = dccg2_init,
> +       .dccg_init = dccg21_init,
>         .refclk_setup = dccg2_refclk_setup, /* Deprecated - for backward compatibility only */
>         .allow_clock_gating = dccg2_allow_clock_gating,
>         .enable_memory_low_power = dccg2_enable_memory_low_power,
> diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
> index 0f4307f8f3dd..7f8f657eb0f2 100644
> --- a/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
> +++ b/drivers/gpu/drm/amd/display/dc/resource/dcn21/dcn21_resource.c
> @@ -222,7 +222,8 @@ static const struct dce_audio_mask audio_mask = {
>  };
> 
>  static const struct dccg_registers dccg_regs = {
> -               DCCG_COMMON_REG_LIST_DCN_BASE()
> +               DCCG_COMMON_REG_LIST_DCN_BASE(),
> +               SR(MICROSECOND_TIME_BASE_DIV)
>  };
> 
>  static const struct dccg_shift dccg_shift = {
> --
> 2.53.0
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-09 17:02       ` Leo Li
@ 2026-03-09 17:39         ` Rafael Passos
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael Passos @ 2026-03-09 17:39 UTC (permalink / raw)
  To: Leo Li, Rafael Passos, alexdeucher
  Cc: BhuvanaChandra.Pinninti, Harry.Wentland, Martin.Leung,
	alexander.deucher, amd-gfx, daniel.wheeler, davidbtadokoro,
	dri-devel, linux-kernel, ray.wu, rcpassos, siqueira, Ivan Lipski

On Mon Mar 9, 2026 at 2:02 PM -03, Leo Li wrote:
> Hi Rafael,
>
> Thanks for bisecting and identifying the root cause. A fix has been submitted here:
> https://lore.kernel.org/all/20260306031932.136179-14-alex.hung@amd.com/
>
> Additionally, the offending change missed updating register definitions, which was
> fixed here:
> https://lore.kernel.org/all/20260306031932.136179-10-alex.hung@amd.com/
>
> - Leo

Hi Leo,

Thanks for replying.
We missed that patch, since there is no reference to the report.
I understand the implementation ended up differently, and our patch is not
going forward. But at least we came to the same conclusions. :)

If I could ask, at least a
"Reported-by: Rafael Passos <rafael@rcpassos.me>"
tag would be nice, given the effort put into this.

Either way, thank you.

- Rafael Passos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
  2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
                         ` (2 preceding siblings ...)
  2026-03-09 17:02       ` Leo Li
@ 2026-03-09 18:12       ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2026-03-09 18:12 UTC (permalink / raw)
  To: Rafael Passos, alexdeucher
  Cc: llvm, oe-kbuild-all, BhuvanaChandra.Pinninti, Harry.Wentland,
	Martin.Leung, Sunpeng.Li, alexander.deucher, amd-gfx,
	daniel.wheeler, davidbtadokoro, dri-devel, linux-kernel, rafael,
	ray.wu, rcpassos, siqueira

Hi Rafael,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on linus/master v7.0-rc3 next-20260306]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Rafael-Passos/drm-amd-display-fix-resuming-from-S3-sleep-for-Renoir-iGPU/20260308-080715
base:   https://gitlab.freedesktop.org/drm/misc/kernel.git drm-misc-next
patch link:    https://lore.kernel.org/r/20260308000515.890688-1-rafael%40rcpassos.me
patch subject: [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU
config: x86_64-rhel-9.4-rust (https://download.01.org/0day-ci/archive/20260310/202603100246.Tkzia4Ba-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260310/202603100246.Tkzia4Ba-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603100246.Tkzia4Ba-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c:99:6: warning: no previous prototype for function 'dccg21_init' [-Wmissing-prototypes]
      99 | void dccg21_init(struct dccg *dccg)
         |      ^
   drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c:99:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
      99 | void dccg21_init(struct dccg *dccg)
         | ^
         | static 
   1 warning generated.


vim +/dccg21_init +99 drivers/gpu/drm/amd/amdgpu/../display/dc/dccg/dcn21/dcn21_dccg.c

    98	
  > 99	void dccg21_init(struct dccg *dccg)
   100	{
   101		struct dcn_dccg *dccg_dcn = TO_DCN_DCCG(dccg);
   102	
   103		/* Hardcoded register values for DCN21
   104		 * These are specific to 100Mhz refclk
   105		 * Different ASICs with different refclk may override this in their own init
   106		 */
   107		REG_WRITE(MICROSECOND_TIME_BASE_DIV, 0x00120464);
   108		REG_WRITE(MILLISECOND_TIME_BASE_DIV, 0x001186a0);
   109		REG_WRITE(DISPCLK_FREQ_CHANGE_CNTL, 0x0e01003c);
   110	
   111		if (REG(REFCLK_CNTL))
   112			REG_WRITE(REFCLK_CNTL, 0);
   113	}
   114	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-03-09 18:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 23:01 [bug report] 7.0-rc1 flip_done timed out: amd igpu off when resuming in laptop (regression) Rafael Passos
2026-03-04 12:27 ` Rafael Passos
2026-03-04 16:32   ` Alex Deucher
2026-03-08  0:04     ` [PATCH] drm/amd/display: fix resuming from S3 sleep for Renoir iGPU Rafael Passos
2026-03-08  8:19       ` kernel test robot
2026-03-08 16:23       ` kernel test robot
2026-03-09 17:02       ` Leo Li
2026-03-09 17:39         ` Rafael Passos
2026-03-09 18:12       ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox