From: Samuel Zhang <guoqing.zhang@amd.com>
To: <rafael@kernel.org>, <len.brown@intel.com>, <pavel@kernel.org>,
<alexander.deucher@amd.com>, <christian.koenig@amd.com>,
<mario.limonciello@amd.com>, <lijo.lazar@amd.com>
Cc: <victor.zhao@amd.com>, <haijun.chang@amd.com>, <Qing.Ma@amd.com>,
<amd-gfx@lists.freedesktop.org>,
<dri-devel@lists.freedesktop.org>, <linux-pm@vger.kernel.org>,
<linux-kernel@vger.kernel.org>,
Samuel Zhang <guoqing.zhang@amd.com>
Subject: [PATCH 0/3] reduce system memory requirement for hibernation
Date: Mon, 30 Jun 2025 18:41:13 +0800 [thread overview]
Message-ID: <20250630104116.3050306-1-guoqing.zhang@amd.com> (raw)
Modern data center dGPUs are usually equipped with very large VRAM. On
server with such dGPUs(192GB VRAM * 8) and 2TB system memory, hibernate
will fail due to no enough free memory.
The root cause is that during hibernation all VRAM memory get evicted to
GTT or shmem. In both case, it is in system memory and kernel will try to
copy the pages to hibernation image. In the worst case, this causes 2
copies of VRAM memory in system memory, 2TB is not enough for the
hibernation image. 192GB * 8 * 2 = 3TB > 2TB.
The fix includes following 2 changes. With 2 changes, there's much less
pages needed to be copied to hibernate image and hibernation can succeed.
1. move GTT to shmem after evicting VRAM. then the GTT pages can be freed.
2. force write shmem pages to swap disk and free shmem pages.
After swapout GTT to shmem in hibernation prepare stage, swapin and
restore BOs in thaw stage takes lots of time(50 mintues observed for
8 dGPUs). And it's not necessary since the follow-up hibernate stages do
not use GPU for hibernation successful case. The third patch is just skip
the BOs restore in thaw stage to reduce the hibernation time.
Samuel Zhang (3):
drm/amdgpu: move GTT to SHM after eviction for hibernation
PM: hibernate: shrink shmem pages after dev_pm_ops.prepare()
drm/amdgpu: skip kfd resume_process for dev_pm_ops.thaw()
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 ++++++++++++-
drivers/gpu/drm/ttm/ttm_resource.c | 18 ++++++++++++++++++
include/drm/ttm/ttm_resource.h | 1 +
kernel/power/hibernate.c | 13 +++++++++++++
6 files changed, 47 insertions(+), 2 deletions(-)
--
2.43.5
next reply other threads:[~2025-06-30 10:41 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-30 10:41 Samuel Zhang [this message]
2025-06-30 10:41 ` [PATCH 1/3] drm/amdgpu: move GTT to SHM after eviction for hibernation Samuel Zhang
2025-06-30 11:54 ` Christian König
[not found] ` <DM4PR12MB5937FFB3E121E489A261785DE541A@DM4PR12MB5937.namprd12.prod.outlook.com>
2025-07-01 8:22 ` Christian König
2025-07-02 7:28 ` Samuel Zhang
2025-07-02 7:48 ` Christian König
2025-06-30 10:41 ` [PATCH 2/3] PM: hibernate: shrink shmem pages after dev_pm_ops.prepare() Samuel Zhang
2025-06-30 20:21 ` Rafael J. Wysocki
2025-06-30 10:41 ` [PATCH 3/3] drm/amdgpu: skip kfd resume_process for dev_pm_ops.thaw() Samuel Zhang
2025-06-30 11:58 ` Christian König
[not found] ` <8eb1700d-4d60-4a1e-9d09-718f65baaf1e@amd.com>
2025-07-01 8:32 ` Christian König
2025-07-01 16:07 ` Alex Deucher
2025-07-02 7:23 ` Sam
2025-07-02 13:54 ` Alex Deucher
2025-07-02 14:07 ` Lazar, Lijo
2025-07-04 10:24 ` Zhang, GuoQing (Sam)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250630104116.3050306-1-guoqing.zhang@amd.com \
--to=guoqing.zhang@amd.com \
--cc=Qing.Ma@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=haijun.chang@amd.com \
--cc=len.brown@intel.com \
--cc=lijo.lazar@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=pavel@kernel.org \
--cc=rafael@kernel.org \
--cc=victor.zhao@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox