From: Matthew Brost <matthew.brost@intel.com>
To: intel-xe@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Carlos Santa" <carlos.santa@intel.com>,
"Christian Koenig" <christian.koenig@amd.com>,
"Huang Rui" <ray.huang@amd.com>,
"Matthew Auld" <matthew.auld@intel.com>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Daniel Colascione" <dancol@dancol.org>
Subject: [PATCH 0/3] drm/ttm, drm/xe: Avoid reclaim/eviction loops under fragmentation
Date: Mon, 20 Apr 2026 18:26:05 -0700 [thread overview]
Message-ID: <20260421012608.1474950-1-matthew.brost@intel.com> (raw)
TTM allocations at higher orders can drive Xe into a pathological
reclaim loop when memory is fragmented:
kswapd → shrinker → eviction → rebind (exec ioctl) → repeat
In this state, reclaim is triggered despite substantial free memory,
but fails to produce contiguous higher-order pages. The Xe shrinker then
evicts active buffer objects, increasing faulting and rebind activity
and further feeding the loop. The result is high CPU overhead and poor
GPU forward progress.
This issue was first reported in [1] and independently observed
internally and by Google.
A simple reproducer is:
- Boot an iGPU system with mem=8G
- Launch 10 Chrome tabs running the WebGL aquarium demo
- Configure each tab with ~5k fish
Under this workload, ftrace shows a continuous loop of:
xe_shrinker_scan (kswapd)
xe_vma_rebind_exec
Performance degrades significantly, with each tab dropping to ~2 FPS on
PTL.
At the same time, /proc/buddyinfo shows substantial free memory but no
higher-order availability. For example, the Normal zone:
Count: 4063 4595 3455 3400 3139 2762 2293 1655 643 0 0
This corresponds to ~2.8GB free memory, but no order-9 (2MB) blocks,
indicating severe fragmentation.
This series addresses the issue in two ways:
TTM: Restrict direct reclaim to beneficial_order. Larger allocations
use __GFP_NORETRY to fail quickly rather than triggering reclaim.
Xe: Introduce a heuristic in the shrinker to avoid eviction when
running under kswapd and the system appears memory-rich but
fragmented.
With these changes, the reclaim/eviction loop is eliminated. The same
workload improves to ~10 FPS per tab, and kswapd activity subsides.
Buddyinfo after applying this series shows restored higher-order
availability:
Count: 8526 7067 3092 1959 1292 660 194 28 20 13 1
Matt
[1] https://patchwork.freedesktop.org/patch/716404/?series=164353&rev=1
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Carlos Santa <carlos.santa@intel.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
CC: dri-devel@lists.freedesktop.org
Cc: Daniel Colascione <dancol@dancol.org>
Matthew Brost (3):
drm/ttm: Issue direct reclaim at beneficial_order
drm/xe: Set TTM device beneficial_order to 9 (2M)
drm/xe: Avoid shrinker reclaim from kswapd under fragmentation
drivers/gpu/drm/ttm/ttm_pool.c | 4 ++--
drivers/gpu/drm/xe/xe_device.c | 3 ++-
drivers/gpu/drm/xe/xe_shrinker.c | 13 +++++++++++++
3 files changed, 17 insertions(+), 3 deletions(-)
--
2.34.1
next reply other threads:[~2026-04-21 1:26 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-21 1:26 Matthew Brost [this message]
2026-04-21 1:26 ` [PATCH 1/3] drm/ttm: Issue direct reclaim at beneficial_order Matthew Brost
2026-04-21 6:11 ` Christian König
2026-04-22 4:12 ` Matthew Brost
2026-04-22 6:41 ` Christian König
2026-04-22 7:32 ` Tvrtko Ursulin
2026-04-22 7:41 ` Christian König
2026-04-22 20:41 ` Matthew Brost
2026-04-23 8:44 ` Christian König
2026-04-28 13:45 ` Tvrtko Ursulin
2026-04-29 22:52 ` Daniel Colascione
2026-04-30 0:11 ` Dave Airlie
2026-04-30 7:59 ` Thomas Hellström
2026-04-30 8:14 ` Christian König
2026-04-30 7:34 ` Christian König
2026-04-30 3:00 ` Matthew Brost
2026-05-01 20:04 ` Thadeu Lima de Souza Cascardo
2026-04-21 1:26 ` [PATCH 2/3] drm/xe: Set TTM device beneficial_order to 9 (2M) Matthew Brost
2026-04-21 1:26 ` [PATCH 3/3] drm/xe: Avoid shrinker reclaim from kswapd under fragmentation Matthew Brost
2026-04-22 8:22 ` Thomas Hellström
2026-04-22 20:27 ` Matthew Brost
2026-04-21 5:56 ` ✓ CI.KUnit: success for drm/ttm, drm/xe: Avoid reclaim/eviction loops " Patchwork
2026-04-21 6:43 ` ✓ Xe.CI.BAT: " Patchwork
2026-04-21 8:29 ` ✗ Xe.CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260421012608.1474950-1-matthew.brost@intel.com \
--to=matthew.brost@intel.com \
--cc=airlied@gmail.com \
--cc=carlos.santa@intel.com \
--cc=christian.koenig@amd.com \
--cc=dancol@dancol.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=maarten.lankhorst@linux.intel.com \
--cc=matthew.auld@intel.com \
--cc=mripard@kernel.org \
--cc=ray.huang@amd.com \
--cc=simona@ffwll.ch \
--cc=thomas.hellstrom@linux.intel.com \
--cc=tzimmermann@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox