From: Tejas Upadhyay <tejas.upadhyay@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: matthew.auld@intel.com, matthew.brost@intel.com,
himal.prasad.ghimiray@intel.com,
Tejas Upadhyay <tejas.upadhyay@intel.com>
Subject: [RFC PATCH 0/5] Add memory page offlining support
Date: Wed, 11 Feb 2026 10:31:33 +0530 [thread overview]
Message-ID: <20260211050132.1332599-7-tejas.upadhyay@intel.com> (raw)
This functionality represents a significant step in making
the xe driver gracefully handle hardware memory degradation.
By integrating with the DRM Buddy allocator, the driver
can permanently "carve out" faulty memory so it isn't reused
by subsequent allocations.
This series adds memory page offlining support with following:
1. Link and track ttm BO's with physical addresses
2. Handle the generated physical address error by reserving addresses 4K page
3. Adds supporting debugfs to inject manual physcal address error
4. Add buddy block allocation dump for debuggin buddy related issues
5. Sysfs entry to provide statistics of bad gpu vram pages for user info
Opens:
1. mm->avail under drm_buddy throwing WARN_ON(mm->avail != mm->size) with no leaks
in memory, mutliple bind/ubind works fine. Debug in progress.
2. dump_allocated_blocks() and xe_ttm_vram_addr_to_tbo() API will move under drm_buddy,
right now just to showcase concept its part of xe code
Tejas Upadhyay (5):
drm/xe: Implement VRAM object tracking ability using physical address
drm/xe: Handle physical memory address error
[DO_NOT_REVIEW]drm/xe/cri: Add debugfs to inject faulty vram address
drm/xe: Add routine to dump allocated VRAM blocks
[DO NOT REVIEW]drm/xe/cri: Add sysfs interface for bad gpu vram pages
drivers/gpu/drm/xe/xe_debugfs.c | 49 +++
drivers/gpu/drm/xe/xe_device_sysfs.c | 2 +
drivers/gpu/drm/xe/xe_tile_sysfs.c | 1 +
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 366 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_ttm_vram_mgr.h | 6 +-
drivers/gpu/drm/xe/xe_ttm_vram_mgr_types.h | 23 ++
6 files changed, 446 insertions(+), 1 deletion(-)
--
2.52.0
next reply other threads:[~2026-02-11 5:02 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-11 5:01 Tejas Upadhyay [this message]
2026-02-11 5:01 ` [RFC PATCH 1/5] drm/xe: Implement VRAM object tracking ability using physical address Tejas Upadhyay
2026-02-11 6:26 ` Matthew Brost
2026-02-12 4:49 ` Upadhyay, Tejas
2026-02-11 5:01 ` [RFC PATCH 2/5] drm/xe: Handle physical memory address error Tejas Upadhyay
2026-02-11 5:01 ` [RFC PATCH 3/5] [DO_NOT_REVIEW]drm/xe/cri: Add debugfs to inject faulty vram address Tejas Upadhyay
2026-02-11 5:01 ` [RFC PATCH 4/5] drm/xe: Add routine to dump allocated VRAM blocks Tejas Upadhyay
2026-02-11 5:01 ` [RFC PATCH 5/5] [DO NOT REVIEW]drm/xe/cri: Add sysfs interface for bad gpu vram pages Tejas Upadhyay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260211050132.1332599-7-tejas.upadhyay@intel.com \
--to=tejas.upadhyay@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.auld@intel.com \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox