From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
"Somalapuram Amaranath" <Amaranath.Somalapuram@amd.com>,
"Christian König" <christian.koenig@amd.com>,
"Matthew Brost" <matthew.brost@intel.com>,
dri-devel@lists.freedesktop.org
Subject: [PATCH v5 00/12] TTM shrinker helpers and xe buffer object shrinker
Date: Tue, 18 Jun 2024 09:18:08 +0200 [thread overview]
Message-ID: <20240618071820.130917-1-thomas.hellstrom@linux.intel.com> (raw)
This series implements TTM shrinker / eviction helpers and an xe bo
shrinker. It builds on two previous series, *and obsoletes these*. First
https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg484425.html
Second the previous TTM shrinker series
https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/
Where the comment about layering
https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9
now addressed, and this version also implements shmem objects for backup
rather than direct swap-cache insertions, which was used in the previuos
series. It turns out that with per-page backup / shrinking, shmem objects
appears to work just as well as direct swap-cache insertions with the
added benefit that was introduced in the previous TTM shrinker series to
avoid running out of swap entries isn't really needed.
Patch 1-4 implements restartable LRU list iteration.
Patch 5 implements a LRU walker + resv locking helper
Patch 6 moves TTM swapping over to the walker.
Patch 7 moves TTM eviction over to the walker.
Patch 8 could in theory be skipped but introduces a possibility to easily
add or test multiple backup backends, like the direct swap-cache
insertion or even files into fast dedicated nvme storage for for example.
Patch 9 introduces helpers in the ttm_pool code for page-by-page shrinking
and recovery. It avoids having to temporarily allocate a huge amount of
memory to be able to shrink a buffer object. It also introduces the
possibility to immediately write-back pages if needed, since that tends
to be a bit delayed when left to kswapd.
Patch 10 Adds a simple error injection to the above code to help increase
test coverage.
Patch 11 Implements an xe bo shrinker and a common helper in TTM for
shrinking.
Patch 12-21 are really a separate POC series, for introducing drm_exec locking
in TTM. The patch touches both drm_exec and dma-buf and is for now marked as
an RFC:
Patch 12 Increases (removes) the XE_PL_TT watermark.
v2:
- Squash obsolete revision history in the patch commit messages.
- Fix a couple of review comments by Christian
- Don't store the mem_type in the TTM managers but in the
resource cursor.
- Rename introduced TTM *back_up* function names to *backup*
- Add ttm pool recovery fault injection.
- Shrinker xe kunit test
- Various bugfixes
v3:
- Address some review comments from Matthew Brost and Christian König.
- Use the restartable LRU walk for TTM swapping and eviction.
- Provide a POC drm_exec locking implementation for exhaustive
eviction. (Christian König).
v4:
- Remove the RFC exhaustive eviction part. While the path to exhaustive
eviction is pretty clear and demonstrated in v3, there is still some
drm_exec work that needs to be agreed and implemented.
- Add shrinker power management. On some hw we need to wake when shrinking.
- Fix the lru walker helper for -EALREADY errors.
- Add drm/xe: Increase the XE_PL_TT watermark.
v5:
- Update also TTM kunit tests
- Handle ghost- and zombie objects in the shrinker.
- A couple of compile- and UAF fixes reported by Kernel Build Robot and
Dan Carpenter.
Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>
Thomas Hellström (12):
drm/ttm: Allow TTM LRU list nodes of different types
drm/ttm: Slightly clean up LRU list iteration
drm/ttm: Use LRU hitches
drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist
moves
drm/ttm: Provide a generic LRU walker helper
drm/ttm: Use the LRU walker helper for swapping
drm/ttm: Use the LRU walker for eviction
drm/ttm: Add a virtual base class for graphics memory backup
drm/ttm/pool: Provide a helper to shrink pages
drm/ttm: Use fault-injection to test error paths
drm/ttm, drm/xe: Add a shrinker for xe bos
drm/xe: Increase the XE_PL_TT watermark
drivers/gpu/drm/Kconfig | 10 +
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 4 +
drivers/gpu/drm/ttm/Makefile | 2 +-
drivers/gpu/drm/ttm/tests/ttm_bo_test.c | 6 +-
drivers/gpu/drm/ttm/tests/ttm_resource_test.c | 2 +-
drivers/gpu/drm/ttm/ttm_backup_shmem.c | 139 ++++++
drivers/gpu/drm/ttm/ttm_bo.c | 463 ++++++++----------
drivers/gpu/drm/ttm/ttm_bo_util.c | 212 ++++++++
drivers/gpu/drm/ttm/ttm_device.c | 29 +-
drivers/gpu/drm/ttm/ttm_pool.c | 412 +++++++++++++++-
drivers/gpu/drm/ttm/ttm_resource.c | 264 ++++++++--
drivers/gpu/drm/ttm/ttm_tt.c | 37 ++
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/tests/xe_bo.c | 118 +++++
drivers/gpu/drm/xe/tests/xe_bo_test.c | 1 +
drivers/gpu/drm/xe/tests/xe_bo_test.h | 1 +
drivers/gpu/drm/xe/xe_bo.c | 155 +++++-
drivers/gpu/drm/xe/xe_bo.h | 26 +
drivers/gpu/drm/xe/xe_device.c | 8 +
drivers/gpu/drm/xe/xe_device_types.h | 2 +
drivers/gpu/drm/xe/xe_shrinker.c | 287 +++++++++++
drivers/gpu/drm/xe/xe_shrinker.h | 18 +
drivers/gpu/drm/xe/xe_ttm_sys_mgr.c | 3 +-
drivers/gpu/drm/xe/xe_vm.c | 4 +
include/drm/ttm/ttm_backup.h | 136 +++++
include/drm/ttm/ttm_bo.h | 48 +-
include/drm/ttm/ttm_pool.h | 5 +
include/drm/ttm/ttm_resource.h | 99 +++-
include/drm/ttm/ttm_tt.h | 20 +
29 files changed, 2133 insertions(+), 379 deletions(-)
create mode 100644 drivers/gpu/drm/ttm/ttm_backup_shmem.c
create mode 100644 drivers/gpu/drm/xe/xe_shrinker.c
create mode 100644 drivers/gpu/drm/xe/xe_shrinker.h
create mode 100644 include/drm/ttm/ttm_backup.h
--
2.44.0
next reply other threads:[~2024-06-18 7:18 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-18 7:18 Thomas Hellström [this message]
2024-06-18 7:18 ` [PATCH v5 01/12] drm/ttm: Allow TTM LRU list nodes of different types Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 02/12] drm/ttm: Slightly clean up LRU list iteration Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 03/12] drm/ttm: Use LRU hitches Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 04/12] drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist moves Thomas Hellström
2024-06-19 3:37 ` Matthew Brost
2024-06-19 8:24 ` Thomas Hellström
2024-06-19 14:44 ` Matthew Brost
2024-06-18 7:18 ` [PATCH v5 05/12] drm/ttm: Provide a generic LRU walker helper Thomas Hellström
2024-06-18 22:11 ` Matthew Brost
2024-06-19 7:31 ` Thomas Hellström
2024-06-19 15:09 ` Matthew Brost
2024-06-18 7:18 ` [PATCH v5 06/12] drm/ttm: Use the LRU walker helper for swapping Thomas Hellström
2024-06-19 4:23 ` Matthew Brost
2024-06-19 8:36 ` Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 07/12] drm/ttm: Use the LRU walker for eviction Thomas Hellström
2024-06-19 22:52 ` Matthew Brost
2024-06-24 9:06 ` Thomas Hellström
2024-06-19 23:33 ` Matthew Brost
2024-06-24 9:16 ` Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 08/12] drm/ttm: Add a virtual base class for graphics memory backup Thomas Hellström
2024-06-20 15:17 ` Matthew Brost
2024-06-24 9:26 ` Thomas Hellström
2024-06-24 15:47 ` Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 09/12] drm/ttm/pool: Provide a helper to shrink pages Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 10/12] drm/ttm: Use fault-injection to test error paths Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 11/12] drm/ttm, drm/xe: Add a shrinker for xe bos Thomas Hellström
2024-06-18 7:18 ` [PATCH v5 12/12] drm/xe: Increase the XE_PL_TT watermark Thomas Hellström
2024-06-18 7:24 ` ✓ CI.Patch_applied: success for TTM shrinker helpers and xe buffer object shrinker (rev5) Patchwork
2024-06-18 7:24 ` ✗ CI.checkpatch: warning " Patchwork
2024-06-18 7:25 ` ✓ CI.KUnit: success " Patchwork
2024-06-18 7:37 ` ✓ CI.Build: " Patchwork
2024-06-18 7:39 ` ✗ CI.Hooks: failure " Patchwork
2024-06-18 7:41 ` ✗ CI.checksparse: warning " Patchwork
2024-06-18 8:03 ` ✓ CI.BAT: success " Patchwork
2024-06-18 19:08 ` ✗ CI.FULL: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240618071820.130917-1-thomas.hellstrom@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=Amaranath.Somalapuram@amd.com \
--cc=christian.koenig@amd.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=intel-xe@lists.freedesktop.org \
--cc=matthew.brost@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.