All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/12] TTM shrinker helpers and xe buffer object shrinker
@ 2024-06-14 10:25 Thomas Hellström
  2024-06-14 10:25 ` [PATCH v4 01/12] drm/ttm: Allow TTM LRU list nodes of different types Thomas Hellström
                   ` (14 more replies)
  0 siblings, 15 replies; 21+ messages in thread
From: Thomas Hellström @ 2024-06-14 10:25 UTC (permalink / raw)
  To: intel-xe
  Cc: Thomas Hellström, Somalapuram Amaranath,
	Christian König, Matthew Brost, dri-devel

This series implements TTM shrinker / eviction helpers and an xe bo
shrinker. It builds on two previous series, *and obsoletes these*. First

https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg484425.html

Second the previous TTM shrinker series

https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/

Where the comment about layering
https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9

now addressed, and this version also implements shmem objects for backup
rather than direct swap-cache insertions, which was used in the previuos
series. It turns out that with per-page backup / shrinking, shmem objects
appears to work just as well as direct swap-cache insertions with the
added benefit that was introduced in the previous TTM shrinker series to
avoid running out of swap entries isn't really needed.

Patch 1-4 implements restartable LRU list iteration.

Patch 5 implements a LRU walker + resv locking helper

Patch 6 moves TTM swapping over to the walker.

Patch 7 moves TTM eviction over to the walker.

Patch 8 could in theory be skipped but introduces a possibility to easily
add or test multiple backup backends, like the direct swap-cache
insertion or even files into fast dedicated nvme storage for for example.

Patch 9 introduces helpers in the ttm_pool code for page-by-page shrinking
and recovery. It avoids having to temporarily allocate a huge amount of
memory to be able to shrink a buffer object. It also introduces the
possibility to immediately write-back pages if needed, since that tends
to be a bit delayed when left to kswapd.

Patch 10 Adds a simple error injection to the above code to help increase
test coverage.

Patch 11 Implements an xe bo shrinker and a common helper in TTM for
shrinking.

Patch 12-21 are really a separate POC series, for introducing drm_exec locking
in TTM. The patch touches both drm_exec and dma-buf and is for now marked as
an RFC:

Patch 12 Increases (removes) the XE_PL_TT watermark.

v2:
- Squash obsolete revision history in the patch commit messages.
- Fix a couple of review comments by Christian
- Don't store the mem_type in the TTM managers but in the
  resource cursor.
- Rename introduced TTM *back_up* function names to *backup*
- Add ttm pool recovery fault injection.
- Shrinker xe kunit test
- Various bugfixes

v3:
- Address some review comments from Matthew Brost and Christian König.
- Use the restartable LRU walk for TTM swapping and eviction.
- Provide a POC drm_exec locking implementation for exhaustive
  eviction. (Christian König).

v4:
- Remove the RFC exhaustive eviction part. While the path to exhaustive
  eviction is pretty clear and demonstrated in v3, there is still some
  drm_exec work that needs to be agreed and implemented.
- Add shrinker power management. On some hw we need to wake when shrinking.
- Fix the lru walker helper for -EALREADY errors.
- Add drm/xe: Increase the XE_PL_TT watermark.

Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <dri-devel@lists.freedesktop.org>

Thomas Hellström (12):
  drm/ttm: Allow TTM LRU list nodes of different types
  drm/ttm: Slightly clean up LRU list iteration
  drm/ttm: Use LRU hitches
  drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist
    moves
  drm/ttm: Provide a generic LRU walker helper
  drm/ttm: Use the LRU walker helper for swapping
  drm/ttm: Use the LRU walker for eviction
  drm/ttm: Add a virtual base class for graphics memory backup
  drm/ttm/pool: Provide a helper to shrink pages
  drm/ttm: Use fault-injection to test error paths
  drm/ttm, drm/xe: Add a shrinker for xe bos
  drm/xe: Increase the XE_PL_TT watermark

 drivers/gpu/drm/Kconfig                |  10 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   4 +
 drivers/gpu/drm/ttm/Makefile           |   2 +-
 drivers/gpu/drm/ttm/ttm_backup_shmem.c | 137 ++++++++
 drivers/gpu/drm/ttm/ttm_bo.c           | 463 ++++++++++++-------------
 drivers/gpu/drm/ttm/ttm_bo_util.c      | 212 +++++++++++
 drivers/gpu/drm/ttm/ttm_device.c       |  29 +-
 drivers/gpu/drm/ttm/ttm_pool.c         | 412 +++++++++++++++++++++-
 drivers/gpu/drm/ttm/ttm_resource.c     | 264 +++++++++++---
 drivers/gpu/drm/ttm/ttm_tt.c           |  37 ++
 drivers/gpu/drm/xe/Makefile            |   1 +
 drivers/gpu/drm/xe/tests/xe_bo.c       | 118 +++++++
 drivers/gpu/drm/xe/tests/xe_bo_test.c  |   1 +
 drivers/gpu/drm/xe/tests/xe_bo_test.h  |   1 +
 drivers/gpu/drm/xe/xe_bo.c             | 139 +++++++-
 drivers/gpu/drm/xe/xe_bo.h             |   4 +
 drivers/gpu/drm/xe/xe_device.c         |   8 +
 drivers/gpu/drm/xe/xe_device_types.h   |   2 +
 drivers/gpu/drm/xe/xe_shrinker.c       | 287 +++++++++++++++
 drivers/gpu/drm/xe/xe_shrinker.h       |  18 +
 drivers/gpu/drm/xe/xe_ttm_sys_mgr.c    |   3 +-
 drivers/gpu/drm/xe/xe_vm.c             |   4 +
 include/drm/ttm/ttm_backup.h           | 136 ++++++++
 include/drm/ttm/ttm_bo.h               |  48 ++-
 include/drm/ttm/ttm_pool.h             |   5 +
 include/drm/ttm/ttm_resource.h         |  99 +++++-
 include/drm/ttm/ttm_tt.h               |  20 ++
 27 files changed, 2089 insertions(+), 375 deletions(-)
 create mode 100644 drivers/gpu/drm/ttm/ttm_backup_shmem.c
 create mode 100644 drivers/gpu/drm/xe/xe_shrinker.c
 create mode 100644 drivers/gpu/drm/xe/xe_shrinker.h
 create mode 100644 include/drm/ttm/ttm_backup.h

-- 
2.44.0


^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [PATCH v4 08/12] drm/ttm: Add a virtual base class for graphics memory backup
@ 2024-06-16 22:04 kernel test robot
  0 siblings, 0 replies; 21+ messages in thread
From: kernel test robot @ 2024-06-16 22:04 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp, Dan Carpenter

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20240614102548.4364-9-thomas.hellstrom@linux.intel.com>
References: <20240614102548.4364-9-thomas.hellstrom@linux.intel.com>
TO: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
TO: intel-xe@lists.freedesktop.org
CC: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
CC: "Christian König" <christian.koenig@amd.com>
CC: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
CC: Matthew Brost <matthew.brost@intel.com>
CC: dri-devel@lists.freedesktop.org

Hi Thomas,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-xe/drm-xe-next]
[also build test WARNING on next-20240613]
[cannot apply to linus/master v6.10-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Thomas-Hellstr-m/drm-ttm-Allow-TTM-LRU-list-nodes-of-different-types/20240614-182911
base:   https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link:    https://lore.kernel.org/r/20240614102548.4364-9-thomas.hellstrom%40linux.intel.com
patch subject: [PATCH v4 08/12] drm/ttm: Add a virtual base class for graphics memory backup
:::::: branch date: 3 days ago
:::::: commit date: 3 days ago
config: x86_64-randconfig-161-20240617 (https://download.01.org/0day-ci/archive/20240617/202406170559.WdDkFEiU-lkp@intel.com/config)
compiler: gcc-13 (Ubuntu 13.2.0-4ubuntu3) 13.2.0

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <error27@gmail.com>
| Closes: https://lore.kernel.org/r/202406170559.WdDkFEiU-lkp@intel.com/

smatch warnings:
drivers/gpu/drm/ttm/ttm_backup_shmem.c:130 ttm_backup_shmem_create() error: dereferencing freed memory 'sbackup'

vim +/sbackup +130 drivers/gpu/drm/ttm/ttm_backup_shmem.c

827540d42dec01 Thomas Hellström 2024-06-14  109  
827540d42dec01 Thomas Hellström 2024-06-14  110  /**
827540d42dec01 Thomas Hellström 2024-06-14  111   * ttm_backup_shmem_create() - Create a shmem-based struct backup.
827540d42dec01 Thomas Hellström 2024-06-14  112   * @size: The maximum size (in bytes) to back up.
827540d42dec01 Thomas Hellström 2024-06-14  113   *
827540d42dec01 Thomas Hellström 2024-06-14  114   * Create a backup utilizing shmem objects.
827540d42dec01 Thomas Hellström 2024-06-14  115   *
827540d42dec01 Thomas Hellström 2024-06-14  116   * Return: A pointer to a struct ttm_backup on success,
827540d42dec01 Thomas Hellström 2024-06-14  117   * an error pointer on error.
827540d42dec01 Thomas Hellström 2024-06-14  118   */
827540d42dec01 Thomas Hellström 2024-06-14  119  struct ttm_backup *ttm_backup_shmem_create(loff_t size)
827540d42dec01 Thomas Hellström 2024-06-14  120  {
827540d42dec01 Thomas Hellström 2024-06-14  121  	struct ttm_backup_shmem *sbackup =
827540d42dec01 Thomas Hellström 2024-06-14  122  		kzalloc(sizeof(*sbackup), GFP_KERNEL | __GFP_ACCOUNT);
827540d42dec01 Thomas Hellström 2024-06-14  123  
827540d42dec01 Thomas Hellström 2024-06-14  124  	if (!sbackup)
827540d42dec01 Thomas Hellström 2024-06-14  125  		return ERR_PTR(-ENOMEM);
827540d42dec01 Thomas Hellström 2024-06-14  126  
827540d42dec01 Thomas Hellström 2024-06-14  127  	sbackup->filp = shmem_file_setup("ttm shmem backup", size, 0);
827540d42dec01 Thomas Hellström 2024-06-14  128  	if (IS_ERR(sbackup->filp)) {
827540d42dec01 Thomas Hellström 2024-06-14  129  		kfree(sbackup);
827540d42dec01 Thomas Hellström 2024-06-14 @130  		return ERR_CAST(sbackup->filp);

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-06-17 14:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 10:25 [PATCH v4 00/12] TTM shrinker helpers and xe buffer object shrinker Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 01/12] drm/ttm: Allow TTM LRU list nodes of different types Thomas Hellström
2024-06-15  7:45   ` kernel test robot
2024-06-14 10:25 ` [PATCH v4 02/12] drm/ttm: Slightly clean up LRU list iteration Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 03/12] drm/ttm: Use LRU hitches Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 04/12] drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist moves Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 05/12] drm/ttm: Provide a generic LRU walker helper Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 06/12] drm/ttm: Use the LRU walker helper for swapping Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 07/12] drm/ttm: Use the LRU walker for eviction Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 08/12] drm/ttm: Add a virtual base class for graphics memory backup Thomas Hellström
2024-06-17 14:09   ` Dan Carpenter
2024-06-14 10:25 ` [PATCH v4 09/12] drm/ttm/pool: Provide a helper to shrink pages Thomas Hellström
2024-06-15 10:55   ` kernel test robot
2024-06-14 10:25 ` [PATCH v4 10/12] drm/ttm: Use fault-injection to test error paths Thomas Hellström
2024-06-14 10:25 ` [PATCH v4 11/12] drm/ttm, drm/xe: Add a shrinker for xe bos Thomas Hellström
2024-06-14 23:27   ` kernel test robot
2024-06-14 10:25 ` [PATCH v4 12/12] drm/xe: Increase the XE_PL_TT watermark Thomas Hellström
2024-06-14 12:00 ` ✓ CI.Patch_applied: success for TTM shrinker helpers and xe buffer object shrinker (rev4) Patchwork
2024-06-14 12:00 ` ✗ CI.checkpatch: warning " Patchwork
2024-06-14 12:01 ` ✗ CI.KUnit: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2024-06-16 22:04 [PATCH v4 08/12] drm/ttm: Add a virtual base class for graphics memory backup kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.