public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2020-02-07 16:13 Tvrtko Ursulin
  2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-02-07 16:13 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Refactor of the i915_drm_client concept based on feedback from Chris.
 * Tracking contexts belonging to a client had a flaw where reaped contexts
   would stop contributing to the busyness making the counters go backwards.
   I was simply broken. In this version I had to track per-client stats at the
   same call-sites where per-context is tracked. It's a little bit more runtime
   cost but not too bad.
 * GuC support is out - needs more time than I have at the moment to properly
   wire it up with the latest changes. Plus there is a monotonicity issue with
   either the value stored in pphwsp by the GPU or a bug in my patch which also
   needs to be debugged.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid


Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

It is stil a RFC since it misses dedicated test cases to ensure things really
work as advertised.

Tvrtko Ursulin (6):
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Make GEM contexts track DRM clients
  drm/i915: Track per-context engine busyness
  drm/i915: Track per drm client engine class busyness
  drm/i915: Expose per-engine client busyness

 drivers/gpu/drm/i915/Makefile                 |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  31 +-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  13 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 +
 drivers/gpu/drm/i915/gt/intel_context.h       |  35 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  65 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |  31 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 350 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h        |  88 +++++
 drivers/gpu/drm/i915/i915_drv.h               |   5 +
 drivers/gpu/drm/i915/i915_gem.c               |  35 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  21 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 15 files changed, 673 insertions(+), 57 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2020-01-10 13:30 Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2020-01-10 13:30 UTC (permalink / raw)
  To: Intel-gfx; +Cc: kui.wen

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Two patches added on top of the series to provide the data in GuC mode.
   (For warnings and limitations please read the respective commit messages.)
 * Refactor to introduce a notion of i915_drm_client and move to separate source
   file.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── 9
	    ├── busy
	    │   ├── 0
	    │   ├── 1
	    │   ├── 2
	    │   └── 3
	    ├── name
	    └── pid


Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

Tvrtko Ursulin (8):
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Track per-context engine busyness
  drm/i915: Track all user contexts per client
  drm/i915: Contexts can use struct pid stored in the client
  drm/i915: Expose per-engine client busyness
  drm/i915: Track hw reported context runtime
  drm/i915: Fallback to hw context runtime when sw tracking is not
    available

 drivers/gpu/drm/i915/Makefile                 |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  60 +++-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 ++
 drivers/gpu/drm/i915/gt/intel_context.h       |  18 ++
 drivers/gpu/drm/i915/gt/intel_context_types.h |  14 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  55 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |   7 +-
 drivers/gpu/drm/i915/i915_drm_client.c        | 298 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drm_client.h        |  83 +++++
 drivers/gpu/drm/i915/i915_drv.h               |   5 +
 drivers/gpu/drm/i915/i915_gem.c               |  35 +-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   8 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 drivers/gpu/drm/i915/intel_device_info.c      |   2 +
 drivers/gpu/drm/i915/intel_device_info.h      |   1 +
 17 files changed, 606 insertions(+), 43 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
 create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread
* [Intel-gfx] [RFC 0/8] Per client engine busyness
@ 2019-12-19 18:00 Tvrtko Ursulin
  0 siblings, 0 replies; 17+ messages in thread
From: Tvrtko Ursulin @ 2019-12-19 18:00 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another re-spin of the per-client engine busyness series. Highlights from this
version:

 * Now tracks GPU time for clients who exit with GPU work left running.
 * No more global toggle - it is now constantly on.

Internally we track time spent on engines for each struct intel_context. This
can serve as a building block for several features from the want list:
smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality
wanted by some customers, cgroups controller, dynamic SSEU tuning,...

Externally, in sysfs, we expose time spent on GPU per client and per engine
class.

Sysfs interface enables us to implement a "top-like" tool for GPU tasks. Or with
a "screenshot":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
intel-gpu-top -  906/ 955 MHz;    0% RC6;  5.30 Watts;      933 irqs/s

      IMC reads:     4414 MiB/s
     IMC writes:     3805 MiB/s

          ENGINE      BUSY                                      MI_SEMA MI_WAIT
     Render/3D/0   93.46% |████████████████████████████████▋  |      0%      0%
       Blitter/0    0.00% |                                   |      0%      0%
         Video/0    0.00% |                                   |      0%      0%
  VideoEnhance/0    0.00% |                                   |      0%      0%

  PID            NAME  Render/3D      Blitter        Video      VideoEnhance
 2733       neverball |██████▌     ||            ||            ||            |
 2047            Xorg |███▊        ||            ||            ||            |
 2737        glxgears |█▍          ||            ||            ||            |
 2128           xfwm4 |            ||            ||            ||            |
 2047            Xorg |            ||            ||            ||            |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Implementation wise we add a a bunch of files in sysfs like:

	# cd /sys/class/drm/card0/clients/
	# tree
	.
	├── 7
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 8
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	├── 9
	│   ├── busy
	│   │   ├── 0
	│   │   ├── 1
	│   │   ├── 2
	│   │   └── 3
	│   ├── name
	│   └── pid
	└── enable_stats

Files in 'busy' directories are numbered using the engine class ABI values and
they contain accumulated nanoseconds each client spent on engines of a
respective class.

I will post the corresponding patch to intel_gpu_top for reference as well.

Tvrtko Ursulin (8):
  drm/i915: Switch context id allocation directoy to xarray
  drm/i915: Reference count struct drm_i915_file_private
  drm/i915: Expose list of clients in sysfs
  drm/i915: Update client name on context create
  drm/i915: Track per-context engine busyness
  drm/i915: Track all user contexts per client
  drm/i915: Contexts can use struct pid stored in the client
  drm/i915: Expose per-engine client busyness

 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 113 ++++++---
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
 .../gpu/drm/i915/gem/selftests/mock_context.c |   3 +-
 drivers/gpu/drm/i915/gt/intel_context.c       |  20 ++
 drivers/gpu/drm/i915/gt/intel_context.h       |  11 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   9 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  52 +++-
 drivers/gpu/drm/i915/i915_debugfs.c           |   7 +-
 drivers/gpu/drm/i915/i915_drv.c               |   4 -
 drivers/gpu/drm/i915/i915_drv.h               |  66 ++++-
 drivers/gpu/drm/i915/i915_gem.c               | 236 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |   6 +-
 drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
 14 files changed, 486 insertions(+), 81 deletions(-)

-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-02-09 11:02 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-07 16:13 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 1/6] drm/i915: Expose list of clients in sysfs Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 2/6] drm/i915: Update client name on context create Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 3/6] drm/i915: Make GEM contexts track DRM clients Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 4/6] drm/i915: Track per-context engine busyness Tvrtko Ursulin
2020-02-09 11:02   ` Chris Wilson
2020-02-07 16:13 ` [Intel-gfx] [PATCH 5/6] drm/i915: Track per drm client engine class busyness Tvrtko Ursulin
2020-02-07 16:33   ` Chris Wilson
2020-02-07 16:49     ` Tvrtko Ursulin
2020-02-07 17:02       ` Chris Wilson
2020-02-07 17:24         ` Tvrtko Ursulin
2020-02-07 16:13 ` [Intel-gfx] [PATCH 6/6] drm/i915: Expose per-engine client busyness Tvrtko Ursulin
2020-02-07 20:19 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Per client engine busyness (rev4) Patchwork
2020-02-07 20:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-02-07 20:44 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2020-01-10 13:30 [Intel-gfx] [RFC 0/8] Per client engine busyness Tvrtko Ursulin
2019-12-19 18:00 Tvrtko Ursulin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox